Data Processing

In this category, you will find instructions for preparing and parsing your original raw data. Data Slicer is specifically dedicated to transform numeric data into binned categories. Finally  querying facilities are explained to extract sub-corpora or enrich existing tables in your corpus.

Data processing documentation

Data formats

CorText Manager proposes a full ecosystem of modeling and exploratory tools for analyzing data, which can be more or less calibrated. A wide range of data formats can be imported into CorText Manager, leaving you great flexibility in terms of data type that can be processed.   CorText Manager is particularly adapted for processing text:...

Upload corpus

Once you have collected data (see Data formats section for more information about data supported by CorText Manager), you should first zip the file(s) that compose your corpus into a single zip archive before you can upload it into CorText Manager. The zip archive should be performed, regardless of the format of your original data...

Data Parsing

After uploading your corpus, the first mandatory step to be able to apply analysis scripts in CorText Manager is to run the “Data Parsing” script. Data parser transforms your corpus in a sqlite database (see “what does the parsing step?” paragraph below for more details about the database structure). “Data Parsing” script can only process...

Query Corpus

This script allows you to query your corpus and build a subcorpus or create new fields with sql-like queries. Two modes of querying are proposed, sql begin the standard one. Querying your corpus in a sql-like mode The principle is to allow users to directly perform sql-like queries on their corpus. Query type Choose sql...

Data Slicer

Data Slicer simply slices numeric data (provided that they are integer values) into any given number of quantiles (to be chosen in the form). For example, if one has a database compiling information about individuals including their age, it may be useful to transform this field in bins of various significant ages. In turn, it...

Upload resource

You can upload any kind of documents (doc, pdf, power point) into your project. This is particularly useful for sharing these documents with the participants you are working with in the project or for example to store in your project a scientific article which would be useful for your analysis. But you may want to...

Data Curation

Data curation script is there to help you to handle some transformation you would like to apply to your corpus. Database level Rename a Database Rename your corpus/database with a new name. Useful to shorten database form built by Query corpus which has usually long name. Remove duplicate entries This option allows to get rid...

Latest questions in the Q&A forum on data processing

Filter:AllOpenResolvedClosedUnanswered
AnsweredHannah asked 11 months ago • 
326 views3 answers0 votes
AnsweredChristophe Gauld asked 2 years ago • 
2064 views2 answers0 votes
Answeredalbertoc asked 2 years ago • 
848 views3 answers0 votes
Answeredorianabras asked 10 months ago • 
244 views1 answers0 votes
AnsweredEmma Bogler asked 12 months ago • 
206 views2 answers0 votes
Answeredleo zhang asked 1 year ago • 
299 views3 answers0 votes
AnsweredJulien Pelet asked 2 years ago • 
354 views1 answers0 votes
AnsweredJulien Pelet asked 2 years ago • 
420 views1 answers0 votes
Answeredmatias.milia asked 2 years ago • 
480 views1 answers0 votes