But you may want to add a specific resource to your corpus and work with it: term list (from Corpus Terms Indexer), a list of equivalent entities (from List Builder) you want to clean, an external classification… These files may come from CorText Manager itself, downloaded and refined outside CorText Manager, or they may come from an external source (e.g., dictionaries built and found somewhere else).
This is a core feature of CorText Manager as it has been made also to: extract, enrich and refine the initial information of your corpuses, in order to empower your analyzes.
If you have refined your terms list or a list of entities directly in CorText Manager using the built in csv editor, there is obviously no need to go through this step.
Upload a resource (e.g., a list of terms)
For a decent size (up to several hundred megabytes), you must not zip the file: simply click on the “upload file” red button, drag and drop the tsv or csv, and wait. And, that all!
At the end of the “upload process”, the file will be automatically added to your project.
Common format of a resource (e.g., a list of terms)
The most straightforward way to format your resource and to work with it in CorText Manager is: utf8 and tab separated, stored in a .tsv or in a .csv.
Apply the resource to the corpus
To work with this new resource in your corpus, you will have to apply it to the dataset you already have in your project. Depending on what was your initial purpose, you may need to run: Corpus Terms Indexer or Corpus List Indexer scripts, or any other scripts which is able to work with a resource.