Merge Meta Data in a csv file to a text corpus

Cortext Manager Q&A forum › Category: Data processing › Merge Meta Data in a csv file to a text corpus

PaulG asked 7 years ago

Hi,
First of all thank you very much for your app.
I’ve some troubles to do something simple.
I uploaded a text corpus (.text) and then a CSV file with all the metadata for each text.
Now I would like to merge those two databases (the filename of each text corresponds to a code column in the metadata).
I try to use the “Corpus List Indexer” script, as shown in the distant reading video tutorial, but the new list gathers only the filename of the two databases.
Sorry if it’s a stupid question, but is it possible to make a simpler merging ? (in order to be able to create subcorpuses or do textual analysis based on certain variables later on)

Thank you again

Question Tags: MergeMetaData

Jean-Philippe Cointet Staff replied 7 years ago

Hi PaulG,
I’m not sure I understood very well which operations you performed.
But if you have already an existing corpus parsed in your working space (and let’s suppose it comes from a bung of txt files), you can enrich your dataset with metadata coming from a csv file.
For doing so:
– make sure that the first column of your csv file is filled with the filenames of the corpus. Safest way to do so, is simply to run list builder against the field original_filename and retrieve the csv file that is produced.
– add as many columns as you dimensions of metadata you need in the spreadsheet
– save the file as .csv (tab separated)
– upload the csv file and select “term list” when prompted to (this is not a dataset)
– run corpus list indexer to map original filenames with the different columns present in your bespoke csv file

1 Answers

0 Vote Up Vote Down

PaulG answered 7 years ago

Hi thank you very much for your quick answer !
It works now. My mistake was to import the csv of metadata as a dataset and not as a term list !
I guess it is possible to declare a metadata variable as a timestep afterward.
Thank you again

Cortext Manager Documentation

Merge Meta Data in a csv file to a text corpus

Learn about Cortext methods and share your experiences