Hi,
First of all thank you very much for your app.
I’ve some troubles to do something simple.
I uploaded a text corpus (.text) and then a CSV file with all the metadata for each text.
Now I would like to merge those two databases (the filename of each text corresponds to a code column in the metadata).
I try to use the “Corpus List Indexer” script, as shown in the distant reading video tutorial, but the new list gathers only the filename of the two databases.
Sorry if it’s a stupid question, but is it possible to make a simpler merging ? (in order to be able to create subcorpuses or do textual analysis based on certain variables later on)
Thank you again
Hi PaulG,
I’m not sure I understood very well which operations you performed.
But if you have already an existing corpus parsed in your working space (and let’s suppose it comes from a bung of txt files), you can enrich your dataset with metadata coming from a csv file.
For doing so:
– make sure that the first column of your csv file is filled with the filenames of the corpus. Safest way to do so, is simply to run list builder against the field original_filename and retrieve the csv file that is produced.
– add as many columns as you dimensions of metadata you need in the spreadsheet
– save the file as .csv (tab separated)
– upload the csv file and select “term list” when prompted to (this is not a dataset)
– run corpus list indexer to map original filenames with the different columns present in your bespoke csv file