distant reading

Carolina Rau asked 3 days ago

Hi! I am following the video tutorial for distant reading using a pdf spanish corpus. I noticed that the terms of the top_1000 list it creates when I run the list building script does not correspond extactly with the names of my files. My filenames are separated by tabs, with the year at the end. When the script creates the list, it misses the first term in all the filenames. Then, I am also having trouble when separating the dates and reindexing. 
Thank you for your help!

Carolina Rau replied 3 days ago

I already figured it out 🙂