Dear community,
I am having trouble finding the problem with a parsing script error (see below) using a WoS corpus. I have used the “save as” plain text and other steps outlined in the CorText documentation in both cases. In one case it worked and in the other it did not.
Source:
Type of Data: dataset
Corpus Format: isi
Ignore entries with incorrectly formatted time steps: true 2019-04-29
17:30:21INFO : Preparing raw data 2019-04-29
17:30:21 INFO : Parsing file /srv/local/documents/3e08/3e08a1de6c4b15c7edb1d86d0b534d71/savedrecs/savedrecs (4).txt 2019-04-29
17:30:21DEBUG : Something went wrong while trying to parse, are you sure you selected the correct corpus format ?
Before that I used another WoS Corpus with the same attributes and it worked out:
Source:
Type of Data: dataset
Corpus Format: isi
Ignore entries with incorrectly formatted time steps: true 2019-04-29
16:29:33 INFO : Preparing raw data 2019-04-29
16:29:33 INFO : Parsing file /srv/local/documents/9a1a/9a1a663cd071787e187bc2f4e1763450/29-04-full-98-18-sorted-by-relevance/29.04 Full 98-18 sorted by relevance.txt 2019-04-29
16:29:34 INFO : Data Enriching
Many thanks in advance!
After a bit more of error searching I might have found the solution:
When I download the corpus with the “Other reference software” (also .txt file) option on WoS instead of “plain text” (as outlined in the documentation) the parsing works out.