Agathe Riou asked 1 year ago

I’m working with a list of authors and several of them appear under different aliases (ex : John Smith, J. Smith, J Smith…), thus, the same author can appear several time in the co-authors network mapping. 
I tried to do a term indexation but it seams that CorText doesn’t do well with the last names (it only spots a few last names). I tried to copy/paste from a csv doc the list of the authors in the column “main form” and put the aliases in the column “forms using” |&| but it didn’t work. 
I tried to index a list, putting all the authors in the column “first entity” and the right name in the column “second entity” when it was an alias, but it didn’t work. 
Thank you in advance for your help and have a great day ! 

2 Answers
Lionel Staff answered 1 year ago

Hi Agathe,
Thanks for your question.
The most straightforward way to achieve what you want is to:

  • Run a list builder to extract all the variations of your names. This should correspond to the list that you already have and, if it corresponds, you should not need to run it again;
  • Download the list, or edit it directly online, and edit it according to the transformations you want to apply. See an example there on a list of keywords. Please use tools which are respectful to the raw text formatting (google spreadsheet and LibreOffice Calc can manipulate tsv, for example)
  • Upload it again (if downloaded);
  • Run a list indexer, select the “Add a dictionary of equivalent strings” feature, and optionally add the name you want for your new variable.

I hope it helps!!

Agathe Riou answered 1 year ago

Hi Lionel, 
Thank you so much for your answer ! It worked ! I dont have dupplicates anymore. 
Thanks again and have a great day,