Relevance of attribution to an unique cluster based on keywords data

CorText Manager Q&A forumCategory: Network mappingRelevance of attribution to an unique cluster based on keywords data
Beadavi asked 4 years ago

Hello ,
I would like to know if the attribution of a reference to a cluster based on keywords coocurences (author keywords and ISI keywords) is relevant, even if the number of occurrences of the terms in the document is always 0 or 1 since it is not free text.
Thank you in advance

Lionel Staff replied 4 years ago

see below

1 Answers
Lionel Staff answered 4 years ago

Dear Béatrice,
If I have understood well: it is not a question of 0 and 1, neither of full text nor indexed keywords provided by the data provider.
A document is represented by a specific set of combination of keywords (cooccurrences). No matter if these keywords come from authors keywords or if they are extracted from full text.

Summing them for all the documents of the corpus, these cooccurrences build a matrix: the relationships between keywords are calculated with the chosen proximity measure. CorText Manager classify these cooccurrences based on their relationships to build the clusters.

To project a document on top of the clusters, CorText Manager will only measure which are the closest clusters to the document, based on the comparison of the keywords listed for the document and the keywords and their relationships of the clusters. The best match is chosen, if “Assign a unique cluster to each record (best match)” is set on “yes”.

I hope it helps!