Hello,
I have made a co-words network mapping with 500 nodes (node weigt by default, thus nodes weights scale with their cooccurrences sum). I did this after a “Corpus Terms Indexer” made from a custom terms list.
I have three questions:
– is it correct that the script proceeds in this order: first it calculates for each term in the list provided the sum of its co-occurrences, then ranks the terms according to this score, then calculates the “similarity” (according to the chosen proximity measure) only for the 500 nodes with the highest cooc-sum scores?
– why does the resulting network only contains “497 nodes” (according to the “Graph overview” in Explore tab in Retina)? Does this mean that the parameters I’ve chosen in the Edges tab result in 3 of the top 500 nodes being orphaned (i.e. not having a strong enough link with any other node on the map to appear on it)?
– how can I obtain a file in which the sum of co-occurrences of each term is specified? (I couldn’t find it neither in the “terms-expanded.tsv” file resulting from the indexation, nor in the “maps_output.zip” folder resulting from network mapping).
Thanks in advance for your reply!
Aurélien
Dear Aurélien,
Yes!
You have activated the “find optimal threshold” option, which calculates automatically the threshold to apply to the remove weak edges. In your network, the value of the edges is calculated by the distributional measure.
In your network:
- The Final proximity threshold is 0.35 (calculates by the find optimal threshold based on the distributional measure)
- 500 nodes are selected, but 3 nodes are linked to others nodes with all their edges below this value. So, three nodes are excluded: penicillin-binding protein, acne, machine learning
In order to show the three disconnected nodes on your maps, you could activate the option: “Hide isolated nodes” to “no”. They will be plotted in a corner of the map.
I hope it helps.
L