Link between the data produced by CorText and those produced by Geph

CorText Manager Q&A forumCategory: Network mappingLink between the data produced by CorText and those produced by Geph
Anne-Lise Dauphiné-Morer asked 2 years ago

Dear Cortext Team,
I am coming back to you to ask for help in linking the data produced by CorText and those produced by Gephi. In order to analyse my co-occurence networks, I need some data: weight (only calculated by CorText), degree, beetwenness, eigenvector, PageRanck, Clustering coefficient and modularity. Some of these data are already calculated by CorText but others are not (for example the clustering coefficient). So, I work with Gephi (based on the .gexf data from Cortext). But the same parameters calculated by Gephi and CorText never give the same data. For example, the betweenness in Cortext for a given node is 0.002454185 and 2190.783367 in Gephi (and the multiplicater coef between CorText and Gephi varies from one node to another). I have the same kind of problem with the clustering calculation, CorText proposes 11 clusters and Gephi only 10.
Could you help me to understande why the data are different? And if I can use CorText data and Gephi data (espacially the clustering coefficient and modularity) or if I have to choose one of these softwares (i.e. work only with CorText data or only with Gephi).
And finally, there is a data named “level-edge” which is “low” for all edges, I am not sure how this is interpretable.
Thank you in advance for your help!

4 Answers
Lionel Staff answered 2 years ago

Dear Anne-Lise,
The betweenness centrality in CorTexT Manager is normalized between 0 and 1. In order to find a similar behavior with Gephi, you have to activate the option “Normalized [0,1]” when using the Diameter calculation.
Beyond that, there are many other small adjustments made by CorTexT that you will have some difficulties to reproduced on Gephi:

  • directed vs non directed: Gephi sees the gexf as non-directed, but CorTexT produced most of the time a network which is not symmetrical. It plays a role for the betweenness centrality;
  • Weight vs weight_edge;

I think, the most important is to stay consistent between the tools and methods used and not to combined some metrics which are calculated by different tools.
The “low” indicate that you are working at the nodes level. In the, in the folder where you have found the gexf, you have another file which is “High” as this one is done at the cluster level. You can navigate through the two scales, it could be interesting for you!
I hope it helps;

Anne-Lise Dauphiné-Morer answered 2 years ago

Dear Lionel,
Thank you very much for your answer!

As the communities detected by Cortext were the basis of the other steps of my research, and Gephi does not detect the same number, I can’t work with Gephi data?
If this is the case, is there a way to recover the modularity, density and clustering coefficient, directly calculated by Cortext?

Thanks in advance for your help!


Lionel Staff answered 2 years ago

If your work is based on the community detection made on CorTexT Manager, you should use it and stick to the clusters and modularity made with it. See this also which could help you for the community detection algorithm. The optimal modularity of the network is calculated and show in the log of the job that corresponds to the Network analysis next to the line : “Optimal Modularity obtained by Louvain Algorithm for network” …
My comment was more to warm you about the fact that when you will calculate other global metrics outside, as the density, they will be based on a network produced after some calculations and filters performed by CorTexT Manager: using the selected proximity measure, the selection of the important edges etc… and not on the original network.

Anne-Lise Dauphiné-Morer answered 2 years ago

Dear Lionel,
Thank you for these explanations! I think I have everything I need.