Hi, I am analysing the historical evolution of a network using time periods to build three different maps. To further understand the inner structure of each period and each cluster, I am trying to gain access to the cluster info. Nevertheless, I can’t find this details (cluster name and the term list that is related to each cluster) in the information that the script produces.
To gain access to this details I thought I would query the corpus to build three different set of data, one for each period. In practice this had to shortcomings:
- I won’t be able to query the third period (that is why I was asking this: https://docs.cortext.net/question/not-being-able-to-query-a-new-corpus/), so this option is momently out of the menu.
- The maps and clusters I get differ and I get the feeling that I could be missing something important here. Both graphics have been produced with the same specs (top150 nodes) and the same term list (terms_400). In the historical network mapping I’ve selected that each period should have the same amount of nodes; if I don’t do this, I would get a very small network in relation to following periods. This, then, would difficult the analysis and comprarison. Here, some screenshots I made:
Just in case the images do not upload. Here are the links:
Historical Network: https://www.dropbox.com/s/44mlpa5gqjdmv9o/1-hn-ener-92-2016-selglobal0-0top150-isitermsglobal-400-isitermsglobal-400-distributionalcooc-99999-ot0-47-9999-loutrue.png?dl=0
Network from the same period using the Queried corpus: https://www.dropbox.com/s/tu4fqumxfau1a2o/2-hn-ener-92-2016-selglobal1992-2000top150-isitermsglobal-400-isitermsglobal-400-distributionalcooc-99999-ot0-39-9999-loutrue.png?dl=0
Have you got any clues on what is going on here? May be I am being too picky about this, from a general perspective the ideas do not differ so much. But, anyways, some clusters appear connected through different words and have been set at different distances. Then, I get the feeling that there is something I don’t seem to be getting.
Thanks in advance!
I just had a look to query script and made some changes such that temporal time field appears as Time Steps in the form (and not ISIpubdate) and Periods (instead of ISIpubdate_custom). It should make things easier in the future although it is not the explanation for the bug you observed.
I tested it and in my case selecting Periods as the target field for my query and typing data =’2′ worked very well.
I assume you may have an issue in the way period slicer operated. Have you checked the log of this script to actually verufy the number of periods that were created ?
Regarding the second issue. Do you mean that producing three different maps from a different sub-corpus yields a different outcome than generating three temporal maps ?
If so, it is probably because the automatic thresholding will adapt precisely to each map in one case, and adopt the smallest threshold that guarantee that the network is “connected” in the other case. So networks are not comparable.
(regarding the FIRST question)
Period slicer says, that has ‘ correctly finished ‘ .
Also, I manage so see with ‘corpus explorer’ that at the ‘Pubdate_Cu’ (Custom Pub Date) field everything is ok. The value is correctly set at ‘2’.
2018-11-05 22:13:13 INFO : Period Slicer Started
2018-11-05 22:13:13 INFO :
Period slices definition:
Enter a custom time partition: ‘[1991:2000];[2001:2007];[2008:2016]’
2018-11-05 22:13:15 INFO : Creating time slices according to this time division: [‘1991_2000’, ‘2001_2007’, ‘2008_2016’]
2018-11-05 22:13:17 INFO : New Time table ISIpubdate_custom created
2018-11-05 22:13:17 INFO : Period Slicer correctly finished
Regarding the SECOND question: Do you mean that producing three different maps from a different sub-corpus yields a different outcome than generating three temporal maps ?
Answer: I am sorry, I think I didn’t explain myself right. I have two networks, both analyse the same period from the same database. One is made with the ‘Network mapping’ using the periods detected with ‘Time Period’ and constructed with ‘Period slicer’. The other is done by using a specific database for the same period that was queried using ‘query corpus’. Both networks should be representing the same (or so I though), using the same term_list, the same number of nodes and using ‘louvain’ algorithm to build the clusters.