I am making a making a network map of rhetoric in political speeches. My corpus consists of ~20 .txt files containing the full text of political speeches given by presidential candidates, with each file named for the political candidate who gave the speech. I construct a network showing clusters of co-occurring words (which have been computed at the document level), and I am using “file_name” to essentially tag each cluster with the candidate who’s speech is most similar to that the words contained in that cluster. It seems as if I have done this successfully, as I get clusters of words tagged with a file name (in this case a candidate name) that makes sense.
My question is: how can I find a measurement of the strength of the similarity between a “discursive category” identified by the community detection algorithm and a tag? This would be really useful in my project, especially when multiple candidates are tagged to a single cluster /community: I know that some candidates must have a “stronger” relationship to the cluster than others! I know this must be calculated somewhere in the tagging process, but I haven’t been able to find it in the CorText outputs. Any help would be wonderful!
Thanks so much for this amazing tool.
When building a network showing clusters of co-occurring words with Network Mapping script, a table/variable is added to you dataset following this pattern: “PC_and_names_of_the_two_variables_used”.
To achieve what you want to do, you should consider Profiling script and/or Contingency Matrix script, by selecting the semantic clusters (the PC_ … table/variable) and the tag variable.
I hope it helps!