demography script – interpretation

Chiara asked 1 year ago

Hello all, 
I have run the demography script to explore the temporal distribution of topics that I have extracted with the ‘Topic modeling’ script. I have now some doubts about the interpretation of temporal evolution. How is the occurrence of topics (as a group of words) calculated over time? in other words, what does the y axis indicates when the variable used are topics rather than a single term?
Thank you

1 Answers
Lionel Staff answered 12 months ago

Dear Chiara,
It is the number of documents for the periods used associated with the topics. But, you have to consider that it is a raw count of the number of documents.
So there are two issues using topic names (from topic modelling script) with demography script :

  • Each document may have more than one topic;
  • Each topic for one document is not representative of the content with the same intensity. Some topics may be strongly present in your documents while some others are marginal. So, using a demography script does not show the real evolution of the importance of the topic in the content of the documents.

I hope it helps