Demography processes each field of the corpus and counts the raw evolution of occurrences of the top items. You will simply be asked to specify the number of top items you wish to evaluate. And whether to use custom or regular time periods

The script creates two directories called “global distributions” and “temporal evolution”.

Screenshot from 2016-08-16 15:06:02

Synthetic Biology scientific publications – top 20 countries since 2001

  1. The first directory “global distribution” simply lists the distribution of items per document and the distribution of documents per item of each field. Those files are useful in order to understand – for instance –  the distributions of the number of authors per article or number of papers written by authors in a scientific database (by selecting the Authors field). Note that distributions are computed over all possible entries in the database, thus ignoring the number of top items to consider.
  2. In the “temporal evolution” directory, each field of the corpus will be enumerated over time in a csv file compiling the occurrences at each time step of the top items of the given field (original count of occurrences averaged over 3 or 5 time-steps windows are also available for analysis if raw statistics are too noisy).  A dedicated web interface (see illustration) is also provided by clicking the html files to visualize and customize the chart of each chosen field.

