Hi, I am working with a quite large database (around 265k entries) and trying to do a Structural Analysis. I found out that the script would stop computing if the number of entities goes over 30,000 because it is computationally too heavy. I have a total of 319,189 authors, which is the field I am using as input.
I am just aiming to understand the newcomer dynamics (ratio and total number per year), so it is not the whole thing I am trying to do. Anyways, I have some questions that you might be able to help me with, I couldn’t find further info in the documentation page https://docs.cortext.net/structural-analysis/
- How are the TOP ENTITIES calculated?
I activated ‘top nodes filtering for the whole period’ and just get to analyse a 10% of my population, so I need to know how is the data being selected if I want to analyse results correctly. I assume the script takes the more significative authors, but I don’t know how does it reach that point.
- Is it possible to, somehow, set this TOP ENTITIES aside in a particular database or variable?
This would be useful to understand the composition of this top entities population. I know this might sound a little bit ‘too much’, but I had to ask.
- Which alternative method do you suggest to do a author count for each year of the period?
If I manage to do this, would be useful to better describe the field and the inserted bias.
- Is there any other way to show the newcomer evolution in such a large database?
I am thinking over it a lot, but I don’t seem to find any way around this.
I know my questions might be difficult to answer, so thanks beforehand for your kind effort.
Thanks a lot!
I found a way to see if the dynamics I was observing for this 30,000 TOP ENTITIES related to a broader sample. I run the PERIOD DETECTOR script just on the ‘author’ field with the TOP 200,000 authors; I found that the dynamics I observed do illustrate the dynamics of the field. It solves partially my question, so I thought it could be useful to post this here.
Anyway, I still would like to know how are the TOP ENTITIES calculated. Does anybody know where to find an insight on this?