What’s the difference in the two distributional proximity measures: distributional vs. distributional (l-l r)? I only see the first defined in the documentation.
Thanks for your question. We will update the documentation!
The Distributional l-l r proximity measure is similar than the classic Distributional, with a small distinction:
- The classic Distributional: the cooccurrences matrix is based on the Mutual Information between nodes.
- The Distributional l-l r: the cooccurrences matrix is based on the Log-Likelihood Ratio between nodes.
The two share the same characteristics: the nodes (e.g. keywords) are close when they play similar roles/functions regarding the contexts in which they appear (e.g. sentences/paragraphs). But the distributional log-likelihood ratio tends to be more respectful to the indirect relations with small frequencies.
I hope it helps
I assume that the equation differs somewhat. Is it possible for you to post it (or send it to firstname.lastname@example.org)? I plan to submit an article based on the analysis and would like to include the equation.