How to select specific text whitin a corpus.

ForumCategory: QuestionsHow to select specific text whitin a corpus.
matias.milia asked 10 months ago

Hi guys, I am currently trying to work on a kind of complicated corpus with a very specific set of questions I want to make. The thing is that the terms extraction is not so helpful to separate the useful from the superfluous. I sense that there is a lot of noise in the data, regarding my interests at least. It is a big set of text where many issues are being discussed. I just want to focus on one. Therefore, I’ve made a list of the words I consider descriptive of my interests. What I am having trouble doing is to use this self-made dictionary to filter the text content. I want to keep just the text entrances where these words are being used. I am stumbling with this already for a couple of hours with no luck. Maybe I am just too tired. But I was wondering if someone could just give me basic instructions on this. It would be really, really appreciated.
I have managed to index a list of terms. So now, all these terms are merged under one label. That is how far I’ve reached.
Thanks in advance!

matias.milia replied 10 months ago

I’ve tried using a ‘pivot word’ but I shall not be doing it properly, so it crashes. The documentation don’t say much about how to format a ‘pivot word’ or if there is any chance to use more than one.

Jean-Philippe Cointet Staff replied 10 months ago

Could you not simply use the query script against this “one label” you created to identify your target topic ?

matias.milia replied 10 months ago

I think I managed to find and solve the problem. It was in Spanish and, then, the accents got in the way and made text recognition difficult.

matias.milia replied 10 months ago

Thanks for your quick response Jean-Philippe!

learn about CorText scripts and share your experience