Dear Vladimir,
I want to find out what the most common collocations in the corpus are.
A frequency list tells you what the most prevalent words are in the
corpus, but not what they collocate with frequently.
Perhaps I am not expressing a sensible question but I would have
thought that there would be a way to see the most common collocations
in the corpus as a whole.
Thanks for your reply.
Best wishes
Evan
On 21 June 2017 at 19:47, Vladimír Benko <
vla...@juls.savba.sk> wrote:
> Dear Evan,
>
> How large are your corpora? And, I am bit surprised by your query -- what
> do you expect to find out? How should the result differ from, say, a plain
> frequency list?
>
> I am using NoSketchEngine on a project involving legal texts. This is
> working fine so far but I have a couple of queries about additional
> functionality that I want to implement.
>
> I am using the latest version of Manatee through the Python API.
>
> 1. I am able to retrieve collocations and concordances based on a
> query for a specific node word and to order the results by frequency.
> In addition, I would like to be able to retrieve the most common
> collocations irrespective of the node - i.e. the collocations which
> occur most frequently in a given corpus for any node. I have tried
> querying with:
>
> rangestream = corpus.eval_query("[word=\"*"]")
>
>
> I do no have any experience with the Python API. The regular expression for
> "everything", however, should be:
>
> ".*"
>
> Or, if you do not want to math "non-words":
>
> "[[:alpha:]]*"
>
> Best,
>
> Vlado B, 20:45
>
>
> --
> Vladimír Benko
>
> Slovak Academy of Sciences
> Ľ. Štúr Institute of Linguistics
> Panská 26, SK-81101 Bratislava
>
> Tel
+421-2-54431762 Fax -54431756
>
>
http://aranea.juls.savba.sk/guest/
>
https://www.facebook.com/araneawebcorpora/