Hello!
I am working on CLASSLA-web.bs (Bosnian Web) corpus and I am trying to recreate the values for MI score, Log likelihood and T score using the formulas provided in the Statistics section, but I am having difficulty getting anywhere close to the values that I copy from the collocation statistics. I was wondering if there are any tips to assist me?
Also, I was wondering why I sometimes see different values for statistics of the same terms. I searched for "osloboditi" with the collocate "se" in Bosnian Web corpus and even though the coocurrences and occurences were the same on two different instances, the T score shifted from 107 to 109. I am wondering if this is a general occurrence or if there is something wrong with my browser (or anything else). Thank you for your time!
Best,
Matea Tolic