Error occurs when the Context Size is higher than 30 tokens

120 views
Skip to first unread message

Illia Ilyin

unread,
Dec 27, 2021, 9:24:29 AM12/27/21
to AntConc-Discussion
Dear Laurence Anthony! 

Thank you for the new version of AntConc! I couldn't even begin to express my gratitude, but your work is very appreciated!

I noticed than in the new version the Context Size could be up to 99 tokens, but when I tried to set it that high (even 31 tokens high) the error occurred (a window with "No hits found", etc). I'm using Windows 8.1, and there hadn't been the similar problem with earlier versions (when I set 20 words span).

How to deal with this issue?

Also, as I figured, the Collocate word span was reduced to 10 instead of 20 in earlier version. Is there a way to return 20 word span?

Please, help me.

Warm regards, Illia (from Ukraine; sorry for my English).

Laurence Anthony

unread,
Dec 27, 2021, 10:40:49 AM12/27/21
to ant...@googlegroups.com
Hi Illia,

Thank you for the comments.

When you say "Context Size" are you talking about the KWIC tool or the Collocate tool or the advanced search or something more general? 

As for the collocate span, I didn't realize that the maximum has been fixed at 10. This is a kind of bug. But, saying that, research has shown that collocate spans have little meaning after 4 or 5 tokens. So, anything more than 10 would be almost meaningless. Can I ask why you are trying to find collocates at such large distances?

Laurence.



###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/ebc7441a-5b8b-4e05-a0f0-4f1719e6f4f0n%40googlegroups.com.

Illia Ilyin

unread,
Dec 27, 2021, 10:51:12 AM12/27/21
to AntConc-Discussion
Hi Anthony!

Sorry. I meant the KWIC tool when I wrote about the error. 

I am trying to find collocates at such large distances because I'm working on Marx & Engels's corpus in order to establish terminological links.

понедельник, 27 декабря 2021 г. в 17:40:49 UTC+2, Laurence Anthony:

Laurence Anthony

unread,
Dec 27, 2021, 11:07:35 AM12/27/21
to ant...@googlegroups.com
Hi Illia,

In AntConc 4, going over a context size of 50 tokens in the KWIC tool is likely to cause problems because of the design of the database. But, I also see the error you describe appearing after at 32 tokens (for a one-word search). I'll need to check the database for what is causing the error. Saying that, a 32 word token span is almost meaningless because the display is simply not that wide and you could never see that level of context. This is what the File view tool is for. How are you seeing the words at such a distance? Note that in previous versions (e.g. 3.5.9) the context size for the KWIC tool was based on the number of *characters*. So, even a 30 *token* span is roughly 3 times that limit.

As for the collocate span, I'm not sure the collocate approach is what you should be using. The collocate tool calculates a statistical measure of co-occurrence. At 10+ words, I would expect this calculation to produce many many spurious results.

Let me know what your ideal limits would be and I'll see if they are possible.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Illia Ilyin

unread,
Dec 27, 2021, 11:33:35 AM12/27/21
to AntConc-Discussion
Hi Anthony! 

In AntConc 3.5.9, I used 450 characters in Search Window Size (Concordance) and thought it would be optimal for the understanding of the meaningful context of theoretical texts. And now, when I used 30 tokens, I saw the context that 2 times smaller. 

Usually I saved the results in txt, then to docx, and finally proceeded to read them. That's how I manage to see the context in full.

I have found some interesting results when I used 20 words span in Collocate section. Therefore, if the old norm returns, I'll be very pleased. 
понедельник, 27 декабря 2021 г. в 18:07:35 UTC+2, Laurence Anthony:

Laurence Anthony

unread,
Dec 27, 2021, 11:57:43 AM12/27/21
to ant...@googlegroups.com
Hi Illia,

I feel like we've had this discussion sometime in the past. Sorry if I forgot!

>In AntConc 3.5.9, I used 450 characters in Search Window Size (Concordance) and thought it would be optimal for the understanding of the meaningful context of theoretical texts. And now, when I used 30 tokens, I saw the context that 2 times smaller. 

Am I understanding correctly that you are searching for a word or phrase within a huge span, exporting the results so that you can see the span properly, and then scanning the concordance lines to find some kinds of pattern at a huge distance from the search word/phrase? If so, can I ask why you don't just search for the word/phrase in the raw text? What advantage does the concordance line give you? The normal file view tool seems to be a much better tool for this job.

I look forward to hearing your response.

I've also now identified what the problem was. There is an effective 64 columns limit in the underlying database that I cannot change. So, if you try to generate spans beyond this limit, you'll get an error. The column size is very large so for almost all normal uses, I cannot see a problem. We just need to figure out a better way for you to do your analysis. It does seem that you are using the wrong tool for the job.

Regards,

Laurence.


###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Illia Ilyin

unread,
Dec 30, 2021, 7:23:46 PM12/30/21
to AntConc-Discussion
Hi Anthony! 

Sorry for bothering you with these not-so-useful questions. I'm a newbie in corpus research, you see. Thank you for your patience and support!

No, we have not discussed this issue before. Concordances give me the precise size of the context, so I don't have to copy all information manually. Maybe I should use this tool in a normal way, selecting the appropriate size of the Concordance lines and the Collocate context. I would be grateful if you give me links to the articles/books on the correct use of concordances and collocates contexts. Thank you in advance!

Thanks for the clarification about the KWIC tool! Maybe then it will be appropriate to limit the Context Size to 30 tokens, instead of the futile possibility to choose 99 tokens now?

Thank you for your help and sorry for my ignorance!

Warm regards, 
Illia

понедельник, 27 декабря 2021 г. в 18:57:43 UTC+2, Laurence Anthony:
Reply all
Reply to author
Forward
0 new messages