Restricting kwic kontext to a <doc> element

Normunds Grūzītis

unread,

May 30, 2023, 11:56:30 AM5/30/23

to NoSketch Engine

Hello everyone,

In our grpoup, we are widely using NoSke for text corpora (https://korpuss.lv/en/) and now we are testing it for speech corpora.

In speech corpora, "documents" can be very short - just isolated phrases / segments; consider, for instance, the Common Voice corpora.

Is it possible to somehow restrict the context in the concordance view to a single document?

I have attached a screenshot illustrating that "previous" and "next" documents are included by default in the context window. Our users say that this is very confusing.

Best regards,

Normunds

University of Latvia

ailab.lv | korpuss.lv | tezaurs.lv

doc.png

Miloš Jakubíček

unread,

May 30, 2023, 1:10:24 PM5/30/23

to Normunds Grūzītis, NoSketch Engine

Hi Normunds,

in the concordance view, you can switch from KWIC view to sentence view (see https://www.sketchengine.eu/my_keywords/kwic/) if you have sentences marked with the <s> structure in the corpus.

So, either make the <doc> into an <s> and recompile the corpus; or it looks like you could also tweak this in the run.cgi setting senleftctx = '-1:doc' and senrightctx = '1:doc' in the properties of the BonitoCGI class

(which defaults to '-1:s' and '1:s' in conccgi.py) -- haven't tried myself though so this is just a quick hunch based on looking into the code for two minutes ;-)

Best

Milos

Milos Jakubicek

CEO, Lexical Computing

Brno, CZ | Brighton, UK

http://www.lexicalcomputing.com

http://www.sketchengine.eu

--
You received this message because you are subscribed to the Google Groups "NoSketch Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noske+un...@sketchengine.co.uk.
To view this discussion on the web visit https://groups.google.com/a/sketchengine.co.uk/d/msgid/noske/092c032c-64f9-46a5-8b14-983250281333n%40sketchengine.co.uk.

Normunds Grūzītis

unread,

May 30, 2023, 2:54:38 PM5/30/23

to Miloš Jakubíček, NoSketch Engine

Thanks, Miloš, it works! Although the kwic alignment is lost.

We will try the run.cgi setting which seems even a better solution.

Best,

Normunds

Reply all

Reply to author

Forward