Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Display/computational limit -- a question

9 views
Skip to first unread message

Pahor de Maiti, Kristina

unread,
Nov 6, 2024, 11:10:27 AM11/6/24
to NoSketch Engine
Dear team,

I wanted to search and analyse a subset that returns more than 10mio hits. Because of this, I am unable to compute the general statistics of text types for the subset, nor save it as a separate subset in the log-in version of noSkE. Is there a way around this limit? Namely, I would like to compare speeches of one set of parties with speeches of another set of parties for the last five years approximately (but this returns too many hits to be displayed or used for analyses). Any idea would be greatly appreciated.

The specific query is the following: https://www.clarin.si/ske/#concordance?corpname=parlamint41_si&tab=advanced&queryselector=cql&attrs=word&viewmode=kwic&attr_allpos=all&refs_up=0&shorten_refs=1&glue=1&gdexcnt=300&show_gdex_scores=0&itemsPerPage=20&structs=s%2Cg&refs=%3Dspeech.speaker_id%2C%3Dspeech.date&default_attr=lemma&cql=%5B%5D%20within%20%3Cspeech%20date%3D%222017.*%7C2018.*%7C2019.*%7C2020.*%7C2021.*%7C2022.*%22%20%26%20speaker_party%20!%3D%20%22SDS%7CSNS%7CNSi%22%2F%3E&showresults=1&showTBL=0&tbl_template=&gdexconf=&f_tab=basic&f_showrelfrq=1&f_showperc=0&f_showreldens=0&f_showreltt=0&c_customrange=0&t_attr=&t_absfrq=0&t_trimempty=1&t_threshold=5&operations=%5B%7B%22name%22%3A%22cql%22%2C%22arg%22%3A%22%5B%5D%20within%20%3Cspeech%20date%3D%5C%222017.*%7C2018.*%7C2019.*%7C2020.*%7C2021.*%7C2022.*%5C%22%20%26%20speaker_party%20!%3D%20%5C%22SDS%7CSNS%7CNSi%5C%22%2F%3E%22%2C%22query%22%3A%7B%22queryselector%22%3A%22cqlrow%22%2C%22cql%22%3A%22%5B%5D%20within%20%3Cspeech%20date%3D%5C%222017.*%7C2018.*%7C2019.*%7C2020.*%7C2021.*%7C2022.*%5C%22%20%26%20speaker_party%20!%3D%20%5C%22SDS%7CSNS%7CNSi%5C%22%2F%3E%22%2C%22default_attr%22%3A%22lemma%22%7D%2C%22id%22%3A6807%7D%5D

Best regards,
Kristina

H Pirker

unread,
Nov 7, 2024, 4:41:54 AM11/7/24
to NoSketch Engine, Pahor de Maiti, Kristina
 My educated guess :
I don't thinks there ever is a need to search for `[] within <...>` because this would return you each and every token separately.

Instead just use :

`<speech date="2017.*|2018.*|2019.*|2020.*|2021.*|2022.*" & speaker_party != "SDS|SNS|NSi"/>` 

This would return the 53.000 <speech>es you are interested in, and you can proceed from there. 

cheers 
Hannes
Reply all
Reply to author
Forward
0 new messages