Subcorpus not found - Advanced Wordlist search fail when adding any Text types

35 views
Skip to first unread message

Valdis Saulespurens

unread,
Oct 12, 2023, 8:18:08 AM10/12/23
to NoSketch Engine
All of our corpuses (at nosketch.lnb.lv) fail when using Advanced Wordlist search with any  Text type filters.

The error message is as follows:

Something went wrong…
Subcorpus "###8853811370775438242###" not found

  • Advanced Wordlist find works with regular find(words,lemmas)
  • Find Fails As soon as you add additional filters from any of Text types
Depending on corpus type we have author, title, firstPublished and other Text types, but all of them fail in Advanced Wordlist find filter.

The Subcorpus hash ###numbers###  error changes numbers depending on the type of filter you choose, but it always fails..

My hypothesis is that there is some issue with indexing possibly from incorrect configuration file?

What is puzzling that these same Text types filter WORK on Advanced concordance search.

Any ideas, maybe there are log files to inspect?

Sincerely,
   Valdis Saulespurens
Researcher National Library of Latvia


PSS Simple 4 core VPS - 16GB RAM
Ubuntu 18.04.1 LTS
Python 2.7.17
Python 3.6.9 - 

Crystal 2.14,
Bonito 4.24
Manatee 2.167

Tomáš Svoboda

unread,
Oct 12, 2023, 9:21:17 AM10/12/23
to NoSketch Engine, valdis.sa...@gmail.com
Dear Valdis,
I see that you are using more than three years old NoSketch, so it is not easy to give you good advice. If you can, please, update your Manatee, Bonito and Crystal.

Anyway, my best quess is that your web server does not have permission to write to the subcorpus directory. The path to the directory is in the run.cgi in the "subcpath" variable.


Best regards!

Tomas
Dne čtvrtek 12. října 2023 v 14:18:08 UTC+2 uživatel valdis.sa...@gmail.com napsal:

Valdis Saulespurens

unread,
Oct 13, 2023, 8:14:53 AM10/13/23
to NoSketch Engine, tomas....@sketchengine.co.uk, Valdis Saulespurens
Thank you Tomas for advice!

We do plan to upgrade to latest  versions as soon as possible.

For the time being I checked run.cgi for Bonito


    # ConcCGI options
    _cache_dir = _data_dir + '/cache'
    _tmp_dir = _data_dir + '/tmp'
    subcpath = [_data_dir + '/subcorp/GLOBAL']
    gdexpath = [] # [('confname', '/path/to/gdex.conf'), ...]
    user_gdex_path = "" # /path/to/%s/gdex/ %s to be replaced with username


I did not define any gdex options because we are not using gdex.

valdiss@nosketch:/var/lib/bonito$ ls -l
total 16
drwxr-xr-x 21 www-data www-data 4096 Oct 12 07:59 cache
-rw-r--r--  1 www-data www-data    0 Oct 20  2021 htpasswd
drwxr-xr-x  2 www-data www-data 4096 Oct 20  2021 jobs
drwxr-xr-x  2 www-data www-data 4096 Oct 20  2021 options
drwxr-xr-x  3 www-data www-data 4096 Oct 13 11:19 subcorp


It looks normal, I did notice there was no GLOBAL directory inside subcorp, so I made one and assigned ownership to www-data:www-data

valdiss@nosketch:/var/lib/bonito/subcorp$ mkdir GLOBAL
valdiss@nosketch:/var/lib/bonito/subcorp$ sudo chown -R www-data:www-data GLOBAL/
valdiss@nosketch:/var/lib/bonito/subcorp$ ls -l
total 4
drwxr-xr-x 2 www-data www-data 4096 Oct 13 11:19 GLOBAL

Restarted apache2 just in case.

Problem still persists, concordances Advanced search can use Text type filters, but Wordlist Advanced search can not.

Will update, if/when I make any progress with this issue.

Sincerely,
   Valdis 

Vojtěch Kovář

unread,
Oct 26, 2023, 12:44:22 PM10/26/23
to Valdis Saulespurens, NoSketch Engine, tomas....@sketchengine.co.uk
Dear Valdis,

were you able to make any progress on this? If not, could you please double-check that the webserver is able to write (including creating subdirectories) into the subcorp directory? Using text types in the wordlist feature tries to create a new "instant" subcorpus in a new special directory in subcorp, according to the text types (unlike when using text types in concordance) and the "subcorpus not found" message most likely means the creation failed.

If this does not help, would you be able to share the whole contents of the run.cgi file?

Best regards,

Vojtech Kovar
Sketch Engine Team


--
You received this message because you are subscribed to the Google Groups "NoSketch Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noske+un...@sketchengine.co.uk.
To view this discussion on the web visit https://groups.google.com/a/sketchengine.co.uk/d/msgid/noske/72b64930-847d-4a48-b670-11d08a4f27can%40sketchengine.co.uk.

Valdis Saulespurens

unread,
Jul 4, 2024, 5:51:52 AMJul 4
to NoSketch Engine, vojtech.kovar, NoSketch Engine, tomas....@sketchengine.co.uk, Valdis Saulespurens
Just wanted to update you on progress.

We ended up migrating to a newer version of nosketch at https://korpuss.lnb.lv

This newer version of nosketch does not have this issue. :)

Best regards,
   Valdis Saulespurens
Researcher National Library of Latvia


Reply all
Reply to author
Forward
0 new messages