Dear NoSketch Engine team,
I hope you're doing well. I am working with the General Regionally Annotated Corpus of Ukrainian, and I have a question about compiling corpora (manatee-open-2.223.6).
Are there any particular considerations when compiling corpora with compilecorp using many threads? I've noticed that compilecorp occasionally loses part of the corpus during the process. While recompiling sometimes solves this issue, there might be better
approaches. Currently, I'm using GNU Parallel for this task.
Any insights or best practices you can share would be greatly appreciated.
Thank you for your time.
Best regards,
Arsenij