According to the documentation page at
https://www.dotcms.com/docs/latest/how-content-is-mapped-to-elasticsearch
"Only the first 8192 characters of Raw fields are indexed, and thus sorting is only performed based on the first 8192 characters for these fields."
It appears that all text, textarea and wysiwyg fields are Raw, in which case our large blog articles will not be fully indexed, which is a business requirement and was fine when we used Solr (similar to Elastic).
I can see that one advantage of this rule is to make indexing a faster process, and it also keeps the index to a smaller size. We currently have some 70,000 documents and are able to generate an incremental index hourly, on relatively standard hardware (single server for CMS and index generation, with search instance in the cloud).
Does anyone know where this value is controlled, or will we need to set up a custom instance of Elastic and/or a manual indexing process?
Any hints are welcome, thanks.
- Jon