[5.0.3] SiteSearch: how to exclude some content and how to get aggregations and suggestions

44 views
Skip to first unread message

Giacomo Petillo

unread,
Jan 25, 2021, 6:17:10 AM1/25/21
to dotCMS User Group
Hi all,
is there a way to use the "SiteSearch" viewtool with type exclusion? Eg. not show the images, pdf files and other types?
I had added the "DOTCMS_SITESEARCH" UserAgent exclusion before "<img>" tag and set the "Exclude" field with "/application, /resourses/" values, but if i seach a term the search result contains both images and js/css files.


Another question, seems that this viewtool doesn't support the "aggregations" and the "suggestions" options of ES query, it just returns "SiteSearchResults", which doesn't contains that infos. Is there a method?


I must implement a site search, which exclude all the images, pdf (and other types) and returns the categories aggregations for each results, any idea? Is The EsTool the best choice?



Thanks,
G.

Xander Steinmann

unread,
Jan 26, 2021, 2:33:21 AM1/26/21
to dotCMS User Group
Hi Giacomo,

If you want basic search then the SiteSearch tool is fine. You can use the DOTCMS_SITESEARCH to not index certain part of pages and the Exclude field to exclude files (I believe it should be /resources/*). If you want to take it a step further you can do the following things:
  1. Filter the results you get from the viewtool, for instance to remove all file results
  2. Create a custom ES query that returns better results (you can test it in the back-end)
  3. Create a plugin that calls the site search methods
  4. Completely customize the Elasticsearch and what it returns - this is possible but not recommended
One thing I noticed with the index is that it is never cleaned up, only things are added to it. This means that if a page is indexed and later removed then it will keep popping up. That's why, for one client, we went for option 3 and added code that checks whether results actually exist. We also added a checkbox to pages so the customer can select which pages should turn up in the site search results and which shouldn't.

Kind regards,

Xander

Giacomo Petillo

unread,
Jan 27, 2021, 12:00:59 PM1/27/21
to dotCMS User Group
Thank yiu Xander for the suggestions.

The DOTCMS-SITESEARCH user agent exclusion is used for remove some content from rendered pages and works we well!

I studied both docs and source code and i'll upgrade to the 5.3.8.2 (LTS) as soon is possible, because in new SiteSearch viewtool there is "getAggregation" method which is what i need.

N.b. Use che "keywords" meta tag for aggregate over field values or custom values, as showed here: https://dotcms.com/docs/latest/site-search#MetaKeywords

G.
Reply all
Reply to author
Forward
0 new messages