Pdf, MS Documents are not Crawling

19 views
Skip to first unread message

anve...@gmail.com

unread,
Oct 12, 2018, 11:30:51 AM10/12/18
to DigitalPebble
Hi ,

I am working on Apache Storm 1.2.2 and ES 6.4. The Web crawler is performing well. when I check for the documents i.e., pdf,docx I am unable to get that in the results. I checked in the regex there no restrictions for the document types. Help me out how to tell the the crawler to grab the documents.


Regards,
Anvesh.

DigitalPebble

unread,
Oct 12, 2018, 11:32:59 AM10/12/18
to DigitalPebble

--
You received this message because you are subscribed to the Google Groups "DigitalPebble" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digitalpebbl...@googlegroups.com.
To post to this group, send email to digita...@googlegroups.com.
Visit this group at https://groups.google.com/group/digitalpebble.
For more options, visit https://groups.google.com/d/optout.


--
Reply all
Reply to author
Forward
0 new messages