noob request

10 views

Skip to first unread message

Shashvat Walia

unread,

Dec 11, 2021, 7:52:27 PM12/11/21

to MG4J

1) If implementing Bubing, can we automatically set the crawler to store the crawled index in a database (like hypertable) or do we always have to import from the WARC file?

2) Bubing is only the crawler. Can we use bubing with Hypertable?

3) Could you suggest how do we rank the results using open source

4) Can we configure the crawler to only crawl hyperlinks along with their anchor text, as well as crawl *.pdf files etc as opposed to the entire page?