Some questions

37 views

Skip to first unread message

albertonm81

unread,

Sep 7, 2013, 2:40:45 PM9/7/13

to ldsp...@googlegroups.com

Dear all

I have developed a little program in order to download some datasets in RDF. Here is some parts of the code, I want to know if everything is well done and I have some doubts.

First I have set the crawler:

Crawler crawler = new Crawler(5);

Frontier frontier = new BasicFrontier();

frontier.add(new URI("http://**********")); -> Here I have to put the URI of the dump haven't I?

Then the LinkFilter. If I want to download only a dataset for example dbpedia and don't want to download the datasets it is connected with. Do I have to set it here? How?

LinkFilter linkFilter = new LinkFilterDefault(frontier);

crawler.setLinkFilter(linkFilter);

Next step is to set the ContentHandler. If I want to download datasets in any kind of format, is it mandatory to use Any23 server?

ContentHandler contentHandler = new ContentHandlers(new ContentHandlerRdfXml(), new ContentHandlerNx());

crawler.setContentHandler(contentHandler);

Finally I have to set the Sink, where I am storing all the information of the downloaded dataset. How I set it to download the information in RDF?

OutputStream os = new FileOutputStream("/Users/*************"); -> Do I have to put here the path with the file where I want to download the information?

Sink sink = new SinkCallback(new CallbackNQOutputStream(os));

crawler.setOutputCallback(sink);

Thank you I wait for your answers.

Regards

Reply all

Reply to author

Forward

0 new messages