Creating a seeds file

13 views
Skip to first unread message

grant....@griffithuni.edu.au

unread,
Aug 27, 2018, 11:47:55 PM8/27/18
to LDSpider
Hi,

I've been working on updating the LDSpider recently and I was wondering if there is a option within the CLI to crawl a URL looking for links to Linked Data. For example, can i target the LDSpider on http://harth.org/andreas/ and it find http://harth.org/andreas/foaf? And can it either export that http://harth.org/andreas/foaf to build a seed file, or crawl through it automatically?

Thanks.

- Grant Burgess

tobias...@kit.edu

unread,
Aug 28, 2018, 1:16:18 PM8/28/18
to LDSpider
Hi Grant,

great to see continuing interest in LDSpider. LDSpider can follow RDF links. Hence, if there is an extractor that extracts RDF from corresponding HTML headers, you can follow those links. Any23 does this kind of extraction, which you can use from LDSpider. Other RDFa extractors may also helpful. In the case of Andreas' homepage, you would want to follow the <http://www.w3.org/1999/xhtml/vocab#meta> predicate (eg. using a LinkFilter), which is admittedly very generic.

FWIW, if you have mid- or long-term interest in using LDSpider, you may want to look at the nxparser2.2 branch, where we started to refactor LDSpider to use updated libraries, which include bugfixes and easier to use APIs. We stopped the update when the libraries continued to introduce API-breaking changes, but we can give you hints how to finish the work.

Cheers,

Tobias
Reply all
Reply to author
Forward
0 new messages