Error when performing crawling

227 views
Skip to first unread message

Jaakko Lappalainen

unread,
Dec 10, 2012, 4:18:54 PM12/10/12
to ldsp...@googlegroups.com
Hi, I am using the following options with ldspider-1.1e.jar

numThreads = 3
int depth = 2;
int maxURIs = 100;

I invoke the crawling method as

crawler.evaluateBreadthFirst(frontier, depth,maxURIs,1, Crawler.Mode.ABOX_ONLY);

What is the 4th argument? Are the rest of the arguments correct?

Anyway the program seems to be frozen in the method 'evaluateBreadthFirst'. I let the code running for more than 500 minutes in NetBeans. Is this normal?

Some output is here:

IDec 10, 2012 10:14:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread <init>
INFO: Initialised CloseIdleConnectionThread with sleepTime 60000 ms
Dec 10, 2012 10:14:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Starting CloseIdleConnectionThread
Dec 10, 2012 10:14:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list [data.gov.au]
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 1 plds done in 5 ms
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: data.gov.au: 1

Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: data.gov.au: 1

Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 0 with 1 uris
LT-0
LT-1
LT-2
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Exception in thread "LT-0:http://lab.environment.data.gov.au/data/acorn/climate/slice.html" java.lang.NoSuchMethodError: org.osjava.norbert.NoRobotClient.parse(Ljava/lang/String;Ljava/net/URL;)V
at com.ontologycentral.ldspider.http.robot.Robot.<init>(Unknown Source)
at com.ontologycentral.ldspider.http.robot.Robots.accessOk(Unknown Source)
at com.ontologycentral.ldspider.http.LookupThread.run(Unknown Source)
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 0 DONE with 0 uris remaining in queue
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list []
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done in 0 ms
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: 
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 1 with 0 uris
LT-0
LT-1
LT-2
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 1 DONE with 0 uris remaining in queue
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list []
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done in 1 ms
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: 
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 2 with 0 uris
LT-0
LT-1
LT-2
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 2 DONE with 0 uris remaining in queue
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list []
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done in 0 ms
Dec 10, 2012 10:14:22 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: 
Crawling finished.
Dec 10, 2012 10:15:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 10, 2012 10:16:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 10, 2012 10:17:20 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections

If you need some more info don't hesitate to ask.

Andreas Harth

unread,
Dec 10, 2012, 6:45:54 PM12/10/12
to ldsp...@googlegroups.com
Hi Jaakko,

do you use the code or the jar? What's your seed set? Maybe you can
minimise the seed to help narrow down the issue?

Cheers,
Andreas.

Andreas Harth

unread,
Dec 10, 2012, 7:05:16 PM12/10/12
to ldsp...@googlegroups.com
Hi,

just saw that you use the jar. I think we had a very short timeout
in there (500 ms or so), so the crawler is starving, i.e. not getting
any new URIs. FWIW, there's probably a small bug in that the crawler
does not terminate.

The current code in the SVN has more conservative timeout values,
so I'd advise you to use that.

Best regards,
Andreas.

Jaakko Lappalainen

unread,
Dec 11, 2012, 4:08:40 AM12/11/12
to ldsp...@googlegroups.com
Thank you Andreas, I am using the code from the SVN right now.

And calling the method like this

crawler.evaluateBreadthFirst(frontier, 2, 100, 100, 100, false, Crawler.Mode.ABOX_ONLY);

I am just using random numbers as arguments and the program still freezes. Here's the output

Dec 11, 2012 10:06:12 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread <init>
INFO: Initialised CloseIdleConnectionThread with sleepTime 60000 ms
Dec 11, 2012 10:06:12 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Starting CloseIdleConnectionThread
Dec 11, 2012 10:06:12 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Exception in thread "main" java.lang.NullPointerException
at com.ontologycentral.ldspider.Crawler.evaluateBreadthFirst(Crawler.java:239)
at aglinkeddatacrawler.AgLinkedDataCrawler.main(AgLinkedDataCrawler.java:73)
Dec 11, 2012 10:07:12 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections

Thank you very much for your responses

Andreas Harth

unread,
Dec 11, 2012, 4:55:25 AM12/11/12
to ldsp...@googlegroups.com
Hi Jaako,

which seed set do you use?

@Tobias: Providing the "simple" interface w/o load balancing (as
in the wiki) with sensible default values might be a good idea.

Cheers,
Andreas.
> >> INFO: sorted pld list [data.gov.au <http://data.gov.au>]
> >> Dec 10, 2012 10:14:21 PM
> >> com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
> >> INFO: scheduling 1 plds done in 5 ms
> >> Dec 10, 2012 10:14:21 PM
> >> com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
> >> INFO: data.gov.au <http://data.gov.au>: 1
> >>
> >> Dec 10, 2012 10:14:21 PM com.ontologycentral.ldspider.Crawler
> >> evaluateBreadthFirst
> >> INFO: data.gov.au <http://data.gov.au>: 1

Jaakko Lappalainen

unread,
Dec 11, 2012, 5:21:28 AM12/11/12
to ldsp...@googlegroups.com

Andreas Harth

unread,
Dec 11, 2012, 10:38:39 AM12/11/12
to ldsp...@googlegroups.com
Hi,

$ svn update
$ ant clean; ant dist
$ mkdir crawl-jaakko
$ cd crawl-jaakko
$ vi seed.txt # create file with seed URIs, one per line
$ java -jar ../dist/ldspider-trunk.jar -b 2 -s seed.txt -a access.log -o
data.nq -r redirects.nx

actually works (breadth-first traversal depth 2, with 2 threads,
default wait time 2 secs) and does lookups; I stopped once I started
getting timeouts from the server.

If that does not solve your problem, did you register all the handlers
as described in [1]? If so (and it still doesn't work), could you
provide a minimal test case?

Cheers,
Andreas.

[1] http://code.google.com/p/ldspider/wiki/GettingStartedAPI

Jaakko Lappalainen

unread,
Dec 11, 2012, 6:01:47 PM12/11/12
to ldsp...@googlegroups.com
Hi, no I get these exceptions

java.lang.NullPointerException
at org.osjava.norbert.NoRobotClient.isUrlAllowed(NoRobotClient.java:219)
at com.ontologycentral.ldspider.http.robot.Robot.isUrlAllowed(Robot.java:115)
at com.ontologycentral.ldspider.http.robot.Robots.accessOk(Robots.java:91)
at com.ontologycentral.ldspider.http.LookupThread.run(LookupThread.java:111)


Although I get data in my output file.

Andreas Harth

unread,
Dec 12, 2012, 3:12:30 AM12/12/12
to ldsp...@googlegroups.com
Hi,

On 12/12/12 00:01, Jaakko Lappalainen wrote:
> Hi, no I get these exceptions
>
> java.lang.NullPointerException
> at org.osjava.norbert.NoRobotClient.isUrlAllowed(NoRobotClient.java:219)
> at
> com.ontologycentral.ldspider.http.robot.Robot.isUrlAllowed(Robot.java:115)
> at com.ontologycentral.ldspider.http.robot.Robots.accessOk(Robots.java:91)
> at com.ontologycentral.ldspider.http.LookupThread.run(LookupThread.java:111)

you can ignore that error. It means there was a robots.txt which could
not be parsed. Happens quite often when a lookup on robots.txt returns
some HTML 404 page.

> Although I get data in my output file.

Great!

Best regards,
Andreas.

Jaakko Lappalainen

unread,
Dec 14, 2012, 4:14:13 AM12/14/12
to ldsp...@googlegroups.com
Hi again, how can I reproduce this execution

java -jar ../dist/ldspider-trunk.jar -b 2 -s seed.txt -a access.log -o 
data.nq -r redirects.nx

using the API? 

Jürgen Umbrich

unread,
Dec 14, 2012, 4:50:34 AM12/14/12
to ldsp...@googlegroups.com
Hi 

A very good starting point and comprehensive example how command line parameters are encoded in API calls is the following class
http://code.google.com/p/ldspider/source/browse/trunk/src/com/ontologycentral/ldspider/Main.java

Especially the run method (starting at line 333) shows very nicely how the LDSpider framework is initialised and how hooks can be easily plugged into the system. 

Let us now if that helps.

Best
   Juergen

Jaakko Lappalainen

unread,
Dec 14, 2012, 5:23:05 AM12/14/12
to ldsp...@googlegroups.com
Hi, thanks for the quick reply. 

My main problem is that I need a map between this prototype

public void evaluateBreadthFirst(Frontier frntr, int i, int i1, int i2, int i3, boolean bln, Mode mode)

Which I'm running from the trunk .jar file you provided my above, and the actual prototype from the code you also point me in Main.java

public void evaluateBreadthFirst(Frontier frontier, int depth, int maxuris, int maxplds, int minActPlds, boolean minActPldsAlready4Seedlist, Mode crawlingMode)
Thank you very much

Jürgen Umbrich

unread,
Dec 14, 2012, 5:37:04 AM12/14/12
to ldsp...@googlegroups.com
The "evaluateBreadthFirst" code is shown here http://code.google.com/p/ldspider/source/browse/trunk/src/com/ontologycentral/ldspider/Crawler.java I think the necessary mapping for you should be (as per source code, line 229 (http://code.google.com/p/ldspider/source/browse/trunk/src/com/ontologycentral/ldspider/Crawler.java#229)): public void evaluateBreadthFirst( Frontier frontier, int depth, int maxuris, int maxplds, int minActPlds, boolean minActPldsAlready4Seedlist) {
                evaluateBreadthFirst( frontier, depth, maxuris, maxplds, minActPlds, minActPldsAlready4Seedlist, Mode.ABOX_AND_TBOX);
}
        Hope that helps

Jaakko Lappalainen

unread,
Dec 14, 2012, 1:34:48 PM12/14/12
to ldsp...@googlegroups.com
Ok, then I'll assume that

i = depth
i1 = maxURIs
i2 = maxplds
i3 = minActPlds
bln = minActPldsAlready4Seedlist

I use this configuration

int depth = 2; int maxURIs = 100; int maxplds = 5; int minActPlds = 1; boolean minActPldsAlready4Seedlist = false;

And I still can't get any results. This is the output

Exception in thread "main" java.lang.NullPointerException
at com.ontologycentral.ldspider.Crawler.evaluateBreadthFirst(Crawler.java:239)
at com.ontologycentral.ldspider.Crawler.evaluateBreadthFirst(Crawler.java:230)
at aglinkeddatacrawler.AgLinkedDataCrawler.main(AgLinkedDataCrawler.java:78)

What would be the equivalent configuration when calling the jar like

java -jar ../dist/ldspider-trunk.jar -b 2 -s seed.txt -a access.log -o data.nq -r redirects.nx 

??

Thank you very much

Jürgen Umbrich

unread,
Dec 14, 2012, 2:07:19 PM12/14/12
to ldsp...@googlegroups.com
Hi Jaakko, 

Again, i would like to point you to the main class of LDSpider 

This class is well documented and contains the logic behind the mapping of command line parameters to the actual API initialisation.
To know exactly how these parameters are then implemented just search for the string "cmd.hasOption("<PARAMETER>").
e.g. line 636  if (cmd.hasOption("b")) {....

Juergen

Jaakko Lappalainen

unread,
Dec 14, 2012, 3:01:01 PM12/14/12
to ldsp...@googlegroups.com
Thank you Juergen, let me point you the code you refer to me

if (cmd.hasOption("b")) {
                        String[] vals = cmd.getOptionValues("b");

                        int depth = Integer.parseInt(vals[0]);
                        int maxuris = -1;
                        int maxplds = -1;

                        if (vals.length > 1) {
                                maxuris = Integer.parseInt(vals[1]);
                                if (vals.length > 2) {
                                        maxplds = Integer.parseInt(vals[2]);
                                }
                        }

                        _log.info("breadth-first crawl with " + CrawlerConstants.NB_THREADS + " threads, depth " + depth + " maxuris " + maxuris + " maxplds " + maxplds + " minActivePlds " + cmd.getOptionValue("minpld", "unspecified"));

                        c.evaluateBreadthFirst(frontier, depth, maxuris, maxplds, Integer.parseInt(cmd.getOptionValue("minpld", "-1")), cmd.hasOption("mapseed") );
                }
As for what I undestand by this code fragment, and the command line, 'vals' only has one number, '2' in this case. So I have to invoke the method with '-1' in maxURIs, maxpld and minpld and 'false' in the boolean argument.

But still, I get the output I refer to you previously. I must have been missing something, but what? 

Jürgen Umbrich

unread,
Dec 15, 2012, 5:27:02 AM12/15/12
to ldsp...@googlegroups.com
Hi, 

for which code did you get the error? 
I am asking, because the Crawler class should not throw an nullpointer exception at line 239 (http://code.google.com/p/ldspider/source/browse/trunk/src/com/ontologycentral/ldspider/Crawler.java#239

Are you using the source code or binary

Jaakko Lappalainen

unread,
Dec 18, 2012, 9:33:45 AM12/18/12
to ldsp...@googlegroups.com
Hi, I am using the ldspider-trunk.jar that Andreas suggested me in previous posts.

Jürgen Umbrich

unread,
Dec 18, 2012, 9:41:00 AM12/18/12
to ldsp...@googlegroups.com
Hi Jaako, 

You might wanna checkout the source code as Andreas also suggested:


$ svn update
$ ant clean; ant dist
$ java -jar ../dist/ldspider-trunk.jar 

This woudl guarantee that we talk about the same code and we can check the error messages you report


Jaakko Lappalainen

unread,
Dec 18, 2012, 10:13:30 AM12/18/12
to ldsp...@googlegroups.com
Done, still getting the same error.

I did checkout from http://ldspider.googlecode.com/svn/trunk/ ldspider-read-only

Is that right?

Jürgen Umbrich

unread,
Dec 18, 2012, 10:44:04 AM12/18/12
to ldsp...@googlegroups.com
yepp thats right,

can you provide a  minimal working example (MWE) or short, self-contained, correct example (SSCCE), so we can try to track the problem down on our end?

Jaakko Lappalainen

unread,
Dec 18, 2012, 11:29:52 AM12/18/12
to ldsp...@googlegroups.com
Hi, I paste you my main method in JAVA

public static void main(String[] args) throws ClassNotFoundException, SQLException, URISyntaxException, FileNotFoundException, IOException {

        Crawler crawler = new Crawler(2);
        Frontier frontier = new BasicFrontier();
        LinkFilter linkFilter = new LinkFilterDefault(frontier);
        crawler.setLinkFilter(linkFilter);

        ContentHandler contentHandler = new ContentHandlers(new ContentHandlerRdfXml(), new ContentHandlerNx());
        crawler.setContentHandler(contentHandler);

        Sink sink = new SinkCallback(new CallbackRDFXMLOutputStream(System.out));
        crawler.setOutputCallback(sink);

        int depth = 4;
        int maxURIs = 100;
        int maxplds = 1;
        int minActPlds = 1;
        boolean minActPldsAlready4Seedlist = false;
        boolean includeABox = true;
        boolean includeTBox = false;
        crawler.evaluateBreadthFirst(frontier, depth, maxURIs, maxplds, minActPlds, minActPldsAlready4Seedlist);
        System.err.println("Crawling finished.");

Jürgen Umbrich

unread,
Dec 19, 2012, 5:16:48 AM12/19/12
to ldsp...@googlegroups.com

Hi, 

The general problem is that you did not initialise the crawler with all necessary components. 

Following the detailed how-to on our wikipage should get you going. 
see http://code.google.com/p/ldspider/wiki/GettingStartedAPI

what i spotted directly from your code is that you are missing something like
 crawler.setRedirsClass(HashTableRedirects.class);
 ErrorHandler eh = new ErrorHandlerLogger(System.out, null, false);
 crawler.setErrorHandler(eh);
        

Jaakko Lappalainen

unread,
Dec 19, 2012, 9:15:42 AM12/19/12
to ldsp...@googlegroups.com
Thank you Juergen, I solved the null pointer exception, and I get the robot-related exception that Andreas told previously it wasn't relevant. But I don't get any results, and I have been waiting for the program to end like 30 minutes.

Do you suggest me to change the seed I'm using? Just in case there's something wrong with it...

Do you get results from the program I pasted you in a previous post?

Jürgen Umbrich

unread,
Dec 19, 2012, 9:40:04 AM12/19/12
to ldsp...@googlegroups.com
Hi, 

I dont understand what you mean, you get no results? 

Given your current code, the crawled data is written to System.out. 

The following code snippet writes the code to data.rdf.xml.
------

 Crawler crawler = new Crawler(2);
        Frontier frontier = new BasicFrontier();
        LinkFilter linkFilter = new LinkFilterDefault(frontier);
        crawler.setLinkFilter(linkFilter);
        crawler.setRedirsClass(HashTableRedirects.class);
        ErrorHandler eh = new ErrorHandlerLogger(System.out, null, false);
        crawler.setErrorHandler(eh);
        ContentHandler contentHandler = new ContentHandlers(new ContentHandlerRdfXml(), new ContentHandlerNx());
        crawler.setContentHandler(contentHandler);

        File out = new File("data.drf.xml");
        FileOutputStream fos = new FileOutputStream(out);
        Sink sink = new SinkCallback(new CallbackRDFXMLOutputStream(fos));
        crawler.setOutputCallback(sink);

        int depth = 4;
        int maxURIs = 100;
        int maxplds = 1;
        int minActPlds = 1;
        boolean minActPldsAlready4Seedlist = false;
        boolean includeABox = true;
        boolean includeTBox = false;
        crawler.evaluateBreadthFirst(frontier, depth, maxURIs, maxplds, minActPlds, minActPldsAlready4Seedlist);
        System.err.println("Crawling finished.");
        System.err.println("Output is written to "+out.getAbsolutePath());
        fos.close();
        crawler.close();
-----

You might want to change the output format to NQuads if you want to keep the source infromation and check again the Main-class of ldspider how to iterate over errors and properly close the crawler. 

Jaakko Lappalainen

unread,
Dec 19, 2012, 9:57:50 AM12/19/12
to ldsp...@googlegroups.com
HI,

I tried to set the output to NQUADS as you suggested me with 

OutputStream os = new FileOutputStream(outputFile);
Sink sink = new SinkCallback(new CallbackNQOutputStream(os));
crawler
.setOutputCallback(sink);

But I still get an empty output.

Jürgen Umbrich

unread,
Dec 19, 2012, 9:58:39 AM12/19/12
to ldsp...@googlegroups.com
Can you provide again a minimal working example ? 
dropbox link or so is fine 

Jaakko Lappalainen

unread,
Dec 19, 2012, 10:00:43 AM12/19/12
to ldsp...@googlegroups.com
I use your trunk version of the jar, and the main method you pasted me before:

    public static void main(String[] args) throws IOException, URISyntaxException{
         Crawler crawler = new Crawler(2);
        Frontier frontier = new BasicFrontier();
        LinkFilter linkFilter = new LinkFilterDefault(frontier);
        crawler.setLinkFilter(linkFilter);
        crawler.setRedirsClass(HashTableRedirects.class);
        ErrorHandler eh = new ErrorHandlerLogger(System.out, null, false);
        crawler.setErrorHandler(eh);
        ContentHandler contentHandler = new ContentHandlers(new ContentHandlerRdfXml(), new ContentHandlerNx());
        crawler.setContentHandler(contentHandler);

        File out = new File("data.drf.xml");
        FileOutputStream fos = new FileOutputStream(out);
        //Sink sink = new SinkCallback(new CallbackRDFXMLOutputStream(fos));
        //crawler.setOutputCallback(sink);

        
        OutputStream os = new FileOutputStream(out);
        Sink sink = new SinkCallback(new CallbackNQOutputStream(os));
        crawler.setOutputCallback(sink);
        
        int depth = 4;
        int maxURIs = 100;
        int maxplds = 1;
        int minActPlds = 1;
        boolean minActPldsAlready4Seedlist = false;
        boolean includeABox = true;
        boolean includeTBox = false;
        crawler.evaluateBreadthFirst(frontier, depth, maxURIs, maxplds, minActPlds, minActPldsAlready4Seedlist);
        System.err.println("Crawling finished.");
        System.err.println("Output is written to "+out.getAbsolutePath());
        fos.close();
        crawler.close();
    }

I guess it is based on the one I pasted you before :)

Jürgen Umbrich

unread,
Dec 19, 2012, 10:04:32 AM12/19/12
to ldsp...@googlegroups.com
The following example works for me perfectly fine.
Firstly, you opened TWO streams to the same output file.
Secondly, you did not close the correct output stream. 

Please note, i changed the depth from 4 to 1,. 

Run the program and have a look at data.nq to see the output



 Crawler crawler = new Crawler(2);
       Frontier frontier = new BasicFrontier();
       LinkFilter linkFilter = new LinkFilterDefault(frontier);
       crawler.setLinkFilter(linkFilter);
       crawler.setRedirsClass(HashTableRedirects.class);
       ErrorHandler eh = new ErrorHandlerLogger(System.out, null, false);
       crawler.setErrorHandler(eh);
       ContentHandler contentHandler = new ContentHandlers(new ContentHandlerRdfXml(), new ContentHandlerNx());
       crawler.setContentHandler(contentHandler);

       File out = new File("data.nq");
             
       OutputStream os = new FileOutputStream(out);
       Sink sink = new SinkCallback(new CallbackNQOutputStream(os));
       crawler.setOutputCallback(sink);
       
       int depth = 1;
       int maxURIs = 100;
       int maxplds = 1;
       int minActPlds = 1;
       boolean minActPldsAlready4Seedlist = false;
       boolean includeABox = true;
       boolean includeTBox = false;
       crawler.evaluateBreadthFirst(frontier, depth, maxURIs, maxplds, minActPlds, minActPldsAlready4Seedlist);
       System.err.println("Crawling finished.");
       System.err.println("Output is written to "+out.getAbsolutePath());
       os.close();
       crawler.close();

Jaakko Lappalainen

unread,
Dec 19, 2012, 10:58:27 AM12/19/12
to ldsp...@googlegroups.com
Hello Juergen, I have executed the exact code and data.nq still an empty file.

Jürgen Umbrich

unread,
Dec 19, 2012, 11:04:06 AM12/19/12
to ldsp...@googlegroups.com
That is more than strange. 

Maybe some of the other on the list can verify that the code should produce data?

In the meanwhile, can you provide the access log from the crawl to see if there might be a problem. 


Jaakko Lappalainen

unread,
Dec 19, 2012, 11:18:38 AM12/19/12
to ldsp...@googlegroups.com
Hi Juergen, I guess you mean the stderr print

Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread <init>
INFO: Initialised CloseIdleConnectionThread with sleepTime 60000 ms
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Starting CloseIdleConnectionThread
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) [data.gov.au]
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 1 plds done (1 URIs) in 2 ms. This was schedule No. 1
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: data.gov.au: 1
Plus 0 redirects.

Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: data.gov.au: 1
Plus 0 redirects.

Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 0 with 1 uris
LT-0
LT-1
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 19, 2012 5:16:42 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Exception in thread "LT-0:http://lab.environment.data.gov.au/data/acorn/climate/slice" java.lang.NoClassDefFoundError: org/apache/http/Consts
at org.apache.http.client.utils.URIBuilder.digestURI(URIBuilder.java:162)
at org.apache.http.client.utils.URIBuilder.<init>(URIBuilder.java:89)
at org.apache.http.client.utils.URIUtils.rewriteURI(URIUtils.java:134)
at org.apache.http.client.utils.URIUtils.rewriteURI(URIUtils.java:158)
at org.apache.http.impl.client.DefaultRequestDirector.rewriteRequestURI(DefaultRequestDirector.java:389)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:498)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at com.ontologycentral.ldspider.http.ConnectionManager.connect(ConnectionManager.java:103)
at com.ontologycentral.ldspider.http.robot.Robot.<init>(Robot.java:50)
at com.ontologycentral.ldspider.http.robot.Robots.accessOk(Robots.java:78)
at com.ontologycentral.ldspider.http.LookupThread.run(LookupThread.java:111)
Caused by: java.lang.ClassNotFoundException: org.apache.http.Consts
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 13 more
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 0 DONE with 0 uris remaining in queue
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 0 ): null
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 2
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 1 with 0 uris
LT-0
LT-1
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 1 DONE with 0 uris remaining in queue
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 1 ): null
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 3
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Crawling finished.
Output is written to /home/user/NetBeansProjects/test/data.nq
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread shutdown
INFO: Stopping CloseIdleConnectionThread
Dec 19, 2012 5:16:43 PM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Stopped CloseIdleConnectionThread

Jürgen Umbrich

unread,
Dec 19, 2012, 1:24:20 PM12/19/12
to ldsp...@googlegroups.com
Ok

Reading the logs, clearly tell you that you missed to import some necessary libraries.
e.g.,  Exception in thread "LT-0:http://lab.environment.data.gov.au/data/acorn/climate/slice" java.lang.NoClassDefFoundError: org/apache/http/Consts

please make sure that you use all necessary libraries in the lib folder of the ldspider project
http://code.google.com/p/ldspider/source/browse/#svn%2Ftrunk%2Flib

NetBeans has a couple of good documentations which will assist you in setting up a java project and adding the necessary third-party libraries to the classpath 
e.g., http://netbeans.org/kb/docs/java/project-setup.html

Hope that helps


Jaakko Lappalainen

unread,
Dec 23, 2012, 11:54:56 AM12/23/12
to ldsp...@googlegroups.com
You are absolutely right, there were two missing inclusions :)

But still the same problem, I'll paste you the log of the execution started this morning, which is still working with no throughput

Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread <init>
INFO: Initialised CloseIdleConnectionThread with sleepTime 60000 ms
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Starting CloseIdleConnectionThread
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) [data.gov.au]
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 1 plds done (2 URIs) in 5 ms. This was schedule No. 1
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: data.gov.au: 2
Plus 0 redirects.

Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: data.gov.au: 2
Plus 0 redirects.

Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 0 with 2 uris
LT-0
LT-1
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:25:51 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue pollInternal
INFO: delaying queue 500 ms ...
Dec 23, 2012 11:25:52 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue pollInternal
INFO: queue turnaround in 12 ms
Dec 23, 2012 11:26:07 AM com.ontologycentral.ldspider.hooks.error.ErrorHandlerLogger handleError
Dec 23, 2012 11:26:08 AM com.ontologycentral.ldspider.hooks.error.ErrorHandlerLogger handleError
Dec 23, 2012 11:26:23 AM com.ontologycentral.ldspider.http.LookupThread run
WARNING: Exception org.apache.http.conn.ConnectTimeoutException http://lab.environment.data.gov.au/data/acorn/climate/slice
Dec 23, 2012 11:26:23 AM com.ontologycentral.ldspider.hooks.error.ErrorHandlerLogger handleError
Dec 23, 2012 11:26:23 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 1 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
WARNING: Exception org.apache.http.conn.ConnectTimeoutException http://lab.environment.data.gov.au/def/acorn/time-series
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.hooks.error.ErrorHandlerLogger handleError
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 1 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 0 DONE with 0 uris remaining in queue
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 0 ): null
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 2
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 1 with 0 uris
LT-0
LT-1
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 1 DONE with 0 uris remaining in queue
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 1 ): null
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 3
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 2 with 0 uris
LT-0
LT-1
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 2 DONE with 0 uris remaining in queue
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 2 ): null
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 2 ms. This was schedule No. 4
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 3 with 0 uris
LT-0
LT-1
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 3 DONE with 0 uris remaining in queue
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 3 ): null
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 5
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Starting threads round 4 with 0 uris
LT-0
LT-1
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: starting thread ...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.http.LookupThread run
INFO: finished thread after fetching 0 uris; 0 in all threads overall until now (0 with non-empty RDF).
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: ROUND 4 DONE with 0 uris remaining in queue
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.Crawler evaluateBreadthFirst
INFO: Last non-empty context of this hop (# 4 ): null
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: start scheduling...
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: sorted pld list (sorted only if maximum for plds or uris has been set) []
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: scheduling 0 plds done (0 URIs) in 1 ms. This was schedule No. 6
Dec 23, 2012 11:26:24 AM com.ontologycentral.ldspider.queue.BreadthFirstQueue schedule
INFO: Plus 0 redirects.

Crawling finished.
Dec 23, 2012 11:26:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 23, 2012 11:27:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 23, 2012 11:28:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
Dec 23, 2012 11:29:51 AM com.ontologycentral.ldspider.http.internal.CloseIdleConnectionThread run
INFO: Closing expired and idle connections
---------------Same lines repeatedly------------------


I hope this can help figuring out the problem
Reply all
Reply to author
Forward
0 new messages