Problem with the Spotlight endpoint: timeout

42 views
Skip to first unread message

Simone Romero

unread,
Jun 18, 2015, 8:44:59 PM6/18/15
to rm...@googlegroups.com
Hi, 

I'm trying to use the DBpedia Spotlight Linker operator to enrich some data (data.csv attached), 
but I receive an Timeout error as shown in the attached image.

I used the default SPARQL endpoint: http://dbpedia.org/sparql/ 

However, I receive this error only on English Service URL (Spotlight Linker Parameters), 
I tried others Service URL, like Portuguese and Italian, and it worked normally.

Any idea?

Thanks in advance.

Simone Romero


data.csv
testSpotlight.rmp
timeout_spotlight.png

Heiko Paulheim

unread,
Jun 22, 2015, 6:28:19 AM6/22/15
to Simone Romero, rm...@googlegroups.com
Hi Simone,

have you tried the English service via other means, e.g., a Web browser? If that does not work out either, I'd assume the problem is rather on the side of Spotlight than within the RM LOD extension.

If it works in a browser, but not in RM LOD, we'll have a look. Do you use any specific proxy settings?

Best,
Heiko
--
You received this message because you are subscribed to the Google Groups "RapidMiner Linked Open Data Extension" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmlod+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim

Mail: he...@informatik.uni-mannheim.de
Web: www.heikopaulheim.com

Petar Ristoski

unread,
Jun 22, 2015, 6:50:30 AM6/22/15
to rm...@googlegroups.com, he...@informatik.uni-mannheim.de, simone.r...@gmail.com
Hi Simone,

Heiko is right, the official English DBpedia Spotlight service is down (you can check that here).
However, there is a workaround for this problem: first, select the "Custom" option for the "Service URL" parameter of the "DBpedia Spotlight Linker" operator. By doing so, additional text field parameter "Custom Service URL" will be shown. Set "http://spotlight.dbpedia.org/" as a value for the new parameter. Now try to run the process again.

Hope this works for you.

Regards,

Petar

On Monday, June 22, 2015 at 12:28:19 PM UTC+2, Heiko Paulheim wrote:
Hi Simone,

have you tried the English service via other means, e.g., a Web browser? If that does not work out either, I'd assume the problem is rather on the side of Spotlight than within the RM LOD extension.

If it works in a browser, but not in RM LOD, we'll have a look. Do you use any specific proxy settings?

Best,
Heiko

Am 19.06.2015 um 02:44 schrieb Simone Romero:
Hi, 

I'm trying to use the DBpedia Spotlight Linker operator to enrich some data (data.csv attached), 
but I receive an Timeout error as shown in the attached image.

I used the default SPARQL endpoint: http://dbpedia.org/sparql/ 

However, I receive this error only on English Service URL (Spotlight Linker Parameters), 
I tried others Service URL, like Portuguese and Italian, and it worked normally.

Any idea?

Thanks in advance.

Simone Romero


--
You received this message because you are subscribed to the Google Groups "RapidMiner Linked Open Data Extension" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmlod+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Simone Romero

unread,
Jun 22, 2015, 8:52:25 AM6/22/15
to Petar Ristoski, rm...@googlegroups.com, he...@informatik.uni-mannheim.de
Hi Heiko and Petar,

Thanks for your responses. The Custom Service URL worked very well!

I have another question:

I tried to use others Linkers like Pattern-based and Label-based Linker, but I receive a similar error (attached image).
In this case, can I set another Service URL like your suggestion for the Spotlight operator? How can I do that? 
Apparently, these operators do not provide me this option.

Cheers,
Simone Romero

To unsubscribe from this group and stop receiving emails from it, send an email to rmlod+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim

Mail: he...@informatik.uni-mannheim.de
Web: www.heikopaulheim.com



--
Simone Romero
Bacharel em Ciência da Computação - Universidade Estadual do Oeste do Paraná

ProcessFailedLabel-Based.png

Petar Ristoski

unread,
Jun 22, 2015, 1:58:54 PM6/22/15
to rm...@googlegroups.com, simone.r...@gmail.com, he...@informatik.uni-mannheim.de, simone.r...@gmail.com
Hi Simone,

Based on the error you got, the DBpedia SPARQL endpoint has been unavailable at the time you tried to use the Label-based Linker. 
Please try to run the process again. If you get the same error, please check if the DBpedia SPARQL endpoint is available by running a simple query on the official DBpedia SPARQL endpoint.
If the problem is not caused by the DBpedia endpoint itself, please send us your RapidMiner process and excerpt of your input file.

Regards,

Petar
To unsubscribe from this group and stop receiving emails from it, send an email to rmlod+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim

Mail: he...@informatik.uni-mannheim.de
Web: www.heikopaulheim.com

Simone Romero

unread,
Jun 22, 2015, 6:33:59 PM6/22/15
to Petar Ristoski, rm...@googlegroups.com, Heiko Paulheim
Hi Petar,

I tried to run over and over again, but I still got the same error. I checked the http://dbpedia.org/sparql/ and it is working. 
My RapidMiner process (attached) is simple, just to test the LOD extension. I use the same data from the first e-mail (data.csv).

Cheers,
--
Simone Romero
Bacharel em Ciência da Computação - Universidade Estadual do Oeste do Paraná
Mestranda do Programa de Pós-Graduação em Computação - UFRGS
testLabelBased.rmp

Petar Ristoski

unread,
Jun 22, 2015, 7:45:46 PM6/22/15
to rm...@googlegroups.com, simone.r...@gmail.com, he...@informatik.uni-mannheim.de, simone.r...@gmail.com
Hi Simone,

For your type of data I would recommend to use the DBpedia Spotlight operator, and not the label-based linker. The label-based linker is designed to be used with a list of entities as input, e.g., list of countries. Therefore it will not give you desirable results for your data. 
Furthermore, the label-based operator links the input entities with LOD entities based on the rdfs:label. In your case, where you have enabled the "Search by n-grams" parameter, the operator will first generate all possible n-grams (where n ranges between 1  and the #of tokens in the tweet), and for all of them it will issue separate SPARQL query to the endpoint, trying to find the best matching entity based on the rdfs:label. This will always cause timeout because of the complexity of the task.

Regards,

Petar

Simone Romero

unread,
Jun 22, 2015, 9:30:27 PM6/22/15
to Petar Ristoski, rm...@googlegroups.com, Heiko Paulheim
Petar, 

According your explanation, and with the same process, I tried to use the Label-Based operator with a list of cities (and nouns), but, again, I receive an error (HTTP 500 error making the query).

Is there any pattern for structuring the list that I need to follow?

What I'm doing wrong?

Cheers,

city.csv
errorLabel2.png
nouns.csv

Petar Ristoski

unread,
Jun 23, 2015, 5:22:50 AM6/23/15
to rm...@googlegroups.com, simone.r...@gmail.com, he...@informatik.uni-mannheim.de, simone.r...@gmail.com
Hi Simone,

For your use-case you should use the default settings of the operator, i.e., "Search by N-Grams" and "Detect column class type" should be set to false. However, even then you might get timeout exception, because currently the DBpedia endpoint seems to be under heavy load. 
Instead of the label-based operator, I would recommend you to use the "DBpedia Lookup Linker" operator (select the corresponding type you want to link, e.g., City), or the "pattern-based linker" operator  (in combination with the "web validator").

Regards,

Petar

Simone Romero

unread,
Jun 23, 2015, 8:20:16 AM6/23/15
to Petar Ristoski, rm...@googlegroups.com, Heiko Paulheim
Hi Petar,

I will try these others operators.
Thanks a lot!

Cheers,

Reply all
Reply to author
Forward
0 new messages