I came across two issues and I'm unsure whether I should push them.
1) BBC's content negotiation seems borken:
$ rapper -c
"http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0"
rapper: Parsing URI
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 with
parser rdfxml
rapper: Error - URI
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0:1 -
Using property attribute 'lang' without a namespace is forbidden.
rapper: Error - URI
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 -
Resolving URI failed: Failed writing body (0 != 2736)
rapper: Failed to parse URI
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 rdfxml
content
rapper: Parsing returned 1 triple
$
Adding an .rdf to the filename works, but then the thing-source correspondence
is lost.
2) OpenCyc returns text/xml as content-type (e.g., at [2]), and I would like
them to return application/rdf+xml that I don't have to feed all text/xml
files that I get into an RDF/XML parser.
Best regards,
Andreas.
[1] http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0
[2] http://sw.opencyc.org/concept/Mx4r-T6OkHdRS-eUiqO5n8NA1g
> Dear fellow pedants,
>
> I came across two issues and I'm unsure whether I should push them.
>
> 1) BBC's content negotiation seems borken:
>
> $ rapper -c "http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0"
> rapper: Parsing URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 with parser rdfxml
> rapper: Error - URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0:1 - Using property attribute 'lang' without a namespace is forbidden.
> rapper: Error - URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 - Resolving URI failed: Failed writing body (0 != 2736)
> rapper: Failed to parse URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 rdfxml content
> rapper: Parsing returned 1 triple
> $
Without more information -- namely what accept rapper is sending -- it's not clear that it's broken. My rapper has rdfa parsing built in, so it might well be asking for html.
I'll see if I can find out. In the meantime try:
$ rapper -c -g "http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0"
rapper: Parsing URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 with parser guess
raptor_guess.c:113:raptor_guess_parse_content_type_handler: Got content type 'text/html'
rapper: Guessed parser name 'rdfa'
rapper: Error - - XML parser error: AttValue: " or ' expected
rapper: Error - - XML parser error: attributes construct error
rapper: Error - - XML parser error: Specification mandate value for attribute og:image
rapper: Error - - XML parser error: attributes construct error
rapper: Error - URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 - Resolving URI failed: Failed writing body (0 != 2896)
rapper: Failed to parse URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 guess content
rapper: Parsing returned 13 triples
Damian
>
> On 25 Aug 2011, at 16:19, Andreas Harth wrote:
>
>> Dear fellow pedants,
>>
>> I came across two issues and I'm unsure whether I should push them.
>>
>> 1) BBC's content negotiation seems borken:
>>
>> $ rapper -c "http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0"
>> rapper: Parsing URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 with parser rdfxml
>> rapper: Error - URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0:1 - Using property attribute 'lang' without a namespace is forbidden.
>> rapper: Error - URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 - Resolving URI failed: Failed writing body (0 != 2736)
>> rapper: Failed to parse URI http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 rdfxml content
>> rapper: Parsing returned 1 triple
>> $
>
> Without more information -- namely what accept rapper is sending -- it's not clear that it's broken. My rapper has rdfa parsing built in, so it might well be asking for html.
(Apologies for the formatting, via ngrep)
GET /music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 HTTP/1.1..Host: www
.bbc.co.uk..Accept: application/rdf+xml, text/rdf;q=0.6, text/plain;q=0.1,
text/turtle, application/x-turtle, application/turtle, text/n3;q=0.3, text/
rdf+n3;q=0.3, application/rdf+n3;q=0.3, application/x-trig, application/rss
;q=0.8, application/rss+xml;q=0.8, text/rss;q=0.8, application/xml;q=0.3, t
ext/xml;q=0.3, application/atom+xml;q=0.3, text/html;q=0.2, application/xht
ml+xml;q=0.4, text/html;q=0.6, application/xhtml+xml;q=0.8, text/x-nquads,
*/*;q=0.1
which looks fine to me. (x)html is well down the list.
And, indeed:
$ curl -I -H 'Accept: application/rdf+xml' http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0
HTTP/1.1 200 OK
Date: Thu, 25 Aug 2011 15:46:24 GMT
Server: Apache
Vary: Accept
Cache-Control: no-cache
Content-Type: text/html
Transfer-Encoding: chunked
Set-Cookie: BBC-UID=84ce357636bee5c0429f8949b050c91a7aecb2ec108040bc72c9830fa9824cbf0curl%2f7%2e21%2e6%20%28x86%5f64%2dapple%2ddarwin10%2e7%2e0%29%20libcurl%2f7%2e21%2e6%20OpenSSL%2f1%2e0%2e0d%20zlib%2f1%2e2%2e5%20libidn%2f1%2e22; expires=Mon, 24-Aug-15 15:46:24 GMT; path=/; domain=bbc.co.uk;
Damian
As Damian said, the server is returning HTML, so it's not clear that anything is really wrong.
> 2) OpenCyc returns text/xml as content-type (e.g., at [2]), and I would like
> them to return application/rdf+xml that I don't have to feed all text/xml
> files that I get into an RDF/XML parser.
Yeah, it's worth mentioning to them. See also:
http://pedantic-web.org/fops.html#contenttype
As a heuristic, you can scan the first 5k or so for the RDF namespace URI. If it occurs, it's worth throwing an RDF/XML parser at it.
(Any23 has these heuristics built-in for all its supported RDF syntaxes, and it makes life much easier.)
Best,
Richard
many thanks for making Cyc available on the Semantic Web!
There is on small issue though: would it be possible to serve the RDF/XML
files with a "application/rdf+xml" content type rather than the more generic
"text/xml" one? That way, systems that use your data (and other people's)
can directly use the right parser.
If you need direction regarding the Apache configuration let me know.
Cheers,
Andreas.
I'm a bit confused about this... they already have an RDF/XML
description [1], which describes the artist [2]. It seems a waste not to
have [2] dereference to [1] when "Accept: application/rdf+xml" is
specified... or am I missing something?
Cheers,
Aidan
[1]
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0.rdf
[2]
http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0#artist
> I'm a bit confused about this... they already have an RDF/XML
> description [1], which describes the artist [2]. It seems a waste not to
> have [2] dereference to [1] when "Accept: application/rdf+xml" is
> specified... or am I missing something?
A brain dump in lieu of actual thought on my part:
a) Content negotiating from [1] to the content of [2] would be ideal, I
agree. Client's wishes are fulfilled.
b) It's a shame that there seems to be no way to find [2] from [1]. A
link alternate would be useful.
c) [1] and [2] appear to contain the same triples (give or take some
rdfa artifacts).
So I agree, it is a waste. However I have mixed feelings about content
negotiation as a way to discover [2] and consider b) more problematic.
Damian
here are some news regarding the CN issue of BBC.
Cheers
Bo
On 8/26/2011 12:07 PM, Nicholas Humfrey wrote:
> Hi Bob,
>
> Yes, content negotiation is currently broken. Hope to have it fixed (and
> actually doing proper content negotiation, rather than string matching) in
> the next release.
>
> nick.
>
>
> On 26/08/2011 09:04, "Bob Ferris"<za...@smiy.org> wrote:
>
>> Hi Nic,
>>
>> I don't know, whether you are already aware of the following ongoing
>> discussion on the pedantic web list. If not, please feel free to
>> interact there. AFAIK, you are dealing with these issues at BBC, or?
>>
>> Cheers,
>>
>>
>> Bob
>>
>>
>> -------- Original Message --------
>> Subject: [pedantic-web] Possibly minor issues with BBC and Cyc
>> Date: Thu, 25 Aug 2011 17:19:19 +0200
>> From: Andreas Harth<and...@harth.org>
>> Reply-To: pedant...@googlegroups.com
>> To: pedant...@googlegroups.com
>>
>> Dear fellow pedants,
>>
>> I came across two issues and I'm unsure whether I should push them.
>>
>> 1) BBC's content negotiation seems borken:
>>
>> $ rapper -c
>> "http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0"
>> rapper: Parsing URI
>> http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0
>> with
>> parser rdfxml
>> rapper: Error - URI
>> http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0:1 -
>> Using property attribute 'lang' without a namespace is forbidden.
>> rapper: Error - URI
>> http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0 -
>> Resolving URI failed: Failed writing body (0 != 2736)
>> rapper: Failed to parse URI
>> http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0
>> rdfxml
>> content
>> rapper: Parsing returned 1 triple
>> $
>>
>> Adding an .rdf to the filename works, but then the thing-source
>> correspondence
>> is lost.
>>
>> 2) OpenCyc returns text/xml as content-type (e.g., at [2]), and I would like
>> them to return application/rdf+xml that I don't have to feed all text/xml
>> files that I get into an RDF/XML parser.
>>
>> Best regards,
>> Andreas.
>>
>> [1] http://www.bbc.co.uk/music/artists/4cf1eab6-0a14-4ab0-8c11-38d4157f91e0
>> [2] http://sw.opencyc.org/concept/Mx4r-T6OkHdRS-eUiqO5n8NA1g
>
>
> nick.