Re: parsing documents that describe users

Melvin Carvalho

unread,

Jul 25, 2012, 8:23:38 AM7/25/12

to Michiel de Jong, federated-social-web

On 25 July 2012 14:06, Michiel de Jong <mic...@unhosted.org> wrote:

On Wed, Jul 25, 2012 at 12:14 PM, Melvin Carvalho <melvinc...@gmail.com> wrote:

*every* page on the web supports content negotiation

ok, maybe i should have made my question more specific - can you give an example where the 'Accept' request header is being taken into account? For instance if i try:

curl -IH'Accept: application/json' http://melvincarvalho.com/

I still get text/html back. So either my curl command is wrong, or content negotiation (in the strict sense) is not supported for that resource.

The content type will tell you what you are getting and you parse it accordingly.

yes, that i do, of course. but my question was whether i can also influence it with a request header.

FB is not perfect, but certainly developer friendly, and better than most.

sure

it's well documented JSON in the open graph protocol. They tell you what every field means.

yes, but well documented != self-documenting.

Facebook is fully self documenting e.g. in tabulator

no, it is not. In the sense that if i have never heard of facebook, and am not aware that it's a proprietary walled garden with a specific proprietary API format, I retrieve the following:

curl -i https://graph.facebook.com/512908782

HTTP/1.1 200 OK
Access-Control-Allow-Origin: *

Cache-Control: private, no-cache, no-store, must-revalidate
Content-Type: text/javascript; charset=UTF-8
ETag: "82ed88dd63a251e2139b0ab085ae439532f2b0ec"
Expires: Sat, 01 Jan 2000 00:00:00 GMT

Pragma: no-cache
X-FB-Rev: 597466
X-FB-Debug: nPsD/2KFDF8gYv5mESFD+iRy3rNMkAKVeuey3dL/eYI=
Date: Wed, 25 Jul 2012 12:02:34 GMT
Connection: keep-alive

Content-Length: 204

{"id":"512908782","name":"Tim Berners-Lee","first_name":"Tim", /
"last_name":"Berners-Lee","link":"http:\/\/www.facebook.com\/tim.bernerslee.9", /

"username":"tim.bernerslee.9","gender":"male","locale":"en_US"}

Where is the documentation? How do i know what "link" means in this JSON document?

TypeProperty
Comment The user's full name. No `access_token` required. `string`.
Domainuser
Label name

etc.

i tried this but it didn't work for me - anyway, it's interesting because it does what we want! do you know where/how tabulator retrieves this information? From the example above, I see no way to discover that info.

I think the following would be an answer to all of the above.

curl -H 'Accept: text/turtle' -i https://graph.facebook.com/512908782

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix api: <tag:graph.facebook.com,2011:/> .
@prefix og: <http://ogp.me/ns#> .
@prefix fb: <http://ogp.me/ns/fb#> .
@prefix : <http://graph.facebook.com/schema/~/> .
@prefix user: <http://graph.facebook.com/schema/user#> .
</512908782#>
    user:id "512908782" ;
    user:name "Tim Berners-Lee" ;
    user:first_name "Tim" ;
    user:last_name "Berners-Lee" ;
    user:link <http://www.facebook.com/tim.bernerslee.9> ;
    user:username "tim.bernerslee.9" ;
    user:gender "male" ;
    user:locale "en_US" .

so user:name will resolve to:

http://graph.facebook.com/schema/user#name

"name": { "description": "The user's full name. No `access_token` required. `string`.", "uri": "http://graph.facebook.com/schema/user#name" },

I'm sure this is all documented in the ogp protocol, but as you see I just used curl with turtle, and follow your nose to get there

Facebook deserve a huge amount of credit for getting this far ahead of their rivals. FSW just needs to emulate this and we'll have an awesome network 1 billion+ strong, where the users choose which providers they want to use.

Melvin Carvalho

unread,

Jul 25, 2012, 9:49:04 AM7/25/12

to Michiel de Jong, federated-social-web

On 25 July 2012 15:34, Michiel de Jong <mic...@unhosted.org> wrote:

On Wed, Jul 25, 2012 at 2:23 PM, Melvin Carvalho <melvinc...@gmail.com> wrote:

curl -H 'Accept: text/turtle' -i https://graph.facebook.com/512908782

that's supercool! i'm going to add that. So then we'll have 5 parsers: json, html, rdf, xrd, turtle. I'll send "Accept: text/turtle" by default so that i get turtle whenever it's supported.

See the wiki page on content negotiation, the Accept header works like this:

Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1

You give it a list: "This is what I can parse", and you state your preference order.

The server will give you back the first form in your list that it has. The content type says "this is what I have"

In the case of facebook they have both json and turtle

In the case of status.net they have HTML and FOAF

In many cases it will be JSON only

CSS and fonts work in a similar way

Melvin Carvalho

unread,

Jul 25, 2012, 12:17:37 PM7/25/12

to Michiel de Jong, federated-social-web

On 25 July 2012 18:06, Michiel de Jong <mic...@unhosted.org> wrote:

On Wed, Jul 25, 2012 at 3:49 PM, Melvin Carvalho
<melvinc...@gmail.com> wrote:
> See the wiki page on content negotiation, the Accept header works like this:
>
> Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg;
> q=0.6, image/*; q=0.5, */*; q=0.1

ok, i'll send text/turtle; 1.0, application/xrd+xml; 0.8,
application/rdf+xml; 0.6, application/json; 0.4, text/html; 0.2

> In the case of status.net they have HTML and FOAF

this didn't work when i tried it, and also i think foaf is not a
content-type, its content-type would be application/rdf+xml i think.
also, i think in the case of StatusNet they are available as
https://identi.ca/michielbdejong for the html and
https://identi.ca/michielbdejong/foaf for foaf. Google does it by
adding a '&fmt=foaf' query parameter.

so many different ways of doing things, but i'm creating a unit test
for each one, so i'll get them all working. :)

Probably a case of the programmer putting something together in a hurry

Reply all

Reply to author

Forward