support for csv

Giovanni Tummarello

unread,

Jun 19, 2011, 5:34:20 AM6/19/11

to any23-dev

guys could you pls add support for CSV

the modelling must be very simple

1 entity called "table" which describes how many nodes, how many rows and give sthe names of the rows

1 entity PER ROW with

a) properties generated from the header name and corresponding values (must be urified e.g. spaces removed etc.. use the to uri functions of course)

b) an extra property called rownnumber which goes progressively 1.2.3. etc.

shall we stick it in 0.6 cheers?

Gio

Davide Palmisano

unread,

Jun 19, 2011, 7:40:00 AM6/19/11

to any2...@googlegroups.com

it's ok for me.
but: which vocab shall we use? something like:
http://any23.org/vocab/csv/{property} ?
or runtime declared by the user ?

cheers,

Davide

> --
> You received this message because you are subscribed to the Google Groups
> "any23-dev" group.
> To post to this group, send email to any2...@googlegroups.com.
> To unsubscribe from this group, send email to
> any23-dev+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/any23-dev?hl=en.
>

--
Davide Palmisano

http://davidepalmisano.com
http://twitter.com/dpalmisano

Giovanni Tummarello

unread,

Jun 19, 2011, 10:30:03 AM6/19/11

to any2...@googlegroups.com

sorry i also wanted to suggst this earlier.

check if the column name is a valid url. If it is, just keep that as a property name.

If it instn then add any23/vocab/csv/rowname why not. (plus URLify of course)

it would make sense to also declare that property as a property indeed and give it a lable = to the row name of course.

also. The row entities should not be blank nodes but indeed have real URIs. taken from the name of the URL of the file e.g. http://g1o.net/friends.csv#row13

to make it super cool, you might want to check the CELL values and if it is a URL then dont use a literal but a URI property.

:) this has to be the easiest way to get data into sindice ever.

Gio

Davide Palmisano

unread,

Jun 19, 2011, 10:32:38 AM6/19/11

to any2...@googlegroups.com

ok. makes sense to me. I'm going to open an issue and then tomorrow me and Mic
will discuss a bit more.

cheers,

Davide

Richard Cyganiak

unread,

Jun 19, 2011, 12:04:32 PM6/19/11

to any2...@googlegroups.com, Giovanni Tummarello, Michael Hausenblas

There's an IETF draft for a CSV fragment identifier syntax that would yield identifiers for the rows and columns of a CSV file:

http://tools.ietf.org/html/draft-hausenblas-csv-fragment-00

This is an early draft and will most certainly still undergo some changes, but AFAIK this is *exactly* what it's intended for :-)

CC Michael who is one of the authors of this draft.

Best,
Richard

Michael Hausenblas

unread,

Jun 19, 2011, 12:08:38 PM6/19/11

to Richard Cyganiak, Giovanni Tummarello, any2...@googlegroups.com

> This is an early draft and will most certainly still undergo some
> changes, but AFAIK this is *exactly* what it's intended for :-)

Yup, it will be cut down ;)

Schema: http://vocab.deri.ie/scsv

Example: http://omnidator.appspot.com/ (use the 'CSV to JSON' example
as a basis and switch to RDF output) ...

Cheers,
Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

Giovanni Tummarello

unread,

Jun 19, 2011, 6:05:50 PM6/19/11

to Richard Cyganiak, any2...@googlegroups.com, Michael Hausenblas

Thanks for pointing this out.

i think there is a conceptual mismatch in this however.

the schema is intended mostly as something that SERVERS should implement ..

e.g. it says that if you ask http://example.com/data.csv#head the
server should return the first row

same thing as if you ask http://example.com/data.csv#row:2 yo uge tthe
second row (but you can ask for more fancy slicing etc)

this is relaly not what would happen if we call our resources like
that (e.g. http://example.com/data.csv#row:2 for row2 in RDF)
dereferencing http://example.com/data.csv#row:2 you sitll get the full
table and you get it in RDF not in CSV which is what the specification
obviously expect.

so i am really not sure how this apply.

pls explain in detail

Gio

Giovanni Tummarello

unread,

Jun 19, 2011, 6:21:36 PM6/19/11

to Michael Hausenblas, Richard Cyganiak, any2...@googlegroups.com

Hi Michael

i did look at the omnidator example and that's why i came up with the
specs above which are different and not compatible with that ontology
given i wont have URIs for cells. Cells instead are the literals
attached directly to the node of a row.

please be aware of the issue above wrt your conversion to RDF, hosting
an RDF file produced the way of the omnidator would not comply with
the IETF draft. (i mean a semantic web client would understand that as
a uri but that's not what the draft says)

Gio

Michael Hausenblas

unread,

Jun 20, 2011, 3:11:50 AM6/20/11

to Giovanni Tummarello, Richard Cyganiak, any2...@googlegroups.com

> the schema is intended mostly as something that SERVERS should
> implement ..
>
> e.g. it says that if you ask http://example.com/data.csv#head the
> server should return the first row

Unless HTTPbis intents to allow to send the fragment ID with the
request, the server will never see the #head part, hence a hash-based
solution is confined to the User Agent.

Cheers,
Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

Michael Hausenblas

unread,

Jun 20, 2011, 3:15:53 AM6/20/11

to Giovanni Tummarello, Richard Cyganiak, any2...@googlegroups.com

> please be aware of the issue above wrt your conversion to RDF, hosting
> an RDF file produced the way of the omnidator would not comply with
> the IETF draft. (i mean a semantic web client would understand that as
> a uri but that's not what the draft says)

The current IETF draft was the first stab. I implemented both client
and server versions, see [1] for the code and [2] for a Node.js
deployment.

Omnidator, together with the vocab.deri.ie stuff is now the second
revision. Consider the -01 version of the IETF draft being along the
line of what omnidator does. ETA of the new IETF draft is early July.

Cheers,
Michael

[1] https://github.com/mhausenblas/addrable
[2] http://addrable.no.de/

--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

Giovanni Tummarello

unread,

Jun 20, 2011, 3:48:34 AM6/20/11

to Michael Hausenblas, Richard Cyganiak, any2...@googlegroups.com

>> e.g. it says that if you ask http://example.com/data.csv#head the
>> server should return the first row
>
>
> Unless HTTPbis intents to allow to send the fragment ID with the request,
> the server will never see the #head part, hence a hash-based solution is
> confined to the User Agent.
>

ops true!
thanks for the clarification
Gio

Reply all

Reply to author

Forward