Re: how to convert boards.ie (SIOC formatted) data into some database

50 views
Skip to first unread message

Jodi Schneider

unread,
Oct 12, 2012, 7:11:33 AM10/12/12
to icwsm...@googlegroups.com, sioc...@googlegroups.com
Hi Shahzad,

If your main need is to query the data, the best sort of database to use would be a triple store [1]. Triple stores are designed for RDF data such as that in SIOC. It is possible to convert from relational databases to RDF databases; there are a variety of tools, and the W3C RDB2RDF group webpage might be helpful in identifying a suitable mapping tool [2].

Documentation on SIOC may also be helpful:

Perhaps others will have additional advice for you?

Best,

Jodi


On Fri, Oct 12, 2012 at 8:52 AM, shahzad <mshahza...@gmail.com> wrote:

Greetings

I have obtained boards.ie data set from ICWSM.

I want to use thread and post conversation text for mining purpose. For this I need all threads with respective posts in some database.

But the data is in SIOC format.

Kindly suggest me how to convert this SIOC formatted data in to some database like mysql or sql server.

Regards

--
You received this message because you are subscribed to the Google Groups "icwsm-data" group.
To view this discussion on the web visit https://groups.google.com/d/msg/icwsm-data/-/xP5OG1-dvycJ.
To post to this group, send email to icwsm...@googlegroups.com.
To unsubscribe from this group, send email to icwsm-data+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/icwsm-data?hl=en.

Phil Smith

unread,
Oct 12, 2012, 8:47:26 AM10/12/12
to sioc...@googlegroups.com, icwsm...@googlegroups.com
Hi Shahzad,

I agree with Jodi, a triple store would be the best approach. If it absolutely has to go into a relational database perhaps an XSLT could transform your dataset into a more compatible format?

Also: where did you get the boards.ie data set from?

Regards,
phil

--
You received this message because you are subscribed to the Google Groups "SIOC-Dev" group.
To post to this group, send email to sioc...@googlegroups.com.
To unsubscribe from this group, send email to sioc-dev+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sioc-dev?hl=en.

Phil Smith

unread,
Oct 12, 2012, 9:07:34 AM10/12/12
to sioc...@googlegroups.com, icwsm...@googlegroups.com, jschn...@pobox.com
Hi Shahzad,

I agree with Jodi, a triple store would be the best approach. If it absolutely has to go into a relational database perhaps XSLT could transform your dataset into a more compatible schema?

Also: where did you get the boards.ie data set from?

Regards,
Phil

Amendra Shrestha

unread,
Jun 13, 2013, 4:48:28 AM6/13/13
to sioc...@googlegroups.com, icwsm...@googlegroups.com, jschn...@pobox.com
You can make your own parser and store only the necessary tags from RDF files and store in some relational database.
I have also used the data from boards.ie for my master thesis. I have parsed the SIOC format file in java using SAX parser and stored into MySQL tables.

Jodi Schneider

unread,
Jun 13, 2013, 4:51:37 AM6/13/13
to Amendra Shrestha, sioc...@googlegroups.com, icwsm...@googlegroups.com
Amendra,

On Thu, Jun 13, 2013 at 9:48 AM, Amendra Shrestha <amendra...@gmail.com> wrote:
You can make your own parser and store only the necessary tags from RDF files and store in some relational database.
I have also used the data from boards.ie for my master thesis. I have parsed the SIOC format file in java using SAX parser and stored into MySQL tables.

Perhaps you'd be willing to share the code? I'm sure others would find it useful.

-Jodi

Amendra Shrestha

unread,
Jun 13, 2013, 6:32:59 AM6/13/13
to sioc...@googlegroups.com, icwsm...@googlegroups.com, jschn...@pobox.com
I have hosted the code in google code. You can find the code in following link:

http://board-ie-parser.googlecode.com/svn/trunk/

You need to make some changes in database file as your database name.
Thank You.


On Friday, October 12, 2012 1:11:37 PM UTC+2, Jodi Schneider wrote:

Jodi Schneider

unread,
Jun 13, 2013, 6:57:00 AM6/13/13
to Amendra Shrestha, shahzad, sioc...@googlegroups.com, icwsm...@googlegroups.com
Shahzad, if you're still interested in the boards.ie data, the code below should help you:
http://board-ie-parser.googlecode.com/svn/trunk/

Amendra parsed the boards.ie SIOC files into MySQL tables using Java with the SAX parser. 

Thanks, Amendra! It would also be great to get a link to your Master's thesis when it's done.

-Jodi
Reply all
Reply to author
Forward
0 new messages