Convert Excel Spreadsheet to RDF

1,867 views
Skip to first unread message

OntoLearner

unread,
Oct 29, 2012, 2:38:18 PM10/29/12
to topbrai...@googlegroups.com

Hello,

I am trying to convert the below spreadsheet using SPARQL.

I would like to create a taxonomy, which has top 3, 4, or 5 categories with the term as the lowest being element. The term can have a synonym and/or acronym, and a definition.

Because the synonym, acronym, and definition may exist or not, I want to make them optional in my query.

Hopefully someone can help with the above.

I am able to import the spreadsheet into spreadsheet ontology inside TBCME, but I am struggling to create the categories together with the terms and its synonym, acronym or definition.

Category 1 Category 2 Category 3 Term Term synonym (skos:altLabel) Term Acronym(termAcronym) Term Definition (skos:definition)
Test 1 Test 2 Test 3 Term1 Acronym 1
Test 1 Test 2 Test 3 Term2 Synonym 2 Acronym 2 Sample definition
Test 1 Test 2 Test 3 Term3 Synonym 3
Test 1 Test 2 Test 3 Term4 Synonym 4 Acronym 4 Sample definition
Test 1 Test 2 Test 3 Term5 Synonym 5 Acronym 5
Test 1 Test 2 Test 3 Term6 Synonym 6
Test 1 Test 2 Test 3 Term7 Synonym 7 Acronym 7
Test 1 Test 2 Test 3 Term8 Acronym 8
Test 1 Test 2 Test 3 Term9 Synonym 9 Acronym 9
Test 1 Test 2 Test 3 Term10 Synonym 10 Acronym 10
Test 1 Test 2 Test 3 Term11 Acronym 11

Thank you for your help.

Scott Henninger

unread,
Oct 29, 2012, 3:49:16 PM10/29/12
to TopBraid Suite Users
OntoLearner; I wasn't clear on how you were getting the spreadsheet
into triples. Here are two possibilities:

1. Tab-delimited file

In this case, I'd suggest using tab-delimited import (see Help > >
Importing Data Sources > Import external information > Import
Spreadsheets
Import Tab-Separated Spreadsheets, part 2 "Importing the spreadsheet
to the currently selected ontology".

This would require adding a first column that named the class you
import into (row 0, column 0) and each instance (name them anything,
you can always re-name them via SPARQL transforms later. An outline
of the process:
a. Create a file and import SKOS
b. Create a class definition that corresponds to row 0, column 0
c. Right-click the file in the Navigator and choose Import >
TopBraid Composer > Import Tab-Delimited Spreadsheet File
d. In first page of wizard, be sure to choose "Import to the current
ontology"
e. In last page of wizard, you can change the types (Property Range)
to skos:altLabel, skos:definition, etc

2. Excel (.xls) import

In this case, open the file in Excel and save as a .xls file. Place
in Composer's navigator (copy/paste or drag/drop) and open this
directly in Composer - open with... > TopBraid (Excel Documents)

The effect will be similar to 1., except that the instances and class
will be named in order. Again, you can use a SPARQL transformation to
rename as needed.

3. EVN Spreadsheet import

In this case the spreadsheet you outline can be converted into a
hierarchy with the spreadsheet importer. See the Column-Based Tree at
http://topquadrant.com/topbraid/evn/362doc/userguide.html#importing-spreadsheet-data

There was a question about how to make something optional in a query.
Again, I don't know exactly how your data shows in triples, but the
SPARQL would look something like:

SELECT *
WHERE
{ ?s :category1 ?c1 .
?s :category2 ?c2 .
?s :catefory3 ?c3 .
OPTIONAL { ?s skos:definition }
OPTIONAL

Scott Henninger

unread,
Oct 29, 2012, 3:51:00 PM10/29/12
to TopBraid Suite Users
Funble-fingers errantly hit a key. The last part of the query is:

SELECT *
WHERE
{ ?s :category1 ?c1 .
?s :category2 ?c2 .
?s :catefory3 ?c3 .
OPTIONAL { ?s skos:definition ?def }
OPTIONAL { ?s skos:altLabel ?label }
}

-- Scott

On Oct 29, 2:49 pm, Scott Henninger <shennin...@topquadrant.com>
wrote:
> hierarchy with the spreadsheet importer.  See the Column-Based Tree athttp://topquadrant.com/topbraid/evn/362doc/userguide.html#importing-s...

OntoLearner

unread,
Oct 29, 2012, 5:03:01 PM10/29/12
to topbrai...@googlegroups.com
Thanks a lot Scott,

I have tried to follow the first 2 steps but I am not getting the desired results:

Basically what I need is to be able to make category 1 the top category, then category 2 becomes subClassOf of category 1, then category 3 becomes subClassOf of category 2, then the term is becomes a child of category 3. Because the term may have or may not have a synonym, acronym, and/or definition, I would like to have them created but optionally.

I have attached my rdf file after the conversion for easy reference.

Thank you again for your help as I am still learning.
category.rdf

Scott Henninger

unread,
Oct 30, 2012, 10:59:43 AM10/30/12
to TopBraid Suite Users
OK, that is quite a different problem. One solution is to use the out-
of-the box converter in EVN - option 3 in the above. This takes a bit
of setup for EVN and then a couple of steps: export the model from EVN
and convert skos:borader to rdfs:subClassOf.

If you want to follow that option, which does work nicely, let me know
and we can walk you through that.

Another approach would be to use SPIN to process a query for each row
of the spreadsheet. The steps are as follows:

1. Create a SPIN file via New > RDF/OWL/SPIN file. Click on SKOS to
import that model.

2. Right click on the .xls file and choose Open With... > TopBraid
(Excel Documents)

3. When the wizard asks whether you have a Schema, choose "Yes" and
choose the file you created in step 1.

4. Open the class created by the .xls import and copy the following
query into the spin:rule property (for the class):

CONSTRUCT {
?r1 a owl:Class .
?r2 a owl:Class .
?r3 a owl:Class .
?termRsc a owl:Class .
?r2 rdfs:subClassOf ?r1 .
?r3 rdfs:subClassOf ?r2 .
?termRsc rdfs:subClassOf ?r3 .
?termRsc skos:definition ?def .
?termRsc skos:altLabel ?label .
?termRsc ontolearner:termAcronym ?acronym .
}
WHERE {
?this ontolearner:category1 ?c1 .
?this ontolearner:term ?term .
OPTIONAL {?this ontolearner:category2 ?c2 .} .
OPTIONAL {?this ontolearner:category3 ?c3 .} .
OPTIONAL {?this ontolearner:termAcronymtermAcronym ?acronym .} .
OPTIONAL {?this ontolearner:termDefinitionskosdefinition ?def .} .
OPTIONAL {?this ontolearner:termSynonymskosaltLabel ?label .} .
BIND (IRI(fn:concat("http://example.org/classes#", REPLACE(?term,
" ", ""))) AS ?termRsc) .
BIND (IRI(fn:concat("http://example.org/classes#", REPLACE(?c1, "
", ""))) AS ?r1) .
BIND (IRI(fn:concat("http://example.org/classes#", REPLACE(?c2, "
", ""))) AS ?r2) .
BIND (IRI(fn:concat("http://example.org/classes#", REPLACE(?c3, "
", ""))) AS ?r3) .
}

...note that your prefixes an properties may be different. Adjust as
needed. There's a lot to this query, so if you have questions, let us
know.

5. Make sure the TopSPIN engine is configured. See Help > How to? >
Run and Configure Inference Engines

Execute the inference and your subclass tree will be inferred. From
there you can adjust the query as needed to get the data to look the
way you want it to. If you want to assert these, choose the inferred
triples in the Inferences View (bottom-middle row of tabs), select the
triples then choose "Assert selected statements" (the arrow pointing
up).

HTH

-- Scott
>  category.rdf
> 5KViewDownload

OntoLearner

unread,
Oct 30, 2012, 12:28:09 PM10/30/12
to topbrai...@googlegroups.com
Thank you Scott for the detailed reply.

I just tried to follow the second suggestion as it looked to me well detailed to follow. The one part I did not quite understand is "...note that your prefixes an properties may be different.  Adjust as needed.  There's a lot to this query, so if you have questions, let us know."

Could you please elaborate a little bit?

I also configured the TopSPIN engine as shown in the screenshot.

I run the inferences but the result does not seem to be correct as you can see attached.

I was wondering if you were able to run it the SPIN rule successfully on your side based on sample data.

Regards.
category.rdf
Configure Inferencing.png

OntoLearner

unread,
Oct 30, 2012, 1:48:54 PM10/30/12
to topbrai...@googlegroups.com
Hello again Scott,

I have made some progress as can be seen in the attached screenshot.

The problem I am now having is that I am not able to export the portion of the RDF file that is relevant to me, which is exactly what is shown on the left side of the screenshot.

I would like to save/export only that portion, but I keep getting more in my RDF file than I really need. Still learning.

I am attaching both the screenshot and the exported RDF file.

Thanks again for your help.
test6.rdf

OntoLearner

unread,
Oct 30, 2012, 11:51:21 PM10/30/12
to topbrai...@googlegroups.com
Hello Scott,

I was looking at the below SPARQL query, that Bob DuCharme provided back in July for a request I had sent, and wondered if it can be adapted to include the categories and the optional attributes that I want to include in it.


CONSTRUCT {
    ?newClass a rdfs:Class .
    ?newClass rdfs:subClassOf ?superClass .
}
WHERE {
    ?c a ss:Cell .
    ?c ss:row ?row .
    ?c ss:column ?col .
    ?c ss:cellContents ?contents .
    FILTER (?row > 0) .
    BIND ((?col - 1) AS ?parentCol) .
    OPTIONAL {
        ?parent ss:row ?row .
        ?parent ss:column ?parentCol .
        ?parent ss:cellContents ?parentContents .
    } .
    BIND (IRI(fn:concat("http://example.org/", ?contents)) AS ?newClass) .
    BIND (IRI(fn:concat("http://example.org/", ?parentContents)) AS ?superClass) .
}

I am just wondering, I am still new to SPARQL and all it can do, please just provide the best way to get the triples in the format I requested.

Thank you for your continued support.

Scott Henninger

unread,
Oct 31, 2012, 12:57:42 AM10/31/12
to topbrai...@googlegroups.com
OntoLearner; The previous email was missing the screenshot.  I did run my query against your data and it was coming out as expected.

If the issue is now to extract the new data, then it may be a good idea to look into SPARQMotion (http://topquadrant.com/products/SPARQLMotion.html).  Is may be a bit of al leap, but once you have a bit down about SPARQMotion, this problem becomes simple.

First, a bit about SPARQMotion.  It's basically a visual scripting language that passes triples from one module to the next, potentially transformed by, you guessed it, SPARQL.  See the home page and examples for more.

After that leap, your problem is basically to set up an ApplyTopSPIN module, with inputs of your data and the rules, that retains only the triples created by the rules.  This accomplished by setting a 'replace' property to true - basically stating "replace the triples in the steam with whatever was created in the ApplyTopSPIN module".

At least, this is how I'd do it given the rule I sent earlier.  It is, of course, possible to do extraction with a SPARQL query, but I'm not sure what a good starting point is.  Do you want to just extract what the rule created?  And if so to where?

And if you're really interested in applying this, take a serious look at http://topquadrant.com/training/training_overview.html.  Sorry for the shameless plug, but it truly is the quickest way to get started.

-- Scott

On Oct 30, 2012, at 10:51 PM, OntoLearner <elor...@gmail.com> wrote:

Hello Scott,

I was looking at the below SPARQL query, that Bob DuCharme provided back in July for a request I had sent, and wondered if it can be adapted to include the categories and the optional attributes that I want to include in it.


CONSTRUCT {
    ?newClass a rdfs:Class .
    ?newClass rdfs:subClassOf ?superClass .
}
WHERE
    ?c a ss:Cell .
    ?c ss:row ?row .
    ?c ss:column ?col .
    ?c ss:cellContents ?contents
--
-- You received this message because you are subscribed to the Google
Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary Network (EVN), TopBraid Composer, TopBraid Live,
TopBraid Ensemble, SPARQLMotion, SPARQL Web Pages and SPIN.
To post to this group, send email to
topbrai...@googlegroups.com
To unsubscribe from this group, send email to
topbraid-user...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/topbraid-users?hl=en
 
 

OntoLearner

unread,
Oct 31, 2012, 1:49:26 PM10/31/12
to topbrai...@googlegroups.com
Scott,

I am definitely going to take the training. I will keep trying to solve the current problem.

Attached is the screenshot for the inferences that I am not able to export, for some reason, with only the triples that I need to the format as  displayed in TBCME. I am going through the resources you sent.

I have attached the rdf and the resulting inferences from the SPIN engine.

I don't know what I am doing wrong for the export part, otherwise the creation of the inferred triples is ok.

Thanks.
Inference screenshot.png
category.rdf
Reply all
Reply to author
Forward
0 new messages