Scones

64 views
Skip to first unread message

Diana Tanase

unread,
May 14, 2012, 4:46:48 PM5/14/12
to open-semant...@googlegroups.com
Hi All,

I've been trying out the Scones component for OSF and I already have a GATE application for annotating documents that does concept identification differently from the example scones.xgapp.

I'd like to do the annotations without Scones and place my annotated files in the <annotated folder>
I need this for a variety of reasons -- volume of documents and also a different pipeline that allows annotating multilingual documents.

Can this be done and what do I need to do to integrate the annotated documents with my running instance of OSF?

Thank you for your help,
Diana

Frederick Giasson

unread,
May 15, 2012, 10:51:43 AM5/15/12
to open-semant...@googlegroups.com
Hi Diana!

> I've been trying out the Scones component for OSF and I already have a GATE application for annotating documents that does concept identification differently from the example scones.xgapp.

There should not be any problem. Are you using the onto gazetteer in
your workflow?

> I'd like to do the annotations without Scones and place my annotated files in the<annotated folder>
> I need this for a variety of reasons -- volume of documents and also a different pipeline that allows annotating multilingual documents.

Humm, not sure what you mean here. You want to use Scones or not? If
not, what would you like to do with OSF? If you don't want to use
Scones, you can always do what you have to do with your Gate
application, then converting the annotated documents into RDF using some
of your procedure, and then to import everything into OSF using the
Crud: Create endpoint.


> Can this be done and what do I need to do to integrate the annotated documents with my running instance of OSF?

Please specify the workflow you are currently envisioning, and do tell
us if you want to use Scones, or to replace it.

Thanks!

Take are,

Fred

Diana Tanase

unread,
May 15, 2012, 11:40:59 AM5/15/12
to open-semant...@googlegroups.com
Hi Fred,

What I want to do is index a collection of multilingual documents with concepts from an ontology, and be able to search those documents by concept name.

This is my scenario:

1. Take collection of documents and annotate running Gate Embedded from java. The Gate app does not use gazetteer, but a plugin called Apolda, which allows me to load an ontology and annotate documents based only on information in the ontology (I can use prefLabel and altLabel for annotating with the same concept. The advantage to gazetteers is that I don't need to build the definition lists. It is all just about
identification of concepts, no disambiguation; The disambiguation part is separate and outside of GATE.

2. Use the annotations to index documents both in Virtuoso and Solr. Now, this is were I'm confused about the correct use of Scones and of OSF.

Without Scones, I treat the documents as datasets and add them as you suggested in RDF form. If I loaded the ontology used for annotation already in my instance of OSF
does it have any effect on how the RDFs versions of my documents get imported.


With Scones, I'm not clear what happens once the the annotation process ends. For example, will there be a triple "conceptA annotates docB" in Virtuoso?

I'm missing some connecting glue of the OSF puzzle!

Thank you for your help,
Diana



Frederick Giasson

unread,
May 15, 2012, 1:45:23 PM5/15/12
to open-semant...@googlegroups.com
Hi!


> What I want to do is index a collection of multilingual documents with concepts from an ontology, and be able to search those documents by concept name.
>
> This is my scenario:
>
> 1. Take collection of documents and annotate running Gate Embedded from java. The Gate app does not use gazetteer, but a plugin called Apolda, which allows me to load an ontology and annotate documents based only on information in the ontology (I can use prefLabel and altLabel for annotating with the same concept. The advantage to gazetteers is that I don't need to build the definition lists. It is all just about
> identification of concepts, no disambiguation; The disambiguation part is separate and outside of GATE.

Ok good; looks like more or less what we end-up with the default Gate
application that is shipped with the Scones endpoint.

> 2. Use the annotations to index documents both in Virtuoso and Solr. Now, this is were I'm confused about the correct use of Scones and of OSF.
>
> Without Scones, I treat the documents as datasets and add them as you suggested in RDF form. If I loaded the ontology used for annotation already in my instance of OSF
> does it have any effect on how the RDFs versions of my documents get imported.

Right now, the Scones web service endpoint is *configured* to use one or
multiples ontologies (the gazetteers). Then, if you look at the
endpoint's documentation on the TechWiki, you will see that it takes a
document as input, and output a *gate document*. So, it doesn't return
(yet) RDF, but really a Gate document.

On its side, structScones (in the conStruct set of modules, the Scones
user interface) does extract the RDF from that document, generate the
RDF records, and index everything in OSF using the Crud: Create and
Crud: Update endpoints.

All the ontologies loaded in OSF are not currently automatically used by
Scones (this is a task that we have a tackle in 2012). But these have an
effect on the records imported using the other structWSF web service
endpoints.


> With Scones, I'm not clear what happens once the the annotation process ends. For example, will there be a triple "conceptA annotates docB" in Virtuoso?

No, as I said above, you will get a Gate annotations document. From
there, you have to extract the triples. Check structScones to see how it
can be done.

> I'm missing some connecting glue of the OSF puzzle!

Yeah, so the concept glue for that is a script that does extract the
triples (you can have the code in structScones) and that does send the
resulting RDF to structWSF by using the Crud: Create web service
endpoint. Once you did that, you will have access to this imported Gate
annotations via all the other services (Search, Read, etc).



Tell me if you need documentation/code URLs related to this answer :)

Thanks,

Fred

Diana Tanase

unread,
May 16, 2012, 9:52:41 AM5/16/12
to open-semant...@googlegroups.com
Thank you for these clarifications. I will work on piggybacking the structScones mechanism.

Best,
Diana

Diana Tanase

unread,
May 22, 2012, 5:30:22 PM5/22/12
to open-semant...@googlegroups.com
Hi Fred,

I've setup my little Gate app to work with Scones and created a couple stories using Scones (no working Portable Control Application). I can see a number of corresponding triples in Virtuoso.

What I don't see, is the individual annotations mapped to RDF triples. I looked at the code and the $recordDescriptionN3 only adds triples for prefLabel, abstract, annotatedTextUri.

Have I missed something in the setup? I've changed the AnnotationSet's name, but I've also changed it in the Scones Settings!

Thank you,
Diana



On May 15, 2012, at 6:45 PM, Frederick Giasson wrote:

Frederick Giasson

unread,
May 22, 2012, 5:58:44 PM5/22/12
to open-semant...@googlegroups.com
Hi Diana!


> I've setup my little Gate app to work with Scones and created a couple stories using Scones (no working Portable Control Application). I can see a number of corresponding triples in Virtuoso.

Good, this is a good sign :)

> What I don't see, is the individual annotations mapped to RDF triples. I looked at the code and the $recordDescriptionN3 only adds triples for prefLabel, abstract, annotatedTextUri.
>
> Have I missed something in the setup? I've changed the AnnotationSet's name, but I've also changed it in the Scones Settings!

Could you send one of the XML file (which is a Gate XML file) generated
by a structScones processed story? I would like to see what got tagged
by Gate (if anything). This should give me a few clues.


Thanks,

Fred

Diana Tanase

unread,
May 22, 2012, 6:03:10 PM5/22/12
to open-semant...@googlegroups.com
It's some random bit of text from the guardian ;-) thank you

a9cbc0c95c62a7308edd5d86f78b6c6f_filtered.xml
a9cbc0c95c62a7308edd5d86f78b6c6f.txt
a9cbc0c95c62a7308edd5d86f78b6c6f.xml

Frederick Giasson

unread,
May 22, 2012, 6:12:05 PM5/22/12
to open-semant...@googlegroups.com
Hi

Ok good, I can see that it got tagged. So, the settings in structScones
are such that the annotation set is "Mention", right?

If so, I will have to check if this XML structure is different than what
it was expecting. So, confirm/inform the above, and I will tell you the
next debug step after.

Also, do you have the generated RDF document (export from SPARQL?)

Thanks,

Fred

Diana Tanase

unread,
May 22, 2012, 6:38:08 PM5/22/12
to open-semant...@googlegroups.com
Hi,

Yes, I've edited the scones settings to 'Mention' (beforehand).

Here is the output from a construct sparql (I've kept all triples related to the previous document):

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><n0pred:created xmlns:n0pred="http://purl.org/dc/terms/">2012-05-22T20:53:10+00:00</n0pred:created></rdf:Description>
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><n0pred:storyAnnotatedTextUri xmlns:n0pred="http://purl.org/ontology/sco#">http://localhost/ws/annotatedDocuments/a9cbc0c95c62a7308edd5d86f78b6c6f.xml</n0pred:storyAnnotatedTextUri></rdf:Description>
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><rdf:type rdf:resource="http://purl.org/ontology/bibo/Document"/></rdf:Description>
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><n0pred:abstract xmlns:n0pred="http://purl.org/ontology/bibo/">Perched as we are between the westerly winds from an ocean and easterly winds from a continent it could be cynically said that we have no climate in the UK, just weather, and it is of course mostly weather that determines when things grow. On the whole pla...</n0pred:abstract></rdf:Description>
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><n0pred:prefLabel xmlns:n0pred="http://purl.org/ontology/iron#">story 2</n0pred:prefLabel></rdf:Description>
<rdf:Description rdf:about="http://localhost/wsf/datasets/stories/resource/a9cbc0c95c62a7308edd5d86f78b6c6f"><n0pred:storyTextUri xmlns:n0pred="http://purl.org/ontology/sco#">http://localhost/ws/annotatedDocuments/a9cbc0c95c62a7308edd5d86f78b6c6f.txt</n0pred:storyTextUri></rdf:Description>
</rdf:RDF>


THANK YOU, diana

Frederick Giasson

unread,
May 22, 2012, 7:27:29 PM5/22/12
to open-semant...@googlegroups.com
Hi!

Yeah, the tagged concepts are not there. Will have to check in the code
tomorrow morning.

Will keep you updated.

Thanks,


Fred

Frederick Giasson

unread,
May 23, 2012, 9:04:06 AM5/23/12
to open-semant...@googlegroups.com
Hi Diana,

Were you using the Ontology gazetteer, I think not, right?

The problem here is that the code was expecting a different kind a
Feature from Gate.

Right now, you have:


==================
<Annotation Id="579" Type="Mention" StartNode="138" EndNode="145">
<Feature>
<Name className="java.lang.String">ontology</Name>
<Value className="java.lang.String">gemet-definitions.owl_00015</Value>
</Feature>
<Feature>
<Name className="java.lang.String">class</Name>
<Value
className="java.lang.String">http://www.eionet.europa.eu/gemet/concept/1462</Value>
</Feature>
</Annotation>
==================

However, the <Feature> <Name> that was expected is "type" or "URI"

Where the type is a "class"

Do search the structScones.module file for "Feature", and you will
notice how the Gate document is read to extract the tags from the Gate
generated XML document.

So, two possibilities here:

(1) You change your Gate application to change the name of these
features for each annotation
(2) You update the structScones.module to accomodate this.


Tell me if you have any other questions related to that.

Thanks,

Fred

Diana Tanase

unread,
May 23, 2012, 9:33:08 AM5/23/12
to open-semant...@googlegroups.com

No, gazetteer...

This makes sense and found the code! I'll do some tweaking!

Thank you, Diana

Frederick Giasson

unread,
May 23, 2012, 9:56:40 AM5/23/12
to open-semant...@googlegroups.com
Hi Diana!

> No, gazetteer...
>
> This makes sense and found the code! I'll do some tweaking!


Ok good. Well, if you find the time to generalize that (maybe with some
new settings in structScones?), then you could push on Drupal's Git


Thanks,

Fred
Reply all
Reply to author
Forward
0 new messages