Hi!
> What I want to do is index a collection of multilingual documents with concepts from an ontology, and be able to search those documents by concept name.
>
> This is my scenario:
>
> 1. Take collection of documents and annotate running Gate Embedded from java. The Gate app does not use gazetteer, but a plugin called Apolda, which allows me to load an ontology and annotate documents based only on information in the ontology (I can use prefLabel and altLabel for annotating with the same concept. The advantage to gazetteers is that I don't need to build the definition lists. It is all just about
> identification of concepts, no disambiguation; The disambiguation part is separate and outside of GATE.
Ok good; looks like more or less what we end-up with the default Gate
application that is shipped with the Scones endpoint.
> 2. Use the annotations to index documents both in Virtuoso and Solr. Now, this is were I'm confused about the correct use of Scones and of OSF.
>
> Without Scones, I treat the documents as datasets and add them as you suggested in RDF form. If I loaded the ontology used for annotation already in my instance of OSF
> does it have any effect on how the RDFs versions of my documents get imported.
Right now, the Scones web service endpoint is *configured* to use one or
multiples ontologies (the gazetteers). Then, if you look at the
endpoint's documentation on the TechWiki, you will see that it takes a
document as input, and output a *gate document*. So, it doesn't return
(yet) RDF, but really a Gate document.
On its side, structScones (in the conStruct set of modules, the Scones
user interface) does extract the RDF from that document, generate the
RDF records, and index everything in OSF using the Crud: Create and
Crud: Update endpoints.
All the ontologies loaded in OSF are not currently automatically used by
Scones (this is a task that we have a tackle in 2012). But these have an
effect on the records imported using the other structWSF web service
endpoints.
> With Scones, I'm not clear what happens once the the annotation process ends. For example, will there be a triple "conceptA annotates docB" in Virtuoso?
No, as I said above, you will get a Gate annotations document. From
there, you have to extract the triples. Check structScones to see how it
can be done.
> I'm missing some connecting glue of the OSF puzzle!
Yeah, so the concept glue for that is a script that does extract the
triples (you can have the code in structScones) and that does send the
resulting RDF to structWSF by using the Crud: Create web service
endpoint. Once you did that, you will have access to this imported Gate
annotations via all the other services (Search, Read, etc).
Tell me if you need documentation/code URLs related to this answer :)
Thanks,
Fred