Exporting only selected ConceptSchemes ?

71 views

Skip to first unread message

Thomas Francart

unread,

Jan 15, 2018, 6:11:22 AM1/15/18

to vocbench-user

Hello

I have a project with multiple ConceptSchemes in it. I would like to export only concepts belonging to selected ConceptSchemes. I suspect this is done through a "SPARQL Export Filter", but I have no clue on how to set this up.

I read http://vocbench.uniroma2.it/doc/user/global_data_management.jsf#export_data. It does not document the available export filters and their parameters.

If using a "SPARQL Export Filter" is the correct way to do this, could you document/indicate the structure of the SPARQL query to enter ? and BTW, having a larger textarea for the "filter" field in "Configure" menu for the filter would be nice.

Having a dedicated export filter to export only selected schemes could also be nice.

Thanks !

Thomas

Thomas Francart - SPARNA
Web de données | Architecture de l'information | Accès aux connaissances
blog : blog.sparna.fr, site : sparna.fr, linkedin : fr.linkedin.com/in/thomasfrancart
tel : +33 (0)6.71.11.25.97, skype : francartthomas

Armando Stellato

unread,

Jan 15, 2018, 11:29:56 AM1/15/18

to Thomas Francart, vocbench-user

Dear Thomas,

you are right, the “filter” field cannot be resized. We will implement make all field resizable in the future.

These configuration fields are modeled according to a common vocabulary that we defined for all configurations of all extension points: the better idea would be to have some annotation telling if it should be a one line entry (text field), or a text area. However, making it resizeable would solve for now. The short fix now is to write the SPARQL query somewhere else (e.g. a text editor) and then paste it there.

Concerning the content of the SPARQL update, if you are familiar with SPARQL, the thing is easy, just consider that all the repository is initially copied onto an in-memory copy (with no reasoning, just plain triples) over which you can apply all sort of updates (thus acting destructively, as you will not be modifiying the original one) in order to change its content and finally export it.

The SPARQL update does not need to address graphs, as the selection of graphs can be made a-priori. The default (automatically set by the client) is that the sole working graph (the one containing the dataset you are editing) will be selected.

In the future we will provide in the future a dedicated page on the site hosting a “library of export filters”, so that user can copy and paste them and/or adapt them to their needs. Going to the specific case, we will provide something for exporting schemes however, please consider that the export of a scheme might not be always so straightforward: an RDF thesaurus could contain a lot of specific RDF constructs developed by the specific community editing the thesaurus and not all of them might be local to schemes.

So, a generic export UPDATE would probably:

· Remove all triples with the concepts not belonging to the selected scheme as their subject

· Remove all triples with the concepts not belonging to the selected scheme as their object. This might be not desired, for instance if you have two schemes A and B and you want to export A only, you might or might not want to remove a triple of the type:
:CA skos:related CB
where CB belongs to scheme B (it would be a “mention” in the resulting dataset)

· In case of SKOSXL Remove all triples defining the reified labels for concepts not in the scheme (usually two, plus all triples starting or ending in the URI of the reified label in case of lexical relationships)

But then, depending on specific constructs (elaborated entities described by more triples) you might have in your specific thesaurus, there is no way to make a general exporter unless you know what you are representing.

That’s why a very precise exporter cannot be written in a general way.

Anyway, we’ll send you soon a general purpose scheme-export update that does most of the job ;-)

Cheers,

Armando

--
You received this message because you are subscribed to the Google Groups "vocbench-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vocbench-use...@googlegroups.com.
Visit this group at https://groups.google.com/group/vocbench-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/vocbench-user/CAPugn7WBWJWLednxgQ-9dCMGA03xgqFwSmjDzNZx_9oNamHGbg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Andrea Turbati

unread,

Jan 16, 2018, 4:55:56 AM1/16/18

to vocbench-user, Armando Stellato, Thomas Francart

Dear Thomas,
a SPARQL query to export the desired scheme is the following:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX skosxl: <http://www.w3.org/2008/05/skos-xl#>
DELETE {
    ?concept ?p ?o .
    ?label ?pLabel ?oLabel .
    ?scheme ?pScheme ?oScheme .
}
WHERE{
    ?concept a skos:Concept .
    FILTER NOT EXISTS{
       ?concept skos:inScheme|skos:topConceptOf|^skos:hasTopConcept <DESIRED_SCHEME> .
    }
    ?concept ?p ?o .
    OPTIONAL {
       ?concept ?p2 ?label .
       ?label a skosxl:Label .
       ?label ?pLabel ?oLabel .
    }
    ?scheme a skos:ConceptScheme .
    FILTER (?scheme != <DESIRED_SCHEME>)
    ?scheme ?pScheme ?oScheme .
    OPTIONAL{
        ?scheme ?p3 ?labelScheme .
        ?labelScheme a skosxl:Label .
        ?labelScheme ?pLabelScheme ?oLabelScheme .
    }
}

To execute such query you need to go to:
"Global Data Management"->"Export Data"
then in the "Export Filter" section click on the "+" sign and select "SPARQLExportFilterFactory".
Click on "Configure" and in the "filter*" field paste the SPARQL query I've provided (and substitute the two "DESIRED_SCHEME" with the URI of the scheme you want to export ).

That SPARQL DELETE removes all triples having as subject a concept con belonging to the desired scheme and all its associated skosxl:Label. It also removes the information about the other schemes (and all its associated skosxl:Label).

As Armando said, if you need to remove other special construct as well from the export of your data, you need to change the SPARQL DELETE accordingly.

I hope that this could help you.

Cheers

Andrea

Il 15/01/2018 17:29, Armando Stellato ha scritto:

Dear Thomas,

you are right, the “filter” field cannot be resized. We will implement make all field resizable in the future.

These configuration fields are modeled according to a common vocabulary that we defined for all configurations of all extension points: the better idea would be to have some annotation telling if it should be a one line entry (text field), or a text area. However, making it resizeable would solve for now. The short fix now is to write the SPARQL query somewhere else (e.g. a text editor) and then paste it there.

Concerning the content of the SPARQL update, if you are familiar with SPARQL, the thing is easy, just consider that all the repository is initially copied onto an in-memory copy (with no reasoning, just plain triples) over which you can apply all sort of updates (thus acting destructively, as you will not be modifiying the original one) in order to change its content and finally export it.

The SPARQL update does not need to address graphs, as the selection of graphs can be made a-priori. The default (automatically set by the client) is that the sole working graph (the one containing the dataset you are editing) will be selected.

In the future we will provide in the future a dedicated page on the site hosting a “library of export filters”, so that user can copy and paste them and/or adapt them to their needs. Going to the specific case, we will provide something for exporting schemes however, please consider that the export of a scheme might not be always so straightforward: an RDF thesaurus could contain a lot of specific RDF constructs developed by the specific community editing the thesaurus and not all of them might be local to schemes.

So, a generic export UPDATE would probably:

·         Remove all triples with the concepts not belonging to the selected scheme as their subject

·         Remove all triples with the concepts not belonging to the selected scheme as their object. This might be not desired, for instance if you have two schemes A and B and you want to export A only, you might or might not want to remove a triple of the type:
:CA skos:related CB
where CB belongs to scheme B (it would be a “mention” in the resulting dataset)

·         In case of SKOSXL Remove all triples defining the reified labels for concepts not in the scheme (usually two, plus all triples starting or ending in the URI of the reified label in case of lexical relationships)

But then, depending on specific constructs (elaborated entities described by more triples) you might have in your specific thesaurus, there is no way to make a general exporter unless you know what you are representing.

That’s why a very precise exporter cannot be written in a general way.

Anyway, we’ll send you soon a general purpose scheme-export update that does most of the job ;-)

Cheers,

Armando

From: vocben...@googlegroups.com [mailto:vocben...@googlegroups.com] On Behalf Of Thomas Francart
Sent: Monday, January 15, 2018 12:11 PM
To: vocbench-user <vocben...@googlegroups.com>
Subject: [vocbench-user] Exporting only selected ConceptSchemes ?

Hello

I have a project with multiple ConceptSchemes in it. I would like to export only concepts belonging to selected ConceptSchemes. I suspect this is done through a "SPARQL Export Filter", but I have no clue on how to set this up.

I read MailScanner ha rilevato un possibile tentativo di frode proveniente da "eur02.safelinks.protection.outlook.com" http://vocbench.uniroma2.it/doc/user/global_data_management.jsf#export_data. It does not document the available export filters and their parameters.

If using a "SPARQL Export Filter" is the correct way to do this, could you document/indicate the structure of the SPARQL query to enter ? and BTW, having a larger textarea for the "filter" field in "Configure" menu for the filter would be nice.

Having a dedicated export filter to export only selected schemes could also be nice.

Thanks !

Thomas

--

Thomas Francart - SPARNA
Web de données | Architecture de l'information | Accès aux connaissances

blog : blog.sparna.fr, site : sparna.fr, linkedin : MailScanner ha rilevato un possibile tentativo di frode proveniente da "eur02.safelinks.protection.outlook.com" fr.linkedin.com/in/thomasfrancart

tel : +33 (0)6.71.11.25.97, skype : francartthomas

--
You received this message because you are subscribed to the Google Groups "vocbench-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vocbench-use...@googlegroups.com.

Visit this group at MailScanner ha rilevato un possibile tentativo di frode proveniente da "eur02.safelinks.protection.outlook.com" https://groups.google.com/group/vocbench-user.
To view this discussion on the web visit MailScanner ha rilevato un possibile tentativo di frode proveniente da "eur02.safelinks.protection.outlook.com" https://groups.google.com/d/msgid/vocbench-user/CAPugn7WBWJWLednxgQ-9dCMGA03xgqFwSmjDzNZx_9oNamHGbg%40mail.gmail.com.
For more options, visit MailScanner ha rilevato un possibile tentativo di frode proveniente da "eur02.safelinks.protection.outlook.com" https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "vocbench-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vocbench-use...@googlegroups.com.
Visit this group at https://groups.google.com/group/vocbench-user.

To view this discussion on the web visit https://groups.google.com/d/msgid/vocbench-user/DB6PR1001MB101349D2FA04E6AEE26064E5C7EB0%40DB6PR1001MB1013.EURPRD10.PROD.OUTLOOK.COM.

For more options, visit https://groups.google.com/d/optout.

-- 
-------------------------------------------------
 
Dott. Andrea Turbati, PhD
AI Research Group,
Dept. of Enterprise Engineering
University of Roma, Tor Vergata
Via del Politecnico 1 00133 ROMA (ITALY)
tel: +39 06 7259 7334 
lab: +39 06 7259 7332 
e_mail: tur...@info.uniroma2.it
home page: http://art.uniroma2.it/turbati/

--------------------------------------------------

Reply all

Reply to author

Forward

0 new messages