regex in FILTER query

23 views
Skip to first unread message

Dominique Guardiola Falco

unread,
Aug 29, 2011, 5:11:06 AM8/29/11
to sur...@googlegroups.com
Sorry to bother you again, hope that I'll be able to help others here in the future ...

I'm trying to filter some triples based on a regex on their URI
With the rdflib memory store, I've got strange results, and with the rdflib one no result at all:

import surf
from surf import ns, query
from surf.query import select,a,filter
#store = surf.Store(reader='rdflib',writer='rdflib',rdflib_store='IOMemory')
store = surf.Store(**{"reader": "librdf", "writer" : "librdf", })
session = surf.Session(store)
store.load_triples(source='http://rdfs.org/sioc/ns#')
store.enable_logging(True)

Here's what I'm looking for :

for i in session.get_class(ns.OWL.Class).all():print i

{http://xmlns.com/foaf/0.1/OnlineAccount : http://www.w3.org/2002/07/owl#Class}
{http://xmlns.com/foaf/0.1/Agent : http://www.w3.org/2002/07/owl#Class}
{http://xmlns.com/foaf/0.1/Document : http://www.w3.org/2002/07/owl#Class}
{http://rdfs.org/sioc/ns#Community : http://www.w3.org/2002/07/owl#Class}
{http://rdfs.org/sioc/ns#Container : http://www.w3.org/2002/07/owl#Class}
...

In this ontology (SIOC) they included classes from FOAF and I want to filter them out and only list the "pure" SIOC Classes.
So i do :

query = select("?s").where(("?s",a,ns.OWL.Class)).filter('(regex (?s,"http://rdfs.org/sioc/ns","i"))')
list(store.reader._to_table(store.reader._execute(query)))

 With librdf/redland the logging shows a correct sparql query , but an empty result list

DEBUG:ReaderPlugin:SELECT  ?s   WHERE {  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .  FILTER (regex (?s,"http://rdfs.org/sioc/ns","i"))  }   
[]


rdflib store shows a correct number of results, but like this :

DEBUG:ReaderPlugin:SELECT  ?s   WHERE {  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .  FILTER (regex (?s,"http://rdfs.org/sioc/ns","i"))  } 
[{u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}, {u's': u'h'}]



Christoph Burgmer

unread,
Aug 29, 2011, 6:07:06 PM8/29/11
to sur...@googlegroups.com
> Sorry to bother you again, hope that I'll be able to help others here in
> the future ...

Feel free to ask :)

> I'm trying to filter some triples based on a regex on their URI
> With the rdflib memory store, I've got strange results, and with the rdflib
> one no result at all:

For one I would search by writing

regex (?s,"^http://rdfs.org/sioc/ns","i")

(mind the additional ^ here). However your regex might be correct, I am not
sure.

I can reproduce your results for librdf. I don't get any hits, too. However
for rdflib I get the following:

>>> query = select("?s").where(("?s",a,ns.OWL.Class)).filter('(regex

(?s,"^http://rdfs.org/sioc/ns","i"))')
>>> store.execute_sparql(unicode(query))


DEBUG:ReaderPlugin:SELECT ?s WHERE { ?s <http://www.w3.org/1999/02/22-rdf-
syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> . FILTER (regex

(?s,"^http://rdfs.org/sioc/ns","i")) }
{u'head': {u'vars': [u's']}, u'results': {u'distinct': False, u'bindings':
[{u's': {u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Item'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Usergroup'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#UserAccount'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Forum'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Space'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Container'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Community'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Site'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Post'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Thread'}}, {u's':
{u'type': u'uri', u'value': u'http://rdfs.org/sioc/ns#Role'}}], u'ordered':
False}}

Is there an easy way to test librdf through other ways than just the SuRF
plugin? You should maybe consider filing a bug report with librdf.

While SuRF's plugin system initially suggest that you can easily switch
backends and still have the same API, the backends most often differ
significantly, see for example my initial documentation here:
http://code.google.com/p/surfrdf/wiki/BackendPeculiarities

Sorry I can't be much help here.
-Christoph

Christoph Burgmer

unread,
Aug 29, 2011, 6:17:10 PM8/29/11
to sur...@googlegroups.com
Am Dienstag, 30. August 2011 schrieb Christoph Burgmer:
> I can reproduce your results for librdf. I don't get any hits, too. However
> for rdflib I get the following:

I forgot to add that I have rasqal 0.9.20 installed (from Ubuntu) which is a
year old now.

-Christoph

Dominique Guardiola Falco

unread,
Aug 30, 2011, 4:53:14 AM8/30/11
to sur...@googlegroups.com
Ok, I found the answer here , my query was wrong

http://www.thefigtrees.net/lee/sw/sparql-faq#regex-language-tag

I needed to put str(?s) instead of calling directly ?s which is an uri, not a literal - rdflib is more liberal on this syntax
Now the query works well with the librdf store

But the strange output I had with rdflib is still here (I don't have your results)
Really looks like there's some knots  between rdflib and librdf installs :

>>> store = surf.Store(reader='rdflib',writer='rdflib',rdflib_store='IOMemory')
librdf error - storage postgresql already registered
librdf error - storage virtuoso already registered
librdf error - query language vsparql already registered

This is happenning without loading librdf at all...need to do some cleaning here...

Reply all
Reply to author
Forward
0 new messages