"Foreign keys" usage while subclassing rdfSubject

72 views
Skip to first unread message

Tatiana Al-Chueyr

unread,
Jun 22, 2012, 3:19:11 PM6/22/12
to rdfalch...@googlegroups.com
Hello Guys,

I've been trying to "replace" a SPARQL query using RDFAlchemy's rdfSubject, but I didn't succeed.

Consider the query bellow, with which I'd like to acquire 10 musicians from DBPedia who play guitar or piano. I'd also like to retrieve the label of the instrument they play:

"""
SELECT distinct ?name ?label {
    ?artist
        a dbonto:MusicalArtist;
        dbprop:birthName ?name.
    {
        ?artist dbonto:instrument db:Guitar .
        db:Guitar rdfs:label ?label.
    }
    UNION
    {
    ?artist  dbonto:instrument db:Piano .
    db:Piano rdfs:label ?label .
    }
    FILTER ( lang(?name) = "en" )
}
ORDER BY ?name
LIMIT 10
"""

I considered using rdfSubject as shown bellow, but I'm afraid the line commented with the "FIXME" is wrong:

------------------------------------------------------------------------
class Instrument(rdfSubject):
    rdfs_label = rdfSingle(RDFS.label, 'label')

class MusicalArtist(rdfSubject):
    rdf_type = DBONTO.MusicalArtist
    dbprop_birthname = DBPROP.birthName
    dbonto_instrument = Instrument  # ? FIXME
------------------------------------------------------------------------

The idea is to be able to use rdfSubject such as shown bellow. This would bring musicians who play guitar, for instance (simpler than the original query - but on the way ;)).

------------------------------------------------------------------------
guitar_players_list = MusicalArtist.get_by(dbonto_instrument=DB.Guitar)
------------------------------------------------------------------------

However, in this case, it is throwing a AttributeError, saying "type object 'rdfSubject' has no attribute 'pred'."

What would be the best way of doing this?

The complete source code for this example is available here:

Thanks in advance!

Tatiana

Philip Cooper

unread,
Jun 22, 2012, 6:20:32 PM6/22/12
to rdfalch...@googlegroups.com
On 6/22/12 1:19 PM, Tatiana Al-Chueyr wrote:
Hello Guys,

I've been trying to "replace" a SPARQL query using RDFAlchemy's rdfSubject, but I didn't succeed.
Probably not a good idea here.  Let the database engine optimize the query and only send back the relevent results.  With a SQL database you would do the same. 

That being said, to help you get on the right path:

   * a player has one birthname and could play multiple instruments so use rdfMultiple
   * in the class definition, you use a descriptor (e.g. rdfSingle, rdfMutltiple etc) on the RHS (see below)
   * get_by expects to return one item (e.g. the predicate is like an sql primary key or an OWL inverseFunctionalProperty.  get_by is not being used in later versions of sqlalchemy but I kept it.  filter_by is more appropriate here


########################################
# your code plus:
from rdfalchemy import rdfMultiple


class MusicalArtist(rdfSubject):
    rdf_type = DBONTO.MusicalArtist
    birthname = rdfSingle(DBPROP.birthName)
    instrument = rdfMultiple(DBONTO.instrument)

guitar_players = MusicalArtist.filter_by(instrument=DB.Guitar)

time guitar_players = MusicalArtist.filter_by(instrument=DB.Guitar)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s

time player = guitar_players.next()
CPU times: user 2.37 s, sys: 0.07 s, total: 2.44 s
Wall time: 5.94 s

player
Out[8]: MusicalArtist('<http://dbpedia.org/resource/Mauro_Scocco>')

player.instrument
Out[9]:
[rdfSubject('<http://dbpedia.org/resource/Piano>'),
 rdfSubject('<http://dbpedia.org/resource/Bass_guitar>'),
 rdfSubject('<http://dbpedia.org/resource/Keyboard_instrument>'),
 rdfSubject('<http://dbpedia.org/resource/Singing>'),
 rdfSubject('<http://dbpedia.org/resource/Singer>'),
 rdfSubject('<http://dbpedia.org/resource/Guitar>')]

###############
first query response was 6 seconds and subsequent responses (calls to next()) were about 3/4 second each. slow way to go.

--
Phil

Tatiana Al-Chueyr Martins

unread,
Jun 22, 2012, 10:08:48 PM6/22/12
to rdfalch...@googlegroups.com
Hi Philip!

On 22 June 2012 19:20, Philip Cooper <philip...@openvest.com> wrote:
On 6/22/12 1:19 PM, Tatiana Al-Chueyr wrote:
Hello Guys,

I've been trying to "replace" a SPARQL query using RDFAlchemy's rdfSubject, but I didn't succeed.
Probably not a good idea here.  Let the database engine optimize the query and only send back the relevent results.  With a SQL database you would do the same. 


The idea about using the objects would be to improve mocks and testing. I understand the delay problem, but considering I may cache stuff, it could be worth it. But I will follow your advice and study it case carefully. Thanks!
 
That being said, to help you get on the right path:

   * a player has one birthname and could play multiple instruments so use rdfMultiple
   * in the class definition, you use a descriptor (e.g. rdfSingle, rdfMutltiple etc) on the RHS (see below)
   * get_by expects to return one item (e.g. the predicate is like an sql primary key or an OWL inverseFunctionalProperty.  get_by is not being used in later versions of sqlalchemy but I kept it.  filter_by is more appropriate here



Thank you very much for your prompt reply and your guiding!
 
########################################
# your code plus:
from rdfalchemy import rdfMultiple


class MusicalArtist(rdfSubject):
    rdf_type = DBONTO.MusicalArtist
    birthname = rdfSingle(DBPROP.birthName)
    instrument = rdfMultiple(DBONTO.instrument)

guitar_players = MusicalArtist.filter_by(instrument=DB.Guitar)

time guitar_players = MusicalArtist.filter_by(instrument=DB.Guitar)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s


Hmmm, so lazy evaluation was used? Until this point no query is submitted to SPARQL?

Is there some way of printing what the query will look like?
 
time player = guitar_players.next()
CPU times: user 2.37 s, sys: 0.07 s, total: 2.44 s
Wall time: 5.94 s

player
Out[8]: MusicalArtist('<http://dbpedia.org/resource/Mauro_Scocco>')

player.instrument
Out[9]:
[rdfSubject('<http://dbpedia.org/resource/Piano>'),
 rdfSubject('<http://dbpedia.org/resource/Bass_guitar>'),
 rdfSubject('<http://dbpedia.org/resource/Keyboard_instrument>'),
 rdfSubject('<http://dbpedia.org/resource/Singing>'),
 rdfSubject('<http://dbpedia.org/resource/Singer>'),
 rdfSubject('<http://dbpedia.org/resource/Guitar>')]

###############
first query response was 6 seconds and subsequent responses (calls to next()) were about 3/4 second each. slow way to go.


Indeed, very slow. Anyway, thank you very much for the explanation. Extremely helpful. As I said, in some circumstances this delay might not be a problem. Let's see ;)

Regards,

Tatiana
 
--
Phil

--
You received this message because you are subscribed to the Google Groups "rdfalchemy-dev" group.
To post to this group, send email to rdfalch...@googlegroups.com.
To unsubscribe from this group, send email to rdfalchemy-de...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rdfalchemy-dev?hl=en.



--
Tatiana Al-Chueyr
Reply all
Reply to author
Forward
0 new messages