Hg repository issues?

10 views
Skip to first unread message

Allan Hollander

unread,
May 12, 2009, 3:03:20 PM5/12/09
to rdflib-dev
Hi all -- I'm trying to download the development version of rdflib
from the Google source repository (I'd like to try a version of rdflib
with a PostgreSQL backend store) but when I attempt to browse the Hg
repository at http://code.google.com/p/rdflib/source/browse/ it just
lists no files. Does something need to be fixed with this repository
or how do I go about obtaining the latest source?

Thanks,

Allan Hollander
Information Center for the Environment, UC Davis

Daniel Krech

unread,
May 12, 2009, 3:13:35 PM5/12/09
to rdfli...@googlegroups.com
Hi Allan,

We've not yet moved our svn history into Hg. I didn't realize all the
source tab pages in code.google have changed to reflect what's in Hg.
I'll bump doing the migration up in priority; hopefully I can do that
this evening. In the main time, the svn repository is still available
at:

svn checkout http://rdflib.googlecode.com/svn/trunk/ rdflib-read-
only

Cheers,
Daniel

Graham Higgins

unread,
May 17, 2009, 1:42:38 PM5/17/09
to rdflib-dev
On May 12, 8:03 pm, Allan Hollander <numen...@magpienest.org> wrote:
> Hi all -- I'm trying to download the development version of rdflib
> from the Google source repository (I'd like to try a version of rdflib
> with a PostgreSQL backend store)

I've recently been playing with the PostgreSQL store option. I
experienced a couple of problems (using PyGres4-beta) ...

It's gabby for apparent reason, keeps nattering away on each triple
addition:

NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"kb_3b5e0fec61_namespace_binds_pkey" for table
"kb_3b5e0fec61_namespace_binds"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"kb_3b5e0fec61_identifiers_pkey" for table "kb_3b5e0fec61_identifiers"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"kb_3b5e0fec61_literals_pkey" for table "kb_3b5e0fec61_literals"
WARNING: there is no transaction in progress

and there's an explicit, hard-coded use of a MySQL-only keyword
(STRAIGHT_JOIN) in store/FOPLRelationalModel/
BinaryRelationPartition.py which kills any PostgreSQL query stone
dead.

A comment in the code for store/MySQL.py indicates that the PostgreSQL
component is to be separated out at some point. In the interim, I'm
using a workaround.

I use psycopg2 (as recommended for SQLAlchemy users). I changed the
MySQL.py import section (~line1477), adding an import test for
psycopg2 and indenting the pgdb imports to suit:

try:
import psycopg2
def _connect(self, db=None):
if db is None:
db = self.config['db']
return PostgreSQL.psycopg2.connect(
user=self.config['user'],
password=self.config['password'],
database=db,
host=self.config['host'],
port=self.config['port'])
except ImportError:
try:
import pgdb
[ ... ]

And (somewhat brutally) I inserted the following test in store/
FOPLRelationalModel/BinaryRelationPartition.py (at line856):

if 'psycopg2' in str(cursor.__class__): query = query.replace
('STRAIGHT_JOIN', '')

[if you're using PyGreSQL, the test is: if 'pgdb' in str
(cursor.__class__) ]

The workaround is effective under Python 2.5.1 and Python 2.6.2 with
psycopg2-2.0.10.

I'm now a happy PostgreSQL-using bunny.

HTH

Graham Higgins

John L. Clark

unread,
May 18, 2009, 2:55:08 PM5/18/09
to rdfli...@googlegroups.com
Graham,

On Sun, May 17, 2009 at 1:42 PM, you wrote:
> I've recently been playing with the PostgreSQL store option. I
> experienced a couple of problems (using PyGres4-beta) ...

You're using RDFLib from Subversion trunk? Also, what version of
PostgreSQL are you using?

> and there's an explicit, hard-coded use of a MySQL-only keyword
> (STRAIGHT_JOIN) in store/FOPLRelationalModel/
> BinaryRelationPartition.py which kills any PostgreSQL query stone
> dead.

Can you post a bit more about the setup code you're using that's
emitting the 'STRAIGHT_JOIN'? The PostgreSQL subclass should override
that in all the important places.

Thanks,

John L. Clark

Graham Higgins

unread,
May 18, 2009, 8:37:34 PM5/18/09
to rdflib-dev
On May 18, 7:55 pm, "John L. Clark" <john.l.cl...@gmail.com> wrote:

John,

> You're using RDFLib from Subversion trunk?  Also, what version of
> PostgreSQL are you using?

I'm using Python2.6, RDFLib from trunk - now tip :-) and PostgreSQL
version is 8.2.5

> Can you post a bit more about the setup code you're using that's
> emitting the 'STRAIGHT_JOIN'?  The PostgreSQL subclass should override
> that in all the important places.

I did look through the code and I tried explicitly setting the
instantiated store's select_modifier attrib to the empty string, to no
avail.

This is what I get:

bash-3.2$ python2.6
Python 2.6 (r26:66714, Dec 4 2008, 16:17:57)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import rdflib
>>> from rdflib.graph import ConjunctiveGraph as Graph
>>> from rdflib import plugin
>>> from rdflib.store import Store, NO_STORE, VALID_STORE
>>> from rdflib.namespace import Namespace
>>> from rdflib.term import Literal, URIRef
>>> default_graph_uri = "http://rdflib.net/rdfstore"
>>> configString = "host=localhost,user=foo,password=baz,db=rdflib"
>>> store = plugin.get('PostgreSQL', Store)('rdfstore')
rdflib/store/AbstractSQLStore.py:5: DeprecationWarning: the sha module
is deprecated; use the hashlib module instead
import sha,sys, weakref
>>> rt = store.open(configString,create=False)
table kb_7b066eca61_relations Doesn't exist
table kb_7b066eca61_relations Doesn't exist
>>> if rt == NO_STORE:
... store.open(configString,create=True)
... else:
... assert rt == VALID_STORE,"There underlying store is corrupted"
...
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
AssertionError: There underlying store is corrupted
>>> store.open(configString,create=True)
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"kb_7b066eca61_namespace_binds_pkey" for table
"kb_7b066eca61_namespace_binds"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"kb_7b066eca61_identifiers_pkey" for table "kb_7b066eca61_identifiers"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"kb_7b066eca61_literals_pkey" for table "kb_7b066eca61_literals"
1
>>> graph = Graph(store, identifier = URIRef(default_graph_uri))
>>> print "Triples in graph before add: ", len(graph)
Triples in graph before add: 0
>>> rdflib = Namespace('http://rdflib.net/test/')
>>> graph.add((rdflib['pic:1'], rdflib['name'], Literal('Jane & Bob')))
>>> graph.add((rdflib['pic:2'], rdflib['name'], Literal('Squirrel in Tree')))
>>> graph.commit()
WARNING: there is no transaction in progress
>>> print "Triples in graph after add: ", len(graph)
Triples in graph after add: 2
>>> print graph.serialize()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "rdflib/graph.py", line 685, in serialize
serializer.serialize(stream, base=base, encoding=encoding, **args)
File "rdflib/syntax/serializers/XMLSerializer.py", line 50, in
serialize
bindings = list(self.__bindings())
File "rdflib/syntax/serializers/XMLSerializer.py", line 25, in
__bindings
for predicate in uniq(store.predicates()):
File "rdflib/util.py", line 27, in uniq
map(set.__setitem__, sequence, [])
File "rdflib/graph.py", line 463, in predicates
for s, p, o in self.triples((subject, None, object)):
File "rdflib/graph.py", line 905, in triples
for (s, p, o), cg in self.store.triples((s, p, o),
context=context):
File "rdflib/store/MySQL.py", line 1250, in triples
rt=PatternResolution
((subject,predicate,obj,context),c,self.partitions,fetchall=False)
File "rdflib/store/FOPLRelationalModel/BinaryRelationPartition.py",
line 873, in PatternResolution
cursor.execute(query,tuple(unionQueriesParams))
File "build/bdist.macosx-10.5-i386/egg/pgdb.py", line 168, in
execute
self.executemany(operation, (params,))
File "build/bdist.macosx-10.5-i386/egg/pgdb.py", line 189, in
executemany
raise DatabaseError, "error '%s' in '%s'" % ( msg, sql )
pg.DatabaseError: error 'ERROR: syntax error at or near "rt_subject"
LINE 1: (SELECT STRAIGHT_JOIN rt_subject.lexical as
subject,rt_subje...
^
' in '(SELECT STRAIGHT_JOIN rt_subject.lexical as
subject,rt_subject.term_type as subjectTermType,rt_predicate.lexical
as predicate,rt_predicate.term_type as
predicateTermType,rt_object.lexical as object,'L' as
objectTermType,rt_context.lexical as context,rt_context.term_type as
contextTermType,rt_data_type.lexical as
dataType,kb_7b066eca61_literalProperties.language as language FROM
kb_7b066eca61_literalProperties LEFT JOIN kb_7b066eca61_identifiers
rt_data_type ON (kb_7b066eca61_literalProperties.data_type =
rt_data_type.id) INNER JOIN kb_7b066eca61_identifiers rt_predicate ON
(kb_7b066eca61_literalProperties.predicate = rt_predicate.id AND
kb_7b066eca61_literalProperties.predicate_term =
rt_predicate.term_type) INNER JOIN kb_7b066eca61_identifiers
rt_context ON (kb_7b066eca61_literalProperties.context = rt_context.id
AND kb_7b066eca61_literalProperties.context_term =
rt_context.term_type) INNER JOIN kb_7b066eca61_literals rt_object ON
(kb_7b066eca61_literalProperties.object = rt_object.id) INNER JOIN
kb_7b066eca61_identifiers rt_subject ON
(kb_7b066eca61_literalProperties.subject = rt_subject.id AND
kb_7b066eca61_literalProperties.subject_term = rt_subject.term_type)
WHERE kb_7b066eca61_literalProperties.context_term != 'F') union all
(SELECT STRAIGHT_JOIN
rt_subject.lexical,rt_subject.term_type,rt_predicate.lexical,rt_predicate.term_type,rt_object.lexical,rt_object.term_type,rt_context.lexical,rt_context.term_type,NULL,NULL
FROM kb_7b066eca61_relations INNER JOIN kb_7b066eca61_identifiers
rt_predicate ON (kb_7b066eca61_relations.predicate = rt_predicate.id
AND kb_7b066eca61_relations.predicate_term = rt_predicate.term_type)
INNER JOIN kb_7b066eca61_identifiers rt_context ON
(kb_7b066eca61_relations.context = rt_context.id AND
kb_7b066eca61_relations.context_term = rt_context.term_type) INNER
JOIN kb_7b066eca61_identifiers rt_object ON
(kb_7b066eca61_relations.object = rt_object.id AND
kb_7b066eca61_relations.object_term = rt_object.term_type) INNER JOIN
kb_7b066eca61_identifiers rt_subject ON
(kb_7b066eca61_relations.subject = rt_subject.id AND
kb_7b066eca61_relations.subject_term = rt_subject.term_type) WHERE
kb_7b066eca61_relations.context_term != 'F') union all (SELECT
STRAIGHT_JOIN rt_subject.lexical,rt_subject.term_type,'http://
www.w3.org/1999/02/22-rdf-syntax-ns#type','U',rt_object.lexical,rt_object.term_type,rt_context.lexical,rt_context.term_type,NULL,NULL
FROM kb_7b066eca61_associativeBox INNER JOIN kb_7b066eca61_identifiers
rt_object ON (kb_7b066eca61_associativeBox.class = rt_object.id AND
kb_7b066eca61_associativeBox.class_term = rt_object.term_type) INNER
JOIN kb_7b066eca61_identifiers rt_context ON
(kb_7b066eca61_associativeBox.context = rt_context.id AND
kb_7b066eca61_associativeBox.context_term = rt_context.term_type)
INNER JOIN kb_7b066eca61_identifiers rt_subject ON
(kb_7b066eca61_associativeBox.member = rt_subject.id AND
kb_7b066eca61_associativeBox.member_term = rt_subject.term_type) WHERE
kb_7b066eca61_associativeBox.context_term != 'F') ORDER BY
subject,predicate,object'


Switching API to psycopg2 gives a smoother drive but the STRAIGHT_JOIN
is still there.

bash-3.2$ python2.6
Python 2.6 (r26:66714, Dec 4 2008, 16:17:57)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import rdflib
>>> from rdflib.graph import ConjunctiveGraph as Graph
>>> from rdflib import plugin
>>> from rdflib.store import Store, NO_STORE, VALID_STORE
>>> from rdflib.namespace import Namespace
>>> from rdflib.term import Literal, URIRef
>>> default_graph_uri = "http://rdflib.net/rdfstore"
>>> configString = "host=localhost,user=httpd,password=,db=rdflib"
>>> store = plugin.get('PostgreSQL', Store)('rdfstore')
rdflib/store/AbstractSQLStore.py:5: DeprecationWarning: the sha module
is deprecated; use the hashlib module instead
import sha,sys, weakref
>>> rt = store.open(configString,create=False)
>>> if rt == NO_STORE:
... store.open(configString,create=True)
... else:
... assert rt == VALID_STORE,"There underlying store is corrupted"
...
>>> graph = Graph(store, identifier = URIRef(default_graph_uri))
>>> print "Triples in graph before add: ", len(graph)
Triples in graph before add: 2
>>> rdflib = Namespace('http://rdflib.net/test/')
>>> graph.add((rdflib['pic:1'], rdflib['name'], Literal('Jane & Bob')))
>>> graph.add((rdflib['pic:2'], rdflib['name'], Literal('Squirrel in Tree')))
>>> graph.commit()
>>> print "Triples in graph after add: ", len(graph)
Triples in graph after add: 4
>>> print graph.serialize()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "rdflib/graph.py", line 685, in serialize
serializer.serialize(stream, base=base, encoding=encoding, **args)
File "rdflib/syntax/serializers/XMLSerializer.py", line 50, in
serialize
bindings = list(self.__bindings())
File "rdflib/syntax/serializers/XMLSerializer.py", line 25, in
__bindings
for predicate in uniq(store.predicates()):
File "rdflib/util.py", line 27, in uniq
map(set.__setitem__, sequence, [])
File "rdflib/graph.py", line 463, in predicates
for s, p, o in self.triples((subject, None, object)):
File "rdflib/graph.py", line 905, in triples
for (s, p, o), cg in self.store.triples((s, p, o),
context=context):
File "rdflib/store/MySQL.py", line 1250, in triples
rt=PatternResolution
((subject,predicate,obj,context),c,self.partitions,fetchall=False)
File "rdflib/store/FOPLRelationalModel/BinaryRelationPartition.py",
line 873, in PatternResolution
cursor.execute(query,tuple(unionQueriesParams))
psycopg2.ProgrammingError: syntax error at or near "rt_subject"
LINE 1: (SELECT STRAIGHT_JOIN rt_subject.lexical as
subject,rt_subje...
^

It's a crude workaround but like as not it won't be needed for long
and the PostgreSQL back-end isn't official yet.

If there's a proper fix tho', that'd be useful.

Cheers,

Graham Higgins

John L. Clark

unread,
May 19, 2009, 2:00:41 PM5/19/09
to rdfli...@googlegroups.com
On Mon, May 18, 2009 at 8:37 PM, Graham Higgins
<gjhi...@googlemail.com> wrote:
>> Can you post a bit more about the setup code you're using that's
>> emitting the 'STRAIGHT_JOIN'? The PostgreSQL subclass should override
>> that in all the important places.

> It's a crude workaround but like as not it won't be needed for long


> and the PostgreSQL back-end isn't official yet.
>
> If there's a proper fix tho', that'd be useful.

Yeah, you just unearthed a straight-up bug. That was one execution
path where the store's select_modifier was not properly propagated to
a utility function. I have attached a patch that should fix it, but I
haven't yet had a chance to actually test it.

Take care,

John

rdflib-straight-join_2009-05-19.diff

Graham Higgins

unread,
May 20, 2009, 1:33:18 PM5/20/09
to rdflib-dev
On May 19, 7:00 pm, "John L. Clark" <john.l.cl...@gmail.com> wrote:

> Yeah, you just unearthed a straight-up bug. That was one execution
> path where the store's select_modifier was not properly propagated to
> a utility function. I have attached a patch that should fix it, but I
> haven't yet had a chance to actually test it.

Applied, tested (with my simple-minded read/write test) and working
for me. Thank you.

Cheers,

Graham.

Daniel Krech

unread,
May 20, 2009, 4:34:10 PM5/20/09
to Daniel Krech, rdfli...@googlegroups.com
Thanks to the help of Graham we now have all of our svn history
imported into our Hg repository. I'd like to propose that we switch to
using our Hg repository moving forward... seeing as we kind of have
already.

I didn't realize when I requested that Hg be activated for rdflib it'd
essentially replace svn... as they made it sound it'd be in addition
to svn. Was planning on having some discussion before making the
switch. Hopefully we can all just move forward using Hg and focus on
pulling together a new release in the not so distant future.
Reply all
Reply to author
Forward
0 new messages