Comment #3 on issue 4 by gjhigg...@gmail.com: Switch over to current rdflib
In response to Chimezie's post to the forum:
++ Excellent! I will take a look and try to get a sense of the effort
++ it would take to take this to its conclusion: switching back to
++ rdflib while maintaining the divergent module-changes and components
++ (such as the pure python parser, the Generic SPARQL Store,
++ the MySQL/SPARQL implementation, etc.). Do you have any sense of this?
I am able to report that all the above-referenced work has been restored,
refactored and recently merged back into the default branch of my clone
of rdfextras, ready for pushing to the "official" repos:
There is a Hudson CI build which tracks commits:
and which maintains reports of test runs, currently standing at 369 tests
with 4 failures and 13 skips (of known issues, mostly with SQL stores):
and (fwiw) coverage reports:
With respect to detail - the MySQL/SPARQL implementation is available as
rdfextras.sparql2sql and shows little or no difference in test results to
the extant default rdfextras SPARQL implementation.
Most of the stores have been recovered and refactored but I'm unsure of
what you mean by "the Generic SPARQL Store" - the recovered stores are in:
Whilst the key-value stores required little change other than a mild
refactoring, the SQL stores are evincing problems when running tests that
involved contexts and Statements.
Many of the tests make assertions about the length of the graph but this
seems to be broken for contexts, as this Pdb interaction apparently
demonstrates (if the comment formatting screws up the layout, I'll
attach a text file):
python run_tests.py --pdb-failure
Running nose with: --pdb-failure
--attr=!performancetest --where=./ --with-doctest
-> raise self.failureException(msg)
-> assertion_func(first, second, msg=msg)
-> self.assertEquals(len(self.graph), oldLen + 1)
*** Exception: Can't split 'hates'
'\n<pizza> <hates> <tarek> .\n\n'
(Pdb) self.assertEquals(len([y for y in self.graph.triples((None, None,
None))]), oldLen + 1)
The failure to serialize the test statements as XML is rather inconvenient
and perhaps even a bug.
Still, even with the limitation of several significant test failures,
it is possible to run FuXi's test suite with rdflib 3.2 dev and the
"restoration" rdfextras clone.
Again, there is a Hudson CI build:
similarly tracking commits and maintaining reports of test runs, currently
standing at 87 tests and 31 failures
and (again, fwiw) test coverage
The complete console output is captured here:
For my own convenience, I adjusted matters so that I could run nose, its
--pdb and --pdb-failure options are extremely useful conveniences. The
existing test/suite.py seems to find 469 doctests, I can't explain the
difference as yet.
I can't detect any significant difference between the result of suite.py run
with FuXi+layercake and the same test run with refactoredFuXi +
rdflib3/restorationrdfextras; the numbers of tests, passes and fails were
pretty much the same (to a casual inspection).
The overwhelming majority of the test failures would seem to be due to
case mismatches and other format mismatches, e.g.
( ex:Fire and ex:Water )
( ex:Fire AND ex:Water )
If this were any domain other than RDF, I would readily opine that a fix
would appear to be trivial - but I've learned to be circumspect, even
with what seems obvious.
I have recently removed the rdflib2/rdflib3 import switching because FuXi
not run with either rdflib-2.4.1 or rdflib-2.4.2, only with the "layercake"
fork. This is immediately apparent with the failure of imports of a
"parse" function in rdflib.sparql.parser and then an rdflib.OWL module
missing completely from the 2.4.1/2.4.2 package (that's as far as I got
the realisation settled in).