This causes a bit of a problem when using pip instead
of easy_install -- pip is good for doing development
in-place and not having to install the package, it just
manipulates the setuptools import machinery to point
at the checked out source.
However if you try to do:
% virtualenv testing
% . ./testing/bin/activate
(testing)% pip install -e hg+https://fuxi.googlecode.com/hg/#egg=fuxi
% python
>>> import FuXi
ImportError(...)
a workaround is to create a symlink (ln -s lib FuXi) but
I wonder if it wouldn't be too much trouble to do a
% hg mv lib FuXi
and edit setup.py appropriately? Then this would work
seamlessly...
Cheers,
-w
--
William Waites <wwa...@gmail.com>
Mob: +44 789 798 9965
Fax: +44 131 464 4948
On a related note, module naming do not follow PEP8 [1], because they
use title casing instead of just lower casing as recommended there.
However, changing that would break compatibility with code that
already depends on FuXi. Maybe FuXi shouldn't make the same mistakes
as the rdflib people did just to follow style conventions, as good as
that goal might be.
Just to be clear, my suggestion about renaming
the lib directory to FuXi to facilitate using pip
would not result in any api changes and I can't
see how it would break anything.
With rdflib, there are several problems that I can
see, and the renaming of modules is the least
of them -- import statements can be trivially
fixed.
More problematic is the removal of functionality,
back-end stores, SPARQL support, etc. This is
compounded by the decision not to use
setuptools. Not using setuptools means that
in order to find plugins that live in other packages
(like rdfextras) you have to reinvent something
like entrypoints. I don't see the usefulness in
doing that, setuptools is very common and does
that perfectly well.
There are also some very useful things in the
layercake branch, chiefly the pure-python SPARQL
and the remote SPARQL store, that would really
like to live in rdfextras.
As for old code being cast in stone, there are
still improvements that could be made to either
branch. I'm thinking of the bnc (blank node
closure) and distinct_subjects/predicates/objects
methods on the ORDF variant of Graph that I
would really like to see pulled up into rdflib
proper (both branches!)
Finally, for me, the main advantage to arranging
and distributing my data using RDF as opposed
to, say XML or JSON is the ability to do inferencing.
FuXi is probably the best RDF inference engine
in existence and it is a shame that there was not
more coordination in the great refactoring. This
is the main reason that while I maintain a certain
level of 3.0 compatibility in ORDF with an eye
to the future, I don't use it.
This is not all idle talk. I'm quite willing to help
improve rdflib 3.0 and update other packages
and extras to use it. I'm fortunate in that my
current work with the Open Knowledge Foundation
involves heavy use of rdflib and FuXi and I'm sure
I can justify spending some time on this, but I'm
hesitant because design decisions that I might
make (e.g. the use of setuptools and entrypoints)
might not be agreed with by the rdflib developers.
Cheers,
-w
--
William Waites <wwa...@gmail.com>
It also prevents simply doing PYTHONPATH=`pwd` ... I would be
curious to know what technical advantages you are talking about.
> It seems the reason you want this is to use setup.py develop, rather
> than pip itself. I strongly recommend you avoid doing so. setup.py
> develop (like much of setuptools) is an utter mess and if you use it
> long enough you will find the many quirks and frustrations
Quite so, "pip -e" does, I believe, "setup.py develop". I tend to
write and test in a very tight loop, enough so that having to
re-run "setup.py install" is an annoyance. I find the develop
very helpful. I don't doubt that there are bugs but they haven't
bitten me... In any event this is a question of personal convenience
and I don't mind adjusting my practices if necessary (a symlink
and setting PYTHONPATH if I must)
> which have now culminated in distutils2 and distribute,
> behind which Guido famously threw his support. I recommend developing
> to those in mind rather than setuptools.
The main thing I'm concerned with is entrypoints which
distribute has. I think they are crucial for something like
rdfextras to be viable without reinventing the wheel. I
don't mind using distribute over setuptools.
What I don't know is the rdflib attitude towards distribute,
right now 3.0 uses old distutils and has a hardcoded
entrypoint-like dictionary thing in plugin.py...
Cheers,
-w
--
William Waites <wwa...@gmail.com>
On 25 Aug 2010, at 09:45, William Waites wrote:
> I'm quite willing to help improve rdflib 3.0 and
> update other packages and extras to use it.
There -may- be less work involved than has generally been anticipated...
This post carries FuXi-specific details that carry over from the last
of my recent three posts top rdflib-dev (http://groups.google.com/group/rdflib-dev/msg/99a60eb68be3ebe0
)...
In another strand of inquiry, I was hoping to compare FuXi+rdflib3
test output against FuXi+layercake test output (a "gold standard" is
useful in such circumstances). Unfortunately, I haven't yet managed
successfully to run the test suite for FuXi+layercake. This is because
(other) rdflib-generated exceptions terminate the test run after about
ten tests --- but it is clear even from the truncated run that FuXi
+layercake passes considerably more tests than does FuXi+rdflib3.
With FuXi+rdflib3, rendering the printed output in HTML eases the task
of interpreting the results (which I find confusing in terms of what
passes and what fails) but rendering restructured text is a PITA, so I
exchanged the code that outputs restructured text for code that
outputs XML (docbook 4.5) with a stylesheet PI:
http://bel-epa.com/area51/library/fuxitest.xml
I intend that this will be automatically updated whenever builds occur
(I just need to commit the hacked-about testOWL.py to trigger this
process).
What confuses me is that FuXi appears to (mostly) work, even on
complex tasks, yet the number of test failures would strongly suggest
otherwise. This leads me to speculate that perhaps part of the reason
for the extensive test failures lies with the test mechanism itself
and not altogether (if at all) with the FuXi code.
An example...
A few days ago Will posted advice of, and a link to, his Object
Description Mapper: http://packages.python.org/ordf/odm.html which
carries an extensive worked example of using FuXi_ordf for inferencing.
Ever the cargo culter (in this instance, it's a usefully direct method
of illuminating my local code landscape), I serially copied and pasted
from Will's example into TextMate.app, hitting the "run buffer" button
as I went and FuXi+rdflib3 tracked the example code execution
flawlessly.
On completion of the cargo culting of Will's example, the output from
FuXi+rdflib3 runs thus:
> <class '__main__.Country'> ['Scotland']
> ['Scotland']
> ['Russia']
> Time to build production rule (RDFLib 3.0.0): 7.39097595215e-05 second
I haven't yet had time to explore FuXi in any depth, so ATM I'm unable
to assess how much of FuXi is being exercised by Will's example.
In an attempt to exercise FuXi_rdflib3 a little further, I performed
the same copy'n'paste exercise with the command-line examples listed
on the FuXi wiki pages. I post the test script and the results here
and invite informed comments which might help guide me towards a
successful test run.
#!/bin/sh
export FUXI=/...path/to.../bin/FuXi
echo "Test 1"
${FUXI} \
--firstAnswer \
--debug \
--method=both \
--safety=loose \
--output=conflict \
--normalize \
--ns=ub=http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# \
--ns=ex=http://www.Department0.University0.edu/ \
--why="ASK { ?class a ub:Course }" \
--dlp \
--ontology=http://www.lehigh.edu/%7Ezhp2/2004/0401/univ-bench.owl \
http://swat.cse.lehigh.edu/projects/lubm/University0_0.owl
echo "Test 2"
${FUXI} \
--ns=ex=http://www.agfa.com/w3c/euler/subclass# \
--why="ASK { ex:i a ex:A }" \
--debug \
--method=both \
--input-format=n3 \
--dlp \
http://www.agfa.com/w3c/euler/subclass.n3
echo "Test 3"
${FUXI} \
--ruleFacts \
--why="ASK { test:Ghent test:path test:Amsterdam }" \
--ns=test=http://www.w3.org/2002/03owlt/TransitiveProperty/premises001#
\
--dlp \
--output=conflict \
--debug \
--method=both \
--strict=defaultDerived \
http://www.w3.org/2002/03owlt/TransitiveProperty/premises001
echo "Test 4"
${FUXI} \
--safety=loose \
--strictness=defaultDerived \
--idb=owl:sameAs \
--method=both \
--why="ASK { ex:subject1 owl:sameAs ex:subject2 }" \
--debug \
--ns=ex=http://www.w3.org/2002/03owlt/InverseFunctionalProperty/premises001#
\
--pDSemantics \
--builtinTemplates=file:///...path/to.../RuleBuiltinSPARQLTemplates.n3
\
--dlp \
http://www.w3.org/2002/03owlt/InverseFunctionalProperty/premises001.rdf
Results (which I have edited for the purposes of presenting here):
Test 1
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 8.01086425781e-05 seconds
/.../fuxi-hg/FuXi/Rete/CommandLine.py:296: SyntaxWarning:
Ignoring unsafe rule (.*) safety = safetyNameMap[options.safety])
(Forall ( Exists _:oBYAMnLe43 ( takesCourse(?X oBYAMnLe43) ) :-
GraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe43 ( GraduateCourse(oBYAMnLe43) ) :-
GraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe52 ( worksFor(?X oBYAMnLe52) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe52 ( ResearchGroup(oBYAMnLe52) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe111 ( worksFor(?X oBYAMnLe111) ) :-
Faculty(?X) ))
(Forall ( Exists _:oBYAMnLe111 ( Organization(oBYAMnLe111) ) :-
Faculty(?X) ))
(Forall ( Exists _:oBYAMnLe124 ( worksFor(?X oBYAMnLe124) ) :-
AdministrativeStaff(?X) ))
(Forall ( Exists _:oBYAMnLe124 ( Organization(oBYAMnLe124) ) :-
AdministrativeStaff(?X) ))
(Forall ( Exists _:oBYAMnLe137 ( takesCourse(?X oBYAMnLe137) ) :-
UndergraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe137 ( Course(oBYAMnLe137) ) :-
UndergraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe150 ( takesCourse(?X oBYAMnLe150) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe150 ( Course(oBYAMnLe150) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe250 ( teachingAssistantOf(?X
oBYAMnLe250) ) :- TeachingAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe251 ( Course(oBYAMnLe251) ) :-
TeachingAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe271 ( headOf(?X oBYAMnLe271) ) :-
Director(?X) ))
(Forall ( Exists _:oBYAMnLe272 ( Program(oBYAMnLe272) ) :- Director(?
X) ))
(Forall ( Exists _:oBYAMnLe292 ( headOf(?X oBYAMnLe292) ) :- Chair(?
X) ))
(Forall ( Exists _:oBYAMnLe293 ( Department(oBYAMnLe293) ) :- Chair(?
X) ))
(Forall ( Exists _:oBYAMnLe313 ( takesCourse(?X oBYAMnLe313) ) :-
Student(?X) ))
(Forall ( Exists _:oBYAMnLe314 ( Course(oBYAMnLe314) ) :- Student(?
X) ))
(Forall ( Exists _:oBYAMnLe334 ( worksFor(?X oBYAMnLe334) ) :-
Employee(?X) ))
(Forall ( Exists _:oBYAMnLe335 ( Organization(oBYAMnLe335) ) :-
Employee(?X) ))
(Forall ( Exists _:oBYAMnLe355 ( headOf(?X oBYAMnLe355) ) :- Dean(?
X) ))
(Forall ( Exists _:oBYAMnLe356 ( College(oBYAMnLe356) ) :- Dean(?X) ))
/.../fuxi-hg/FuXi/Rete/Magic.py:553: UserWarning:
predicate symbol of Publication(?X) is in both IDB and EDB. Marking as
base
predicate symbol of University(?oBYAMnLe482) is in both IDB and EDB.
Marking as base
predicate symbol of Course(?oBYAMnLe432) is in both IDB and EDB.
Marking as base
predicate symbol of TeachingAssistant(?X) is in both IDB and EDB.
Marking as base
predicate symbol of ResearchGroup(?X) is in both IDB and EDB. Marking
as base
predicate symbol of worksFor(?X ?Y) is in both IDB and EDB. Marking as
base
predicate symbol of subOrganizationOf(?X ?oBYAMnLe388) is in both IDB
and EDB. Marking as base
predicate symbol of memberOf(?oBYAMnLe382 ?X) is in both IDB and EDB.
Marking as base
Derived predicates (top-down)
[u'ub:Software', u'ub:Person', u'ub:Research', u'ub:Employee',
u'ub:Professor',
u'ub:Work', u'ub:Director', u'ub:Organization', u'ub:member',
u'ub:Article',
u'ub:Student', u'ub:AdministrativeStaff', u'ub:Dean',
u'ub:Faculty',
u'ub:degreeFrom', u'ub:hasAlumnus', u'ub:Schedule',
u'ub:Course', u'ub:Chair']
Magic seed fact (used in bottom-up evaluation) :Course_magic(?class)
Solving :Course(?class) {}
Processing rule :Course_f(?oBYAMnLe442) :- teacherOf(?X ?
oBYAMnLe442)
Solving :teacherOf(?X ?oBYAMnLe442) {}
SELECT ?X ?oBYAMnLe442 { ?X ub:teacherOf ?oBYAMnLe442 }-> []
Time to reach answer
{?class: rdflib.term.URIRef('http://www.Department0.University0.edu/Course52')
}
via top-down SPARQL sip strategy: 21.26288414 milli seconds
Exception RuntimeError: 'generator ignored GeneratorExit' in
<generator object collectAnswers at 0x2e12f08>
ignored reduction in size of program: 96.0396039604 (101 -> 4 clauses)
Derived predicates (bottom-up)
[u'ub:Software', u'ub:Person', u'ub:Research', u'ub:Employee',
u'ub:Professor',
u'ub:Work', u'ub:Director', u'ub:Organization', u'ub:member',
u'ub:Article',
u'ub:Student', u'ub:AdministrativeStaff', u'ub:Dean',
u'ub:Faculty',
u'ub:degreeFrom', u'ub:hasAlumnus', u'ub:Schedule',
u'ub:Course', u'ub:Chair']
Time to calculate closure on working memory: 493.834018707 milli
seconds
<Network: 4 rules, 8 nodes, 224 tokens in working memory, 105 inferred
tokens>
<TerminalNode (pass-thru):
CommonVariables: [?X, ?oBYAMnLe442] (0 in left, 128 in right memories)>
:Course_f(?oBYAMnLe442) :- teacherOf(?X ?oBYAMnLe442)
67 instanciations
<TerminalNode (pass-thru):
CommonVariables: [?X] (0 in left, 67 in right memories)>
:Course_f(?X) :- GraduateCourse(?X)
30 instanciations
<TerminalNode (pass-thru):
CommonVariables: [?X, ?oBYAMnLe432] (0 in left, 29 in right memories)>
:Course_f(?oBYAMnLe432) :- teachingAssistantOf(?X ?oBYAMnLe432)
8 instanciations
Test 2
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 7.70092010498e-05 seconds
/.../fuxi-hg/FuXi/Rete/Magic.py:553:
UserWarning: predicate symbol of ex:B(?X) is in both IDB and EDB.
Marking as base
Derived predicates (top-down) [u'ex:A']
Sideways Information Passing (sip) graph:
Magic seed fact (used in bottom-up evaluation) :A_magic(:i)
Solving :A(:i) {}
Processing rule :A_b(?X) :- ex:B(?X)
Solving :B(:i) {?X: rdflib.term.URIRef('http://www.agfa.com/w3c/euler/subclass#i')
}
ASK { ex:i a ex:B } 1 apriori binding(s)-> True
Time to reach answer True via top-down SPARQL sip strategy:
6.30307197571 milli seconds
reduction in size of program: 50.0 (2 -> 1 clauses)
Derived predicates (bottom-up) [u'ex:A']
Time to calculate closure on working memory: 0.897884368896 milli
seconds
<Network: 1 rules, 4 nodes, 3 tokens in working memory, 1 inferred
tokens>
@prefix : <http://www.agfa.com/w3c/euler/subclass#> .
:i a :A .
Test 3
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 7.60555267334e-05 seconds
/.../fuxi-hg/FuXi/Rete/Magic.py:553: UserWarning:
predicate symbol of test:path(?X ?qWPXHvIJ20) is in both IDB and EDB.
Marking as derived
Traceback (most recent call last):
File "/.../FuXi", line 8, in <module>
load_entry_point('FuXi==1.0.dev', 'console_scripts', 'FuXi')()
File "/.../fuxi-hg/FuXi/Rete/CommandLine.py", line 416, in main
defaultDerivedPreds))
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 78, in
SetupDDLAndAdornProgram
adornedProgram = AdornProgram(factGraph,rules,GOALS,derivedPreds)
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 338, in AdornProgram
adornedRule=AdornRule(derivedPreds,clause,term)
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 262, in AdornRule
headArc = len(N)==1 and N[0] == GetOp(newHead)
File "/.../rdflib/collection.py", line 76, in __len__
assert item not in links,"There is a loop in the RDF list! (%s
has been processed before)"%item
AssertionError: There is a loop in the RDF list!
(http://www.w3.org/2002/03owlt/TransitiveProperty/premises001#path
has been processed before)
Test 4
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 0.000101089477539 seconds
/.../fuxi-hg/FuXi/Rete/CommandLine.py:419:
UserWarning: Unable to solve goal via ruleset
Derived predicates (top-down) []
Magic seed fact (used in bottom-up evaluation)
owl:sameAs_magic(:subject1 :subject2)
Solving owl:sameAs(:subject1 :subject2) {}
ASK { ex:subject1 owl:sameAs ex:subject2 }-> False
Time to reach answer False via top-down SPARQL sip strategy:
4.43291664124 milli seconds
Derived predicates (bottom-up) []
Time to calculate closure on working memory: 0.900030136108 milli
seconds
<Network: 11 rules, 39 nodes, 3 tokens in working memory, 0 inferred
tokens>
- --
Cheers,
Graham
http://www.linkedin.com/in/ghiggins
-----BEGIN PGP SIGNATURE-----
iEYEARECAAYFAkx4BOIACgkQOsmLt1NhivwoQACfdqQ+MIaUkVooEipc9FhBmYrl
cVwAnAxde6qhi2+0V7N97d15MbNEBzoViQCVAgUBTHgE4lnrWVZ7aXD1AQK5RQQA
omDZ8DvP9Q3ksAjPnyzGUAWM6gZTxA9cAsqj5odm8QdvNmf0cKFeRqt+byG7k98d
RFwh202btgeNmIQb2UKANzXk/UeQ06FFqUkpjgwGJkWSZNFvpybM4NC7br2Sa7hE
tNF06Tx4au0sjyZDWqEWQjevXNc7JJaGRNdVcNYMbYk=
=H+UR
-----END PGP SIGNATURE-----