pip and fuxi and lib vs FuXi

6 views
Skip to first unread message

William Waites

unread,
Jul 28, 2010, 8:57:27 AM7/28/10
to fuxi-di...@googlegroups.com
A bit of a problem with directory naming. In the FuXi
source, rather than the standard convention of having
the directory name the same as the package name it is
called "lib". So instead of "FuXi/Rete/Network.py" there
is "lib/Rete/Network.py"

This causes a bit of a problem when using pip instead
of easy_install -- pip is good for doing development
in-place and not having to install the package, it just
manipulates the setuptools import machinery to point
at the checked out source.

However if you try to do:

% virtualenv testing
% . ./testing/bin/activate
(testing)% pip install -e hg+https://fuxi.googlecode.com/hg/#egg=fuxi
% python
>>> import FuXi
ImportError(...)

a workaround is to create a symlink (ln -s lib FuXi) but
I wonder if it wouldn't be too much trouble to do a

% hg mv lib FuXi

and edit setup.py appropriately? Then this would work
seamlessly...

Cheers,
-w

--
William Waites <wwa...@gmail.com>
Mob: +44 789 798 9965
Fax: +44 131 464 4948

Augusto Herrmann

unread,
Aug 24, 2010, 2:14:30 PM8/24/10
to fuxi-discussion
On a related note, module naming do not follow PEP8 [1], because they
use title casing instead of just lower casing as recommended there.
However, changing that would break compatibility with code that
already depends on FuXi. Maybe FuXi shouldn't make the same mistakes
as the rdflib people did just to follow style conventions, as good as
that goal might be.

[1] http://www.python.org/dev/peps/pep-0008/

Cheers,
Augusto Herrmann
> William Waites <wwai...@gmail.com>

Daniel Krech

unread,
Aug 24, 2010, 5:23:01 PM8/24/10
to fuxi-di...@googlegroups.com
On Tue, Aug 24, 2010 at 2:14 PM, Augusto Herrmann <hell...@gmail.com> wrote:
On a related note, module naming do not follow PEP8 [1], because they
use title casing instead of just lower casing as recommended there.
However, changing that would break compatibility with code that
already depends on FuXi. Maybe FuXi shouldn't make the same mistakes
as the rdflib people did just to follow style conventions, as good as
that goal might be.

In rdflib's case we had a naming clash that was caused by using module names that matched class names they contained combined with imports like from rdflib.URIRef import URIRef. We choose to fix this issue to get rid of the problems/confusion that the issue caused and move forward. For those willing to deal with such issues, and the issue William is running into, then the code can forever be cast in stone.

William Waites

unread,
Aug 25, 2010, 4:45:03 AM8/25/10
to fuxi-di...@googlegroups.com, Daniel Krech
On 10-08-24 22:23, Daniel Krech wrote:
> In rdflib's case we had a naming clash that was caused by using module
> names that matched class names they contained combined with imports
> like from rdflib.URIRef import URIRef. We choose to fix this issue to
> get rid of the problems/confusion that the issue caused and move
> forward. For those willing to deal with such issues, and the issue
> William is running into, then the code can forever be cast in stone.

Just to be clear, my suggestion about renaming
the lib directory to FuXi to facilitate using pip
would not result in any api changes and I can't
see how it would break anything.

With rdflib, there are several problems that I can
see, and the renaming of modules is the least
of them -- import statements can be trivially
fixed.

More problematic is the removal of functionality,
back-end stores, SPARQL support, etc. This is
compounded by the decision not to use
setuptools. Not using setuptools means that
in order to find plugins that live in other packages
(like rdfextras) you have to reinvent something
like entrypoints. I don't see the usefulness in
doing that, setuptools is very common and does
that perfectly well.

There are also some very useful things in the
layercake branch, chiefly the pure-python SPARQL
and the remote SPARQL store, that would really
like to live in rdfextras.

As for old code being cast in stone, there are
still improvements that could be made to either
branch. I'm thinking of the bnc (blank node
closure) and distinct_subjects/predicates/objects
methods on the ORDF variant of Graph that I
would really like to see pulled up into rdflib
proper (both branches!)

Finally, for me, the main advantage to arranging
and distributing my data using RDF as opposed
to, say XML or JSON is the ability to do inferencing.
FuXi is probably the best RDF inference engine
in existence and it is a shame that there was not
more coordination in the great refactoring. This
is the main reason that while I maintain a certain
level of 3.0 compatibility in ORDF with an eye
to the future, I don't use it.

This is not all idle talk. I'm quite willing to help
improve rdflib 3.0 and update other packages
and extras to use it. I'm fortunate in that my
current work with the Open Knowledge Foundation
involves heavy use of rdflib and FuXi and I'm sure
I can justify spending some time on this, but I'm
hesitant because design decisions that I might
make (e.g. the use of setuptools and entrypoints)
might not be agreed with by the rdflib developers.

Cheers,
-w

--
William Waites <wwa...@gmail.com>

Uche Ogbuji

unread,
Aug 25, 2010, 9:42:34 AM8/25/10
to fuxi-discussion
Just my 2 cents here:

I disagree that having the Python source module directory name the
same as the package name is standard convention. Yes, it is a
convention that became popular because of setuptools, and PJE's
stubborn refusal to support other conventions, but using "lib" is more
established, more congruent with more general conventions, and has a
few technical advantages of its own.

It seems the reason you want this is to use setup.py develop, rather
than pip itself. I strongly recommend you avoid doing so. setup.py
develop (like much of setuptools) is an utter mess and if you use it
long enough you will find the many quirks and frustrations I did
before I banned setuptools from any development process under which I
have control over a year ago. And contrary to some myth, you can use
pip just fine without setuptools support. Note: I'm far from the only
one with such complaints, and many developers, distributors, and more
have been constantly trying to organize non-broken replacements or
setuptools, which have now culminated in distutils2 and distribute,
behind which Guido famously threw his support. I recommend developing
to those in mind rather than setuptools.

--Uche
> William Waites <wwai...@gmail.com>

William Waites

unread,
Aug 25, 2010, 4:01:31 PM8/25/10
to fuxi-di...@googlegroups.com
On 10-08-25 14:42, Uche Ogbuji wrote:
> but using "lib" is more
> established, more congruent with more general conventions, and has a
> few technical advantages of its own.

It also prevents simply doing PYTHONPATH=`pwd` ... I would be
curious to know what technical advantages you are talking about.

> It seems the reason you want this is to use setup.py develop, rather
> than pip itself. I strongly recommend you avoid doing so. setup.py
> develop (like much of setuptools) is an utter mess and if you use it
> long enough you will find the many quirks and frustrations

Quite so, "pip -e" does, I believe, "setup.py develop". I tend to
write and test in a very tight loop, enough so that having to
re-run "setup.py install" is an annoyance. I find the develop
very helpful. I don't doubt that there are bugs but they haven't
bitten me... In any event this is a question of personal convenience
and I don't mind adjusting my practices if necessary (a symlink
and setting PYTHONPATH if I must)

> which have now culminated in distutils2 and distribute,
> behind which Guido famously threw his support. I recommend developing
> to those in mind rather than setuptools.

The main thing I'm concerned with is entrypoints which
distribute has. I think they are crucial for something like
rdfextras to be viable without reinventing the wheel. I
don't mind using distribute over setuptools.

What I don't know is the rdflib attitude towards distribute,
right now 3.0 uses old distutils and has a hardcoded
entrypoint-like dictionary thing in plugin.py...

Cheers,
-w

--
William Waites <wwa...@gmail.com>

Graham Higgins

unread,
Aug 27, 2010, 2:33:05 PM8/27/10
to fuxi-di...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 25 Aug 2010, at 09:45, William Waites wrote:
> I'm quite willing to help improve rdflib 3.0 and
> update other packages and extras to use it.


There -may- be less work involved than has generally been anticipated...

This post carries FuXi-specific details that carry over from the last
of my recent three posts top rdflib-dev (http://groups.google.com/group/rdflib-dev/msg/99a60eb68be3ebe0
)...

In another strand of inquiry, I was hoping to compare FuXi+rdflib3
test output against FuXi+layercake test output (a "gold standard" is
useful in such circumstances). Unfortunately, I haven't yet managed
successfully to run the test suite for FuXi+layercake. This is because
(other) rdflib-generated exceptions terminate the test run after about
ten tests --- but it is clear even from the truncated run that FuXi
+layercake passes considerably more tests than does FuXi+rdflib3.

With FuXi+rdflib3, rendering the printed output in HTML eases the task
of interpreting the results (which I find confusing in terms of what
passes and what fails) but rendering restructured text is a PITA, so I
exchanged the code that outputs restructured text for code that
outputs XML (docbook 4.5) with a stylesheet PI:

http://bel-epa.com/area51/library/fuxitest.xml

I intend that this will be automatically updated whenever builds occur
(I just need to commit the hacked-about testOWL.py to trigger this
process).

What confuses me is that FuXi appears to (mostly) work, even on
complex tasks, yet the number of test failures would strongly suggest
otherwise. This leads me to speculate that perhaps part of the reason
for the extensive test failures lies with the test mechanism itself
and not altogether (if at all) with the FuXi code.

An example...

A few days ago Will posted advice of, and a link to, his Object
Description Mapper: http://packages.python.org/ordf/odm.html which
carries an extensive worked example of using FuXi_ordf for inferencing.

Ever the cargo culter (in this instance, it's a usefully direct method
of illuminating my local code landscape), I serially copied and pasted
from Will's example into TextMate.app, hitting the "run buffer" button
as I went and FuXi+rdflib3 tracked the example code execution
flawlessly.

On completion of the cargo culting of Will's example, the output from
FuXi+rdflib3 runs thus:
> <class '__main__.Country'> ['Scotland']
> ['Scotland']
> ['Russia']
> Time to build production rule (RDFLib 3.0.0): 7.39097595215e-05 second

I haven't yet had time to explore FuXi in any depth, so ATM I'm unable
to assess how much of FuXi is being exercised by Will's example.

In an attempt to exercise FuXi_rdflib3 a little further, I performed
the same copy'n'paste exercise with the command-line examples listed
on the FuXi wiki pages. I post the test script and the results here
and invite informed comments which might help guide me towards a
successful test run.

#!/bin/sh
export FUXI=/...path/to.../bin/FuXi

echo "Test 1"

${FUXI} \
--firstAnswer \
--debug \
--method=both \
--safety=loose \
--output=conflict \
--normalize \
--ns=ub=http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# \
--ns=ex=http://www.Department0.University0.edu/ \
--why="ASK { ?class a ub:Course }" \
--dlp \
--ontology=http://www.lehigh.edu/%7Ezhp2/2004/0401/univ-bench.owl \
http://swat.cse.lehigh.edu/projects/lubm/University0_0.owl

echo "Test 2"

${FUXI} \
--ns=ex=http://www.agfa.com/w3c/euler/subclass# \
--why="ASK { ex:i a ex:A }" \
--debug \
--method=both \
--input-format=n3 \
--dlp \
http://www.agfa.com/w3c/euler/subclass.n3

echo "Test 3"

${FUXI} \
--ruleFacts \
--why="ASK { test:Ghent test:path test:Amsterdam }" \
--ns=test=http://www.w3.org/2002/03owlt/TransitiveProperty/premises001#
\
--dlp \
--output=conflict \
--debug \
--method=both \
--strict=defaultDerived \
http://www.w3.org/2002/03owlt/TransitiveProperty/premises001


echo "Test 4"

${FUXI} \
--safety=loose \
--strictness=defaultDerived \
--idb=owl:sameAs \
--method=both \
--why="ASK { ex:subject1 owl:sameAs ex:subject2 }" \
--debug \
--ns=ex=http://www.w3.org/2002/03owlt/InverseFunctionalProperty/premises001#
\
--pDSemantics \
--builtinTemplates=file:///...path/to.../RuleBuiltinSPARQLTemplates.n3
\
--dlp \
http://www.w3.org/2002/03owlt/InverseFunctionalProperty/premises001.rdf

Results (which I have edited for the purposes of presenting here):

Test 1
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 8.01086425781e-05 seconds
/.../fuxi-hg/FuXi/Rete/CommandLine.py:296: SyntaxWarning:
Ignoring unsafe rule (.*) safety = safetyNameMap[options.safety])
(Forall ( Exists _:oBYAMnLe43 ( takesCourse(?X oBYAMnLe43) ) :-
GraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe43 ( GraduateCourse(oBYAMnLe43) ) :-
GraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe52 ( worksFor(?X oBYAMnLe52) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe52 ( ResearchGroup(oBYAMnLe52) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe111 ( worksFor(?X oBYAMnLe111) ) :-
Faculty(?X) ))
(Forall ( Exists _:oBYAMnLe111 ( Organization(oBYAMnLe111) ) :-
Faculty(?X) ))
(Forall ( Exists _:oBYAMnLe124 ( worksFor(?X oBYAMnLe124) ) :-
AdministrativeStaff(?X) ))
(Forall ( Exists _:oBYAMnLe124 ( Organization(oBYAMnLe124) ) :-
AdministrativeStaff(?X) ))
(Forall ( Exists _:oBYAMnLe137 ( takesCourse(?X oBYAMnLe137) ) :-
UndergraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe137 ( Course(oBYAMnLe137) ) :-
UndergraduateStudent(?X) ))
(Forall ( Exists _:oBYAMnLe150 ( takesCourse(?X oBYAMnLe150) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe150 ( Course(oBYAMnLe150) ) :-
ResearchAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe250 ( teachingAssistantOf(?X
oBYAMnLe250) ) :- TeachingAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe251 ( Course(oBYAMnLe251) ) :-
TeachingAssistant(?X) ))
(Forall ( Exists _:oBYAMnLe271 ( headOf(?X oBYAMnLe271) ) :-
Director(?X) ))
(Forall ( Exists _:oBYAMnLe272 ( Program(oBYAMnLe272) ) :- Director(?
X) ))
(Forall ( Exists _:oBYAMnLe292 ( headOf(?X oBYAMnLe292) ) :- Chair(?
X) ))
(Forall ( Exists _:oBYAMnLe293 ( Department(oBYAMnLe293) ) :- Chair(?
X) ))
(Forall ( Exists _:oBYAMnLe313 ( takesCourse(?X oBYAMnLe313) ) :-
Student(?X) ))
(Forall ( Exists _:oBYAMnLe314 ( Course(oBYAMnLe314) ) :- Student(?
X) ))
(Forall ( Exists _:oBYAMnLe334 ( worksFor(?X oBYAMnLe334) ) :-
Employee(?X) ))
(Forall ( Exists _:oBYAMnLe335 ( Organization(oBYAMnLe335) ) :-
Employee(?X) ))
(Forall ( Exists _:oBYAMnLe355 ( headOf(?X oBYAMnLe355) ) :- Dean(?
X) ))
(Forall ( Exists _:oBYAMnLe356 ( College(oBYAMnLe356) ) :- Dean(?X) ))

/.../fuxi-hg/FuXi/Rete/Magic.py:553: UserWarning:
predicate symbol of Publication(?X) is in both IDB and EDB. Marking as
base
predicate symbol of University(?oBYAMnLe482) is in both IDB and EDB.
Marking as base
predicate symbol of Course(?oBYAMnLe432) is in both IDB and EDB.
Marking as base
predicate symbol of TeachingAssistant(?X) is in both IDB and EDB.
Marking as base
predicate symbol of ResearchGroup(?X) is in both IDB and EDB. Marking
as base
predicate symbol of worksFor(?X ?Y) is in both IDB and EDB. Marking as
base
predicate symbol of subOrganizationOf(?X ?oBYAMnLe388) is in both IDB
and EDB. Marking as base
predicate symbol of memberOf(?oBYAMnLe382 ?X) is in both IDB and EDB.
Marking as base

Derived predicates (top-down)
[u'ub:Software', u'ub:Person', u'ub:Research', u'ub:Employee',
u'ub:Professor',
u'ub:Work', u'ub:Director', u'ub:Organization', u'ub:member',
u'ub:Article',
u'ub:Student', u'ub:AdministrativeStaff', u'ub:Dean',
u'ub:Faculty',
u'ub:degreeFrom', u'ub:hasAlumnus', u'ub:Schedule',
u'ub:Course', u'ub:Chair']

Magic seed fact (used in bottom-up evaluation) :Course_magic(?class)
Solving :Course(?class) {}
Processing rule :Course_f(?oBYAMnLe442) :- teacherOf(?X ?
oBYAMnLe442)
Solving :teacherOf(?X ?oBYAMnLe442) {}

SELECT ?X ?oBYAMnLe442 { ?X ub:teacherOf ?oBYAMnLe442 }-> []

Time to reach answer
{?class: rdflib.term.URIRef('http://www.Department0.University0.edu/Course52')
}
via top-down SPARQL sip strategy: 21.26288414 milli seconds

Exception RuntimeError: 'generator ignored GeneratorExit' in
<generator object collectAnswers at 0x2e12f08>

ignored reduction in size of program: 96.0396039604 (101 -> 4 clauses)

Derived predicates (bottom-up)
[u'ub:Software', u'ub:Person', u'ub:Research', u'ub:Employee',
u'ub:Professor',
u'ub:Work', u'ub:Director', u'ub:Organization', u'ub:member',
u'ub:Article',
u'ub:Student', u'ub:AdministrativeStaff', u'ub:Dean',
u'ub:Faculty',
u'ub:degreeFrom', u'ub:hasAlumnus', u'ub:Schedule',
u'ub:Course', u'ub:Chair']

Time to calculate closure on working memory: 493.834018707 milli
seconds

<Network: 4 rules, 8 nodes, 224 tokens in working memory, 105 inferred
tokens>

<TerminalNode (pass-thru):
CommonVariables: [?X, ?oBYAMnLe442] (0 in left, 128 in right memories)>
:Course_f(?oBYAMnLe442) :- teacherOf(?X ?oBYAMnLe442)
67 instanciations

<TerminalNode (pass-thru):
CommonVariables: [?X] (0 in left, 67 in right memories)>
:Course_f(?X) :- GraduateCourse(?X)
30 instanciations

<TerminalNode (pass-thru):
CommonVariables: [?X, ?oBYAMnLe432] (0 in left, 29 in right memories)>
:Course_f(?oBYAMnLe432) :- teachingAssistantOf(?X ?oBYAMnLe432)
8 instanciations


Test 2
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 7.70092010498e-05 seconds

/.../fuxi-hg/FuXi/Rete/Magic.py:553:
UserWarning: predicate symbol of ex:B(?X) is in both IDB and EDB.
Marking as base

Derived predicates (top-down) [u'ex:A']

Sideways Information Passing (sip) graph:
Magic seed fact (used in bottom-up evaluation) :A_magic(:i)
Solving :A(:i) {}
Processing rule :A_b(?X) :- ex:B(?X)
Solving :B(:i) {?X: rdflib.term.URIRef('http://www.agfa.com/w3c/euler/subclass#i')
}

ASK { ex:i a ex:B } 1 apriori binding(s)-> True

Time to reach answer True via top-down SPARQL sip strategy:
6.30307197571 milli seconds

reduction in size of program: 50.0 (2 -> 1 clauses)

Derived predicates (bottom-up) [u'ex:A']

Time to calculate closure on working memory: 0.897884368896 milli
seconds

<Network: 1 rules, 4 nodes, 3 tokens in working memory, 1 inferred
tokens>

@prefix : <http://www.agfa.com/w3c/euler/subclass#> .

:i a :A .


Test 3
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 7.60555267334e-05 seconds
/.../fuxi-hg/FuXi/Rete/Magic.py:553: UserWarning:
predicate symbol of test:path(?X ?qWPXHvIJ20) is in both IDB and EDB.
Marking as derived

Traceback (most recent call last):
File "/.../FuXi", line 8, in <module>
load_entry_point('FuXi==1.0.dev', 'console_scripts', 'FuXi')()
File "/.../fuxi-hg/FuXi/Rete/CommandLine.py", line 416, in main
defaultDerivedPreds))
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 78, in
SetupDDLAndAdornProgram
adornedProgram = AdornProgram(factGraph,rules,GOALS,derivedPreds)
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 338, in AdornProgram
adornedRule=AdornRule(derivedPreds,clause,term)
File "/.../fuxi-hg/FuXi/Rete/Magic.py", line 262, in AdornRule
headArc = len(N)==1 and N[0] == GetOp(newHead)
File "/.../rdflib/collection.py", line 76, in __len__
assert item not in links,"There is a loop in the RDF list! (%s
has been processed before)"%item
AssertionError: There is a loop in the RDF list!
(http://www.w3.org/2002/03owlt/TransitiveProperty/premises001#path
has been processed before)

Test 4
=
=
=
=
=
=
========================================================================
Time to build production rule (RDFLib 3.0.0): 0.000101089477539 seconds
/.../fuxi-hg/FuXi/Rete/CommandLine.py:419:
UserWarning: Unable to solve goal via ruleset

Derived predicates (top-down) []

Magic seed fact (used in bottom-up evaluation)
owl:sameAs_magic(:subject1 :subject2)
Solving owl:sameAs(:subject1 :subject2) {}

ASK { ex:subject1 owl:sameAs ex:subject2 }-> False

Time to reach answer False via top-down SPARQL sip strategy:
4.43291664124 milli seconds

Derived predicates (bottom-up) []

Time to calculate closure on working memory: 0.900030136108 milli
seconds
<Network: 11 rules, 39 nodes, 3 tokens in working memory, 0 inferred
tokens>


- --
Cheers,

Graham

http://www.linkedin.com/in/ghiggins

-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAkx4BOIACgkQOsmLt1NhivwoQACfdqQ+MIaUkVooEipc9FhBmYrl
cVwAnAxde6qhi2+0V7N97d15MbNEBzoViQCVAgUBTHgE4lnrWVZ7aXD1AQK5RQQA
omDZ8DvP9Q3ksAjPnyzGUAWM6gZTxA9cAsqj5odm8QdvNmf0cKFeRqt+byG7k98d
RFwh202btgeNmIQb2UKANzXk/UeQ06FFqUkpjgwGJkWSZNFvpybM4NC7br2Sa7hE
tNF06Tx4au0sjyZDWqEWQjevXNc7JJaGRNdVcNYMbYk=
=H+UR
-----END PGP SIGNATURE-----

Reply all
Reply to author
Forward
0 new messages