FFS improvements

2 views
Skip to first unread message

k.o.lu...@rdg.ac.uk

unread,
Jan 3, 2008, 12:02:02 PM1/3/08
to MeAggregator
Hi All

I've spent the day looking at how the FFS could be improved.

There are two improvement needs (please add to the list as you see
fit):
- Speed issues, especially for future inferencing implementations,
when using Jena
- Memory consumption issues on desktop FFS

Starting with memory consumption:

This could easily by rectified by introducing a db to the system,
however Jena's speed is even worse when using db's, and the timing of
the job seemed to be performed at query time. (Trust me a 2 minute
inference job turned into a 10 minute one in some other work I did...)

We could choose shorter URI's. Although this is only a shortterm
solution (or as I like to call them a "peeing in your pants
solution" ;-) )



Speed issues:

We could change to a different library, or make a bespoke system.

Lots of the speed problems come with OWL inferencing. We could change
the ontologies to simple rdfs, loosing native transitivity, but then
create our own rdfs transitive definition using Euler style rule
creation. Jena would probablystill be slow, and it seems more
difficult to control when the rules are inferenced / performed on the
models.

Other rdf store libraries are available, e.g. Redland, SemWeb and
sesame. Non of them has OWL inferencing, hence the choice of using
Jena in the first place, but they all have rdfs inferencing support,
and SemWeb has a Euler rule inferencing engine as well.

I've looked at SemWeb (c# store supporting both mono and ms) and there
the Euler inference is controlled completely programmatically (unlike
Jena which just does it for you). It is also a smaller library, which
could be beneficial if bespoke changes are neccesary.

Redland is a c library which I'm a bit hesitant of using, I must say
that it looks a bit wierd using a non object programming language when
doing ontology work...

Sesame is a light weight (compared to jena) java rdf store.

So where does this leave us?
We could:
- keep jena, change to rdfs, and see how it will work.
- change to SemWeb now (or sesame altough it would be nice coding
something else than java)
- Make something bespoke from first principles

Patrick Parslow

unread,
Jan 3, 2008, 1:13:31 PM1/3/08
to meaggr...@googlegroups.com
Hi Karsten,

I am in favour of a quick and dirty home grown implementation
for comparison purposes, which I rather hope to get going in the
near-ish future. I have two partial starting points, one is a GUID
based implementation in c# and the other is a database version. I think
they are worth exploring, but I am not convinced that they will
necessarily prove to be more than instructive.

I will have to think further about the issues - probably next
week.

Pat
PS Happy New Year.

Shirley Williams

unread,
Jan 3, 2008, 3:55:19 PM1/3/08
to meaggr...@googlegroups.com
We need also to review the evaluations and ensure we are on track with
deliverables. I think we need to meet.

We also need to arrange Pat and Karsten to do presentations on their PhD
work - I'll get some dates from Rachel (we need here there).

Happy New Year

Shirley

Patrick Parslow

unread,
Jan 4, 2008, 1:45:00 AM1/4/08
to meaggr...@googlegroups.com
I don't think the situation has changed to any significant degree since
our last meeting, shortly before Christmas and the ensuing two week
break, although I could be wrong. I haven't seen any discussion on the
Private RSS issue, for instance.

I am not available next week until Tuesday, by the way.

Pat

Robert Powell Ashton

unread,
Jan 4, 2008, 8:22:40 AM1/4/08
to meaggr...@googlegroups.com
I don't think any further discussion has been had yet, save for "it's becoming a standard that authentication for RSS can be done over HTTPS using standard authentication methods" - which will probably be the way we therefore do it.

With regards to memory consumption/speed and the like - are these concerns that important at this early stage of development? I'd have thought an ontological backend would always imply some level of overhead just because of what it is.
Do you have any concrete numbers about memory consumption or data crunching that would give an indication as to how far the current code would take us?

You've said all along that the backend could just be swapped out at any time, so is it really necessary before the other deliverables have been completed if it means slipping on deadlines?

Rob

________________________________

winmail.dat

Karsten Oster Lundqvist

unread,
Jan 4, 2008, 8:41:50 AM1/4/08
to meaggr...@googlegroups.com

The issue here is rdfs!

 

If we want to change the ontology language, we ought to do it sooner rather than later, as the URIs would change. This change would improve speed tremendously, and it wouldn’t require too many changes if we stick with jena. (Which is probably the most sensible thing to do at the mo.)

 

I haven’t got any O-style measurements of speed. The only measures I have is from the TRACE work, which has shown that OWL inference within Jena is very slow, especially if we want to use dbs.

 

Karsten

Reply all
Reply to author
Forward
0 new messages