How to contribute on EXQuery Functions?

12 views
Skip to first unread message

Jeroen van der Wal

unread,
May 16, 2009, 11:44:52 AM5/16/09
to EXQuery Functions Developers
Dear all,

I think EXQuery is on the sweet spot of my problem: we're developing
an XQuery application and currently use eXist as our development
platform but are looking to MarkLogic for the production environment.
I came to the conclusion that I have to write wrapper functions which
call the vender-specific functions (or emulate non-existing ones).

To prevent reinventing the wheel I would like to contribute on this
issue. I noticed that I can contribute use-cases but I'm more into
contributing to a codebase. Maybe someone can start a project on a VCS
(like Dan McCreary did with XRX: http://code.google.com/p/xrx/) and
add wrapper functions, to-do comments and documentation there?

Jeroen van der Wal

Adam Retter

unread,
May 16, 2009, 3:01:57 PM5/16/09
to exquery-fu...@googlegroups.com
Jeroen,

What we really need to do is establish a common specification for your
needed functions and modules. Can you tell us more about which
functions you are using across the two implementations that vary?

If we can pull a common approach together, nail it down into a
specification we can then certainly write/host some implementations
which at this point in time may well be wrapper functions. The idea is
that a vendor such as MarkLogic or eXist will eventually natively
implement the common EXPath / EXQuery functions which will enable us
to do away with wrapper functions.

At present the extension functions work is really happening at EXPath
(http://www.expath.org) because we figured that establishing them at
the XPath level wherever possible would lead to even greater
reusability spanning XPath, XSLT, XQuery, XProc etc. There may be
certain functions that are not suitable outside of an XQuery
environment and those will go into EXQuery extension modules.


Cheers Adam.

2009/5/16 Jeroen van der Wal <jer...@stromboli.it>:
--
Adam Retter

EXQuery Founder
t: +44 (0)7714 330069
e: adam....@exquery.org
w: www.exquery.org

Joe Wicentowski

unread,
May 17, 2009, 9:55:45 PM5/17/09
to exquery-fu...@googlegroups.com
Hi guys,

Much like Jeroen, I am working in a cross-platform situation with
eXist and Mark Logic. Most of my code is in the former, and I will be
porting some common code to the latter and maintaining code on both
going forward.

One issue I hit in my first porting project was the basic difference
in the implementation of the notion of "collection". For eXist,
collection is a hierarchical directory-like structure, and all
documents belong to a collection (beginning at "/db"). For Mark
Logic, collection is more like an attribute of the document: Documents
in different directories can be assigned to the same collection, and
until set differently all documents are assigned to the default
collection. Mark Logic has both collection- and directory-specific
functions.

Given this basic difference in the meaning of collection, I foresee
some interesting challenges for exquery.

- Joe

p.s. I have heard that eXist plans to make collections
non-hierarchical -- perhaps more like what I describe here of Mark
Logic's collections.

Florent Georges

unread,
May 18, 2009, 3:19:23 PM5/18/09
to exquery-fu...@googlegroups.com
Hi (of dag?) Jeroen,

> I think EXQuery is on the sweet spot of my problem: we're
> developing an XQuery application and currently use eXist as our
> development platform but are looking to MarkLogic for the
> production environment. I came to the conclusion that I have
> to write wrapper functions which call the vender-specific
> functions (or emulate non-existing ones).

> To prevent reinventing the wheel I would like to contribute on
> this issue. I noticed that I can contribute use-cases but I'm
> more into contributing to a codebase. Maybe someone can start a
> project on a VCS (like Dan McCreary did with XRX:
> http://code.google.com/p/xrx/) and add wrapper functions, to-do
> comments and documentation there?

This is an ambitious project. I feel more and more for a few
months now that XQuery (and in some regards XSLT) lacks a
standardized description of an application container in a server
environment.

If you rather want independent, standardized modules of
functions, you can also have a look at EXPath, as Adam suggested.
Any help if welcome. If you want more information on how you
could get involved and propose extensions, please have a look at
the website <http://www.expath.org/> and send an email to the
mailing list with more details. I'll give you more information
on that topic (I think this is more appropriated there than
here.)

Regards,

--
Florent Georges
http://www.fgeorges.org/

Jeroen van der Wal

unread,
May 19, 2009, 6:16:03 AM5/19/09
to exquery-fu...@googlegroups.com
Thanks Adam, Joe and Florent for elaborating things for me. My current issues are definitely EXPath related and I will join the discussion there. Thanks Joe for sharing your experiences, I think there's more to do before we have true portable code then I expected.

A few days ago I got in contact with Eric Palmitesta who wrote an XQuery framework based on the MVC pattern (http://spotdocs.scholarsportal.info/display/MarkLogic/XQMVC). I liked the simplicity and think such a framework could improve XQuery adoption but it makes extensive use of MarkLogic functions of wich most of them can't be mapped to an eXist function (http://spreadsheets.google.com/ccc?key=ryLnFEx3Zk8DeHynzM8rQhg). My opinon is that eXist tries to be close to the standards as possible and that the xdmp namespace of MarkLogic is a wrapper around their core API, which is great for speed but not for building applications on open standards.

Our deadline for delivering a demo application using XRX is close so I will lower the priority for portability for now.

Jeroen van der Wal
Stromboli b.v.
+31 655 874050
jer...@stromboli.it

Adam Retter

unread,
May 19, 2009, 5:34:32 PM5/19/09
to exquery-fu...@googlegroups.com, jer...@stromboli.it
> makes extensive use of MarkLogic functions of wich most of them can't be
> mapped to an eXist function
> (http://spreadsheets.google.com/ccc?key=ryLnFEx3Zk8DeHynzM8rQhg).

Actually, almost all of those can be mapped to equivalent eXist functions :-)
eXist has a number of extension modules that are not enabled by
default, these can be enabled through its conf.xml file.

I have updated your spreadsheet with the additional mappings of
MarkLogic functions to eXist functions - hope that helps.

Jeroen van der Wal

unread,
May 20, 2009, 12:15:14 PM5/20/09
to Adam Retter, exquery-fu...@googlegroups.com
Thanks a lot Adam, I appreciate your effort. Would you agree to use the naming conventions in eXist as the standard? I have to check if the update statment works in MarkLogic or that I have to create a method for portability.

Cheers,

Jeroen

Joe Wicentowski

unread,
May 20, 2009, 5:02:55 PM5/20/09
to exquery-fu...@googlegroups.com, jer...@stromboli.it
>> makes extensive use of MarkLogic functions of wich most of them can't be
>> mapped to an eXist function
>> (http://spreadsheets.google.com/ccc?key=ryLnFEx3Zk8DeHynzM8rQhg).
>
> Actually, almost all of those can be mapped to equivalent eXist functions :-)
> eXist has a number of extension modules that are not enabled by
> default, these can be enabled through its conf.xml file.

Interesting - and on further thought, it's possible to apply
collections in Mark Logic such that they duplicate the hierarchical
form of collections in eXist. Also, you could set up the Mark Logic
database so that the root of the database is "/db". So my earlier
point about the difference in definition/implementation of collections
may not be much of a stumbling block.

> I have updated your spreadsheet with the additional mappings of
> MarkLogic functions to eXist functions - hope that helps.

Neat!

In terms of mapping fulltext search functions/syntax, do you all think
it's best to wait for both implementations to support XQuery Update?

- Joe

Adam Retter

unread,
May 20, 2009, 5:03:23 PM5/20/09
to Jeroen van der Wal, exquery-fu...@googlegroups.com
No, I do not think we can just use the eXist naming conventions as the
basis of any standard. The standard must be well informed by
identifying all the possible vendors functions and coming to a
consensus about suitable common naming and functionality.


2009/5/20 Jeroen van der Wal <jer...@stromboli.it>:

Joe Wicentowski

unread,
May 20, 2009, 5:04:12 PM5/20/09
to exquery-fu...@googlegroups.com
> In terms of mapping fulltext search functions/syntax, do you all think
> it's best to wait for both implementations to support XQuery Update?

Oops, I meant XQuery Full Text.

... And in terms of mapping XQuery Update, do you all think it's best
to wait for both implementations to support XQuery Update? :)

Adam Retter

unread,
May 20, 2009, 5:12:31 PM5/20/09
to exquery-fu...@googlegroups.com
>> In terms of mapping fulltext search functions/syntax, do you all think
>> it's best to wait for both implementations to support XQuery Update?
>
> Oops, I meant XQuery Full Text.

It really depends... If you need such functions today and you need
them to be portable you could create a series of common encapsulation
functions. eXist provides full-text search operators such as &= and |=
(amongst others) and I am sure MarkLogic provides similar constructs.
However, if you are in no rush then XQuery Full Text will hopefully
solve/ease this particular issue of portability.


> ... And in terms of mapping XQuery Update, do you all think it's best
> to wait for both implementations to support XQuery Update?  :)

Again, the same points as above stand.

Joe Wicentowski

unread,
May 20, 2009, 5:20:31 PM5/20/09
to exquery-fu...@googlegroups.com
> It really depends... If you need such functions today and you need
> them to be portable you could create a series of common encapsulation
> functions. eXist provides full-text search operators such as &= and |=
> (amongst others) and I am sure MarkLogic provides similar constructs.

That makes sense. I think simple queries (i.e. those I use) will be
feasible; complex queries could be really tough. I don't believe ML
uses &=/+= operators, but rather nested functions consisting of
cts:search(), cts:element-query(), cts:attribute-query(),
cts:and-query, cts:or-query, etc. I'm certainly not the authority on
this - I've just dabbled.

To do any of this function mapping, we'd need a way to ask the server
its identity. I think James Fuller raised this in his article on IBM
Developerworks (where he discussed the variation in random()
functions), but do we have a good way via XQuery to ask the server
whether we're using eXist or Mark Logic (or the other XQuery
implementations out there)?

Thanks,
Joe

Adam Retter

unread,
May 20, 2009, 5:34:18 PM5/20/09
to exquery-fu...@googlegroups.com
> That makes sense.  I think simple queries (i.e. those I use) will be
> feasible; complex queries could be really tough.  I don't believe ML
> uses &=/+= operators, but rather nested functions consisting of
> cts:search(), cts:element-query(), cts:attribute-query(),
> cts:and-query, cts:or-query, etc.  I'm certainly not the authority on
> this - I've just dabbled.

For a number of indexing/searching functionality eXist also provides
callable functions; either way these could be wrapped into wrapper
functions.

> To do any of this function mapping, we'd need a way to ask the server
> its identity.  I think James Fuller raised this in his article on IBM
> Developerworks (where he discussed the variation in random()
> functions), but do we have a good way via XQuery to ask the server
> whether we're using eXist or Mark Logic (or the other XQuery
> implementations out there)?

Well I wonder if you really need to know? Or rather the question is,
when do you need to know! - Perhaps only at application deployment
time.

Consider the following two fictitious modules, they wont compile
(yet), but you get the general idea -


(: eXist implementation - random.xqm :)
module namespace random = "http://www.expath.org/functions/random";

declare namespace math = "http://exist-db.org/xquery/math";

declare function random:number() as xs:integer
{
math:random()
};


(: MarkLogic implementaion - random.xqm :)
module namespace random = "http://www.expath.org/functions/random";

declare function random:number() as xs:integer
{
xdmp:random()
};



Your query would then be something like -

xquery version "1.0";

import module namespace random =
"http://www.expath.org/functions/random" at "random.xqm";

(: ...some code :)

random:number()

(: ...some code :)


Now in this example providing you deploy the correct random.xqm module
(i.e. the MarkLogic one to MarkLogic and the eXist one to eXist) you
suddenly have much much more portable code and your implementation
specific code is in specific known modules.



> Thanks,
> Joe

Florent Georges

unread,
May 20, 2009, 5:43:46 PM5/20/09
to exquery-fu...@googlegroups.com
2009/5/20 Joe Wicentowski wrote:

Hi Joe,

Thanks for your interest in EXQuery!

> To do any of this function mapping, we'd need a way to ask the server
> its identity.  I think James Fuller raised this in his article on IBM
> Developerworks (where he discussed the variation in random()
> functions), but do we have a good way via XQuery to ask the server
> whether we're using eXist or Mark Logic (or the other XQuery
> implementations out there)?

I think the solution is rather to isolate processor-defined features
and use them in a module that expose the same interface whether it is
implemented for ML or eXist. Then the rest of the application,
including third-party library modules, use them by importing the
module with a given namespace URI (the same on all processor.)

The point here is to be able to include a library module only by its
namespace URI, without the use of the hint in the import statement:

import module namespace lib = "http://exquery.org/ns/library";

instead of (for instance):

import module namespace lib = "http://exquery.org/ns/library"
at "xmldb:exist:///db/exquery/library/library.xq";

The problem is that processors do want a hint. If I am right, eXist
makes it possible to use the former by configuring it properly in a
config file since a few months, and Saxon 9.2, which should be
released soon, will have a brand new extension mechanism (I do not
know the details though.)

I do not know details for other processors, I am sure this is
possible on some, and not on others. But I think that's also one of
the goals of EXQuery: to show to implementers what are real needs for
library writing, behind the scope of strict conformance, as well as
investigating best practices.

I think this problem of portable import statements is pretty clear:
we can only rely on the namespace URI and not on the hint, so the
mapping has to be set up outside of the query modules (so in a
processor-specific way: config files, etc.)

Florent Georges

unread,
May 20, 2009, 5:48:58 PM5/20/09
to exquery-fu...@googlegroups.com
2009/5/20 Adam Retter wrote:

> Now in this example providing you deploy the correct random.xqm module
> (i.e. the MarkLogic one to MarkLogic and the eXist one to eXist) you
> suddenly have much much more portable code and your implementation
> specific code is in specific known modules.

Seems we have the same thoughts on that subject :-)

I would be interested to look at various processor to see how many
support importing modules through the use of their single namespace
URIs. I shouldn't have the time to do so in the next two weeks as I
will be abroad, but I'll try to have a look after...

Joe Wicentowski

unread,
May 20, 2009, 6:32:48 PM5/20/09
to exquery-fu...@googlegroups.com
>> To do any of this function mapping, we'd need a way to ask the server
>> its identity. I think James Fuller raised this in his article on IBM
>> Developerworks (where he discussed the variation in random()
>> functions), but do we have a good way via XQuery to ask the server
>> whether we're using eXist or Mark Logic (or the other XQuery
>> implementations out there)?
>
[Adam]

> Well I wonder if you really need to know? Or rather the question is,
> when do you need to know! - Perhaps only at application deployment
> time.

Interesting - I hadn't thought of it this way! I had pictured a
single exquery:random() function with a series of if/thens testing the
server environment and then executing implementation-specific
functions. But I suppose that would throw errors since the other
implementation's functions wouldn't work.

[Adam]


> Now in this example providing you deploy the correct random.xqm module
> (i.e. the MarkLogic one to MarkLogic and the eXist one to eXist) you
> suddenly have much much more portable code and your implementation
> specific code is in specific known modules.

...
[Florent]


> Seems we have the same thoughts on that subject :-)

Okay, I'm glad you guys are on the same page!

When you say "deploy the correct random.xqm module", do you picture
this involving renaming directories (i.e. rename modules-exist/ to
modules/ for eXist, and modules-marklogic/ to modules/ in the case of
Mark Logic)? I'm trying to picture how development/deployment would
work. It's a theoretical question at the moment, but it'd be helpful
for picturing the road forward.

[Florent]


> I think this problem of portable import statements is pretty clear:
> we can only rely on the namespace URI and not on the hint, so the
> mapping has to be set up outside of the query modules (so in a
> processor-specific way: config files, etc.)

Ah, I think I see where you're going. Don't rename directories, but
set configuration files -- to provide the processor with a hint as to
where to find its implementation-specific modules.

Thanks,
Joe

Reply all
Reply to author
Forward
0 new messages