"http://expath.org/ns/core"
"http://expath.org/ns/<module>"
we need a function
core:function-available(AnyURI namespace, String name) as boolean
core:function-available(AnyURI namespace, String name, int arity) as boolean
core:function-available(QName function) as boolean
core:function-available(QName function, int arity) as boolean
Extending
We can add (because we have function-available())
* new name function (as soon as it make sens in the current package)
* same name function with different arity than existing one
We can create new package
We can deprecate existing function in a package trough documentation
If we want to refactor completly we have the following choices
Having a major new branch
"http://v2.expath.org/ns/core"
or local branching
"http://expath.org/ns/core/v2"
or new module
Xmlizer
we need a function
core:function-available(AnyURI namespace, String name) as boolean
core:function-available(AnyURI namespace, String name, int arity) as boolean
core:function-available(QName function) as boolean
core:function-available(QName function, int arity) as boolean
Hi,
> versioning proposal
> "http://expath.org/ns/core"
> "http://expath.org/ns/<module>"
Yes, I agree with that, providing as you said that a module can
introduce a version number if absolutely necessary. I am not sure the
"core" ns should be treated specially, but who knows?
> core:function-available(AnyURI namespace, String name) as boolean
> core:function-available(AnyURI namespace, String name, int arity) as boolean
> core:function-available(QName function) as boolean
> core:function-available(QName function, int arity) as boolean
I think xs:string would be better than xs:anyURI (for the same
reasons F&O functions takes strings for URIs.)
> Extending
> We can add (because we have function-available())
> * new name function (as soon as it make sens in the current package)
> * same name function with different arity than existing one
Those cases do not really need function-available() for versioning
purpose. If the implementation does not support them, there would be
a compilation error.
But if one has an alternative, as David said, it would maybe be
interesting for her to be able to detect it at compile-time. But I am
not sure how we could handle that in an elegant way. If we backport
XQuery pragmas, we'll generate compile-errors if the implementation
does not support them, and if we use something like (:# ... #:) we
could generate compile-time errors for legitimate XPath comments, or
not reach our goals if the implementation does not recognize it.
I wonder if this is not the responsibility of the host language to
ensure the existence of the function (name or arity,) and to take
adequate behavior if not (use-when in XSLT, pragma in XQuery, or a
separate call to function-available() if, say, in Java.)
So about namespace URIs: +1. About function-available(): +1. About
preventing compile-time errors for unsupported functions: yes, but
how?
Regards,
--
Florent Georges
http://www.fgeorges.org/
I think that will not quite be possible in XQuery. The spec says it's
a static, compile time error if a function is not available. E.g. in
case we have a newer http-get function that supports an additional
cookie parameter, this would not work:
if (function-available("http://expath.org/ns/core", "http-get", 2)) then
expath:http-get('http://www.example.com', $mycookie)
else
expath:http-get('http://www.example.com')
It might somehow work if we have higher order functions, and they
support dynamically instantiating functions:
let $http-call as function() :=
if (function-available("http://expath.org/ns/core", "http-get", 2)) then
get-function('expath:http-get#2')
else
get-function('expath:http-get#1')
return ...
... but with plain XQuery as it is now, such a dynamic fall back is
impossible. Maybe this would make a good case for a first extension
function.
Martin
Or rather util:eval($namespace, $functionname, $args...)?
Martin
Eh yes, I'd just rather see a dynamic function invocation function.
That seems less invasive, and we all know eval is evil.
util:invoke($namespace, $functionname, $args...)
That way you don't have to take care you get the syntax right, you
don't have to invoke a parser, and so on.
Martin
Eh yes, I'd just rather see a dynamic function invocation function.
>> > if ( util:function-available("http:send-request", 1) ) then
>> > util:eval("http:send-request(...)")
>
>> Or rather util:eval($namespace, $functionname, $args...)?
>
> I am note sure to understand; a dynamic evaluation function would be
> able to evaluate any XPath expression: util:eval("1 + 2").
That seems less invasive, and we all know eval is evil.
util:invoke($namespace, $functionname, $args...)
Blowing the eXist trumpet for a second (sorry), we have something
similar already -
util:function() and util:call()
util:call(util:function(xs:QName("http:send-request"), 1), $args)
Documentation is available here -
http://demo.exist-db.org/exist/xquery/functions.xql
--
Adam Retter
EXQuery Founder
t: +44 (0)7714 330069
e: adam....@exquery.org
w: www.exquery.org
However the current proposal at
http://snelson.org.uk/~jpcs/higher-order-functions.html will not work
for the use case of function-available, as it cannot do dynamic
function calls.
About eval: to make this actually useful, we would need to use some
eval function that evaluates the whole query within the current
context, with namespaces and variables and all. I don't know about
other XQuery implementations, but in xDB this is not going to be easy,
as at run time we have already "forgotten" much of the needed
information and/or transformed it into a more execution-suited format.
I really have a strong dislike to concatenating program statements at
run time, and I think it doesn't make a difference whether we're in a
declarative or imperative programming language. The kind of bugs, the
implementation overhead, and the security problems are all the same.
> This is of course a not terribly secure case - I see predicate as being a
> mechanism for converting something like a keyword set from a Search box on a
> browser into a much more secure query, but the idea is there nonetheless.
That is exactly part of the problem. Having written three different
XQuery parsers, I'd have true pity for someone trying to escape
literals to keep an XQuery injection attack away.
> I think we need to be careful about not getting so caught up in the notion
> that something is bad in imperative languages (and there I would agree that
> eval() is evil) should also be seen as being bad in declarative ones. I'll
> probably be taken to task for this, but I'm not necessarily ready to abandon
> eval() without their being a rational reason to do so in the context of the
> language.
There are use cases for eval, granted. But I think our use case at
hand here - dynamically picking a function to call - is much simpler.
My rule of thumb is if you can get away without using eval, do so. And
I think this is such a case.
Martin
A great endorsement for eXist ;-) But hangon just a second... I agree
that there are a lot of good general purpose functions in there (I am
sure other implementations probably have many of them as well),
however there are several that are very eXist specific (locking,
indexing, optimisation, some of the eval stuff) or out of reach of a
pure XPath effort (just too complicated to make use of in XPath) and
more relevant to EXQuery.
Also the util module has suffered more than any module in eXist for
being a general purpose dump for anything that doesnt really fit into
its own module. Because we had no versioning (am I really starting
that conversation again?!?), we have never been able to remove or tidy
that module. Personally in maybe release 1.6 of eXist (1.4 will be out
soonish and odd versions are development branches), I want to do a
complete overhaul and reorganisation of the modules whilst culling
anything legacy. Yes it may break applications, but we are carrying a
lot of cruft around and its getting to the point where its hard for
developers to locate the function they need from the documentation -
actually EXPath and EXQuery modules could be the perfect excuse ;-)
> I was going to say I didn't see the point of such a function,
> the behavior of which could be provided by a general purpose
> util:eval() function or by first-class function objects. But
> actually, I guess that could be useful for that exact issue of
> detecting at runtime if a function is supported, without
> requiring the whole machinery for full evaluation or first-class
> object functions.
>
> Could someone comment on that?
I would always avoid full evaluation where possible.
As a developer with an eye on security I see util:eval() as the devils
own work, for me this is right up there beside native language binding
in XQuery or XSLT. If I can provide constructs which are more limited
then I will always do so. I am not saying that it should not be
available, but in an implementation I am providing, such functionality
will be disabled by default, as it enables too much risk.
I am sure we are all aware of SQL injection attacks, has anyone tried
an XQuery injection attack? I am sure it is possible. With util:eval()
enabled in something like a HTTP REST context and people having easy
access to an eval function and using parameters from POST or GET then
I can see a lot of problems occuring.
Sure such issues are perhaps a developer training issue, but I prefer
perhaps a more proactive approach where if I can prevent stupid
mistakes from happening then everyones life becomes easier hopefully.
Perhaps I am mistaken, but I wonder if we are not over stepping the
scope of EXPath, EXPath is for XPath and should develop only what is
permissible in XPath surely?
> That is exactly part of the problem. Having written three different
> XQuery parsers, I'd have true pity for someone trying to escape
> literals to keep an XQuery injection attack away.
> There are use cases for eval, granted. But I think our use case at
> hand here - dynamically picking a function to call - is much simpler.
> My rule of thumb is if you can get away without using eval, do so. And
> I think this is such a case.
I completely agree with all of these statements, some excellent points.
I understand the purpose of the extension functions. But I have some
questions otherwise - When you start extending XPath as a language are
you not really turning it more and more into XQuery (or what the next
version of XQuery is likely to include). Why not just use XQuery
instead of XPath if you need language functionality that XPath does
not have?
> At present many people have identified at least one feature as badly
> neaded: the support for a nested sequence -- a sequence that can have
> another sequence as an individual item.
Excuse my ignorance but I am not a pure XPath developer, my background
is really in XQuery. This may not be the place to ask, but for my sake
can you explain this need briefly - why a sequence of sequences rather
than a node set?
((1,2,3), (8,9,0)) =
<sequence><sequence><item>1</item><item>2</item><item>3</item></sequence><sequence><item>8</item><item>9</item><item>0</item></sequence></sequence>
> Some people (most notably Florent) have proposed other extensions to
> XPath (such as the Try ... Catch mechanism). While this seems a lot
> more tricky to implement in a functional setting, why not address the
> issue at all? If implemented properly, the benefits would be huge. A
> lot of people are asking how to determine if the document() function
> was successful or if there was timeout when consuming a remote
> service. I believe that having try... catch would provide a good
> mechanism to address such kind of issues.
try/catch would certainly be welcome in XQuery ;-)
> In just a few words, we agree that a couple of extension functions
> (one for a reference to a sequence and another for dereferencing a
> sequence) are sufficient for a very basic, rude implementation of
> nested sequences.
That's at least the first step. Once we have that, we could always
consider its drawbacks and new reasonable orientations :-)
> Excuse my ignorance but I am not a pure XPath developer, my background
> is really in XQuery. This may not be the place to ask, but for my sake
> can you explain this need briefly - why a sequence of sequences rather
> than a node set?
> ((1,2,3), (8,9,0)) =
> <sequence><sequence><item>1</item><item>2</item><item>3</item></sequence><sequence><item>8</item><item>9</item><item>0</item></sequence></sequence>
But then, you cannot have complex data structure that preserve every
properties of nodes that you would like to reference from there. For
example their identity or their validation info. Because when you add
an item to a node, it is copied and some properties of the resulting
node are changed from the original one.
> try/catch would certainly be welcome in XQuery ;-)
It is on its way:
http://www.w3.org/TR/2008/WD-xquery-11-20081203/#id-try-catch :-)
Sure but I think it is only a matter of time before HOF and try/catch
will make it into XQuery, we can assist it along the way of course. My
point was that XPath should not turn into XQuery. XPath fulfills a
very simple need (and does it very well) and if you need more
complexity then there is XQuery.
>> ((1,2,3), (8,9,0)) =
>> <sequence><sequence><item>1</item><item>2</item><item>3</item></sequence><sequence><item>8</item><item>9</item><item>0</item></sequence></sequence>
>>
>
> Because this is both more awkward and prohibitively expensive. (This
> is how a nested sequence *has to be modelled* at present in FXSL, and
> this is quite painful).
Whilst trying not to be unrealistic, I am wondering if this is this
not a detail of the underlying implementation? Is it not the case that
if the implementation was more efficient at handling nodes then you
would not need a nested sequence?
If that is the case (it might not be) would it not be better to invest
time improving the performance of the underlying implementation?
>> Excuse my ignorance but I am not a pure XPath developer, my background
>> is really in XQuery. This may not be the place to ask, but for my sake
>> can you explain this need briefly - why a sequence of sequences rather
>> than a node set?
>> ((1,2,3), (8,9,0)) =
>> <sequence><sequence><item>1</item><item>2</item><item>3</item></sequence><sequence><item>8</item><item>9</item><item>0</item></sequence></sequence>
> But then, you cannot have complex data structure that preserve every
> properties of nodes that you would like to reference from there. For
> example their identity or their validation info. Because when you add
> an item to a node, it is copied and some properties of the resulting
> node are changed from the original one.
Besides, of course, you cannot create nodes in XPath :-)
In eXist we do have a mechanism for marking functions as deprecated,
with some accompanying documentation - just as you suggested actually
;-) But we then encounter the problem of not knowing how long we
should wait before removing a function in its entirety.
> hash:set-value($hash-obj as ^hash^,$key as xs:string, $obj as xs:any*)
Do you mean the hash passed as parameter should be modified as a
side-effect of the function call?
So how would you create a nested sequence? Short of encoding such a sequence as a sequence of XML structures, each of which are an encoding of a sequence, I can't see how'd you do it.
First, about side-effects of an extension function, it would be
better to avoid them as much as possible. They introduce
unpredictability in code, and optimizers in existing processors
actually show problems with such extensions.
> So how would you create a nested sequence? Short of encoding such a sequence
> as a sequence of XML structures, each of which are an encoding of a
> sequence, I can't see how'd you do it.
By using what XSLT 2.0 calls an external object:
http://www.w3.org/TR/xslt20/#external-objects
This is an item(), but neither an atomic type nor a node(). Exactly
as the proposed function() item type. So the exact type is a black
box (and in plain, standard XPath 2.0 can only be represented as
item() in an ItemType:
http://www.w3.org/TR/xpath20/#prod-xpath-ItemType
The user can only pass it to another functions defined as such by
the module (besides adding it to a sequence, for instance in a
variable.) For instance (let's abbreviate a nested sequence by ns,
even if it is not the best option in the XML world, but I don't have
enough imagination right now):
let $ns1 as item() := ns:make-ns((1, 2, 3))
let $ns2 as item() := ns:make-ns((5, 6, 7))
let $s as item()+ := ($ns1, 4, $ns2)
return
( count($s), ns:atomize($s[1]), $s[2], ns:atomize($s[3]) )
that would produce:
3, 1, 2, 3, 4, 5, 6, 7
3 is the length of $s (one nested sequence, one integer, and one
other ns.) Then 1, 2, 3 is the content of the first nested sequence
(that is, the first item of $s,) which must be explicitly atomized
before using its items. 4 is the second item of $s, the integer. And
finally 5, 6, 7 is the content of the last ns (the third item of $s.)
Regards,
Ouch - It would be my preference to avoid the use of Item at all
possible costs. Passing around an unknown object just feels very
wrong. We used to have a similar situation in eXist where we had a
type to repesent a Java File Object and you could pass that in and out
of various functions - we have subsequently chased that out of the
extension functions almost entirely now.
Although that said, it would seem that Item is appropriate to this
situation if we really cant come up with anything better.
> The user can only pass it to another functions defined as such by
> the module (besides adding it to a sequence, for instance in a
> variable.)
> even if it is not the best option in the XML world, but I don't have
> enough imagination right now):
Me neither, will give it some thought tomorrow...
We can avoid it by just not speaking about it -- the same way this
topic is avoided in XPath/XQuery/XSLT by not speaking what is inside a
variable that contains a sequence:
<xsl:variable name="mySeq" select="1 to 1000"/>
my:foo($mySeq)
what is contained in $mySeq and is passed to my:foo() above are not
all numbers from 1 to 1000, but it is rather a *reference* to the
sequence 1 to 1000.
We possibly would not even an item() for this type if we cover it well
in expressions like:
(1, makeRef(2 to 10), 3)
or
subsequence(deRef($seq[3]), 2,1)
Well this is also obviously XPath
Xmlizer
I think in XQuery one has to use the "let" clause.
In pure XPath it is not possible at all to define a variable that
contains a sequence. This:
> for $myNum in (1 to 1000) return $myNum
is simply equivalent to:
1 to 1000
and this is not a definition of a variable :(
> I think one way to think about it here is that we are walking on the edge of
> formalizing something that has remained unformalized for a while - the
> underlying relationship between a namespace and an object definition in
> XSLT/XQuery. Right now, existing modules are pre-OOP - think of them as
> global static classes. This changes the moment we get into the modality
>
> let $oRef := ns:new($paramSeq...)
> let $result := ns:someFn($oRef,$paramSeq...)
>
There's absolutely nothing special in the above lines and in fact a
person who doesn't know English (doesn't know the meaning of "new")
and OOP, will not make any difference between the above and:
let $myRef := ref($someSeq)
let $result := foo($myRef,$paramSeq...)
The fact that you *think* "new" means calling a constructor to create
a new instance of a type does not at all mean that this will *really*
take place in an actual implementation. In a particular
implementation there might be lazy evaluation, optimization and due to
various reasons creating a ref may actually be a no-op, as this ref
may already exist and be used for various other purposes.
> I think the question that needs to be determined is whether in fact this
> design pattern is one that EX* should endorse. It solves a number of
> problems, but it also introduces a number as well. For instance, even
The reasoning above proves that this is only an imaginary problem. It
would be good to warn any people that might be too influenced by OOP
that no such problem actually exists.
I think in XQuery one has to use the "let" clause.
In pure XPath it is not possible at all to define a variable that
contains a sequence. This:
is simply equivalent to:
> for $myNum in (1 to 1000) return $myNum
1 to 1000
and this is not a definition of a variable :(
There's absolutely nothing special in the above lines and in fact a
person who doesn't know English (doesn't know the meaning of "new")
and OOP, will not make any difference between the above and:
let $myRef := ref($someSeq)
let $result := foo($myRef,$paramSeq...)
The fact that you *think* "new" means calling a constructor to create
a new instance of a type does not at all mean that this will *really*
take place in an actual implementation. In a particular
implementation there might be lazy evaluation, optimization and due to
various reasons creating a ref may actually be a no-op, as this ref
may already exist and be used for various other purposes.
The reasoning above proves that this is only an imaginary problem. It
would be good to warn any people that might be too influenced by OOP
that no such problem actually exists.
>
> Admittedly - the problem with the discussion here is that the inability to
> create variables in XPath means that any entity that IS created has to be
> anonymous:
>
> (for $myNum in (1 to 1000)[. mod 2 =0] return $myNum)
>
> creates two sequences (the entire sequence and (1 to 1000)), but because you
> can't create variables, they are simply expressions. However, there are
> still operations that can be defined upon the expressions overall.
Umm... it may be simpler, just:
(1 to 1000)[. mod 2 =0]
> The point I'm making is not in the functional refs but in the
> assumption that these are in a distinct namespace. That is to say, whether
> using new(), create(), instantiate(), ref() or whatever, there is a
> difference between:
>
> let $ref1 := ns1:ref()
>
> and
>
> let $ref1 := ns2:ref()
>
> Once you create the idea of a core object class that is tied to a given
> namespace implementation,
There is nothing that makes us associate a namespace to an object.
Haskell has its modules and a module is like a namespace. If you
import tow modules that have functions with the same name you need to
specify some disambiguation (isn't this exactly so with the XQuery
modules?).
This still doesn't make a module object.
>
> The problem that we face with XQuery/XPath modularization is that it IS
> conceptually possible to use an OOP like construction, the question is
> whether it is advisable.
As explained above, this is not "OOP-like", but just a good
modularization practice.
> The point to consider here is the this is an object description - an
> instantiator (or parser), an accessor, a serializer. As you indicate, the
> names are less relevant than the larger issue that this is an object, and as
> an object it will be treated in different ways than a library module of
> utility functions (global static methods) would.
You may *think* of it as an object description, but in fact it isn't.
People, who don't know about OOP wouldn't agree to this being a
specification of an object and wouldn't care, or worse may be
confused.
They will certainly point out this as a good modularization practice.
It is good to pray to God, no matter that this God is different to
different people :)
>
> The question I would have is whether this modality should be encouraged
> within EX*, or not. If so, then it suggests that we need to think about the
> mechanics of such a system and how they could be made consistent across
> platforms - otherwise, we continue to punt and run into a problem down the
> road when we have half a dozen different object (inconsistent) instantiation
> systems. This is why I bring it up.
I strongly believe that EXPath will better serve its mission if it
embraces a number of good design practices, one of which is
modularity.Not preaching a single religion will also be a definite
plus.
Umm... it may be simpler, just:
(1 to 1000)[. mod 2 =0]
There is nothing that makes us associate a namespace to an object.
Haskell has its modules and a module is like a namespace. If you
import tow modules that have functions with the same name you need to
specify some disambiguation (isn't this exactly so with the XQuery
modules?).
This still doesn't make a module object.
You may *think* of it as an object description, but in fact it isn't.
People, who don't know about OOP wouldn't agree to this being a
specification of an object and wouldn't care, or worse may be
confused.
They will certainly point out this as a good modularization practice.
It is good to pray to God, no matter that this God is different to
different people :)
>> The question I would have is whether this modality should be encouragedI strongly believe that EXPath will better serve its mission if it
> within EX*, or not. If so, then it suggests that we need to think about the
> mechanics of such a system and how they could be made consistent across
> platforms - otherwise, we continue to punt and run into a problem down the
> road when we have half a dozen different object (inconsistent) instantiation
> systems. This is why I bring it up.
embraces a number of good design practices, one of which is
modularity.Not preaching a single religion will also be a definite
plus.
I'd be very, very careful in doing things like these.
If you add nested sequences to the type system of XPath, you will have
a lot of specification work to do. How do all the built in operators
work on nested sequences (=, eq, ..)? How do they interact with the
standard function library? Of course you could always say "they don't,
and it's an error", but that might not be very useful then.
Same goes for object orientation. XQuery is a functional language, so
side effects are forbidden. And that is for a very good reason - if
you drop this rule, you loose lazy evaluation and statement
reordering, which are really nice and really important optimization
techniques. The sql:connect extension would for example leak
connections in xDB as lazy evaluation would prevent the disconnect
call from happening.
But object orientation is going to be hardly useful without objects
carrying around state. I know it's possible to have immutable objects
only, but 99% of your users will not expect it to work that way.
If you are extending the language this way, you better very well know
what you're doing. Otherwise, you might end up with a variant of Java
with a really ugly syntax and some special XML processing
capabilities.
I know that quite some people like to use XQuery as a general purpose
programming language, and of course that is where requests for nested
sequences and objects come from, but I'm really not sure if that is a
good idea. I can't quite see a way to extend the language to handle
these use cases without breaking it beyond repair in some areas.
Martin
I know that quite some people like to use XQuery as a general purpose
programming language, and of course that is where requests for nested
sequences and objects come from, but I'm really not sure if that is a
good idea. I can't quite see a way to extend the language to handle
these use cases without breaking it beyond repair in some areas.
Martin
> I especially do not want to reformulate and embed OO concepts at expath or
> exquery level; but I do want to have 'just enough' added to the language to
> make it a pragmatic tool.
My problem with a pragmatic tool is that pragmatism tends to break
certain kinds of scrutiny, and you might well end up with designing
language extensions that are extremely pragmatic, but totally break
certain use cases.
One of these use cases are certain optimizations, and that is what
XQuery was designed to do well in the first place. It's already a hell
to optimize currently, I'd rather not see more complexity added there.
I'm particularly wary of the side-effects adding things. They are
indeed very pragmatic, but IMHO totally break the language in very
important aspects. They do work in one certain implementation, but
that's clearly not enough, at least to me.
About application use cases: there are many interesting domains, and
certainly you can achieve a lot with XQuery. But I'd rather have a
programming environment that lets me fall back to Java (Python, Ruby,
...) or something else when necessary than shoe-horning lots of
features into XQuery.
It's very easy to break a language beyond any recognition by
pragmatically adding features - see PHP for an example. We shouldn't
do that. I'm all in favor of adding small things here and there that
enable certain well defined use cases (like HOF or dynamic function
invocation, better HTTP access and so on), but IMHO we should restrain
from significantly changing the language.
Martin
CC'ing exquery, even though I guess most people are on both lists anyways?
My problem with a pragmatic tool is that pragmatism tends to break
> I especially do not want to reformulate and embed OO concepts at expath or
> exquery level; but I do want to have 'just enough' added to the language to
> make it a pragmatic tool.
certain kinds of scrutiny, and you might well end up with designing
language extensions that are extremely pragmatic, but totally break
certain use cases.
One of these use cases are certain optimizations, and that is what
XQuery was designed to do well in the first place. It's already a hell
to optimize currently, I'd rather not see more complexity added there.
I'm particularly wary of the side-effects adding things. They are
indeed very pragmatic, but IMHO totally break the language in very
important aspects. They do work in one certain implementation, but
that's clearly not enough, at least to me.
About application use cases: there are many interesting domains, and
certainly you can achieve a lot with XQuery. But I'd rather have a
programming environment that lets me fall back to Java (Python, Ruby,
...) or something else when necessary than shoe-horning lots of
features into XQuery.
It's very easy to break a language beyond any recognition by
pragmatically adding features - see PHP for an example. We shouldn't
do that. I'm all in favor of adding small things here and there that
enable certain well defined use cases (like HOF or dynamic function
invocation, better HTTP access and so on), but IMHO we should restrain
from significantly changing the language.
>> This is an item(), but neither an atomic type nor a node(). Exactly
>> as the proposed function() item type. So the exact type is a black
>> box (and in plain, standard XPath 2.0 can only be represented as
>> item() in an ItemType:
> Ouch - It would be my preference to avoid the use of Item at all
> possible costs.
I understand this concern. But on the other hand, that's really on
purpose here. We do not want to impose more restriction on this type
(like "it must be a node()," or "it must be a xs:string.") We just
want *exactly one* item. And one cannot do anything else than passing
it to some other functions of this module.
If we were defining a new version of XPath, we could add a new type,
sibling to node(). And provide an ItemType as "sequence-ref()" or
"nested-sequence()." But to only introduce a set of extension
functions, without changing the grammar, we can just stay with
"item()."
Furthermore, nested sequences will typically used to build
heterogeneous sequences (where the SequenceType can not really be more
precise than "item()+".) When you have a sequence of strings and
integers, you cannot precise that fact, only say that you can have
several items: item()+.
Of course, it must be carefully designed to be sure standard F&O are
properly handled (you could do almost nothing with this type excepted
with the functions explicitly designed to deal with it.)
The goal is to enable building complex data structures. But then,
this is the responsibility of the developer to explicitly handle
conversions between those structures and the traditional sequences of
XDM.
> XPath and all its implementations already have the reference type, but
> they are not speaking about it openly.
I am a bit surprised to see this affirmation on a quite regular
basis. In my very humble opinion, XPath does not have anything like
this. Indeed, implementations will not copy all sequence's values if
it is used at several places, but this is just an implementation
"detail." Sequences are defined by value, and do not have any
identity.
But I might be wrong.
> Michael Kay calls the reference withs its proper name in a number of
> places in his books.
Do you have any precise reference (well, I mean pointer, well, you know :-p) ?
>> The problem that we face with XQuery/XPath modularization is
>> that it IS conceptually possible to use an OOP like
>> construction, the question is whether it is advisable.
> As explained above, this is not "OOP-like", but just a good
> modularization practice.
Yes, I agree with Dimitre. IMHO this is "just" encapsulation.
While this is one very important principle of object orientation,
this is not the only one, and this does not make something object
oriented by itself.
And I do think encapsulation is very valuable in X*, and that's
one reason I'd really like to have the ability to nest sequences:
to be able to define complex data structures, encapsulated by
function libraries.
> If you add nested sequences to the type system of XPath, you
> will have a lot of specification work to do. How do all the
> built in operators work on nested sequences (=, eq, ..)? How do
> they interact with the standard function library? Of course you
> could always say "they don't, and it's an error", but that
> might not be very useful then.
Funny, I do think this is exactly how it would be useful.
IMHO the goal is not to define black magic rules to decide
automagically when a sequence has to be nested or unnested, but
instead to provide a way to explicitly nest or unnest sequences.
In almost any other cases, an error should be thrown: with value
comparison operators (maybe also with general comparison ops, or
just return false,) with fn:data() and fn:string(), if tried to
be added to a tree, etc.
That would guarantee to not have any silent, unwanted result:
either the handling is explicit, or we get an error.
>>> Michael Kay calls the reference withs its proper name in a
>>> number of places in his books.
>> Do you have any precise reference (well, I mean pointer, well,
>> you know :-p) ?
> There was another discussion on this topic not long ago, which
> you probably remember well, in the EXSLT list where I provided
> a number of citations (including page numbers) from a number of
> books by Dr. Kay. AFAIK his statements to the same effect can
> be found in his lists/newsgroups replies, too.
Thanks. I found for instance the following from EXSLT list
http://lists.fourthought.com/pipermail/exslt/2008-December/001782.html:
|> The <xsl:sequence> instruction ... is the only XSLT
|> instruction (with the exception of <xsl:perform-sort>) that
|> can return references to existing nodes, as distinct from
|> newly constructed nodes.
Well, in this case, I would say the important words are "as
distinct from newly constructed nodes," using "references" is
mainly to contrast with that. You can equally say "that can
return existing nodes..." I am not sure this is enough to say
"XPath has the reference type."
Nodes are added to sequences by identity (aka by reference) but
I don't think that's the same thing as having a reference type.
Regards,
> it is not so important how you (or I) feel about that, what is
> important is *the fact* that references are *de facto* already there
> in XPath.
Sure. You just haven't convinced me yet about that fact :-)
Actually, I simply don't see the point to say "there is a reference
type in XPath; it is not defined in the spec, you cannot use it, but
it is there." Or I completely misunderstood you.
> A single node cannot be in two different items, at two different
> positions, at the same time, but its references can.
That's not what I understand from
http://www.w3.org/TR/xpath-datamodel/#sequences. Maybe, if we
continue this discussion, we should do so on XSL List?
Personally, I think true nested sequences should be the responsibility
of the W3C WG. However, if someone wishes to add an extension module
specification to EXPath in the mean time for something that imitates
the functionality of a nested sequence then they are more than
welcome. Please note that I said 'extension module', in my opinion
unofficial extensions to the XPath language itself will prove too
costly.
> Same goes for object orientation. XQuery is a functional language, so
> side effects are forbidden. And that is for a very good reason - if
> you drop this rule, you loose lazy evaluation and statement
> reordering, which are really nice and really important optimization
> techniques. The sql:connect extension would for example leak
> connections in xDB as lazy evaluation would prevent the disconnect
> call from happening.
I have no interest in seeing Object Orientation added. I completely
agree that side-effects in XQuery are a bad idea from the point of
lazy evaluation and optimization and I will always try and avoid such
things if possible.
However this is where I will start to diverge from your view - These
are extension modules for doing things that the language never
intended. People want these additions in my experience, which is what
has led me to this point today. If I need to provide some
functionality that can only be achieved with side effects, then I will
add that functionality. The assumption that I make is that it is only
where I have this functionality that the implementation will loose
optimal performance - is that a correct assumption? If so then that is
a price I am happy with.
With regards the sql:connect() example I used, perhaps the example
could be improved with regards to side-effects, it is not actually the
sql module that I implemented for eXist. The sql module in eXist has
no close() function, it automatically cleans up open connections when
the XQuery finishes execution - more side effects ;-p
sql:execute($connectionString as xs:string, $username as xs:string,
$password as xs:string, $sql as xs:string) as element(sql:results)
Perhaps that would be more side-effect free? It would then be upto the
implementation whether to do that as an atomic side-effect free
operation, or whether to perhaps underneath the hood use some sort of
sql connection pooling etc.
> If you are extending the language this way, you better very well know
> what you're doing. Otherwise, you might end up with a variant of Java
> with a really ugly syntax and some special XML processing
> capabilities.
Yuck!
> I know that quite some people like to use XQuery as a general purpose
> programming language, and of course that is where requests for nested
> sequences and objects come from, but I'm really not sure if that is a
> good idea. I can't quite see a way to extend the language to handle
> these use cases without breaking it beyond repair in some areas.
I do a lot of web application development in XQuery, personally I
havent wanted nested sequences or objects yet. But isnt this about
XPath?
> sql:execute($connectionString as xs:string, $username as xs:string,
> $password as xs:string, $sql as xs:string) as element(sql:results)
> Perhaps that would be more side-effect free?
Well, a SQL extension shouldn't be totally free of side-effect ;-)