self axis in XPath

53 views
Skip to first unread message

Robin Green

unread,
Oct 30, 2012, 11:49:24 AM10/30/12
to scale...@googlegroups.com
What is the syntax for the self axis in XPath? I am trying to select both by element name and attributes.

Chris Twiner

unread,
Oct 30, 2012, 3:45:47 PM10/30/12
to scale...@googlegroups.com
Hi Robin,

The below code shows some of the ways you can stay within "self". I
prefer the last example as it uses self but also returns it.

let me know if that wasn't what you are after..

HTH,
Cheers,
Chris

import scales.xml._
import ScalesXml._
import scales.utils.{top, boolean}

import Functions._

val ns = Namespace("test:uri")
val nsa = Namespace("test:uri:attribs")
val nsp = nsa.prefixed("pre")

val builder =
ns("Elem") /(
ns("Child") /@ (nsa("pre", "attr1") -> "val1",
"attr2" -> "val2",
nsp("attr3") -> "val3"),
"Mixed Content",
ns("Child2") /( ns("Subchild") ~> "text" )
)

val root = top(builder)

val pathsSinglePredicate =
root.\*{
p => localName(p) == "Child" &&
boolean(p \@ nsp("attr3")) // could also be direct on the elem.attributes
}

println("--- single predicate")
pathsSinglePredicate foreach println

val pathsCombinedPredicatesParent =
root.\*(ns("Child")).
\@( nsp("attr3") ). // still on Child matches
\^ // need to come back up from the attribute

println("--- combined predicates Parent")
pathsCombinedPredicatesParent foreach println

val pathsCombinedPredicates =
root.\*(ns("Child")).
*(_.\@( nsp("attr3") )) // still on Child matches, but stays on it

println("--- combined predicates")
pathsCombinedPredicates foreach println

Robin Green

unread,
Oct 31, 2012, 7:05:44 AM10/31/12
to scale...@googlegroups.com
Thanks Chris, that's exactly what I needed.

Unfortunately this was not very easy for me to figure out from the reference docs and the scaladocs. For example, the reference documentation has this translation table:

Scales supports the complete useful XPath axe:
  • ancestor (ancestor_::)
  • ancestor-or-self (ancestor_or_self::)
  • attribute (*@)
  • child (\ or \+ to expand XmlItems)
  • descendant (descendant_::)
  • descendant-or-self (descendant_or_self_::)
  • following (following_::)
  • following-sibling (following_sibling_::)
  • parent (\^)
  • preceding (preceding_::)
  • preceding-sibling (preceding_sibling_::)
  • self (.)

Firstly, which side is XPath and which side is your syntax? You don't make that clear. An expert might be able to tell by sight, but my XPath is a bit rusty (and I seem to remember that XPath itself has a short syntax and a verbose syntax, further complicating matters.) Also "." is the built-in Scala language operator for dereferencing, and "self" doesn't exist - well, my IDE says it does, but it's something completely different...

Then if we turn to the Scaladocs, it's hard to find the right class/trait, and if you do find an apparently relevant one:

/ is said to just "forward the current context". What does that mean?

*@ is said to be all immediate attributes (I didn't know what immediate meant, but now I think I've figured it out) and the overloaded method that I really needed, \@(pred: (AttributePath) ⇒ Boolean), isn't documented at all!

It may well be that some of these things are just standard XPath stuff, but if so, you should say so and preferably direct the reader to a suitable resource to read more.

Chris Twiner

unread,
Oct 31, 2012, 6:14:07 PM10/31/12
to scale...@googlegroups.com
Thanks for the feedback Robin, its very valuable to me. I've put a
lot of changes into the docs in the last releases but without such
focussed feedback its all guesswork as to what is most useful..

I've added a new issue #13 to track it for the 0.5 M1 release.
> really needed, \@(pred: (AttributePath) => Boolean), isn't documented at all!

Robin Green

unread,
Jan 4, 2013, 10:40:07 AM1/4/13
to scale...@googlegroups.com
Thanks for improving the documentation in response to my feedback.

However, I still don't understand what the difference is between \@ and *@. Looking at the code it looks like the \ means go down one level along the current axis, but then I don't understand why

path.\*("foo").*@("bar")

works (and it does work - this is what I have in some of my code that uses 0.4.4). Don't you have to go down one level from an element to get to its attributes, and if the *@ does that anyway, in what situation would you need \@ - or in other words in what situation would *@ and \@ behave differently?

Another question, while I'm at it: What's the difference between the types XPath and XmlPath? I think I've worked out that XPath[PT] is more general and any library methods I create in my own libraries should take XPath[PT]s as arguments insead of XmlPaths - is that right?

Chris Twiner

unread,
Jan 4, 2013, 1:00:40 PM1/4/13
to scale...@googlegroups.com
On Fri, Jan 4, 2013 at 4:40 PM, Robin Green <gre...@gmail.com> wrote:
> Thanks for improving the documentation in response to my feedback.

More than welcome.

> However, I still don't understand what the difference is between \@ and *@.

There isn't a difference any more. At one stage \ always opened up
children, but that led to special casing text nodes to being different
than a simple predicate.

Honestly I'm still not super thrilled with that part of the dsl (\+
for example is a pet hate), but its driven by the same E1/E2 problem.

>
> Another question, while I'm at it: What's the difference between the types
> XPath and XmlPath?

An XmlPath is any individual "node" in an XmlTree, basically a type
alias for the zipper.

XPath[PT]s are the representation of XPath over XmlPaths. The type
parameter is to allow using Vectors or indeed ImmutableArrays for the
results, but during tests List was the most peformant default type.
Given the performance gains for ImmutableArrays its something I'll
revisit.

The "over" bit is that for each step/axe the number of possible
matches can increase.

> I think I've worked out that XPath[PT] is more general
> and any library methods I create in my own libraries should take XPath[PT]s
> as arguments insead of XmlPaths - is that right?
>

That depends largely on what you want to do, if you are adding steps
to an XPath then XPath[PT] is what you should use. If you are just
using a particular part of a tree then XmlPath should be used, as its
only a single node.

You can always convert from XPath to an Iterable of XmlPath, but if
performing additional filtering you are better off keeping XPath as it
won't attempt to perform document ordering/filtering. It document
ordering / uniqueness isn't important there are speed gains to be had
from using "raw" for large sets to convert from XPath to
Iterable[XmlPath].

That could sound terribly vague I realise :< but I hope it helps.
Reply all
Reply to author
Forward
0 new messages