[racket] sxpath, txpath and accessors

118 views
Skip to first unread message

Sanjeev K Sharma

unread,
Mar 1, 2015, 3:44:38 PM3/1/15
to us...@racket-lang.org

I tried to send this twice now- this is the 3rd attempt, I hope no one gets this 3 times / gets pissed.

on this page
~/.racket/6.1.1/pkgs/sxml/sxml/doc/sxml/sxpath.html

I read this:

"Like sxpath, but only accepts an XPath query in string form, using the standard XPath syntax.

Deprecated; use sxpath instead."

so I'd like to get the hang of sxpath but I'm somehow stuck on how to translate txpath syntax to sxpath syntax for axes and accessors

after the example document below are some of the variations I tried

#lang racket (require sxml)(require sxml/html)
(define doc(html->xexp
"<AAA>
<BBB>
<CCC/>
<www> www content <xxx/><www>
<zzz/>
</BBB>
<XXX>
<DDD>
<EEE/>
<FFF>
<HHH/>
<GGG>
<JJJ>content under jjj
<QQQ><lll/>content under qqq</qqq>
<rrr/>
</JJJ>
<JJJ/>
</GGG>
<HHH/>
</FFF>
</DDD>
</XXX>
<CCC>content in ccc
<DDD/>
</CCC>
</AAA>"))


((txpath"/aaa/xxx/preceding::*")doc);expect;'((zzz) ... (www "\n" " " (zzz) "\n" " "))))

((txpath"/aaa/xxx/preceding::zzz")doc);expect '((zzz))
((txpath"/aaa/xxx/preceding::www")doc);expect '((www "\n" " ... " (zzz) "\n" " ")))
((txpath"/aaa/xxx/preceding::www/following::lll")doc);expect '((lll) (lll))

;from the doc,
;lists in the s-expression syntax correspond to string concatenation in the txpath syntax.
;((sxpath'(aaa xxx ddd fff ggg jjj qqq))doc);expect '((qqq (lll) "content under qqq"))

((sxpath'(aaa xxx preceding))doc)

((sxpath'(aaa xxx preceding::www))doc)
((sxpath'(aaa xxx preceding::zzz))doc)
((sxpath'(aaa xxx preceding::bbb))doc)

((sxpath'(aaa xxx "preceding"))doc)
((sxpath'(aaa xxx "preceding::zzz"))doc);
((sxpath'(aaa xxx "preceding::www"))doc);

((sxpath'(aaa xxx "preceding::*"))doc)
;((sxpath "/aaa/xxx/preceding::*")doc); works
((sxpath'(aaa xxx preceding::*))doc)
((sxpath'(aaa (xxx (preceding::*))))doc)
((sxpath'(aaa (xxx (preceding::*))))doc)
((sxpath'(aaa (xxx (preceding::"*"))))doc)
((sxpath'(aaa (xxx "preceding::*")))doc)
((sxpath'(aaa (xxx ("preceding::*"))))doc)

((sxpath'(aaa xxx preceding::*))doc)
((sxpath'(aaa xxx preceding::*))doc)
((sxpath'(aaa xxx "preceding::*"))doc)
((sxpath'(aaa xxx ,"preceding::*"))doc)
((sxpath'(// aaa /xxx /"preceding::*"))doc)

;((sxpath'(//aaa /xxx/preceding"::*"))doc); error
;(sxml:preceding::*(sxpath'(aaa xxx))doc); error

((sxml:preceding 'eq) doc) ;#<procedure:...l/sxpath-ext.rkt:578:4>
((sxml:preceding(sxpath'(aaa xxx)))doc);#<procedure:...l/sxpath-ext.rkt:578:4>




I'd like to get the other axes and accessors working but now that I've spent some time on this I also REALLY want to know what my blind spot is here. Hope I didn't miss something glaringly obvious.

____________________
Racket Users list:
http://lists.racket-lang.org/users

Vincent St-Amour

unread,
Mar 2, 2015, 9:56:34 AM3/2/15
to Sanjeev K Sharma, us...@racket-lang.org
I think this is what you're looking for:

((sxpath '(aaa xxx (sxml:preceding '(www)))) doc)

I recommend the short tutorial near the top of this document:

http://pkg-build.racket-lang.org/doc/sxml/sxpath.html

It doesn't discuss `sxml:preceding` specifically, but it shows similar
constructs.

Vincent



At Sun, 1 Mar 2015 15:42:35 -0500,

Sanjeev K Sharma

unread,
Mar 3, 2015, 7:00:29 AM3/3/15
to us...@racket-lang.org
On Mon, Mar 02, 2015 at 09:52:53AM -0500, Vincent St-Amour wrote:
> I recommend the short tutorial near the top of this document:
>
> http://pkg-build.racket-lang.org/doc/sxml/sxpath.html
>

Thanks for the suggested solution;

that was exactly the document I referred to

I have not been able to generate an sxpath working example of the axes listed at the bottom of that page. Aside from generating an example, from Oleg's documents to some other google hits I haven't found working examples either.

a simplified example (including your suggestion)

#lang racket (require sxml)(require sxml/html)
(define doc(html->xexp
"<AAA>
<BBB>
<CCC/>
<www> www content <xxx/><www>
<zzz/>
</BBB>
<XXX>
<DDD> content in ccc
</DDD>
</XXX>
</AAA>"))

((txpath"/aaa/xxx/preceding::ccc")doc)
; '((ccc))


((txpath"/aaa/xxx/preceding::www")doc)
; '((www "\n" " " (zzz) "\n" " ")
; (www
; " www content "
; (xxx)
; (www "\n" " " (zzz) "\n" " ")))
; '()

((sxpath '(aaa xxx (sxml:preceding '(www)))) doc)
; '()

((sxpath '(aaa xxx (sxml:preceding (www)))) doc)
; '()

((sxpath'(aaa xxx (preceding:"www")))doc)
; '()

((sxpath '(aaa xxx ((sxml::preceding "www")))) doc)
; '()

((sxml:preceding ((sxpath '(aaa xxx www))))doc)
; '()

Vincent St-Amour

unread,
Mar 3, 2015, 10:30:33 AM3/3/15
to Sanjeev K Sharma, us...@racket-lang.org
I have not been able to get those axes working either. I've looked
briefly at the implementation of `sxml:preceding`, and it's unclear to
me how it could even work.

Since `txpath` appears to work, I'd stick with that. I'm not sure why
it's deprecated in favor of `sxpath`. Since, AFAICT, we're just
packaging that library (i.e. not developing it) and its development
appears to have stopped, I don't expect `txpath` to go anywhere.

Vincent



At Tue, 3 Mar 2015 06:56:34 -0500,

John Clements

unread,
Mar 3, 2015, 12:16:21 PM3/3/15
to Vincent St-Amour, Sanjeev K Sharma, us...@racket-lang.org
I think sxpath is not as broken as you think. It certainly *is* desperately short of examples, and I'll add this one when I'm done.

I wouldn't call this complete success, but after much head-scratching and doc-reading, I got this to produce the expected result:

(((sxml:preceding (ntype?? 'www)) doc) ((sxpath `(aaa xxx)) doc))
; =>

; '((www "\n" "     " (zzz) "\n" "  ")
;   (www " www content " (xxx) (www "\n" "     " (zzz) "\n" "  ")))


That is, sxml:preceding needs to know about the whole doc (reasonable, given that it's back-traversing), and it acts as a filter. Is there some way to turn this around so that the sxml:preceding node appears in the sxpath chain? There might be, yes. Also, why do I have to use 'ntype??' instead of sxpath? Not sure.  I think this library might gain a great deal from being a *teensy* bit less higher-order.

In the end, though... it's true that you might as well just use txpath.

(FYI, I'm planning to use this example in the docs, please let me know if that's not appropriate.)

Best,

John Clements

Sanjeev K Sharma

unread,
Mar 3, 2015, 12:51:11 PM3/3/15
to John Clements, us...@racket-lang.org
dang ... thanks so much

I don't know how much time you saved me, this was the tack I was on


#lang racket (require sxml);(require sxml/html)
(define doc(ssax:xml->sxml(open-input-string"<AAA>
<BBB>
<CCC/>
<WWW> www content <ttt/></WWW>
<zzz/>
</BBB>
<XXX>
<DDD>
<EEE/>
<FFF>
<HHH/>
<GGG>
<JJJ>content under jjj
<QQQ><lll/>content under qqq</QQQ>
<rrr/>
</JJJ>
<JJJ/>
</GGG>
<HHH/>
</FFF>
</DDD>
</XXX>
<CCC>content in ccc
<DDD/>
</CCC>
</AAA>") '()))
(sxml:element? doc)
(sxml:node? doc)
(sxml:element? ((txpath"/AAA/XXX/preceding::CCC")doc)); #f
(sxml:element?(car((txpath"/AAA/XXX/preceding::CCC")doc)))
(sxml:element?(car((sxpath'(AAA XXX))doc)))
(sxml:node? (car((sxpath'(AAA XXX))doc)))
(sxml:node?((sxpath'(AAA XXX))doc))

(( (sxml:preceding '*)doc)'(// XXX))
(( (sxml:preceding '*any*)doc)'(// XXX))
(( (sxml:preceding '*any*)(cdr doc))'(// XXX))
(( (sxml:preceding '*)(cdr doc))'(// XXX))
(( (sxml:preceding '*any*)(cadr doc))'(// XXX))
(( (sxml:preceding '*)(cadr doc))'(// XXX))

(( (sxml:preceding '*any*) doc) '(AAA XXX))
(( (sxml:preceding '*any*) doc) '(// XXX))
(( (sxml:preceding (ntype?? '*any*)) doc) '(// XXX))
(( (sxml:preceding (ntype?? '*text*)) doc) '("content"))
(( (sxml:preceding 'ntype??) doc)'(XXX))
(( (sxml:preceding ntype??) doc)'(// XXX))
(( (sxml:preceding '*) doc) '(// XXX))
(( (sxml:preceding (ntype?? '*)) doc) '(// XXX))

(( (sxml:preceding '*any*)doc)'(AAA XXX))
(( (sxml:preceding '*)doc)'(AAA XXX))
(( (sxml:preceding '*)doc)"//XXX")
;(( (sxml:preceding '*)doc)((txpath"//XXX"))) ; ERROR
(( (sxml:preceding '@)doc)'(AAA XXX))

(( (sxml:preceding '*any*)doc)"/AAA/XXX/")
(( (sxml:preceding '*any*)doc)"//XXX")

(( (sxml:preceding '(WWW)) doc)doc)
(( (sxml:preceding '(CCC)) doc)doc)

(( (sxml:preceding '*any*) doc)'CCC)
(( (sxml:preceding '*any*) doc)'(CCC))
(( (sxml:preceding '*any*) doc)"CCC")

(( (sxml:preceding '@) doc)'(CCC))
(( (sxml:preceding '*) doc)'(CCC))
(( (sxml:preceding '*) doc)'((CCC)))
(( (sxml:preceding '*) doc)"CCC")

(( (sxml:preceding ntype??) doc)'(CCC))
(( (sxml:preceding 'ntype??) doc)'(CCC))
(( (sxml:preceding (ntype?? '*)) doc)'(CCC))

(( (sxml:preceding equal?) doc)'CCC)
(( (sxml:preceding equal?) doc)'(CCC))
(( (sxml:preceding 'equal?) doc)'(CCC))
(( (sxml:preceding equal?) doc)"CCC")
(( (sxml:preceding equal?) doc)"WWW")
(( (sxml:preceding 'equal?) doc)'(WWW))

((txpath"/AAA/XXX/preceding::CCC")doc)
((txpath"/AAA/XXX/preceding::WWW")doc)



((sxpath '(AAA XXX (sxml:preceding '(WWW)))) doc)
((sxpath '(AAA XXX (sxml:preceding (WWW)))) doc)

((sxpath'(AAA XXX (preceding:"WWW")))doc)
((sxpath '(aaa xxx ((sxml::preceding "www")))) doc)
((sxpath '(AAA XXX ((sxml::preceding "WWW")))) doc)
((sxpath '(AAA XXX ((sxml::preceding "WWW")))) doc)

((ntype?? 'www)doc)
((sxpath '(aaa xxx ((sxml:preceding (ntype?? 'www))))) doc)

;((sxml:preceding ((sxpath '(aaa xxx www))))doc)

; ((sxpath '(aaa xxx ((sxml:preceding '(bbb))))) doc)
; ((sxpath '(aaa xxx ((sxml:preceding '(ccc))))) doc)
; ((sxpath '(aaa xxx ((sxml::preceding "www")))) doc)
;
; ((sxpath '(aaa xxx (sxml:preceding (www)))) doc)
;
; ;((sxml:preceding ((sxpath '(aaa xxx www)doc)
;
;
; ;((txpath"/aaa/xxx/fff/ddd/ancestor::")doc)
; ;((sxpath '(aaa xxx fff((sxml:ancestor)))) doc)
;
;
; (sxml:element? doc)
; ((ntype?? 'www)doc)
; ((ntype-names?? '(aaa ccc xxx))doc)
;
;
;
; ;((txpath"/aaa/xxx/preceding::ccc")doc)
; ((sxpath '(aaa xxx ((sxml:preceding '(ccc))))) doc)
; ;((sxpath '(aaa xxx (sxml:preceding (ntype?? 'ccc)))) doc)
; ((ntype?? 'ccc)doc)
;
; ;((sxpath '(aaa xxx :preceding '(ccc)) doc)
; ;((sxpath '((sxml:preceding '(aaa xxx ccc)))) doc)
; ;((sxml:preceding '(ccc) doc))
;
; ((sxpath '(aaa xxx '(sxml:preceding '(www)))) doc)
; ((sxpath '(aaa xxx (sxml:preceding '(www)))) doc)
;
;
;
; ; ((txpath"/aaa/xxx/preceding::www/following::lll")doc);expect '((lll) (lll))
; ; ((txpath"/aaa/xxx/preceding::www/following")doc)
; ; ((txpath"/aaa/xxx/preceding::ccc")doc)
;
;

Sanjeev Sharma

unread,
Mar 3, 2015, 12:59:40 PM3/3/15
to us...@racket-lang.org
several blind spots, the last being the sxpath context as the argument that

(sxml:... [])
gets applied to

Sanjeev K Sharma

unread,
Mar 3, 2015, 1:04:50 PM3/3/15
to us...@racket-lang.org
On Tue, Mar 03, 2015 at 09:14:42AM -0800, John Clements wrote:

> Also, why do I have to use 'ntype??'
> instead of sxpath? Not sure. I think this library might gain a great deal
> from being a *teensy* bit less higher-order.
>

I started using ntype there after realizing sxpath can't make predicates.

Am I wrong there?
Reply all
Reply to author
Forward
0 new messages