<ROOT><A><B1>test</B1><B2>"test qoute"</B2></A></ROOT>
(string->xexpr "<ROOT><A><B1>test</B1><B2>"test qoute"</B2></A></ROOT>")
'(ROOT () (A () (B1 () "test") (B2 () "\"" "test qoute" "\"")))
On Nov 21, 2019, at 9:51 AM, Kira <peacekee...@gmail.com> wrote:I tested sxml. And it is producer rigth (for me) output.
I feel that this is wrong behavior. Because this " symbols is direct part of one and whole element content. And must be read as "\"test qoute\"".
Can please someone explain reasoning under such behaivor, and can we change it? Perhaps it is important for some other racket libs?
It is just totally contrintuitive for me. And creating a huge problems with even simple XML parsing. (I am basically battling XML lib all day already to do most simple tasks)
[...] creating a huge problems with even simple XML parsing. (I am basically battling XML lib all day already to do most simple tasks)
--You received this message because you are subscribed to the Google Groups "Racket Users" group.To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/df1e76f4-6459-4c12-b85e-f36c2d4bd13c%40googlegroups.com.
That hypothetical parser assembling the parsed representation *could*
then concatenate sequences of 2 or more contiguous strings representing
CDATA, but that could be expensive, and might not be needed. Consider
how large some XML and HTML documents can be, and how little information
out of them is sometimes needed (e.g., price scraper) --
performance-wise, the concatenation might be best left up to whatever
uses that parsed representation.
<ROOT>
<A>
<B1>test</B1>
<B2>"test qoute"</B2>
</A>
<A>
<B1>test2</B1>
<B2>"test "qoute2""</B2>
</A>
</ROOT>
(define rawmxl "<ROOT><A><B1>test</B1><B2>"test qoute"</B2></A><A><B1>test2</B1><B2>"test "qoute2""</B2></A></ROOT>")
(define xexpr (string->xexpr rawmxl))
(se-path*/list '(A) xexpr)
'((B1 () "test") (B2 () "\"" "test qoute" "\"") (B1 () "test2") (B2 () "\"" "test " "\"" "qoute2" "\"" "\""))
'("\"" "test qoute" "\"" "\"" "test " "\"" "qoute2" "\"" "\"")
(match xexpr
[(list 'ROOT '()
(list 'A '()
(list 'B1 '() b1)
(list 'B2 '() b2 __1)) __1) (list b1 b2)]
[_ 'empty])
I’ll probably do this if I were you:
#lang racket
(require xml)
(struct data (b1 b2) #:transparent)
;; extract :: XExpr -> (Listof data?)
(define (extract xexpr)
(match-define `(ROOT () ,xs ...) xexpr)
(for/list ([e xs] #:when (list? e))
(match-define `(A () ,b1 ,b2) e)
(data b1 b2)))
(define raw-xml #<<EOF
<ROOT>
<A><B1>test</B1><B2>"test qoute"</B2></A>
<A><B1>test2</B1><B2>"test "qoute2""</B2></A>
</ROOT>
EOF
)
(extract (string->xexpr raw-xml))
Alternatively, you can use this hacky way:
;; extract :: XExpr -> (Listof data?)
(define (extract xexpr)
(for/list ([slice (in-slice 2 (se-path*/list '(A) xexpr))])
(apply data slice)))
--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/f64331d0-374a-43ea-8a59-18c7e4457bc5%40googlegroups.com.