Can I obtain byte offsets from the streaming XML reader?

14 views
Skip to first unread message

Adam Wojnakowski

unread,
Jul 4, 2022, 7:24:48 AM7/4/22
to Woodstox User Mailing List
I am looking for a way to obtain offsets of start and end element tags, ones that I can later use to efficiently read a portion of XML data from the same file. I know that you can obtain character offsets but can't seem to find an easy way to get the byte offsets of InputStream.

The getByteStartingOffset() or getByteEndingOffset() of LocationInfo return -1 even if I construct the reader from an InputStream.

Is there any support for tracking byte offsets of the current tag in the streaming XML reader?

Thanks in advance,
Adam

Tatu Saloranta

unread,
Jul 5, 2022, 2:06:14 PM7/5/22
to Adam Wojnakowski, Woodstox User Mailing List
No, unfortunately this is not possible with Woodstox: all input will
be read using a Reader, so conversion from bytes to chars occurs
before any decoding/parsing. So only character offsets are available.

Aalto parser (https://github.com/FasterXML/aalto-xml/) would provide
byte offsets, if it worked for your use case -- the main limitation
being that it does not support DTDs.

-+ Tatu +-

>
> Thanks in advance,
> Adam
>
> --
> You received this message because you are subscribed to the Google Groups "Woodstox User Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to woodstox-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/woodstox-user/53ed8b24-7d04-4f0e-967e-e0de7c4fae99n%40googlegroups.com.

Adam Wojnakowski

unread,
Jul 5, 2022, 5:25:02 PM7/5/22
to Woodstox User Mailing List
We may be fine without DTD. I will have a look at Aalto. Thanks a lot!

Tatu Saloranta

unread,
Jul 6, 2022, 11:42:45 AM7/6/22
to Adam Wojnakowski, Woodstox User Mailing List
On Tue, Jul 5, 2022 at 2:25 PM Adam Wojnakowski <ada...@gmail.com> wrote:
>
> We may be fine without DTD. I will have a look at Aalto. Thanks a lot!

No problem!

-+ Tatu +-
> To view this discussion on the web visit https://groups.google.com/d/msgid/woodstox-user/66846600-5513-4d33-a9c0-d498ec7235adn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages