Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Variable : Can't have more than 600 lines? (Parsing XML with tDOM)

48 views
Skip to first unread message

atomi...@gmail.com

unread,
Oct 12, 2017, 6:12:23 PM10/12/17
to
I'm running into a bit of a weird problem. It looks like I cannot assign a variable with more than 600 lines (regardless of line length). It seems to bail part way through the assignment and start interpreting the excess lines as additional commands.

Stepping back a bit:
The reason I'm doing this is I'm using tDOM to parse a very large XML file. So following some of the examples, I first set a variable to be the contents of the XML file. e.g.

set XML "
<order number='1'>
<customer>John Doe</customer>
<phone>555-4321</phone>
<email>jd...@example.com</email>
<website/>
<parts>
<widget sku='XYZ123' />
<widget sku='ABC789' />
</parts>
</order>
"

This is the simple example on the Tcl.Tk wiki and when I replace it with my own simple example XML with a few items, it works fine. When I use the full blown XML which is hundreds of lines long, it fails. I finally figured out the issue was the 600 line limit, so I replaced all the LF and CR in the XML file so it is one giant line and everything worked fine.

This seems like a dumb way to do things and it isn't going to work when I go for the full-blown production XML which spans I think 100,000 lines or more.

Is there a better way to tackle this problem?
I'm using ActiveTcl 8.5.11 on Windows 10 machine

Robert Heller

unread,
Oct 12, 2017, 7:51:38 PM10/12/17
to
At Thu, 12 Oct 2017 15:12:21 -0700 (PDT) atomi...@gmail.com wrote:

>
> I'm running into a bit of a weird problem. It looks like I cannot assign a =
> variable with more than 600 lines (regardless of line length). It seems to =
> bail part way through the assignment and start interpreting the excess line=
> s as additional commands.
>
> Stepping back a bit:
> The reason I'm doing this is I'm using tDOM to parse a very large XML file.=
> So following some of the examples, I first set a variable to be the conten=
> ts of the XML file. e.g.
>
> set XML "
> <order number=3D'1'>
> <customer>John Doe</customer>
> <phone>555-4321</phone>
> <email>jd...@example.com</email>
> <website/>
> <parts>
> <widget sku=3D'XYZ123' />
> <widget sku=3D'ABC789' />
> </parts>
> </order>
> "

Are you actually using double quotes? Did you try it with braces instead?

I'm wondering if somewhere a double quote actually occurs in the XML
somewhere. This will seriously break things.

Double quote is a legal XML character. Even though in your example it is
showing single quotes (also legal), there might be double quote somewhere in
there that you might have missed.

>
> This is the simple example on the Tcl.Tk wiki and when I replace it with my=
> own simple example XML with a few items, it works fine. When I use the ful=
> l blown XML which is hundreds of lines long, it fails. I finally figured ou=
> t the issue was the 600 line limit, so I replaced all the LF and CR in the =
> XML file so it is one giant line and everything worked fine.
>
> This seems like a dumb way to do things and it isn't going to work when I g=
> o for the full-blown production XML which spans I think 100,000 lines or mo=
> re.
>
> Is there a better way to tackle this problem?
> I'm using ActiveTcl 8.5.11 on Windows 10 machine
>

--
Robert Heller -- 978-544-6933
Deepwoods Software -- Custom Software Services
http://www.deepsoft.com/ -- Linux Administration Services
hel...@deepsoft.com -- Webhosting Services

Gerald Lester

unread,
Oct 12, 2017, 8:57:43 PM10/12/17
to
I've successfully parsed documents that were way over 600 lines with tdom.

Have you run your XML document through a validator?

--
+----------------------------------------------------------------------+
| Gerald W. Lester, President, KNG Consulting LLC |
| Email: Gerald...@kng-consulting.net |
+----------------------------------------------------------------------+

Christian Gollwitzer

unread,
Oct 13, 2017, 2:46:13 AM10/13/17
to
Am 13.10.17 um 00:12 schrieb atomi...@gmail.com:
> The reason I'm doing this is I'm using tDOM to parse a very large XML file. So following some of the examples, I first set a variable to be the contents of the XML file. e.g.
>
> set XML "
> <order number='1'>
> <customer>John Doe</customer>
> <phone>555-4321</phone>
> <email>jd...@example.com</email>
> <website/>
> <parts>
> <widget sku='XYZ123' />
> <widget sku='ABC789' />
> </parts>
> </order>
> "

Don't do that - you got the XML file as a separate file, instead of
editing it into a Tcl script, use Tcl to read the XML file. E.g.

package require fileutil
set XML [fileutil::cat myfile.xml]

# now do your tDOM parsing here

> This is the simple example on the Tcl.Tk wiki and when I replace it with my own simple example XML with a few items, it works fine.

well that's the thing with simple examples, the authors of the example
liked that you don't need to create another XML-file to try their
example out, but it was never meant that you store the file verbatim in
our code.

> I'm using ActiveTcl 8.5.11 on Windows 10 machine
>

Any reason you can't update to 8.6? 8.5 is very old now.

Christian

Rolf Ade

unread,
Oct 13, 2017, 7:06:26 AM10/13/17
to
atomi...@gmail.com writes:

> I'm running into a bit of a weird problem. It looks like I cannot
> assign a variable with more than 600 lines (regardless of line
> length). It seems to bail part way through the assignment and start
> interpreting the excess lines as additional commands.

There is no such limit with tcl. As others already pointed out this
probably is a quoting error in the XML included literal in your tcl
code.
The simplest way will be

set doc [dom parse [::tDOM::xmlReadFile $path_to_your_XML_file]]

If your XML file is bigger than 2 GByte, this will fail. (Because there
is a hard limit: Tcl variables can't hold strings bigger than that.)

If this is the case for you then use this pattern:

set fd [::tDOM::xmlOpenFile $path_to_your_XML_file]
set doc [dom parse -channel $fd]
close $fd

With this (and on 64-bit OS / with 64-bit tcl) you're able to parse XML
files of any size, as long as your box has enough memory. The users of
an application of mine do this more or less every day.

The helper procs ::tDOM::xmlReadFile and ::tDOM::xmlOpenFile are more
robust to read XML data out of a file because they respect the encoding
of the XML file.

> I'm using ActiveTcl 8.5.11 on Windows 10 machine

Please consider updating to 8.6.7 and make sure you're using at least
tDOM 0.9.0

rolf
0 new messages