Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to set string to NULL (Hex value 00)

237 views
Skip to first unread message

User Name

unread,
Mar 30, 2004, 11:52:54 PM3/30/04
to
Hi,
I tried to set a string to NULL with the following statement, but it
always get c080. Does anybody know hot to do it? TIA
set str [binary format h* 0]
puts $str

When I ran the above statements, and redirect to a file. It has c080
values

Gerald Lester

unread,
Mar 31, 2004, 12:18:22 AM3/31/04
to
User Name wrote:

Done use redirection of stdout (due to encodings), instead:
1) open a file
2) fconfigure to binary
3) puts to it
4) close the file

BTW, you could also just do:
set str \x00
or
set str \0


--
+--------------------------------+---------------------------------------+
| Gerald W. Lester | "The man who fights for his ideals is |
| Gerald...@cox.net | the man who is alive." -- Cervantes |
+--------------------------------+---------------------------------------+

Gerald Lester

unread,
Mar 31, 2004, 12:21:27 AM3/31/04
to
User Name wrote:

1) Please read http://wiki.tcl.tk/endekalogue

2) If after reading (1), you do not understand backslash substitution, go to
step (1)

Jeff Hobbs

unread,
Mar 31, 2004, 12:21:07 AM3/31/04
to
User Name wrote:

If you are dealing with binary, make sure to do:
fconfigure $fileid -translation binary

--
Jeff Hobbs, The Tcl Guy
http://www.ActiveState.com/, a division of Sophos

Darren New

unread,
Mar 31, 2004, 12:32:03 AM3/31/04
to
Gerald Lester wrote:
>> always get c080. Does anybody know hot to do it? TIA
> 2) If after reading (1), you do not understand backslash substitution,

This looks more to me like a unicode encoding of ASCII NUL. It doesn't
look like it has anything to do with backslashes, but with encodings.


--
Darren New, San Diego CA USA (PST)
I am in geosynchronous orbit, supported by
a quantum photon exchange drive....

Michael Schlenker

unread,
Mar 31, 2004, 4:12:36 AM3/31/04
to
User Name wrote:

Basically it is a bug in Tcl's encoding. c080 is used as the Tcl
internal code for NUL, so the usual C string functions can be used. As
the use of non-shortest encodings was outlawed by the unicode consortium
a while ago.
(mainly due to security problems of various vendors, one very prominent
case was the really really broken code in M$ IIS getting it horribly
wrong due to stupidity (check access than normalize path instead of
normalize than check)).

So your above code probably should work, as co80 was a valid NULL in
some UTF-8, but nowadays this is broken.

Michael

Don Porter

unread,
Mar 31, 2004, 9:49:02 AM3/31/04
to
>> set str [binary format h* 0]
>> puts $str
>> When I ran the above statements, and redirect to a file. It has c080
>> values

What encoding did you want the file in? What is your system encoding?
If the answer to both is utf-8, then...

Michael Schlenker wrote:
> Basically it is a bug in Tcl's encoding. c080 is used as the Tcl
> internal code for NUL, so the usual C string functions can be used.

Otherwise, if you want to write binary data to stdout without any
encoding changes, be sure to configure for that:

fconfigure stdout -encoding binary -translation binary

--
| Don Porter Mathematical and Computational Sciences Division |
| donald...@nist.gov Information Technology Laboratory |
| http://math.nist.gov/~DPorter/ NIST |
|______________________________________________________________________|

Donal K. Fellows

unread,
Apr 1, 2004, 7:43:49 AM4/1/04
to
Michael Schlenker wrote:
> So your above code probably should work, as co80 was a valid NULL in
> some UTF-8, but nowadays this is broken.

There was some discussion of this topic before Xmas between some UNICODE
people and some of the Core Team. We couldn't reach agreement over what
the right way forward was; their preferred solutions (which varied from
making the app exit immediately to substituting such sequences with the
UNICODE "unknown character sequence" character) would have broken far
too much existing code and data for our taste, and our preferred
solutions (which can be summed up largely by the IETF dictum "Be liberal
in what you accept and strict in what you generate") had them throwing
up their arms in horror. We did not see eye-to-eye... :^(

Donal.

Torsten Berg

unread,
Jan 28, 2023, 6:34:41 PM1/28/23
to
Wow, this is an old discussion ... but this is the problem I seem to have with Tcl 8.6.12 ...

I need to build a BLOB for a field in an SQLite table. It should start with these four bytes:

byte[2] magic = 0x4750;
byte version;
byte flags;

So, the first one is ASCII "GP", the second one should be a zero as an "8-bit unsigned integer" and the third one is a byte with flags that is "00000011" (only the two right-most bits are set) in my case. What I do is

binary format a2BB8 GP 0 00000011

Looking at the hex representation of the BLOB, it looks like this: "47 50 c0 80 03"
I can see the correct first two bytes (the "GP") and the last byte (the flags) but the NULL comes out as "c080".

Even if I do

set BLOB \x47\x50\x00\x03

I get the same output.

So, how do I get the BLOB to look like this (hex representation): "47 50 00 03"

Christian Gollwitzer

unread,
Jan 29, 2023, 5:58:15 AM1/29/23
to
Am 29.01.23 um 00:34 schrieb Torsten Berg:
> Wow, this is an old discussion ... but this is the problem I seem to have with Tcl 8.6.12 ...
>
> I need to build a BLOB for a field in an SQLite table. It should start with these four bytes:
>
> byte[2] magic = 0x4750;
> byte version;
> byte flags;
>
> So, the first one is ASCII "GP", the second one should be a zero as an "8-bit unsigned integer" and the third one is a byte with flags that is "00000011" (only the two right-most bits are set) in my case. What I do is
>
> binary format a2BB8 GP 0 00000011
>
> Looking at the hex representation of the BLOB, it looks like this: "47 50 c0 80 03"
> I can see the correct first two bytes (the "GP") and the last byte (the flags) but the NULL comes out as "c080".

This means that the string has been encoded in UTF-8. The problem is not
with "binary format", it is the transition from the string to the
database. For example, if you were writing the content to a file, you
would need to do "fconfigure $fd -encoding binary -translation binary"
to do so, and what you see is alike to "fconfigure $fd -encoding utf8".

Hence, you need to check the database interface layer if there is an
option to pass binary contents.

Christian

Rich

unread,
Jan 29, 2023, 8:54:34 AM1/29/23
to
Torsten Berg <be...@typoscriptics.de> wrote:
> Wow, this is an old discussion ... but this is the problem I seem to have with Tcl 8.6.12 ...
>
> I need to build a BLOB for a field in an SQLite table. It should start with these four bytes:
>
> byte[2] magic = 0x4750;
> byte version;
> byte flags;
>
> So, the first one is ASCII "GP", the second one should be a zero as
> an "8-bit unsigned integer" and the third one is a byte with flags
> that is "00000011" (only the two right-most bits are set) in my case.
> What I do is
>
> binary format a2BB8 GP 0 00000011
>
> Looking at the hex representation of the BLOB, it looks like this:
> "47 50 c0 80 03"
>
> I can see the correct first two bytes (the "GP") and the last byte
> (the flags) but the NULL comes out as "c080".
>
> Even if I do
>
> set BLOB \x47\x50\x00\x03
>
> I get the same output.
>
> So, how do I get the BLOB to look like this (hex representation): "47 50 00 03"

You have not stated how you are looking at the "hex representation".
If you ask for the 'hex' in the normal Tcl way, it appears to work
properly:

$ rlwrap tclsh
% set blob [binary format a2BB8 GP 0 00000011]
GP
% binary scan $blob H* hex
1
% set hex
47500003
%

And as Christian pointed out, the hex you quote is the UTF-8 encoding
of the binary blob. So it looks like you are gaining a UTF-8 encoding
of the blob somewhere.

Torsten Berg

unread,
Jan 29, 2023, 2:01:24 PM1/29/23
to
Hi,

and thanks for your thoughts! They made me read the SQLite documentation again carefully for the Tcl binding and the 'eval' command. I found this sentence:

"If the $bigstring variable has both a string and a "bytearray" representation, then TCL inserts the value as a string. If it has only a "bytearray" representation, then the value is inserted as a BLOB. To force a value to be inserted as a BLOB even if it also has a text representation, use a "@" character to in place of the "$"."

That did the trick!
0 new messages