Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Binary byte in TCL gets pre-fixed by byte '0xC2', while accessing through C++ function via SWIG wrapper

96 views
Skip to first unread message

jim.ch...@gmail.com

unread,
Jul 11, 2007, 2:59:54 AM7/11/07
to
Hello,

I have a problem in converting a binary data into a character buffer
which is read by a C++ function using wrapper generated by SWIG.

I create a variable 'b' in TCL, and store in it a binary value as hex
string.
Then, I do "% set t [binary format H* $b]" to convert it into actual
binary.
Then, I pass it as argument to the SWIG wrapper for my C++ function,
which just accepts char* pointer and number of bytes, and prints
binary value of each separate bytes.

I can see that in my C++ function, I get 2 characters instead of 1.
My original byte is pre-fixed by a byte called 0xC2.

This happens when I pass a binary string also. Every binary byte is
pre-fixed by 0xC2.

Can some one tell me where is the problem, and how to solve it?
>From TCL, it looks fine to me.
Is it a problem with SWIG wrapper doing some internal conversion?


########### See the script below ############
% load example.dll

% MyClass myObj
_e04c0a01_p_MyClass
% set b b0
b0
% set t [binary format H* $b]
°
% myObj WRAP_TestFn $t 1
[3268] WRAP_TestFn pData: 0xc2

% myObj WRAP_TestFn $t 2
[3268] WRAP_TestFn pData: 0xc2 0xb0

%
########### End ############

########### C++ code ############

void WRAP_TestFn(char* pData, unsigned int uiBufSize);

########### End ############

bill...@alum.mit.edu

unread,
Jul 11, 2007, 5:44:10 AM7/11/07
to

0xC2 0xB0 is the UTF-8 encoding of the character U+00B0 "degree sign".
You are creating a single character with value 0xB0. When this UTF-32
character is converted to a UTF-8 string, it turns into two bytes.
Are you trying to pass text to your C++ function or raw binary data?

David Gravereaux

unread,
Jul 11, 2007, 10:28:58 AM7/11/07
to
To add on to what Bill said, see the dox for Tcl_UtfToExternalDString()
http://www.tcl.tk/man/tcl8.4/TclLib/Encoding.htm

Or what might be more pertinent in your case could be Tcl_GetByteArrayFromObj()
http://www.tcl.tk/man/tcl8.4/TclLib/ByteArrObj.htm

Instead of passing a char* to your C++ code, you'd pass the object pointer instead
and grab the byte array from that object.

--
How many boards would the Mongols hoard if the Mongol hordes got bored?
-- Calvin

signature.asc

pal...@yahoo.com

unread,
Jul 11, 2007, 7:15:54 PM7/11/07
to

The built-in SWIG typemaps for char* type will treat the Tcl object as
a UTF8 string. Hence the behaviour you see.

To indicate the Tcl object contains binary data, you need a typemap.
Something like the following (not tested)

%typemap(in) (char *BINDATA, unsigned int BINLEN) %{
$1 = Tcl_GetByteArrayFromObj($input, &$2);
%}

Then declare your function as
void WRAP_TestFn(char* BINDATA, unsigned int BINLEN);

Note that the parameter names MUST MATCH the typemap above else it
will default to standard char* (UTF8) treatment.

If you have multiple functions that take these type of parameters, you
can of course reuse the same typemap definition for them.

/Ashok

0 new messages