how to communicate unsigned char* to Java

Ananya

unread,

May 20, 2007, 8:30:00 PM5/20/07

to

I am calling a Java native executable from a C++ program.

I would like to communicate the red, green, and blue values of an image,
which are unsigned chars in my C++ program (varying between 0 and 255) to my
Java program.

Well I first convert my unsigned char* to char*, but chars in C++ are 1 byte
and chars in Java are 2 bytes.

What's a fast and correct way to make this communication?

Thanks for looking at this!

Ananya

unread,

May 21, 2007, 11:54:01 AM5/21/07

to

I convert the unsigned char* to char* in C++ using reinterpret_cast<char*>.

But when I recieve the char ch in Java, neither
int i = ((int)(ch)>>8) & 255 (reversing the order of bytes)
nor
int i = (int)(ch) & 255
seems to give the same value as the value of the unsigned char in C++.

What's a correct and fast way to make this communication?

Thanks for your time!

Alexander Nickolov

unread,

May 21, 2007, 1:44:07 PM5/21/07

to

I'm pretty certain you must _not_ use a string on the Java side.
Java strings are UTF16 encoded, whereas you have binary
data not suitable for a string. On the C++ side don't use any
string functions either - char* is not reserved for strings only -
it's used for binary data as well.

--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD
email: agnic...@mvps.org
MVP VC FAQ: http://vcfaq.mvps.org
=====================================

"Ananya" <Ana...@discussions.microsoft.com> wrote in message
news:C056C161-C1A3-4082...@microsoft.com...

Ananya

unread,

May 21, 2007, 10:47:00 PM5/21/07

to

Cant't I just use char* in C++ and String[] string in Java and
byte[] bytes = string.getBytes()
which gets the lower bytes of the two bytes?

Ananya

unread,

May 22, 2007, 5:43:02 AM5/22/07

to

Actually, I did find out that getBytes does not get the lower byte, and even
worse, the Java characters don't seem to allow arbitrary values. So it looks
like I have to figure out how to read my file in Java as bytes.

Ali

unread,

May 22, 2007, 9:56:54 AM5/22/07

to

On May 22, 5:43 pm, Ananya <Ana...@discussions.microsoft.com> wrote:
> Actually, I did find out that getBytes does not get the lower byte, and even
> worse, the Java characters don't seem to allow arbitrary values. So it looks
> like I have to figure out how to read my file in Java as bytes.
>
> "Ananya" wrote:
> > Cant't I just use char* in C++ and String[] string in Java and
> > byte[] bytes = string.getBytes()
> > which gets the lower bytes of the two bytes?
>
> > "Alexander Nickolov" wrote:
>
> > > I'm pretty certain you must _not_ use a string on the Java side.
> > > Java strings are UTF16 encoded, whereas you have binary
> > > data not suitable for a string. On the C++ side don't use any
> > > string functions either - char* is not reserved for strings only -
> > > it's used for binary data as well.
>
> > > --
> > > =====================================
> > > Alexander Nickolov
> > > Microsoft MVP [VC], MCSD

> > > email: agnicko...@mvps.org

> > > MVP VC FAQ:http://vcfaq.mvps.org
> > > =====================================
>
> > > "Ananya" <Ana...@discussions.microsoft.com> wrote in message
> > >news:C056C161-C1A3-4082...@microsoft.com...
> > > >I convert the unsigned char* to char* in C++ using reinterpret_cast<char*>.
>
> > > > But when I recieve the char ch in Java, neither
> > > > int i = ((int)(ch)>>8) & 255 (reversing the order of bytes)
> > > > nor
> > > > int i = (int)(ch) & 255
> > > > seems to give the same value as the value of the unsigned char in C++.
>
> > > > What's a correct and fast way to make this communication?
>
> > > > Thanks for your time!
>
> > > > "Ananya" wrote:
>
> > > >> I am calling a Java native executable from a C++ program.
>
> > > >> I would like to communicate the red, green, and blue values of an image,
> > > >> which are unsigned chars in my C++ program (varying between 0 and 255) to
> > > >> my
> > > >> Java program.
>
> > > >> Well I first convert my unsigned char* to char*, but chars in C++ are 1
> > > >> byte
> > > >> and chars in Java are 2 bytes.
>
> > > >> What's a fast and correct way to make this communication?
>
> > > >> Thanks for looking at this!

Hello,
Well! why don't you just write it with char* and read it via
java string. Now write some funny code to elaborate those strings to
meaningful values;-)

ali

Günter Prossliner

unread,

May 22, 2007, 11:18:54 AM5/22/07

to

Hi

> Well! why don't you just write it with char* and read it via
> java string.

This will not work, allthough it _may_ work on some bunch of data.

Because: Strings in Java are Unicode (aka UTF-16). An not every sequence of
bytes is valid Unicode. Unicode Implementations may substitute the illegal
chars with something else, or throw an error.

> Now write some funny code to elaborate those strings to
> meaningful values;-)

Converting random binary data into a Unicode-String and converting this
Unicode-String back to binary data will not produce the same data again in
many cases.

While in a "none-unicode" world Strings could be used to carry some binary
data (allthough it was never a good coding practice), these techniques will
not work propery when you have unicode DataTypes and Parsers.

You just have to use the byte - Datatype to read arbitary binary data in
Java.

GP

Alexander Nickolov

unread,

May 22, 2007, 1:24:05 PM5/22/07

to

I suspect Ali meant to use Base64 encoding... That's quite
inefficient of course, but berhaps OP doesn't care about
performance at all - just so as not to bother with advanced
Java techniques...

--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD

email: agnic...@mvps.org

MVP VC FAQ: http://vcfaq.mvps.org
=====================================

"Günter Prossliner" <g.prossliner/gmx/at> wrote in message
news:eJGkMSI...@TK2MSFTNGP06.phx.gbl...

Ananya

unread,

May 22, 2007, 3:46:00 PM5/22/07

to

Ok, in C++ I made a file of char*.

How exactly do I read that in Java as arbitray byte data?

Thanks for your patience with me!

Ananya

unread,

May 22, 2007, 7:13:01 PM5/22/07

to

Well, what are the advanced Java techniques?

Or how at least could I read the correct byte value in Java?

Thanks in advance!

Günter Prossliner

unread,

May 23, 2007, 9:13:42 AM5/23/07

to

Hi!

> Or how at least could I read the correct byte value in Java?

http://www.infosys.tuwien.ac.at/teaching/courses/WebEngineering/References/java/docs/api/java/io/InputStream.html

GP

Ali

unread,

May 23, 2007, 12:10:26 PM5/23/07

to

Snip from GP:

> This will not work, allthough it _may_ work on some bunch of data.

And what is the data limit of this working solution? i mean the max
size when this tech. will stop working.

Snip from GP:

>> Now write some funny code to elaborate those strings to
>> meaningful values;-)

>Converting random binary data into a Unicode-String and converting this
>Unicode-String back to binary data will not produce the same data again in
>many cases.

char is a byte and char pointer is the size of underlying processor's
execution register (a.k.a accumulator register). And as you people
said that java is having unicode-16 for string which is indeed 16 bit
(2 bytes) for every single character that it will read from common
file systems (ramfs, ntfs etc. etc. ).

Honestly i've never been through the code for getbyte (IIRC java
function) as pointed by GP but i'm sure it would be putting the lower
byte (or say it higher byte ) unusable when it comes to 8 bit (a byte
UTF-8) data. So, the point is that you are suppose to put 8 bit data
in 16 bit place, is it that hard to do? Oh, yeah, go for libraries
that might save your immediate code but can't say that it will never
have the latency lib call time overhead too.

ali

Ananya

unread,

May 23, 2007, 11:56:00 PM5/23/07

to

Thanks for all your input!

Actually at this point I think I figured out the code in Java fore reading
the bytes:
pixelFile = new FileInputStream("pixels.dat");
pixelData = new DataInputStream(pixelFile);
pixelByte = new byte [3*pixcount];
try
{
pixelData.read(pixelByte);
...
}
catch (IOException e)
{
}
and I know how to make the bytes unsigned integers by using & 0xff.

But I still haven't figured out the code in C++ for writing the bytes.
What I have so far is:
unsigned char *pixels = ...
char *charpixels = reinterpret_cast<char*>(pixels);
FileStream file = new FileStream(S"pixels.dat", FileMode::Create,
FileAccess::Write);
BinaryWriter binary = new BinaryWriter(file);
binary.write(charpixels, pixcount);
binary.close();

However, my compiler doesn't recognize FileStream and BinaryWriter,
and when I try:
using namespace System::IO;
the compiler says that a namespace with this name does not exist.

Is my code correct, and what "using" or "include" do I need?
By the way, I copied new FileStream(S"pixels.dat",...) from somewhere and I
don't know what the "S" means?

I posted the question "how to write a binary file" also in the vc.language
forum at:
http://msdn.microsoft.com/newsgroups/default.aspx?&lang=en&cr=US&guid=&sloc=en-us&dg=microsoft.public.vc.language&p=1&tid=662db087-82f1-4ab4-8802-f56c94c8d967

Thanks for reading this.

Ananya

unread,

May 24, 2007, 12:32:01 AM5/24/07

to

Oh, and I also tried Ali's suggestion at my "how to wait for socket
communications " question, which was for my C++ code:

fstream file_op("c:\\Dev\\AnanyaCurves0.new\\pixels.dat", ios::in);
while(file_op >> charpixels)
cout << charpixels;
file_op.close();

but the compiler didn't understand "cout", and I have a feeling that I have
a better chance of Java understanding the file if I use BinaryWriter.

So please answer my previous question. Thanks!

Günter Prossliner

unread,

May 24, 2007, 4:25:09 AM5/24/07

to

Hello Ali!

> Snip from GP:
>> This will not work, allthough it _may_ work on some bunch of data.
>
> And what is the data limit of this working solution? i mean the max
> size when this tech. will stop working.

It is not about size. It will stop working when a specific byte-sequences
occure within the binary data which are no valid Unicode.

When you use ANSI Strings to read binary data
1. every byte stands for it's own
2. every possible value has an opposite Character within ANSI

==> you may read any binary data as an ANSI String. "String" is just an
representation of the underlying binary data, so you can convert them
without loosing information. Allthough in languages which supports an
explicit binary type (in opposite to c++ which uses 'char' for character and
binary), this was never a good coding practice.

As opposite to Unicode:
1. characters are encoded in more than one byte (when using UTF-16)
2. NOT every possible combination of values is an valid Unicode
Code-Point.

What happens when an Unicode-Parser dedects an invalid byte-Sequence?
* Thrown an error (in this case you will not be able to read the data at
all)
* Substitute the invalid sequence to something want may be SIMILAR (an
in this case you will not get the same binary data when you interpret this
string a binary data).
* Skipp the invalid character (needless to say that you will loose data)

Just check out the UNICODE specification for details.

see: http://www.unicode.org/reports/tr22/ (a document specifing an
XML-format for UNICODE Exchange).

======================================================
1.1 Illegal and Unassigned Codes

00 The sequence is illegal. There are two variants of this.
First is where the sequence is incomplete. For example,

* 0xA3 is incomplete in CP950.
Unless followed by another byte of the right form, it is illegal.

* 0xD800 is incomplete in Unicode.
Unless followed by another value of the right form, it is illegal.

* 0xDC00 is incomplete in Unicode.
Unless preceded by another value of the right form, it is illegal.

The second variant is where the sequence is complete, but explicitly
illegal. For example,
0xFFFF is illegal in Unicode. This value can never occur in valid Unicode
text, and will never be assigned.

00 The source sequence represents a valid code point, but is unassigned (aka
undefined). This sequence may be given an assignment in some future
(evolved) version of the character encoding
.
For example,
* 0xA3 0xBF is unassigned in CP950, as of 1999.
* 0x0EDE is unassigned in Unicode, V3.0

00 The source sequence is assigned, but unmappable: there is no
corresponding code point in the target encoding to accurately represent the
source sequence.
For example,

The long dash is assigned in Unicode, but cannot be mapped to ISO-8859-1.

In the case of illegal source sequences, a conversion routine will typically
provide the following options:

* stop (or throw an exception)
in particular, stopping is commonly used by higher level character encoding
schemes, such as ISO 2022 conversions, to know when to stop converting into
one encoding and pick another to convert to.
in either case, the information as to length of the bad sequence should be
available and the conversion should be resumable (after the caller handles
the bad sequence).

* skip the source sequence
while this is commonly an option, it can also hide corruption problems in
the source text.

* map to a substitution character
such as the Unicode U+FFFD REPLACEMENT CHARACTER.
======================================================

see: http://unicode.org/faq/utf_bom.html#40
======================================================
Q: Are there any 16-bit values that are invalid?

A: The two values FFFE16 and FFFF16 as well as the 32 values from FDD016 to
FDEF16 represent noncharacters. They are invalid in interchange, but may be
freely used internal to an implementation. Unpaired surrogates are invalid
as well, i.e. any value in the range D80016 to DBFF16 not followed by a
value in the range DC0016 to DFFF16, or any value in the range DC0016 to
DFFF16 not preceded by a value in the range D80016 to DBFF16. [AF]

======================================================

GP

Ananya

unread,

May 24, 2007, 5:51:02 AM5/24/07

to

Well, I am trying to get away from reading the C++ characters as Java
characters. I am trying to write a file of bytes in C++ and read it as a
file of bytes in Java.

Well, I was "googleing" for some code, and even though I asked for C++, it
seems I ended up with some C# code, which I cannot use.

So now I tried the following C++ code, which is similar to what Ali
suggested originally:

unsigned char *pixels = ...
char *charpixels = reinterpret_cast<char*>(pixels);

fstream file;
file.open("chars.dat", ios::out);
file.write(charpixels, count);

Now the code compiled, but it doesn't seem to write the correct bytes.

Well, unsigned chars of red, green, blue, all equal to 110, in C++, end up
in Java as 0 after making the bytes unsigned integers by using & 0xff.

Thanks for your patience with me.

"Ananya" wrote:

> I am trying to create a file of bytes from my character array
> char* chars with length int count.
>
> I wrote the code:
> FileStream file = new FileStream(S"chars.dat", FileMode::Create,

> FileAccess::Write);
> BinaryWriter binary = new BinaryWriter(file);

> binary.write(chars, count);
> binary.close();
>
> My compiler doesn't recognize FileStream and BinaryWriter.
> I tried:
> using namespace System::IO;
> but it doesn't recognize that either?

>
> Is my code correct, and what "using" or "include" do I need?

> By the way, in new FileStream(S"chars.dat", ...), what does the "S" mean?
>
> Thanks for reading this.
>
>

"Günter Prossliner" wrote:

> ..

Günter Prossliner

unread,

May 24, 2007, 6:37:17 AM5/24/07

to

Hello Ananya!

> Well, I am trying to get away from reading the C++ characters as Java
> characters. I am trying to write a file of bytes in C++ and read it
> as a file of bytes in Java.

I know. I've posted this explanation for Ali.

> Well, I was "googleing" for some code, and even though I asked for
> C++, it seems I ended up with some C# code, which I cannot use.

If you meen this code...

FileStream file = new FileStream(S"pixels.dat", FileMode::Create,

FileAccess::Write);
BinaryWriter binary = new BinaryWriter(file);

binary.write(charpixels, pixcount);
binary.close();

... it is neighter C#, nor C++ (managed C++ or C++/CLS). It will not compile
at all.

For example if you use C++/CLS you would have to write "FileStream
file=gcnew ....", if you use C++ with managed Extensions the syntax would be
"__gc FileStream file = new __gc FileStream ...". The same for BinaryWriter.
"write" and "close" have to be Pascal-Case.

> So now I tried the following C++ code, which is similar to what Ali
> suggested originally:

> unsigned char *pixels = ...
> char *charpixels = reinterpret_cast<char*>(pixels);

unsigned char *pixels = ...

fstream file("C:\\test.dat", ios::out|ios::binary);
file.write(reinterpret_cast<char*>(pixels), pixcount);
file.close();

GP

Ananya

unread,

May 24, 2007, 7:44:00 AM5/24/07

to

Ok, my code actually was:

unsigned char *pixels = ...
char *charpixels = reinterpret_cast<char*>(pixels);

fstream file;
file.open("C:\\Dev\\AnanyaCurves\\pixels.dat", ios::out | ios::binary);
file.write(charpixels, pixcount);

But making it:

unsigned char *pixels = ...

fstream file;
file.open("C:\\test.dat", ios::out|ios::binary);
file.write(reinterpret_cast<char*>(pixels), pixcount);

does not help. The output file test.dat still stays empty.

Thanks for your patience!

"Günter Prossliner" wrote:

> Hello Ananya!
>
> > Well, I am trying to get away from reading the C++ characters as Java
> > characters. I am trying to write a file of bytes in C++ and read it
> > as a file of bytes in Java.
>
> I know. I've posted this explanation for Ali.
>
> > Well, I was "googleing" for some code, and even though I asked for
> > C++, it seems I ended up with some C# code, which I cannot use.
>
> If you meen this code...
>
> FileStream file = new FileStream(S"pixels.dat", FileMode::Create,
> FileAccess::Write);
> BinaryWriter binary = new BinaryWriter(file);
> binary.write(charpixels, pixcount);
> binary.close();
>

> .... it is neighter C#, nor C++ (managed C++ or C++/CLS). It will not compile

Ananya

unread,

May 24, 2007, 9:08:01 AM5/24/07

to

Well, I had to say file.close() after writing the file so that the file is
acutally written. (Otherwise it was only written after exiting the method
inside which the writing was done).

Ananya

unread,

May 24, 2007, 9:48:02 AM5/24/07

to

And it is really fast! Thanks so much, Ali, for suggesting to communicate
with simple file I/O between C++ and Java, and not to worry about fancy
memory-mapped files!

Ali

unread,

May 25, 2007, 5:58:47 AM5/25/07

to

Glad to hear that you have solved your problem;-)

Cheers,
ali

Ali

unread,

May 25, 2007, 6:03:36 AM5/25/07

to

> see:http://www.unicode.org/reports/tr22/(a document specifing an

GP! thanks for value thoughts and i enjoyed reading that. Well, i
guess it seems another discussion like OO verses structural
programming or good programming verses bad one;-)

ali

Ananya

unread,

May 27, 2007, 8:03:02 PM5/27/07

to

Let me just make a summary of how to communicate unsigned char* from C++ to
Java, which enabled me to communicate the r, g, b values of a picture:

In Java I wrote the code:
int[] pix = new int[count]; //this array has the r, g, b values, count =
width*height
FileInputStream pixelFile = new FileInputStream("C:\\pixelvalues.dat");
DataInputStream pixelData = new DataInputStream(pixelFile);
byte[] pixelByte = new byte [3*count];
try
{
pixelData.read(pixelByte);
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
red = (int)pixelByte[y*pixelsw + x] & 0xff;
green = (int)pixelByte[count + y*pixelsw + x] & 0xff;
blue = (int)pixelByte[2*count + y*pixelsw + x] & 0xff;
pix[y*pixelsw + x] = 0xff000000 | ((red<<16) + (green<<8) + blue);
}
}
}
catch (IOException e)
{
}

and in C++ I wrote the code:
unsigned char *pixels = ... //the length of this array is 3*count
const char *charpixels = reinterpret_cast<char*>(pixels);
fstream file;
file.open("C:\\pixelvalues.dat", ios::out | ios::binary);
file.write(charpixels, 3*count);
file.close();

Thanks to everyone for participating in this discussion!