Hexadecimal representation of objects in console

364 views
Skip to first unread message

michaelm

unread,
Jan 4, 2010, 7:09:28 PM1/4/10
to H2 Database
Hi,

If I create a table in h2 linked to a postgresql/postgis database and
I select the geometry field (type Geometry in postgresql/postgis, type
Object in h2), the console displays the hexadecimal representation of
the hexadecimal representation of the original data.
Exemple :
0106... in postgresql (hexadecimal values)
30313036... in h2

I could retrieve the original binary data with this special function :
HEXTORAW(REGEXP_REPLACE(GEOMETRIE, '[0-9A-F][0-9A-F]', '00$0')

Is there any special reason to have this behaviour (conversion ?), or
a simpler way to keep original binary data as is ?

Thanks, and keep the excellent work.

Michaël

Thomas Mueller

unread,
Jan 6, 2010, 1:12:04 PM1/6/10
to h2-da...@googlegroups.com
Hi,

That's strange. What would be the easiest way for me to reproduce this?

Regards,
Thomas

> --
>
> You received this message because you are subscribed to the Google Groups "H2 Database" group.
> To post to this group, send email to h2-da...@googlegroups.com.
> To unsubscribe from this group, send email to h2-database...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.
>
>
>

michaelm

unread,
Jan 6, 2010, 5:45:48 PM1/6/10
to H2 Database
Hi,

Sorry, I cannot reproduce it without using postgresql + postgis (I'm
using postgresql-8.4-701.jdbc3.jar)

With a normal bytea type, everything seems ok
for exemple :
create table mytable(myfield bytea); // postgresql
insert into mytable VALUES('Thomas'::bytea); // postgresql
--> pgadmin displays 'Thomas' (I'm not sure why bytea is converted to
string)
--> h2 displays '54686f6d6173' (which is fine)

Now if you can test with postgis,

in postgresql :
create table mytable(myfield geometry);
insert into mytable VALUES(geomfromtext('POINT(0 0)'));
select myfield from mytable3; // displays
"010100000000000000000000000000000000000000"

in h2
CREATE LINKED TABLE LT ('org.postgresql.Driver', 'jdbc:postgresql://
mybase', 'postgres', 'postgres', 'mytable')
SELECT * FROM LT; //
"303130313030303030303030303030303030303030303030303030303030303030303030303030303030"

Here is a link about how postgresql output binary data into
hexadecimal strings (http://developer.postgresql.org/pgdocs/postgres/
datatype-binary.html#AEN4978)
And another about a similar problem, but quite old (http://
postgis.refractions.net/pipermail/postgis-users/2002-July/001235.html)

May be related to how jdbc driver output an unknown datatype or how h2
drivers read an h2 datatype.

Hope that helps

Michaël

On 6 jan, 19:12, Thomas Mueller <thomas.tom.muel...@gmail.com> wrote:
> Hi,
>
> That's strange. What would be the easiest way for me to reproduce this?
>
> Regards,
> Thomas
>

Thomas Mueller

unread,
Jan 9, 2010, 9:25:51 AM1/9/10
to h2-da...@googlegroups.com
Hi,

> --> pgadmin displays 'Thomas' (I'm not sure why bytea is converted to
> string)
> --> h2 displays '54686f6d6173' (which is fine)

The problem is that the H2 Console calls ResultSet.getString(..). It
looks like PostgreSQL converts binary data to a String in this case,
while H2 converts binary data to the hex representation.

To convert binary data in the UTF-8 format to a String in H2, use
UTF8TOSTRING, see also
http://www.h2database.com/html/functions.html#utf8tostring

I don't know if this will work in all cases (I don't know if the
binary is UTF-8). If it doesn't work, you might want to create a Java
function within H2. This is quite easy, see
http://www.h2database.com/html/grammar.html#create_alias - and use
something like:  return new String(byte[] bytes, String charsetName).

Regards,
Thomas

michaelm

unread,
Jan 9, 2010, 3:38:31 PM1/9/10
to H2 Database
Hi,

> The problem is that the H2 Console calls ResultSet.getString(..). It
> looks like PostgreSQL converts binary data to a String in this case,
> while H2 converts binary data to the hex representation.

Sorry, I try to make it clear as I think that something does not work
properly (don't know if it is in h2, postgresql jdbc driver or
postgresql)
My data IS binary data, not ascii nor UTF-8 nor any character stream.
As far as I understand, postgresql already converts this binary data
into a hexadecimal string representation when you get it from the jdbc
driver, and h2 console converts again this hexadecimal string as if it
was binary, into its hexadecimal representation.
One has to know what is the datatype of the returned data and if it is
returned as a binary stream or as a hexadecimal string stream.
If it is binary, everything should work properly, but it is
hexadecimal, it should NOT be interpreted as if it was binary data.

Hope my analyse makes sense.

>
> To convert binary data in the UTF-8 format to a String in H2, use
> UTF8TOSTRING, see alsohttp://www.h2database.com/html/functions.html#utf8tostring

Actually, this binary data has to be interpreted by a Java function as
it is stored in the WKB format (well-known-binary).
I can achieve it, but at the price of a conversion from hexadecimal to
hexadecimal to binary, which, I hope, can be avoided.

Thanks for your help

Michaël

>
> I don't know if this will work in all cases (I don't know if the
> binary is UTF-8). If it doesn't work, you might want to create a Java

> function within H2. This is quite easy, seehttp://www.h2database.com/html/grammar.html#create_alias- and use

Dario Fassi

unread,
Jan 10, 2010, 3:43:19 PM1/10/10
to h2-da...@googlegroups.com
Hi,
I think that this may be a Postgresql particularity about how it handle bytea data type.

A few points to check:

1) In pgsql conversion's behavior is controlled by SET CLIENT ENCONDING    (http://www.postgresql.org/docs/8.4/static/multibyte.html )

2) In pgsql with standard JDBC driver bytea data output handle printable and non-printable characters in different ways  ( http://www.postgresql.org/docs/current/static/datatype-binary.html )
Bytea octets are sometimes escaped when output. In general, each "non-printable" octet is converted into its equivalent three-digit octal value and preceded by one backslash. Most "printable" octets are represented by their standard representation in the client character set. The octet with decimal value 92 (backslash) is doubled in the output. Details are in Table 8-8.

3) Postgis has an extension to the postrgres JDBC driver ( http://postgis.refractions.net/download/postgis-1.5.0SVN.jar )  to properly manage postgis specific data types.
     It's posible that you don't have postgis.jar in H2 classpath (plus the standard postgresql JDBC driver) when H2 connect the linked table.

I suspect that when H2 console read the BINARY field for output make a printable string from it as an hex representation , but  don't have any way to know if portgres/postgis has previously done other translation at JDBC level. 

I hope this help.
Regards,
Dario.

El 09/01/10 17:38, michaelm escribió:

michaelm

unread,
Jan 11, 2010, 8:02:24 PM1/11/10
to H2 Database
Hi,

Thank you for your input,

> 1) In pgsql conversion's behavior is controlled by SET CLIENT
> ENCONDING    (http://www.postgresql.org/docs/8.4/static/multibyte.html)

I don't think the problem is related to character encoding. Original
data is binary data, pg output is a hexadecimal string and H2 outputs
is another hexadecimal string.

> 2) In pgsql with standard JDBC driver bytea data output handle printable
> and non-printable characters in different ways  (http://www.postgresql.org/docs/current/static/datatype-binary.html)
>
> > Bytea octets are sometimes escaped when output. In general, each
> > "non-printable" octet is converted into its equivalent three-digit
> > octal value and preceded by one backslash. Most "printable" octets are
> > represented by their standard representation in the client character
> > set. The octet with decimal value 92 (backslash) is doubled in the
> > output. Details are in Table 8-8

> > <http://www.postgresql.org/docs/current/static/datatype-binary.html#DA...>.

Actually, if I read a postgresql bytea field, H2 displays its exact
hexadecimal representation, which is what is expected I think.
Problem arises with pg/postgis geometry datatype which, I think is
managed as a UDT data type in PostgreSQL and converted into a OTHER
data_type in h2.
Postgresql output this data as a hexadecimal string, but maybe H2 has
no way to guess that, and it converts it again as if it was binary
data, hence the problem.
At least that's what I understand.

>
> 3) Postgis has an extension to the postrgres JDBC driver (http://postgis.refractions.net/download/postgis-1.5.0SVN.jar)  to


> properly manage postgis specific data types.
>      It's posible that you don't have postgis.jar in H2 classpath (plus
> the standard postgresql JDBC driver) when H2 connect the linked table.

Good suggestion. I tried to add this jar in the classpath, but without
success. I think there is no code in postgres driver nor in h2 code to
use this third library.
I think postgis.jar offers a way to read geometry from a postgresql
geometry data, but I've already code to convert bytes into geometry.
The problem is that my method parameter must be the original byte
array, not a byte array representing the hexadecimal digits of the
original byte array.
As far as I understand the problem, the only way to solve it would be
to assume that for such a UDT, the value returned by postgresql is a
hexadecimal string of the binary data and not the data itself, so that
H2 do not process it as if it was the original binary data.
But this assumption needs to be confirmed

> I suspect that when H2 console read the BINARY field for output make a
> printable string from it as an hex representation , but  don't have any
> way to know if portgres/postgis has previously done other translation at
> JDBC level.

Yes, I think this is the point.
I just hope there is a way to know (convention or documentation about
how this kind of data is held by postgresql and jdbc).

Thanks for help

Michaël

> >> function within H2. This is quite easy, seehttp://www.h2database.com/html/grammar.html#create_alias-and use

Dario Fassi

unread,
Jan 11, 2010, 9:01:32 PM1/11/10
to h2-da...@googlegroups.com
Michael,
If I understand well what you are trying to do, you have created a linked table to postgis
and you do:   "select  binary_field from linked_table "  as input of  UDF or DML in H2 database, isn't ?

Then what matter how H2 console output that field ?
or  binary data stored/handled in H2 database is different from the original raw binary ?

If you need this byte array as input to some UDF or java method you are done , and  H2 Console don't play any role here, all is in your program.

If your problem is how to recover the exact binary data stored in pgsql to be stored in an H2 database , you can try this to avoid any unwanted conversion:

1) make a direct jdbc query to pgsql and get the resultset Metadata to see with what data type pg/postgis report the "geometry" field.
2) If that type is any other than BINARY , make a cast in your query or create a view to cast that field to BINARY (or blob)
3) Read that field with   InputStream ResultSet.getBinaryStream(columnIndex)  or   rs.getBlog()   ;
4) Dump the input stream to check the raw data

You can avoid use linked table too , and use other direct jdbc connection to pgsql for read raw binary in such way and use it as you need. 

regards,
Dario.

michaelm

unread,
Jan 12, 2010, 2:56:04 AM1/12/10
to H2 Database
Hi,

Point 2 of your last suggestion is a great idea.
If I create my linked table with a cast from postgresql, binary data
is read in a standard way (I mean as binary data)
CREATE LINKED TABLE LT('org.postgresql.Driver',
'jdbc:postgresql:mybase', '', '', '(select myfield::bytea from
mytable)');

This is a little overhead compared with the conversion I had to
achieve before to get original binary data from its hexadecimal
representation.

FYI, here after are more details about what I try to achieve.

1) I'm testing h2 to see how well it can manage geographic data
2) I get problems when I tried to spatialize h2 with the third party
library hat_box.
3) The easiest way for me to populate h2 with test data is from
postgresql.
4) To use hatbox (library creating a spatial index from a field in wkb
binary format) or jts (library able to perform geometric operations on
a wkb binary field), I have to get geometry in a well formatted wkb
binary field.

I still think that the way this UDF field (postgis geometry) is
displayed in h2 in a way that is error prone because original data is
doubly converted.
I understand that one conversion is done by postgresql and the other
by h2, so that it is difficult to avoid until we are sure how
postgresql handle this kind od data in a general case.

Last solution you suggested (casting the geometry field into bytea) is
acceptable for my use case.

Thanks for your patience and your help

Michaël


On 12 jan, 03:01, Dario Fassi <dfa...@gmail.com> wrote:
> Michael,
> If I understand well what you are trying to do, you have created a
> linked table to postgis
> and you do:   "select  binary_field from linked_table "  as input of
> UDF or DML in H2 database, isn't ?
>
> Then what matter how H2 console output that field ?
> or  binary data stored/handled in H2 database is different from the
> original raw binary ?
>
> If you need this byte array as input to some UDF or java method you are
> done , and  H2 Console don't play any role here, all is in your program.
>
> If your problem is how to recover the exact binary data stored in pgsql
> to be stored in an H2 database , you can try this to avoid any unwanted
> conversion:
>
> 1) make a direct jdbc query to pgsql and get the resultset Metadata to
> see with what data type pg/postgis report the "geometry" field.
> 2) If that type is any other than BINARY , make a cast in your query or
> create a view to cast that field to BINARY (or blob)
> 3) Read that field with   |InputStream

> <http://java.sun.com/j2se/1.5.0/docs/api/java/io/InputStream.html>|
> ResultSet.
> <http://java.sun.com/j2se/1.5.0/docs/api/java/sql/ResultSet.html#getBi...>|*getBinaryStream
> <http://java.sun.com/j2se/1.5.0/docs/api/java/sql/ResultSet.html#getBi...>*(columnIndex)|

Reply all
Reply to author
Forward
0 new messages