ResultSetMetaData.getColumnDisplaySize returns 2147483647 ?

DGPickett

unread,

Feb 25, 2009, 11:33:36 AM2/25/09

to David....@nasdaqomx.com

Selecting a constant string or concatenated columns from postgres
using jisql (xigole), I get heap overflow.

select 'X' AS XXX from . . . .
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.AbstractStringBuilder.<init>
(AbstractStringBuilder.java:45)
at java.lang.StringBuffer.<init>(StringBuffer.java:91)
at com.xigole.util.sql.Jisql.formatLabel(Jisql.java:720)
at com.xigole.util.sql.Jisql.doIsql(Jisql.java:352)
at com.xigole.util.sql.Jisql.run(Jisql.java:225)
at com.xigole.util.sql.Jisql.main(Jisql.java:207)
0

This discussion seems to say this is a postgres JDBC bug:

http://forums.hotjoe.com/posts/list/315.page

I tried going back to the last rev 7 JAR, and these queries work,
albeit with 65K column widths. Something is seriously wrong in this
method, if it cannot size a three byte label or a one byte constant
string.

Kris Jurka

unread,

Feb 25, 2009, 7:11:50 PM2/25/09

to

On Wed, 25 Feb 2009, DGPickett wrote:

> Selecting a constant string or concatenated columns from postgres
> using jisql (xigole), I get heap overflow.
>
> select 'X' AS XXX from . . . .

While in this case it's obvious that 'X' has length 1, the server isn't
telling the driver that. Even if it did in this case, there are clearly
cases the server cannot know the output length. Consider:

SELECT repeat('X', (random() * 100)::int);

So regardless of the simple case you've shown, we must be able to handle
unknown length data. Previously we did that by returning -1 which broke
the rowset implementation which insisted that the value by >= 0. So we
moved to Integer.MAX_VALUE which seems to mostly work for people.

This thread details how we intend to solve the similar problem of
getPrecision for Crystal Reports users, so we could likely apply the same
sort of changes for display size as well.

http://archives.postgresql.org/pgsql-jdbc/2008-12/msg00018.php

> This discussion seems to say this is a postgres JDBC bug:
>
> http://forums.hotjoe.com/posts/list/315.page
>

That's not how I read the discussion. The thread says that h2 does the
same thing and that the author has provided a workaround to allow the user
to provide a custom formatter to handle data types with problems.

Kris Jurka

--
Sent via pgsql-jdbc mailing list (pgsql...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

DGPickett

unread,

Feb 26, 2009, 11:54:11 AM2/26/09

to

I feel it runs contrary to the purpose of JDBC to make clients have
postgres-specific code for this response. The common JDBC
speccification does not support returning int max (or anything else)
for "I don't know."

It seems unreasonable that this implementation cannot deal with these
trivial cases. I am not sure whether the fault lies with the JDBC
implementation or the underlying RDBMS server, that we cannot
determine that "X" is 1 byte wide, or a||b is (width of a) + (width of
b) bytes wide. Perhaps the requirement to support result metadata did
not percolate down into the server or TCP interface code far enough.
Can someone illuminate us on this?

Best,

David

Kris Jurka

unread,

Feb 26, 2009, 1:18:55 PM2/26/09

to

On Thu, 26 Feb 2009, DGPickett wrote:

> I feel it runs contrary to the purpose of JDBC to make clients have
> postgres-specific code for this response. The common JDBC
> speccification does not support returning int max (or anything else)
> for "I don't know."

Exactly. Since the spec doesn't say what to do, what do you want the JDBC
driver to do? Throw an exception? That's not helpful at all. The JDBC
team has come up with a solution that works for most clients. If you have
a concrete suggestion as to how this case should be handled that will work
for everyone, then we'd be glad to hear it. Just wishing this case didn't
exist isn't helpful.

> It seems unreasonable that this implementation cannot deal with these
> trivial cases. I am not sure whether the fault lies with the JDBC
> implementation or the underlying RDBMS server, that we cannot

> determine that "X" is 1 byte wide...

The server is responsible for determining the type and length of data, not
the JDBC driver. The server tells the JDBC driver that it is unknown and
the driver has to do something with it. You are still ignoring the fact
that there must be cases where the server cannot possibly know the length
of a piece of data even if it was capable of of handling the simpler
cases.

DGPickett

unread,

Feb 26, 2009, 2:53:10 PM2/26/09

to

On Feb 26, 1:18 pm, bo...@ejurka.com (Kris Jurka) wrote:

> The server is responsible for determining the type and length of data, not
> the JDBC driver. The server tells the JDBC driver that it is unknown and
> the driver has to do something with it. You are still ignoring the fact
> that there must be cases where the server cannot possibly know the length
> of a piece of data even if it was capable of of handling the simpler
> cases.
>
> Kris Jurka
>
> --

> Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org)

> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-jdbc

Well, can the JDBC team raise this issue with the postgres server
team, that many relatively obvious cases are being marked 'width
unknown'?

Kris Jurka

unread,

Feb 26, 2009, 3:54:46 PM2/26/09

to

On Thu, 26 Feb 2009, DGPickett wrote:

> Well, can the JDBC team raise this issue with the postgres server
> team, that many relatively obvious cases are being marked 'width
> unknown'?
>

They are aware. See this previous discussion regarding the same problem
with numeric types:

http://archives.postgresql.org/pgsql-jdbc/2008-05/msg00021.php

You can take it up on pgsql-hackers if you want, but unless you have
something conrete to add, I wouldn't bother.

Kris Jurka

--
Sent via pgsql-jdbc mailing list (pgsql...@postgresql.org)

DGPickett

unread,

Feb 27, 2009, 12:41:11 PM2/27/09

to

On Feb 26, 3:54 pm, bo...@ejurka.com (Kris Jurka) wrote:
> On Thu, 26 Feb 2009, DGPickett wrote:
> > Well, can the JDBC team raise this issue with the postgres server
> > team, that many relatively obvious cases are being marked 'width
> > unknown'?
>
> They are aware. See this previous discussion regarding the same problem
> with numeric types:
>
> http://archives.postgresql.org/pgsql-jdbc/2008-05/msg00021.php
>
> You can take it up on pgsql-hackers if you want, but unless you have
> something conrete to add, I wouldn't bother.
>
> Kris Jurka
>
> --

> Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org)

> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-jdbc

Yes, I see it comes up over and over:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00428.php

The discussions about char versus byte were nice, but the method name
says "Display" and the api doc says "characters", so if you have
Nvarchar(20) in a 16 bit unicode context, the storage needs 40 bytes
(plus a length) and the right return is 20. Nobody expects or wants
the gratuitous expense of dynamic sizing, just the obvious max so they
can typeset text output. JAVA hides the code for wide characters, and
we do not want it bring it back into the light.

Somehow, doing what the real money RDBMS do, the de-facto standard, is
hard for postgres dudes, or they have more important concerns. It
makes the product seem shabby by comparison. Usually, that is enough
to motivate frantic coding, to make open source appear just as good.

Oliver Jowett

unread,

Mar 2, 2009, 4:10:30 AM3/2/09

to

DGPickett wrote:

> Yes, I see it comes up over and over:
>
> http://archives.postgresql.org/pgsql-hackers/2007-06/msg00428.php
>
> The discussions about char versus byte were nice, but the method name
> says "Display" and the api doc says "characters", so if you have
> Nvarchar(20) in a 16 bit unicode context, the storage needs 40 bytes
> (plus a length) and the right return is 20. Nobody expects or wants
> the gratuitous expense of dynamic sizing, just the obvious max so they
> can typeset text output. JAVA hides the code for wide characters, and
> we do not want it bring it back into the light.

Except that the backend is not written in Java, and neither are all the
other frontends.

To some extent, this is a case of the JDBC interface not being flexible
enough about the metadata (or often not even documenting what it's meant
to mean!) - so we are stuck trying to provide the best fit we can from
the data the server gives us.

Asking the server to conform to JDBC's particular idea of how the
metadata should look, at the expense of other interfaces, seems a bit
unreasonable.

If you have a better idea about how the JDBC driver can map the
information coming from the backend to the JDBC metadata interfaces, a
patch would be great. Or you could change the calling code to
bounds-check the metadata results before it goes off and tries to
allocate the entire heap for one column in one row ..

-O

--
Sent via pgsql-jdbc mailing list (pgsql...@postgresql.org)

DGPickett

unread,

Mar 2, 2009, 11:25:53 AM3/2/09

to

On Mar 2, 4:10 am, oli...@opencloud.com (Oliver Jowett) wrote:
> Or you could change the calling code to
> bounds-check the metadata results before it goes off and tries to
> allocate the entire heap for one column in one row ..

If you have to change calling code to access Postgres, it violates the
original promise of JDBC: use me and be portable. If you want an 'I
don't know' response built into this API method, sell it to the JDBC
community, but until then, it is a requirement. Most RDBMS have not
stumbled when delivering on the requirement, so selling it as a
hardship seems difficult. As the real estate zoning lawyers say,
"This is, at best, a self-imposed hardship."

Re: Character / Byte, I know that in addition to 1, 2 and 4 byte wide
characters of several families, we have these variable width creatures
like UTF-8. RDBMS developers can hardly be happy about complications
of this development, but to stay on the road to being more and more
application-facing, they should provide storage in characters for any
sort of character, even if the underlying storage must be overly
generous or variable to accommodate this. JAVA and JDBC are just
leading the way. When this call is made, they should be able to say
how many characters are configured for the result set column. Just as
when smallint + int is typed up to int (bigint would be safer) to hold
the result and all the smallint values are cast up, when "X" || zzz (a
column varchar(10)) is parsed, the result should be typed varchar(11)
or char(11). The RDBMS should provide this metadata from its parsing
process to JDBC.

Hannu Krosing

unread,

Mar 2, 2009, 5:17:28 PM3/2/09

to

On Fri, 2009-02-27 at 09:41 -0800, DGPickett wrote:

> Somehow, doing what the real money RDBMS do, the de-facto standard,

Could you describe, what do they do, especially for cases of really
unknown sizes ?

One solution would be to fetch all data into a buffer and actually
measure the longest field size.

> is
> hard for postgres dudes, or they have more important concerns. It
> makes the product seem shabby by comparison. Usually, that is enough
> to motivate frantic coding, to make open source appear just as good.

You are most welcome to "code frantically" :)

PostgreSQL "dudes" in general are not in business of appearing just as
good. Rather we are trying to produce the most advanced database
available, at least for values of "most advanced" that matter to us.

--
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

--
Sent via pgsql-jdbc mailing list (pgsql...@postgresql.org)

Oliver Jowett

unread,

Mar 2, 2009, 7:02:51 PM3/2/09

to

DGPickett wrote:

> If you have to change calling code to access Postgres, it violates the
> original promise of JDBC: use me and be portable.

JDBC in general failed that requirement a long time ago, I'm afraid.
Whenever I'm writing a JDBC client, I invariably end up with a strategy
object that's customized for each RDBMS target. About the only thing
that is truely portable is the overall infrastructure of
Connection/PreparedStatement/ResultSet etc.

Anyway, if JDBC is meant to be a portable adaptor to existing RDBMSes,
then it should provide a superset of existing capabilities that degrades
nicely when some are not available on a particular RDBMS. It
demonstrably doesn't do that in this case.

You seem to consider JDBC as an end in itself, rather than just another
interface target to adapt an existing DB to. I don't think it's
reasonable or expected for the existing core server to jump through
hoops and change behavior solely for the sake of what the JDBC interface
happens to want. Obviously, if there's some extra data that's easy for
the server to provide, that's a good thing - and in fact we got a lot of
extra metadata in the v3 protocol specifically to help out JDBC. But it
does not seem to be easy (or even universally possible) for the server
to give us the metadata in the limited form that the JDBC interface
expects here.

Again, I ask: Do you have a better suggestion for how the JDBC driver
should handle this case, given the data it gets from the server?

Kris Jurka

unread,

Mar 2, 2009, 7:05:58 PM3/2/09

to

On Tue, 3 Mar 2009, Hannu Krosing wrote:

> Could you describe, what do they do, especially for cases of really
> unknown sizes ?
>
> One solution would be to fetch all data into a buffer and actually
> measure the longest field size.

That's not always an option. Consider the case of
PreparedStatement.getMetaData(), it must be able to return metadata
without executing the statement.

Kris Jurka

Віталій Тимчишин

unread,

Mar 3, 2009, 8:57:17 AM3/3/09

to

2009/3/2 DGPickett <DGPi...@aol.com>

On Mar 2, 4:10 am, oli...@opencloud.com (Oliver Jowett) wrote:
> Or you could change the calling code to
> bounds-check the metadata results before it goes off and tries to
> allocate the entire heap for one column in one row ..

If you have to change calling code to access Postgres, it violates the
original promise of JDBC: use me and be portable. If you want an 'I
don't know' response built into this API method, sell it to the JDBC
community, but until then, it is a requirement. Most RDBMS have not
stumbled when delivering on the requirement, so selling it as a
hardship seems difficult. As the real estate zoning lawyers say,
"This is, at best, a self-imposed hardship."

I'd say it works fairly well here. It is asked a maximum, it don't know - so it tells theoretical maximum possible and that is not precise, but correct value. And the jisql does have a bug - it tries to allocate memory equal to column maximum width. Imagine a varchar(1000000000) column I may use to store long descriptions without limits (or equal PostgreSQL "text" type). My values never larger then, say, 4KB, and jisql would try to allocate 1000000000 size buffer, that is stupid and would lead to problems. It should have some maximum limit (say, 1MB), because even for varchar(many) column, it usually do not have such a long values.