Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Getting the size of a result set

1,625 views
Skip to first unread message

Jon Skeet

unread,
Mar 2, 2001, 9:42:28 AM3/2/01
to
Hi everyone,

I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
database "thang". I've been tasked with writing a class to implement our
own data interface, but accessing a JDBC database. One of the methods I
have to implement is (on a ResultSet-type object) getCount() which
should return the number of rows in the results.

I've been trying to figure out a way of doing this using JDBC, and I
haven't got very far yet. As this is only a prototype at the moment, I'm
probably okay to use any standard JDBC extensions etc if necessary,
although obviously I'd like to keep it as simple as possible. I'd hoped
that ResultSetMetaData would have something in it, or that I could
specify an OUT parameter in a PreparedStatement which could contain the
count. (I realise I can use OUT parameters with a CallableStatement, but
I can't force our users to add appropriate CallableStatements,
unfortunately.) I can get the count if I don't specify anything else,
but I really need the rest of the data as well...

I realise this may be an impossible mission, but I thought it would be
worth asking.

--
Jon Skeet - sk...@pobox.com
http://www.pobox.com/~skeet

Guenther Grau

unread,
Mar 2, 2001, 12:06:51 PM3/2/01
to
Hi,

Jon Skeet wrote:
> I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
> database "thang". I've been tasked with writing a class to implement our
> own data interface, but accessing a JDBC database. One of the methods I
> have to implement is (on a ResultSet-type object) getCount() which
> should return the number of rows in the results.

Using jdbc-2.0, if you have a scrollable resultSet, you could do

resultSet.last();
resultSet.getRow();

If you do not have a scrollable cursor you have to go through it
line by line and count the results. Another feasable approach is
having the database return the number of rows in a seperate out
parameter, but you seem not to be able to do that (or have the
db-developers do that).

[...]


> count. (I realise I can use OUT parameters with a CallableStatement, but
> I can't force our users to add appropriate CallableStatements,
> unfortunately.) I can get the count if I don't specify anything else,
> but I really need the rest of the data as well...

Then why not make two database calls?

Guenther

Jon Skeet

unread,
Mar 2, 2001, 11:29:02 AM3/2/01
to
Guenther Grau <Guenth...@ubs.com> wrote:
> Jon Skeet wrote:
> > I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
> > database "thang". I've been tasked with writing a class to implement our
> > own data interface, but accessing a JDBC database. One of the methods I
> > have to implement is (on a ResultSet-type object) getCount() which
> > should return the number of rows in the results.
>
> Using jdbc-2.0, if you have a scrollable resultSet, you could do
>
> resultSet.last();
> resultSet.getRow();
>
> If you do not have a scrollable cursor you have to go through it
> line by line and count the results.

Right. Am I right in saying that's likely to kill performance to a large
extent? Does the performance hit depend on the type of driver I'm using?

> Another feasable approach is
> having the database return the number of rows in a seperate out
> parameter, but you seem not to be able to do that (or have the
> db-developers do that).

Unfortunately not - especially as this will probably end up being a very
dynamic product, and we may not even know all the types of query
beforehand :(



> [...]
> > count. (I realise I can use OUT parameters with a CallableStatement, but
> > I can't force our users to add appropriate CallableStatements,
> > unfortunately.) I can get the count if I don't specify anything else,
> > but I really need the rest of the data as well...
>
> Then why not make two database calls?

In case the data changes in-between, for one thing. It may be feasible
though...

Jon Skeet

unread,
Mar 2, 2001, 11:32:12 AM3/2/01
to
Jon Skeet <sk...@pobox.com> wrote:
> > > count. (I realise I can use OUT parameters with a CallableStatement, but
> > > I can't force our users to add appropriate CallableStatements,
> > > unfortunately.) I can get the count if I don't specify anything else,
> > > but I really need the rest of the data as well...
> >
> > Then why not make two database calls?
>
> In case the data changes in-between, for one thing. It may be feasible
> though...

Just a thought - presumably I could use transaction support to make sure
the data I see doesn't change between queries, even if the underlying
database has?

This would also have a performance hit as well though, at a guess.

I'm sure there's a good reason why this fairly obviously useful piece of
information isn't provided - I'll just have to wait 'til I know more
about databases to find it out :)

AD

unread,
Mar 2, 2001, 7:27:37 PM3/2/01
to
How about storing the resultset in a Vector or a ArrayList and close the
resultset.

getCount() would then be equivalent to size(). This is a workaround I
normally use.

This is of course if you can do with a disconnected resultset and do not
want a updatable resultset (as of JDBC 2.0).

Using transactions will have a performance hit.

AD


"Jon Skeet" <sk...@pobox.com> wrote in message
news:MPG.1509db65b...@10.1.1.51...

bhasso

unread,
Mar 2, 2001, 11:50:50 PM3/2/01
to
I have code at work that returns the result set and other
information. I would tell you now, but i have not looked at
that code in so long that i do not remember. if you can wait
for a possiable answer, e-mail me at bha...@carlson.com monday
and i will look

Jon Skeet

unread,
Mar 3, 2001, 3:15:13 AM3/3/01
to
AD <a...@pobox.com> wrote:
> How about storing the resultset in a Vector or a ArrayList and close the
> resultset.
>
> getCount() would then be equivalent to size(). This is a workaround I
> normally use.

Unfortunately I may well be dealing with huge amounts of data - in which
case the client wants to know the size, but will probably only read
about ten items.



> This is of course if you can do with a disconnected resultset and do not
> want a updatable resultset (as of JDBC 2.0).
>
> Using transactions will have a performance hit.

Yup - it seems that whatever I do I'm going to get a performance hit :(

I'll try just going to the last row, finding the row number, then going
back to the beginning. With any luck that shouldn't be *too* bad.

Thanks for all the suggestions though :)

Valentin Todorov

unread,
Mar 8, 2001, 12:02:04 PM3/8/01
to
Try this:

long getCount(Connection con, String table){
Statement stmt = null;
ResultSet rs = null;
int count = 0;
try{
stmt = con.createStatement();
rs = stmt.executeQuery("SELECT COUNT(*) FROM "+table);
if(rs.next())
count = rs.getInt(1);
}catch(SQLException e) {
// error handling
}finally{
// close rs &stmt
}
}

HTH
Valentin


Jon Skeet schrieb:

Jon Skeet

unread,
Mar 8, 2001, 12:25:05 PM3/8/01
to
Valentin Todorov <valentin...@sea.ericsson.se> wrote:
> Try this:
>
> long getCount(Connection con, String table){
> Statement stmt = null;
> ResultSet rs = null;
> int count = 0;
> try{
> stmt = con.createStatement();
> rs = stmt.executeQuery("SELECT COUNT(*) FROM "+table);
> if(rs.next())
> count = rs.getInt(1);
> }catch(SQLException e) {
> // error handling
> }finally{
> // close rs &stmt
> }
> }

Unfortunately that doesn't let me see the actual data itself - I could
do with both the data itself and the size of the data I'm looking at.

pa...@quidditic.co.uk

unread,
Mar 14, 2001, 6:05:43 PM3/14/01
to
Not having the Java Documentation handy, but:

Java 2 ONLY:

ResultSet has two functions last(), and getRow(). I think a combination of
these should provide the ResultSet "length" as required.
last() - move "pointer"/"cursor" to the last row in the ResultSet
getRow() - return the current row in the ResultSet

If memory serves, this reflects the current ResultSet, and so would reflect
changes caused by insertRow() and deleteRow()

I hope this helps. [Please let me know how you get on].

Cheers, Paul


Jon Skeet <sk...@pobox.com> wrote in message

news:MPG.1511d0ca9...@10.1.1.51...

crg

unread,
Mar 15, 2001, 7:26:12 AM3/15/01
to
Could you do a count query first, then go get the result set?


"Ruud de Koter" <ruud_d...@hp.com> wrote in message
news:3AB0712B...@hp.com...
> Hi Paul,


>
> pa...@quidditic.co.uk wrote:
> >
> > Not having the Java Documentation handy, but:
> >
> > Java 2 ONLY:
> >
> > ResultSet has two functions last(), and getRow(). I think a combination
of
> > these should provide the ResultSet "length" as required.
> > last() - move "pointer"/"cursor" to the last row in the ResultSet
> > getRow() - return the current row in the ResultSet
> >
>

> Right, that is pretty easy. The point is, at least with the
implementations of
> ResultSet that I have used (Oracle oci), it is terribly slow. Think 3
seconds
> for finding uit the ResultSet contains 10,000 rows, on a 733 MHz Pentium.
>
> Anybody got better results?
>
> Regards,
>
> Ruud de Koter.

Ruud de Koter

unread,
Mar 15, 2001, 2:37:15 AM3/15/01
to
Hi Paul,

pa...@quidditic.co.uk wrote:
>
> Not having the Java Documentation handy, but:
>
> Java 2 ONLY:
>
> ResultSet has two functions last(), and getRow(). I think a combination of
> these should provide the ResultSet "length" as required.
> last() - move "pointer"/"cursor" to the last row in the ResultSet
> getRow() - return the current row in the ResultSet
>

Right, that is pretty easy. The point is, at least with the implementations of


ResultSet that I have used (Oracle oci), it is terribly slow. Think 3 seconds
for finding uit the ResultSet contains 10,000 rows, on a 733 MHz Pentium.

Anybody got better results?

Regards,

Ruud de Koter.

Jon Skeet

unread,
Mar 15, 2001, 7:42:30 AM3/15/01
to
crg <craig.gould@*REMOVE.THIS*bt.com> wrote:
> Could you do a count query first, then go get the result set?

Yes, but then the data could change between the two queries, unless I do
it as a transaction, which would be expensive in itself.

Hartmut

unread,
Mar 15, 2001, 9:13:16 AM3/15/01
to
Ok, now you say more clear, what you want.

a snapshot, but not over the wire,
that would be a select into a new table. This new table is used
then only by your application. So you don't need any
transactions to deal with this one.
You do one "select count(*)" and tell it the user. Ok.

There's no need to configure a ScrollBar, if you use JTable in a
JScrollBar. Filling the table model will fill the table and
adjust the size of the scrollbar.

How shall the user decide which records he wants?
You obviously will give him some of the columns in a JTable, I
think. You identify them by a key.
So you are able to let the user choose which records he wants
with their full data, so you retrieve only those from your "new"
table - all the time the same unchanging snapshot.

If you want, you do a refresh of the "new" table.

This I would call reasonable - of course you tell your user,
that it's a snapshot.

Good?

Regards,
Hartmut


Jon Skeet

unread,
Mar 15, 2001, 9:14:29 AM3/15/01
to
Hartmut <%48artmutNO%4S...@sun.partner.remarq.com.invalid> wrote:

> But if you - normal situation - have many updates and deletes,
> what's the use of knowing the exact count of the rows? Next
> moment it has changed!

Ah, but the ResultSet itself doesn't necessarily change. What I want to
do is ask the database for a snapshot of the results of a certain query,
which I *may* need all the results of, or I may just need some. There's
no point in transferring *all* the data over the wire (even if it's only
a virtual wire) if only some of it is needed, but I need to know how
much *potential* data there is.

Imagine a scrollable user interface - I want to know how big the
snapshot is in order to size the scrollbar.

I don't mind getting slightly out-of-date information, but what I *do*
mind is the being either a performance hit due to capturing all the
information within Java, or being told I have a certain amount of data
available and then having more or less than that.

Sorry, I still think this is a reasonable thing to ask.

Hartmut

unread,
Mar 15, 2001, 8:40:56 AM3/15/01
to
If you really need all record, then retrieve them all.

If you need only the count and then maybe only some records,
then retrieve only the count and those records you need.

But sorry,

I don't see the sense between the number of records and their
retrival!

If you think it must be at any case, and you have very less
users and changing accesses on your database, then choose the
transaction solution.

But if you - normal situation - have many updates and deletes,
what's the use of knowing the exact count of the rows? Next
moment it has changed!

Why do you want to make the user believe that he is the only one
dealing with that data? A second is a long time in a database.
That's just the reality - show him!

Why not simply say to the user:
Now there are 100.000 records in the table (customers in the
company, dollars on the account - you will speak like the user,
clear).
Now - now there are only / even ...
You want to see the records that ...
Ok, here are them: ...
You want to see more / others / count them again ...
Ok, User, all reasonable things, that can be done with your
data, can be done by my program.
But:
You, User, are not the owner of this database. There are a lot
of other users, too. So don't be astonished if things change
faster as you think.
Wouldn't your user agree?

Technique is for the use of the user. And you don't need more
technique than the user needs.

Do you agree?

Regards,
Hartmut


Jon Skeet

unread,
Mar 15, 2001, 11:53:03 AM3/15/01
to
Hartmut <%48artmutNO%4S...@sun.partner.remarq.com.invalid> wrote:
> Ok, now you say more clear, what you want.

> a snapshot, but not over the wire,
> that would be a select into a new table. This new table is used
> then only by your application. So you don't need any
> transactions to deal with this one.
> You do one "select count(*)" and tell it the user. Ok.

Create a new table for *every* request? Sounds expensive, when I could
be dealing with over a hundred of these requests per second...

Isn't "getting a snapshot" exactly what specifying a ResultSet to be
TYPE_SCROLL_INSENSITIVE does?



> There's no need to configure a ScrollBar, if you use JTable in a
> JScrollBar. Filling the table model will fill the table and
> adjust the size of the scrollbar.

I'm not actually doing a GUI at all - it was an entirely theoretical
example to show how it's useful to know the size of the data without
downloading it all. Note that in filling the table model you have to
show *all* of the data, which could be an extremely expensive operation.

Jon Skeet

unread,
Mar 16, 2001, 6:53:47 AM3/16/01
to
Per Schröder <p...@nospam.mimer.se> wrote:
> I do sympathize (sp?) with your wish, but the RDBMS or driver may not
> actually have that information readily available.

<snip> Thanks for the most reasonable explanation I've read of why this
may be tricky. I'll probably allow the constructor of my class to say
whether or not to try to get the count using last() and accept the
performance hit or to not bother returning the size.

Per Schröder

unread,
Mar 16, 2001, 7:02:11 AM3/16/01
to
Jon Skeet <sk...@pobox.com> wrote in
<MPG.151adea1a...@10.1.1.51>:

>I don't mind getting slightly out-of-date information, but what I *do*
>mind is the being either a performance hit due to capturing all the
>information within Java, or being told I have a certain amount of data
>available and then having more or less than that.
>
>Sorry, I still think this is a reasonable thing to ask.
>

I do sympathize (sp?) with your wish, but the RDBMS or driver may not

actually have that information readily available.

In many situations, a ResultSet is not what it sounds like; a data
structure containing your result. Most of the time it is actually a query
plan, or a program. To get the next row (or couple of rows), the RDBMS
executes the query plan/program. In order to know how many rows there will
be, the RDBMS may have to actually execute the plan until it runs out of
data. This is where the performance hit comes into play.

To avoid doing this yourself, you could use a scrollable ResultSet if it is
available for your driver and your query type. Do rs.last() followed by
rs.getRow(). When you do this the driver/RDBMS *may* be able to get the row
count without actually fetching all data. It all depends on RDBMS
capabilities and how the data is actually stored and the type of query. In
some cases the RDBMS *have* to get/buffer all data internally to get at the
last row, inducing a performance hit.

Buffering data in this way may lead to other transactional problems which I
don't want to discuss here.

My advice is that you try to structure your applications so that you don't
have to know the size of the ResultSet.

/Per Schröder
http://www.mimer.com

Hartmut

unread,
Mar 19, 2001, 5:10:46 AM3/19/01
to
Jon,

I think you haven't yet got my main point.

Your user shouldn't decide to accept the performance hit,
because this would concern all the other (thousands?) users of
that database server, too.

Glad to see that Per is aggreeing with me:


structure your applications so that you don't have to know the
size of the ResultSet.

Sorry, that my more practical advice wasn't reasonable enough
for you.

Nethertheless,
good luck!

Hartmut


Jon Skeet

unread,
Mar 19, 2001, 5:38:14 AM3/19/01
to
Hartmut <%48artmutNO%4S...@sun.partner.remarq.com.invalid> wrote:
> I think you haven't yet got my main point.
>
> Your user shouldn't decide to accept the performance hit,
> because this would concern all the other (thousands?) users of
> that database server, too.
>
> Glad to see that Per is aggreeing with me:
> structure your applications so that you don't have to know the
> size of the ResultSet.

It's all very well *saying* that (and indeed our application can *cope*
with not knowing the size of the ResultSet) but our application
certainly *benefits* from knowing the size, as then more accurate
information can be given to the user. If that information *is* given, it
needs to be accurate (for parts where it's not directly fed back to a
user, but to other parts of the program). What wasn't explained before
Per's post was *why* this seemingly easy piece of information wasn't
available.

Lee Fesperman

unread,
Mar 19, 2001, 7:33:15 PM3/19/01
to
Jon Skeet wrote:
>
> It's all very well *saying* that (and indeed our application can *cope*
> with not knowing the size of the ResultSet) but our application
> certainly *benefits* from knowing the size, as then more accurate
> information can be given to the user. If that information *is* given, it
> needs to be accurate (for parts where it's not directly fed back to a
> user, but to other parts of the program). What wasn't explained before
> Per's post was *why* this seemingly easy piece of information wasn't
> available.

That's because the 'why' is very complex. One reason is because it depends on the
specific driver and backend implementations, which the vendors generally consider
proprietary. Another is JDBC is a standardized layer on top of wildly divergent native
interfaces (also, not always publicly available).

Given all of that, "select count(*) from table" is the most portable ... that is, if the
backend supports count(*) (this unfortunately can't be determined with MetaData, except
by using SQL support level info, which isn't reliable.)

--
Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com)
===================================================================
* Check out Database Debunkings (http://www.firstsql.com/dbdebunk/)
* "The Forum Where Database Matters Are Set Straight"

jagil...@gmail.com

unread,
Feb 15, 2016, 12:15:31 PM2/15/16
to
Hi, if you want to make a count of rows is better always use COUNT(1) than
COUNT(*). Because the * implies project all the atributes and the 1 avoid project the attributes.

Arne Vajhøj

unread,
Feb 15, 2016, 12:53:16 PM2/15/16
to
On 2/15/2016 12:15 PM, jagil...@gmail.com wrote:
> El jueves, 8 de marzo de 2001, 18:02:04 (UTC+1), Valentin Todorov escribió:
>> Try this:

> Hi, if you want to make a count of rows is better always use COUNT(1) than
> COUNT(*). Because the * implies project all the atributes and the 1 avoid project the attributes.

1) The thread you are replying to is 15 years old.

2) Your advice is *not* universally accepted. Most databases seems to
do the same for the two.

Arne


Rhino

unread,
Oct 16, 2016, 9:49:48 PM10/16/16
to
I realize this is months late but, in case anyone reads this later, I
would add a caution about this technique. While this technique would get
you the count of the rows in the table at the point in time where you
executed the query, that count *could* change at any time unless the
table was locked against updates. Therefore, if you have in mind some
kind of processing where you first get a count of rows and then do an
insert, update or delete operation that depends on that row count being
accurate, remember that it will only be accurate if no inserts, updates
or deletes have taken place between the time you counted the rows and
the time you do the updates/deletes/inserts.

In my view, it's far better to write the logic using cursors where you
build a result set that gets all the rows that qualify to be updated or
deleted, regardless of the number, and then, while the rows are still
locked by the cursor, do the updates or deletes (or, conceivably
inserts). The cursor ensures that the rows you want to work with are
locked so they can't change between the time you read them and the time
you change them.


--
Rhino

Arne Vajhøj

unread,
Oct 19, 2016, 11:50:58 PM10/19/16
to
On 10/16/2016 9:49 PM, Rhino wrote:
> On 2016-02-15 12:15 PM, jagil...@gmail.com wrote:
>> El jueves, 8 de marzo de 2001, 18:02:04 (UTC+1), Valentin Todorov
>> escribió:
>>> Jon Skeet schrieb:
>>>> I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
>>>> database "thang". I've been tasked with writing a class to implement
>>>> our
>>>> own data interface, but accessing a JDBC database. One of the methods I
>>>> have to implement is (on a ResultSet-type object) getCount() which
>>>> should return the number of rows in the results.
>>>>
>>>> I've been trying to figure out a way of doing this using JDBC, and I
>>>> haven't got very far yet. As this is only a prototype at the moment,
>>>> I'm
>>>> probably okay to use any standard JDBC extensions etc if necessary,
>>>> although obviously I'd like to keep it as simple as possible. I'd hoped
>>>> that ResultSetMetaData would have something in it, or that I could
>>>> specify an OUT parameter in a PreparedStatement which could contain the
>>>> count. (I realise I can use OUT parameters with a CallableStatement,
>>>> but
>>>> I can't force our users to add appropriate CallableStatements,
>>>> unfortunately.) I can get the count if I don't specify anything else,
>>>> but I really need the rest of the data as well...
>>> Try this:
>>>
>>> long getCount(Connection con, String table){
>>> Statement stmt = null;
>>> ResultSet rs = null;
>>> int count = 0;
>>> try{
>>> stmt = con.createStatement();
>>> rs = stmt.executeQuery("SELECT COUNT(*) FROM "+table);
>>> if(rs.next())
>>> count = rs.getInt(1);
>>> }catch(SQLException e) {
>>> // error handling
>>> }finally{
>>> // close rs &stmt
>>> }
>>> }
>> Hi, if you want to make a count of rows is better always use COUNT(1)
>> than
>> COUNT(*). Because the * implies project all the atributes and the 1
>> avoid project the attributes.

First there is no reason to believe that the database would treat
COUNT(*) and COUNT(1) differently as they have the same semantics.
SELECT * and SELECT 1 behaves differently, but that does not by magic
apply to other contexts using the same lexical elements.

Second there are lots of tests on the internet showing that for
specific databases and versions there are no difference.

> I realize this is months late but, in case anyone reads this later, I
> would add a caution about this technique. While this technique would get
> you the count of the rows in the table at the point in time where you
> executed the query, that count *could* change at any time unless the
> table was locked against updates. Therefore, if you have in mind some
> kind of processing where you first get a count of rows and then do an
> insert, update or delete operation that depends on that row count being
> accurate, remember that it will only be accurate if no inserts, updates
> or deletes have taken place between the time you counted the rows and
> the time you do the updates/deletes/inserts.
>
> In my view, it's far better to write the logic using cursors where you
> build a result set that gets all the rows that qualify to be updated or
> deleted, regardless of the number, and then, while the rows are still
> locked by the cursor, do the updates or deletes (or, conceivably
> inserts). The cursor ensures that the rows you want to work with are
> locked so they can't change between the time you read them and the time
> you change them.

Behavior for both COUNT and CURSOR depends on the database, the SQL
and especially transaction isolation level.

Arne




0 new messages