I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
database "thang". I've been tasked with writing a class to implement our
own data interface, but accessing a JDBC database. One of the methods I
have to implement is (on a ResultSet-type object) getCount() which
should return the number of rows in the results.
I've been trying to figure out a way of doing this using JDBC, and I
haven't got very far yet. As this is only a prototype at the moment, I'm
probably okay to use any standard JDBC extensions etc if necessary,
although obviously I'd like to keep it as simple as possible. I'd hoped
that ResultSetMetaData would have something in it, or that I could
specify an OUT parameter in a PreparedStatement which could contain the
count. (I realise I can use OUT parameters with a CallableStatement, but
I can't force our users to add appropriate CallableStatements,
unfortunately.) I can get the count if I don't specify anything else,
but I really need the rest of the data as well...
I realise this may be an impossible mission, but I thought it would be
worth asking.
--
Jon Skeet - sk...@pobox.com
http://www.pobox.com/~skeet
Jon Skeet wrote:
> I'm not new to Java, but I'm extremely new to JDBC, SQL, and the whole
> database "thang". I've been tasked with writing a class to implement our
> own data interface, but accessing a JDBC database. One of the methods I
> have to implement is (on a ResultSet-type object) getCount() which
> should return the number of rows in the results.
Using jdbc-2.0, if you have a scrollable resultSet, you could do
resultSet.last();
resultSet.getRow();
If you do not have a scrollable cursor you have to go through it
line by line and count the results. Another feasable approach is
having the database return the number of rows in a seperate out
parameter, but you seem not to be able to do that (or have the
db-developers do that).
[...]
> count. (I realise I can use OUT parameters with a CallableStatement, but
> I can't force our users to add appropriate CallableStatements,
> unfortunately.) I can get the count if I don't specify anything else,
> but I really need the rest of the data as well...
Then why not make two database calls?
Guenther
Right. Am I right in saying that's likely to kill performance to a large
extent? Does the performance hit depend on the type of driver I'm using?
> Another feasable approach is
> having the database return the number of rows in a seperate out
> parameter, but you seem not to be able to do that (or have the
> db-developers do that).
Unfortunately not - especially as this will probably end up being a very
dynamic product, and we may not even know all the types of query
beforehand :(
> [...]
> > count. (I realise I can use OUT parameters with a CallableStatement, but
> > I can't force our users to add appropriate CallableStatements,
> > unfortunately.) I can get the count if I don't specify anything else,
> > but I really need the rest of the data as well...
>
> Then why not make two database calls?
In case the data changes in-between, for one thing. It may be feasible
though...
Just a thought - presumably I could use transaction support to make sure
the data I see doesn't change between queries, even if the underlying
database has?
This would also have a performance hit as well though, at a guess.
I'm sure there's a good reason why this fairly obviously useful piece of
information isn't provided - I'll just have to wait 'til I know more
about databases to find it out :)
getCount() would then be equivalent to size(). This is a workaround I
normally use.
This is of course if you can do with a disconnected resultset and do not
want a updatable resultset (as of JDBC 2.0).
Using transactions will have a performance hit.
AD
"Jon Skeet" <sk...@pobox.com> wrote in message
news:MPG.1509db65b...@10.1.1.51...
Unfortunately I may well be dealing with huge amounts of data - in which
case the client wants to know the size, but will probably only read
about ten items.
> This is of course if you can do with a disconnected resultset and do not
> want a updatable resultset (as of JDBC 2.0).
>
> Using transactions will have a performance hit.
Yup - it seems that whatever I do I'm going to get a performance hit :(
I'll try just going to the last row, finding the row number, then going
back to the beginning. With any luck that shouldn't be *too* bad.
Thanks for all the suggestions though :)
long getCount(Connection con, String table){
Statement stmt = null;
ResultSet rs = null;
int count = 0;
try{
stmt = con.createStatement();
rs = stmt.executeQuery("SELECT COUNT(*) FROM "+table);
if(rs.next())
count = rs.getInt(1);
}catch(SQLException e) {
// error handling
}finally{
// close rs &stmt
}
}
HTH
Valentin
Jon Skeet schrieb:
Unfortunately that doesn't let me see the actual data itself - I could
do with both the data itself and the size of the data I'm looking at.
Java 2 ONLY:
ResultSet has two functions last(), and getRow(). I think a combination of
these should provide the ResultSet "length" as required.
last() - move "pointer"/"cursor" to the last row in the ResultSet
getRow() - return the current row in the ResultSet
If memory serves, this reflects the current ResultSet, and so would reflect
changes caused by insertRow() and deleteRow()
I hope this helps. [Please let me know how you get on].
Cheers, Paul
Jon Skeet <sk...@pobox.com> wrote in message
news:MPG.1511d0ca9...@10.1.1.51...
"Ruud de Koter" <ruud_d...@hp.com> wrote in message
news:3AB0712B...@hp.com...
> Hi Paul,
>
> pa...@quidditic.co.uk wrote:
> >
> > Not having the Java Documentation handy, but:
> >
> > Java 2 ONLY:
> >
> > ResultSet has two functions last(), and getRow(). I think a combination
of
> > these should provide the ResultSet "length" as required.
> > last() - move "pointer"/"cursor" to the last row in the ResultSet
> > getRow() - return the current row in the ResultSet
> >
>
> Right, that is pretty easy. The point is, at least with the
implementations of
> ResultSet that I have used (Oracle oci), it is terribly slow. Think 3
seconds
> for finding uit the ResultSet contains 10,000 rows, on a 733 MHz Pentium.
>
> Anybody got better results?
>
> Regards,
>
> Ruud de Koter.
pa...@quidditic.co.uk wrote:
>
> Not having the Java Documentation handy, but:
>
> Java 2 ONLY:
>
> ResultSet has two functions last(), and getRow(). I think a combination of
> these should provide the ResultSet "length" as required.
> last() - move "pointer"/"cursor" to the last row in the ResultSet
> getRow() - return the current row in the ResultSet
>
Right, that is pretty easy. The point is, at least with the implementations of
ResultSet that I have used (Oracle oci), it is terribly slow. Think 3 seconds
for finding uit the ResultSet contains 10,000 rows, on a 733 MHz Pentium.
Anybody got better results?
Regards,
Ruud de Koter.
Yes, but then the data could change between the two queries, unless I do
it as a transaction, which would be expensive in itself.
a snapshot, but not over the wire,
that would be a select into a new table. This new table is used
then only by your application. So you don't need any
transactions to deal with this one.
You do one "select count(*)" and tell it the user. Ok.
There's no need to configure a ScrollBar, if you use JTable in a
JScrollBar. Filling the table model will fill the table and
adjust the size of the scrollbar.
How shall the user decide which records he wants?
You obviously will give him some of the columns in a JTable, I
think. You identify them by a key.
So you are able to let the user choose which records he wants
with their full data, so you retrieve only those from your "new"
table - all the time the same unchanging snapshot.
If you want, you do a refresh of the "new" table.
This I would call reasonable - of course you tell your user,
that it's a snapshot.
Good?
Regards,
Hartmut
Ah, but the ResultSet itself doesn't necessarily change. What I want to
do is ask the database for a snapshot of the results of a certain query,
which I *may* need all the results of, or I may just need some. There's
no point in transferring *all* the data over the wire (even if it's only
a virtual wire) if only some of it is needed, but I need to know how
much *potential* data there is.
Imagine a scrollable user interface - I want to know how big the
snapshot is in order to size the scrollbar.
I don't mind getting slightly out-of-date information, but what I *do*
mind is the being either a performance hit due to capturing all the
information within Java, or being told I have a certain amount of data
available and then having more or less than that.
Sorry, I still think this is a reasonable thing to ask.
If you need only the count and then maybe only some records,
then retrieve only the count and those records you need.
But sorry,
I don't see the sense between the number of records and their
retrival!
If you think it must be at any case, and you have very less
users and changing accesses on your database, then choose the
transaction solution.
But if you - normal situation - have many updates and deletes,
what's the use of knowing the exact count of the rows? Next
moment it has changed!
Why do you want to make the user believe that he is the only one
dealing with that data? A second is a long time in a database.
That's just the reality - show him!
Why not simply say to the user:
Now there are 100.000 records in the table (customers in the
company, dollars on the account - you will speak like the user,
clear).
Now - now there are only / even ...
You want to see the records that ...
Ok, here are them: ...
You want to see more / others / count them again ...
Ok, User, all reasonable things, that can be done with your
data, can be done by my program.
But:
You, User, are not the owner of this database. There are a lot
of other users, too. So don't be astonished if things change
faster as you think.
Wouldn't your user agree?
Technique is for the use of the user. And you don't need more
technique than the user needs.
Do you agree?
Regards,
Hartmut
> a snapshot, but not over the wire,
> that would be a select into a new table. This new table is used
> then only by your application. So you don't need any
> transactions to deal with this one.
> You do one "select count(*)" and tell it the user. Ok.
Create a new table for *every* request? Sounds expensive, when I could
be dealing with over a hundred of these requests per second...
Isn't "getting a snapshot" exactly what specifying a ResultSet to be
TYPE_SCROLL_INSENSITIVE does?
> There's no need to configure a ScrollBar, if you use JTable in a
> JScrollBar. Filling the table model will fill the table and
> adjust the size of the scrollbar.
I'm not actually doing a GUI at all - it was an entirely theoretical
example to show how it's useful to know the size of the data without
downloading it all. Note that in filling the table model you have to
show *all* of the data, which could be an extremely expensive operation.
<snip> Thanks for the most reasonable explanation I've read of why this
may be tricky. I'll probably allow the constructor of my class to say
whether or not to try to get the count using last() and accept the
performance hit or to not bother returning the size.
I do sympathize (sp?) with your wish, but the RDBMS or driver may not
actually have that information readily available.
In many situations, a ResultSet is not what it sounds like; a data
structure containing your result. Most of the time it is actually a query
plan, or a program. To get the next row (or couple of rows), the RDBMS
executes the query plan/program. In order to know how many rows there will
be, the RDBMS may have to actually execute the plan until it runs out of
data. This is where the performance hit comes into play.
To avoid doing this yourself, you could use a scrollable ResultSet if it is
available for your driver and your query type. Do rs.last() followed by
rs.getRow(). When you do this the driver/RDBMS *may* be able to get the row
count without actually fetching all data. It all depends on RDBMS
capabilities and how the data is actually stored and the type of query. In
some cases the RDBMS *have* to get/buffer all data internally to get at the
last row, inducing a performance hit.
Buffering data in this way may lead to other transactional problems which I
don't want to discuss here.
My advice is that you try to structure your applications so that you don't
have to know the size of the ResultSet.
/Per Schröder
http://www.mimer.com
I think you haven't yet got my main point.
Your user shouldn't decide to accept the performance hit,
because this would concern all the other (thousands?) users of
that database server, too.
Glad to see that Per is aggreeing with me:
structure your applications so that you don't have to know the
size of the ResultSet.
Sorry, that my more practical advice wasn't reasonable enough
for you.
Nethertheless,
good luck!
Hartmut
It's all very well *saying* that (and indeed our application can *cope*
with not knowing the size of the ResultSet) but our application
certainly *benefits* from knowing the size, as then more accurate
information can be given to the user. If that information *is* given, it
needs to be accurate (for parts where it's not directly fed back to a
user, but to other parts of the program). What wasn't explained before
Per's post was *why* this seemingly easy piece of information wasn't
available.
That's because the 'why' is very complex. One reason is because it depends on the
specific driver and backend implementations, which the vendors generally consider
proprietary. Another is JDBC is a standardized layer on top of wildly divergent native
interfaces (also, not always publicly available).
Given all of that, "select count(*) from table" is the most portable ... that is, if the
backend supports count(*) (this unfortunately can't be determined with MetaData, except
by using SQL support level info, which isn't reliable.)
--
Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com)
===================================================================
* Check out Database Debunkings (http://www.firstsql.com/dbdebunk/)
* "The Forum Where Database Matters Are Set Straight"