Re: Retrieving row count from a select query

2,006 views
Skip to first unread message

Lukas Eder

unread,
Jan 3, 2013, 2:32:59 PM1/3/13
to jooq...@googlegroups.com
Hello,

What you're trying to do is currently not possible through the jOOQ API, as jOOQ doesn't expose the SelectQuery's underlying state. Changing these APIs to expose more internal state is on the roadmap:

Some comments about your attempts:

 * Convert query to string
 * Create a new query from the string (basically create a copy)
 * Call query.getSelect().add(Factory.count()) --> seems to have no effect

This doesn't work, as getSelect() returns a copy of the underlying state
 
 * Create a copy as above
 * Call query.getQuery().addSelect(Factory.count())
 * While this does work with an empty select() call it does not work with select(<fields>) as the existing items cannot be removed

This doesn't work as you're modifying the underlying state "for good"

Right now, I see these options for you - none of which are really very nice:

- Use reflection to access the SelectQuery's underlying state
- Patch jOOQ and expose some internals
- Re-create the same SelectQuery twice, once per query

Another, better option is to use Factory.countOver() - the COUNT(*) OVER() window function - along with your regular query if your database supports window functions. This would prevent the round trip of executing two queries

Cheers
Lukas

digulla

unread,
Feb 14, 2013, 11:44:45 AM2/14/13
to jooq...@googlegroups.com, isomet...@googlemail.com
Since the https://github.com/jOOQ/jOOQ/issues/1492 won't make it into 3.0, how about a simple "asCount()" method that returns a new query with anything in the select list removed and Count(1/*) added?

On Friday, January 4, 2013 2:14:15 PM UTC+1, isomet...@googlemail.com wrote:
Okay, thanks for your reply. I will wait for jooq 3.0 then ..

Lukas Eder

unread,
Feb 14, 2013, 12:41:08 PM2/14/13
to jooq...@googlegroups.com, isomet...@googlemail.com, digulla
> Since the https://github.com/jOOQ/jOOQ/issues/1492 won't make it into 3.0,

Yes, I will send a mail with some directions in the next couple of
days as I'm about to release RC1 for jOOQ 3.0...

> how about a simple "asCount()" method that returns a new query with anything
> in the select list removed and Count(1/*) added?

;-) how to put this...
I "feel" that this kind of method would need deprecation right after
introduction.

More seriously: I'd rather get these things right by cleanly
separating the DSL API from the Model API where the model will allow
for arbitrary manipulation...

Cheers
Lukas

2013/2/14 digulla <adig...@gmail.com>:

digulla

unread,
Feb 15, 2013, 4:17:51 AM2/15/13
to jooq...@googlegroups.com, isomet...@googlemail.com, digulla
On Thursday, February 14, 2013 6:41:08 PM UTC+1, Lukas Eder wrote:

;-) how to put this... 
I "feel" that this kind of method would need deprecation right after
introduction.

I agree but how much code would that be? 4-5 lines? Also, it would be easy to keep around (it shouldn't break future versions or keep you from changing other important parts of the API) plus it's a singular, very useful feature since the "turn complete query into select found" is something many apps need to do.

So in this special case, my feeling is that the ROI is positive.

Regards,

A. Digulla

Lukas Eder

unread,
Feb 15, 2013, 5:01:41 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
>> ;-) how to put this...
>> I "feel" that this kind of method would need deprecation right after
>> introduction.
>
> I agree but how much code would that be? 4-5 lines? Also, it would be easy
> to keep around (it shouldn't break future versions or keep you from changing
> other important parts of the API)

Hmm, 4-5 lines of implementation is not the problem. The problem is an
API that is tricky to maintain, because the initial requirements are
fuzzy, so far. Already, the Select.getSelect() method shouldn't be
there.

> plus it's a singular, very useful feature
> since the "turn complete query into select found" is something many apps
> need to do.

I agree, it can be useful.

OK, let's think about this more generally, then. What users often need
is a way to "patch" the projection (<select list>) of an otherwise
"stable" SELECT statement. I.e. given that...

<query specification> ::=
SELECT [ <set quantifier> ] <select list> <table expression>

<table expression> ::=
<from clause>
[ <where clause> ]
[ <group by clause> ]
[ <having clause> ]

... then, in the short run, being able to modify the <select list>
adds most immediate value.

In the long run, I would like to expose all of these syntax elements
in a "Model" with a couple of nice foreseeable properties:
- Model is a QueryPart (can render, bind)
- Model is Serializable
- Model is Cloneable (in some way)
- Model is probably mutable
- Model exposes its component Models (<select list>, <table
expression>) in two modes:

1. "as-is", i.e. in a manipulable way
2. "as-will-be-rendered", i.e. in a non-manipulable way

This is registered as #2198:
https://github.com/jOOQ/jOOQ/issues/2198

In the short run, however, short of having Cloneable models, there is
only ever one <query specification> instance to operate on. No matter
how we want to express the <select list>, e.g.

- SELECT a, b, c
- SELECT *,
- SELECT count(*),
- SELECT 1

It cannot be done without modifying the original query, which is
already possible today, actually. This works for me:

------------------------------------------------
Select<?> select =
create().select(TBook_ID(), TBook_TITLE())
.from(TBook());
System.out.println(select);

select.getSelect().add(TBook_AUTHOR_ID());
select.getSelect().remove(TBook_ID());
System.out.println(select);
------------------------------------------------

The above prints:
select "PUBLIC"."T_BOOK"."ID", "PUBLIC"."T_BOOK"."TITLE" from "PUBLIC"."T_BOOK"
select "PUBLIC"."T_BOOK"."TITLE", "PUBLIC"."T_BOOK"."AUTHOR_ID" from
"PUBLIC"."T_BOOK"

The OP's problem resides in the fact that they have provided jOOQ with
an empty <select list>:
Select<?> query = jooq.select(<empty/unknown>).from(table)....<unknown>

Unfortunately, the getSelect() method is abused for two purposes:
1. Expose the internal <select list>
2. Provide a constant <select list> derived from the <table
expression>, in case there is no internal <select list> (i.e. it is
empty)

I think, with this in mind, there is no need for further action right now?

Cheers
Lukas

Lukas Eder

unread,
Feb 15, 2013, 5:03:27 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger
Martin,

A short summary of the previous mail, responding to the original problem

> What I tried so far is this:
> * Convert query to string
> * Create a new query from the string (basically create a copy)
> * Call query.getSelect().add(Factory.count()) --> seems to have no effect

The last should work, if you don't create a select statement with an
empty initial projection. Instead of:

Select<?> query = jooq.select(<empty/unknown>).from(table)....<unknown>

Do this:

Select<?> query = jooq.selectOne().from(table)....<unknown>

You should be able to modify the projection through query.getSelect(), then

digulla

unread,
Feb 15, 2013, 6:26:38 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
In the short run, however, short of having Cloneable models...

Ah, that was the crucial piece of information that I was missing: You can't easily clone the model. I thought you could because all other parts of jOOQ I looked at can do this.

So another approach would be to add 

int Query.fetchCount()

to the API. The implementation should build SQL which ignores anything between "select" and "from". The implementation of this would be in the dialect.

I'm not sure if that always gives the same results when you use grouping and similar things, though. Maybe you can throw an error when the SQL contains dangerous elements.

Regards,

A. Digulla

Lukas Eder

unread,
Feb 15, 2013, 7:47:55 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
>> In the short run, however, short of having Cloneable models...
>
> Ah, that was the crucial piece of information that I was missing: You can't
> easily clone the model.

Unfortunately, no. Of course, you can always clone QueryParts by
serialising / deserialising them, but that seems a bit over the top.

> I thought you could because all other parts of jOOQ
> I looked at can do this.

Hmm, none of them are Cloneable. What did you have in mind?

> So another approach would be to add
>
> int Query.fetchCount()
>
> to the API. The implementation should build SQL which ignores anything
> between "select" and "from". The implementation of this would be in the
> dialect.

I guess I could live with such a solution. It would have to be put on
org.jooq.Select, though, not on Query/ResultQuery. The latter could
wrap plain SQL queries or non-SELECT queries, where patching the
projection doesn't make sense.

And also, Factory.fetchCount(Select<?>) could be useful. In that
event, jOOQ's internals could take care of not rendering the
projection and avoiding variable binding originating from the
projection.

> I'm not sure if that always gives the same results when you use grouping and
> similar things, though. Maybe you can throw an error when the SQL contains
> dangerous elements.

There are still open questions. The projection has an implicit effect
on the table expression in many cases, e.g.:

- When the DISTINCT keyword is applied
- When UNIONs are involved
- When Oracle PIVOT tables are used
- When some sorts of Oracle hints are applied
- When ORDER BY clauses reference column indexes from the projection
- When CUBRID / Sybase window function ORDER BY clauses reference
column indexes from the projection
- When paging is applied and simulated with ROW_NUMBER() calculations
- When paging is applied with the TOP clause

Maybe it would be sufficient to document this in the Javadoc... I'll
register #2200 for this:
https://github.com/jOOQ/jOOQ/issues/2200

Thanks for sharing your ideas.

Cheers
Lukas

Lukas Eder

unread,
Feb 15, 2013, 7:53:38 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
> Maybe it would be sufficient to document this in the Javadoc... I'll
> register #2200 for this:
> https://github.com/jOOQ/jOOQ/issues/2200

This probably makes it into jOOQ 3.0 and 2.7...

digulla

unread,
Feb 15, 2013, 8:18:46 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
On Friday, February 15, 2013 1:47:55 PM UTC+1, Lukas Eder wrote:

> I thought you could because all other parts of jOOQ
> I looked at can do this.

Hmm, none of them are Cloneable. What did you have in mind?

You use copy constructors in Factory and in the configuration.

Regards,

A. Digulla 

Lukas Eder

unread,
Feb 15, 2013, 8:22:26 AM2/15/13
to jooq...@googlegroups.com
>> > I thought you could because all other parts of jOOQ
>> > I looked at can do this.
>>
>> Hmm, none of them are Cloneable. What did you have in mind?
>
> You use copy constructors in Factory and in the configuration.

Heh, OK, the Factory is "cloneable". But all the QueryParts aren't :-)
Many QueryParts are heavily nested (e.g. Conditions). Adding true
clone support might be quite a feature

Cheers
Lukas

Durchholz, Joachim

unread,
Feb 15, 2013, 10:08:03 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla
>> So another approach would be to add
>>
>> int Query.fetchCount()
>>
>> to the API. The implementation should build SQL which ignores anything
>> between "select" and "from". The implementation of this would be in
>> the dialect.
>
> I guess I could live with such a solution.

One of the recurring requirements that I have is a paged display with a result size count, so varying a query from "give me the result columns" to "just give me the result count" is something that I could make use of.

I suspect I'm not alone with this use case.

Aside note: I'm not sure whether "fetch" is the right terminology here, it is ringing some unappropriate overtones for me (in the sense of "page fetching" and "fetch limits"). I'd have designed it as a query transformer that takes a query and returns its SELECT COUNT(*) variation (so it can be reused, prepared with values, etc).
Not knowing the API that this is going into, I may be totally off the mark though. (I still haven't found the time to migrate to Jooq :-((( so my knowledge is very limited, and to be taken as an outsider's perspective.)

Lukas Eder

unread,
Feb 15, 2013, 10:21:35 AM2/15/13
to jooq...@googlegroups.com, Martin Steiger, digulla, Joachim
>>> So another approach would be to add
>>>
>>> int Query.fetchCount()
>>>
>>> to the API. The implementation should build SQL which ignores anything
>>> between "select" and "from". The implementation of this would be in
>>> the dialect.
>>
>> I guess I could live with such a solution.
>
> One of the recurring requirements that I have is a paged display with a result size count, so varying a query from "give me the result columns" to "just give me the result count" is something that I could make use of.
>
> I suspect I'm not alone with this use case.

You're certainly not alone with this. However, beware that it is
always better to use window functions, where available: COUNT(*)
OVER(). Executing the same query twice for paging can be quite costly
in the DB.

> Aside note: I'm not sure whether "fetch" is the right terminology here, it is ringing some unappropriate overtones for me (in the sense of "page fetching" and "fetch limits"). I'd have designed it as a query transformer that takes a query and returns its SELECT COUNT(*) variation (so it can be reused, prepared with values, etc).

"Fetch" is a well-established term in the jOOQ API:
http://www.jooq.org/doc/2.6/manual/sql-execution/fetching/

So, "fetchCount" is the best choice here, as the Select variation is
actually executed and a Result is fetched.

The term "FETCH" is also well-established in the SQL standard. See
§13.3 <fetch statement> from
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

It is used to fetch data from a cursor. Other terms such as the ones
you mentioned, are probably derived from this.

Cheers
Lukas

Lukas Eder

unread,
Feb 16, 2013, 8:08:03 AM2/16/13
to jooq...@googlegroups.com
Unfortuantely, the projected Executor.fetchCount() query is non-trivial to implement. The side-effects of patching the projection are too risky. Compared to that, explicitly modifying the projection seems more reasonable.
I will postpone implementation to a later release.

2013/2/15 Lukas Eder <lukas...@gmail.com>

Lukas Eder

unread,
Feb 17, 2013, 11:07:48 AM2/17/13
to jooq...@googlegroups.com
Check out this sample query (also on http://sqlfiddle.com/#!4/d41d8/7433)

WITH t(id, title) AS (
  SELECT 1, '1984'         FROM DUAL UNION ALL
  SELECT 2, 'Animal Farm'  FROM DUAL UNION ALL
  SELECT 3, 'O Alquimista' FROM DUAL UNION ALL
  SELECT 4, 'Brida'        FROM DUAL
)
SELECT t.id, t.title, COUNT(*) OVER()
FROM t
WHERE t.title LIKE '%a%'

It features parts of the jOOQ T_BOOK table. The actual query is to select ID and TITLE where TITLE contains at least one 'a' character. Instead of running this query twice in order to fetch the number of results that would be obtained by the [title like '%a%'] predicate, you can just add this window function to the regular projection. The number of results is thus added to every record:

ID TITLECOUNT(*)OVER()
2Animal Farm 3
3 O Alquimista 3
4Brida 3
jOOQ supports this window function as count().over(). As in the SQL standard, all jOOQ aggregate functions can be turned into window functions using the OVER() clause:

Of course, this might not always be implemented in all SQL dialects. Among the databases supported by jOOQ, these databases support window functions:

- CUBRID
- DB2
- Postgres
- Oracle
- SQL Server
- Sybase SQL Anywhere

More information can be found here:

Cheers
Lukas

2013/2/17 <isomet...@googlemail.com>
Thanks for the update!

How would a query with the windowing function Count(*) Over() look like? Does it somehow provide the total number of rows before you actually start processing the records?

thanks!

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Durchholz, Joachim

unread,
Feb 18, 2013, 3:53:32 AM2/18/13
to jooq...@googlegroups.com
Thanks for the feedback, good to know.

Lukas Eder

unread,
Feb 19, 2013, 4:48:28 AM2/19/13
to jooq...@googlegroups.com, Martin Steiger
2013/2/19 <rithom...@gmail.com>:
> I maybe missing something but would it not be possible to handle this by
> turning the original SQL select into a sub select so
>
> select * from table
>
> ends up as
>
> select count(*)
> from ( select * from table )
>
> This way all the edge cases caused by things like DISTINCT and UNION operate
> correctly as long as the DB engine can handle them within a sub-select. It
> also deals with the issue of DBs that do not have the window functions.
>
> I am saying this without any understanding of how you hold a query within
> JOOQ so do not know if you can defind such a wrapping at will.

I must've had tomatoes on my eyes. Trying to patch rendering and
binding and all that... Sheesh ;-)

This will indeed "get rid" of most of the corner cases, as it isn't
really a complex query transformation. Here are some corner-cases that
would persist:

- The FOR UPDATE clauses. But that can be left as the user's
responsibility, not to pass a SELECT .. FOR UPDATE to this new
Executor.fetchCount(Select<?>)

And here is a new corner-case:

- "SELECT 1 FROM ..." Some databases cannot handle unnamed columns in
subqueries. E.g. SQL Server can't run SELECT count(*) FROM (SELECT 1).
It would have to be changed to SELECT count(*) FROM (SELECT 1 [dummy])

I wonder, though, if all optimisers can handle the transformation
efficiently. I'm not afraid for Oracle, but are these the same queries
in MySQL?

> select * from table
>
> select count(*)
> from ( select * from table )

Anyway, thanks a lot for the input :-) so, this will still make it
easily into jOOQ 3.0-RC2 and 2.7

Cheers
Lukas

Lukas Eder

unread,
Feb 19, 2013, 4:53:18 AM2/19/13
to jooq...@googlegroups.com, Martin Steiger
2013/2/19 Lukas Eder <lukas...@gmail.com>:
> 2013/2/19 <rithom...@gmail.com>:
>> I maybe missing something but would it not be possible to handle this by
>> turning the original SQL select into a sub select so
>>
>> select * from table
>>
>> ends up as
>>
>> select count(*)
>> from ( select * from table )
>>
>> This way all the edge cases caused by things like DISTINCT and UNION operate
>> correctly as long as the DB engine can handle them within a sub-select. It
>> also deals with the issue of DBs that do not have the window functions.
>>
>> I am saying this without any understanding of how you hold a query within
>> JOOQ so do not know if you can defind such a wrapping at will.
>
> I must've had tomatoes on my eyes. Trying to patch rendering and
> binding and all that... Sheesh ;-)
>
> This will indeed "get rid" of most of the corner cases, as it isn't
> really a complex query transformation. Here are some corner-cases that
> would persist:
>
> - The FOR UPDATE clauses. But that can be left as the user's
> responsibility, not to pass a SELECT .. FOR UPDATE to this new
> Executor.fetchCount(Select<?>)
>
> And here is a new corner-case:
>
> - "SELECT 1 FROM ..." Some databases cannot handle unnamed columns in
> subqueries. E.g. SQL Server can't run SELECT count(*) FROM (SELECT 1).
> It would have to be changed to SELECT count(*) FROM (SELECT 1 [dummy])

... of course, in most situations, users will want to avoid passing a
previously paged query to this new method. The COUNT(*) value should
probably be applied on an unpaged query, in order to calculate the
number of available pages... I guess that jOOQ shouldn't do any magic
here, and leave the query untouched in this case.

Jörg Schönfisch

unread,
Feb 19, 2013, 5:26:06 AM2/19/13
to jooq...@googlegroups.com, Martin Steiger
On Tuesday, February 19, 2013 10:48:28 AM UTC+1, Lukas Eder wrote:
And here is a new corner-case: 

- "SELECT 1 FROM ..." Some databases cannot handle unnamed columns in
subqueries. E.g. SQL Server can't run SELECT count(*) FROM (SELECT 1).
It would have to be changed to SELECT count(*) FROM (SELECT 1 [dummy])

I've never worked with SQLServer and have none at hand to test this, but would COUNT(1) make it work in this case? 

Lukas Eder

unread,
Feb 19, 2013, 5:36:40 AM2/19/13
to jooq...@googlegroups.com, Martin Steiger
>> And here is a new corner-case:
>>
>> - "SELECT 1 FROM ..." Some databases cannot handle unnamed columns in
>> subqueries. E.g. SQL Server can't run SELECT count(*) FROM (SELECT 1).
>> It would have to be changed to SELECT count(*) FROM (SELECT 1 [dummy])
>
>
> I've never worked with SQLServer and have none at hand to test this, but
> would COUNT(1) make it work in this case?

No, this is a general syntax "issue" in SQL Server. This query here:

select * from (select 1, 2) t

Causes the same issue. In this case, solutions would be any of these:

-- Renaming columns in the subquery
select * from (select 1 a, 2 b) t

-- Using derived column lists
select * from (select 1, 2) t(a, b)

This corner-case isn't really related to the COUNT(*) operation but to
unnamed columns in subqueries:
https://github.com/jOOQ/jOOQ/issues/579

jOOQ already auto-generates the missing table name [t] in the above
example, as providing names for derived tables is required by many
databases (e.g. including MySQL, SQL Server). With jOOQ 3.0's support
for derived column lists, #579 can finally be fixed also for SQL
Server

Roger Thomas

unread,
Feb 19, 2013, 2:53:16 PM2/19/13
to jooq...@googlegroups.com, Martin Steiger

Anyway, thanks a lot for the input :-) so, this will still make it
easily into jOOQ 3.0-RC2 and 2.7  

No problem, glad to have helped rather than complicated an issue for once :)

>  I wonder, though, if all optimisers can handle the transformation
>  efficiently. I'm not afraid for Oracle, but are these the same queries
>  in MySQL?

I think its fair to say that a good number will execute the whole sub query into a temp table and then count up the results, but thats not as bad as it first seems

  1) At least it did not happen across the wire between the server and the client.

  2) If the query is then executed to get the results and the server has enough RAM to cache all the pages from the first run the second run will be a lot faster.

Doing counts on certain DBs is costly anywhere so its useful for you to offer the feature, but I would not worry to much about how it ends up being handled, just think about how bad it can get in an JPA based environment.
Reply all
Reply to author
Forward
0 new messages