GROUP_CONCAT in SQL Server

1,296 views
Skip to first unread message

Ben Hood

unread,
Aug 4, 2013, 7:50:16 PM8/4/13
to jooq...@googlegroups.com
Hi Lukas,

Using this 3rd party package (http://groupconcat.codeplex.com/) to
provide GROUP_CONCAT on SQL Server seems to provide more flexibility
that using the builtin FOR XML PATH (as referred to by
https://github.com/jOOQ/jOOQ/issues/1277).

I was wondering if there is a way to get the LISTAGG generator to be
dialect aware and generate the necessary syntax for SQL Server, i.e.
dbo.GROUP_CONCAT(COLUMN_NAME). Or is this something that is not likely
to be very portable (looking beyond the issue of having to assume that
this 3rd package is actually installed)?

Cheers,

Ben

Lukas Eder

unread,
Aug 5, 2013, 2:37:15 AM8/5/13
to jooq...@googlegroups.com
Hi Ben,

2013/8/5 Ben Hood <0x6e...@gmail.com>

Hi Lukas,

Using this 3rd party package (http://groupconcat.codeplex.com/) to
provide GROUP_CONCAT on SQL Server seems to provide more flexibility
that using the builtin FOR XML PATH (as referred to by
https://github.com/jOOQ/jOOQ/issues/1277).

Yes, this LISTAGG() emulation idea using SQL Server's STUFF is a bit funky.

I was wondering if there is a way to get the LISTAGG generator to be
dialect aware and generate the necessary syntax for SQL Server, i.e.
dbo.GROUP_CONCAT(COLUMN_NAME). Or is this something that is not likely
to be very portable (looking beyond the issue of having to assume that
this 3rd package is actually installed)?

I prefer not to rely on such a third-party extension that might be installed in SQL Server (or not). Think about the various other "useful" functions that are present in 1-2 databases but missing in the remaining 12, supported by jOOQ.

Of course, you can create your own function, using plain SQL:

public static Field<String> groupConcat(Field<String> column) {
    return DSL.field("dbo.GROUP_CONCAT({0})", 
        String.class, column);
}

public static Field<String> groupConcat(Field<String> column, String delimiter) {
    return DSL.field("dbo.GROUP_CONCAT({0}, {1})", 
        String.class, column, val(delimiter));
}

Note, jOOQ should probably support code generation for custom aggregate functions. This is currently only supported in Oracle. I have added a feature request for this:

Ben Hood

unread,
Aug 5, 2013, 5:04:15 AM8/5/13
to jooq...@googlegroups.com, jooq...@googlegroups.com
Hey Lukas,

On Aug 5, 2013, at 7:37, Lukas Eder <lukas...@gmail.com> wrote:
Yes, this LISTAGG() emulation idea using SQL Server's STUFF is a bit funky.


The specific issue I have with it is that it doesn't behave like other aggregation functions - you can't aggregate a single column, since it concatenates each entire row of the projection. This makes it difficult to compose with subqueries. I would say that this technique is more of a result wrapping function that can be made to look like an aggregation for relatively simple use cases.

I prefer not to rely on such a third-party extension that might be installed in SQL Server (or not). Think about the various other "useful" functions that are present in 1-2 databases but missing in the remaining 12, supported by jOOQ.

Sorry, I expressed myself badly - I wasn't suggesting that JOOQ should implement this 3rd party extension out of the box (for the very reason you make above).

I was wondering whether there was a way for application level code to extend the groupConcat() function based on the dialect, rather than having to wire in a special case DSL field. This would increase the reuse of my query building blocks.

Of course, you can create your own function, using plain SQL:

public static Field<String> groupConcat(Field<String> column) {
    return DSL.field("dbo.GROUP_CONCAT({0})", 
        String.class, column);
}

public static Field<String> groupConcat(Field<String> column, String delimiter) {
    return DSL.field("dbo.GROUP_CONCAT({0}, {1})", 
        String.class, column, val(delimiter));
}

Yes, this is exactly what I have got right now. Basically I've wired in the dialect to all query building blocks so that I can do the dialect specific stuff. For example, I've also created an MD5 function in the dbo namespace. It works well enough for now, but I was wondering there is a neater way to do this kind of thing.


Note, jOOQ should probably support code generation for custom aggregate functions. This is currently only supported in Oracle. I have added a feature request for this:

I guess what I'm trying to do is to monkey patch the groupConcat() and redirect it to the custom aggregate function for SQL Server. If I was trying to get groupConcat() to be transparent across dialects, I probably wouldn't want to overload groupConcat(), instead I would use a separate function name (although the 3rd party function is actually called GROUP_CONCAT).

The joys of "why isn't feature X supported in Y database" :-) 

Cheers,

Ben

Lukas Eder

unread,
Aug 6, 2013, 4:18:51 AM8/6/13
to jooq...@googlegroups.com
Hi Ben,

2013/8/5 Ben Hood <0x6e...@gmail.com>
Hey Lukas,

On Aug 5, 2013, at 7:37, Lukas Eder <lukas...@gmail.com> wrote:
Yes, this LISTAGG() emulation idea using SQL Server's STUFF is a bit funky.


The specific issue I have with it is that it doesn't behave like other aggregation functions - you can't aggregate a single column, since it concatenates each entire row of the projection. This makes it difficult to compose with subqueries. I would say that this technique is more of a result wrapping function that can be made to look like an aggregation for relatively simple use cases.

You're probably right with this. I haven't evaluated the depths of the STUFF solution yet, as this is really a low-prio improvement. But it appeared to be quite a bit of a hack to me, at the time. Similar to jOOQ's RPAD/LPAD emulation for SQLite:
I prefer not to rely on such a third-party extension that might be installed in SQL Server (or not). Think about the various other "useful" functions that are present in 1-2 databases but missing in the remaining 12, supported by jOOQ.

Sorry, I expressed myself badly - I wasn't suggesting that JOOQ should implement this 3rd party extension out of the box (for the very reason you make above).

I was wondering whether there was a way for application level code to extend the groupConcat() function based on the dialect, rather than having to wire in a special case DSL field. This would increase the reuse of my query building blocks.

I see. Yes, this kind of thing is on the roadmap:

It's certainly worth thinking about these things, in order to support arbitrary SQL dialects more easily. My first evaluations of this functionality showed, however, that it will be quite hard to implement the full jOOQ SQL feature support in an external file. It might still be good enough for tweaking, though.

Another option is to think about this new idea I've had, recently, in the context of a customer requirement:

This issue will allow for providing a custom SQL rendering and variable binding SPI implementation. The original requirement was to be able to dynamically append predicates to WHERE clauses in the event of access to any given table X. Of course, this functionality could be used to re-write jOOQ's SQL generation for 1-2 types of API elements, such as LISTAGG().
Of course, you can create your own function, using plain SQL:

public static Field<String> groupConcat(Field<String> column) {
    return DSL.field("dbo.GROUP_CONCAT({0})", 
        String.class, column);
}

public static Field<String> groupConcat(Field<String> column, String delimiter) {
    return DSL.field("dbo.GROUP_CONCAT({0}, {1})", 
        String.class, column, val(delimiter));
}

Yes, this is exactly what I have got right now. Basically I've wired in the dialect to all query building blocks so that I can do the dialect specific stuff. For example, I've also created an MD5 function in the dbo namespace. It works well enough for now, but I was wondering there is a neater way to do this kind of thing.

I'm always open to suggestions.
 
Note, jOOQ should probably support code generation for custom aggregate functions. This is currently only supported in Oracle. I have added a feature request for this:

I guess what I'm trying to do is to monkey patch the groupConcat() and redirect it to the custom aggregate function for SQL Server. If I was trying to get groupConcat() to be transparent across dialects, I probably wouldn't want to overload groupConcat(), instead I would use a separate function name (although the 3rd party function is actually called GROUP_CONCAT).

I have a strong feeling that the LISTAGG syntax might make it into the SQL standard. Not necessarily the function name, but the ordered aggregate clause:

    WITHIN GROUP (ORDER BY ...)

The joys of "why isn't feature X supported in Y database" :-) 

Yes. When you check out the jOOQ code with respect to SQL standardisation, it's crazy how many different ways of doing the same exist... 

Cheers
Lukas

Ben Hood

unread,
Aug 7, 2013, 6:27:04 PM8/7/13
to jooq...@googlegroups.com
Hi Lukas,

On Tue, Aug 6, 2013 at 9:18 AM, Lukas Eder <lukas...@gmail.com> wrote:
> You're probably right with this. I haven't evaluated the depths of the STUFF
> solution yet, as this is really a low-prio improvement. But it appeared to
> be quite a bit of a hack to me, at the time. Similar to jOOQ's RPAD/LPAD
> emulation for SQLite:
> http://blog.jooq.org/2012/07/19/funky-string-function-simulation-in-sqlite

Cool - I like the way people find ways around the idiosyncrasies of
all of the different DBs out there :-)

> I see. Yes, this kind of thing is on the roadmap:
> https://github.com/jOOQ/jOOQ/issues/2337
>
> It's certainly worth thinking about these things, in order to support
> arbitrary SQL dialects more easily. My first evaluations of this
> functionality showed, however, that it will be quite hard to implement the
> full jOOQ SQL feature support in an external file. It might still be good
> enough for tweaking, though.

That sounds like a good start.

> Another option is to think about this new idea I've had, recently, in the
> context of a customer requirement:
> https://github.com/jOOQ/jOOQ/issues/2665
>
> This issue will allow for providing a custom SQL rendering and variable
> binding SPI implementation. The original requirement was to be able to
> dynamically append predicates to WHERE clauses in the event of access to any
> given table X. Of course, this functionality could be used to re-write
> jOOQ's SQL generation for 1-2 types of API elements, such as LISTAGG().

This sounds a bit more powerful than the external file approach. Not
that I claim to know anything about design, but this approach gives me
the impression that I can selectively decorate the render phase
according to my custom dialect.

>> Note, jOOQ should probably support code generation for custom aggregate
>> functions. This is currently only supported in Oracle. I have added a
>> feature request for this:
>> https://github.com/jOOQ/jOOQ/issues/2677
>>
>>
>> I guess what I'm trying to do is to monkey patch the groupConcat() and
>> redirect it to the custom aggregate function for SQL Server. If I was trying
>> to get groupConcat() to be transparent across dialects, I probably wouldn't
>> want to overload groupConcat(), instead I would use a separate function name
>> (although the 3rd party function is actually called GROUP_CONCAT).
>
>
> I have a strong feeling that the LISTAGG syntax might make it into the SQL
> standard. Not necessarily the function name, but the ordered aggregate
> clause:
>
> WITHIN GROUP (ORDER BY ...)

OK, so what I've ended up having to do is implement my own custom
aggregation function in the CLR that implements the WITHIN GROUP
(ORDER BY ...) semantics that Oracle provides out of the box.
Basically ordered set functions are not supported in SQL Server - see
here: https://connect.microsoft.com/SQLServer/feedback/details/728969.

So where I've ended up from a JOOQ perspective is that I can easily
create a DSL field object that will render the correct function name,
so all is good with JOOQ for now :-)

In addition, I just wanted to say that the SQL generation that JOOQ
has allowed me to build has been invaluable in maintaining ports of
non-trivial queries across Oracle, MySQL, Postgres and SQL Server - so
thanks for such a great library :-)

Cheers,

Ben

Lukas Eder

unread,
Aug 8, 2013, 4:25:55 AM8/8/13
to jooq...@googlegroups.com

2013/8/8 Ben Hood <0x6e...@gmail.com>

Hi Lukas,

On Tue, Aug 6, 2013 at 9:18 AM, Lukas Eder <lukas...@gmail.com> wrote:
> You're probably right with this. I haven't evaluated the depths of the STUFF
> solution yet, as this is really a low-prio improvement. But it appeared to
> be quite a bit of a hack to me, at the time. Similar to jOOQ's RPAD/LPAD
> emulation for SQLite:
> http://blog.jooq.org/2012/07/19/funky-string-function-simulation-in-sqlite

Cool - I like the way people find ways around the idiosyncrasies of
all of the different DBs out there :-)

Yeah. Give them a Stack Overflow bounty and they'll do the craziest stuff! ;-)
 

> I see. Yes, this kind of thing is on the roadmap:
> https://github.com/jOOQ/jOOQ/issues/2337
>
> It's certainly worth thinking about these things, in order to support
> arbitrary SQL dialects more easily. My first evaluations of this
> functionality showed, however, that it will be quite hard to implement the
> full jOOQ SQL feature support in an external file. It might still be good
> enough for tweaking, though.

That sounds like a good start.

> Another option is to think about this new idea I've had, recently, in the
> context of a customer requirement:
> https://github.com/jOOQ/jOOQ/issues/2665
>
> This issue will allow for providing a custom SQL rendering and variable
> binding SPI implementation. The original requirement was to be able to
> dynamically append predicates to WHERE clauses in the event of access to any
> given table X. Of course, this functionality could be used to re-write
> jOOQ's SQL generation for 1-2 types of API elements, such as LISTAGG().

This sounds a bit more powerful than the external file approach. Not
that I claim to know anything about design, but this approach gives me
the impression that I can selectively decorate the render phase
according to my custom dialect.

Yes, that impression is justified - at least in the long run.
 
>> Note, jOOQ should probably support code generation for custom aggregate
>> functions. This is currently only supported in Oracle. I have added a
>> feature request for this:
>> https://github.com/jOOQ/jOOQ/issues/2677
>>
>>
>> I guess what I'm trying to do is to monkey patch the groupConcat() and
>> redirect it to the custom aggregate function for SQL Server. If I was trying
>> to get groupConcat() to be transparent across dialects, I probably wouldn't
>> want to overload groupConcat(), instead I would use a separate function name
>> (although the 3rd party function is actually called GROUP_CONCAT).
>
>
> I have a strong feeling that the LISTAGG syntax might make it into the SQL
> standard. Not necessarily the function name, but the ordered aggregate
> clause:
>
>     WITHIN GROUP (ORDER BY ...)

OK, so what I've ended up having to do is implement my own custom
aggregation function in the CLR that implements the WITHIN GROUP
(ORDER BY ...) semantics that Oracle provides out of the box.
Basically ordered set functions are not supported in SQL Server - see
here: https://connect.microsoft.com/SQLServer/feedback/details/728969.

"Ordered set functions", that's the formally correct name - good to know. I just checked in my SQL:2008 draft specs. It says

10.9 <aggregate function>

[...]

<ordered set function> ::=
    <hypothetical set function>
  | <inverse distribution function>

<hypothetical set function> ::=
    <rank function type> <left paren>
        <hypothetical set function value expression list> <right paren>
    <within group specification>

<within group specification> ::=
    WITHIN GROUP <left paren> ORDER BY <sort specification list> <right paren>

<inverse distribution function> ::=
    <inverse distribution function type> <left paren>
        <inverse distribution function argument> <right paren>
    <within group specification>

So, in the long run, this syntax variant is clearly better than MySQL's GROUP_CONCAT() function.

So where I've ended up from a JOOQ perspective is that I can easily
create a DSL field object that will render the correct function name,
so all is good with JOOQ for now :-)

In addition, I just wanted to say that the SQL generation that JOOQ
has allowed me to build has been invaluable in maintaining ports of
non-trivial queries across Oracle, MySQL, Postgres and SQL Server - so
thanks for such a great library :-)

Great, good to know!

Cheers
Lukas
Reply all
Reply to author
Forward
0 new messages