Grouping sets rewrite and MPP-awareness

38 views
Skip to first unread message

Heikki Linnakangas

unread,
Aug 17, 2018, 3:12:23 AM8/17/18
to Greenplum Developers
Grouping Sets rewrite - done so far
-----------------------------------

Last week, I took a stab at replacing GPDB's historical GROUPING SETS
implementation with the one from PostgreSQL 9.5. Why? Because the GPDB
implementation has a number of issues, ranging from assertion failures
to incorrect results. See
https://github.com/greenplum-db/gpdb/issues?q=is%3Aissue+is%3Aopen+label%3A%22Grouping+Sets%22+

The result of that was this PR:
https://github.com/greenplum-db/gpdb/pull/5440. There is still some ORCA
work needed, and there are still a few regression failures, but I
believe they're just a matter of fixing the expected output. It was
relatively straightforward; I first ripped out all the existing grouping
sets code, and then cherry-picked the necessary upstream commits. It works.

However, it's got a few performance issues:

1. It's not very MPP-aware. If you do "ROLLUP(a, b)", it will generate a
Gather node to bring all the data to the QD node, and perform the
grouping there.

2. It only supports sorted aggregates. No hashing.

I'm particularly worried about 1, the lack of MPP-awareness. I'm afraid
users would be very unhappy with that. In the last few days, I've been
thinking about how to fix the performance regressions.

Plan A: merge/cherry-pick stuff
-------------------------------

There are two relevant features in upstream:

1. Parallel aggregate support was added in PostgreSQL 9.6. It would
provide the infrastructure needed for two-stage aggregates. A parallel
aggregate on a single node, with separate Partial and Finalize stages,
is very similar to a two-stage aggregate across nodes.

2. Hashing support was added in PostgreSQL v10.

I considered cherry-picking those two features from PostgreSQL v10.
However, it gets very complicated, very quickly. The partial aggregate
support was added after the so-called Upper Planner Pathification work
in PostgreSQL 9.6. That completely revamped the GROUP BY planning in
PostgreSQL. Backporting that seems infeasible. It might be possible to
cherry-pick just the partial aggregate support without the Upper Planner
Pathification, but it requires some patchwork.

Even if we do that, the upstream Partial aggregate code is quite
different from the current GPDB implementation, especially in the
planner. Partial aggregates are used with regular GROUP BY queries, even
without grouping sets. So replacing the GPDB multi-stage aggragate
implementation with upstream's would involve a big refactoring of the
code in cdbgroup.c. And trying to support two different flavors of
multi-stage aggregates at the same time would be quite schizophrenic.

From a developer's point of view, I think the best marching order would be:

1. Commit the GROUPING SETS re-implementation we have in PR 5440 as it
is, and accept the performance issues for now.

2. Merge whole of GPDB all the way through PostgreSQL 9.6. Once we hit
9.6, as a temporary fix, remove cdbgroup.c. That would cause a massive
performance degradation to all GROUP BY queries, losing the ability to
create MPP-aware plans.

3. Re-implement the logic in cdbgroup.c, now that we have Upper Planner
Pathification, and Parallel Aggregates. That brings back the performance
on GROUP BY queries.

4. Re-implement the MPP-awareness to grouping sets using the Partial
Aggregate code. This brings back the performance we lost in the GROUPING
SETs rewrite in step 1.

5. Cherry-pick the support for hashing with grouping sets from PostgrSQL
v10 (or continue the merge from version 9.6 to 10 as whole).

In other words, we'd take on a technical mortgage in step 1, and pay it
back in steps 2-5.


From a Product Manager's point of view, there's a little problem with
this plan: We would like to release GPDB6 based on PostgreSQL 9.2, or
maybe 9.3 or 9.4. Not 9.6 or 10. If we release GPDB 6 between steps 1
and 2, grouping sets will have a significant performance regression,
compared to GPDB 5.


Plan B
------

Another approach would be to not pick the upstream Partial aggregate
support, but instead, add a bunch of new code to the grouping sets
planner that uses the current GPDB multi-stage aggregate implementation.
so the plan would be:

1. Commit the GROUPING SETS re-implementation we have in PR 5440 as it
is, and accept the performance issues for now.

2. Make that MPP-aware, by adding some code similar to current code in
cdbgroup.c.


We could perhaps reuse some of the old grouping sets code and the code
in cdbgroup.c, or the old grouping sets implementation, for this, but
it's not straightforward. It's going to be a fair amount of refactoring
and new code. And once we reach 9.6 in the PostgreSQL merge, we'll need
to rip out and replace it all anyway, and follow the steps 2-5 above.
The upside with this approach is that we have a good chance of getting
this into GPDB6.

Thoughts?

- Heikki

Robert Eckhardt

unread,
Aug 17, 2018, 9:35:39 AM8/17/18
to Heikki Linnakangas, Greenplum Developers
On Fri, Aug 17, 2018 at 3:12 AM, Heikki Linnakangas <hlin...@iki.fi> wrote:

[SNIP]

>
> Thoughts?

Option 3: do nothing in the 6 release, deal with the known issues as
we are doing in 5 right now and then just get this work instream with
the merges for 7.

My initial thoughts right now are to put this on hold until we have
finished the cleanup work we are doing. Get the 9.3 merge iteration
started and have a few more folks dig in and share educated opinions
based on the WIP PR you created.

-- Rob

>
> - Heikki
>
>

Robert Eckhardt

unread,
Aug 17, 2018, 11:43:38 AM8/17/18
to Heikki Linnakangas, Greenplum Developers
On Fri, Aug 17, 2018 at 9:35 AM, Robert Eckhardt <reck...@pivotal.io> wrote:
> On Fri, Aug 17, 2018 at 3:12 AM, Heikki Linnakangas <hlin...@iki.fi> wrote:
>
> [SNIP]
>
>>
>> Thoughts?
>
> Option 3: do nothing in the 6 release, deal with the known issues as
> we are doing in 5 right now and then just get this work instream with
> the merges for 7.

It was pointed out to me that I was unclear here. By do nothing I mean
hold off on backporting ANY GROUPING SETS changes and use the current
implementation as it exists in 5.x.

-- Rob

Ivan Novick

unread,
Aug 17, 2018, 11:47:09 AM8/17/18
to Robert Eckhardt, Heikki Linnakangas, Greenplum Developers
On Fri, Aug 17, 2018 at 8:43 AM, Robert Eckhardt <reck...@pivotal.io> wrote:
On Fri, Aug 17, 2018 at 9:35 AM, Robert Eckhardt <reck...@pivotal.io> wrote:
> On Fri, Aug 17, 2018 at 3:12 AM, Heikki Linnakangas <hlin...@iki.fi> wrote:
>
> [SNIP]
>
>>
>> Thoughts?
>
> Option 3: do nothing in the 6 release, deal with the known issues as
> we are doing in 5 right now and then just get this work instream with
> the merges for 7.

It was pointed out to me that I was unclear here. By do nothing I mean
hold off on backporting ANY GROUPING SETS changes and use the current
implementation as it exists in 5.x.

That is also aligned with my previous point in the other thread, which is lets try to keep MASTER equally fast and production ready as 5_STABLE, and try to improve MASTER as much as possible holistically in chunks that make it better than 5_STABLE without going the reverse temporarily, or at least for as short as time possible.  This is for obvious practical reasons of trying to keep MASTER as shippable as possibly can be done.


Venkatesh Raghavan

unread,
Aug 17, 2018, 12:26:17 PM8/17/18
to Ivan Novick, Robert Eckhardt, Heikki Linnakangas, Greenplum Developers
I am very happy we are having this discussion early. I agree with Ivan's and Rob's view. Window operation is widely and large scale "known" regression need to be addressed before we commit into master. Regressions will deter people from upgrading and prolong the life of GPDB5.

How much of an effort is there to fix the regressions?

V

Venkatesh Raghavan

unread,
Aug 17, 2018, 2:38:51 PM8/17/18
to Ivan Novick, Robert Eckhardt, Heikki Linnakangas, Greenplum Developers
Correction, Grouping Sets not Windows. 

Window merge did bring in some executor changes that impacts performance, which can we will need to address before we release GPDB6.

V

Yandong Yao

unread,
Aug 22, 2018, 11:19:10 PM8/22/18
to Venkatesh Raghavan, Ivan Novick, Robert Eckhardt, Heikki Linnakangas, Greenplum Developers
Would second on Plan B for 1) making future merge easier as it is closer to upstream,  2) fix 10 issues related with group sets including some incorrect result. https://github.com/greenplum-db/gpdb/issues?q=is%3Aissue+is%3Aopen+label%3A%22Grouping+Sets%22+
--
Best Regards,
Yandong
Reply all
Reply to author
Forward
0 new messages