PostgreSQL 9.0 merge, breakdown into iterations

Heikki Linnakangas

unread,

Nov 30, 2017, 4:42:52 PM11/30/17

to Greenplum Developers

As the PostgreSQL 8.4 merge nears completion, it's time to start looking
ahead at PostgreSQL 9.0.

Throughout the PostgreSQL 9.0 development cycle, the community made an
alpha release about every two months, after each commitfest. They're
essentially snapshots of the 'master' branch at that point in time, but
they are conveniently spaced out, such that if we merge one or two
alpha-releases in one iteration, that would be a pretty good pacing.

What's coming up? The big features in PostgreSQL 9.0 were Hot Standby
and Streaming Replication. But aside from those, there were a few other
things that are noteworthy, from a merging point of view.

ALPHA1
------

* EXPLAIN COSTS OFF option (commit d4382c4ae7)

Allows leaving out cost estimates from EXPLAIN output. That makes for
fairly repeatable output across all platforms and configurations that
won't change because of minor tweaks of the cost model. PostgreSQL takes
advantage of that in regression tests; with COSTS OFF, you can memorize
EXPLAIN output in expected output, if you wan to check that you get a
certain plan for a query. Our gpdiff.pl masks out the costs too, but it
masks out a lot more and the output if a test fails because of a plan
change is pretty hard to read. We could improve our tests by switching
over to COSTS OFF.

ALPHA2
------

* Split the processing of INSERT/UPDATE/DELETE to new executor node
type, ModifyTable.

The ModifyTable node is very similar to GPDB's DML node that you get
with ORCA, so we should be able reduce our diff footprint nicely with this.

Also, we have currently disabled RETURNING in GPDB. I'd really wish to
re-enable that, for the sake of completeness. There's no fundamental
reason GPDB couldn't support it. But I've put off tackling that until
this upstream refactoring, because it touches much of the same code and
I think it'll be easier to make RETURNING work again after this commit.
So after this commit, we should attempt fixing RETURNING, as a little
side-project.

ALPHA3
------

* Remove -w (--ignore-all-space) option from pg_regress's diff calls
(ce3153fa93)

This isn't very exciting, but we will need to fix the whitespace in the
expected output of all the GPDB-added tests, to cope with this. We
should do that as a separate PR, before we reach this point in the
merge. Otherwise we'll have have massive but boring expected output
changes mixed in with the real changes in the iteration.

* Hot Standby

This is where Hot Standby arrives. I'm not sure what it means for GPDB.
You can't use the feature as such on a GPDB cluster, but maybe we can
build something exciting using it in the future. But all the refactoring
and other changes related to this will land to the repository in any
case. We better have finished the WAL replication work, and gotten rid
of file replication and peristent tables stuff by then, or we will be in
a world of hurt merging this!

ALPHA4
------

* Streaming Replication

We've already backported most of this, because we already use streaming
replication for mirroring the master node. So this isn't very exciting,
feature-wise, but I'm sure there will be some cleanup work to do, to
disentangle the code that we had cherry-picked and backported earlier,
with what we'll be merging.

ALPHA5
------

Nothing exciting I can see..

BETA3
-----

No exciting commits that I can see. This is where the REL9_0_STABLE
branch was created, and 9.1 development started on the 'master' branch.

All in all, this looks pretty straightforward. The big stumbling block
is getting rid of file mirroring and persistent tables stuff, but once
that's done, the 9.0 merge should be plain sailing.

- Heikki

Xin Zhang

unread,

Nov 30, 2017, 5:27:19 PM11/30/17

to Heikki Linnakangas, Greenplum Developers

Thanks a lot Heikki for the roadmap. I cannot wait to get the hot standby, so that the mirrors can actually support some read-only queries. Looking forward for the 9.0 merge.

--

Thanks,

Shin

Venkatesh Raghavan

unread,

Nov 30, 2017, 5:46:19 PM11/30/17

to Xin Zhang, Heikki Linnakangas, Greenplum Developers

Cool!

ᐧ

Ivan Novick

unread,

Nov 30, 2017, 5:49:30 PM11/30/17

to Xin Zhang, Heikki Linnakangas, Greenplum Developers

+1 and thanks all

On Thu, Nov 30, 2017 at 2:27 PM, Xin Zhang <xzh...@pivotal.io> wrote:

--

Ivan Novick, Product Manager Pivotal Greenplum

ino...@pivotal.io -- (Mobile) 408-230-6491

https://www.youtube.com/GreenplumDatabase

Jasper Li

unread,

Nov 30, 2017, 6:12:17 PM11/30/17

to Ivan Novick, Xin Zhang, Heikki Linnakangas, Greenplum Developers

Great job! Thanks!

Best wishes

Jasper

--

Yandong Yao

unread,

Dec 4, 2017, 7:32:52 AM12/4/17

to Jasper Li, Ivan Novick, Xin Zhang, Heikki Linnakangas, Greenplum Developers

Great work indeed! Looking forward to 9.x.

Once have hot standby, maybe an opportunity to make master/standby failover faster and less manual work.

Regards,

Yandong

--

Best Regards,

Yandong

Ashwin Agrawal

unread,

Dec 4, 2017, 11:29:17 AM12/4/17

to Yandong Yao, Jasper Li, Ivan Novick, Xin Zhang, Heikki Linnakangas, Greenplum Developers

On Mon, Dec 4, 2017 at 4:32 AM, Yandong Yao <yy...@pivotal.io> wrote:

Great work indeed! Looking forward to 9.x.

Once have hot standby, maybe an opportunity to make master/standby failover faster and less manual work.

I don't think hot-standby has any connection to that, if it was we would enabled it when streaming replication was back ported.

On a side note even enabling Hot-standby for GPDB master is not straight-forward thing as its in postgres because we are not single node and master-standby still share the same segment data. So, can't easily enable hot-standby as need to worry about locks and all proper handling, as for example cannot have situations where read is performed from standby while drop is being executed from master. Kind of problems postgres in single node doesn't suffer.

Yandong Yao

unread,

Dec 6, 2017, 8:48:07 AM12/6/17

to Ashwin Agrawal, Jasper Li, Ivan Novick, Xin Zhang, Heikki Linnakangas, Greenplum Developers

so you mean if master is dropping a table, while standby is reading from it, it could be that some segment works, while other segments report table not find error?

--

Best Regards,

Yandong

Reply all

Reply to author

Forward