Isolation levels supported in Greenplum

231 views
Skip to first unread message

Heikki Linnakangas

unread,
Nov 11, 2015, 3:21:02 AM11/11/15
to Greenplum Developers
Hi,

Greenplum accepts isolation level SERIALIZABLE:

postgres=# begin isolation level serializable;
BEGIN

But throws an error on REPEATABLE READ:

postgres=# begin isolation level repeatable read;
ERROR: Greenplum Database does not support REPEATABLE READ
transactions. (variable.c:570)

What is the logic behind this? It seems insane. SERIALIZABLE is a more
strict mode than REPEATABLE READ; if you support SERIALIZABLE, you can
always upgrade REPEATABLE READ to SERIALIAZABLE to satisfy the SQL
spec's requirements.

Of course, Greenplum's SERIALIZABLE mode is not truly serializable, like
it wasn't in PostgreSQL either, until we got the "SSI" feature in 9.1.
Before that SERIALIZABLE and REPEATABLE READ were treated the same, and
what you actually got was something called "snapshot isolation", which
satisfies the requirements for REPEATABLE READ, but not SERIALIZABLE.

I think we need to do something about this in Greenplum. It certainly
doesn't make sense to accept SERIALIZABLE but reject REPEATABLE READ.
And when we catch up with PostgreSQL 9.1, we'll have to decide how this
all works with SSI. That also depends on whether we can implement SSI
over the whole cluster, or if we'll just leave it unimplemented. If we
can't implement SSI over the cluster, then we should reject SERIALIZABLE
but allow REPEATABLE READ.

I looked at the history of this check in variable.c, and it looks like
in the very beginning, back in 2006, both REPEATABLE READ and
SERIALIZABLE were forbidden. The SERIALIZABLE restriction was removed in
2007 a commit with message "Distributed Snapshots and DTM Re-Work (merge
from global_snapshot branch)". My guess is that after that commit, we do
support Snapshot Isolation, and that commit should've removed the
restriction for REPEATABLE READ as well.

Thoughts? I'm inclined to remove the restriction on REPEATABLE READ.

- Heikki

Ivan Novick

unread,
Nov 11, 2015, 11:22:02 AM11/11/15
to Heikki Linnakangas, Greenplum Developers
Ashwin A and Asim P will certainly have a comment on this when there timezone comes online.

Cheers,
Ivan

Ashwin Agrawal

unread,
Nov 11, 2015, 1:06:16 PM11/11/15
to Ivan Novick, Heikki Linnakangas, Greenplum Developers

Isolation level was discussed in near past and consensus was to stop supporting SERIALIZABLE mode given the sweet spot of gpdb. Extremely few not even handfull of customers using the same really using for the purpose its intended but just due to legacy tools / scripts. There are known cases in which SERIALIZABLE mode doesn't work, plus even causes data loss or system down scenarios.

So, practically we have two options validate out all the scenarios / functionality with SERIALIZABLE and fix them or stop supporting the same. Since not much usecase is seen stop supporting seems to be the choice. How to disable support can be decided, options are 
  • give ERROR when used
  • just issue WARNING and act as noop, which means treat as READ COMMITTED to avoid need for rewrite of tools / scripts used by customers. 
  • allow its usage under some GUC only for certain operations which we know it works like some tools catalog check which itself uses SERIALIZABLE 

Álvaro Hernández Tortosa

unread,
Nov 11, 2015, 2:37:00 PM11/11/15
to gpdb...@greenplum.org

    If I may add to the debate, having REPEATABLE READ isolation level is something really important for many, us (ToroDB) included. SERIALIZABLE would be great, but it is not common need. So PostgreSQL's pre-9.1 SERIALIZABLE which is not such but greater than REPEATABLE READ is probably good enough.

    However, only if it works correctly across the cluster, of course. I'd test it with Jepsen (https://github.com/aphyr/jepsen) or similar tools, to prove it is not broken.

    My 2 cents,

    Álvaro


-- 
Álvaro Hernández Tortosa


-----------
8Kdata

--
You received this message because you are subscribed to the Google Groups "Greenplum Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.

Andreas Scherbaum

unread,
Nov 11, 2015, 5:02:47 PM11/11/15
to Ashwin Agrawal, Ivan Novick, Heikki Linnakangas, Greenplum Developers

With the intention to make GPDB more OLTP-friendly, it might worth exploring the 4th option, which is fixing the issue. But it might come with performance costs.

For what it's worth, just raising a warning and keep going (option 2) is not something I expect from a database.



Regards,
Andreas

--

Andreas Scherbaum

Principal Software Engineer

GoPivotal Deutschland GmbH


Hauptverwaltung und Sitz: Am Kronberger Hang 2a, 65824 Schwalbach/Ts., Deutschland

Amtsgericht Königstein im Taunus, HRB 8433

Geschäftsführer: Andrew Michael Cohen, Paul Thomas Dacier

Asim Praveen

unread,
Nov 11, 2015, 6:01:30 PM11/11/15
to Álvaro Hernández Tortosa, Greenplum Developers


On Wed, Nov 11, 2015 at 11:36 AM, Álvaro Hernández Tortosa <a...@8kdata.com> wrote:
>
>
>     If I may add to the debate, having REPEATABLE READ isolation level is something really important for many, us (ToroDB) included. SERIALIZABLE would be great, but it is not common need. So PostgreSQL's pre-9.1 SERIALIZABLE which is not such but greater than REPEATABLE READ is probably good enough.
>

Just curious, how does ToroDB depend on repeatable read isolation level?  Greenplum's catalog checker (gpcheckcat) that Ashwin alluded to, compares catalog tables on master with those on segments.  Serializable isolation level is used to guarantee that the tool reads the same catalog contents on mater and segments, even if transactions commit after it started reading.

Prior to when SSI was introduced, did PostgreSQL differentiate between repeatable read and serializable isolation levels?  If yes, can someone please elaborate how?

Asim

Álvaro Hernández Tortosa

unread,
Nov 11, 2015, 7:05:54 PM11/11/15
to Asim Praveen, Greenplum Developers

On 12/11/15 00:01, Asim Praveen wrote:


On Wed, Nov 11, 2015 at 11:36 AM, Álvaro Hernández Tortosa <a...@8kdata.com> wrote:
>
>
>     If I may add to the debate, having REPEATABLE READ isolation level is something really important for many, us (ToroDB) included. SERIALIZABLE would be great, but it is not common need. So PostgreSQL's pre-9.1 SERIALIZABLE which is not such but greater than REPEATABLE READ is probably good enough.
>

Just curious, how does ToroDB depend on repeatable read isolation level?

    As you know, we implement MongoDB's wire protocol. This protocol is session-less. When you issue a query and not all the results fit in the response, you retain a cursor reference. Subsequent queries may use this cursor to fetch more results. The problem is that these queries to fetch more data, being the protocol session-less, cannot be tied to a user-facing tx. In order for us to provide consistent reads, we open an "internal" cursor reading off a REPEATABLE READ READONLY tx and use that cursor for subsequent MongoDB fetches. READ COMMITTED is not enough for us as we might see new data committed before each new "fetch more" MongoDB wire protocol message.


Greenplum's catalog checker (gpcheckcat) that Ashwin alluded to, compares catalog tables on master with those on segments.  Serializable isolation level is used to guarantee that the tool reads the same catalog contents on mater and segments, even if transactions commit after it started reading.

    If either RR or Serializable are effectively provided and they work, we are good with either. But I understood that as of today RR is not provided and Serializable may not be perfectly implemented, is that right?



Prior to when SSI was introduced, did PostgreSQL differentiate between repeatable read and serializable isolation levels?  If yes, can someone please elaborate how?

    Up to 9.0, RR and Serializable were the same (http://www.postgresql.org/docs/9.0/static/transaction-iso.html). Only on 9.1 onwards PostgreSQL started having a "true" serializable level, and the previous Serializable was downgraded to RR, as it wasn't a truly Serializable level:

" Note: Prior to PostgreSQL version 9.1, a request for the Serializable transaction isolation level provided exactly the same behavior described here. To retain the legacy Serializable behavior, Repeatable Read should now be requested."

http://www.postgresql.org/docs/9.4/static/transaction-iso.html

    Cheers,

Jon

unread,
Nov 12, 2015, 3:38:16 PM11/12/15
to Greenplum Developers
Most customers are using AO (appendonly) tables now because the files are smaller and you can use column orientation and/or compression.  Starting with 4.3, you can UPDATE and DELETE AO tables but we don't support serializable transactions either. 

Example:

gpdb=# create table foo (id int, fname text) with (appendonly=true) distributed by (id);
CREATE TABLE
Time: 30.124 ms
gpdb=# insert into foo select i, 'jon_' || i from generate_series(1, 100) as i;
INSERT 0 100
Time: 121.938 ms
gpdb=# begin transaction isolation level serializable;
BEGIN
Time: 1.304 ms
gpdb=# update foo set fname = upper(fname) where id = 1;
ERROR:  Updates on append-only tables are not supported in serializable transactions.  (seg0 gpdbvm43.localdomain:50000 pid=2934)

But I've been using GP since 2007 and I've never heard of anyone wanting to use serializable transactions.  

Asim Praveen

unread,
Nov 14, 2015, 6:58:16 PM11/14/15
to Álvaro Hernández Tortosa, Greenplum Developers
On Wed, Nov 11, 2015 at 4:05 PM, Álvaro Hernández Tortosa <a...@8kdata.com> wrote:

    As you know, we implement MongoDB's wire protocol. This protocol is session-less. When you issue a query and not all the results fit in the response, you retain a cursor reference. Subsequent queries may use this cursor to fetch more results. The problem is that these queries to fetch more data, being the protocol session-less, cannot be tied to a user-facing tx. In order for us to provide consistent reads, we open an "internal" cursor reading off a REPEATABLE READ READONLY tx and use that cursor for subsequent MongoDB fetches. READ COMMITTED is not enough for us as we might see new data committed before each new "fetch more" MongoDB wire protocol message.

Cursors in Greenplum behave similar to repeatable read transactions.  E.g. let transaction T1 open a cursor as "declare c cursor for select * from r1;".  While T1 fetches from cursor c, another transaction, T2 may insert / update / delete tuples from table r1.  But T1 continues to see the version of r1 when it was started.  I just verified that the same behavior is exhibited in PostgreSQL 9.4.

Isn't this behavior sufficient for your use case?  Or am I missing something obvious?

    If either RR or Serializable are effectively provided and they work, we are good with either. But I understood that as of today RR is not provided and Serializable may not be perfectly implemented, is that right?

That is correct.  But please note that Greenplum's serializable isolation mode is not broken outrightly.  E.g. read only serializable transactions behave as expected.

" Note: Prior to PostgreSQL version 9.1, a request for the Serializable transaction isolation level provided exactly the same behavior described here. To retain the legacy Serializable behavior, Repeatable Read should now be requested."

http://www.postgresql.org/docs/9.4/static/transaction-iso.html


Thank you for the confirmation.

Asim

Álvaro Hernández Tortosa

unread,
Nov 15, 2015, 5:28:50 AM11/15/15
to Asim Praveen, Greenplum Developers

On 15/11/15 00:58, Asim Praveen wrote:
On Wed, Nov 11, 2015 at 4:05 PM, Álvaro Hernández Tortosa <a...@8kdata.com> wrote:

    As you know, we implement MongoDB's wire protocol. This protocol is session-less. When you issue a query and not all the results fit in the response, you retain a cursor reference. Subsequent queries may use this cursor to fetch more results. The problem is that these queries to fetch more data, being the protocol session-less, cannot be tied to a user-facing tx. In order for us to provide consistent reads, we open an "internal" cursor reading off a REPEATABLE READ READONLY tx and use that cursor for subsequent MongoDB fetches. READ COMMITTED is not enough for us as we might see new data committed before each new "fetch more" MongoDB wire protocol message.

Cursors in Greenplum behave similar to repeatable read transactions.  E.g. let transaction T1 open a cursor as "declare c cursor for select * from r1;".  While T1 fetches from cursor c, another transaction, T2 may insert / update / delete tuples from table r1.  But T1 continues to see the version of r1 when it was started.  I just verified that the same behavior is exhibited in PostgreSQL 9.4.

Isn't this behavior sufficient for your use case?  Or am I missing something obvious?

    Yes, you are right, but it's not sufficient for us, as we need to potentially query more than one table to return a document. So we need a stable view of the database.



    If either RR or Serializable are effectively provided and they work, we are good with either. But I understood that as of today RR is not provided and Serializable may not be perfectly implemented, is that right?

That is correct.  But please note that Greenplum's serializable isolation mode is not broken outrightly.  E.g. read only serializable transactions behave as expected.

    That's more than enough then! We only need this for read operations. For write operations, read committed tx are enough :)


    Thank you!
Reply all
Reply to author
Forward
0 new messages