Client and server encodings across QD and QE

26 views
Skip to first unread message

Heikki Linnakangas

unread,
Jan 8, 2019, 3:41:02 AM1/8/19
to Greenplum Developers
Hi,

I started to look into the regression failure with the multibyte
regression tests in src/test/mb, reported as
https://github.com/greenplum-db/gpdb/issues/5241. The root cause of the
failure is confusion on client_encoding between QD and QEs.

Currently, we don't do anything special with client_encoding in QE
processes, so its value is set like any other GUC. In most cases, it
means that the QE processes have the same client_encoding setting as the
QD process. That's problematic.

If you look at the whole pipeline from the user to the QE process, we
have a situation like this:

QD QE
client <==> client_enc <-> server_enc <==> client_enc <-> server_enc


When a query enters the system via libpq, from the left, it is first
converted from client_encoding to server encoding, in the QD process. QD
plans the query, and dispatches it to the QE processes.

You might think that things go wrong right there, because the QE will
try to perform the client->server encoding conversion again. But it
works, because the QE->QE dispatching is performed using the special 'M'
message type, and the code in tcop/postgres.c that handles that, doesn't
perform encoding conversion. So the query is expected to already be in
server encoding, which is correct. Transferring the query result back to
the client also works, because the results from QEs to QD are not sent
via the libpq connection. They are sent via the interconnect, which also
doesn't do any encoding conversions. So all that communication between
the QD and QD works in the server encoding.

Except for a few things:

1. COPY TO is one exception: the data is converted to the client
encoding already in the QE. That's mostly handled correctly, except in
the case exposed in github issue #5241, where the QE's client_encoding
setting is different from the QD's.

2. ERROR processing. If an error happens in a QE, the QE converts it
from server to client encoding, and sends it back to the QD. The QD
incorrectly assumes that it's in the server encoding, and will try to
convert it again. As a result, you get e.g. this:

postgres=# create function raise_error(t text) returns void as $$
begin
-- Unicode code point 196 is "Latin Capital Letter a with Diaeresis".
raise 'raise_error called on %. Here''s a funny character: %', t,
chr(196);
end;
$$ language plpgsql;
CREATE FUNCTION
postgres=# set client_encoding=latin1;
SET
postgres=# select raise_error(t) from enctest;
FATAL: Unexpected internal error (elog.c:259)
DETAIL: FailedAssertion("!(pg_verifymbstr(*str, len, ((bool) 1)))",
File: "elog.c", Line: 259)
HINT: Process 32757 will wait for gp_debug_linger=120 seconds before
termination.
Note that its locks and other resources will not be released until then.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.


3. Internal queries dispatched directly with CdbDispatchCommand()
function. They don't use the interconnect, and send the results from QEs
back to the QD via libpq. Most such queries return ASCII-only data, like
tuple counts, so the encoding doesn't matter. But there's one exception:
ANALYZE collects the sample from the segments using
CdbDispatchCommand(). So you can get invalidly encoded data in pg_stats.


How to fix this? I think there are two possible solutions:

1. Teach QD about the things coming from QEs have already been converted
to client encoding. This isn't as simple as it might seem at first
glance. You could forward any errors to the client as is, as they're
already in the client's encoding, but then we'll need to keep track
which errors originated from the QE and don't need encoding conversion,
and which do. Also, if an error is caught e.g. with a PL/pgSQL EXCEPTION
block, rather than sent to the client, then we'd need to convert it back
from client_encoding to server encoding. We'll also need to fix the case
exposed in github issue #5241, where the client_encodings in QD and QE
currently go out of sync.

2. Always set client_encoding to match server encoding in QEs. This
seems simpler, but there are a couple of little complications with this
approach, too. First, in COPY TO, we should still perform the encoding
conversion in the QE, so that will need a special case. That's just a
simple matter of programming, though, because you can already specify a
non-default encoding in the COPY options, so we can make use of that. A
thornier case is in the NOTICE processing. Currently, we install a libpq
NOTICE processor in the QD->QE libpq connections, which just forwards
the NOTICE to the client, without encoding conversion. It runs as a
libpq callback function, so it cannot do anything that ereport()s, so it
is not safe to do encoding conversion there. We'll need to rewrite that
so that the NOTICE processor just stashes away NOTICE message, and we
convert and send it back to the client later, when we're out of the
callback function.


So, I'm leaning towards solution 2: always set client_encoding to match
server encoding (or to SQL_ASCII) in the QE processes. Anyone see a
problem with that?

- Heikki

Ming Li

unread,
Jan 8, 2019, 4:22:37 AM1/8/19
to Heikki Linnakangas, Greenplum Developers
Hi Heikki,

Is it possible to:
1) Keep client_encoding guc value on QE same as on QD, i.e. it refer to real user client encoding, so that we can still use it if needed (like COPY).
2) In QD, when sending it to remote client, check the libpq is_internal_gpdb_conn(), if true, we don't need to convert encoding. We can skip converting encoding in any cases that the DestRemote is QD. 

Thanks.

Adam Lee

unread,
Jan 8, 2019, 9:39:30 PM1/8/19
to Heikki Linnakangas, Greenplum Developers, Jerome Yang, Hubert Zhang
On Tue, Jan 08, 2019 at 10:40:57AM +0200, Heikki Linnakangas wrote:
> ...
> 2. Always set client_encoding to match server encoding in QEs. This seems
> simpler, but there are a couple of little complications with this approach,
> too. First, in COPY TO, we should still perform the encoding conversion in
> the QE, so that will need a special case. That's just a simple matter of
> programming, though, because you can already specify a non-default encoding
> in the COPY options, so we can make use of that. A thornier case is in the
> NOTICE processing. Currently, we install a libpq NOTICE processor in the
> QD->QE libpq connections, which just forwards the NOTICE to the client,
> without encoding conversion. It runs as a libpq callback function, so it
> cannot do anything that ereport()s, so it is not safe to do encoding
> conversion there. We'll need to rewrite that so that the NOTICE processor
> just stashes away NOTICE message, and we convert and send it back to the
> client later, when we're out of the callback function.
>
> So, I'm leaning towards solution 2: always set client_encoding to match
> server encoding (or to SQL_ASCII) in the QE processes. Anyone see a problem
> with that?
>
> - Heikki

If I get this right, your concern is that there might be some other
codes "sending" messages from QE to user client relying on the
"client_encoding".

I don't see any other than COPY and error handling. Loop more people who
are working on gptext or PL.

--
Adam Lee

Heikki Linnakangas

unread,
Jan 9, 2019, 3:56:48 AM1/9/19
to Adam Lee, Greenplum Developers, Jerome Yang, Hubert Zhang
On 09/01/2019 04:39, Adam Lee wrote:
> On Tue, Jan 08, 2019 at 10:40:57AM +0200, Heikki Linnakangas wrote:
>> ...
>> 2. Always set client_encoding to match server encoding in QEs. This seems
>> simpler, but there are a couple of little complications with this approach,
>> too. First, in COPY TO, we should still perform the encoding conversion in
>> the QE, so that will need a special case. That's just a simple matter of
>> programming, though, because you can already specify a non-default encoding
>> in the COPY options, so we can make use of that. A thornier case is in the
>> NOTICE processing. Currently, we install a libpq NOTICE processor in the
>> QD->QE libpq connections, which just forwards the NOTICE to the client,
>> without encoding conversion. It runs as a libpq callback function, so it
>> cannot do anything that ereport()s, so it is not safe to do encoding
>> conversion there. We'll need to rewrite that so that the NOTICE processor
>> just stashes away NOTICE message, and we convert and send it back to the
>> client later, when we're out of the callback function.
>>
>> So, I'm leaning towards solution 2: always set client_encoding to match
>> server encoding (or to SQL_ASCII) in the QE processes. Anyone see a problem
>> with that?
>
> If I get this right, your concern is that there might be some other
> codes "sending" messages from QE to user client relying on the
> "client_encoding".

Right.

- Heikki

Heikki Linnakangas

unread,
Jan 9, 2019, 3:59:55 AM1/9/19
to Ming Li, Greenplum Developers
On 08/01/2019 11:22, Ming Li wrote:
> Hi Heikki,
>
> Is it possible to:
> 1) Keep client_encoding guc value on QE same as on QD, i.e. it refer to
> real user client encoding, so that we can still use it if needed (like
> COPY).
> 2) In QD, when sending it to remote client, check the libpq
> is_internal_gpdb_conn(), if true, we don't need to convert encoding. We can
> skip converting encoding in any cases that the DestRemote is QD.

It's hard to keep track of which messages originally came from the QE,
and which were produced within the QD. Also, sometimes an error message
is produced in the QE, but is caught there, instead of transmitting it
to the user. In such cases, we'd need to convert it back from the client
encoding to the server encoding, which is silly performance-wise, and
makes things more complicated.

We could do all that, but it gets very complicated, and I'm sure we'd
miss something.

- Heikki

Ming Li

unread,
Jan 9, 2019, 4:29:05 AM1/9/19
to Heikki Linnakangas, Greenplum Developers
Hi Heikki,

Maybe my comment in the previous email is not clear, I suggested to think a new way:

1) The libpq connections between QD and QE don't need any encoding conversion. It always use the value of server_enc as libpq client_enc setting if the libpq is is_internal_gpdb_conn(). 

                        QD                               QE
client <==> client_enc <-> server_enc  <==>  server_enc

2) The guc client_enc is already refer to the gpdb outer libpq connection. 

We just need to hard-coded the libpq client code and it's server processing code, if the libpq connection is_internal_gpdb_conn(), skip encoding conversion.


Heikki Linnakangas

unread,
Jan 9, 2019, 4:46:18 AM1/9/19
to Ming Li, Greenplum Developers
On 09/01/2019 11:28, Ming Li wrote:
> Hi Heikki,
>
> Maybe my comment in the previous email is not clear, I suggested to think a
> new way:
>
> 1) The libpq connections between QD and QE don't need any encoding
> conversion. It always use the value of server_enc as libpq client_enc
> setting if the libpq is is_internal_gpdb_conn().
>
> QD QE
> client <==> client_enc <-> server_enc <==> server_enc
>
> 2) The guc client_enc is already refer to the gpdb outer libpq connection.
>
> We just need to hard-coded the libpq client code and it's server processing
> code, if the libpq connection is_internal_gpdb_conn(), skip encoding
> conversion.

Oh, I see. So in QEs, set client_encoding GUC like today, but don't
actually perform any conversions, as if it was set the same as
server_encoding? Seems doable. I'm not sure if it's better than setting
client_encoding to server_encoding directly.

Another variant of that idea is to always set client_encoding the same
as server_encoding in QEs, but add another GUC, "client_encoding_in_qd"
or something, to indicate the client_encoding used by the user's
connection to QD. COPY could look at that, instead of client_encoding.

I'll think about it. All of these variants are essentially the same, in
that QEs don't do conversions between client and server encodings
(except for COPY TO). The difference is in how that is represented in
the GUCs.

- Heikki

Ming Li

unread,
Jan 9, 2019, 4:56:39 AM1/9/19
to Heikki Linnakangas, Greenplum Developers
Exactly.

If we add a new guc, this guc will misunderstanding, and the guc client_encoding in different scenarios have a different meaning, it will error-prone. So I would like to keep the guc only one meaning in any scenario.


Heikki Linnakangas

unread,
Jan 9, 2019, 8:03:41 AM1/9/19
to Ming Li, Greenplum Developers
On 09/01/2019 11:56, Ming Li wrote:
> Exactly.
>
> If we add a new guc, this guc will misunderstanding, and the guc
> client_encoding in different scenarios have a different meaning, it will
> error-prone. So I would like to keep the guc only one meaning in any
> scenario.

It is not really clear what client_encoding should mean in the QE.

For example, let's imagine that the database's encoding is UTF-8, and
the user connects with psql, with client_encoding=LATIN1. Would you set
QE's client_encoding to LATIN1 or UTF-8? You can argue that it should be
set to LATIN1, because that's the client_encoding that the user
specified. But from the QE process' and QD->QE libpq connection's point
of view, that's not right, because that libpq connection is in fact
using UTF-8 for all the communication.

I'm leaning towards setting client_encoding to UTF-8 in the QEs. I think
that requires the least changes, as that leads to correct behavior in
most backend code. The only exception is in COPY TO, which then need
special handling to specify the encoding explicitly, when the COPY TO
command is dispatched from QD to QEs.

- Heikki
Reply all
Reply to author
Forward
0 new messages