Potential Bug in GPDB Custom Formatter Logic

108 views
Skip to first unread message

Ashuka Xue

unread,
Feb 15, 2023, 5:27:58 PM2/15/23
to gpdb...@greenplum.org

Hi GPDB Developers, 

Users have reported that when they were inserting data into GPDB from PXF and the PXF service encountered an error, the query would complete successfully with a partial load of data from the external data source. See PXF-898 for more details. The PXF team has been looking into this issue and it looks like this behavior was the result of two bugs:

1. When the PXF service is reading from an external data source and sending the data back to GPDB, it uses HTTP chunked transfer encoding to stream the data. In older versions of the embedded web server the PXF service uses (Apache Tomcat), when an error was reported the streaming response would stop writing data, write an error response, and then close the connection in such a way that from an HTTP protocol point of view, the transfer was completed successfully. The means that PXF's custom GPDB protocol does not see any errors and therefore doesn't report an error.

In later versions of Apache Tomcat, the behavior was changed to close the connection without properly terminating the chunked transfer encoding. PXF has since upgraded to using this newer version of Tomcat and this results in newer versions of PXF detecting the error in the custom GPDB protocol and raising an error. The end result is that the transaction is ended.

2. There is a second bug in GPDB's fileam.c with regards to `FORMATTER_GET_SAW_EOF` that prevented PXF's custom formatter from detecting the error. This issue was also recently reported via the Greenplum Users mailing list (see message EOF in Custom Formatter for External Table) and the reporter comes to more or less the same conclusion:

> Using FORMATTER_GET_SAW_EOF never allows to see a true flag, thus I think there is a misunderstanding how it works

For some context, PXF C code contains 2 parts. The first is the C side protocol that connects to the PXF Java service and streams data into GPDB. The second part is the custom formatter. This takes the data streamed in and parses it into tuples for GPDB to read. The two work in tandem to continuously load and parse a data buffer. Most other custom protocols also work in this manner.

GPDB calls into the formatter code and the PXF extension protocol using `externalgettup_custom`. This function contains a double `while`-loop to handle the incoming data stream and the formatter. There are also two important variables to track in this function:
* `scan->raw_buf_done` - true if either there is no more data to read or the formatter has handled all the data currently in the buffer, i.e: the formatter can do no more processing on the data in the buffer
* `scan->fs_state->fe_eof` - true if EOF has been seen

We go into the outer-`while`-loop if `!(raw_buf_done == true && fe_eof == true)`. This means that if either `raw_buf_done == false` or `fe_eof == false` or both are false. Once we go into the outer-`while`-loop, we call into `external_getdata` if `raw_buf_done == true`. This function calls the protocol to get more data if needed and sets `fe_eof` accordingly. If the protocol returned 0 bytes, `fe_eof` is set to true. If data is loaded into the buffer, `raw_buf_done` is set to false.

We go into the inner-`while`-loop when `raw_buf_done == false` at which point we call the custom formatter. If the formatter returns `FORMATTER_NEEDS_MORE_DATA`, we set `raw_buf_done = true` as we have read all the current data in the buffer and exit the inner-`while`-loop.

Let's say that the incoming data stream ended unexpectedly and the call into the protocol to get more data returned 0 bytes. Then we would have `raw_buf_done = true` and `fe_eof = true`. In this scenario, the formatter has called for more data but the conditions for both `while` loops are such that custom formatter will not be called again and instead we report `WARNING` with the message `unexpected end of file`.

The key takeaway is that the custom formatter function was never called _after_ the EOF was detected. This prevents any error handling in the custom formatter from detecting the case where there is still unformatted data but EOF has been seen (e.g, PXF's custom formatter). If the transport (in this case, HTTP chunked transfer encoding) completes without a _transport_-level error, there is no way for the _application_-level to detect partial data transfers.

It is also interesting that the comment block states:

> The formatter needs more data, but we have reached EOF. This is an error.

but reports at the `WARNING` level.

PR 5391 refactored the code inside `externalgettup_custom` to change the reporting from `ERROR` to `WARNING`. This PR also modified the `while`-loops so that the custom formatter function is no longer called _after_ EOF is seen. Is this change in behavior expected/intentional? 

We could change the report from `WARNING` back to `ERROR` since partial data loading seems especially problematic, but this might be considered a breaking change in behavior. Alternatively, ensuring the custom formatter function is called at least once _after_ the EOF is seen would allow custom formatters to detect and handle the case where there is still some unprocessed data but EOF has been seen.

An in-depth step through of a local reproduction of the issue can be found below:

1. The function `externalgettup_custom` is called with
    * `scan->raw_buf_done = false`
    * `scan->fs_pstate->fe_eof = false`
2. The condition of the outer-`while`-loop (`!(scan->raw_buf_done && pstate->fe_eof)`) is `true`, so execution proceeds into the loop body.
3. The condition of the `if`-statement (`scan->raw_buf_done`) is `false`, the execution skips over the body of the statement.
4. The condition of the inner-`while`-loop (`!scan->raw_buf_done`) is `true`, so execution proceeds into the loop body, where the custom formatter function is invoked (line 1001).
5. The formatter function returns without an error, having set `formatter->fmt_notification = FMT_NEED_MORE_DATA` to indicate that it needs more data to form a tuple from the data buffer. This causes `scan->raw_buf_done` to be set to `true` and execution to continue with the next iteration of the inner-`while`-loop.
6. The condition of the inner-`while`-loop is now `false`, so execution continues with next statement after the loop body which is the end of the outer-`while`-loop.
7. The condition of the outer-loop is still `true` (since `pstate->fe_eof = false`), so execution proceeds into the loop body.
8. The condition of the `if`-statement is `true`, so execution proceeds into the statement body.
9. The function `external_getdata` returns `0` and the inner-`if`-statement (lines 967--971) are skipped, meaning `scan-raw_buf_done` is still `true`. This also has caused `pstate->fe_eof` to be set to `true`.
    * url_feof

10. The condition of the inner-`while`-loop is `false`, so execution continues with the next iteration of the outer-`while`-loop.
11. The condition of the outer-`while`-loop is now `false` (`scan->raw_buf_done = true` and `pstate->fe_eof = true`), so execution continues with the next statement after the outer-loop.
12. The `if`-statement at the end of `externalgettup_custom` detects that there is still unformatted data that has not been processed by the formatter but that we have reached EOF. A warning is reported and the scan is ended. 

A diagram of the function can be found here: externalgettup_custom diagram 1.jpg 

Thanks, 
Ashuka. Bradford and Yiming from the PXF Team

Ashuka Xue

unread,
Feb 15, 2023, 8:25:54 PM2/15/23
to gpdb...@greenplum.org, Ashuka Xue
Hi all, 

It seems the link for the diagram was not shared properly. Please use this link instead: externalgettup_custom diagram 1.jpg

Thanks, 
Ashuka

From: 'Ashuka Xue' via Greenplum Developers <gpdb...@greenplum.org>
Sent: Wednesday, February 15, 2023 2:27 PM
To: gpdb...@greenplum.org <gpdb...@greenplum.org>
Subject: Potential Bug in GPDB Custom Formatter Logic
 
!! External Email
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.

Rui Zhao

unread,
Feb 16, 2023, 7:55:23 AM2/16/23
to Greenplum Developers, Ashuka Xue
Hi Ashuka,

The picture shows the real procedure of function externalgettup_custom(). I think it is a good material to learn about the custom formatter processing.

I think there are two ways to avoid this potential bug:
1. Append an end of line to the output.
That's what S3 protocol currently doing to work around this bug. If the data file "text_in" is like :
`hexdump text_in -c
0000000   d   e   d   e   d   d  \n   d   e   d   e   d   d   e   f
0000010   f   e   f   f   e   f  \n   d   e   d   e   d   e   d   e  \n
0000020   d   d   d  \n   d   d   d
0000027`
wc text_in
 4  6 39 text_in
The file contains 39 char, and the last char is not '\n'.
Then when create the external table like this, it could handle the last line"ddd"
```create external table test_s3custom_formatter(name text) LOCATION('s3://s3-us-west-2.amazonaws.com/s3test.pivotal.io/regress/oneline/ config=/home/gpadmin/workspace/6_greenplum/s3.key') format 'custom' (formatter='text_in');
select * from test_s3custom_formatter ;
      name
-----------------
 dededd
 dededdef feffef
 dededede
 ddd
 ddd
(5 rows)```
Why s3 could handle the last line is that on the first loop, external_getdata returned 39 which contains all the text content, then after it out put the first 4lines, it entered the FMT_NEED_MORE_DATA block, then "raw_buf_done" is set to true, and then external_getdata is called for the second time, this time, S3 protocol detects the file ends without eol, so it returns 1, the data is "\n". So in following "ddd" could be handled and formatted into a tuple. Then, the last loop will be called, external_getdata returned 0. This time, the "pstate->fe_eof" is set. Then all of the loop ends.

This solution works around most of the cases, and if any connection error happened, S3 protocol would throw an error. In this way, if the last line missing any column, they could be handled and recorded in error log files.

2. Set the EOF flag at the first reading at the end of the file.
This solution is used in file and gpfdist protocol, because in both url_file_fread() and url_curl_fread(), there is one reading loop which tries to read the expected length, but if the returned length is less than the expected length, then the EOF flag is set. So for file and gpfdist protocol, it could detects the EOF as earliest timing which could avoid the bug in most of the cases except when the expecting length is the buffer length and the last returning length is equal to the buffer length, then in this case, the bug would be triggered, but in a very very little probability.

But this solution does not work for custom protocol, that's why S3 protocol uses the 1 work around to avoid this bug.
If we try to use the 2nd way, we need to add one eof flag inside ExtProtocolData struct to let custom protocol set the EOF at the earliest timing. And the url_custom_feof() would be modified to check the EOF flag in ExtProtocolData struct.

I hope the first way could work for PXF because it need to modify the code of url_custom and there is still possibility to trigger the bug in a very very little probability.


Thanks
Rui Zhao

Ashuka Xue

unread,
Feb 21, 2023, 9:13:29 PM2/21/23
to Greenplum Developers, Rui Zhao, Ashuka Xue
Hi Rui Zhao, 

1. For this workaround, if I understood you correctly, the s3 protocol was able to receive data after the call FMT_NEED_MORE_DATA and thus able to detect that the file ends without EOL. 

However, in our situation, the FMT_NEED_MORE_DATA call was unable to receive any more data because the connection was ended abruptly. In this scenario, pstate->fe_eof would be set (and thus FORMATTER_GET_SAW_EOF would be true) and we would not be able to enter our protocol to handle the situation when the file ends abruptly. 

In this scenario, we would ideally call the formatter function once more so that the protocol can handle the situation where FORMATTER_GET_SAW_EOF is true 

2. As you mentioned, the second work around you provided does not work with custom protocol. If we were to continue with the fix you recommended here (EOF flag inside ExtProtocolData), we would need to modify the code inside GPDB, correct? 

Thanks, 
Ashuka

Rui Zhao

unread,
Feb 21, 2023, 11:29:06 PM2/21/23
to Greenplum Developers, Ashuka Xue, Rui Zhao
Hi Ashuka,


> 1. For this workaround, if I understood you correctly, the s3 protocol was able to receive data after the call FMT_NEED_MORE_DATA and thus able to detect that the file ends without EOL. 
> 2. As you mentioned, the second work around you provided does not work with custom protocol. If we were to continue with the fix you recommended here (EOF flag inside ExtProtocolData), we would need to modify the code inside GPDB, correct? 

Correct!

But I am not quite familiar with PXF implementations(Maybe we could sync more detailed implements about PXF if needed), in my understanding, if any connection was ended abruptly, there would be errlog be called, because this kind of error cannot be handled by sreh. And I think the custom protocol should set the eof and detect connection error in a proper way. It is not easy to modify the code logic in function externalgettup_custom() because it may affect all custom formatters. 

Thanks,
Rui Zhao

Ashuka Xue

unread,
Mar 15, 2023, 4:07:31 PM3/15/23
to gpdb...@greenplum.org
Hi GPDB Devs,

For those interested, commits were pushed to GPDB main and 6X_STABLE with a fix for the custom formatter logic. 

It allows the formatter to handle cases where an EOF has been seen. 

Thanks, 
Ashuka

From: Ashuka Xue <ax...@vmware.com>
Sent: Wednesday, February 15, 2023 5:25 PM
To: gpdb...@greenplum.org <gpdb...@greenplum.org>; Ashuka Xue <ax...@vmware.com>
Subject: Re: Potential Bug in GPDB Custom Formatter Logic
 
Reply all
Reply to author
Forward
0 new messages