Performance issue on dispatching debug_query_string

37 views
Skip to first unread message

Ning Yu

unread,
Oct 8, 2018, 11:17:39 PM10/8/18
to Greenplum Developers
Hi,

debug_query_string is a useful variable for developers to find out the current query with gdb, it's recorded on master and dispatched to segments.  One problem about it is that if the SQL parsed into multiple statements then debug_query_string is dispatched repeatedly for each statement, this can be very expansive in some cases.

One example is creating a partitioned table:

    create table t (c1 int, c2 int)
    partition by range(c2) (start(0), end(1000), every(1));

The stmts returned by transformCreateStmt() contains 2001 substmt, then debug_query_string is dispatched 2001 times, but as the SQL is just a small one, the total execution time is only 7s with a default demo cluster (master branch, --enable-debug, --enable-cassert, CFLAGS="-g -O3").  But what if we add a large amount of comments in the SQL?

    create table t (c1 int, c2 int)
    /* comment */
    /* ... */
    /* comment */
    partition by range(c2) (start(0) end(1000) every(1));

In my experiment by putting 10MB comments the execution time becomes 35s, because the 10MB query is dispatched 2001 times, that's about 20GB data to dispatch.  If we dispatch empty string instead of debug_query_string, the execution time is reduced to 8s.

In real world it's not likely for us to put the lord of the rings as comment in a sql, however there is also chance for a CREATE TABLE SQL to be large enough:

    create table t (c1 int, c2 int, ..., c100 int)
    with (appendonly=true, compresslevel=5, orientation=column, compresstype=zlib)
    partition by range(c2) (
        start(0) end(1) every(1) with (tablename='p1', appendonly=true, compresslevel=5, orientation=column, compresstype=zlib)
            column c1 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            column c2 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            ...
            column c100 encoding (compresslevel=5, compresstype=zlib, blocksize=32768),
        start(1) end(2) every(1) with (tablename='p2', appendonly=true, compresslevel=5, orientation=column, compresstype=zlib)
            column c1 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            column c2 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            ...
            column c100 encoding (compresslevel=5, compresstype=zlib, blocksize=32768),
         ...
        start(999) end(1000) every(1) with (tablename='p1000', appendonly=true, compresslevel=5, orientation=column, compresstype=zlib)
            column c1 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            column c2 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
            ...
            column c100 encoding (compresslevel=5, compresstype=zlib, blocksize=32768)
    );

Execution time on it:
- by dispatching debug_query_string: 65s;
- by dispatching empty string: 50s;

On a more powerful server a more complicated version of above SQL has below results:
- by dispatching debug_query_string: 160s;
- by dispatching empty string: 30s;

We are considering to optimize this, there are many options:

0. never dispatch debug_query_string;
1. only dispatch debug_query_string in debug build;
2. only dispatch debug_query_string if it's small enough (and count of substmts is small enough);
3. only dispatch debug_query_string on the first substmt and provide some way to let segments reuse it in other substmts;
4. provide a GUC to control whether to dispatch it;
5. etc.;

Without debug_query_string it's still possible for us to find out the segment id and command id from a segment core dump (gp_session_id, gp_command_count), and thus find out the sql from the master logs.

Do you think it's worthy to make such an optimization?  Any inputs are welcome.

Thanks
Ning


Robert Eckhardt

unread,
Oct 8, 2018, 11:29:06 PM10/8/18
to Ning Yu, Greenplum Developers
On Mon, Oct 8, 2018, 11:17 PM Ning Yu <n...@pivotal.io> wrote:
Hi,

debug_query_string is a useful variable for developers to find out the current query with gdb, it's recorded on master and dispatched to segments.  One problem about it is that if the SQL parsed into multiple statements then debug_query_string is dispatched repeatedly for each statement, this can be very expansive in some cases.
[snip] 

We are considering to optimize this, there are many options:

0. never dispatch debug_query_string;
1. only dispatch debug_query_string in debug build;
2. only dispatch debug_query_string if it's small enough (and count of substmts is small enough);
3. only dispatch debug_query_string on the first substmt and provide some way to let segments reuse it in other substmts;
4. provide a GUC to control whether to dispatch it;
5. etc.;

Without debug_query_string it's still possible for us to find out the segment id and command id from a segment core dump (gp_session_id, gp_command_count), and thus find out the sql from the master logs.

Based on the name only and your brief description this doesn't seem like something we should do outside of a debug build. 

Can you see a use case a user might have for this? 

Rob 

Ning Yu

unread,
Oct 9, 2018, 3:28:20 AM10/9/18
to Robert Eckhardt, Greenplum Developers
Hi Rob,

Most consumers of debug_query_string is to record current query in log messages.  The only user interface of debug_query_string is a UDF current_query(), but it's usually executed on QD so it can still see the correct debug_query_string; even if the user execute sth. like `select current_query() from gp_dist_random('gp_id');` it still can get the correct values, the reason is that debug_query_string on segments are always dispatched from QD, but on QD there are kinds of dispatch methods, here is a list of the build functions for them:

```
cdbdisp_buildCommandQueryParms(const char *strCommand, ...)
{
...
    pQueryParms->strCommand = strCommand;
...
}

cdbdisp_buildUtilityQueryParms(struct Node *stmt, ...)
{
...
    pQueryParms->strCommand = debug_query_string;
...
}

cdbdisp_buildPlanQueryParms(struct QueryDesc *queryDesc, ...)
{
...
    pQueryParms->strCommand = queryDesc->sourceText;
...
}
```

debug_query_string on segments have different sources on QD for DDL and DML.  What triggers the performance issue is CREATE TABLE ... PARTITION BY which is in `cdbdisp_buildUtilityQueryParms()`, that doesn't affect SELECT queries.  Whether could we pass empty string to `strCommand` is another topic.

So in general I believe if we only hack cdbdisp_buildUtilityQueryParms() it should not affect the end users.

Best Regards
Ning

Asim R P

unread,
Oct 9, 2018, 1:25:46 PM10/9/18
to n...@pivotal.io, Greenplum Developers
Good finding.

On Mon, Oct 8, 2018 at 8:17 PM Ning Yu <n...@pivotal.io> wrote:
>
> 2. only dispatch debug_query_string if it's small enough (and count of substmts is small enough);

I support this option. How about dispatching always but truncating if
it's longer than a threshold?

Venkatesh Raghavan

unread,
Oct 9, 2018, 1:40:50 PM10/9/18
to Asim Praveen, Ning Yu, Greenplum Developers
How easy is to strip out the comments. Wouldn't that be a good next step?

V

Ashwin Agrawal

unread,
Oct 9, 2018, 1:45:32 PM10/9/18
to Venkatesh Raghavan, Asim Praveen, Ning Yu, Greenplum Developers

On Tue, Oct 9, 2018 at 10:40 AM Venkatesh Raghavan <vrag...@pivotal.io> wrote:
How easy is to strip out the comments. Wouldn't that be a good next step?

I don't think problem is only with comments. As mentioned in start of thread, native DDL itself can be bulky. I am thinking the problem is not just with partition tables but functions and all can also suffer the same problem.

Ning Yu

unread,
Oct 9, 2018, 8:31:02 PM10/9/18
to Asim R P, Greenplum Developers
Yes, personally I also vote for truncating.

Ning Yu

unread,
Oct 9, 2018, 8:50:14 PM10/9/18
to Venkatesh Raghavan, Asim Praveen, Greenplum Developers
As explained by Ashwin, comments is not the only trigger of the issue.  But striping out comments does is one of the potential next steps, some of the others are:

- for the last DDL example I put in the thread we could:
  - merge the partition by subclauses
  - release some temp memory: we noticed that the generated stmts are about 200MB, but there are extra 1GB memory leaked during the generation which might cause OOM

Paul Guo

unread,
Oct 9, 2018, 9:10:50 PM10/9/18
to Ning Yu, Asim Praveen, Greenplum Developers
Yes, truncated debug_query_string for QE seems to be a better idea.

By the way, from the perspective of performance, I'm wondering if we should start to think about compression for those dispatched stuffs also. Personally I'm curious about the compression ratio of those dispatched stuffs and the cpu vs network io balance.

Ning Yu <n...@pivotal.io> 于2018年10月10日周三 上午8:31写道:
--
You received this message because you are subscribed to the Google Groups "Greenplum Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.

Ashwin Agrawal

unread,
Oct 9, 2018, 9:48:31 PM10/9/18
to Paul Guo, Ning Yu, Asim Praveen, Greenplum Developers
On Tue, Oct 9, 2018 at 6:10 PM Paul Guo <pau...@gmail.com> wrote:
Yes, truncated debug_query_string for QE seems to be a better idea.

By the way, from the perspective of performance, I'm wondering if we should start to think about compression for those dispatched stuffs also. Personally I'm curious about the compression ratio of those dispatched stuffs and the cpu vs network io balance.

If this turns into something which needs compression and decompression, the I feel we should definitely reconsider the worth of it and find alternatives.

Ning Yu

unread,
Oct 9, 2018, 9:49:17 PM10/9/18
to Paul Guo, Asim Praveen, Greenplum Developers
On Wed, Oct 10, 2018 at 9:10 AM, Paul Guo <pau...@gmail.com> wrote:
Yes, truncated debug_query_string for QE seems to be a better idea.

By the way, from the perspective of performance, I'm wondering if we should start to think about compression for those dispatched stuffs also. Personally I'm curious about the compression ratio of those dispatched stuffs and the cpu vs network io balance.

If I understand correctly the serialized nodes are already compressed (refer to serializeNode()).  In the other hand `pQueryParms->strCommand` is not compressed, in our experiments we dispatched compressed debug_query_string and uncompress it on segments, a 5MB sql was compressed into 64KB, however as there were 4000 sub-stmts, that means there were 5MB*4000=20GB data to compress and uncompress (the compress part can be cached), so CPU became the new bottleneck and the overall performance is even worse.

Suppose we truncate debug_query_string to 1KB, then even if it's dispatched 4000 times it's only about 4MB which should not be a big problem.
 

Ning Yu <n...@pivotal.io> 于2018年10月10日周三 上午8:31写道:
Yes, personally I also vote for truncating.

On Wed, Oct 10, 2018 at 1:25 AM, Asim R P <apra...@pivotal.io> wrote:
Good finding.

On Mon, Oct 8, 2018 at 8:17 PM Ning Yu <n...@pivotal.io> wrote:
>
> 2. only dispatch debug_query_string if it's small enough (and count of substmts is small enough);

I support this option.  How about dispatching always but truncating if
it's longer than a threshold?

--
You received this message because you are subscribed to the Google Groups "Greenplum Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+unsubscribe@greenplum.org.

Ning Yu

unread,
Oct 9, 2018, 9:51:40 PM10/9/18
to Ashwin Agrawal, Paul Guo, Asim Praveen, Greenplum Developers
That's true, we'd better move the compression and decompression into a different topic.

Ivan Novick

unread,
Oct 9, 2018, 10:00:57 PM10/9/18
to Ning Yu, Greenplum Developers
I think the correct option is 4, rovide a GUC to control whether to dispatch it; and we should change the default of the GUC to NOT dispatch it.  If someone needs to debug something they can change the GUC value to restore the old behavior. I really don't see a case for dispatching it under normal conditions, but someone might want to use it for debugging without a recompile.

Makes sense?
--
Ivan Novick, Product Manager Pivotal Greenplum

Simon Gao

unread,
Oct 9, 2018, 10:27:43 PM10/9/18
to Ivan Novick, Ning Yu, Greenplum Developers
when debug problem, developers can find the full SQL from master log without recompiling.  We reviewed existing GUCs and didn't see a proper one to control this. We want to avoid introducing a new GUC just for this specific purpose. 

Ivan Novick

unread,
Oct 9, 2018, 10:29:25 PM10/9/18
to Simon Gao, Ning Yu, Greenplum Developers
On Wed, Oct 10, 2018 at 3:27 AM Simon Gao <sg...@pivotal.io> wrote:
when debug problem, developers can find the full SQL from master log without recompiling.  We reviewed existing GUCs and didn't see a proper one to control this. We want to avoid introducing a new GUC just for this specific purpose. 

Why avoid a new GUC.  The performance gain is so HUGE its worth adding a GUC?  The only other good option is to just never make that debug info available on the segment and get the gain without adding the GUC.  But i am not sure if developers will need this info ever?

Paul Guo

unread,
Oct 9, 2018, 11:29:39 PM10/9/18
to ino...@pivotal.io, SIMON GAO, Ning Yu, Greenplum Developers
With truncated debug_query_string on QEs, users could search in the QD log to find the full query string. This should be sufficient for debugging. Besides, it is not convenient to switch guc and do recheck sometimes since some scenarios are probably not reproducible.


Ivan Novick <ino...@pivotal.io> 于2018年10月10日周三 上午10:29写道:
--
You received this message because you are subscribed to the Google Groups "Greenplum Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.

Ning Yu

unread,
Oct 10, 2018, 12:17:13 AM10/10/18
to Ivan Novick, Simon Gao, Greenplum Developers
On Wed, Oct 10, 2018 at 10:29 AM, Ivan Novick <ino...@pivotal.io> wrote:


On Wed, Oct 10, 2018 at 3:27 AM Simon Gao <sg...@pivotal.io> wrote:
when debug problem, developers can find the full SQL from master log without recompiling.  We reviewed existing GUCs and didn't see a proper one to control this. We want to avoid introducing a new GUC just for this specific purpose. 

Why avoid a new GUC.  The performance gain is so HUGE its worth adding a GUC?  The only other good option is to just never make that debug info available on the segment and get the gain without adding the GUC.  But i am not sure if developers will need this info ever?

The main purpose of debug_query_string is to let us know current query.  Without debug_query_string we could know current query with session id and command id if queries are logged on master.  Suppose the queries are not logged then we might need to turn on such a GUC to dispatch debug_query_string and wait for a reproduce; however we could also increase the log level on master to log the queries, so once the issue reproduced we could also know the query.  In such a sense we do not need a new GUC to control whether to dispatch debug_query_string.

In the other hand, instead of never dispatching debug_query_string we think dispatching a truncated one will be better.  Usually we do not care about the entire query, we care about the query type, is it a CREATE TABLE? is it partitioned? etc., a truncated query should be enough to provide these key information most of time.

Ning Yu

unread,
Oct 10, 2018, 12:28:53 AM10/10/18
to Ivan Novick, Simon Gao, Greenplum Developers
BTW, suppose we dispatch truncate debug_query_string then what's actually affected?

For DMLs such as select, insert, update, delete, no changes at all, complete debug_query_string is available on both master and segments.

For DDLs such as create, alter, drop, complete debug_query_string is still available on master, truncated debug_query_string is available on segments.

 

Heikki Linnakangas

unread,
Oct 10, 2018, 1:27:36 AM10/10/18
to Ning Yu, Ivan Novick, Simon Gao, Greenplum Developers
On 10/10/2018 07:28, Ning Yu wrote:
> BTW, suppose we dispatch truncate debug_query_string then what's
> actually affected?
>
> For DMLs such as select, insert, update, delete, no changes at all,
> complete debug_query_string is available on both master and segments.
>
> For DDLs such as create, alter, drop, complete debug_query_string is
> still available on master, truncated debug_query_string is available on
> segments.

Seems strange to treate SELECT, DML commands and DDLs should be treated
differently.

Lots of good discussion. I couldn't decide which parts to reply to, so I
just collected my thoughts as a list:

* The elephant in the room is that a user's CREATE TABLE command can be
expanded to thousands of commands between the QD and QEs, if the table
is partitioned. If we could refactor that so that we'd dispatch all the
sub-commands as one batch, that would be nice. Aside from the
debug_query_string effect, it would reduce the number of round-trips.

* I object to removing debug_query_string completely, and to having a
GUC for it. debug_query_string is a very valuable debugging aid. When I
attach a debugger to a core dump or a hung backend, that's the first
thing I look at. If we have a GUC for it, it's in practice never going
to be there when you need it the most.

* As Ning Yu explained, the main purpose of debug_query_string is to
identify the current query. With the query string, you can look for the
place where the query is issued in the application or script, to get
more context. Truncating the string is OK for that purpose, but removing
comments is a bad idea, because often the comments are even more useful
for finding the origin of the query.

* debug_query_string also shows up in the pg_stat_activity view. In the
master, you'd have it in any case, even if we omit it in the segments.
But it'd be nice to have it available in the segments, too, if you
connect in utility mode to a segment.

* It also appears in the logs, with log_statements. Usually you'd only
set it in the master, but again it would be nice if it worked in
segments, too.

* I think it might also show up in the context information, in some
errors. I'm not sure if we ever emit such errors from segments, errors
that typically show the position in the original query are syntax errors
and the like, which are caught in the QD already. But still.

I think truncating the string to e.g. 10 kB is reasonable.

- Heikki

Ivan Novick

unread,
Oct 10, 2018, 2:10:48 AM10/10/18
to Heikki Linnakangas, Ning Yu, Simon Gao, Greenplum Developers
On Wed, Oct 10, 2018 at 6:27 AM Heikki Linnakangas <hlinna...@pivotal.io> wrote:
On 10/10/2018 07:28, Ning Yu wrote:

Lots of good discussion. I couldn't decide which parts to reply to, so I
just collected my thoughts as a list:

* The elephant in the room is that a user's CREATE TABLE command can be
expanded to thousands of commands between the QD and QEs, if the table
is partitioned. If we could refactor that so that we'd dispatch all the
sub-commands as one batch, that would be nice. Aside from the
debug_query_string effect, it would reduce the number of round-trips.

That sounds interesting and potential for great gains
 
* I object to removing debug_query_string completely, and to having a
GUC for it. debug_query_string is a very valuable debugging aid. When I
attach a debugger to a core dump or a hung backend, that's the first
thing I look at. If we have a GUC for it, it's in practice never going
to be there when you need it the most.

What ever method helps you guys to debug i am fine with it, as long as we can also get the primary performance gain on a system with default settings.  In other words, as long as the debugging on by default is a reasonable performance penalty (few percent not a few hundred percent like it is now for these use cases)
 
Cheers,
Ivan

Ning Yu

unread,
Oct 10, 2018, 2:29:37 AM10/10/18
to Ivan Novick, Heikki Linnakangas, Simon Gao, Greenplum Developers
Now looks like we have reached a decision to dispatch a truncated version of debug_query_string.  I will work out a PR for it.

Ning Yu

unread,
Oct 10, 2018, 3:09:24 AM10/10/18
to Ivan Novick, Heikki Linnakangas, Simon Gao, Greenplum Developers

Please help to review it, thanks in advance.

Oak Barrett

unread,
Oct 10, 2018, 10:44:30 AM10/10/18
to n...@pivotal.io, Ivan Novick, Heikki Linnakangas, Simon Gao, gpdb...@greenplum.org
Appreciate all the effort...  
This is a big pain point for customers restoring tables with many partitions.  
oak



Oak Barrett

unread,
Oct 11, 2018, 4:22:27 PM10/11/18
to n...@pivotal.io, Ivan Novick, Heikki Linnakangas, Simon Gao, gpdb...@greenplum.org
This will tremendously decrease the time it takes for user to restore their databases with partitioned tables. 

My quick test: 

Test Case:
------------------------------
- 2 Host Virtual cluster / 4 segments per host
- 1 partitioned table with 1461 child partitions
- original CREATE TABLE statement  139 lines & 12kb file size
- gpbackup generated CREATE TABLE statement 191,525 lines & 18MB file size

on 5.11.0:
--------------------------------
original CREATE: 66.37 seconds
gpabackup CREATE: 423.98 seconds

on 5.11.1+dev.51.g23ae137e54:
--------------------------------
original CREATE: 77.16 seconds
gpabackup CREATE: 78.52 seconds


Thanks,
oak



Ashwin Agrawal

unread,
Oct 11, 2018, 8:20:29 PM10/11/18
to Oak Barrett, Ning Yu, Ivan Novick, Heikki Linnakangas, Simon Gao, Greenplum Developers

On Thu, Oct 11, 2018 at 1:22 PM Oak Barrett <obar...@pivotal.io> wrote:
This will tremendously decrease the time it takes for user to restore their databases with partitioned tables. 

My quick test: 

Test Case:
------------------------------
- 2 Host Virtual cluster / 4 segments per host
- 1 partitioned table with 1461 child partitions
- original CREATE TABLE statement  139 lines & 12kb file size
- gpbackup generated CREATE TABLE statement 191,525 lines & 18MB file size

on 5.11.0:
--------------------------------
original CREATE: 66.37 seconds
gpabackup CREATE: 423.98 seconds

on 5.11.1+dev.51.g23ae137e54:
--------------------------------
original CREATE: 77.16 seconds
gpabackup CREATE: 78.52 seconds

Nice!!!

Surprised we didn't realize and evaluate before what was causing the slow-down or that difference. Not sure why original CREATE table on 5.11.1+dev didn't perform at same speed or actually also better compared to 5.11.0.

Ning Yu

unread,
Oct 11, 2018, 8:33:32 PM10/11/18
to Oak Barrett, Ivan Novick, Heikki Linnakangas, Simon Gao, Greenplum Developers
On Fri, Oct 12, 2018 at 4:22 AM, Oak Barrett <obar...@pivotal.io> wrote:
This will tremendously decrease the time it takes for user to restore their databases with partitioned tables. 

My quick test: 

Test Case:
------------------------------
- 2 Host Virtual cluster / 4 segments per host
- 1 partitioned table with 1461 child partitions
- original CREATE TABLE statement  139 lines & 12kb file size
- gpbackup generated CREATE TABLE statement 191,525 lines & 18MB file size

on 5.11.0:
--------------------------------
original CREATE: 66.37 seconds
gpabackup CREATE: 423.98 seconds

on 5.11.1+dev.51.g23ae137e54:
--------------------------------
original CREATE: 77.16 seconds
gpabackup CREATE: 78.52 seconds

Cool, thanks for the tests!

Ning Yu

unread,
Oct 11, 2018, 8:47:19 PM10/11/18
to Ashwin Agrawal, Oak Barrett, Ivan Novick, Heikki Linnakangas, Simon Gao, Greenplum Developers
I'm also curious about the performance change on original command.  If there does is a performance regression caused by some commits, maybe we could run a git bisect to locate them.  I would like to work on this next week.

Ivan Novick

unread,
Oct 12, 2018, 10:14:09 AM10/12/18
to Ashwin Agrawal, Oak Barrett, Ning Yu, Heikki Linnakangas, Simon Gao, Greenplum Developers
On Fri, Oct 12, 2018 at 2:20 AM Ashwin Agrawal <aagr...@pivotal.io> wrote:
On Thu, Oct 11, 2018 at 1:22 PM Oak Barrett <obar...@pivotal.io> wrote:
This will tremendously decrease the time it takes for user to restore their databases with partitioned tables. 

Nice!!!

Surprised we didn't realize and evaluate before what was causing the slow-down or that difference. Not sure why original CREATE table on 5.11.1+dev didn't perform at same speed or actually also better compared to 5.11.0.

Let's get some more slow workloads and profile them! ;)

Reply all
Reply to author
Forward
0 new messages