Jira (PDB-5645) Abandon queries when client disconnects

3 views
Skip to first unread message

Rob Browning (Jira)

unread,
May 17, 2023, 4:43:03 PM5/17/23
to puppe...@googlegroups.com
Rob Browning created an issue
 
PuppetDB / Task PDB-5645
Abandon queries when client disconnects
Issue Type: Task Task
Assignee: Unassigned
Components: PuppetDB
Created: 2023/05/17 1:42 PM
Priority: Normal Normal
Reporter: Rob Browning
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)
Atlassian logo

Rob Browning (Jira)

unread,
May 17, 2023, 4:44:03 PM5/17/23
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Testing revealed that pdb streaming queries definitely do continue running for a while after client disconnection.  Long after the client disconnected, jetty would continue to call the {{query-eng/generated-stream}} read to stream the encoded json bytes which in turn keep the database query running.  The current assumption is that the reads are caused by buffering before the next network write.  A stack trace from the read reveals that there's a gzip handler, but since (at least from a cursory investigation) that handler appears to rely on a Deflate instance with a 512 byte buffer (fixed?), it's probably not the key reason for the continuing reads.

In any case, after further investigation, we didn't find a straightforward way to detect a client disconnection in the ring or tk-jetty9 world, but since the jetty/servlet response handler is given a {{ServletResponse}} object, which for jetty is an {{org.eclipse.jetty.server.Response}}, I wondered if that might provide access to the client connection.  It does.

Jetty (9 at least) handles the connection via a chain of "interceptors" attached to the {{HttpOutput}}, which is itself attached to the {{Response}}.  For a trivial jetty app there's just one interceptor, while pdb includes others (the gzip "layer" at least).  The jetty {{HttpOutput}} Interceptor comments say that the last link in the chain will be an HttpChannel, and for pdb, that's currently an {{HttpChannelOverHttp}} instance, and that allows direct access to the jdk {{SocketChannel}} connected to the client.

Fortunately the client channel is also in non-blocking mode, and so even though the client isn't (and shouldn't be) sending any data, an attempt to read from the client channel will return 0 as long as the client is connected and -1 after it disconnects (verified via curl against both the simple test app and pdb itself).  It's then straightforward to add a check in pdb's {{query-eng/body-stream}}'s {{stream-rows}} that will cause pdb to abandon the query whenever the read indicates that the client is gone.

This approach relies on some assumptions:

  - The jetty transport is non-blocking.

  - The read from the client always returns -1 after a disconnect.

  - By the time pdb is streaming, it's acceptable to attempt to read bytes from the client, i.e. either the client won't be sending any bytes, or it's OK to discard them.

It would also require a change to trapperkeeper-webserver-jetty9, because ring handlers (both in ring itself and in tk-jetty9) don't provide the response instance to the handler, just the request map.  One option would be to add an {{:include-response?}} option to the {{add-ring-handler}} service method.  When true the handler call would become {{(handle request-map response)}} instead of {{(handle request-map)}}.

Finally, here are a couple of draft prs (with no tests) that sketch all this out, though the pdb-side relies on a {{\*pdb-response\*}} binding that presumably wouldn't be included in the end.  I just used it to avoid having to rework pdb's handler(s) for the tests:

  https://github.com/puppetlabs/trapperkeeper-webserver-jetty9/pull/250
  https://github.com/puppetlabs/puppetdb/pull/3813
  

Steve Axthelm (Jira)

unread,
May 31, 2023, 1:40:01 PM5/31/23
to puppe...@googlegroups.com
Steve Axthelm commented on Task PDB-5645
 
Re: Abandon queries when client disconnects

Rob Browning, just saw your PR against tk-j9. Not sure if you saw, but there's work in flight around moving to Jetty 10. What stream would these PDB changes be targeted at?

This message was sent by Atlassian Jira (v8.20.21#820021-sha1:38274c8)
Atlassian logo

Rob Browning (Jira)

unread,
May 31, 2023, 2:09:01 PM5/31/23
to puppe...@googlegroups.com
Rob Browning assigned an issue to Rob Browning
 
Change By: Rob Browning
Assignee: Rob Browning

Rob Browning (Jira)

unread,
May 31, 2023, 2:09:02 PM5/31/23
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Sprint: Skeletor 06/07/2023
Issue Type: Task Improvement

Cas Donoghue (Jira)

unread,
Jun 8, 2023, 2:10:01 PM6/8/23
to puppe...@googlegroups.com
Cas Donoghue updated an issue
Change By: Cas Donoghue
Sprint: Skeletor 06/07/2023 , Skeletor 06/22/2023

Cas Donoghue (Jira)

unread,
Jun 21, 2023, 2:11:02 PM6/21/23
to puppe...@googlegroups.com
Cas Donoghue updated an issue
Change By: Cas Donoghue
Sprint: Skeletor 06/07/2023, Skeletor 06/22/2023 , Skeletor 07/05/2023
Reply all
Reply to author
Forward
0 new messages