[PATCH 0/5] deflater: tiny optimizations + 1 new feature

37 views
Skip to first unread message

Eric Wong

unread,
Jun 28, 2017, 10:19:17 PM6/28/17
to rack-...@googlegroups.com
I mainly wanted to to implement "sync: false" [PATCH 3/5] to
avoid flushing the buffer for every chunk. This would allow my
applications/middleware to work with smaller chunks of memory
internally, yet still achieve good compression.

The default flush behavior remains unchanged, of course, since
I'd rather be inefficient than break existing apps.

Of course, I saw a few other small things along the way.

1/5 is a hopefully obvious cleanup
2/5 makes me sleep better at night

3/5 see above

4/5 test only, split from 5/5 in case 5/5 is rejected
5/5 War on Time, part 1 :)

The following changes since commit 0362a54dba92626582d42f3343c209b7cdb7e713:

Partially reverting 8a7a142d (2017-06-28 13:20:52 -0700)

are available in the git repository at:

git://80x24.org/rack deflater

for you to fetch changes up to 69a2a195d749fdc2c04451688f3569bd5ce24c73:

deflater: reduce live Time objects (2017-06-29 02:04:50 +0000)

----------------------------------------------------------------
Eric Wong (5):
deflater: rely on frozen_string_literal
deflater: avoid wasting an ivar slot on @closed
deflater: support "sync: false" option
deflater: additional mtime tests
deflater: reduce live Time objects

lib/rack/deflater.rb | 30 +++++++++++++++++-------------
test/spec_deflater.rb | 45 ++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 61 insertions(+), 14 deletions(-)

--
EW

Eric Wong

unread,
Jun 28, 2017, 10:19:18 PM6/28/17
to rack-...@googlegroups.com
This improves readability of our code and can allow us to
generate less garbage in Ruby 2.3+ for places where we didn't
already use constants. We can also avoid the old constant
lookups (and associated inline cache overhead) this way.
---
lib/rack/deflater.rb | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/rack/deflater.rb b/lib/rack/deflater.rb
index 46d5b20..9b798ba 100644
--- a/lib/rack/deflater.rb
+++ b/lib/rack/deflater.rb
@@ -1,3 +1,4 @@
+# frozen_string_literal: true
require "zlib"
require "time" # for Time.httpdate
require 'rack/utils'
@@ -52,7 +53,7 @@ module Rack
case encoding
when "gzip"
headers['Content-Encoding'] = "gzip"
- headers.delete(CONTENT_LENGTH)
+ headers.delete('Content-Length')
mtime = headers.key?("Last-Modified") ?
Time.httpdate(headers["Last-Modified"]) : Time.now
[status, headers, GzipStream.new(body, mtime)]
@@ -61,7 +62,7 @@ module Rack
when nil
message = "An acceptable encoding for the requested resource #{request.fullpath} could not be found."
bp = Rack::BodyProxy.new([message]) { body.close if body.respond_to?(:close) }
- [406, {CONTENT_TYPE => "text/plain", CONTENT_LENGTH => message.length.to_s}, bp]
+ [406, {'Content-Type' => "text/plain", 'Content-Length' => message.length.to_s}, bp]
end
end

@@ -102,13 +103,13 @@ module Rack
# Skip compressing empty entity body responses and responses with
# no-transform set.
if Utils::STATUS_WITH_NO_ENTITY_BODY.include?(status) ||
- headers[CACHE_CONTROL].to_s =~ /\bno-transform\b/ ||
+ headers['Cache-Control'].to_s =~ /\bno-transform\b/ ||
(headers['Content-Encoding'] && headers['Content-Encoding'] !~ /\bidentity\b/)
return false
end

# Skip if @compressible_types are given and does not include request's content type
- return false if @compressible_types && !(headers.has_key?(CONTENT_TYPE) && @compressible_types.include?(headers[CONTENT_TYPE][/[^;]*/]))
+ return false if @compressible_types && !(headers.has_key?('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

# Skip if @condition lambda is given and evaluates to false
return false if @condition && !@condition.call(env, status, headers, body)
--
EW

Eric Wong

unread,
Jun 28, 2017, 10:19:20 PM6/28/17
to rack-...@googlegroups.com
We can merely set the @body to nil ensure we do not call close
on the @body, twice. Saving an ivar slot can save 8 bytes
per object at minimum, and this makes me feel more comfortable
about using another ivar for the next commit.
---
lib/rack/deflater.rb | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/rack/deflater.rb b/lib/rack/deflater.rb
index 9b798ba..d575adf 100644
--- a/lib/rack/deflater.rb
+++ b/lib/rack/deflater.rb
@@ -70,7 +70,6 @@ module Rack
def initialize(body, mtime)
@body = body
@mtime = mtime
- @closed = false
end

def each(&block)
@@ -91,9 +90,8 @@ module Rack
end

def close
- return if @closed
- @closed = true
@body.close if @body.respond_to?(:close)
+ @body = nil
end
end

--
EW

Eric Wong

unread,
Jun 28, 2017, 10:19:23 PM6/28/17
to rack-...@googlegroups.com
Flushing after after very flush is great for real-time apps.
However, flushing is inefficient when apps use Rack::Response
to generate many small writes (e.g. Rack::Lobster).

Allow users to disable the default "sync: true" behavior to
reduce bandwidth usage, otherwise using Rack::Deflater can lead
to using more bandwidth than without it.
---
lib/rack/deflater.rb | 11 ++++++++---
test/spec_deflater.rb | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/lib/rack/deflater.rb b/lib/rack/deflater.rb
index d575adf..821f708 100644
--- a/lib/rack/deflater.rb
+++ b/lib/rack/deflater.rb
@@ -24,11 +24,15 @@ module Rack
# 'if' - a lambda enabling / disabling deflation based on returned boolean value
# e.g use Rack::Deflater, :if => lambda { |env, status, headers, body| body.map(&:bytesize).reduce(0, :+) > 512 }
# 'include' - a list of content types that should be compressed
+ # 'sync' - Flushing after every chunk reduces latency for
+ # time-sensitive streaming applications, but hurts
+ # compression and throughput. Defaults to `true'.
def initialize(app, options = {})
@app = app

@condition = options[:if]
@compressible_types = options[:include]
+ @sync = options[:sync] == false ? false : true
end

def call(env)
@@ -56,7 +60,7 @@ module Rack
headers.delete('Content-Length')
mtime = headers.key?("Last-Modified") ?
Time.httpdate(headers["Last-Modified"]) : Time.now
- [status, headers, GzipStream.new(body, mtime)]
+ [status, headers, GzipStream.new(body, mtime, @sync)]
when "identity"
[status, headers, body]
when nil
@@ -67,7 +71,8 @@ module Rack
end

class GzipStream
- def initialize(body, mtime)
+ def initialize(body, mtime, sync)
+ @sync = sync
@body = body
@mtime = mtime
end
@@ -78,7 +83,7 @@ module Rack
gzip.mtime = @mtime
@body.each { |part|
gzip.write(part)
- gzip.flush
+ gzip.flush if @sync
}
ensure
gzip.close
diff --git a/test/spec_deflater.rb b/test/spec_deflater.rb
index 0f27c85..410a143 100644
--- a/test/spec_deflater.rb
+++ b/test/spec_deflater.rb
@@ -372,4 +372,38 @@ describe Rack::Deflater do

verify(200, response, 'gzip', options)
end
+
+ it 'will honor sync: false to avoid unnecessary flushing' do
+ app_body = Object.new
+ class << app_body
+ def each
+ (0..20).each { |i| yield "hello\n".freeze }
+ end
+ end
+
+ options = {
+ 'deflater_options' => { :sync => false },
+ 'app_body' => app_body,
+ 'skip_body_verify' => true,
+ }
+ verify(200, app_body, deflate_or_gzip, options) do |status, headers, body|
+ headers.must_equal({
+ 'Content-Encoding' => 'gzip',
+ 'Vary' => 'Accept-Encoding',
+ 'Content-Type' => 'text/plain'
+ })
+
+ buf = ''
+ raw_bytes = 0
+ inflater = auto_inflater
+ body.each do |part|
+ raw_bytes += part.bytesize
+ buf << inflater.inflate(part)
+ end
+ buf << inflater.finish
+ expect = "hello\n" * 21
+ buf.must_equal expect
+ raw_bytes.must_be(:<, expect.bytesize)
+ end
+ end
end
--
EW

Eric Wong

unread,
Jun 28, 2017, 10:19:25 PM6/28/17
to rack-...@googlegroups.com
The next commit will reduce long-lived Time objects. Regardless
of whether that commit is acceptable, having extra tests for
existing mtime behavior cannot hurt.

For testing responses with the Last-Modified header, setting a
random date in the past should make failure to preserve mtime
in the gzip header more apparent.
---
test/spec_deflater.rb | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/test/spec_deflater.rb b/test/spec_deflater.rb
index 410a143..4a337ca 100644
--- a/test/spec_deflater.rb
+++ b/test/spec_deflater.rb
@@ -44,6 +44,8 @@ describe Rack::Deflater do
[accept_encoding, accept_encoding.dup]
end

+ start = Time.now.to_i
+
# build response
status, headers, body = build_response(
options['app_status'] || expected_status,
@@ -67,6 +69,13 @@ describe Rack::Deflater do
when 'gzip'
io = StringIO.new(body_text)
gz = Zlib::GzipReader.new(io)
+ mtime = gz.mtime.to_i
+ if last_mod = headers['Last-Modified']
+ Time.httpdate(last_mod).to_i.must_equal mtime
+ else
+ mtime.must_be(:<=, Time.now.to_i)
+ mtime.must_be(:>=, start.to_i)
+ end
tmp = gz.read
gz.close
tmp
@@ -243,7 +252,7 @@ describe Rack::Deflater do
end

it 'handle gzip response with Last-Modified header' do
- last_modified = Time.now.httpdate
+ last_modified = Time.at(123).httpdate
options = {
'response_headers' => {
'Content-Type' => 'text/plain',
--
EW

Eric Wong

unread,
Jun 28, 2017, 10:19:28 PM6/28/17
to rack-...@googlegroups.com
Only create a Time object if the Last-Modified header exists,
(immediately use the Integer result of Time#to_). Otherwise, we
can rely on the default behavior of Zlib::GzipWriter of setting
the mtime to the current time.

While we're at it, avoid repeating the hash lookup for
Last-Modified.

This results in an improvement from 11.0k to 11.4k iterations/sec
with the following script:

require 'benchmark/ips'
require 'rack/mock'
req = Rack::MockRequest.env_for('', 'HTTP_ACCEPT_ENCODING' => 'gzip')
response = [200, {}, []]
deflater = Rack::Deflater.new(lambda { |env| response })
Benchmark.ips do |x|
x.report('deflater') do
s, h, b = deflater.call(req)
b.each { |buf| }
b.close
end
end
---
lib/rack/deflater.rb | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/rack/deflater.rb b/lib/rack/deflater.rb
index 821f708..0d0d021 100644
--- a/lib/rack/deflater.rb
+++ b/lib/rack/deflater.rb
@@ -58,8 +58,8 @@ module Rack
when "gzip"
headers['Content-Encoding'] = "gzip"
headers.delete('Content-Length')
- mtime = headers.key?("Last-Modified") ?
- Time.httpdate(headers["Last-Modified"]) : Time.now
+ mtime = headers["Last-Modified"]
+ mtime = Time.httpdate(mtime).to_i if mtime
[status, headers, GzipStream.new(body, mtime, @sync)]
when "identity"
[status, headers, body]
@@ -80,7 +80,7 @@ module Rack
def each(&block)
@writer = block
gzip =::Zlib::GzipWriter.new(self)
- gzip.mtime = @mtime
+ gzip.mtime = @mtime if @mtime
@body.each { |part|
gzip.write(part)
gzip.flush if @sync
--
EW

Eric Wong

unread,
Jul 23, 2017, 9:34:06 PM7/23/17
to rack-...@googlegroups.com
Ping on this series, I might be working on a Rack app which will
depend on "sync: false" for best results

* [PATCH 0/5] deflater: tiny optimizations + 1 new feature
https://public-inbox.org/rack-devel/201706290219...@80x24.org/


And also a minor common_logger tweak:

* [PATCH] common_logger: rely on monotonic clock
https://public-inbox.org/rack-devel/20170629023056.GA22961@untitled/


Thanks.

Rafael Mendonça França

unread,
Jul 24, 2017, 10:43:35 PM7/24/17
to Rack Development, e...@80x24.org
Thank you Eric. Both series of patches were applied on master on b76ca575c5dfc08366c26f7cb577fdc14d879f8a.


On Sunday, July 23, 2017 at 9:34:06 PM UTC-4, Eric Wong wrote:
Ping on this series, I might be working on a Rack app which will
depend on "sync: false" for best results

* [PATCH 0/5] deflater: tiny optimizations + 1 new feature
Reply all
Reply to author
Forward
0 new messages