code review 8179043: net/textproto: reduce allocations in ReadMIMEHeader (issue 8179043)

203 views
Skip to first unread message

brad...@golang.org

unread,
Mar 29, 2013, 5:58:38 PM3/29/13
to golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Reviewers: golang-dev1,

Message:
Hello golan...@googlegroups.com,

I'd like you to review this change to
https://go.googlecode.com/hg/


Description:
net/textproto: reduce allocations in ReadMIMEHeader

ReadMIMEHeader is used by net/http, net/mail, and
mime/multipart.

Don't do so many small allocations. Calculate up front
how much we'll probably need.

benchmark old ns/op new ns/op delta
BenchmarkReadMIMEHeader 9469 8249 -12.88%

benchmark old allocs new allocs delta
BenchmarkReadMIMEHeader 23 14 -39.13%

benchmark old bytes new bytes delta
BenchmarkReadMIMEHeader 1708 1344 -21.31%

Please review this at https://codereview.appspot.com/8179043/

Affected files:
M src/pkg/net/textproto/reader.go
M src/pkg/net/textproto/reader_test.go


Index: src/pkg/net/textproto/reader.go
===================================================================
--- a/src/pkg/net/textproto/reader.go
+++ b/src/pkg/net/textproto/reader.go
@@ -11,6 +11,7 @@
"io/ioutil"
"strconv"
"strings"
+ "unsafe"
)

// BUG(rsc): To let callers manage exposure to denial of service
@@ -456,7 +457,32 @@
// }
//
func (r *Reader) ReadMIMEHeader() (MIMEHeader, error) {
- m := make(MIMEHeader, 4)
+ // Avoid lots of small slice allocations later by allocating
+ // one large one ahead of time. If this isn't big enough
+ // later, we allocate small ones.
+ var strs []string
+ var hint int // guess of number of keys in map
+
+ // Try to determine the 'hint' size.
+ r.R.Peek(1) // force a buffer load if empty
+ if s := r.R.Buffered(); s > 0 {
+ peek, _ := r.R.Peek(s)
+ for len(peek) > 0 {
+ i := bytes.IndexByte(peek, '\n')
+ if i < 3 {
+ // Not present (-1) or found within the next few bytes,
+ // implying we're at the end ("\r\n\r\n" or "\n\n")
+ break
+ }
+ hint++
+ peek = peek[i+1:]
+ }
+ if hint > 0 {
+ strs = make([]string, hint)
+ }
+ }
+
+ m := make(MIMEHeader, hint)
for {
kv, err := r.readContinuedLineSlice()
if len(kv) == 0 {
@@ -483,7 +509,16 @@
}
value := string(kv[i:])

- m[key] = append(m[key], value)
+ vv := m[key]
+ if vv == nil && len(strs) > 0 {
+ // More than likely this will be a single-element key.
+ // Most headers aren't multi-valued.
+ vv, strs = strs[:1], strs[1:]
+ vv[0] = value
+ m[key] = cappedStringSlice(vv)
+ } else {
+ m[key] = append(vv, value)
+ }

if err != nil {
return m, err
@@ -491,6 +526,18 @@
}
}

+func cappedStringSlice(s []string) []string {
+ // header is a copy of reflect.SliceHeader
+ type header struct {
+ Data uintptr
+ Len int
+ Cap int
+ }
+ sh := (*header)(unsafe.Pointer(&s))
+ sh.Cap = sh.Len
+ return s
+}
+
// CanonicalMIMEHeaderKey returns the canonical format of the
// MIME header key s. The canonicalization converts the first
// letter and any letter following a hyphen to upper case;
Index: src/pkg/net/textproto/reader_test.go
===================================================================
--- a/src/pkg/net/textproto/reader_test.go
+++ b/src/pkg/net/textproto/reader_test.go
@@ -264,6 +264,13 @@
}
}

+func TestCappedStringSlice(t *testing.T) {
+ // verify our slice header copy doesn't get out of sync with
reflect/runtime.
+ if c := cappedStringSlice([]string{"foo"}); len(c) != 1 || cap(c) != 1 ||
c[0] != "foo" {
+ t.Fatal()
+ }
+}
+
var clientHeaders = strings.Replace(`Host: golang.org
Connection: keep-alive
Cache-Control: max-age=0
@@ -290,6 +297,7 @@
`, "\n", "\r\n", -1)

func BenchmarkReadMIMEHeader(b *testing.B) {
+ b.ReportAllocs()
var buf bytes.Buffer
br := bufio.NewReader(&buf)
r := NewReader(br)


r...@golang.org

unread,
Mar 29, 2013, 6:18:05 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
this sets a precedent i'd rather leave unset.


https://codereview.appspot.com/8179043/diff/4001/src/pkg/net/textproto/reader.go
File src/pkg/net/textproto/reader.go (right):

https://codereview.appspot.com/8179043/diff/4001/src/pkg/net/textproto/reader.go#newcode14
src/pkg/net/textproto/reader.go:14: "unsafe"
if you do this it's a problem on app engine, isn't it? i'd much prefer
not to use unsafe for speed like this. if you absolutely insist, put
the unsafe piece in a separate build-tagged file.

https://codereview.appspot.com/8179043/

Brad Fitzpatrick

unread,
Mar 29, 2013, 6:25:17 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, r...@golang.org, re...@codereview-hr.appspotmail.com
App Engine trusts the standard library.

Do you still want it in a separate file?

I'm of the mindset that,
-- Go is a language about giving you control over memory.
-- The standard library is allowed to be fast, so others can use it and assume it's fast and not have to play their own games.
-- I'd rather we play games than everybody else play games (or, say, have the net/http package provide a hook for users to supply their own alternate reader functions. See also: the rejected CL for crypto sha1/rc4 alternate implementation hooks)
-- If we don't do stuff like this, we don't see the pain points of the language (e.g. Issue 1642) and we'll forget to fix it later.

This wouldn't even be objectionable if the language let you control the capacity of a slice (Issue 1642), of which only the syntax is really contentious.

I can

a) abandon this
b) move unsafe bits to a separate file (tagged what? appengine? unnecessary)
c) submit as-is.

?

Brad Fitzpatrick

unread,
Mar 29, 2013, 6:42:45 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, r...@golang.org, re...@codereview-hr.appspotmail.com
Stand by.

Moving it to a separate file.

brad...@golang.org

unread,
Mar 29, 2013, 6:54:28 PM3/29/13
to golan...@googlegroups.com, r...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com

Brad Fitzpatrick

unread,
Mar 29, 2013, 6:55:34 PM3/29/13
to golan...@googlegroups.com, r...@golang.org, re...@codereview-hr.appspotmail.com
PTAL

Now it'll involve less paperwork.

r...@golang.org

unread,
Mar 29, 2013, 7:07:24 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
I believe encoding/gob is the only standard package that uses unsafe to
speed up allocations, and I wish I hadn't stepped into that water. I do
not sanction doing this in the standard library.

Please abandon this CL or find a way to improve the allocation strategy
without putting details about the runtime's internal data structures
into a MIME parser (!).

https://codereview.appspot.com/8179043/

Brad Fitzpatrick

unread,
Mar 29, 2013, 7:12:56 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, r...@golang.org, re...@codereview-hr.appspotmail.com
Maybe I can use gob instead.

brad...@golang.org

unread,
Mar 29, 2013, 7:16:13 PM3/29/13
to golan...@googlegroups.com, r...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com

ia...@golang.org

unread,
Mar 29, 2013, 7:21:27 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, r...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Let's not work around a language/runtime problem by importing unsafe
(unless we're desperate).


https://codereview.appspot.com/8179043/

Brad Fitzpatrick

unread,
Mar 29, 2013, 7:31:31 PM3/29/13
to brad...@golang.org, golan...@googlegroups.com, r...@golang.org, ia...@golang.org, re...@codereview-hr.appspotmail.com
I moved my whining to Issue 1642.

I was hoping this CL would be more acceptable than my previous one in that it added no new public API and we could just change the one unsafe function to use slice = slice[off:len:cap] or something later, if that ever happens.
Reply all
Reply to author
Forward
0 new messages