Ann: pgx v2 - extremely high performance PostgreSQL driver in pure Go

6,658 views
Skip to first unread message

Jack Christensen

unread,
Jul 15, 2014, 10:28:00 AM7/15/14
to golang-nuts
pgx v2 is a pure Go PostgreSQL adapter that focuses on performance and
PostgreSQL specific features. pgx supports the standard database/sql interface
and its own native interface. The native interface offers more performance and
PostgreSQL specific features.

pgx is usually 10% to 20% faster than pq when using database/sql. When using
pgx's native interface it can be close to twice as fast as pq, but normal
performance would be about 50% faster. Below are sample benchmarks for selecting
a single row with 8 columns (3 int4, 3 varchar, 1 date, 1 timestamptz)

BenchmarkPgxNativeSelectSingleRowPrepared   500000       31148 ns/op       217 B/op        5 allocs/op
BenchmarkPgxStdlibSelectSingleRowPrepared   200000       39617 ns/op      1056 B/op       26 allocs/op
BenchmarkPqSelectSingleRowPrepared    200000       47033 ns/op      1374 B/op       37 allocs/op

Not only is pgx execute substantially faster, but it uses significantly less
memory and performs fewer memory allocs.

Here are some of the features of pgx:

* Native interface for greater performance that closely resembles database/sql for greater ease of use.
* Transaction isolation level control
* Listen / notify
* Complete control over TLS
* Custom type support including binary encodings for maximum performance
* Leveled logging support
* Connection pool with AfterConnect callback for setting connection settings or preparing statements
* Access to native pgx connection through database/sql to optimize hot paths for pgx while retaining database/sql compatibility


Check it out at: https://github.com/jackc/pgx

You can get the benchmark suite at:
https://github.com/jackc/go_db_bench

Jack

Rodrigo Kochenburger

unread,
Jul 15, 2014, 12:27:32 PM7/15/14
to Jack Christensen, golang-nuts
Nice work. Can you explain why the database/sql interface makes it slower and if there is any work that can be done to make it faster? (Both on Go's stdlib and your package)

- RK


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nicolas Hillegeer

unread,
Jul 15, 2014, 5:19:37 PM7/15/14
to golan...@googlegroups.com
If these results hold up, all I can say is good job!

Have you compared to https://github.com/go-pg/pg? It doesn't follow database/sql either (uses bits and pieces) and
is also rumored to have better performance than lib/pq.

If you haven't, would you mind including it in your benchmark?

Taru Karttunen

unread,
Jul 16, 2014, 3:43:56 AM7/16/14
to Jack Christensen, golang-nuts
On 15.07 09:27, Jack Christensen wrote:
> pgx v2 is a pure Go PostgreSQL adapter that focuses on performance and
> PostgreSQL specific features. pgx supports the standard database/sql
> interface
> and its own native interface. The native interface offers more performance
> and
> PostgreSQL specific features.

This looks very nice. Do you have plans to support fancy PostgreSQL
types like arrays and jsonb?

- Taru Karttunen

Jack Christensen

unread,
Jul 16, 2014, 8:35:52 AM7/16/14
to Nicolas Hillegeer, golan...@googlegroups.com


Nicolas Hillegeer wrote:
If these results hold up, all I can say is good job!

Have you compared to https://github.com/go-pg/pg? It doesn't follow database/sql either (uses bits and pieces) and
is also rumored to have better performance than lib/pq.

If you haven't, would you mind including it in your benchmark?
Done.

In general it is a bit slower than pgx and a bit faster than pq.

You can see the full results here. https://gist.github.com/jackc/c402b42244d3390f26c6

Jack
--

Jack Christensen

unread,
Jul 16, 2014, 8:37:59 AM7/16/14
to Taru Karttunen, golang-nuts
Yes, but maybe as extensions or examples instead of part of the core package. The reason is there are multiple ways to map those to Go types and I'm not sure that one size will fit all.

Jack

- Taru Karttunen

Jack Christensen

unread,
Jul 16, 2014, 8:42:48 AM7/16/14
to Rodrigo Kochenburger, golang-nuts
Rodrigo Kochenburger wrote:
Nice work. Can you explain why the database/sql interface makes it slower and if there is any work that can be done to make it faster? (Both on Go's stdlib and your package)
I'm not entirely sure. A lot of the performance optimization I did in pgx native was removing extra memory allocations. There are some copies and allocs going on between the pgx native and pgx stdlib adapte.r I'm pretty sure there is a little more that can be squeezed out of it.

As far as just the standard lib, there do seem to be a lot of memory allocs, so I strongly suspect there is room for more optimization there, but I haven't looked at it in any detail.

Jack

Harry B

unread,
Sep 27, 2014, 2:17:13 PM9/27/14
to golan...@googlegroups.com, div...@gmail.com
Jack,

I tried pgx and its direct interface and I got about 20% improvement. My usecase was 50k+ qps with a trivial query, where the overhead of the library is somewhat significant compared to postgres query run time (most likely data is cached in memory). However I have a few suggestions / feedback.

1. The row.Scan() API ties the choice between []byte and string to underlying Postgres types 'text' and 'bytea'. I don't think the standard sql API does this. See https://github.com/jackc/pgx/blob/master/query.go#L211

That is, I want to pass a *[]byte and avoid conversion back and forth between string and []byte. I see the code is reading a []byte string and converting to string here https://github.com/jackc/pgx/blob/master/msg_reader.go#L155 . I can get the API to call readByte() instead, but that requires the column to be bytea. See https://github.com/jackc/pgx/blob/master/values.go#L1006

Ideally, reading to []byte shouldn't require the underlying column type to be bytea - which is rarely the case. Text and varchar are more common AFAIK.

2. It would be nice to avoid the allocation at https://github.com/jackc/pgx/blob/master/msg_reader.go#L194 if the caller can pass in already allocated buffer (from a sync.Pool may be). But I don't know Go enough to be sure if that is the right approach to reduce allocations.

Thank you again for making the library. Let me know your thoughts on these points.

--
Harry

Jack Christensen

unread,
Sep 27, 2014, 5:34:41 PM9/27/14
to Harry B, golan...@googlegroups.com, div...@gmail.com


Saturday, September 27, 2014 1:17 PM
Jack,

I tried pgx and its direct interface and I got about 20% improvement. My usecase was 50k+ qps with a trivial query, where the overhead of the library is somewhat significant compared to postgres query run time (most likely data is cached in memory). However I have a few suggestions / feedback.

1. The row.Scan() API ties the choice between []byte and string to underlying Postgres types 'text' and 'bytea'. I don't think the standard sql API does this. See https://github.com/jackc/pgx/blob/master/query.go#L211

That is, I want to pass a *[]byte and avoid conversion back and forth between string and []byte. I see the code is reading a []byte string and converting to string here https://github.com/jackc/pgx/blob/master/msg_reader.go#L155 . I can get the API to call readByte() instead, but that requires the column to be bytea. See https://github.com/jackc/pgx/blob/master/values.go#L1006

Ideally, reading to []byte shouldn't require the underlying column type to be bytea - which is rarely the case. Text and varchar are more common AFAIK.
Good call on this. Fixed in https://github.com/jackc/pgx/commit/4e51ff728f70d61cd17173b1206166ea192a4441. I got a small, but measurable performance improvement on one of my apps by converting from string to []byte for JSON responses from PostgreSQL.


2. It would be nice to avoid the allocation at https://github.com/jackc/pgx/blob/master/msg_reader.go#L194 if the caller can pass in already allocated buffer (from a sync.Pool may be). But I don't know Go enough to be sure if that is the right approach to reduce allocations.
This only gets called in the case of calling Scan with a *[]byte argument. It expects to return a new slice. I suppose we could do something like this:

type PreallocedBytes []byte
buf := make([]byte, 1024)

conn.QueryRow("...").Scan(*PreallocatedBytes(&buf))

Scan would treat *PreallocatedBytes differently than []byte and would copy data into it instead of create a new slice. Could raise error if scanned value is too big or just alloc if necessary.

I think this might work. Is this the type thing you were thinking of?

Jack


Thank you again for making the library. Let me know your thoughts on these points.

--
Harry


On Wednesday, July 16, 2014 5:42:48 AM UTC-7, Domain Admin wrote:
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Wednesday, July 16, 2014 7:42 AM
Tuesday, July 15, 2014 11:27 AM
Nice work. Can you explain why the database/sql interface makes it slower and if there is any work that can be done to make it faster? (Both on Go's stdlib and your package)

- RK



Tuesday, July 15, 2014 9:27 AM

Harry B

unread,
Sep 30, 2014, 4:28:39 PM9/30/14
to golan...@googlegroups.com, harry...@gmail.com, div...@gmail.com


On Saturday, September 27, 2014 2:34:41 PM UTC-7, Domain Admin wrote:
2. It would be nice to avoid the allocation at https://github.com/jackc/pgx/blob/master/msg_reader.go#L194 if the caller can pass in already allocated buffer (from a sync.Pool may be). But I don't know Go enough to be sure if that is the right approach to reduce allocations.
This only gets called in the case of calling Scan with a *[]byte argument. It expects to return a new slice. I suppose we could do something like this:

type PreallocedBytes []byte
buf := make([]byte, 1024)

conn.QueryRow("...").Scan(*PreallocatedBytes(&buf))

Scan would treat *PreallocatedBytes differently than []byte and would copy data into it instead of create a new slice. Could raise error if scanned value is too big or just alloc if necessary.

I think this might work. Is this the type thing you were thinking of?


The database/sql Scan() method behaves differently based on whether the argument is []byte, *[]byte, or something else (like a string). Can we do it without defining a new type.

That said I am still trying to really understand the database/sql behavior difference between *[]byte and []byte :)  May be some golang Guru should chime in here (on how the interface should be). I am afraid I am kinda newbie and probably not the right person to decide what the ideal interface should be.

I will try your other patch / update and see what it does for my program.

Thanks
--
Harry

Jack Christensen

unread,
Oct 3, 2014, 2:58:17 PM10/3/14
to Harry B, golan...@googlegroups.com, div...@gmail.com


Harry B wrote:


On Saturday, September 27, 2014 2:34:41 PM UTC-7, Domain Admin wrote:
2. It would be nice to avoid the allocation at https://github.com/jackc/pgx/blob/master/msg_reader.go#L194 if the caller can pass in already allocated buffer (from a sync.Pool may be). But I don't know Go enough to be sure if that is the right approach to reduce allocations.
This only gets called in the case of calling Scan with a *[]byte argument. It expects to return a new slice. I suppose we could do something like this:

type PreallocedBytes []byte
buf := make([]byte, 1024)

conn.QueryRow("...").Scan(*PreallocatedBytes(&buf))

Scan would treat *PreallocatedBytes differently than []byte and would copy data into it instead of create a new slice. Could raise error if scanned value is too big or just alloc if necessary.

I think this might work. Is this the type thing you were thinking of?


The database/sql Scan() method behaves differently based on whether the argument is []byte, *[]byte, or something else (like a string). Can we do it without defining a new type.
I don't think that database/sql Scan() actually does that. I just ran a test. If I pass a []byte instead of *[]byte I get this error: "sql: Scan error on column index 0: destination not a pointer".

There is precedent for defining a type to enable different behavior on Scan. Checkout http://golang.org/pkg/database/sql/#RawBytes.

That said I am still trying to really understand the database/sql behavior difference between *[]byte and []byte :)  May be some golang Guru should chime in here (on how the interface should be). I am afraid I am kinda newbie and probably not the right person to decide what the ideal interface should be.

I will try your other patch / update and see what it does for my program.

Thanks
--
Harry

Harry B

unread,
Oct 21, 2014, 3:41:04 PM10/21/14
to golan...@googlegroups.com, harry...@gmail.com, div...@gmail.com
Jack,

Sorry I didn't respond earlier, but your suggestion of using a new type seems to be good. Please go ahead.

I have a pull request for a different code change on the write path. I think your earlier patch (bytea vs text) only covered the select / read path.

https://github.com/jackc/pgx/pull/41

I haven't done pull requests before, let me know if this is a good way.

Thanks
--
Harry
Reply all
Reply to author
Forward
0 new messages