Continuing discussion on the roadmap for 0.3, result handling

11 views
Skip to first unread message

Phil Bayfield

unread,
Jan 10, 2011, 8:14:26 AM1/10/11
to GoMySQL
This stems from the original issue thread on Github:
https://github.com/Philio/GoMySQL/issues#issue/36

First steps I've taken is to clean up the instantiation and connection
code. See https://github.com/Philio/GoMySQL/blob/dev/mysql.go

The MySQL struct is now called Client, plus we have 3 new functions,
NewClient, DialTCP, DialUnix. The idea here is to simplify and fall in
line with the "Go way of thinking", which I think this does quite
nicely.

I've also added a bunch of empty functions which will include some
additional functionality (really it's wrapping what can be done with
queries or existing functions in a nice way) e.g. transactions, using
prepare to directly prepare a statement from an SQL statement with a
single method call.

Main thing that needs discussion, finalising a plan on is the best way
to handle packets.

Possible ways to improve/build on the simple retrieve and process
methodology of 0.2:

1. Separate the read/write/parse process in a way that we would read
the whole packet, attempt to process it then act, see original post on
Github.

2. Adopt the C API method, see https://github.com/Philio/GoMySQL/issues#issue/36/comment/657628

3. Some other way, perhaps we could also incorporate multi-threading/
go routines to improve efficiency.

Tom Lee

unread,
Jan 10, 2011, 8:51:50 AM1/10/11
to gom...@googlegroups.com
Regarding packet processing:

I'd like to see packets handled as a stream such that a call to e.g.
"nextPacket(reader)" returns the next packet on the reader, irrespective
of type. This would fit in perfectly with your "read header, read entire
packet" approach too.

If we could make something like this work, it would probably greatly
simplify stuff like stmt.getExecuteResult() in 0.2.x

The only thing I see stopping this is the apparent overlap between
things like Result and Field packets -- I might just be half asleep, but
it seems to me whether a packet is a Result or a Field seems to depend
upon the context (i.e. "have we received a field EOF yet?").

The approach that makes sense to me at a glance is to read all the field
packets immediately after the result packet and treat them as *part* of
the result packet. This would somewhat bastardize what is meant by a
"packet", but it would keep the packet processing stateless and thus
eliminate the need for all that big-switch, state-checking horribleness
in 0.2.

Implementation for nextPacket(r) would thus be something like:

1. Read header from r
2. Read full packet body from r
3. Does first byte indicate OK packet? Process & return OK packet.
4. Does first byte indicate error packet? Process & return error packet.
5. Does first byte indicate result packet? Process packet to determine
field count. Process fields to EOF packet. Return result packet.
6. <other packet types Sleepy Tom has forgotten>
N. Unknown packet -> error.

This also provides a natural solution for the store/use problem (my
"priority 2"): Upon seeing a result packet, code calling nextPacket()
can make the assumption that a set of rows are available and can call
e.g. nextRowPacket() all the way up to the next EOF. Whether that's done
all at once (ala mysql_store_result) or leave it to a result set to
return each row one-by-one (ala mysql_use_result) is left to the caller.
If we were to go this way, "Result" might best be an interface so that
we can provide the two different implementations seamlessly.

Happy to provide example code if that helps, but right now I need some
sleep. :) Happy to hear any input.

Cheers,
Tom

On 11/01/11 00:14, Phil Bayfield wrote:
> This stems from the original issue thread on Github:
> https://github.com/Philio/GoMySQL/issues#issue/36
>
> First steps I've taken is to clean up the instantiation and connection

> code. Seehttps://github.com/Philio/GoMySQL/blob/dev/mysql.go


>
> The MySQL struct is now called Client, plus we have 3 new functions,
> NewClient, DialTCP, DialUnix. The idea here is to simplify and fall in
> line with the "Go way of thinking", which I think this does quite
> nicely.
>
> I've also added a bunch of empty functions which will include some
> additional functionality (really it's wrapping what can be done with
> queries or existing functions in a nice way) e.g. transactions, using
> prepare to directly prepare a statement from an SQL statement with a
> single method call.
>
> Main thing that needs discussion, finalising a plan on is the best way
> to handle packets.
>
> Possible ways to improve/build on the simple retrieve and process
> methodology of 0.2:
>
> 1. Separate the read/write/parse process in a way that we would read
> the whole packet, attempt to process it then act, see original post on
> Github.
>

> 2. Adopt the C API method, seehttps://github.com/Philio/GoMySQL/issues#issue/36/comment/657628


>
> 3. Some other way, perhaps we could also incorporate multi-threading/
> go routines to improve efficiency.


--
Tom Lee
http://tomlee.co

ph 0450 112 893


Phil Bayfield

unread,
Jan 10, 2011, 11:32:46 AM1/10/11
to gom...@googlegroups.com
On 10 January 2011 13:51, Tom Lee <goo...@tomlee.co> wrote:
Regarding packet processing:

I'd like to see packets handled as a stream such that a call to e.g. "nextPacket(reader)" returns the next packet on the reader, irrespective of type. This would fit in perfectly with your "read header, read entire packet" approach too.

If we could make something like this work, it would probably greatly simplify stuff like stmt.getExecuteResult() in 0.2.x

The only thing I see stopping this is the apparent overlap between things like Result and Field packets -- I might just be half asleep, but it seems to me whether a packet is a Result or a Field seems to depend upon the context (i.e. "have we received a field EOF yet?").

It's a bit of a pain in the arse in places, I think some of the prepared statement stuff is the worst.
Also there are some errors in MySQL docs that I've discovered while investigating previous bugs, but we can determine the packet type one way or another. 

The approach that makes sense to me at a glance is to read all the field packets immediately after the result packet and treat them as *part* of the result packet. This would somewhat bastardize what is meant by a "packet", but it would keep the packet processing stateless and thus eliminate the need for all that big-switch, state-checking horribleness in 0.2.

Implementation for nextPacket(r) would thus be something like:

1. Read header from r
2. Read full packet body from r
3. Does first byte indicate OK packet? Process & return OK packet.
4. Does first byte indicate error packet? Process & return error packet.
5. Does first byte indicate result packet? Process packet to determine field count. Process fields to EOF packet. Return result packet.
6. <other packet types Sleepy Tom has forgotten>
N. Unknown packet -> error.

This also provides a natural solution for the store/use problem (my "priority 2"): Upon seeing a result packet, code calling nextPacket() can make the assumption that a set of rows are available and can call e.g. nextRowPacket() all the way up to the next EOF. Whether that's done all at once (ala mysql_store_result) or leave it to a result set to return each row one-by-one (ala mysql_use_result) is left to the caller. If we were to go this way, "Result" might best be an interface so that we can provide the two different implementations seamlessly.


If we do the store/use way it might make more sense to do something like....

Read header
Read first packet in sequence
Check packet type
OK -> Send OK back
Error -> Send error back
Result set -> Do nothing but indicate that a result is waiting
Store result -> Save the result
Free result -> discard the result [if we've received it, clean up, if not, discard/ignore it]

I think this is kinda how the C API functions.

Phil Bayfield

unread,
Jan 11, 2011, 2:20:19 AM1/11/11
to GoMySQL
This is how it works in C:

After invoking mysql_query() or mysql_real_query(), you must call
mysql_store_result() or mysql_use_result() for every statement that
successfully produces a result set (SELECT, SHOW, DESCRIBE, EXPLAIN,
CHECK TABLE, and so forth). You must also call mysql_free_result()
after you are done with the result set.

http://dev.mysql.com/doc/refman/5.5/en/mysql-store-result.html

Phil Bayfield

unread,
Jan 11, 2011, 2:34:27 AM1/11/11
to GoMySQL
This is the work flow for the C API:

mysql_query("select * from blah")

This returns an integer success value.

mysql_store_result() -> Retrieves the entire result and returns a
pointer to it. (What we do now automatically.)
mysql_use_result() -> Also returns a pointer to the result also but
does not retrieve it yet, each row is retrieved one by one as and when
requested by the client.

So we could have something like:

func (client *Client) Query(sql string) (resCode int, err os.Error) {}

Here we would send the query command to the server, read the first
result packet (and probably store it for later) and return a response
to indicate that the query was successful, resCode could in fact
probably just be a boolean success value.

func (client *Client) StoreResult() (res *Result, err os.Error) {}

Here we would retrieve and process the entire result set and return a
pointer to it.

func (client *Client) UseResult() (res *Result, err os.Error) {}

Here we would retrieve the first few params up until a field EOF
packet is received, but leave the row data for now. Row retrieval
would be initiated every time we request a row from the result.

func (client *Client) FreeResult() {}

Remove the current result from memory, if there are unreceived packets
"on the wire" we need to remove them.

Tom Lee

unread,
Jan 11, 2011, 4:53:27 AM1/11/11
to gom...@googlegroups.com

On 11/01/11 18:34, Phil Bayfield wrote:
This is the work flow for the C API:

mysql_query("select * from blah")

This returns an integer success value.

mysql_store_result() -> Retrieves the entire result and returns a
pointer to it. (What we do now automatically.)
mysql_use_result() -> Also returns a pointer to the result also but
does not retrieve it yet, each row is retrieved one by one as and when
requested by the client.

So we could have something like:

func (client *Client) Query(sql string) (resCode int, err os.Error) {}


I'd suggest that resCode wasn't necessary i.e.:
func (c *Client) Query(sql string) os.Error
Seems more idiomatic to simply look at the error than rely on a status code and an error. What do you think?


Here we would send the query command to the server, read the first
result packet (and probably store it for later) and return a response
to indicate that the query was successful, resCode could in fact
probably just be a boolean success value.

func (client *Client) StoreResult() (res *Result, err os.Error) {}

Here we would retrieve and process the entire result set and return a
pointer to it.

func (client *Client) UseResult() (res *Result, err os.Error) {}

Here we would retrieve the first few params up until a field EOF
packet is received, but leave the row data for now. Row retrieval
would be initiated every time we request a row from the result.


Sounds good, although I'd suggest that Result should be an interface rather than a struct. Then we could provide the two implementations as e.g. ClientSideResult or ServerSideResult structs without mixing up the implementation details of Use/Store.


func (client *Client) FreeResult() {}

Remove the current result from memory, if there are unreceived packets
"on the wire" we need to remove them.

I think this should belong to the Result. Once Client has returned a Result, the Result should be responsible for itself IMHO.

With all that out of the way, a quick discussion of some options for the result stuff:

Option A

This is essentially the proposal you described in your email:
_, err := client.Query("SELECT * FROM blah")
if err != nil {
  // ...
}
res, err := client.UseResult() // or client.StoreResult()
if err != nil {
  // ...
}
defer client.FreeResult()
// ...

Looks reasonable, but puts all the result set management burden on the Client instance.

Option B

Usage of Client with my suggestions in place would look something like this when executing a query:

err := client.Query("SELECT * FROM blah")
if err != nil {
  // ...
}
res, err := client.UseResult() // or client.StoreResult()
if err != nil {
  // ...
}
defer res.Free()
// ...
Again, looks reasonable -- but maybe still some room for improvement.

Option C

Move responsibility for all Result related stuff into the Result itself once Query has been called.
res, err := client.Query("SELECT * FROM blah")
if err != nil {
  // handle error
}
defer res.Free()
res.Use() // if the user doesn't explicitly call Use() or Store() prior to requesting a row, default to something sane
// ...
Looks nice, but might complicate the implementation due to implementation details being slightly different depending on whether you call Use() or Store().

Lastly, I have a slight concern that it's possible to forget to call UseResult()/StoreResult() for both Option A and B, which might leave junk on the wire. We may need to weave some magic to discard "old" result sets prior to sending & receiving additional packets.

Any preferences/thoughts?

Cheers,
Tom

Phil Bayfield

unread,
Jan 11, 2011, 5:28:04 AM1/11/11
to gom...@googlegroups.com
On 11 January 2011 09:53, Tom Lee <goo...@tomlee.co> wrote:

On 11/01/11 18:34, Phil Bayfield wrote:
This is the work flow for the C API:

mysql_query("select * from blah")

This returns an integer success value.

mysql_store_result() -> Retrieves the entire result and returns a
pointer to it. (What we do now automatically.)
mysql_use_result() -> Also returns a pointer to the result also but
does not retrieve it yet, each row is retrieved one by one as and when
requested by the client.

So we could have something like:

func (client *Client) Query(sql string) (resCode int, err os.Error) {}


I'd suggest that resCode wasn't necessary i.e.:
func (c *Client) Query(sql string) os.Error
Seems more idiomatic to simply look at the error than rely on a status code and an error. What do you think?


Yep, now I've woken up this morning with the help of some Red Bull that makes mu more sense!
This is actually the C way:

void mysql_free_result(MYSQL_RES *result)

Being totally procedural, I forgot to check what it was actually doing earlier.

Having read the above also agree with this it should be Result.Free() not Client.FreeResult()
 

Option C

Move responsibility for all Result related stuff into the Result itself once Query has been called.
res, err := client.Query("SELECT * FROM blah")
if err != nil {
  // handle error
}
defer res.Free()
res.Use() // if the user doesn't explicitly call Use() or Store() prior to requesting a row, default to something sane
// ...
Looks nice, but might complicate the implementation due to implementation details being slightly different depending on whether you call Use() or Store().

This looks a bit wrong to me tbh. If we already have a result then we wouldn't need to use it.

Noticed in most of the MySQL libraries (in other languages) that this is always called on the client/mysql object, C as an example again:

MYSQL_RES *mysql_use_result(MYSQL *mysql)
 

Lastly, I have a slight concern that it's possible to forget to call UseResult()/StoreResult() for both Option A and B, which might leave junk on the wire. We may need to weave some magic to discard "old" result sets prior to sending & receiving additional packets.

 
This could be a problem, we need to check and remove any data left over.

I suspect what we'll be doing is using a pointer to the client within the result object anyway, like:

func (res *Result) fetchRow() *Row {
    if res.loadDataFromWire {
        res.client.loadRow()
... etc

Basically when we get a row EOF that means the end of the result set, we need to check for additional result sets, not sure if there is a better way than checking if there is any more received data, think this is the only way.

We'll also need for this...

func (client *Client) MoreResults() os.Error {} // Check if more results are available
func (client *Client) NextResult() os.Error {} // Move to the next result set (this explains it well: http://dev.mysql.com/doc/refman/5.5/en/mysql-next-result.html)

So you would do something like...

res := client.UseResult()
....
res.Free()

if client.MoreResults() {
    client.NextResult()
    res = client.UseResult()
}
 
Any preferences/thoughts?

Cheers,
Tom


-- 
Tom Lee
http://tomlee.co

ph 0450 112 893

--
You received this message because you are subscribed to the Google Groups "GoMySQL" group.
To post to this group, send email to gom...@googlegroups.com.
To unsubscribe from this group, send email to gomysql+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gomysql?hl=en.

Tom Lee

unread,
Jan 11, 2011, 5:43:03 AM1/11/11
to gom...@googlegroups.com
On 11/01/11 21:28, Phil Bayfield wrote:
We'll also need for this...

func (client *Client) MoreResults() os.Error {} // Check if more results are available
func (client *Client) NextResult() os.Error {} // Move to the next result set (this explains it well: http://dev.mysql.com/doc/refman/5.5/en/mysql-next-result.html)

So you would do something like...

res := client.UseResult()
....
res.Free()

if client.MoreResults() {
    client.NextResult()
    res = client.UseResult()
}
 


Lost me a bit here -- I'm assuming something like this would supersede e.g. MultiQuery()?

Phil Bayfield

unread,
Jan 11, 2011, 5:45:57 AM1/11/11
to gom...@googlegroups.com
Yep, plus I think it is possible to receive multiple result sets under certain circumstances from a normal query too - Read something somewhere in some API docs but not sure where. 
-- 
Tom Lee
http://tomlee.co

ph 0450 112 893

Phil Bayfield

unread,
Jan 11, 2011, 7:09:32 AM1/11/11
to gom...@googlegroups.com
There are a few multi threading issues to consider here as well.

Have just copy/pasted this from the manual as it all applies:

  • Two threads can't send a query to the MySQL server at the same time on the same connection. In particular, you have to ensure that between calls to mysql_query() and mysql_store_result() no other thread is using the same connection.

  • Many threads can access different result sets that are retrieved with mysql_store_result().

  • If you use mysql_use_result(), you must ensure that no other thread is using the same connection until the result set is closed. However, it really is best for threaded clients that share the same connection to usemysql_store_result().

  • If you want to use multiple threads on the same connection, you must have a mutex lock around your pair ofmysql_query() and mysql_store_result() calls. Once mysql_store_result() is ready, the lock can be released and other threads may query the same connection.

  • If you use POSIX threads, you can use pthread_mutex_lock() and pthread_mutex_unlock() to establish and release a mutex lock.

Micha? Jarco

unread,
Jan 11, 2011, 8:02:39 AM1/11/11
to gom...@googlegroups.com
Hi!
I've just joined to Your group, because I'm writing kind of orm in Go and I'm looking for go-mysql driver :-) 
Recently I was playing with go-pgsql, and there way of result handling i more like B, but instead of Free() there is Close() method.

Phil Bayfield

unread,
Jan 11, 2011, 8:10:44 AM1/11/11
to gom...@googlegroups.com
Hi Micha

--
You received this message because you are subscribed to the Google Groups "GoMySQL" group.
To post to this group, send email to gom...@googlegroups.com.
To unsubscribe from this group, send email to gomysql+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gomysql?hl=en.

Welcome to the group :)

I took a (very quick) look at go-pgsql, the result stuff specifically https://github.com/lxn/go-pgsql/blob/master/resultset.go

Looks almost exactly what we are discussing actually thanks for the info, might be useful!

Phil.

Phil Bayfield

unread,
Jan 11, 2011, 8:12:36 AM1/11/11
to gom...@googlegroups.com
Oh also, regarding the naming conventions, we're sort of a cross between MySQL and Go standards on this one, Postgres is probably a bit different again.

Tom Lee

unread,
Jan 11, 2011, 8:19:16 AM1/11/11
to gom...@googlegroups.com
Welcome aboard :)

Regarding Close() vs. Free() ... this might keep the API consistent for
connections & results. I know I've used APIs in the past where there's
confusion about whether I should be calling close, free, disconnect or
dispose :) Not too bothered either way though in the case of results.

A broader question in my mind that's come up as part of this discussion
is whether we want to essentially "Goify" the C API or if we try to nice
it up.

I think 0.2 leaned towards the latter, but I'm +0 for either ... I'll
just be wrapping it all up in go-dbi anyway ;)

Cheers,
Tom

> --
> You received this message because you are subscribed to the Google
> Groups "GoMySQL" group.
> To post to this group, send email to gom...@googlegroups.com.
> To unsubscribe from this group, send email to
> gomysql+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/gomysql?hl=en.

Phil Bayfield

unread,
Jan 11, 2011, 8:27:04 AM1/11/11
to gom...@googlegroups.com
I always consider Close() in the context of a connection.

What it is actually doing is removing our result data from memory (and/or buffer storing raw data) so personally think Free() is much more appropriate.

Micha? Jarco

unread,
Jan 11, 2011, 9:27:54 AM1/11/11
to gom...@googlegroups.com
I don't mind of whatever the name is/will be :-) 

But I like the mimic of go-psql, to be more precise it's option D (mixed b and c):

res, err = client.Query("SELECT * FROM table")
if err!= nil {
// do stuff 
}
res.Close() //Free() whatever

I think it's worth to mention that go-pgsql have this "cursor" behavior. In GoMySQL AFAIR you fetch everything that comes in the response. Go-pgsql gives "client programmer" chance to decide if he want to get (means parse and convert to a proper types) all results and rows or just one or none. I like it that way. For me as a ORM programmer it means quite big efficiency bonus in some cases.

Phil Bayfield

unread,
Jan 11, 2011, 9:35:45 AM1/11/11
to gom...@googlegroups.com
On 11 January 2011 14:27, Micha? Jarco <mja...@gmail.com> wrote:
I don't mind of whatever the name is/will be :-) 

But I like the mimic of go-psql, to be more precise it's option D (mixed b and c):

res, err = client.Query("SELECT * FROM table")
if err!= nil {
// do stuff 
}
res.Close() //Free() whatever

This is quite similar to the way we're looking to do it.
 

I think it's worth to mention that go-pgsql have this "cursor" behavior. In GoMySQL AFAIR you fetch everything that comes in the response. Go-pgsql gives "client programmer" chance to decide if he want to get (means parse and convert to a proper types) all results and rows or just one or none. I like it that way. For me as a ORM programmer it means quite big efficiency bonus in some cases.

This is an interesting idea, something we could definitely consider adding to the library.

I'm not familiar with the Postgres protocol, so I'll explain why we do what we do with MySQL.

For queries, MySQL sends all row data as strings, currently we have some basic conversion to convert ints and floats to corresponding Go types but everything else remains as strings.

For prepared statements, MySQL sends all row data in their raw binary form, so in this instance we convert directly from binary to the equivalent native go type.

Arguably, it would be consistent if both functions returned equivalent results.

Tom Lee

unread,
Jan 11, 2011, 9:39:40 AM1/11/11
to gom...@googlegroups.com
That's an interesting point: if I understand correctly, you're suggesting we should always treat query results as the Use() case, and leave users to decide whether they want to pull everything back at once?

Effectively just the opposite of what 0.2 currently does.

Hmm. I think I like this idea!

Cheers,
Tom
--
You received this message because you are subscribed to the Google Groups "GoMySQL" group.
To post to this group, send email to gom...@googlegroups.com.
To unsubscribe from this group, send email to gomysql+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gomysql?hl=en.

Phil Bayfield

unread,
Jan 11, 2011, 9:46:13 AM1/11/11
to gom...@googlegroups.com
On 11 January 2011 14:39, Tom Lee <goo...@tomlee.co> wrote:
That's an interesting point: if I understand correctly, you're suggesting we should always treat query results as the Use() case, and leave users to decide whether they want to pull everything back at once?


If we implement UseResult and StoreResult then we get the best of both worlds really, allowing the programmer to choose either.

Phil Bayfield

unread,
Jan 15, 2011, 7:25:45 AM1/15/11
to GoMySQL
I posted an updated Client structure the other day but then got side
tracked with another project

Latest structure is here: https://github.com/Philio/GoMySQL/blob/dev/mysql.go

Plan is to start looking at the packet side of things over the weekend.
Reply all
Reply to author
Forward
0 new messages