Logging HTTP Requests with Channels, Go routines, and concurrency!

1,326 views
Skip to first unread message

nrajc...@gmail.com

unread,
Feb 20, 2014, 7:09:51 AM2/20/14
to golan...@googlegroups.com
Hello fellow golang enthusiasts,

I was recently given the task of creating a Golang script for a company I work for and I am having trouble to say the least. I am completely new to GoLang but am very familiar with C concepts as well as python so I've been picking it up fast. The script, conceptually, is pretty straight forward and I am hoping to leverage all the goodness of Go to create it. I am not looking for someone to do my job for me, I would just like to hopefully be pointed in the right direction; examples are always great though.

The script goes as as follows:

- Accept an incoming HTTP Request.
 
This is trivial of course. 
 
- Create an exact copy of the request and change the URL path to a different url:port combination. This includes anything that may be passed via POST and/or GET

For example: If the incoming request was http://127.0.0.1:8080/my/url/path?things=ofcourse&key=123. Then the replica request would look like 
http://127.0.0.1:8888/my/url/path?things=ofcourse&key=123. IMPORTANT: The copied request MUST also include any POST or GET data.

 I don't think I'm having any problems with this step. Again at first glance it seems rather trivial and my initial tests have been positive. Please let me know if I am mistaken and it's not as simple as this:
 
var rcopy *http.Request = r                
newRequest, err := http.NewRequest(rcopy.Method, "http://127.0.0.1:8888"+rcopy.RequestURI, rcopy.Body)

Will this give me an exact copy of the old request just with a new url path? Do I need to use r.ParseForm() anywhere?

- These next three will ideally all be go routines(i think?) so that they can process their jobs faster and keep CPU usage as low as possible while still handling many requests in a short period of time

  • Use Go routine to make an HTTP request to our newRequest variable (the copied version of the request). I will need to log the response status (200,404,etc..) so it has to use channels to transmit this status back to another section of the code 
  • Log all the data from the initial request as well as the return status from the previous bullet. By this I mean I want to log all GET / POST data from the original request. I also want to log the return status from the first bullet WITH the other logged info. Which means I need to wait for the newRequest to process and return a response before the log step is complete. 
    •  I am currently using a channel to send the response to a simple function that then logs the info into a file. 
    •  Right now the log writes to disk EVERY time. If I'm not mistaken there is a better way? I've seen several tutorials using Buffers that looked promising. 
  • All the data that I log to disk I also wish to INSERT into a mongodb collection. The collection would have a name and value for each POST/GET variable/value as well as any custom values I add myself (e.g. timestamp) 
    •  Similarly to the Log function. This one will just put the data into a variable (struct? map? slices?) and create a new entry in the mongoDB. The data will also include the Response status from the first bullet so I will have to manipulate channels again here.
Currently I am doing both the system log and the mongo insert in the same routine as they use practically the same variables. I also parse out the variables (from r.Form) in here too versus earlier in the script. This works but I am almost 100% sure it is not the best way. I've read a handful of articles that talk about padded buffers and multiple channels but none of them are quite what I need.

And that's pretty much it. The script needs to perform those THREE tasks as fast as possible and with minimal CPU usage/overhead; this is why I am using Go in the first place.  If anyone could possibly give me some pseudocode or a brief design breakdown for a task like this I would greatly appreciate it. Anything helps because as I said before, I'm just a beginner in Go.

But I can't wait to learn everything there is to offer! I think it is a phenomenal language and am enjoying myself very much.

Best regards and thank you in advance 
 
 
 
 
 

Matt Silverlock

unread,
Feb 20, 2014, 10:46:27 AM2/20/14
to golan...@googlegroups.com, nrajc...@gmail.com
https://github.com/gorilla/handlers should help with the logging part of things.

Matt Harden

unread,
Feb 24, 2014, 10:05:17 AM2/24/14
to nrajc...@gmail.com, golang-nuts
On Thu, Feb 20, 2014 at 6:09 AM, <nrajc...@gmail.com> wrote:
Hello fellow golang enthusiasts,

I was recently given the task of creating a Golang script for a company I work for and I am having trouble to say the least. I am completely new to GoLang but am very familiar with C concepts as well as python so I've been picking it up fast. The script, conceptually, is pretty straight forward and I am hoping to leverage all the goodness of Go to create it. I am not looking for someone to do my job for me, I would just like to hopefully be pointed in the right direction; examples are always great though.

The script goes as as follows:

- Accept an incoming HTTP Request.
 
This is trivial of course. 
 
- Create an exact copy of the request and change the URL path to a different url:port combination. This includes anything that may be passed via POST and/or GET

For example: If the incoming request was http://127.0.0.1:8080/my/url/path?things=ofcourse&key=123. Then the replica request would look like 
http://127.0.0.1:8888/my/url/path?things=ofcourse&key=123. IMPORTANT: The copied request MUST also include any POST or GET data.

 I don't think I'm having any problems with this step. Again at first glance it seems rather trivial and my initial tests have been positive. Please let me know if I am mistaken and it's not as simple as this:
 
var rcopy *http.Request = r                
newRequest, err := http.NewRequest(rcopy.Method, "http://127.0.0.1:8888"+rcopy.RequestURI, rcopy.Body)

There is no need to create rcopy. It's not a copy of the original request, just a copy of the pointer. If you are going to use the Body more than "once", you will want to wrap it with an io.TeeReader or something similar. Also see httputil.DumpRequest.
 
Will this give me an exact copy of the old request just with a new url path? Do I need to use r.ParseForm() anywhere?
 
I believe so. No need for ParseForm() unless you need to, you know, Parse a Form. :-)

- These next three will ideally all be go routines(i think?) so that they can process their jobs faster and keep CPU usage as low as possible while still handling many requests in a short period of time

I'm not sure you need separate goroutines for these. http.Server will launch a new goroutine for every request. It looks like each step below needs information from the prior step, so it makes sense to run them in sequence in a single goroutine.
  • Use Go routine to make an HTTP request to our newRequest variable (the copied version of the request). I will need to log the response status (200,404,etc..) so it has to use channels to transmit this status back to another section of the code 
No need for channels. Just hold on to the response in a variable and use it in the next sequential section of code. Don't use goroutines and channels unless you need concurrency. In this case the actions are sequential, so just run them sequentially.

resp, err := http.DefaultClient.Do(req)
  • Log all the data from the initial request as well as the return status from the previous bullet. By this I mean I want to log all GET / POST data from the original request. I also want to log the return status from the first bullet WITH the other logged info. Which means I need to wait for the newRequest to process and return a response before the log step is complete. 
    •  I am currently using a channel to send the response to a simple function that then logs the info into a file. 
    •  Right now the log writes to disk EVERY time. If I'm not mistaken there is a better way? I've seen several tutorials using Buffers that looked promising. 
The bufio package. Also see httputil.DumpResponse.
  • All the data that I log to disk I also wish to INSERT into a mongodb collection. The collection would have a name and value for each POST/GET variable/value as well as any custom values I add myself (e.g. timestamp) 
    •  Similarly to the Log function. This one will just put the data into a variable (struct? map? slices?) and create a new entry in the mongoDB. The data will also include the Response status from the first bullet so I will have to manipulate channels again here.
I would just use httputil.Dump{Request,Response}, mentioned above, and store the returned []byte values directly into the collection. I would use the mgo package for this.
 
Currently I am doing both the system log and the mongo insert in the same routine as they use practically the same variables. I also parse out the variables (from r.Form) in here too versus earlier in the script. This works but I am almost 100% sure it is not the best way. I've read a handful of articles that talk about padded buffers and multiple channels but none of them are quite what I need.

Why do you think there's a better way?
 
And that's pretty much it. The script needs to perform those THREE tasks as fast as possible and with minimal CPU usage/overhead; this is why I am using Go in the first place.  If anyone could possibly give me some pseudocode or a brief design breakdown for a task like this I would greatly appreciate it. Anything helps because as I said before, I'm just a beginner in Go.

Go can handle those tasks pretty much as fast as possible and with minimal CPU usage. You're not doing anything CPU intensive here, so network and disk I/O should dominate. If you find they do not, then profile your code and find the bottleneck, and improve that. Don't try to optimize by hand before you have data on where optimizations are needed. Remember the http package will create a goroutine for each incoming request, and they will all run concurrently. If you create goroutines unnecessarily, you will only slow things down - they are lightweight, but I just don't see a reason to use them in the tasks above.

Http request handler (not tested, and be sure to handle errors!):

type httpdump struct {
    req []byte
    resp []byte
}

func handler(w http.ResponseWriter, req *http.Request) {
    var dump httpdump
    var err error

    // Grab a dump of the incoming request
    dump.req, err = httputil.DumpRequest(req, true /*dump the body also*/)

    // Send the request to our server and retrieve the response
    req.URL.Host = "127.0.0.1:8888"
    resp, err := http.DefaultClient.Do(req)

    // Grab a dump of the response
    dump.resp, err = httputil.DumpResponse(resp, true /*dump the body also*/)

    // Copy the response back to the client
    from, to := resp.Header, w.Header()
    for k, v := range(from) {
        to[k] = v
    }
    err = w.WriteHeader(resp.StatusCode)
    _, err = io.Copy(resp.Body, w)

    log.Printf("%v\n", dump.req, dump.resp)
    err = collection.Insert(dump)
}

But I can't wait to learn everything there is to offer! I think it is a phenomenal language and am enjoying myself very much.

Best regards and thank you in advance 
 
 
 
 
 

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages