Enqueing http-request's response body as the response body of the current request

321 views
Skip to first unread message

Marcin Kulik

unread,
May 8, 2013, 5:37:39 AM5/8/13
to alep...@googlegroups.com
I'm trying to write an HTTP proxy that proxies to the proper upstream host based on the Host header.
Right now I have hardcoded upstream host to localhost:3000.
What I want to achieve is to pass the response body channel from the http-request
as a response body of the current request. I have the following code:

    (ns clj-frontal.core
      (:gen-class))

    (use 'lamina.core 'aleph.http 'aleph.formats)

    (defn get-upstream-url [request]
      (let [host "localhost:3000"]
        (str "http://" host (request :uri) (request :query-string))))

    (defn hello-world [ch request]
      (let [method (request :request-method)
            headers (request :headers)
            body (request :body)
            url (get-upstream-url request)]
        (on-realized (http-request {:method method, :url url, :headers headers :body body})
          (fn [response]
            (let [status (response :status)
                  headers (response :headers)
                  body (response :body)]
              (prn body)
              (enqueue ch {:status status :headers headers :body body})
              ))
          #(prn %))))

    (defn -main [& args]
      (start-http-server hello-world {:port 8000}))

When I issue a request to localhost:8000 it prints "<== [ … ]" so I see the "body" is a channel.
And sends the headers correctly:

    HTTP/1.1 200 OK
    Cache-Control: max-age=0, private, must-revalidate
    Connection: close
    Content-Encoding: gzip
    Content-Type: text/html; charset=utf-8
    Date: Wed, 08 May 2013 08:38:37 GMT
    Server: aleph/0.3.0
    Server: thin 1.5.0 codename Knife
    Transfer-Encoding: chunked
    ...

But it doesn't send the body - curl waits for it and times out.

In order to debug it I wanted to print body channel by adding following line just before enqueue:

    (doseq [s (channel->lazy-seq body)] prn s)

But it hangs on it without printing anything. Apparently channel->lazy-seq returns a blocking sequence, but I don't know why it doesn't print anything.

So I used receive-all to print the body channel contents in an async way:

    (receive-all body-channel #(prn %))

I got a series of following:

    #<BigEndianHeapChannelBuffer BigEndianHeapChannelBuffer(ridx=0, widx=406, cap=406)>
    #<BigEndianHeapChannelBuffer BigEndianHeapChannelBuffer(ridx=0, widx=2048, cap=2048)>
    #<BigEndianHeapChannelBuffer BigEndianHeapChannelBuffer(ridx=0, widx=2048, cap=2048)>
    #<BigEndianHeapChannelBuffer BigEndianHeapChannelBuffer(ridx=0, widx=3072, cap=3072)>
    #<BigEndianHeapChannelBuffer BigEndianHeapChannelBuffer(ridx=0, widx=3067, cap=3067)>

So there's some response coming in. I used bytes->string function to map the channel like this:

    (let [body-channel (map* bytes->string body)]
      (receive-all body-channel #(prn %)))

It nicely prints incoming response to the terminal but it still blocks when passed to the response as a body.

Also, I found out that when I create a new channel, enqueue some string on it, close it and pass it as the response body it works fine:

    (let [resp-body (channel)]
      (enqueue resp-body "foo")
      (enqueue resp-body "bar")
      (close resp-body)
      (enqueue ch {:status status :headers headers :body resp-body}))

If I don't close the above channel it hangs. So the key thing here is the channel needs to be closed before passing it as the response and I'm not sure why.
In my case (and in general case of streaming inifnite amount of data) closing the channel doesn't make sense.

Am I missing something obvious here? Thanks!

Zach Tellman

unread,
May 8, 2013, 11:48:29 AM5/8/13
to alep...@googlegroups.com
Hi Marcin,

This is just a guess, but curl doesn't flush the body until there's a newline or the request is complete.  If your chunks don't have newlines, then it will seem as if nothing's coming in, when really curl is just holding onto them until it can flush.

As to why lazy-channel-seq doesn't print anything, you shouldn't do blocking operators inside of an Aleph handler.  If you wrap it in (future ...), I expect it will work fine.

If the newline thing isn't your issue, let me know and I'll take a closer look.

Zach



--
You received this message because you are subscribed to the Google Groups "Aleph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aleph-lib+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Marcin Kulik

unread,
May 8, 2013, 12:44:22 PM5/8/13
to alep...@googlegroups.com
Hey Zach!

Thanks for quick response. Well it turned out that passing result channel directly to enqueue was working (to some degree).
I was using httpie tool instead of curl for some of my tests and httpie was buffering the whole response in order to syntax highlight the whole html.
When I use curl then it correctly displays the whole response (it has newlines so it was not an issue btw) but because aleph sends "transfer-encoding: chunked" header and doesn't close the socket after sending everything from the channel curl waits for more data. Isn't the result-channel from http-request being closed after it has all the data?

Marcin Kulik

unread,
May 8, 2013, 2:17:41 PM5/8/13
to alep...@googlegroups.com
Follow up on the issue: 

I found out that the channel isn't closed only for some requests. One example is locally running Rails application - all responses don't trigger closing of the channel. The other example is "python -m SimpleHTTPServer", 200 responses trigger closing of the channel, 404 responses don't.

When I directly request the Rails app I get following headers back (printed by curl):

< HTTP/1.1 200 OK
< Content-Type: text/html; charset=utf-8
< X-UA-Compatible: IE=Edge
< ETag: "226cb0c03d5a32e653384c8294eb6bd5"
< Cache-Control: max-age=0, private, must-revalidate
< Set-Cookie: .....; path=/; HttpOnly
< X-Request-Id: cfc7de3299b851158b4b67588fdf500d
< X-Runtime: 0.153090
< Connection: close
< Server: thin 1.5.0 codename Knife

When I directly request python's server (with good path) I get:

* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/2.7.4
< Date: Wed, 08 May 2013 18:10:25 GMT
< Content-type: text/html; charset=UTF-8
< Content-Length: 504

When I directly request python's server (with bad path) I get:

* HTTP 1.0, assume close after body
< HTTP/1.0 404 File not found
< Server: SimpleHTTP/0.6 Python/2.7.4
< Date: Wed, 08 May 2013 18:12:51 GMT
< Content-Type: text/html
< Connection: close

So it seems that in both cases where channel isn't being closed the headers from upstream are missing Content-Length so aleph doesn't really know when to stop reading. Seems like curl/httpie close the connection themselves in this case. 

Marcin Kulik

unread,
May 8, 2013, 2:30:38 PM5/8/13
to alep...@googlegroups.com
And curl/httpie doesn't close the connection itself when requesting via my aleph-based proxy because aleph returns "Transfer-Encoding: chunked " header.

Any ideas how to solve this issue?

Zach Tellman

unread,
May 8, 2013, 4:37:36 PM5/8/13
to alep...@googlegroups.com
I'm a little confused, here.  You're creating a proxy, and just returning the response from the http-request you made.  If the body of that response isn't closing, it's because the server you're proxying to hasn't closed it.  Do you see the same behavior when you directly request from the servers, without the proxy?

Marcin Kulik

unread,
May 9, 2013, 5:45:13 AM5/9/13
to alep...@googlegroups.com
Hey Zach,

Here's an asciicast showing this all in action: http://ascii.io/a/3099

First 2 requests go directly to python's SimpleHTTPServer, one returns 200, the other 404. Note that in both cases python's server closes the connection.
Another 2 requests go via aleph's proxy, again one returns 200, the other 404. But in the 200 case there's "connection: keep-alive" header (why?), in 404 case there's "connection: close" and "transfer-encoding: chunked".

Zach Tellman

unread,
May 23, 2013, 4:59:03 PM5/23/13
to alep...@googlegroups.com
Hi Marcin,

Sorry for not having replied earlier, I completely lost track of this.  I believe the issue here is one of HTTP 1.0 vs 1.1.  Certain default behaviors (specifically those around keep-alive) are changed between the two versions.  If the source server is explicit about keep-alive, this should be resolved.  The Ring spec doesn't expose the HTTP version (since this isn't useful except in the proxy use-case), but I think it may be useful to expose a :keep-alive? in HTTP responses.

If you have any further questions, let me know.

Zach
Reply all
Reply to author
Forward
0 new messages