Multipart HTTP requests

77 views
Skip to first unread message

Christopher Lemmer Webber

unread,
Jan 29, 2019, 4:17:56 AM1/29/19
to Racket Users
I'm looking to do multipart HTTP requests in Racket, though it looks
like there's no support at the moment.

I thought I might add a utility using the net/http-client library,
starting with making an adjusted http-conn-send! function. However, the
http-conn-(host/port/etc) struct accessors aren't made available, so it
appears I can't build a utility that uses the same http-conn struct.

Any thoughts on how I should move forward? Has anyone else written a
multipart library I don't know about, for instance?

- cwebb

Philip McGrath

unread,
Jan 29, 2019, 4:35:03 AM1/29/19
to Christopher Lemmer Webber, Racket Users
I don't think there's a multipart-writing library yet, and it would be a great thing to have.

I've written little multipart-writing functions for a small proxy server built on `http-sendrecv/url` and for sending email using `net/sendmail` with html and text/plain alternatives. I'm happy to share code if it would be helpful, but nothing I've written is especially robust—for example, I just use a boundary that I know won't appear in legitimate input in my cases—which is why I haven't posted it publicly yet.

One thought I've had about making a library is that it would be nice if the writing API corresponded in some sensible way to the API from `net/mime` (which only handles parsing).

-Philip


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher Lemmer Webber

unread,
Jan 29, 2019, 8:06:26 AM1/29/19
to Philip McGrath, Racket Users
I see... yeah, I thought about going this route but the reason it didn't
seem particularly "robust" to me is that I would have to read the entire
object into memory at once before passing it in as bytes. IMO it would
be better to have the option to provide ports with file data so the data
could be streamed in without reading the whole data into memory.
However that would, I still think, require being able to access the
http-conn struct accessors at the very least.

At any rate, if you don't mind, send what you have! I'm in the middle
of a hackathon and it might help me deliver some cool things faster :)

Please do mark your work with a libre license if you upload it so I can
safely incorporate it :)

Philip McGrath writes:

> I don't think there's a multipart-writing library yet, and it would be a
> great thing to have.
>
> I've written little multipart-writing functions for a small proxy server
> built on `http-sendrecv/url` and for sending email using `net/sendmail`
> with html and text/plain alternatives. I'm happy to share code if it would
> be helpful, but nothing I've written is especially robust—for example, I
> just use a boundary that I know won't appear in legitimate input in my
> cases—which is why I haven't posted it publicly yet.
>
> One thought I've had about making a library is that it would be nice if the
> writing API corresponded in some sensible way to the API from `net/mime
> <https://docs.racket-lang.org/net/mime.html>` (which only handles parsing).

Matthew Flatt

unread,
Jan 29, 2019, 8:11:55 AM1/29/19
to Christopher Lemmer Webber, Racket Users
At Tue, 29 Jan 2019 04:17:53 -0500, Christopher Lemmer Webber wrote:
> Any thoughts on how I should move forward? Has anyone else written a
> multipart library I don't know about, for instance?

Is that the same as MIME's multipart as used in email? If so, see

http://docs.racket-lang.org/net/mime.html

Hendrik Boom

unread,
Jan 29, 2019, 9:57:41 AM1/29/19
to Racket Users
On Tue, Jan 29, 2019 at 04:34:49AM -0500, Philip McGrath wrote:
> I don't think there's a multipart-writing library yet, and it would be a
> great thing to have.
>
> I've written little multipart-writing functions for a small proxy server
> built on `http-sendrecv/url` and for sending email using `net/sendmail`
> with html and text/plain alternatives. I'm happy to share code if it would
> be helpful, but nothing I've written is especially robust—for example, I
> just use a boundary that I know won't appear in legitimate input in my
> cases—which is why I haven't posted it publicly yet.

You could do what a lot of revision control repositories do -- use a
cryptographically random boundary marker. It's a provably wrong method,
but a long enough one is unlikely to run into much trouble.

I think it's an ugly solution, but ...

-- hendrik

George Neuner

unread,
Jan 29, 2019, 3:22:40 PM1/29/19
to Matthew Flatt, racket users
Multipart HTTP is similar but not exactly the same as MIME.

Maybe I'm missing something, but I don't see where the net library lets
you encode a mulitpart MIME message - it seems only able to parse them. 
I had a need to send multipart email some time ago and ended up encoding
messages manually because I couldn't find anything then either.  MIME
certainly isn't difficult - it's just tedious (the format is recursive,
2 level even when "simple", and parts can contain parts).

Multipart HTTP is simpler - AFAIK, it does not support recursive parts.

George

Neil Van Dyke

unread,
Jan 29, 2019, 3:46:21 PM1/29/19
to racket users
HTTP multipart reading is pretty easy to implement.  You just want to be
careful about space&time performance -- not only because you might want
lots-and-lots of this processing going on on each server (and the costs
add up), but because the individual parts are quite often large (for the
purposes to which people often put multipart).

(Side note: This seems like one of the many essential library things
that might quickly shake out of a startup with strong Racketeers who
contribute back high-quality open source packages for their
non-trade-secret modules.  Many of these packages *could* also be done
by a hobbyist volunteer, but a startup figuring out real-world
function/interface/performance requirements, and needing to do it and
make it work, might be more likely to do produce what's actually needed
by other practitioners.  A few Racket startups doing this, and maybe you
get an effect like the story of the person who, to design footpaths
through a university quad, instead planted grass, and let the
grass-trampling by students getting between classes determine where to
place good footpaths.)

Jon Zeppieri

unread,
Jan 29, 2019, 10:02:10 PM1/29/19
to Christopher Lemmer Webber, Racket Users
On Tue, Jan 29, 2019 at 4:17 AM Christopher Lemmer Webber <cwe...@dustycloud.org> wrote:

Any thoughts on how I should move forward?  

I think that using a `data-procedure/c` of a particular sort should allow you to implement this without needing access to the struct internals or needing to read everything into memory at once.

(Though, it would be a bit nicer if the write proc allowed you to specify an offset and length into the string/byte string.)

Something like:
===
#lang racket/base

(require net/http-client
         file/sha1
         racket/random)

(define (http-conn-send/multipart! hc uri multipart-body
                                   #:version [version #"1.1"]
                                   #:method [method #"POST"]
                                   #:close? [close? #f]
                                   #:headers [headers '()]
                                   #:content-decode [decodes '(gzip)]
                                   #:boundary [boundary (random-boundary)])
  (define content-type-header
    (string-append
     "Content-Type: multipart/form-data; boundary="
     boundary))
      
  (http-conn-send! hc uri
                   #:version version
                   #:method method
                   #:close? close?
                   #:headers (cons content-type-header headers)
                   #:content-decode decodes
                   #:data (multipart-body->data-proc boundary multipart-body)))

(define (mime-escape s)
  (regexp-replace* #rx"[\"\\]" s "\\\\\\0"))

(define (make-string-part field-name field-value)
  (λ (write-chunk boundary)
    (write-chunk
     (format
      (string-append "--~a\r\n"
                     "Content-Disposition: form-data; name=\"~a\"\r\n"
                      "Content-Type: text/plain; charset=utf-8\r\n"
                      "\r\n"
                      "~a\r\n")
      boundary
      (mime-escape field-name)
      field-value))))


(define (make-file-part field-name file-name content-type in)
  (λ (write-chunk boundary)
    (write-chunk
     (format
      (string-append "--~a\r\n"
                     "Content-Disposition: form-data; name=\"~a\"; filename=\"~a\"\r\n"
                      "Content-Type: ~a\r\n"
                      "\r\n")
      boundary
      (mime-escape field-name)
      (mime-escape file-name)
      content-type))

    (define buffer (make-bytes 4096))
    (let loop ([n (read-bytes-avail! buffer in)])
      (cond
        [(eof-object? n)
         n]
        [else
         (write-chunk (subbytes buffer 0 n))
         (loop (read-bytes-avail! buffer in))]))

    (write-chunk "\r\n")))

(define (multipart-body->data-proc boundary parts)
  (λ (write-chunk)
    (for ([part parts])
      (part write-chunk boundary))
    (write-chunk (format "--~a--\r\n" boundary))))
      
(define (random-boundary)
  (string-append
   "--------------------------"
   (bytes->hex-string
    (crypto-random-bytes 8))))



Philip McGrath

unread,
Jan 29, 2019, 11:13:12 PM1/29/19
to Christopher Lemmer Webber, Racket Users
I've put up the code I mentioned for email-sending and a proxy server at https://github.com/LiberalArtist/multipart-writing-examples As noted, these are not general-purpose solutions to either of those problems—I know of a bunch of cases I don't cover, and I basically only have to use these with trusted input—so caveat utor! But hopefully they're useful. I'd be very happy to contribute to this effort, as well: a proper multipart writing library would go a long way to getting some of the little bits of code I've produced into good enough shape to post as a package, enough so that I've thought about tackling it myself.

On Tue, Jan 29, 2019 at 8:06 AM Christopher Lemmer Webber <cwe...@dustycloud.org> wrote:
I see... yeah, I thought about going this route but the reason it didn't
seem particularly "robust" to me is that I would have to read the entire
object into memory at once before passing it in as bytes.  IMO it would
be better to have the option to provide ports with file data so the data
could be streamed in without reading the whole data into memory.

This is one of the things you'd definitely want in a library that I haven't dealt with, because in all of my cases I already have all the data in memory anyway.

However that would, I still think, require being able to access the
http-conn struct accessors at the very least.

I guess I would think about the code to do the writing separately from the specific target you would be writing to, though certainly you would want a good story for how to use it with the existing net libraries.

Maybe you've already seen this, but most of the functions in this family support supplying post data as a `data-procedure/c`, which lets you stream using chunked content transfer encoding. I think the most explicit documentation for that is under `http-conn-send!`.

On Tue, Jan 29, 2019 at 3:22 PM George Neuner <gneu...@comcast.net> wrote:
Multipart HTTP is similar but not exactly the same as MIME.

Multipart HTTP is simpler - AFAIK, it does not support recursive parts.

I believe `multipart/form-data` is a restricted subset of MIME: at least, RFC 7578 says that "The media type multipart/form-data follows the model of multipart MIME data streams as specified in Section 5.1 of RFC 2046; changes are noted in this document."

I hope that a single writing library would be able to handle both `multipart/form-data` and MIME for email, but some of the restrictions on `multipart/form-data` have implications for the design of a writer. In particular, in an email context, you might decide that it's up to the library whether to use a Content-Transfer-Encoding: an implementation could then decide to put everything in quoted-printable encoding so that it could reliably use a boundary containing, say, "=_", which can never appear in quoted-printable-encoded data. However, this strategy doesn't work for `multipart/form-data`, because RFC 7578 §4.7 says that senders "SHOULD NOT" use Content-Transfer-Encoding (even though the RFC itself uses quoted-printable encoding in an example in section 4.5)—I learned this after sending quoted-printable encoding provoked a null pointer error in a real server I was talking to.

Of course, you can also send normal, email-style MIME over HTTP by using an appropriate content type header (`multipart/alternative`, `multipart/mixed`, etc.), just as you can send `application/json`.

-Philip

Christopher Lemmer Webber

unread,
Jan 30, 2019, 6:43:54 AM1/30/19
to Philip McGrath, Racket Users
Philip McGrath writes:

> I've put up the code I mentioned for email-sending and a proxy server at
> https://github.com/LiberalArtist/multipart-writing-examples As noted, these
> are not general-purpose solutions to either of those problems—I know of a
> bunch of cases I don't cover, and I basically only have to use these with
> trusted input—so caveat utor! But hopefully they're useful. I'd be very
> happy to contribute to this effort, as well: a proper multipart writing
> library would go a long way to getting some of the little bits of code I've
> produced into good enough shape to post as a package, enough so that I've
> thought about tackling it myself.

Thank you! While not perfect, still probably helpful for my purpose :)

> Maybe you've already seen this, but most of the functions in this family
> support supplying post data as a `data-procedure/c
> <https://docs.racket-lang.org/net/http-client.html#(def._((lib._net%2Fhttp-client..rkt)._data-procedure%2Fc))>`,
> which lets you stream using chunked content transfer encoding. I think the
> most explicit documentation for that is under `http-conn-send!
> <https://docs.racket-lang.org/net/http-client.html#%28def._%28%28lib._net%2Fhttp-client..rkt%29._http-conn-send%21%29%29>`.

I have seen this but admittedly I do not understand how to use it.
Maybe it is a failure on my part to understand the contract
descriptions.

Could someone supply an example of use? I'd really appreciate it.

Christopher Lemmer Webber

unread,
Jan 30, 2019, 6:46:06 AM1/30/19
to Jon Zeppieri, Racket Users
Oh, this is useful! I didn't see this when asking for data-procedure/c
examples earlier on the list :)

BTW do you mind specifying a license for your code above so I might
incorporate parts of it into my work? Maybe either LGPL or Apache v2?

Jon Zeppieri

unread,
Jan 30, 2019, 10:44:45 AM1/30/19
to Christopher Lemmer Webber, Racket Users
Okay, this applies to the code above:

Copyright (c) 2019 Jon Zeppieri

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Reply all
Reply to author
Forward
0 new messages