handling multipart/form-data in vert.x

2,831 views
Skip to first unread message

Neeme Praks

unread,
Apr 16, 2012, 4:44:45 PM4/16/12
to vert.x
Hi,

I'm playing around with vert.x and using it to implement a hobby
project of mine.
BTW, pretty cool framework/platform - "node.js for the rest of us
(Java devs)" :-)

In my project, I need to upload a file to the web server. On client
side I created a web form, to POST the file as multipart/form-data to
the server. Then I used sample code from the "upload" example to
handle the file upload on the server (save it in a file, for now).
However, this results in the following stream on the server:
------WebKitFormBoundaryBacAOGLREnA3aBb1
Content-Disposition: form-data; name="datafile"; filename="test.txt"
Content-Type: text/plain

123456

------WebKitFormBoundaryBacAOGLREnA3aBb1--

As far as I can see, I need some code to reconstruct the original
stream (e.g. Commons FileUpload). Briefly glancing through the
documentation I could not find an immediate "vert.x way" of handling
this.

Should I integrate Commons FileUpload to my project or is there some
other way to handle this?

Rgds,
Neeme

Tim Fox

unread,
Apr 16, 2012, 5:07:31 PM4/16/12
to ve...@googlegroups.com
Hi Neeme,

Can you post your server side code? I'll take a look tomorrow morning

Tim Fox

unread,
Apr 17, 2012, 4:26:09 AM4/17/12
to ve...@googlegroups.com
I had a look at this.

POST data isn't always multipart form-data and vert.x doesn't make any assumptions about what it is, it just gives you whatever the client sent. To vert.x the POST data is just bytes.

In this case the client sent some multipart form-data so that's what you got :)

Since multipart form-data is probably going to be a popular kind of data we can add some functionality to vert.x which will do the parsing for you.

Can you open a github issue?

Cheers

Gary Russell

unread,
Apr 17, 2012, 8:49:06 AM4/17/12
to ve...@googlegroups.com
On a related note, I was playing with the TCP 'NetServer' yesterday, and noticed that the handler gets raw packets containing whatever the tcp stack has assembled at the time the event fires...

vertx run org.vertx.java.examples.echo.EchoClient -cp classes
vert.x-core-thread-0 Net client sending: hello0
vert.x-core-thread-0 Net client sending: hello1
vert.x-core-thread-0 Net client sending: hello2
vert.x-core-thread-0 Net client sending: hello3
vert.x-core-thread-0 Net client sending: hello4
vert.x-core-thread-0 Net client sending: hello5
vert.x-core-thread-0 Net client sending: hello6
vert.x-core-thread-0 Net client sending: hello7
vert.x-core-thread-0 Net client sending: hello8
vert.x-core-thread-0 Net client sending: hello9
vert.x-core-thread-0 Net client receiving: he
vert.x-core-thread-0 Net client receiving: llo0
hello1

vert.x-core-thread-0 Net client receiving: hello2
hello3
hello4
hello5
hello6
hello7
hello8
hello9

In Spring Integration we provide a number of pluggable serializers/deserializers (aka codecs) that apply structure to the TCP stream; message demarcation occurs at a pretty low level so the app only deals with reassembled "messages". For vert.x, do you feel this functionality belongs in the handler?

Just that if you are considering adding a parser for multi-part form data, maybe you could generalize it into a pluggable strategy for any stream?

We currently provide 3 out of the box implementations in Spring Integration

<data><cr><lf>
<stx><data><etx>
<length><data>   (where length is a 1, 2 or 4 byte binary length in network byte order).

The first two are only suitable for text; the last one is the most efficient (requires no parsing for a delimiter) and works with binary data. Users can plug in their own implementations for any specialized decoding.

Gary

Tim Fox

unread,
Apr 17, 2012, 9:32:48 AM4/17/12
to ve...@googlegroups.com
Hi Gary, comments below


On Tuesday, April 17, 2012 1:49:06 PM UTC+1, Gary Russell wrote:
On a related note, I was playing with the TCP 'NetServer' yesterday, and noticed that the handler gets raw packets containing whatever the tcp stack has assembled at the time the event fires...

Yes, NetServer is a low level service so this is by design
 

vertx run org.vertx.java.examples.echo.EchoClient -cp classes
vert.x-core-thread-0 Net client sending: hello0
vert.x-core-thread-0 Net client sending: hello1
vert.x-core-thread-0 Net client sending: hello2
vert.x-core-thread-0 Net client sending: hello3
vert.x-core-thread-0 Net client sending: hello4
vert.x-core-thread-0 Net client sending: hello5
vert.x-core-thread-0 Net client sending: hello6
vert.x-core-thread-0 Net client sending: hello7
vert.x-core-thread-0 Net client sending: hello8
vert.x-core-thread-0 Net client sending: hello9
vert.x-core-thread-0 Net client receiving: he
vert.x-core-thread-0 Net client receiving: llo0
hello1

vert.x-core-thread-0 Net client receiving: hello2
hello3
hello4
hello5
hello6
hello7
hello8
hello9

In Spring Integration we provide a number of pluggable serializers/deserializers (aka codecs) that apply structure to the TCP stream; message demarcation occurs at a pretty low level so the app only deals with reassembled "messages". For vert.x, do you feel this functionality belongs in the handler?

vert.x already contains functionality which allows you to take an ordered stream of bytes and split out meaningful "records" depending on what the protocol is.

Take a look at the RecordParser class.

There are a few  examples which use this, e.g. the pubsub example, which demonstrates a simple pubsub server which uses a \n delimited text protocol https://github.com/purplefox/vert.x/blob/master/src/examples/javascript/pubsub/pubsub_server.js

The idea with the record parser is simple: You tell it what kind of records to expect - e..g. fixed size or delimited and then you feed it the unordered stream, it then spits out full formed records.


The parser supports both fixed length and delimited protocols or even hybrid protocols where you might have say a couple of fixed length fields followed by some delimited fields.

 

Just that if you are considering adding a parser for multi-part form data, maybe you could generalize it into a pluggable strategy for any stream?

Cool. I will take a look, but Netty contains a multi-part form parser in the master branch which hopefully will end up in the 3.4.0 branch before long so hopefully we can just use that :)

Neeme Praks

unread,
Apr 17, 2012, 11:02:27 AM4/17/12
to vert.x
Opened an issue: https://github.com/purplefox/vert.x/issues/88

Any idea, when I can expect this? Just to decide, if I should
integrate Commons FileUpload for now.
Or if you provide enough guidance, I might be able to port it from
Netty myself?

Thanks!

Neeme Praks

unread,
Apr 17, 2012, 11:35:44 AM4/17/12
to vert.x
Or, maybe you can provide some pointers on how to upload a file from a
browser without sending it encoded as "multipart/form-data"?
I'm not too familiar with the recent advances in web browsers.

Gary Russell

unread,
Apr 17, 2012, 12:04:49 PM4/17/12
to ve...@googlegroups.com
I think the RecordParser needs more options.

A very common legacy integration format is a variable length, non-delimited data where the variable length is contained in the first 1, 2 or 4 bytes. As I said below, this is very efficient (requires no parsing) and can handle binary data.

It appears the RecordParser only supports delimited or fixed length messages.

Another common format is <stx>...<etx> or <soh>...<eot> or <soh>...<sot>...<etx>...<eot>

Finally, I think having the parser itself be a 'Handler' is rather restrictive; if I want to plug in a different protocol, I have to craft a new java class. I think it would be better to have a generic ParsingHandler that can be configured with a pluggable protocol decoder. That way, I can choose my protocol at runtime; of course, in a Spring app, we'd just inject the decoder implementation as part of the configuration.

My 2c.

Gary

Tim Fox

unread,
Apr 17, 2012, 12:22:02 PM4/17/12
to ve...@googlegroups.com
On 17/04/12 17:04, Gary Russell wrote:
> I think the RecordParser needs more options.
>
> A very common legacy integration format is a variable length,
> non-delimited data where the variable length is contained in the first
> 1, 2 or 4 bytes. As I said below, this is very efficient (requires no
> parsing) and can handle binary data.
>
> It appears the RecordParser only supports delimited or fixed length
> messages.

The RecordParser can handle the case you describe (length prefixed), in
fact we use it for parsing the vert.x cluster messages which are passed
between nodes and are length prefixed.

https://github.com/purplefox/vert.x/blob/master/src/main/java/org/vertx/java/core/eventbus/impl/DefaultEventBus.java

As I mentioned previously the record parser can switch between delimited
and fixed length, and the size and the delimiter without creating a new
instance, this means you can parse complex protocols which have a
mixture of fixed length and delimited fields.

BTW length prefixed is not actually a new type in itself, it's just
fixed length.

E.g. if you have a protocol where the first 4 bytes is the length and
the next X bytes is the record, then you can think of that as two fixed
length records:

1) Of length 4, which holds a value representing the length L of the data
2) Of length L

So it's still can be parsed by a fixed length parser as long as you can
change the record length between reads (which you can with the vert.x one)

>
> Another common format is <stx>...<etx> or <soh>...<eot> or
> <soh>...<sot>...<etx>...<eot>

Parsing markup is a whole different kettle of fish. That's not within
scope of the vert.x record parser. For that I would recommend a specific
parser for the markup.


>
> Finally, I think having the parser itself be a 'Handler' is rather
> restrictive; if I want to plug in a different protocol, I have to
> craft a new java class.

...or Groovy class, or Ruby, or JS.

But you still have to describe the protocol format somewhere. If it's
not in a class or script where is it? In xml?


--
Tim Fox

Vert.x - effortless polyglot asynchronous application development
http://vertx.io
twitter:@timfox

Gary Russell

unread,
Apr 17, 2012, 12:57:09 PM4/17/12
to ve...@googlegroups.com

This is not markup <stx> is ascii code 0x02, <etx> =0x03 etc etc. It's
simple Ascii data bracketed by control codes.

>>
>> Finally, I think having the parser itself be a 'Handler' is rather
>> restrictive; if I want to plug in a different protocol, I have to
>> craft a new java class.
> ...or Groovy class, or Ruby, or JS.
>
> But you still have to describe the protocol format somewhere. If it's
> not in a class or script where is it? In xml?
>

Yes, but let's say I want two versions of your PubSubServer, one with
delimited, one with fixed; I'd need two complete servers

public void start() {
vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
public void handle(final NetSocket socket) {
socket.dataHandler(RecordParser.newDelimited("\n", new
Handler<Buffer>() {
public void handle(Buffer frame) {
...

public void start() {
vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
public void handle(final NetSocket socket) {
socket.dataHandler(RecordParser.newFixed(42, new
Handler<Buffer>() {
public void handle(Buffer frame) {
...


I'd prefer to be able to plug in the protocol, with something like

private Parser parser; // actual implementation injected during
initialization

public void start() {
vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
public void handle(final NetSocket socket) {
socket.dataHandler(PubSubServer.this.parser.setHandler(new
Handler<Buffer>() {
public void handle(Buffer frame) {
...


public interface Parser extends Handler<Buffer> {

Handler<Buffer> setHandler(Handler<Buffer> handler);
}

public static class CustomParser implements Parser {

private Handler<Buffer> handler;

public Handler<Buffer> setHandler(Handler<Buffer> handler) {
this.handler = handler;
return this;
}

...

}


It would be nice if all handlers had a

Handler<E> setHandler(Handler<E> handler);

method.

That way, I could assemble the server's call stack from previously
instantiated (injected) objects rather than using hard-wired constructors.

Tim Fox

unread,
Apr 17, 2012, 1:11:35 PM4/17/12
to ve...@googlegroups.com

Ok, so it's just delimited.

>
>>>
>>> Finally, I think having the parser itself be a 'Handler' is rather
>>> restrictive; if I want to plug in a different protocol, I have to
>>> craft a new java class.
>> ...or Groovy class, or Ruby, or JS.
>>
>> But you still have to describe the protocol format somewhere. If it's
>> not in a class or script where is it? In xml?
>>
>
> Yes, but let's say I want two versions of your PubSubServer, one with
> delimited, one with fixed; I'd need two complete servers
>
> public void start() {
> vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
> public void handle(final NetSocket socket) {
> socket.dataHandler(RecordParser.newDelimited("\n", new
> Handler<Buffer>() {
> public void handle(Buffer frame) {
> ...
>
> public void start() {
> vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
> public void handle(final NetSocket socket) {
> socket.dataHandler(RecordParser.newFixed(42, new
> Handler<Buffer>() {
> public void handle(Buffer frame) {
> ...

Why would you need two servers?


>
>
> I'd prefer to be able to plug in the protocol, with something like
>
> private Parser parser; // actual implementation injected during
> initialization
>
> public void start() {
> vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
> public void handle(final NetSocket socket) {
> socket.dataHandler(PubSubServer.this.parser.setHandler(new
> Handler<Buffer>() {
> public void handle(Buffer frame) {
> ...
>
>
> public interface Parser extends Handler<Buffer> {
>
> Handler<Buffer> setHandler(Handler<Buffer> handler);
> }
>
> public static class CustomParser implements Parser {
>
> private Handler<Buffer> handler;
>
> public Handler<Buffer> setHandler(Handler<Buffer> handler) {
> this.handler = handler;
> return this;
> }
>
> ...
>
> }
>
>
> It would be nice if all handlers had a
>
> Handler<E> setHandler(Handler<E> handler);
>
> method.
>
> That way, I could assemble the server's call stack from previously
> instantiated (injected) objects rather than using hard-wired
> constructors.

To be honest I'm not sure what you're getting at here. I guess my
parsing must be faulty ;)

Gary Russell

unread,
Apr 17, 2012, 1:48:47 PM4/17/12
to ve...@googlegroups.com

...except the beginning delimeter (STX etc) is not part of the data. A
message that doesn't start with STX is invalid and should be rejected.

>
>>
>>>>
>>>> Finally, I think having the parser itself be a 'Handler' is rather
>>>> restrictive; if I want to plug in a different protocol, I have to
>>>> craft a new java class.
>>> ...or Groovy class, or Ruby, or JS.
>>>
>>> But you still have to describe the protocol format somewhere. If
>>> it's not in a class or script where is it? In xml?
>>>
>>
>> Yes, but let's say I want two versions of your PubSubServer, one with
>> delimited, one with fixed; I'd need two complete servers
>>
>> public void start() {
>> vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
>> public void handle(final NetSocket socket) {
>> socket.dataHandler(RecordParser.newDelimited("\n", new
>> Handler<Buffer>() {
>> public void handle(Buffer frame) {
>> ...
>>
>> public void start() {
>> vertx.createNetServer().connectHandler(new Handler<NetSocket>() {
>> public void handle(final NetSocket socket) {
>> socket.dataHandler(RecordParser.newFixed(42, new
>> Handler<Buffer>() {
>> public void handle(Buffer frame) {
>> ...
>
> Why would you need two servers?

How else would you configure one to use a delimited parser and another
instance *of the same server* to use a fixed-length parser. The fact you
are wiring the handlers together in code precludes swapping out a
handler for a different one, without writing more code. I prefer a more
loosely coupled approach.

It was a quick attempt to show how you could wire pre-instantiated
handlers together, rather the new... new... new...

It's Inversion of Control - instead of the server instantiating
particular implementations of handlers, the actual implementations are
provided to the server during initialization. Spring provides Inversion
of Control via Dependency Injection; another IoC technique is to use the
factory pattern, where the user delegates to a factory to get an
implementation of an interface, rather than instantiating it directly
itself.

Again; it's all about loose coupling.

Tim Fox

unread,
Apr 17, 2012, 4:41:03 PM4/17/12
to ve...@googlegroups.com
Even so, that shouldn't be a problem for RecordParser.
I don't get it. There are many ways of re-using the code and allowing the parser to be configured without duplicating the server code.... inheritance, delegation, etc
I don't see that anything in the vert.x api precludes you from injecting your parser if that's what you want to do.

Tim Fox

unread,
Apr 17, 2012, 4:43:46 PM4/17/12
to ve...@googlegroups.com
I would go for commons fileupload for now, since I can't make any guarantees when this functionality will arrive. Hopefully won't be long though.

If you like though you could try porting the Netty codec into vert.x. Would be a nice little project.

expert china

unread,
May 11, 2012, 11:38:08 AM5/11/12
to ve...@googlegroups.com
Fileupload is  indispensable, now not in next work list!

在 2012年4月18日星期三UTC+8上午4时43分46秒,Tim Fox写道:

Tim Fox

unread,
May 12, 2012, 4:09:13 AM5/12/12
to ve...@googlegroups.com
I've added it.

Now we just need a volunteer to implement it :)

expert china

unread,
Jun 8, 2012, 4:56:49 AM6/8/12
to ve...@googlegroups.com
Netty can handle  commons fileupload , why vertx can not?

在 2012年4月18日星期三UTC+8上午4时43分46秒,Tim Fox写道:

Pid

unread,
Jun 8, 2012, 3:43:07 PM6/8/12
to ve...@googlegroups.com
On 08/06/2012 09:56, expert china wrote:
> Netty can handle commons fileupload , why vertx can not?

Are you telling us that you've tried it and there is a problem?

If so, what did you try (please provide a code example) and what was the
result?


p
> --
> You received this message because you are subscribed to the Google
> Groups "vert.x" group.
> To view this discussion on the web, visit
> https://groups.google.com/d/msg/vertx/-/9RIFKyvwyBQJ.
> To post to this group, send an email to ve...@googlegroups.com.
> To unsubscribe from this group, send email to
> vertx+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/vertx?hl=en-GB.


--

[key:62590808]

signature.asc

expert china

unread,
Jun 8, 2012, 11:09:45 PM6/8/12
to ve...@googlegroups.com
No, I have not tried it.

What I mean is that vertx was built beyond netty but did not reuse functions of netty, I am puzzled by it.

在 2012年6月9日星期六UTC+8上午3时43分07秒,Pid写道:

Tim Fox

unread,
Jun 11, 2012, 7:19:53 AM6/11/12
to ve...@googlegroups.com
That functionality didn't exist when we built the Http API.

But in any case, there is already a feature request for it.

Feel free to send a pull request :)

castarco

unread,
Jul 30, 2012, 7:45:15 AM7/30/12
to ve...@googlegroups.com
Hello,


El lunes, 11 de junio de 2012 13:19:53 UTC+2, Tim Fox escribió:
That functionality didn't exist when we built the Http API.

But in any case, there is already a feature request for it.

Feel free to send a pull request :)


what's the state of this feature? The Bananity development team is very interested on this feature. In case the task is in the todo list yet, we'll start to implement the feature.

There are suggestions that we should take into account?

Thanks in advance :) .

Tim Fox

unread,
Jul 30, 2012, 7:54:51 AM7/30/12
to ve...@googlegroups.com
The current pull request https://github.com/vert-x/vert.x/pull/235 is on the right track but is not complete.

1) I am not sure I like the idea of putting this functionality directly into the core API
2) The current pull request is Java-centric. To complete the functionality would need it usable via *all* of the languages that Vert.x supports
3) Needs documentation too

Actually, in general 2) and 3) are the most common reasons why I haven't merged pull requests.

It's very common to receive pull requests which only implement the feature in, say, Java, and provide no documentation. It's unlikely that I'll merge requests like these.

If you want stuff to be merged quickly a) Provide a *complete* implementation b) Provide docs

Tim Fox

unread,
Jul 30, 2012, 7:59:51 AM7/30/12
to ve...@googlegroups.com
I will take another look at this today, and see how we can kick start progress again, with a view to getting it into master asap

castarco

unread,
Jul 30, 2012, 10:16:36 AM7/30/12
to ve...@googlegroups.com


El lunes, 30 de julio de 2012 13:54:51 UTC+2, Tim Fox escribió:
The current pull request https://github.com/vert-x/vert.x/pull/235 is on the right track but is not complete.

1) I am not sure I like the idea of putting this functionality directly into the core API

We have no preferences about this, if we can help moving the code into a module just tell us :) .
 
2) The current pull request is Java-centric. To complete the functionality would need it usable via *all* of the languages that Vert.x supports

There are documentation explaining how to do the bindings of the other supported languages? In case the answer is not, where we can find examples to view the pattern?
 
3) Needs documentation too

+1
 

Tim Fox

unread,
Jul 30, 2012, 10:59:48 AM7/30/12
to ve...@googlegroups.com


On Monday, July 30, 2012 3:16:36 PM UTC+1, castarco wrote:


El lunes, 30 de julio de 2012 13:54:51 UTC+2, Tim Fox escribió:
The current pull request https://github.com/vert-x/vert.x/pull/235 is on the right track but is not complete.

1) I am not sure I like the idea of putting this functionality directly into the core API

We have no preferences about this, if we can help moving the code into a module just tell us :) .

I don't think it needs to go into a module, I just don't like the idea of complicating the current HTTP API.

I would envision the post decoder as just a handler which gets set on the current HTTP server, and which implements Handler<HttpServerRequest>
 
2) The current pull request is Java-centric. To complete the functionality would need it usable via *all* of the languages that Vert.x supports

There are documentation explaining how to do the bindings of the other supported languages? In case the answer is not, where we can find examples to view the pattern?

Pick a language, and take a look at how it interacts with Java.

castarco

unread,
Jul 30, 2012, 11:34:11 AM7/30/12
to ve...@googlegroups.com


I don't think it needs to go into a module, I just don't like the idea of complicating the current HTTP API.

I would envision the post decoder as just a handler which gets set on the current HTTP server, and which implements Handler<HttpServerRequest>


Yes, that's a better idea than complicating the current HTTP API. So, this handler would be only for multipart/form requests... or would distinguish basic post requests from multipart requests? We should write the logic to identify the request type inside vert.x or in "verticle space",  but i think it's preferable to implement it in vert.x .

Cheers.

castarco

unread,
Jul 30, 2012, 1:03:42 PM7/30/12
to ve...@googlegroups.com
Hi another time,



I don't think it needs to go into a module, I just don't like the idea of complicating the current HTTP API.

I would envision the post decoder as just a handler which gets set on the current HTTP server, and which implements Handler<HttpServerRequest>


 What do you think about the next proposal?

- Extending HttpServerRequest with a new class HttpPostServerRequest with these new methods (this is a draft):
  • SomeAssociativeMapType<someBytesStreamType> getFilesList ()
  • SomeAssociativeMapType<String> getFieldsList()

Cheers.

Tim Fox

unread,
Jul 30, 2012, 1:14:09 PM7/30/12
to ve...@googlegroups.com
That won't really work since a form post can contain many file uploads and they can be very large (i.e. more than can fit in available RAM), so you can't load them all up first before calling getFilesList().

What we need is an async API that calls you. Actually the API in the pull request is not bad, it's just that it's too closely integrated with the HTTP api.
 

Cheers.

castarco

unread,
Aug 4, 2012, 12:48:13 PM8/4/12
to ve...@googlegroups.com
Hello another time,

I've been trying to understand the multipart handling implementation (done in https://github.com/vert-x/vert.x/pull/235) to use it and document it (as you said, it's important to document it), but I had many problems.

I've used the provided example (in JavaScript) as a guide to use the code of the pull request, and there is a point where I lose the clue every time I try: the seventh line of the first example:
 var pump = new vertx.Pump(data, file.getWriteStream());

The Pump constructor (in Java) requires data to implement the ReadStream interface. In my code, data is an object with HttpAttribute type, which extends HttpData, but does not implement the ReadStream interface... I've been searching how to obtain the data stream to read it following other ways, bug I'm completely lost.

Another question that comes to my mind is if there is a way to stablish a size limit to the incoming post stream or to the GET route.
There isn't an obvious documented
method to implement this security restrictions.

Thanks in advance! :) , I'm wondering to understand a little bit more of vert.x internals to start helping, for now i'll assume my
limitations and continue studying the code.

Tim Fox

unread,
Aug 10, 2012, 4:07:36 AM8/10/12
to ve...@googlegroups.com
Hopefully I will have time to look at this next week, and lend a hand
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To view this discussion on the web, visit https://groups.google.com/d/msg/vertx/-/hB-Zdve23zcJ.

To post to this group, send an email to ve...@googlegroups.com.
To unsubscribe from this group, send email to vertx+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages