Hi Francesco
Thanks for your response. I already came across those links you refer to. However, my files are rather small PDFs, almost all of them containing just one or two pages, with usually a size smaller than 2MB. Compared to the 100MB mentioned in one of the other threads, I considered these not to be "large" payloads (so, I guess, I was wrong, and one of my questions therefore would be: where is the size limit as far as the event bus is concerned / what is it considered a "large" payload?)... Obviously, my example project is also missing a size limitation check of the uploaded files ....
Basically, there is no need to pass the files through the event bus. In theory, I could as well process them once they are received by the Web server. However, for future scaling purposes, I decided to decompose different components of my application (not that I ever intend to really need that, but I always consider a proper design a plus ;-) ). One building block is the Web server, which handles the serving of the GUI components (currently planned to be Angular), as well as the API calls to be used by the GUI. However, I intended to offload the processing of the API calls to dedicated verticles using the Web API service "pattern", so in the future, they could even run on different computing nodes. One verticle implements user management tasks, the other one the document processing/management. The initial document processing might be time intense, so I did not want to do that on the Web server verticle. It will extract text from the PDF (directly or using OCR), and store it in a database. The file shall be stored on persistent storage. Furthermore, files can be renamed, searched for content, assigned tags, grouped in "dossiers", ... (kind of a simple app to archive/organize day to day paperwork; and yes, I know there are professional solutions for that, but most of them are to complex for my purposes. And hey, using this as a little project to get some hands on with Vert.x is fun :-) ). Now as the document "manipulation" (i. e. renaming, searching, tagging, etc.) is using Web API service to forward the calls to the Document processing verticle, for the sake of simplicity and consistency (and to have a "clean" architecture and proper segregation of concerns), I wanted to handle the file uploads the same way as the other calls to the document processing verticle (in the end, the upload is also hitting part of the "API"). -> See "component architecture" below...
+----------+ user
| user | directory
+---------> | service | +--------->
+----------------+ | | verticle |
| +----------+ | | +----------+
| | API |+----------------------------->|
| +----------+ | dispatch API calls |
| | using Web API service | +----------+
| Web | | | doc |
| server | +---------> | service | +--------->
| verticle | | verticle | database/
| | +----------+ file storage
| +----------+ |
| | "static" | |
| | files | |
| +----------+ |
+----------------+
However, it looks like I have to handle the uploads apart from the other API calls. What would you recommend? Another HTTP-based API between the Web Server and doc processing verticle? Otherwise, I could have the Web server store the file in an object store, hand over the reference through event bus to the doc verticle, where the file is retrieved again for processing (though this seems to be a little bit of an overkill)? Last but not least, I will nevertheless consider to point the upload directory of the Web server to some shared storage (GlusterFS, Ceph?). Then I could just hand over the filename through event bus, and access the file directly from the doc verticle...
Thanks again for your input...
Marcial