I've recently been working on a refactor of how multipart file uploads
are handled. This refactor allows you to specify a custom storage
function, so you can store file uploads on different systems. For
example, you might want to have uploaded files sent directly to
Amazon's S3 service.
The problem is what to do with uploaded files we don't want to keep.
Some storage functions will store uploaded in memory, and therefore
will be garbage collected automatically. But other storage functions
will store uploaded files in persistent databases, or on the
file-system.
Currently Ring stores uploaded files in temporary files, which are
then garbage-collected by a background thread. But not all
applications will want this particular behaviour; for instance, Google
App Engine won't allow you to keep a background thread around like
this. Other cloud servers may have similar restrictions.
My feeling is that there's no one right way to handle this; it depends
on your server architecture, and what storage function you want to
use. I'm therefore thinking that cleaning up old uploaded files should
not be the responsibility of the wrap-multipart-params middleware.
Thoughts?
- James
If anyone can think of a sensible default, I'm all ears :)
- James
I was thinking about providing a function to remove temporary files
that have been unmodified for a certain length of time. This could
then be incorporated into a background thread by the user. e.g.
(use '[ring.middleware.multipart-params.temp-file :only (delete-old-files)])
(future
(while true
(delete-old-files (* 60 60 24))
(Thread/sleep (* 1000 60 5)))))
But this doesn't seem like the sort of thing that should be
incorporated into the multipart middleware by default. Or should it?
- James
Hi James,
I stumbled upon this thread while searching through the web for a solution to reject large requests in ring.
AFIK, this has not yet been implemented in ring core (looking https://github.com/ring-clojure/ring/pull/98) and I am on my own to implement it.
I have tried so far on immutant 2:
- Pulling out the InputStream from the request and inspecting how large the content is. (.available method and trying to read bytes), which returns -1.
- :content-length from the request map, which on the wiki is mentioned to be deprecated.
I understand that the behavior of the internal InputStream is up to the implementation the underlying web server, but
would greatly appreciate it if you could shed some light on the topic (pointers, advise, refirects...)
Regards,
Ikuru
P.S. Your work inspires me all of the time!