Is there any facility to send big files ?

159 views
Skip to first unread message

Michel Desmoulin

unread,
Jan 3, 2015, 3:54:39 PM1/3/15
to autob...@googlegroups.com
Since WAMP is a natural fit for distributed system, it seems natural to use it to balance big tasks such as video encoding, natural text processing and such. These tasks material to be processed, often on the form of a file of hundred of Mo, sometimes more.

You can't really send a 1 Go file with a RPC call, that would bloat the RAM.

But a streaming protocole such as WebRTC on a async lib screem for easy data streaming.

Is there a facility to do that ?

Could we add this as a primitive ?

For exemple, a client declare :

class App(ApplicationSession):

   
def onjoin(self).

       
def process_file(path, parameters_sent):
           
# do stuff wit the file

       
# if not endpoint is provided, the data will go throught crossbar
       
# but you may want to avoir overloading it and bypass it
       
self.pipe('video.encoding', endpoint='ws://0.0.0.0:9999', process_file)



The client would then do :


self.stream('video.encoding', '/path/to/file', ['param1'], onchunk)


And crossbar would send it back the endpoint address, then it would start sending data the ApplicationSession would process automatically.

Once downloaded, the process_file get called with a path to the temporary file, which will be deleted automatically once the function returns.

self.stream would return a promise with then() called if the file is downloaded, error() if the downloads has been interupted(). onchunk would provide a hook to process the file chunk while it's been sent. A similar param could be available on the server.

I know you can already do it manually using streaming websocket manually, but having it has a primitive would make things so much simpler. Sending files between machines is still a pain in 2015. So much things to deal with : retries, faillure callback, progress callbacks, etc.

Tobias Oberstein

unread,
Jan 4, 2015, 4:00:33 AM1/4/15
to autob...@googlegroups.com
Am 03.01.2015 um 21:54 schrieb Michel Desmoulin:
> Since WAMP is a natural fit for distributed system, it seems natural to
> use it to balance big tasks such as video encoding, natural text
> processing and such. These tasks material to be processed, often on the
> form of a file of hundred of Mo, sometimes more.
>
> You can't really send a 1 Go file with a RPC call, that would bloat the RAM.

There were others post asking for this (mass data) .. can't find now.
Somehow this seems to be of interest to people;) There are multiple
challenges with this.

WAMP was definitely not designed for doing real-time (in the VoIP and
such) media transmissions. Requirements are vastly different. Dropping
stuff to keep up with network rates is ok here for example. This is
definitely out of scope for WAMP, and as you note, the Web has something
here: WebRTC. If you need real-time _media_, got WebRTC.

Note that WebRTC requires signaling before establish media channels ..
this is something WAMP can do.

Now regarding mass data transfers like file transfer. In general, good
old HTTP does a decent job on this.

But this (transfer of "static" mass data over WAMP) is something we can
address (not completely out of scope, though not the primary focus of WAMP).

Here is 1 idea:

Download:

Say you have a WAMP procedure

"com.example.get_file"

That procedure can produce _progressive results_ returning a sequence of
chunks (like 4KB) to the Caller.

Upload:

3 WAMP procedures

"com.example.create_file"
"com.example.append_to_file"
"com.example.finish_file"

The former returns a "handle" (just an opaque integer ID say) to a temp
file on the Callee. The latter uses that in the call and allows you to
append chunks (eg 4KB) to the file previously created. The finish file
then makes the temp file into a "regular file".

Only the original Caller of create_file can append to it. When the
Caller dies before he calls finish_file, the temp file is automatically
removed. etc etc

==

The challenge with above approach is more of the sort: not doing the 4KB
chunk "send progressive result" / "append" in tight loops which will
effectively block everything else and swamp the WAMP session's transport
("head of line blocking").

Plus: not swamping the outgoing TCP when the receiver can't keep up. In
Twisted this needs "producer/consumer" pattern .. I have never tried
doing WAMP using Twisted "producer/consumer".

>
> But a streaming protocole such as WebRTC on a async lib screem for easy
> data streaming.

WebRTC is for vastly different requirements: loosing stuff is "ok"
(dropped audio frames). Which wouldn't be ok for file transfer ..

>
> Is there a facility to do that ?

Not really

>
> Could we add this as a primitive ?

You mean: file transfer as a WAMP protocol primitive? Mmh. I'm not sure
this would fit into the design or a good idea.

File transfer: there is FTP, HTTP, BitTorrent, ..

Why would I want WAMP?

>
> For exemple, a client declare :
>
> |
> classApp(ApplicationSession):
>
> defonjoin(self).
>
> defprocess_file(path,parameters_sent):
> # do stuff wit the file
>
> # if not endpoint is provided, the data will go throught crossbar
> # but you may want to avoir overloading it and bypass it
> self.pipe('video.encoding',endpoint='ws://0.0.0.0:9999',process_file)
>
> |
>
>
> The client would then do :
>
>
> |
> self.stream('|video.encoding|','/path/to/file', ['param1'], onchunk)
> |
>
>
> And crossbar would send it back the endpoint address, then it would
> start sending data the ApplicationSession would process automatically.
>
> Once downloaded, the process_file get called with a path to the
> temporary file, which will be deleted automatically once the function
> returns.
>
> self.stream would return a promise with then() called if the file is
> downloaded, error() if the downloads has been interupted(). onchunk
> would provide a hook to process the file chunk while it's been sent. A
> similar param could be available on the server.

I think this is quite close to what I outlined above. Should be doable
today. To prove, someone had to actually do it;)

Cheers,
/Tobias

>
> I know you can already do it manually using streaming websocket
> manually, but having it has a primitive would make things so much
> simpler. Sending files between machines is still a pain in 2015. So much
> things to deal with : retries, faillure callback, progress callbacks, etc.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Autobahn" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to autobahnws+...@googlegroups.com
> <mailto:autobahnws+...@googlegroups.com>.
> To post to this group, send email to autob...@googlegroups.com
> <mailto:autob...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/autobahnws/54A85704.3020801%40gmail.com <https://groups.google.com/d/msgid/autobahnws/54A85704.3020801%40gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Michel Desmoulin

unread,
Jan 4, 2015, 5:46:24 AM1/4/15
to autob...@googlegroups.com

It's doesn't need to be wamp, but wamp can do the signaling nicely : you don't need to know the endpoint in advance, you can just ask for a registered endpoint with a name. Plus, websocket seems like a good fit to stream data. You already got both embeded, it seems a natural fit.
 
>
> For exemple, a client declare :
>
> |
> classApp(ApplicationSession):
>
> defonjoin(self).
>
> defprocess_file(path,parameters_sent):
> # do stuff wit the file
>
> # if not endpoint is provided, the data will go throught crossbar
> # but you may want to avoir overloading it and bypass it
> self.pipe('video.encoding',endpoint='ws://0.0.0.0:9999',process_file)
>
> |
>
>
> The client would then do :
>
>
> |
> self.stream('|video.encoding|','/path/to/file', ['param1'], onchunk)
> |
>
>
> And crossbar would send it back the endpoint address, then it would
> start sending data the ApplicationSession would process automatically.
>
> Once downloaded, the process_file get called with a path to the
> temporary file, which will be deleted automatically once the function
> returns.
>
> self.stream would return a promise with then() called if the file is
> downloaded, error() if the downloads has been interupted(). onchunk
> would provide a hook to process the file chunk while it's been sent. A
> similar param could be available on the server.

I think this is quite close to what I outlined above. Should be doable
today. To prove, someone had to actually do it;)

The eternal problem.
 

Cheers,
/Tobias

>
> I know you can already do it manually using streaming websocket
> manually, but having it has a primitive would make things so much
> simpler. Sending files between machines is still a pain in 2015. So much
> things to deal with : retries, faillure callback, progress callbacks, etc.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Autobahn" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to autobahnws+...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages