Streaming with Railo

192 views
Skip to first unread message

Arthur Blake

unread,
Jun 25, 2012, 11:06:59 AM6/25/12
to ra...@googlegroups.com
Has anyone done streaming with Railo?  Basically, I have a custom JSON POST API interface built on top of Railo for 3rd parties to integrate with our business and I'd like to stream the output, as it can get quite large with base64 embedded files, etc.  Using the built in facilities that come with ACF/Railo, it seems you have to build the entire request and response in memory before flushing it out.
This can be a big problem with memory usage with any decent kind of volume.

I have done quite a bit of Java and Servlets programming in the past, so I could write my own servlet or servlet filter to do this and hand it off the Railo at the right points, but I was wondering if there was some kind of built in way to do streaming in Railo that could let me avoid dropping down into Java and doing that work?

Any advice or comments?

AJ Mercer

unread,
Jun 25, 2012, 8:31:17 PM6/25/12
to ra...@googlegroups.com
<tease>
Does not help you today, but keep an eye out for Railo 4 (Appollo)
Micha has some cool ReST stuff baking in the oven ;-)
</tease>

Arthur Blake

unread,
Jun 26, 2012, 8:04:54 AM6/26/12
to ra...@googlegroups.com
On Monday, June 25, 2012 8:31:17 PM UTC-4, AJ Mercer wrote:
<tease>
Does not help you today, but keep an eye out for Railo 4 (Appollo)
Micha has some cool ReST stuff baking in the oven ;-)
</tease>

I definately am keeping an eye out for Railo 4!  Not sure how REST stuff helps with streaming http requests & responses--
Anyhow, if nobody seems to have any good insights, I guess I will go the Java servlet route. 

Jean Moniatte

unread,
Jun 26, 2012, 10:51:51 AM6/26/12
to ra...@googlegroups.com

Probably a very wrong answer, but doesn't cfflush allow you to do this?

Thanks,
Jean

Arthur Blake

unread,
Jun 26, 2012, 12:35:49 PM6/26/12
to ra...@googlegroups.com
On Tuesday, June 26, 2012 10:51:51 AM UTC-4, Jean Moniatte wrote:

Probably a very wrong answer, but doesn't cfflush allow you to do this?

Thanks,
Jean

cfflush flushes any pending data that may be buffered, back to the client.  But you bring up a good point, if I am writing a response back, it probably is streamed to some extent.
The problem comes in for streaming in the request.  I have no idea how to do it in CF.

Example:  Imagine a web service is sending me an HTTP Post where the body has 100MB of data in it.
I want to have the raw content streamed to me.
Using ACF or Railo, the only way I know to get the post data is getHttpRequestData().content which would grab that entire 100MB and shove it into a single variable all at once.
Instead I want to have a consumer that reads it chunk by chunk so that there is no more than a small amount in memory at a time, instead of all 100MB at once!

Alan Holden

unread,
Jun 26, 2012, 12:51:28 PM6/26/12
to ra...@googlegroups.com
There's an assumption here that Railo would in fact shove the entire
100MB into RAM and/or manipulate it like one huge block at once - that
there is no better blob management already "under the hood" here.

I don't know any better myself either, but I would think that one would
need to confirm that this really is an issue - before expending lots of
work to correct for it.

Have you actually coded this scenario on a test machine? Can the Railo
folk confirm your fears?

My thoughts...

Al

On 6/26/2012 9:35 AM, Arthur Blake wrote:
(snipped)

Peter Boughton

unread,
Jun 26, 2012, 1:04:09 PM6/26/12
to ra...@googlegroups.com
In Railo, the cfcontent tag lets you specify range attribute, to
enables partial/resumable downloads:

<cfcontent range=true file="100Megs.data" />

I don't know precisely how it works, but presumably streaming would
make use of this, and thus it would probably send data in smaller than
100MB chunks?

Arthur Blake

unread,
Jun 26, 2012, 1:31:49 PM6/26/12
to ra...@googlegroups.com
On Tuesday, June 26, 2012 12:51:28 PM UTC-4, Alan Holden wrote:
There's an assumption here that Railo would in fact shove the entire 
100MB into RAM and/or manipulate it like one huge block at once - that 
there is no better blob management already "under the hood" here. 

Yes, but it is a correct assumption.  If you have an API call that says "give me all the content", how could it give you incomplete content?
Of course, it could die trying if the content is so large that the infrastructure isn't designed to hold that much data in a variable.
That looks like it is for output only.  I don't think output will be a problem, I just write dribs of data as it is available and it should stream for me (at least I think it should...)  It think the bigger problem is input. There is probably some way to do it, or if I could access the underlying HttpServletRequest object (assuming Railo isn't pre-loading the entire request) , I could do everything I need...

Alan Holden

unread,
Jun 26, 2012, 2:44:59 PM6/26/12
to ra...@googlegroups.com
I agree. Certainly the CFML object would ultimately provide the complete variable at the interface.

I was postulating that - behind the scenes - the engine or platform might already be using a disk cache or something to "stream" gynormous objects naturally, rather than load them entirely into RAM - and commit a predictable suicide - due to a programming instruction like CFFILE or CFCONTENT.

Now I'm familiar with servers which chug and die because their entire resources are consumed. But would a Railo engine croak just because a variable points to something larger than allocated RAM? That was my question or theory.
Al
On 6/26/2012 10:31 AM, Arthur Blake wrote:
On Tuesday, June 26, 2012 12:51:28 PM UTC-4, Alan Holden wrote:
There's an assumption here that Railo would in fact shove the entire 
100MB into RAM and/or manipulate it like one huge block at once - that 
there is no better blob management already "under the hood" here. 

Yes, but it is a correct assumption.  If you have an API call that says "give me all the content", how could it give you incomplete content?
Of course, it could die trying if the content is so large that the infrastructure isn't designed to hold that much data in a variable.

(snipped)

Denny

unread,
Jun 26, 2012, 3:32:40 PM6/26/12
to ra...@googlegroups.com
On 6/26/12 11:31 AM, Arthur Blake wrote:
...
> That looks like it is for output only. I don't think output will be a
> problem, I just write dribs of data as it is available and it should stream
> for me (at least I think it should...) It think the bigger problem is
> input. There is probably some way to do it, or if I could access the
> underlying HttpServletRequest object (assuming Railo isn't pre-loading the
> entire request) , I could do everything I need...

I think getPageContext().getHttpServletRequest() would get you that.
Maybe you could just grab the stream (but IIRC it isn't that easy).

Can you give a little more info about how/what you're doing? I suspect
there's probably a better way, depending on your scenario. Are you
using a webserver in front of your app server? How exactly are you
posting the files? Is this for mobile stuff (I saw you had base64 in
the mix)? There might be a better way than base64, if so.

As Al said, nobody would be able to do gig+ uploads if the chunked
transfers and whatnot weren't handled "under the hood" (I suspect at a
lower level than Railo itself even, but I dunno). If you're doing
custom stuff, is it possible you're bypassing tech that would normally
be used for file transfers?

Are you trying to solve a real problem, or is this theoretical work? If
it's a real problem, can you create a testcase for folks to play with?
It might be the way you're handling variables, vs. the actual uploading
of the data, etc..

If it's a Railo limitation, we'll address it, one way or another. :)

:Denny

--
Railo Technologies: getrailo.com Professional Open Source
Skype: valliantster (505)510.1336 de...@getrailo.com
GnuPG-FP: DDEB 16E1 EF43 DCFD 0AEE 5CD0 964B B7B0 1C22 CB62


Arthur Blake

unread,
Jun 26, 2012, 3:58:08 PM6/26/12
to ra...@googlegroups.com
None of this is theoretical-- the whole interface has been in production for almost 2 years.  Essentially, it's a B2B interface over (2 way JSON post over HTTPS) for transferring documents, notes, etc. between businesses.

On the protocol that is in place, we normally transfer files by reference (by passing a URI to the document that has a session key as part of the URI to secure it), instead of embedding it with base64 in the JSON message.

That tends to work out quite smoothly because we sidestep the whole issue entirely because the JSON messages are quite small, and we take advantage of CFHTTP to download the document directly into a file as a separate request.

But, we are working with a new partner that is insisting on transferring files to us as embedded base64 content (because in some contexts they only have the ability to post out to us and not provide a web server for downloading the resources separately.. in other words, it's a lot easier for them), so I am looking into upgrading what we have to handle this case more robustly.

The normal problem with transferring a document embedded in JSON (or XML for that matter) as base64 encoded is that if you implement it the naive way, you end up storing the same document in memory 3 times as you are decoding it:

1. When the entire request is received. (getHttpRequestData().content)
2. When the JSON is deserialized into a struct (DeserializeJSON function)
3. When the base64 for each embedded file is decoded into binary (ToBinary function)

Couple that with multiple large documents in a single request and multiple simultaneous requests, and you end up with a slow interface that gets OutOfMemoryErrors quite frequently (been there, done that!)
Coldfusion/Railo does not provide for a streaming mechanism (that I know of) for any of the 3 parts outlined above, so I am looking into streaming the whole incoming request into a streaming JSON deserializer (like Jackson for example, similar to SAX XML processing) and picking out the base64 parts as they are encountered (with some kind of base64 Java streaming library), streaming them into actual files on disk as they are encountered (rather than into memory.)

This, if properly implemented would be quite robust, fast, lean and not suffer from any of the memory problems mentioned.

Arthur Blake

unread,
Jun 26, 2012, 5:18:19 PM6/26/12
to ra...@googlegroups.com
I think I am really beyond what Railo can do easily. Yes, I could probably make it work with a lot of ingenuity and shelling out to Java for 90% of it...

What I am now leaning towards is writing the bulk of it in Java as a new servlet that acts as a proxy in front of Railo, that deserializes the initial request and breaks it down into manageable pieces for CF. Then it proxies the smaller JSON payload with the files as references to either temp files on the server or URIs on a smaller internal web server.

So the initial request would chop the unweildy json into a smaller json that has the base64 parts nicely decoded and converted to URI's or files that are then looked up by reference by the CF/Railo processor.

That seems like it would be easier to do than try and have Railo or Java handle the whole thing, given that all the business logic is already nicely in place under Railo...

Thoughts and suggestions would still be welcome...

Alan Holden

unread,
Jun 26, 2012, 5:22:12 PM6/26/12
to ra...@googlegroups.com
Ah, I can see from your original post that you mentioned base 64; but
just assumed that to be a content type of the files you needed to stream
(i.e. they're already converted before you had to deal with them).

Well, the first thought I had was: "how the hell are THEY doing it?" I
hope that they have a proof of concept in place, rather than running in
to the same roadblock after you've done all your work, only to say
'never mind' later on...

So, do any ONE of these three processes lag or fail for lack of memory
on their own? For example, when both the input and output are files?

Now I'm just spitballin'. Probably not much help...

Al

Arthur Blake

unread,
Jun 26, 2012, 5:33:35 PM6/26/12
to ra...@googlegroups.com
They are a large company and are already doing it with many other vendors (albeit with XML, not JSON.)
This is actually not rocket science.  It's not that hard.  What is hard for me is to figure out how to stream every last bit of it in Railo so that it's not such a memory hog.
In Java it's easy cause there are so many classes for dealing with streams.

Arthur Blake

unread,
Jun 26, 2012, 5:38:00 PM6/26/12
to ra...@googlegroups.com
Sorry, didn't answer your other question:

"So, do any ONE of these three processes lag or fail for lack of memory 
on their own? For example, when both the input and output are files?"

They certainly can, it all depends on the size of the files coming over.  The fact that there are 3 overlapping in one request just compounds the problem greatly.

Denny

unread,
Jun 26, 2012, 9:57:54 PM6/26/12
to ra...@googlegroups.com
I wonder if this could work:

http://commons.apache.org/fileupload/streaming.html

Are you using raw message body or are you doing multipart/form?

websockets? ;)

It sounds like you have control of both ends, if you're using custom
AJAX, and yet it also seems like the client has existing infrastructure
you cannot change (XML+b64 soap-like deal?).

Do you have control of both ends? Or are they wanting to point their
existing "upload" stuff at your API?

Eh. A quick experiment (33 lines or so in "meat"), which seems to work
-- I don't know if it's really intercepting the request -- untested
beyond a simple text file:

https://github.com/denuno/flyingbase64

If you test it out, I'd be curious to know if it's really doing what you
wanted to do. Regardless, it's kinda nifty, maybe useful for something
else at some point.

Jochem van Dieten

unread,
Jun 27, 2012, 8:59:11 AM6/27/12
to ra...@googlegroups.com
On Tue, Jun 26, 2012 at 11:18 PM, Arthur Blake wrote:
> What I am now leaning towards is writing the bulk of it in Java as a new
> servlet that acts as a proxy in front of Railo, that deserializes the
> initial request and breaks it down into manageable pieces for CF.

If you are going to do programming anyway, why not just support one of
the existing internet standards that offer some relief or even a
solution for this problem? Off the top of my head we have DIME, MTOM
and swaRef as credible alternatives to base64 embedded binary.Or even
some do-it-yourself mechanism would work, you already have a POST so
it is easy to make it multipart and have them put the binaries in a
different part from the JSON.

Jochem

--
Jochem van Dieten
http://jochem.vandieten.net/

Arthur Blake

unread,
Jun 27, 2012, 12:45:19 PM6/27/12
to ra...@googlegroups.com
Thanks for your suggestions.

Serge Droganov

unread,
Jun 27, 2012, 3:41:10 PM6/27/12
to ra...@googlegroups.com
Hello Arthur, Can't you stream with a front-end? Why do you need to load the app?

On Wed, Jun 27, 2012 at 8:45 PM, Arthur Blake <arthur...@gmail.com> wrote:
Thanks for your suggestions.


Arthur Blake

unread,
Jun 27, 2012, 3:49:23 PM6/27/12
to ra...@googlegroups.com
On Wednesday, June 27, 2012 3:41:10 PM UTC-4, Serge Droganov wrote:
Hello Arthur, Can't you stream with a front-end? Why do you need to load the app?

I don't understand your question. What specifically do you mean by "front-end" and what do you mean by "app"
The solution I've outlined (having a Java Servlet in front of Railo) is a front-end of sorts...
 

Serge Droganov

unread,
Jun 27, 2012, 4:27:42 PM6/27/12
to ra...@googlegroups.com
I'ts not clear to me why do you want to output content with railo. You can use nginx (or whatever) to proxy requests to railo, then make some logic and redirect nginx to the content file. I guess it's better to use a solution that is designed for streaming.

Denny

unread,
Jun 27, 2012, 5:46:42 PM6/27/12
to ra...@googlegroups.com
On 6/27/12 2:27 PM, Serge Droganov wrote:
> I'ts not clear to me why do you want to output content with railo. You can
> use nginx (or whatever) to proxy requests to railo, then make some logic
> and redirect nginx to the content file. I guess it's better to use a
> solution that is designed for streaming.

I think it's a two-pronged deal. One "prong" is the "in house" one,
controlled at both ends, and the other is the "externally controlled"
prong, which is about consuming a 3rd party format (XML/base64) without
having to go "outside" the existing application, and without monster
memory consumption increases- thus the use of the inbound stream.

(I love the "passing by reference", aspect of the existing app, BTW!)

I think Arthur was talking about both aspects, and thus the confusion
over what parts were "external" requirements.

I'm not 100% on that interpretation, and I haven't reread the thread,
but that's the story in my mind ATM. :)

While it's perhaps funny that UUencode-type stuff is still rampant in
the wild (I'm with Jochem (who is a hip cat BTW) on binary) it sounds
like that's one of the externally defined constraints, for the XML stuff
at least. I think.

All I know, is that something that didn't work when I tried it years ago
(intercepting the stream) at least /seems/ to work now. Woohoo!

Arthur Blake

unread,
Jun 28, 2012, 2:50:41 PM6/28/12
to ra...@googlegroups.com
Hey guys, I agree with you that passing things around in base64 is somewhat inane.
Unfortunately, I often have to cooperate and work with standards in my industry that it seems most everyone has bought into.
I am in control of some of it and not in control of other parts of it.
Anyway, I think I have enough information to get the thing working well.
Thanks everyone for all the great suggestions and ideas.

Alan Holden

unread,
Jun 28, 2012, 3:10:42 PM6/28/12
to ra...@googlegroups.com
I hope you can take the time to post back with your solution.

Al

On 6/28/2012 11:50 AM, Arthur Blake wrote:
(snipped)

Arthur Blake

unread,
Jun 29, 2012, 8:06:25 AM6/29/12
to ra...@googlegroups.com
On Thursday, June 28, 2012 3:10:42 PM UTC-4, Alan Holden wrote:
I hope you can take the time to post back with your solution. 

Al 

Will do!
Reply all
Reply to author
Forward
0 new messages