htmlfile hack problem

82 views
Skip to first unread message

ufo

unread,
Nov 4, 2007, 4:49:12 PM11/4/07
to Orbited Discussion
Hi,

After a week of trying to make htmlfile object not to disconnect after
N javascript calls I finally found the solution used in orbited
javascript code. I was about to think that the problem is solved, but
it turned out that it's actually not. When shifting elements off the
array, we are supposed to reset the array globally, ie unset all
elements of it one by one, until the array is empty, and set
the .length property to 0. That's how it is inside the "s" function
after the loop, but next time, when the function "s" is called, the
dq.length has the strange value. It includes all new elements, that
came to the array after we first shifted it, plus the number of
elements, that we already shifted. In other words it looks just like
changing dq.length inside "s" function whether implicitly or
explicitly, doesn't influence on global data_queue array .length
property. Interesting is also, that elements themselves are being
unset and their values are NULL valuse, after shifting, but if you try
to alert (dq) before the loop in "s" function it will show
you ,,,,,,,,,,new value 1, new value 2, new value 3 .., where ,,,,, -
are already unset elements. So it comes to the problem, that function
event_cb is called much more times, than it's supposed to be.

My current solution is basically a hack:

function s() {
while (dq.length > 0) {
var kk = dq.shift();
if (kk)
event_cb(kk);
}
}
I do not call my callback function if the element is empty. But maybe
there is another ellegant solution over there already?

And another quesion - why the iframe transport is commented ?
// // Otherwise use the iframe
// this.transport = "iframe";
// return;

Michael Carter

unread,
Nov 5, 2007, 3:47:17 AM11/5/07
to orbite...@googlegroups.com
Hi peregar (ufo),

I'll look into this problem. We may have not noticed it because our example applications ignores null events. But we never noticed a performance impact when sending tens of thousands of events with our current scheme.

That said, we simply need to change the implementation of the shift command and things will be fine. The solution to the problem is the queue and local polling. The particular implementation of that is mostly a detail. We'll do further testing, and if necessary we'll change our particular implementation.

Thanks,

Michael Carter

On Nov 4, 2007 4:49 PM, ufo <per...@gmail.com> wrote:

Hi,

After a week of trying to make htmlfile object not to disconnect after
N javascript calls I finally found the solution used in orbited
javascript code. I was about to think that the problem is solved, but
it turned out that it's actually not. When shifting elements off the
array, we are supposed to reset the array globally, ie unset all
elements of it one by one, until the array is empty, and set
the .length property to 0. That's how it is inside the "s" function
after the loop, but next time, when the function "s" is called, the
dq.length has the strange value. It includes all new elements, that
came to the array after we first shifted it, plus the number of
elements, that we already shifted. In other words it looks just like
changing dq.length inside "s" function whether implicitly or
explicitly, doesn't influence on global data_queue array .length
property. Interesting is also, that elements themselves are being
unset and their values are NULL valuse, after shifting, but if you try
to alert (dq) before the loop in "s" function it will show
you ,,,,,,,,,,new value 1, new value 2, new value 3 .., where ,,,,, -
are already unset elements. So it comes to the problem, that function
event_cb is called much more times, than it's supposed to be.

My current solution is basically a hack:

   function s() {
       while ( dq.length > 0) {

Lame Vi

unread,
Nov 5, 2007, 5:03:50 AM11/5/07
to orbite...@googlegroups.com
Hi Michael!

Thank you for fast response.

I actually faced a performance impact because of this behaviour. The
scenario is the following:
- I throw (flush) events from server 10 times a second (for test
reasons) and check them in browser using the same time period. If at the
beginning of the process the code doesn't have a lot to do, then later,
when the events count on thousands and thousands checking all of them
whether they are 'undefined' or not takes much CPU resources.

Say, after the first minute of work we have already 600 events being
checked each 1/10-th of a second, when only one of them is actually the
new event.

I tried to set .length = 0 inside the "s" function but didn't succeed
as it resurrected next time the "s" is called.


And the question about iframe solution - why is it commented in
orbited.js code?


Ufo.

Jacob Rus

unread,
Nov 5, 2007, 5:54:34 AM11/5/07
to orbite...@googlegroups.com
Lame Vi (peregar) wrote:
> And the question about iframe solution - why is it commented in
> orbited.js code?

Because at the moment it isn't needed by any modern browsers, and our
"fallback" is the xhr streaming transport (for Firefox/Safari), with
Opera and IE getting their own special transports.

-Jacob

Lame Vi

unread,
Nov 5, 2007, 6:19:36 AM11/5/07
to orbite...@googlegroups.com
Thanks for answering, Jacob. But the XHR solution is only valid for same domain server communication, as far as I can imagine. The iframe solution would be useful if browser page and the event server are on different domains. Is it correct?

Jacob Rus

unread,
Nov 5, 2007, 2:41:41 PM11/5/07
to orbite...@googlegroups.com
Lame Vi (peregar) wrote:
> Thanks for answering, Jacob. But the XHR solution is only valid for same
> domain server communication, as far as I can imagine. The iframe solution
> would be useful if browser page and the event server are on different
> domains. Is it correct?

Yeah, the orbited.js code needs to be changed so that the xhr request
is done from within an iframe with a different domain. We'll
certainly fix that before the next Orbited release, but feel free to
make the change in your own local copy.

-Jacob

Michael Carter

unread,
Nov 5, 2007, 2:53:58 PM11/5/07
to orbite...@googlegroups.com

All of these solutions will work cross-port or cross-subdomain. Some additional hacks are needed to get xhr working cross-subdomain, like putting the requests in an iframe and switching document.domain. But none of these methods will work completely cross-domain. Your only chance there is long polling jsonp.


As to your earlier question, it sounds like its not a big problem to fix the bug you're talking about. We could keep a counter as to the position in the array of the last event payload and ignore all the preceding nulls.

We'll try fixing this sometime in the next couple of days.

-Michael Carter


ufo

unread,
Nov 5, 2007, 4:45:12 PM11/5/07
to Orbited Discussion
Oh

I finally found a problem root with my array. I was population the
array elements from the server by sending them one by one with
incrementing index of the element a la (using php):

<script>arr[<?= $i++ ?>]=data</script>

Which in its turn created all those empty elements I reported of. Now
I'm setting each element as arr[arr.length]=data and no more
additional element occur.

Sorry for fake-bugreport.


P.S. Whis is the way the best way for Safari browser to keep the
permanent connection, does it support streaming xhr as well as FF?

ufo

unread,
Nov 5, 2007, 5:30:58 PM11/5/07
to Orbited Discussion
Yohoho,

Something really weird is going on here.

Looks like I found a solution for not fetching the data from array in
htmlfile hack in a loop, and trigger events directly from the script
coming from the server as it supposed to be.

I noticed that, when I alert something before valueing the sequential
array element coming from the server, the data transfer won't stop
after N-th event. I passed the reference to my event_cb function to
the iframe from my main document

c.parentWindow.updateData = updateData;

and then set its value to the internal function of the streaming
document in the begining of it

<html><head><title>f</title></head><?= str_repeat(' ', 1024) ?><body>
<script type="text/javascript">var updateData = parent.updateData;</
script>

But the main hack is that the main document MUST have a ticker, even
empty ticker does the trick. After I set an iframe into my main
document i set an interval for empty function, and it no matter even
how frequent it will be.

c.parentWindow.updateData = updateData;
c.iframediv.innerHTML = "<iframe id='ifr' src='" + src + "&arr'></
iframe>";

setInterval( function () {}, 10000)

now there are no any limitation of number of DOM uses inside streaming
document, and it may trigger the events itself by directly calling the
function from main document.

Any explanation of it?


And, I guess, it's needed to remove the old wasted <script> objects
from the streaming document, as they are going to leak the memory of
IE.

Jacob Rus

unread,
Nov 7, 2007, 6:06:17 PM11/7/07
to orbite...@googlegroups.com

I'm confused about exactly what you're describing. Do you mind
pasting more complete code somewhere?

--Jacob

Lame Vi

unread,
Nov 8, 2007, 5:39:01 AM11/8/07
to orbite...@googlegroups.com

> I'm confused about exactly what you're describing. Do you mind
> pasting more complete code somewhere?
>
> --Jacob
>
Sorry if I was not clear enough. This discovery made me too emotional by
the moment of writing the letter.

From the very begining:

I found an article that explaine how to implement a forever iframe in IE.
http://alex.dojotoolkit.org/?p=538
Then, when I tried this technique, if faced a problem, just the same as
Michael faced here:
http://cometdaily.com/2007/10/25/http-streaming-and-internet-explorer/
I went to check the implementation he did and fetched the orbited.js
There I find a solution with the array. Then I started to play with it
and found a solution.

Maybe the code itself could explain more

This is my code that makes a connection iframe inside "htmlfile" object:

var c = new ActiveXObject("htmlfile");
c.open();
c.write("<html><head><title>f<\/title><\/head><body>");
c.write("<script>document.domain = '" + dom + "'<\/s" + "cript>");
c.write("<\/body><\/html>");
c.close();
c.div = c.createElement("div");
c.appendChild(c.div);
c.parentWindow.updateData = updateData; // make the
c.div.innerHTML = "<iframe id='ifr' src='" + src + "'><\/iframe>";

setInterval( function () {}, 10000); // this eliminates the DOM
usage limit inside the iframe

And this is the code that comes as streaming html into this created iframe:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html><head><title>f</title></head><body>


<script type="text/javascript">var updateData = parent.updateData;</script>

<script type="text/javascript">updateData('event 1');</script>
...
<script type="text/javascript">updateData('event 2');</script>
...
<script type="text/javascript">updateData('event 3');</script>
...
<script type="text/javascript">updateData('event 4');</script>

Notice, that there are no any periodical function that would go check
the array from inside the iframe and shif its values: the streaming html
calls the trigger function itself and it's executed right after it get
into the browser document.

trib

unread,
Nov 8, 2007, 9:41:25 AM11/8/07
to Orbited Discussion
Hi chaps,

I was a bit slow on the uptake here when Michael asked me about this
issue, because I've not seen it happen in Meteor, but that's probably
because it doesn't seem to affect us. Meteor uses an inline call to a
function in the parent frame and bridges the function into the iframe
as the first command that the streaming page executes. I don't know
if that prevents the HTMLFile limitation from kicking in, but I've
been watching my starscape demo at meteorserver.org via Charles
(http://www.xk72.com/charles/) and I don't see this issue.

This is one of the connections coming from my HTMLFile iframe at the
moment:

Status Receiving response body...
Response Code 200 OK
Protocol HTTP/1.1
Method GET
Content-Type text/html; charset=utf-8
Request Start Time 08/11/07 14:20:05
Response Duration 218.27 sec
Latency 0 ms
Speed 0.02 KB/s
Response Speed 0.02 KB/s
Request Header Size 524 bytes
Response Header Size 185 bytes
Response Size 3.01 KB (3078 bytes)
Total Size 3.70 KB (3787 bytes)

So it's been going for almost 4 minutes and has sent 110 events like
these:

<script>p(6,"demo","{ip:'10.0.0.3',path:'%2F',respcode:200}");</
script>
<script>p(-1,"");</script>
<script>p(7,"demo","{ip:'10.0.0.1',path:'%2Fserver-docs%2F',respcode:
200}");</script>
<script>p(-1,"");</script>
<script>p(8,"demo","{ip:'10.0.0.8',path:'%2Fdemo%2F',respcode:200}");</
script>
<script>p(-1,"");</script>

The p() function is simply a bridge to the Meteor.process function in
the parent frame. I am however still having the lingering HTMLFile
problem - you navigate to another page, and Charles still shows both
connections continuing to receive data, even though theoretically one
of them doesn't exist in the browser anymore. Damn difficult to debug
as there's no page from which to extract any debug. Is anyone else
seeing this?

Cheers.

Andrew

Michael Carter

unread,
Nov 8, 2007, 5:55:15 PM11/8/07
to orbite...@googlegroups.com

I've been investigating this problem further, and I believe that both the dom manipulation limit and the lingering HTMLFile problem are related. Here is what I think:

The reason has to do with the way that Internet Explorer handles garbage collection. For some reason, creating a setInterval on the parent window will create a reference to the htmlfile. But putting that htmlfile object in the parent dom doesn't create that reference. Without a reference to the htmlfile, the garbage collection process will eventually just delete the page, closing the connection right in the middle of javascript execution.

For whatever reason, the garbage collection procedure is linked to the number of javascript executions, in particular dom manipulations. (I'm sure its more complicated than that, but i think this explanation will suffice for our purposes.) So what I've found out is that causing a garbage collection to occur with no reference to the htmlfile (no setInterval running) will cause the htmlfile to immediately cease its javascript rendering. In order to cause this to happen just execute "CollectGarbage();" So if you create an htmlfile and attach a callback function, and in that callback function you handle the event data and end by executing "CollectGarbage();" then after a single execution the connection from the htmlfile will die.

I believe that the reason the connections linger, as per Andrew's problem, is that even after the page navigation happens, the htmlfile isn't actually garbage collected for a while, at least until it execeeds the limit that was causing my initial problem. None of the dom manipulations count against the garbage collection limit until after the page navigation occurs, and then the countdown starts from ~ 50 dom manipulations.

At this point, Andrew, since I've been having trouble replicating your problem with my experiments using orbited (probably b/c orbited explicitly closes the old connection in most use-cases), you should add a call to CollectGarbage(); with each event, or even a setInterval inside of the htmlfile iframe source that calls CollectGarbage every 0.5 seconds or something. If this solves your problem, then we need to also figure out the performance implications of incurring a garbage collection.

I am going to look for a more obvious way of creating an explicit reference to the htmlfile besides using the setInterval in the parent window (no idea why that creates the reference.) Hopefully after some more investigating we can actually get this problem fixed.

-Michael Carter

trib

unread,
Nov 9, 2007, 3:27:02 AM11/9/07
to Orbited Discussion
> The reason has to do with the way that Internet Explorer handles garbage
> collection. For some reason, creating a setInterval on the parent window
> will create a reference to the htmlfile. But putting that htmlfile object in
> the parent dom doesn't create that reference. Without a reference to the
> htmlfile, the garbage collection process will eventually just delete the
> page, closing the connection right in the middle of javascript execution.

This sounds great as a hypothesis, but I can't see where I'm creating
a reference to the HTMLFile - this is where I create it:

this.transferDoc.open();
this.transferDoc.write("<html><script>");
this.transferDoc.write("document.domain=\""+(document.domain)+"\";");
this.transferDoc.write("</"+"script></html>");
this.transferDoc.parentWindow.Meteor = Meteor;
this.transferDoc.close();

So other than a difference in syntax, this is basically the way
Orbited does it, except I have no setInterval - the events are passed
up from the iframe in the HTMLFile by the iframe calling a function in
the parent via a bridge (the process() function in the parent is
mapped to p() in the iframe):

ifr.p = this.instances[instid].process.bind(this.instances[instid]);

So shouldn't this setup be symptomatic of the manipulation limit?

>
> For whatever reason, the garbage collection procedure is linked to the
> number of javascript executions, in particular dom manipulations. (I'm sure
> its more complicated than that, but i think this explanation will suffice
> for our purposes.) So what I've found out is that causing a garbage
> collection to occur with no reference to the htmlfile (no setInterval
> running) will cause the htmlfile to immediately cease its javascript
> rendering. In order to cause this to happen just execute "CollectGarbage();"
> So if you create an htmlfile and attach a callback function, and in that
> callback function you handle the event data and end by executing
> "CollectGarbage();" then after a single execution the connection from the
> htmlfile will die.

I'll give that a go today.

>
> I believe that the reason the connections linger, as per Andrew's problem,
> is that even after the page navigation happens, the htmlfile isn't actually
> garbage collected for a while, at least until it execeeds the limit that was
> causing my initial problem. None of the dom manipulations count against the
> garbage collection limit until after the page navigation occurs, and then
> the countdown starts from ~ 50 dom manipulations.

To clarify, I can navigate to another page and then get over a hundred
events on the orphaned connection. I've only ever seen a server
disconnect at this point - because Meteor server has a connection time
limit, and disconnects automatically after a number of minutes to
mitigate exactly this problem. I'll set the timeout really high and
see if the client ever disconnects.

> I've been having trouble replicating your
> problem with my experiments using orbited (probably b/c orbited explicitly
> closes the old connection in most use-cases)

How do you explicitly close the connection on a page navigation? You
don't seem to be hooked into window.onunload, which is the only way
I've managed to get around this so far. Or does the server kill a
connection if a new one connects from the same client?

Michael Carter

unread,
Nov 9, 2007, 3:35:22 AM11/9/07
to orbite...@googlegroups.com
Okay, I've figured out the problems and here are some solutions.

This is a garbage collection problem. Our original code looked like this:

var transferDoc = new ActiveXObject("htmlfile"); // !?!
transferDoc.open();
transferDoc.write("<html>");
transferDoc.write("<script>document.domain='" + document.domain + "';</script>");
transferDoc.write("</html>");
transferDoc.close();
var ifrDiv = transferDoc.createElement("div");
transferDoc.body.appendChild(ifrDiv);
ifrDiv.innerHTML = "<iframe src='"+url+"'></iframe>";

So here's the problem: We don't hold on to a reference to transferDoc. So the garbage collector deletes it the first chance it gets. And the garbage collector operates not on a time interval but on an execution interval, so we were able to do a little bit of work before it was destroyed. This led to many frameworks thinking they had a working version when it would fall apart under real use.

So the fix as proposed by Ufo is to add this line to the bottom of the above code:


setInterval( function () {}, 10000)

And this will indeed fix this problem. The reason is that we are passing an anonymous function to setInterval, and in the creation of that function a closure is formed. The closure holds references to all of the variables present in the scope of the anonymous function, which includes transferDoc. So the reason this works has nothing to do with setInterval and everything to do with creating a closure. Using the following line instead of ufo's would fix the problem equally:

function() { }

This will also create a closure around the reference to transferDoc which will keep the garbage collector at bay. (Strangely, we don't even have to keep a reference to the anonymous function...

But now we have another problem: How do we actually delete the ActiveX("htmlfile") object when we navigate away. Because we are defining an anonymous function we have no way of getting back at the closure and removing it. This isn't a problem in any browser except IE and when a closure is formed around an ActiveX object (or possibly any non-js (DOM) element.) Using Ufo's fix doesn't help us, because even if we were to delete the timer we created, the anonymous function's closure would remain.  And also, strangely, deleting the anonymous function doesn't solve the problem. (We don't even have to keep a reference to the anonymous function and it still sticks around... I wonder why we keep that in memory? I recall reading that IE has some problem with properly garbage collecting closures that reference ActiveX objects. Any insight is welcome.)

So the solution is stupidly obvious once you understand the problem: hold on to a reference to the transferDoc variable. In orbited here is the new code we use: (with the addition of a single line)

Orbited = {
  ...
 
  connect_htmlfile: function() { 
    var url = this.location + '?user=' + this.user;
    url += "&session=" + this.session + "&transport=iframe";
    var transferDoc = new ActiveXObject("htmlfile"); // !?!
    transferDoc.open();
    transferDoc.write("<html>");
    transferDoc.write("<script>document.domain='" + document.domain + "';</script>");
    transferDoc.write ("</html>");
    transferDoc.parentWindow.Orbited = this;
    transferDoc.close();
    var ifrDiv = transferDoc.createElement("div");
    transferDoc.body.appendChild(ifrDiv);
    ifrDiv.innerHTML = "<iframe src='"+url+"'></iframe>";
    this.transferDoc = transferDoc
  }
 
  ...
}
 
now lets say that we want to close the connection. We just need to kill that reference:

Orbited.transferDoc = null;

Or alternatively just navigate away from the page, or reload the page.

But this doesn't account for the additional events the htmlfile iframe will receive even after we have no references to it. Thats because its not immediately garbage collected. Normally this isn't a problem for random dom elements, but in this case it means that the connection to our comet server persists past when we expect it to. The solution is to manually initiate garbage collection.

Orbited.transferDoc = null;
CollectGarbage();


So Andrew, I assume that you are either hold on to a reference explicity, create an enclosure around the original reference, or both, and that is why you haven't run into this problem that others have. You should make sure that you don't create any enclosures around scopes that contain a reference to transferDoc or you won't be able to delete the htmlfile when you want. Other than that, try throwing in an onunload: Meteor.transferDoc = null; CollectGarbage(); and see where it gets you.

I'll probably write this up as an article in the coming week or two, so email me any other insight so we come up with the best solution.


Michael Carter

unread,
Nov 9, 2007, 3:46:36 AM11/9/07
to orbite...@googlegroups.com
I missed this email before my last reply so I'll address it in the context of my previous email

On Nov 9, 2007 12:27 AM, trib <andrew...@gmail.com > wrote:

> The reason has to do with the way that Internet Explorer handles garbage
> collection. For some reason, creating a setInterval on the parent window
> will create a reference to the htmlfile. But putting that htmlfile object in
> the parent dom doesn't create that reference. Without a reference to the
> htmlfile, the garbage collection process will eventually just delete the
> page, closing the connection right in the middle of javascript execution.

This sounds great as a hypothesis, but I can't see where I'm creating
a reference to the HTMLFile - this is where I create it:

this.transferDoc.open();
this.transferDoc.write("<html><script>");
this.transferDoc.write("document.domain=\""+( document.domain)+"\";");

this.transferDoc.write("</"+"script></html>");
this.transferDoc.parentWindow.Meteor = Meteor;
this.transferDoc.close();

So other than a difference in syntax, this is basically the way
Orbited does it, except I have no setInterval - the events are passed
up from the iframe in the HTMLFile by the iframe calling a function in
the parent via a bridge (the process() function in the parent is
mapped to p() in the iframe):

ifr.p = this.instances[instid].process.bind(this.instances[instid]);

So shouldn't this setup be symptomatic of the manipulation limit?
 
My last email explained this. You keep a reference directly to the transferDoc so you're okay.

>
> I believe that the reason the connections linger, as per Andrew's problem,
> is that even after the page navigation happens, the htmlfile isn't actually
> garbage collected for a while, at least until it execeeds the limit that was
> causing my initial problem. None of the dom manipulations count against the
> garbage collection limit until after the page navigation occurs, and then
> the countdown starts from ~ 50 dom manipulations.

To clarify, I can navigate to another page and then get over a hundred
events on the orphaned connection.  I've only ever seen a server
disconnect at this point - because Meteor server has a connection time
limit, and disconnects automatically after a number of minutes to
mitigate exactly this problem.  I'll set the timeout really high and
see if the client ever disconnects.

My best guess then is that you have some anonymous function creating a closure over the reference to transferDoc. Possibly you are holding on to the reference in some other way. I honestly don't know enough about how IE handles references to be able to tell what exactly could be going on behind the scenes. Lets see effect explicit garbage collection has.
 

> I've been having trouble replicating your
> problem with my experiments using orbited (probably b/c orbited explicitly
> closes the old connection in most use-cases)

How do you explicitly close the connection on a page navigation?  You
don't seem to be hooked into window.onunload, which is the only way
I've managed to get around this so far.  Or does the server kill a
connection if a new one connects from the same client?
 
When a user connects a second time to Orbited with the same connection key the server kills the old one, at least for the streaming transports. This still lends itself to the problem of navigating to another Orbited location (meaning a different connection key) and it not killing the previous connection.

trib

unread,
Nov 9, 2007, 5:53:19 AM11/9/07
to Orbited Discussion

> My last email explained this. You keep a reference directly to the
> transferDoc so you're okay.

All suddenly becomes clear. Something told me the setInterval
solution wasn't quite addressing the root of the problem.

Reply all
Reply to author
Forward
0 new messages