Great! We could definitely use some profiling and optimization of
Orbited's bottlenecks. So far the emphasis has been on making it work
properly, so there is surely quite a bit of low-hanging fruit, like
that example you found. We've also just started building up a test
suite, to make sure that all the edge cases work as they should, and
we don't accidentally introduce regressions. Both of these (tests and
optimization) will hopefully continue to improve as Orbited matures,
given that the concept and API of Orbited should be fairly stable post
0.5.
> My app uses a substantial amount of socket data, and I'm very
> impressed with how well Orbited handles it. I suspect that my app is
> testing performance limits much more than most, and I will gladly
> contribute more javascript optimizations like this one if you guys are
> able to accept community contributions easily right now.
Glad to hear it! We definitely want to accept your contributions. I
think there's a CLA somewhere (I'll let Michael answer about that),
and we'd be glad to give you commit access to the repository if you
want to contributed more than a couple patches.
If you don't mind my asking, what kind of substantial amount of socket
data are you sending around?
Cheers,
Jacob
Michael Carter wrote:
> I came across this Unicode encoder/decoder:
> http://www.webtoolkit.info/javascript-utf8.html -- I have two
> outstanding questions about it though.
>
> 1) is it faster / how much faster than the current version
> 2) What the heck is the license?
>
> I think it may be significantly faster than our current version, but I
> don't know.
I've added a simple unit test (might be flawed, please do check!) at
http://orbited.org/changeset/523 by using native browser functions
(unescape, encodeURIComponent, etc; I remembered this from [1]) to
implementation encode/decode, it seems to be significant faster than
what we currently have, eg: for encoding:
ff3: 21 vs 4 ; 28 vs 3 (500 byte ascii string); 55 vs 7 (1000 byte ascii
string).
sf3: 27 vs 10; 29 vs 8 (500 byte ascii string); 57 vs 15 (1000 byte
ascii string).
ie7: 123 vs 17; 1421 vs 11 (500 byte ascii string); timeout vs 21 (1000
byte ascii string).
We can directly use the encode function, but not the decode (orbited
needs to handle partial utf-8 strings when using a non-binary socket).
Feel free to improve those tests and send patches!
Best regards,
Rui Lopes
[1]
http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html
http://js.io/trac/browser/trunk/protocols/stomp/stomp.js?rev=31#L56
Maybe you can help improve it?
> It would be great for performance if I didn't have to do that part in
> javascript, but instead had a config option in the daemon to auto-
> frame the data based on the given delimiter. That way Orbited gives
> the javascript a valid data set every time.
It could be done in the daemon, but I'm not sure about the perf.
improvement, eg. if it has some impact on the server, I rather do it on
the client, because these days the normal desktop client is quite
powerful (and there are more clients then servers).
Fell free to come up with profiling prof and patches! :-)
> Would that also allow use
> of the browser's decode, or did I misunderstand the context of your
> partial string problem?
>
You understood it correctly, and that would work.
Just to make sure, the problem we want to avoid is the case where a
single unicode code point is encoded as more than one byte, which is not
necessarily the same problem as splitting the input at line boundaries;
because a single line might end up in more than one call to onread,
where the unicode code point is half in the first onread and the rest on
other onread's (using the browser decode we don't seem to be able to
detect these cases).
Best regards,
Rui Lopes