[binary] First proposition

13 views
Skip to first unread message

ondras

unread,
Feb 5, 2009, 3:41:57 AM2/5/09
to serverjs
Hi,

I created a first draft of my proposed Binary object -
https://wiki.mozilla.org/ServerJS/Binary .

Please have a look at the interface and/or implementation and let me
know what is missing and what do you propose to change. I hope this
will be useful when performin file reading/writing as well in other
scenarios.


Ondrej

Ionut Gabriel Stan

unread,
Feb 5, 2009, 4:12:00 AM2/5/09
to serv...@googlegroups.com
Hi guys,

I haven't introduced myself in the introductions thread as I didn't know
if I'd participate in discussions. Anyway, I was thinking we can get
inspiration from several implementations regarding binary handling of
data. The implementations I'm talking about are:

1. Adobe AIR's ByteArray [1]
2. Mozilla's nsIBinaryInputStream [2]
3. Mozilla's nsIBinaryOutputStream [3]

One may notice that the three implementations resemble pretty much so I
think maybe a SSJS implementation should follow the same line.

And by the way, it would be nice to drop naming conventions like
getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
getter/setter structures.


Cheers,
Ionut

[1] http://help.adobe.com/en_US/AIR/1.1/jslr/index.html
http://help.adobe.com/en_US/AIR/1.1/jslr/flash/utils/ByteArray.html
[2] https://developer.mozilla.org/En/NsIBinaryInputStream
[3] https://developer.mozilla.org/En/NsIBinaryOutputStream

Ondrej Zara

unread,
Feb 5, 2009, 4:31:54 AM2/5/09
to serv...@googlegroups.com
> 1. Adobe AIR's ByteArray [1]
> 2. Mozilla's nsIBinaryInputStream [2]
> 3. Mozilla's nsIBinaryOutputStream [3]
>
> One may notice that the three implementations resemble pretty much so I
> think maybe a SSJS implementation should follow the same line.
>

I am not sure whether these implementations follow the same goal as I
do: they are stream oriented, bound (possibly) to some underlying
"data provider". My structure, on the other hand, is just a storage
container: simple wrapper around an array of bytes.

> And by the way, it would be nice to drop naming conventions like
> getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
> getter/setter structures.
>

Good idea - does anybody know what is the situation with our other
targets (Rhino, JSCore) ?


Ondrej

Tom Robinson

unread,
Feb 5, 2009, 5:25:44 AM2/5/09
to serv...@googlegroups.com

On Feb 5, 2009, at 1:31 AM, Ondrej Zara wrote:

And by the way, it would be nice to drop naming conventions like
getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
getter/setter structures.


Good idea - does anybody know what is the situation with our other
targets (Rhino, JSCore) ?

As long as you're on the "native" side, then property getter/setters should be possible in all the engines, right? I imagine the binary / IO stream stuff will be mostly native modules.

-Tom

frank

unread,
Feb 5, 2009, 7:10:03 AM2/5/09
to serverjs
For reuse skill that close to developer,IMO, I second AIR's way.

Davey Waterson

unread,
Feb 5, 2009, 8:05:05 AM2/5/09
to serverjs
rhino/spidermonkey has getter/setters, i think it's doable on all the
engines, makes for a much cleaner API.

I think the cleanliness is so much better.

I like the basic api as proposed. i'd likely add SHA256/SHA384/SHA512
and CRC32 and an optional mimetype property so that something that
gets passed the binary object can deal with it more easily, also along
the same lines i think some kind of toJSON might be cool, where the
mimetype and the base64 data are wrapped up as in JSON

Robert Schultz

unread,
Feb 5, 2009, 8:52:37 AM2/5/09
to serverjs
I think that the base64, sha1 and md5 functions should all be removed
and made into their own classes.

I'd rather have a Base64 class with .encode() and .decode()
You could pass a String or a Binary data type to .encode() and it
would do it's thing.

If I have a String in the future that has base64 data or I want to md5
or sha it, it seems to make more sense that I can just Base64.decode()
it rather than having to go through a Binary object.

Plus this will allow us to have more advanced options for base64, sha1
and md5 encoding/decoding in the future if it's in it's own class.


I guess you could keep the base64, sha1 and md5 functions here as
'shortcuts' but they should certainly just call out to the Base64,
SHA1 and MD5 classes themselves.

What do you think?

Tom Robinson

unread,
Feb 5, 2009, 8:57:36 AM2/5/09
to serv...@googlegroups.com

I agree, though you should also be able to pass in streams (namely
Files) so you can hash large files without loading the whole thing
into memory. That might be why Davey suggested they be on the File
object (?)

-Tom

Robert Schultz

unread,
Feb 5, 2009, 9:04:11 AM2/5/09
to serverjs
On Feb 5, 8:57 am, Tom Robinson <tlrobin...@gmail.com> wrote:
> I agree, though you should also be able to pass in streams (namely  
> Files) so you can hash large files without loading the whole thing  
> into memory. That might be why Davey suggested they be on the File  
> object (?)
>
> -Tom

I think it would be more convenient if Base64.encode would accept
String, File or Binary and handle it appropriately.
I'd also say that having File.base64encode() as a 'shortcut' to the
Base64.encode method also sounds like a great idea.

If both were available (with the actual base 64 encoding code living
in the base64 class of course) then the developer's life using these
API's is made a lot easier.

Davey Waterson

unread,
Feb 5, 2009, 9:04:36 AM2/5/09
to serverjs
indeed, i've just been adding them to the Jaxer File api.

but the idea of an encodings module works for me, though it would put
a dependency on that to Binary, to get the toJSON method operable

something like

var jsonContent = myBinaryObject.toJSON('MD5');

but the potential worry is Binary depends on Encodings which depends
on File, that may be fine, although i'd like to avoid the situation
where we get import explosions

Davey Waterson

unread,
Feb 5, 2009, 9:05:47 AM2/5/09
to serverjs
+1 on multiple encoding modules

Ionut Gabriel Stan

unread,
Feb 5, 2009, 9:11:01 AM2/5/09
to serv...@googlegroups.com


On 2/5/2009 11:31, Ondrej Zara wrote:
>> 1. Adobe AIR's ByteArray [1]
>> 2. Mozilla's nsIBinaryInputStream [2]
>> 3. Mozilla's nsIBinaryOutputStream [3]
>>
>> One may notice that the three implementations resemble pretty much so I
>> think maybe a SSJS implementation should follow the same line.
>>
>
> I am not sure whether these implementations follow the same goal as I
> do: they are stream oriented, bound (possibly) to some underlying
> "data provider". My structure, on the other hand, is just a storage
> container: simple wrapper around an array of bytes.

I know what you mean, but the AIR's ByteArray object is exactly what you
said you want, an array of bytes. What I like about it, thus my
suggestion, is that it has separate methods for reading/writing 16 or
32-bit integers or floats. Mozilla's implementation does that too, and I
think I like they're naming conventions more than AIR's.

Now, while we're at this I'd like to propose two other methods for those
of us who're not afraid working with data at byte level. A *pack/unpack*
pair of methods. Perl, PHP and Python already have this couple so I
believe it would be nice to have them too.



>> And by the way, it would be nice to drop naming conventions like
>> getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
>> getter/setter structures.
>>
>
> Good idea - does anybody know what is the situation with our other
> targets (Rhino, JSCore) ?
>
>
> Ondrej
>
>
>> Cheers,
>> Ionut
>>
>> [1] http://help.adobe.com/en_US/AIR/1.1/jslr/index.html
>> http://help.adobe.com/en_US/AIR/1.1/jslr/flash/utils/ByteArray.html
>> [2] https://developer.mozilla.org/En/NsIBinaryInputStream
>> [3] https://developer.mozilla.org/En/NsIBinaryOutputStream
>>
>>
>> On 2/5/2009 10:41, ondras wrote:
>>> Hi,
>>>
>>> I created a first draft of my proposed Binary object -
>>> https://wiki.mozilla.org/ServerJS/Binary .
>>>
>>> Please have a look at the interface and/or implementation and let me
>>> know what is missing and what do you propose to change. I hope this
>>> will be useful when performin file reading/writing as well in other
>>> scenarios.
>>>
>>>
>>> Ondrej
>>>
>>>
>
> >
>

--
Ionut G. Stan
I'm under construction | http://igstan.blogspot.com/

Starr Horne

unread,
Feb 5, 2009, 9:13:24 AM2/5/09
to serv...@googlegroups.com
> I think the cleanliness is so much better.

I like the cleanliness of automatic getters and setters as well. And they
should be find for, say, networking code that will only be on the server.

But one of the main reasons we're all interested in SSJS is the possibility
of sharing code between the browser and the server.

If this binary object was implemented in javascript, why shouldn't it be usable
on both the client and server?

Starr

Kris Zyp

unread,
Feb 5, 2009, 9:13:44 AM2/5/09
to serv...@googlegroups.com

Ondrej Zara wrote:
> [snip]


>> And by the way, it would be nice to drop naming conventions like
>> getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
>> getter/setter structures.
>>
>>
>
> Good idea - does anybody know what is the situation with our other
> targets (Rhino, JSCore) ?
>

All engines have getters and setters at the native level (including
Rhino, JSCore, and even JScript), they must in order to provide the host
objects in the browser (many of which rely on getter/setters). I would
really prefer to keep this more array-like and use the length property
and allow access to the bytes through numerical index access:
binary[3] // gets the 4th byte
binary.length // returns the number of bytes
binary[2] = 22 // puts 22 in the third byte slot

(NodeList could be a good model for this).

Thanks,
Kris

Ondrej Zara

unread,
Feb 5, 2009, 9:27:56 AM2/5/09
to serv...@googlegroups.com
Yes, this would be nice. I am not sure how to do this in V8 (or if it
actually is possible in V8) though :-(

O.

Ondrej Zara

unread,
Feb 5, 2009, 9:33:31 AM2/5/09
to serv...@googlegroups.com
> I'd rather have a Base64 class with .encode() and .decode()
> You could pass a String or a Binary data type to .encode() and it
> would do it's thing.
>

I am not a huge fan of this. Please note that both Base64 and MD5/SHA1
are _not_ defined on "string", these functions are defined for an
array of bytes. From the coder's view, there is no bijection between a
JS string and the corresponding byte structure: this conversion must
be performed under certain assumptions (for instance, that the string
can be decomposed into bytes using UTF-8 standard).

What I am trying to say: it is not generally possible to compute a
unique hash for "žščřďťň" (letters very common in my language),
because there are numerous ways to represent these in binary. More
semantic approach is to first convert a string to binary (at this
phase, encoding should be supplied!) and then perform the hash
computation (on the binary object).


Ondrej

Davey Waterson

unread,
Feb 5, 2009, 9:41:41 AM2/5/09
to serverjs
> But one of the main reasons we're all interested in SSJS is the possibility
> of sharing code between the browser and the server.

I dont think that's the case at all, it's about common serverside
API's. trying to do something like this and work with say IE5 would be
a complete nightmare. I think we are more concerned with having things
that work like 'privileged' javascript, e.g. FF plugins.

I agree some of the API's may be client runnable though, but
sacrificing ourselves on the altar of browser compatability is a
really horrible thought for me. client use is a nice plus but largely
secondary.

> If this binary object was implemented in javascript, why shouldn't it be usable
> on both the client and server?

basically if the features we want need privileged access. May or may
not be the case for Binary.

Kris Zyp

unread,
Feb 5, 2009, 9:46:52 AM2/5/09
to serv...@googlegroups.com

They built a browser on it. Chrome uses getters and setters extensively
for it's host objects.
Kris

Kris Zyp

unread,
Feb 5, 2009, 9:56:08 AM2/5/09
to serv...@googlegroups.com
You can't just implement binary in JavaScript for the browser, the
browser environment doesn't expose any binary data (at least in the
sense we are discussing, data from files, and so forth). However, there
have been efforts to expose binary data in the browser. I am not sure if
this been listed in the prior art, but it might be worth looking at:
http://dev.w3.org/2006/webapi/FileUpload/publish/FileUpload.xhtml#Blob-if
(Support for this has been dropped, so we don't need to follow it).
Kris


> Starr
>
> >
>

Davey Waterson

unread,
Feb 5, 2009, 10:07:49 AM2/5/09
to serverjs
I think maybe there's some confusion around the Module package being
able to support loading modules clientside and the actual module
running both on the client and the server. As I see that as a useful
pattern on bothsides so a common API is great.

var $ = require('jquery').jQuery

would be neat (and would actually work in Jaxer as we have the real
DOM, but we're an odd one from that perspective)

for me it's

+1 module support on client
-1 server modules lobotomized to run on the client

in Jaxer we have a Jaxer.isOnServer property that can be checked and
behavior adjusted accordingly, this let's us have code that doesn't
depend on the server where needed for sharing clientside, I usually
hate it when I have to deal with that as it drops me back into the
multibrowser lowest common denominator syntax and abilities.

I hope that great features serverside would eventually be avail on the
client, but it's a security nightmare if nothing else, and I can't see
browser vendors racing to give us file system or socket access. (HTML5
provides some of this of course)

Ondrej Zara

unread,
Feb 5, 2009, 10:14:34 AM2/5/09
to serv...@googlegroups.com
>>> All engines have getters and setters at the native level (including
>>> Rhino, JSCore, and even JScript), they must in order to provide the host
>>> objects in the browser (many of which rely on getter/setters). I would
>>> really prefer to keep this more array-like and use the length property
>>> and allow access to the bytes through numerical index access:
>>> binary[3] // gets the 4th byte
>>> binary.length // returns the number of bytes
>>> binary[2] = 22 // puts 22 in the third byte slot
>>>
>>>
>>
>> Yes, this would be nice. I am not sure how to do this in V8 (or if it
>> actually is possible in V8) though :-(
>>
>>
> They built a browser on it. Chrome uses getters and setters extensively
> for it's host objects.

Well, I _know_ that getters and setters are possible. I was thinking
about an old issue of "subclassing an array" (with all its bells &
whistles), because the Binary datatype essentialy *is* an array (with
some additions). This is something I am uncertain of.


O.


> Kris
>
> >
>

Ionut Gabriel Stan

unread,
Feb 5, 2009, 10:14:04 AM2/5/09
to serv...@googlegroups.com
Actually, starting with FF 3 we're able to read local files by means of
the file input element. There's also an active W3C draft[1] regarding
this, which is edited by a Mozilla employee.

On my blog (link in signature) there's an exhaustive tutorial about AJAX
file uploads in FF3 (<- shameless plug :D). The binary methods are
ile.getAsBinary() and XMLHttpRequest.sendAsBinary(). Anyway, I do NOT
advocate for supporting a certain API in the browsers. All I want is JS
on the server.

[1] http://dev.w3.org/2006/webapi/FileUpload/publish/FileUpload.html

Starr Horne

unread,
Feb 5, 2009, 10:16:46 AM2/5/09
to serv...@googlegroups.com
> > But one of the main reasons we're all interested in SSJS is the possibility
> > of sharing code between the browser and the server.
>
> I dont think that's the case at all, it's about common serverside

Ok, let me restate that. This is the reason that I'm interested in SSJS. I think
that to make JS a really great tool for creating rich applications, we need
server-level tools and a standard library.

And granted, privileged code will never run in the browser. And as Kris mentioned,
maybe you can't efficiently implement certain parts of the library in javascript.

I just wanted to bring up the subject now because - even if we're only making
certain pieces of the library available to the browser - we're going need to work
that out ahead of time.

And nobody said anything about IE5. As long as you're not doing DOM stuff, it's
not a big deal to make JS code work across all modern browsers.

SH

Kris Zyp

unread,
Feb 5, 2009, 10:24:13 AM2/5/09
to serv...@googlegroups.com

Ionut Gabriel Stan wrote:
> Actually, starting with FF 3 we're able to read local files by means of
> the file input element. There's also an active W3C draft[1] regarding
> this, which is edited by a Mozilla employee.
>
> On my blog (link in signature) there's an exhaustive tutorial about AJAX
> file uploads in FF3 (<- shameless plug :D). The binary methods are
> ile.getAsBinary() and XMLHttpRequest.sendAsBinary(). Anyway, I do NOT
> advocate for supporting a certain API in the browsers. All I want is JS
> on the server.
>
> [1] http://dev.w3.org/2006/webapi/FileUpload/publish/FileUpload.html
>

That's the same link I shared (just in a different format).
getAsBinary() returns the binary data as a hex-encoded string. I think
the point of this whole exercise was to provide something better than
strings, dealing with binary data through hex-encoded strings is pretty
clunky. Blob was an attempt at that (in your/our link).
Kris

Wes Garland

unread,
Feb 5, 2009, 10:29:51 AM2/5/09
to serv...@googlegroups.com
> Well, I _know_ that getters and setters are possible. I was thinking
> about an old issue of "subclassing an array" (with all its bells &
> whistles), because the Binary datatype essentialy *is* an array (with
> some additions). This is something I am uncertain of.

I don't think you want to subclass an array to make this happen. JS
arrays are too "heavy" to represent binary data streams.

You're much better malloc'ing a bunch of RAM, hooking the JS resolver
for properties on your byte array (IF they are to be used singly), and
de-referencing and returning as a JS String on demand. Recall
myBytes[4] is roughly equivalent to myBytes.4.

But most of the time you won't be looking at individual bytes. So
you're going to want a meaningful .toString method. And of course,
whatever File object gets spec'd out will want to know about this
class.

I don't like things like getLength / setLength -- those should be
length getters and setters. Also, how is setting the length
meaningful? What if you make it bigger?

Adding String.toBinary is not really necessary IMHO. It won't get
called during type promotion, so why not use a cleaner syntax like

var myBytes = new Binary("hello");

I definately don't think base64 encoding/decoding belongs here. That
should be in another class that understands String and Binary. md5
and sha should be in a crypto lib.

Binary.push, pop, shift and unshift are probably not useful enough to
standardize on now, although I suppose they wouldn't hurt. If we're
doing those, we should also slice and splice.

getByte, setByte should be done with array brackets, I think. If note,
they should be "byteAt()" to mirror String.charAt(). Note that since
strings are immutable, there is no String.setByteAt() either.

Wes
--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Ondrej Zara

unread,
Feb 5, 2009, 10:41:13 AM2/5/09
to serv...@googlegroups.com
> I don't like things like getLength / setLength -- those should be
> length getters and setters. Also, how is setting the length
> meaningful? What if you make it bigger?
>

I have not proposed any setLength(). Note sure how this got into
thread, I can assure you that nothing like this is in my proposal
(which I encourage you to look at prior to discussing it .) ).

> Adding String.toBinary is not really necessary IMHO. It won't get
> called during type promotion, so why not use a cleaner syntax like
>
> var myBytes = new Binary("hello");
>
> I definately don't think base64 encoding/decoding belongs here. That
> should be in another class that understands String and Binary. md5
> and sha should be in a crypto lib.
>

I have already expressed my negative feeling about this few posts
before: http://groups.google.com/group/serverjs/msg/0243f943fffb542e


Ondrej

Evandro Myller

unread,
Feb 5, 2009, 11:20:16 AM2/5/09
to serv...@googlegroups.com
I really don't like the Base64's .encode/.decode methods. It's not ugly, but runs out from the language patterns.
I mean, currently, every object has something like a .toString and .toSource. So, I'd expect methods like .toBase64, etc.
If the String constructor had some method like .convert, for example, Base64.encode would make sense, IMHO.

cheers
--
E. Myller ( www.emyller.net )

mob

unread,
Feb 5, 2009, 11:53:31 AM2/5/09
to serverjs

>
> Good idea - does anybody know what is the situation with our other
> targets (Rhino, JSCore) ?
>

Here is the Ejscript ByteArray. Similar to the Adobe approach:

http://www.ejscript.org/products/ejs/doc/api/gen/ejscript/index.html

Michael

mob

unread,
Feb 5, 2009, 11:55:29 AM2/5/09
to serverjs
> I agree, though you should also be able to pass in streams (namely  
> Files) so you can hash large files without loading the whole thing  
> into memory. That might be why Davey suggested they be on the File  
> object (?)
>

Agree. We found that to do File and other APIs right, we definitely
needed a byte array.
We started off simple, but we kept finding we needed a Streams based
approach and
adding that to the ByteArray simplifed a lot of other things.

Michael

mob

unread,
Feb 5, 2009, 11:57:09 AM2/5/09
to serverjs
> All engines have getters and setters at the native level (including
> Rhino, JSCore, and even JScript), they must in order to provide the host
> objects in the browser (many of which rely on getter/setters). I would
> really prefer to keep this more array-like and use the length property
> and allow access to the bytes through numerical index access:
> binary[3] // gets the 4th byte
> binary.length // returns the number of bytes
> binary[2] = 22  // puts 22 in the third byte slot

That is nice. Ejscript does that and provides Streams read/write into
and out of the byte array.

Michael

Scott Christopher

unread,
Feb 5, 2009, 6:07:35 PM2/5/09
to serv...@googlegroups.com
ondras wrote:
> I created a first draft of my proposed Binary object -
> https://wiki.mozilla.org/ServerJS/Binary .
>
> Please have a look at the interface and/or implementation and let me
> know what is missing and what do you propose to change. I hope this
> will be useful when performin file reading/writing as well in other
> scenarios.


/**
* Decodes Encodes the data in Base64
* @returns {string}
*/
Binary.prototype.base64encode = function() {};

---

Is the above supposed to be as below (or not exist at all)?

/**
* Decodes the data in Base64
* @returns {Binary}
*/
Binary.prototype.base64decode = function() {};


Regards,
Scott Christopher

Ondrej Zara

unread,
Feb 6, 2009, 1:54:51 AM2/6/09
to serv...@googlegroups.com

Possibly. If "base64encode" converts Binary to Strings, it seems
logical that inverse function converts String to Binary. Specifically,
that

var a = "anyString;
a.toBinary().base64encode().base64decode().toString() == a

:-)

On the other hand, one can add base64decode to binary as well.

O.

Mike Samuel

unread,
Feb 6, 2009, 2:36:27 PM2/6/09
to serverjs


On Feb 5, 2:25 am, Tom Robinson <tlrobin...@gmail.com> wrote:
> On Feb 5, 2009, at 1:31 AM, Ondrej Zara wrote:
>
>
>
> >> And by the way, it would be nice to drop naming conventions like
> >> getLength()/setLength() as both SpiderMonkey and V8 (AFAIK) provide
> >> getter/setter structures.
>
> > Good idea - does anybody know what is the situation with our other
> > targets (Rhino, JSCore) ?

Rhino allows definition of getters and setters using the contextually
reserved get and set keywords as in ({ get n() { return 42; } }.n). I
don't recall whether Rhino supports defineGetter/defineSetter.


> As long as you're on the "native" side, then property getter/setters
> should be possible in all the engines, right? I imagine the binary /
> IO stream stuff will be mostly native modules.
>
> -Tom

Ross Boucher

unread,
Feb 7, 2009, 11:16:42 AM2/7/09
to serv...@googlegroups.com
On Feb 5, 2009, at 6:52 AM, Robert Schultz wrote:

>
> I think that the base64, sha1 and md5 functions should all be removed
> and made into their own classes.


>
> I'd rather have a Base64 class with .encode() and .decode()
> You could pass a String or a Binary data type to .encode() and it
> would do it's thing.
>

> If I have a String in the future that has base64 data or I want to md5
> or sha it, it seems to make more sense that I can just Base64.decode()
> it rather than having to go through a Binary object.

I dislike API that takes "whatever" for a type and then converts on
the fly. Its messy, because you can't read through your code and
immediately understand what kind of data you are working with. It also
leads to messy convoluted implementations.

I'd rather see encode() and encodeBinary() or something to that
effect. I'm OK with those being properties of a Base64 Object though.

Reply all
Reply to author
Forward
0 new messages