--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
I've been hoping someone would be willing to do that. Go for it!
Kris
blob.byteAt(index);
A) Returns a blob of length=1
B) Returns a Number (byte) representing the byte.
Note that either way blob.valueAt(index); will still return a blob of
length=1, B) deviates from the str.valueAt(idx) == str.charAt(idx); thus
blob.byteAt(idx) == blob.valueAt(idx); pattern I had in mind but doesn't
break abstract code since abstract code uses .valueAt; Note that B)
basically means that blob.byteAt(idx) == blob.integerAt(idx);
Buffer in what module:
A) Buffer inside require('binary');
B) Buffer inside require('io');
Ashb noted Buffer might fit better in an io module than a buffer module.
Also note that I thought a bit more about flexibility than all out
strict portability. I have two or three points inside that proposal that
accept implementations may implement something extra or differently than
other implementations, as long as the spec also specifies a method of
doing the same thing portably.
Namely:
* Implementations may make Blob and Buffer globals, as long as
require('binary') always works for portability.
* Implementations may optionally support blob[idx]; ie: Ideally if your
js interpreter suports the non-ecma str[idx] you should support
blob[idx] as well.
Should I pull that kind of stuff out?
Also ashb says buf.clear(off, len); doesn't completely make sense,
should I replace that with something else... buf.fill(off, len, seq); ?
* the commitment to genericity, the notion that Blob and String can be
interchangeable for some algorithms.
* to that end, I like the idea of making .valueAt the generic of
.charAt and .byteAt.
* generally, cutting back on features. a lot of primitive operations
can be written in terms of just memcopy.
* that Binary/C and Binary/B are very similar in spirit. I don't
think it escapes its stated purpose of divorcing Buffer from the
ideology of ByteArray, or Blob from ByteString, and I think this is a
good thing™.
* that this proposal makes explicit what methods are meant to be
generic between types. We could do more with this trend.
Adjustments I would make:
* integerAt and floatAt beg the inclusion of a lot of unpacking
functionality, like blobAt, pascalBlobAt, nullTerminatedBlobAt,
nullTerminatedStringAt(offset, charset), pascalStringAt and such, that
I feel ought to be deferred to a higher architectural layer, like a
"struct" module for unpacking opaque data from Blobs and Buffers.
* add a memcopy routine to Buffer, like buffer.copy(source, begin,
end, [sourceBegin]).
* favor (begin, end) for ranges, like slice notation, over (begin, length).
* we could probably get rid of reverse without consequence, while
we're cutting back to basics.
* BUT, if we require [] operator and .length, almost every function in
Array.prototype would work in Buffer.prototype. The only exceptions
are .concat and one other that escapes me at the moment.
* make ByteBuffer and StringBuffer separate types, or omit String
buffering, to simplify both usage and implementation.
In response to .clear, I think we should support:
* buffer.clear([begin=0, [end=length]])
* buffer.fill(begin, end, [value=0])
* buffer.copy(source, [begin, [end, [sourceBegin]]])
My hands: blob.byteAt(offset)
A) return a Blob of length 1
Add something like:
blob.codeAt(offset) = blob.byteCodeAt(offset)
string.codeAt(offset) = string.charCodeAt(offset)
To parallel:
blob.valueAt(offset) = blob.byteAt(offset)
string.valueAt(offset) = string.charAt(offset)
My hands: Buffer in what module:
A) Buffer inside require('binary') or
C) Buffer and Blob added to global
Buffer isn't actually a Stream. We have ByteIO and StringIO in
Narwhal's "io" module; they support the same API as file streams,
which are very different than the Buffer API. Buffer *is* actually a
byte array with a subset of the Array interface. I think this is
healthy.
I do not think we should leave unspecified whether Buffer and Blob are
available in the "binary" module or as free variables, or whether []
may or must be supported. Strict portability is a good thing. It's
hard to see that here when creating a narrow specification results in
so much argument, but our users will thank us. I'm indifferent about
whether we do these objects in modules, globals, or both.
I'm in favor of requiring [] to be supported. This will make our
lives difficult in Narwhal on Rhino, but in the long term, I'm hoping
that these types would be supported by Rhino out of the box. Until
then we would probably be partially non-compliant, but that's okay in
the short term.
I'd also like to hear what people think about Blob vs ByteString and
Buffer vs ByteArray. As far as I'm concerned, Binary/B and Binary/C
could use either pair of names without loss of accuracy.
I also think that we should pay attention to making generic routines
that make the Blob and Buffer interfaces interchangeable for some
algorithms too.
Kris
This was fine up until I hit a big gotcha. p.write* would write to the
Buffer, not to the Stream. To write to a stream you would have to create
a new Buffer/Pack pair, write to it, read that into a String (even
though jslibs had a Blob a lot of it's binary methods extracted from a
generic String instead) then write that to the stream.
That's made me a little paranoid about middlemen when it comes to
dealing with the source of binary data, and the interface for reading
and writing binary data.
Some time after that is around when I thought "Why Pack, why not just
put the binary reading methods on the instance we already have?"
> * add a memcopy routine to Buffer, like buffer.copy(source, begin,
> end, [sourceBegin]).
>
Hmmm... random thought...
buffer.append(buffer2);
Implied memcopy when you pass something like a Buffer to one of the
buffer methods, instead of valueOf then call?
> * favor (begin, end) for ranges, like slice notation, over (begin, length).
>
Ah great... I forgot slice used begin,end. I was thinking every method
except str.substring used begin,length ranges.
> * we could probably get rid of reverse without consequence, while
> we're cutting back to basics.
>
Sure. I added it because A) java.io.StringBuffer had it B) Probably the
fastest for it to be done by the implementation rather than in usercode.
Other than that, if we pull it out it'll probably be just another one of
my stdlib extensions like how Wrench.js adds .reverse() to String.
> * BUT, if we require [] operator and .length, almost every function in
> Array.prototype would work in Buffer.prototype. The only exceptions
> are .concat and one other that escapes me at the moment.
>
.join?
> * make ByteBuffer and StringBuffer separate types, or omit String
> buffering, to simplify both usage and implementation.
>
From what I've been through in Java, it looks like writing something
that works abstractly on char[]/byte[] could probably be done using a
interface or an abstract class and using that from within NativeBuffer
to abstract the data manipulation actions.
I did leave splitting Buffer into two types or not up for discussion. I
suppose I'm happy as long as there is simple idiom that'll let someone
create one of either type abstractly.
I had two idioms for that:
(Note: My Stream class has a .text property on it, which is a boolean
indicating if the stream is text or binary, thus indicating whether
.read() will return String or Blob)
function(stream) {
var buf = new Buffer;
buf.text = stream.text;
...
return buf.valueOf();
}
And another:
function (absDataA, absDataB, absDataC) {
var buf = new Buffer(absDataA.constructor);
buf.append(absDataB);
buf.append(absDataA);
buf.append(absDataC);
return buf.valueOf();
}
I actually have an example of something written in MonkeyScript meant to
use that (some of MonkeyScript Lite is theoretical implementation where
the stuff it depends on has not been written yet)
http://github.com/dantman/monkeyscript.lite/blob/master/src/bananas/io/Stream.js
Scroll down to Stream.prototype.yank
There were two main reasons I left the ambiguity with Blob being global
or in a module. I suppose one doesn't matter (I had thoughts about
mini-embedded languages which might want a way to handle binary data,
but had nothing to do with the rest of ServerJS like require()).
I consider Blob to be a counterpart to String, just like string another
native type (it's basically just a String for binary data). Other than
an extreme desire to avoid initializing absolutely anything extra, I see
no reason for Blob not to be a global sitting alongside String.
In fact, I believe there was a topic about ECMA looking at what we come
up with for binary and perhaps standardizing it sometime in the future?
Do I recall that right?
If that ever did happen Blob would become a native global. In fact in
that case if we had been sticking with require('binary').Blob; then my
little ambiguity would come into existence even if we decided to reject
it and stick Blob inside of a binary module only because now the
javascript interpreters would be the ones implementing Blob, and to be
compatible with old code we'd have to write a binary module that would
do what I listed in that spec `|exports.Blob = Blob;`|.
Leaving that ambiguity and making portable code use require('binary')
was my compromise to those who still thought binary should be secluded
inside of a module.
"Whether they are made global or not [...] |require('binary');| must
return an object containing Blob and Buffer as keys..."
ie: If Blob was in the binary module, then require('binary') would
require that module. If Blob was a global, then require('binary') would
return an object with Blob inside of it. Thus code using
require('binary') would be completely portable.
My intention is to make Blob a global inside of MonkeyScript, I see no
reason to do anything other than that. If it's not an option in the spec
the global.Blob will be a non-standard feature to ServerJS just like
str[i] is a non-standard feature to ECMA.
A little side note while I'm on the topic.
For those that don't like the idea of loading Blob all the time.
Wouldn't it work to just define a global getter for Blob which when used
would initialize Blob, delete the getter, set the Blob global, and
return it?
> I'm in favor of requiring [] to be supported. This will make our
> lives difficult in Narwhal on Rhino, but in the long term, I'm hoping
> that these types would be supported by Rhino out of the box. Until
> then we would probably be partially non-compliant, but that's okay in
> the short term.
>
MonkeyScript Lite is MIT. You could swipe NativeBlob and NativeBuffer
out of it if you want.
git submodules, a common repo, and a jar can work nicely for
collaboration to.
> I'd also like to hear what people think about Blob vs ByteString and
> Buffer vs ByteArray. As far as I'm concerned, Binary/B and Binary/C
> could use either pair of names without loss of accuracy.
>
> I also think that we should pay attention to making generic routines
> that make the Blob and Buffer interfaces interchangeable for some
> algorithms too.
>
> Kris
>
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
I've started a show of hands page:
https://wiki.mozilla.org/ServerJS/Binary/C/Show_of_hands
I've fixed up the proposal a bit:
- Added the .codeAt, ... proposal
- Fixed slice
- Moved the unpacking related stuff into a subsection that notes it's
not part of the proposal (reference material for when we write a spec
for unpacking)
- Added buf.fill... I'll add detail later.
I'll fix Buffer in a bit. I'll probably set it up as Buffer,
StringBuffer, and BlobBuffer. StringBuffer and BlobBuffer will be the
individual implementations. Instances of both should work with `buf
instanceof Buffer` and provide a .text boolean. new Buffer will support
String|Blob a string or blob, or an object with a .text property. (Thus
if streams support .text then var buf = new Buffer(stream); will return
a buffer with the appropriate type).
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
I'm thinking of dropping the .text boolean idiom. I have a better one.
Assuming .type for now the discussion I want is on what the best name
for the property is.
"foo".type === String
Blob([0,0,0]).type === Blob
stringbuffer.type === String
blobbuffer.type === Blob
textstream.type === String
binarystream.type === Blob
You can see where I'm going with this?
The idea for this idiom is to have a property on instances of things
like string, blob, streams, buffers, etc... which all fall into the
"Text, or Binary?" category.
They will return either the basic "String" function, or the "Blob" function.
This is an extension to that `new Buffer("foo".constructor);` idiom.
This way `new Buffer(stream.type);` will create an instance of either
StreamBuffer or BlobBuffer depending on whether the stream will return
string or blobs inside of .read;
My only question, is what should the property be named? type,
primitive... primitiveType, .typeOf, ...?
Now I do. This is similar to the C++ idiom with collection::iterator
generics, which are necessary since declaring an iterator requires
knowledge of the contained type…in C++. I don't need to see an
example to believe they would be useful in JavaScript for other
generic algorithms that operate on different types of buffers and
streams. To that end, I would not resist introducing a member like
this to both buffers and streams.
I think "type" is misleading: implies that it's the collection's own
type, not the type of its content. Generally, I like to use TitleCase
for constructors, whether they're members or globals, but
".constructor" is not consistent with my idiom, so take it or leave
it. I would pick one of "Content" or "Value", falling back to
"contentConstructor" or "contentPrototype" &c if none of those are
palatable, or "content", "value" if none of those were even palatable.
I considered "Unit" and "Element", but those would convey the idea of
"Character" and "Byte" instead of "String" and "Blob" respectively,
which would not be applicable.
Summarily, I think we should support these idioms:
var content = anyBufferOrStream.Content(); [1]
anyBufferOrStream.Content().anyGenericMethod(…);
anyBufferOrStream.Content.anyGenericClassProperty…;
anyBufferOrStream.Content.prototype.anyGenericMethod.apply(anyBufferOrStream,
…);
By requiring Streams and Buffers to include a "Content" attribute that
is exactly either "String" or "Blob".
Kris Kowal
[1] a nuance here: we can't support a generic "new
anyBufferOrStream.Content()" because in some cases "Content" will be
"String", in which case "new String()" would have boxing semantics. I
don't believe we've discussed whether "new" will be necessary for
"Blob" construction, but in this case, "new" must not be necessary to
work generically for the "String" case. To this end, we would either
have to make the "new" optional in "new Blob()" or we would need to
make anyBufferOrStream.Content() internally construct a "Blob()", and
set its prototype attribute to that of Blob.prototype. I think that
the former solution is more elegant.
Cept this part of the code:
var buf = new Buffer();
buf.text = this.text;
Would actually be under this new idiom:
var buf = new Buffer(this.type);
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Heh, I didn't consider the idioms of using generics from the
constructor, I was just using it since it was a real nice representation
of String and Blob types.