In the last few days, I've busily implemented Binary/B for Flusspferd
(http://flusspferd.org/). AFAIR, the only missing parts are everything
involving charsets (Encoding module needed).
So, I would propose the following extensions that I've already implemented:
On both ByteString and ByteArray, as in the spec on ByteString:
* indexOf
* lastIndefOf
* byteAt
* split
On ByteArray, as described in DMO* for Array:
* filter
* forEach
* every
* some
* map (extension: allows you to return an Array/ByteArray/ByteString in
addition to numbers in the callback, with the effect that the returned
ByteArray might have a different length)
* reduce
* reduceRight
On ByteArray:
* append [like push, but also allows Array/ByteArray/ByteString in
addition to numbers]
* prepend [like unshift, but also allows Array/ByteArray/ByteString in
addition to numbers]
* erase(begin[, end) [like slice, but deleting, returns length]
* replace(begin, end, values/ByteStrings/ByteArrays/Arrays...) [begin,
end like slice, returns length]
* insert(position, values/ByteStrings/ByteArrays/Arrays...)
* count [like every/some, but returns the total count]
A few notes about split:
* The returned values have the property "delimiter" set to true/false,
if it is a delimiter. (Possibly interesting when you use
includeDelimiter...)
* The delimiters returned when includeDelimiter is specified, are not
counted with regard to the count parameter.
Thank you!
Aristid
*) https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference
Sure, most of these are lucid. Can you describe the behavior of "split"?
> On ByteArray, as described in DMO* for Array:
>
> * filter
> * forEach
> * every
> * some
> * map (extension: allows you to return an Array/ByteArray/ByteString in
> addition to numbers in the callback, with the effect that the returned
> ByteArray might have a different length)
> * reduce
> * reduceRight
Sounds good. I can see why the map extension would be useful,
although I'm generally reluctant to create such "type" special cases.
I say "ok" for now, but I think we should look for an alternate idiom.
> On ByteArray:
>
> * append [like push, but also allows Array/ByteArray/ByteString in
> addition to numbers]
The append idiom is push.apply().
> * prepend [like unshift, but also allows Array/ByteArray/ByteString in
> addition to numbers]
.unshift.apply()
> * erase(begin[, end) [like slice, but deleting, returns length]
.length = 0
> * replace(begin, end, values/ByteStrings/ByteArrays/Arrays...) [begin,
> end like slice, returns length]
.splice(begin, end, replacement) -> replaced (with its .length)
> * insert(position, values/ByteStrings/ByteArrays/Arrays...)
> * count [like every/some, but returns the total count]
.splice(at, at, replacement) -> replaced
> A few notes about split:
>
> * The returned values have the property "delimiter" set to true/false,
> if it is a delimiter. (Possibly interesting when you use
> includeDelimiter...)
> * The delimiters returned when includeDelimiter is specified, are not
> counted with regard to the count parameter.
I'm not a fan of providing or consuming hidden state on primitive
objects; I think it'll lead to difficult bugs.
Thanks for the ideas; the points that receive no objections in the
next day or so I think you should add to the wiki.
Kris Kowal
Yeah, there is the misfortune that it may not be possible for binary
types to be array-like on some platforms.
> In my implementation that's actually no problem because all returned
> elements, including the delimiters, are freshly created. So there is
> no state involved at all.
I'm not anticipating shared state or GC bugs, but rather methods
behaving differently depending on delimiter state carried around but
not shown. Perhaps this could be addressed by showing the attached
delimiter in toString() and toSource(). That being said, an attached
delimiter is not a concern I would like to have when working with a
ByteString or ByteArray. Though, I'll confess that I don't have a
clear idea about what this is used for and how.
> I think creating a Binary just to be thrown away is a waste of
> resources.
Given your arguments about wasteful allocation and not repeating the
mistakes of JavaScript past, I'll buy:
* erase(begin, end)
I think that "replace" should be subsumed by "erase(begin, end,
replacement)" since "replace" has a different meaning for Strings.
"append" is Python's word for "push", so I think we might consider a
different name like "extend", which is Python's word for what you've
described. "prepend" hasn't had the fortune of making it into the
English dictionary yet, but it certainly would be the opposite of
"append", therefore a synonym for "unshift", so I think we should
consider the opposite of "extend" or a pair of opposite words like
"extendLeft" and "extendRight".
And I've already bought without argument:
* ByteArray().indexOf() :Number
* ByteArray().lastIndexOf() :Number
* ByteArray().byteAt()
* ByteArray().forEach(block, [context])
* ByteArray().map(block, [context]) :Array
* ByteArray().reduce(block, [context]) :Number
* ByteArray().reduceRight(block, [context]) :Number
I put ByteArray().byteAt() on the Wiki some time ago, but realized I
hadn't explicated the return type. I'm proposing ByteString for both
ByteArray and ByteString. I also support cross implementing charAt as
an alias on both types.
I'm abstaining on the issue of split, join, delimiters, and the map
extension for lack of expertise, but I hope someone will want discuss
the issues further.
Would you mind adding a link to this discussion to the wiki?
Kris Kowal
Oh, I think that all the ones in the latest ES spec for Array should
be added to ByteArray for sure. How does count() differ from .length?
Kris Kowal
[1]: http://draft.monkeyscript.org/api/_std/Blob.html
[2]: http://draft.monkeyscript.org/git/api/Blob.txt
>> I'm abstaining on the issue of split, join, delimiters, and the map
>> extension for lack of expertise, but I hope someone will want discuss
>> the issues further.
>>
>> Would you mind adding a link to this discussion to the wiki?
>>
>
> Done already.
>
> There are some methods that you've neither accepted nor criticised,
> what about them? :-) (Specifically: "every", "some", "count" and maybe
> others. Both "every" and "some" are in JS 1.6, "count" was added by
> me.)
>
>
>> Kris Kowal
>>
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Okay, I see. Would it not be implicit that every other element of the
returned Array is a delimiter? Even if there were two adjacent
delimiters, would we not interpolate a ByteString([]) between them?
If that's the case, I don't think the delimiter flag would be useful
(apart from a very small convenience), and even then it's likely to be
an inconvenience as once that ByteString is given to someone else, the
.delimiter flag would be noise.
Sorry for being obtuse on this one; there's no precedent for a split
routine that includes delimiters. I just think that someone else
should vouch support for it before it gets in.
> How about "displace"? Just found that word in the dictionary and I
> think it would make sense. :-)
Yeah, that'd do fine, even though there's certainly no precedent.
Let's run with it.
>
> I'd be fine with "extendLeft" and "extendRight".
>
Let's put it on the books 'til someone objects.
> charAt on both? What would it do on ByteArray? Return a ByteString? Or
> a ByteArray?
Well, if byteAt is on ByteArray to be orthogonal with ByteString, it
stands to reason that charAt, being equivalent, should be on ByteArray
as well. That's a slippery slope though; I'm sure we could make a
case for having mutual support for each other's interfaces on
ByteArray and ByteString. That muddies the distinction between them
though. Being as ByteArray is for buffers and ByteString is for most
everything else, I don't think we need to cross implement too much. I
am certain that byteAt() should always return a ByteString() no matter
which type it's attached to. If you're thinking that there needs to
be a method that returns the number at a given offset in the
ByteArray, I agree, and think it should have a different name, like
numberAt, or 'get'.
Kris Kowal
integerAt, intAt, or numberAt would all be acceptable in my opinion.
You're right that charAt implies a *character* (I presume you meant
character instead of byte) and therefore possibly a multi-byte number.
That being said, that's not how it's used, which is far more the
concern for me in terms of reusability and migration. I would very
much like to be able to pass a ByteString to any generic algorithm
that would accept String.
I'm not however concerned about passing ByteArrays to String
algorithms, but Aristid is right that there needs to be some method
for getting single numbers out of the byte array.
Kris Kowal
numberAt wouldn't really work. jslibs' Pack has readInt, readReal,
though going along with other naming and what most people in JS would
expect integer and float would probably be the better names. When
reading an integer or float I expect to be able to define the size, and
in the context of an integer whether it is signed or unsigned, and
whether it's in network byte order or not. If I didn't have that kind of
control I would have utterly failed at trying to implement FastCGI in JS.
In taking a predicate. ByteArray([1,2,3,4,5]).count(function(x) { return
x % 2; }) == 3. As per Javascript convention, the block gets three
parameters: element, index, array.
In the last few days, I've busily implemented Binary/B for Flusspferd
(http://flusspferd.org/). AFAIR, the only missing parts are everything
involving charsets (Encoding module needed).
It's worth noting that this is exactly the kind of idiom that is
easily implemented in Javascript code "on top" of an underlying native
implementation you could probably call _NativeByteArray.
heh, sorry.
Treating native code as a behind the scenes thing, then wrapping a pile
of JS around is has been the zone I've been in for quite awhile.
Both Banana, and even MonkeyScript itself follow an idiom of putting
everything defined natively into a _native object variable inside local
scope where it can't be seen from outside then wrapping JS around it. In
fact _native is basically the only special thing defined when
monkeyscript.js is jumped into. Where it quickly runs an anonymous
function to enclose _native and then deletes it from global view.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]