New binary syntax proposal

3 views
Skip to first unread message

Tony Arcieri

unread,
Dec 15, 2009, 1:21:48 AM12/15/09
to Reia Mailing List
I am wanting to move away from Erlang's << and >> delimiters for binaries.

The main reason I would like to do this is to implement all of the bitwise operations found in the C family of languages, which in this case includes Ruby.

Binaries themselves are a great reason to have a standard syntax for bitwise operations.  Erlang, like no other language I've ever used makes working with binary data a snap, and I would like for Reia to be even better (if possible) by supporting standard C-style bitwise operations.

Currently, the main token I'm eyeing is $, which is unused in Reia because Reia does not have global variables.  So, some possible syntax:

$"bitstring", equivalent to Erlang's <<"bitstring">>
$[1,2,3] for octet lists, equivalent to Erlang's <<1,2,3>>

Previously I was mulling something like:

%b"bitstring"
%b[1,2,3]

although I find this comparatively more ugly than the $ syntax.

What do you think?

--
Tony Arcieri
Medioh! A Kudelski Brand

Abhinav Saxena

unread,
Dec 15, 2009, 2:23:04 AM12/15/09
to re...@googlegroups.com
My first email to the group :-) IMHO using %b is much better as it will make the code more readable and will be in sync with other languages. My 2 cents.

--
Thanks,
Abhinav
http://twitter.com/abhinav



--

You received this message because you are subscribed to the Google Groups "Reia" group.
To post to this group, send email to re...@googlegroups.com.
To unsubscribe from this group, send email to reia+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/reia?hl=en.

Yang Bin Kwok

unread,
Dec 15, 2009, 5:32:55 AM12/15/09
to re...@googlegroups.com
Me too :) I personally don't feel much for either syntax, but I'm wondering if other encodings should be supported, for example, BASE64?

Also, are there other considerations, e.g., from a parsing perspective?

Cheers,
yb

Jeff Bragg

unread,
Dec 15, 2009, 11:18:26 AM12/15/09
to re...@googlegroups.com
Code readability is a highly subjective area.  I find $ to be less noisy than %b, probably because I only have to remember one token instead two.  It's also not clear to me how either option would make it "in sync with other languages" in terms of the bit syntax (though generally enriching its bitwise manipulation capabilities definitely does).  Also just my (in this case contrasting) $0.02.

Phil Pirozhkov

unread,
Dec 15, 2009, 11:00:21 AM12/15/09
to re...@googlegroups.com
It is possible to add a possibility to allow in-code data in any encoding
available, but let's be rational, and focus on what's really widespread
and useful. Other things can always be added later on

Binary syntax is cool, and it is very convenient, but the target task of
it is digital line switching, bit flags, encoders/decoders and so on that
aren't very widespread across internet community.

So if it comes to voting for a direction, i'm voting for class system (and
would vote for prototype class system if i had another vote)

BR,
Phil

-----Original Message-----
From: Yang Bin Kwok <yan...@fragnetics.com>
To: re...@googlegroups.com
Date: Tue, 15 Dec 2009 18:32:55 +0800
Subject: Re: [reia] New binary syntax proposal

> Me too :) I personally don't feel much for either syntax, but I'm wondering
> if other encodings should be supported, for example, BASE64?
>
> Also, are there other considerations, e.g., from a parsing perspective?
>
> Cheers,
> yb
>
> >> reia+uns...@googlegroups.com <reia%2Bunsu...@googlegroups.com>.
> >> For more options, visit this group at
> >> http://groups.google.com/group/reia?hl=en.
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Reia" group.
> > To post to this group, send email to re...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > reia+uns...@googlegroups.com <reia%2Bunsu...@googlegroups.com>.

Chad DePue

unread,
Dec 15, 2009, 11:58:08 AM12/15/09
to re...@googlegroups.com
you mean your %b0.02   :)

Tony - would you want to save $ for regex group matches ala ruby/perl? 

Tony Arcieri

unread,
Dec 15, 2009, 12:19:01 PM12/15/09
to re...@googlegroups.com
On Tue, Dec 15, 2009 at 9:58 AM, Chad DePue <ch...@inakanetworks.com> wrote:
Tony - would you want to save $ for regex group matches ala ruby/perl? 

No, not only do I find that side effecty to an icky degree, but it's something I think can be solved a lot better by pattern matching, especially with named capture groups:

/(?<foo>\w+) (?<bar>\d+) (?<baz>\w+)/ = "asdf 1234 jkl"

The intended behavior here would be to bind the variables "foo", "bar", and "baz" to what their named capture groups match respectively.

candlerb

unread,
Dec 16, 2009, 2:11:43 PM12/16/09
to Reia
On Dec 15, 6:21 am, Tony Arcieri <t...@medioh.com> wrote:
> I am wanting to move away from Erlang's << and >> delimiters for binaries.

Can I make a wacky suggestion: what if you were to use double quotes
for binaries, and something else for lists of characters?

ISTM that if binaries had been in Erlang earlier, they could be used
almost everywhere that 'lists of characters' are currently used. In
couchdb, for example, they are used as property keys in their internal
JSON representation. And in R12+ they are very efficient to
concatenate (unlike lists of characters).

You could then have an alternative syntax for lists of characters,
e.g. %l{abc} short for [?a, ?b, ?c]

Counter-arguments:

* lots of Erlang standard libraries expect lists of characters
* if you are dealing with UTF8 then lists of characters give you the
codepoints

> The main reason I would like to do this is to implement all of the bitwise
> operations found in the C family of languages, which in this case includes
> Ruby.

Aside: I'm not sure that these are mutually exclusive. << and >> are
infix operators; when would they be confused with the syntax around
binaries?

But in any case they are cumbersome in this role, for an application
which makes wide use of binaries.

Tony Arcieri

unread,
Dec 16, 2009, 2:34:11 PM12/16/09
to re...@googlegroups.com
On Wed, Dec 16, 2009 at 12:11 PM, candlerb <b.ca...@pobox.com> wrote:
Can I make a wacky suggestion: what if you were to use double quotes
for binaries, and something else for lists of characters?

The binary syntax is more complex than mere strings, and I want to support the full capabilities of binary pattern matching present in Erlang, e.g.:

%b[x:16, y, z/binary] = bin

That said, I am definitely up for alternative syntax proposals for binaries if anyone else has them.

Double quoted strings are going to be just that: strings.  I am probably going to implement strings as immutable objects represented internally as iolists.  I would like to support a real "String" class intended for string-like behaviors, none of this string-as-lists or strings-as-binaries crap.

Counter-arguments:

* lots of Erlang standard libraries expect lists of characters

Many Erlang APIs require iolists as arguments.  However, yes, a conversion will be needed from a Reia String to a list or iolist (and back again).  Dealing with this in the previous version of Reia was somewhat ob  
 
Aside: I'm not sure that these are mutually exclusive. << and >> are
infix operators; when would they be confused with the syntax around
binaries?

Admittedly it might very well be possible to use << and >> for both purposes.  I think it would be confusing, though.

candlerb

unread,
Dec 17, 2009, 5:55:54 AM12/17/09
to Reia
> I am probably
> going to implement strings as immutable objects represented internally as
> iolists.

And an iolist is a list of (characters and/or binaries and/or
iolists), correct?

I see. In that case, a string literal like "abc" might be internalised
as {string, [<<"abc">>], 3}

That's nice. It's easy to turn interpolated strings into this form
too.

> The binary syntax is more complex than mere strings, and I want to support
> the full capabilities of binary pattern matching present in Erlang

Understood - you need another 'grouping' construct for dealing
natively with binaries.

I can't think of anything better than $[...]. It looks list-like, and
$ was used in BASIC for strings :-)

For simple literals, perhaps you could allow

mybin = $["abc"]

to be shortened to

mybin = $"abc"

You could then use either $[] or $"" for an empty binary.

Tony Arcieri

unread,
Dec 17, 2009, 4:12:36 PM12/17/09
to re...@googlegroups.com
On Thu, Dec 17, 2009 at 3:55 AM, candlerb <b.ca...@pobox.com> wrote:
And an iolist is a list of (characters and/or binaries and/or
iolists), correct?

Correct.  iolists are nice as they have good overall performance characteristics.  For example, appending to them is constant time:

"foo"
["foo", "bar"]
[["foo", "bar"], "baz"]

The code for flattening them out to a single binary is written in C "for speed!" and many APIs work on iolists directly.
 
I see. In that case, a string literal like "abc" might be internalised
as {string, [<<"abc">>], 3}

Yep... interesting you stuck the length in there AOT... not sure if I'll end up doing that or not.
 
That's nice. It's easy to turn interpolated strings into this form
too.

Indeed, as you can simply dump the return value of any interpolated code directly into the iolist (or perhaps invoke to_s or so forth on it first)
 
For simple literals, perhaps you could allow

 mybin = $["abc"]

to be shortened to

 mybin = $"abc"

Yep, that was part of my original proposal :)

--

Tony Arcieri

unread,
Jan 13, 2010, 10:05:04 PM1/13/10
to re...@googlegroups.com
I noticed Efene is using <[ ... ]> as the delimiters for binaries.  I think that's a pretty nifty syntax.

What does everyone think about that?

Lucian

unread,
Mar 6, 2010, 5:32:12 PM3/6/10
to Reia
How about b'abc' or b"abc", like Python 3?

Tony Arcieri

unread,
Mar 7, 2010, 3:00:58 PM3/7/10
to re...@googlegroups.com
That works for string literals, but Erlang's syntax for binaries is a lot more powerful and includes pattern matching components as well, e.g.


<[x:16, y, z/binary]> = bin

--
You received this message because you are subscribed to the Google Groups "Reia" group.
To post to this group, send email to re...@googlegroups.com.
To unsubscribe from this group, send email to reia+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/reia?hl=en.

Reply all
Reply to author
Forward
0 new messages