[erlang-questions] implementing Forex Trader in Erlang

Shahrdad Shadab

unread,

May 17, 2012, 12:19:42 PM5/17/12

to erlang-q...@erlang.org

Hi Erlangers

I have to implement a foreign exchange trader software for a financial firm.
They need fast response and high availability. They also need parallel real time feed from multiple brokers.
As I did my preliminary investigations,It seems most of the tools on the field are implemented in Microsoft technologies like .net or com/OLE.
I believe (although I might be wrong) the broker servers are DDE servers.
I decided to use Erlang/otp. Does anyone know if something like this has done before in Erlang.
Also I appreciate any guidance on how to get feed from trader servers.

Thanks a lot
Best regards
Shahrdad

--
Software Architect & Computer Scientist

Michael Truog

unread,

May 17, 2012, 6:04:05 PM5/17/12

to Shahrdad Shadab, erlang-q...@erlang.org

Here is the architecture of something similar:
http://www.connamara.com/solutions-hftframework.html


_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Richard O'Keefe

unread,

May 18, 2012, 12:38:42 AM5/18/12

to Tom Parker, erlang-q...@erlang.org

On 18/05/2012, at 12:21 PM, Tom Parker wrote:
>
> But here is where I get stuck.
>
> Why not use the record format? ... (Bear with me, I believe the issues in that reuse can be addressed.)

BECAUSE THE RECORD FORMAT ALREADY EXISTS!

It's that simple. Back when Eiffel was undergoing major changes,
I proposed something I called the "diarthognathus principle".
When you want to evolve from supporting technique X to supporting
technique Y, you had *better* evolve through a stage where you
support *both* techniques.

And that means that there has to be a period when frames have begun
to work and record syntax still works *EXACTLY* the way it does now.

Frames and records are not inter-operable (records are just a
stylised use of tuples, while frames are a new completely distinct
type).

> Issue #1 with the ~ syntax) Today's Erlang developer, if using records, has already provided a form of that information. It already has a key->value mapping.

And it must continue to work *as is* without breaking anything.

But there is one thing more.
Suppose you go around mechanically translating

#id0{id1=E1, ..., idn=En}
to
<id0{id1~E1, ..., idn~En}>

which is a fairly straightforward task.

What happens? The code immediately stops working. Because the two forms
of construction DO NOT HAVE THE SAME SEMANTICS.

A frame construction says "here is ALL the information about this frame."

A record construction says "here is SOME information about this record,
pick up anything I've left out from the -record declaration."

Because these things have different semantics they MUST have different
syntax.

>
> Issue #2 with the ~ syntax) The format proposed for extraction from a frame, with a tilde, is ~name(Frame). (okay, a little bike shed paint :) ) This reverses the standard ordering of items from Major to minor.

I am tempted to use a very rude word here.

THERE IS NO STANDARD ORDER!!!

It may still be true that there are more lines of COBOL than any other language.
In COBOL, it's <field> of <record>.
In Algol 68, it's <field> of <record>.

predecessor count OF node -:= 1

seems to me a perfectly natural way to decrement the number of predecessors of
some node.

For a functional language, note that
a field selector is semantically just an overloaded function.

In the Pop family of languages, record selectors are just functions (well,
technically doublets) and are called exactly like functions.
In Lisp, selectors are called exactly like functions.
In ML-family languages, if f is a field, #f is the function that extracts
that field (and yes, that's where I got the syntax).

Yes, in Smalltalk the analogue of field selection is "object field",
but that's how Smalltalk writes unary functions: "-42 abs".

The syntax is neither novel nor unnatural unless you take an
extremely parochial view of programming languages.

> We are all already familiar with the record format being a key:value mapping, and that's what a frame would look like:
> P = #point{x=4,y=5}.
> or as part of the frame proposal, if you prefer the anonymous definition vs. a declared format, we can use (from erlson):
> P = #{x = 4, y = 5}.

It's not a matter of preference, not having declarations is what frames are ABOUT.

> The big question:
> Q: How do I know this #point is a frame and not a record?
> A: Well, we have 2 ways of knowing:
> (1) A language requirement would be imposed that #{ is always an "anonymous frame" since it is not a valid record.
> (2) The namespace defined for -record(ATOM, {...}) and -frame(ATOM,{...}) would be shared (compiler - well, technically the preprocessor - would halt with an error if a record and a frame with the same ATOM identifier were included in the module to be compiled.) Since the preprocessor destroys records, the only #atom{ the compiler would ever see would be frames - it is never ambiguous to the compiler.

No no no no NO! There can be *type* specifications that declare some type to be
implemented as a frame of a particular shape, but there are no frame declarations
as such. NOT having them is one of the major design goals!

>
> Comments on your questions from section 4 of your document in comparing solutions:
> Q1: Frames answer changes to Yes, this actually forces an increase in intelligence of the preprocessor.

That is utterly unacceptable. The frame proposal *EXISTS* in order to have
something vaguely recordy that makes NO use of the preprocessor whatsoever.

> Q4: This would alleviate the impact on Dialyzer as it already understands record formats (should make it a less challenging change)

No, it wouldn't. I keep coming back to the same key point:
frames and records have DIFFERENT SEMANTICS.
Syntactic processing is not a challenge.
The challenge in extending the Dialyzer to handle frames is
in handling the SEMANTICS.

> Q9: If the frame is a declared frame, then the frames answer is enhanced to be able to catch this issue at compile time, otherwise as before

Frame declarations do not exist. By intent, there is no such animal
as "a declared frame". Adding frame declarations is like adding
huge paddle wheels to a hot air balloon and still expecting it to fly.

One of the things that frames are meant for is to communicate
structured values from one context to another context WITHOUT
there being any declaration shared by both.

> Q10: If the frame is a declared frame, then the frames answer is enhanced to be able to catch this issue at compile time, otherwise as before

It is an important feature of the design that there are no
frame declarations and there is NO preprocessor involvement
in the processing of frames whatsoever.

> Q11. Frames now take on the unnecessary recompile situation of records (if the record is a declared frame)

Isn't that a clue that there is something wrong with the idea of
-frame declarations?

> Q20: I believe this would change to make detection easier, but am not totally sure.

If you want records, you know where to find them.

Records are *not* going to go away. They have their virtues
and their vices. Frames are *different* virtues and vices.
Trying to make one look like the other is about as kind to
programmers as trying to make numbers and strings look the same.
It is seductively attractive, but causes more trouble than it's
worth in practice. (Speaking as someone who maintains an AWK
implementation and tried to write an AWK compiler.)
>
> Additional Q&A:
> Q: So this forces declaration of a frame if I want an atom between # and {
> A: That is my thought. It would look like:
> -frame(point, {x, y}).

But the frames proposal already allows <name{ fields }>.
Without any declaration.

We know what is the right way to catch type errors and typos.
That's what the Dialyzer does, and does very well.

Speaking as someone who has tried to maintain other people's
code, if I see #point{x = 1, y = 2} I want to *KNOW* that
that is a record, not anything else at all. I don't want
it to maybe be a record or maybe something else, possibly
depending on a file that isn't open in my editor.

You are asking for a "feature" that I regard as active
malevolence towards the programmer.

> Q: How does extraction change?
> A: Extraction with the record-style format ... and is major to minor:
> Frame#format.member

Having to repeat the record name is one of the most hated aspects of
the present -record syntax. I am having no success understanding
why anyone would *WANT* to imitate that abomination.

As for major to minor, I have never perceived anything major or minor
in field access. Some languages put the field first, some put it second.
BIG DEAL. Languages are different. Get over it. They don't all have
to look like Java.

I skipped the rest of the message.

The frame proposal exists precisely in order to have something that
can do the same job that records do but in a DIFFERENT way. Because
the semantics is different, programmers need to know what they are
dealing with, so it is an incredibly bad idea to use the same syntax
for both. It doesn't matter so very much what the syntax _is_ as
long as it is clearly _different_. Frames "fit" into Erlang data
structures in a way that records never did; _all_ the information
that exists about a frame is right there in it without consulting
any declaration. In particular, changing a declaration in one
field cannot change the meaning of a frame construction in another.
Adding -frame declarations isn't "raising" the proposal but "razing" it.

Ilyushonak Barys

unread,

May 18, 2012, 2:05:09 AM5/18/12

to Michael Truog, Shahrdad Shadab, erlang-q...@erlang.org

It looks likes that this guys use Erlang only for messaging middleware. The server components, which connected to exchanges, are implemented in c++.

_______________________________________________________

The information contained in this message may be privileged and conf idential and protected from disclosure. If you are not the original intended recipient, you are hereby notified that any review, retransmission, dissemination, or other use of, or taking of any action in reliance upon, this information is prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and delete it from your computer. Thank you for your cooperation. Troika Dialog, Russia.

If you need assistance please contact our Contact Center (+7495) 258 0500 or go to www.troika.ru/eng/Contacts/system.wbp

o...@cs.otago.ac.nz

unread,

May 18, 2012, 12:42:24 PM5/18/12

to erlang-q...@erlang.org

If I trimmed the frames proposal down to just say 10 pages,
presenting the design goals (and non goals), the basic semantics
of the data structure, how it can be implemented in a BEAM-like
architecture, and some measurements from my mini-BEAM, is that
something that might be of interest for the next Erlang conference?

Garrett Smith

unread,

May 18, 2012, 4:57:00 PM5/18/12

to o...@cs.otago.ac.nz, erlang-q...@erlang.org

On Fri, May 18, 2012 at 11:42 AM, <o...@cs.otago.ac.nz> wrote:
> If I trimmed the frames proposal down to just say 10 pages,
> presenting the design goals (and non goals), the basic semantics
> of the data structure, how it can be implemented in a BEAM-like
> architecture, and some measurements from my mini-BEAM, is that
> something that might be of interest for the next Erlang conference?

I apologize, this is somewhat off topic and not an answer to your question...

But does anyone know where the Frames proposal stands vis-a-vis the
OTP team's work on hashes?

My reference point is Kenneth's presentation last year:

http://www.erlang-factory.com/upload/presentations/468/EUC_Hashes2011.pdf

Garrett

Kostis Sagonas

unread,

May 19, 2012, 11:26:01 AM5/19/12

to erlang-q...@erlang.org

On 05/18/2012 02:21 AM, Tom Parker wrote:
> ....
>
> Where I think we would vehemently agree: I expect Erlang to be robust.
> That's why I'm even here. An issue with records where you can compile
> two files (using the same .hrl) and end up with a result that (a)
> compiles, (b) doesn't produce an error and (c) produces the wrong
> answer... is a serious issue. It distinctly shows the shortcomings of
> records.

Not really related to the frame discussion, but I would like to point
out that the above statement is wrong. The situation you describe does
not show shortcomings of records; instead it shows shortcomings of
programming with .hrl files and without an appropriate 'make'-like
utility to track dependencies between files.

Records are not without problems, but IMO this is not one of them. This
is a problem of using .hrl files and choosing to program in a way which
is not disciplined and thus error prone. There is a very simple way of
avoiding this problem that IMO is a very nice way of programming: Use
records as abstract data types and have them in a single module that
exports appropriate getters and setters for manipulating fields of the
record. If, for whatever reason, you do not like this way of using
records and want the ability to perform pattern matching and field
extraction in more than one module as you do today with records, better
make sure you use an appropriate Makefile (or equivalent) for compiling
your application. It's not the fault of record syntax if you do not use
such a mechanism! (Makefiles are technology of the 70's after all...)

Kostis

Matthew Evans

unread,

May 19, 2012, 2:32:34 PM5/19/12

to kos...@cs.ntua.gr, erlang-q...@erlang.org

Agreed,

We are making use of Ulf's exprecs module for this precise reason.

Matt

> Date: Sat, 19 May 2012 17:26:01 +0200
> From: kos...@cs.ntua.gr
> To: erlang-q...@erlang.org
> Subject: [erlang-questions] Robustness problems when using records [WAS: Re: Question/Alternative on Frames Proposal [Warning: Long]]

Ulf Wiger

unread,

May 20, 2012, 4:07:36 AM5/20/12

to Tom Parker, erlang-q...@erlang.org

On 19 May 2012, at 22:29, Tom Parker wrote:

Therefore, I continue to assert my statement below is correct as written. Records are the issue, and it is the destruction of the name information before it reaches the VM that is the issue.

Actually, I agree with Kostis: the biggest problem is the reliance on

.hrl files and the preprocessor for sharing code between modules.

You can put a local function definition in a .hrl file as well, and it

has been done too. Changes to a record usually trigger badmatch

or function_clause exceptions if not all modules are recompiled

and replaced in a coordinated way. This is a problem, for sure,

but if you make subtle semantic changes to a function in a .hrl

file, or to a macro (which can expand to pretty much anything),

then *anything* can happen - including a large set of extremely

subtle errors that are hard to achieve with records.

So putting a function definition in a .hrl file can cause some

really subtle problems. This obviously doesn't mean that functions

are fundamentally flawed, but yet an example that reliance on

preprocessors is a dangerous thing.

The worst thing you can do with records - and it's indeed a bad

thing - is to rename attributes and use the same positions for

entirely different data (no, worst would be the *same* data type

but with a different interpretation). Luckily any half-decent

programmer will understand enough not to do this.

Richard O'Keefe has many times stated that a key goal is to

remove the reliance on the pre-processor.

If you think about it, Erlang has no facility for sharing structured

data types between modules, except for code - as Kostis mentioned

- and pattern matching on the primitive elements of the structure.

You can provide accessor functions that are exported from the

same place where the data structure is managed, or you can

use data structures that are as shallow as to be easy to inspect

through pattern-matching. If this is not enough, you can pass along

property lists, which are simple, yet flexible enough to convey just

about anything - however cannot be quickly inspected with

pattern-matching. So people trade safety for speed.

…or rely on convention, which actually works very well.

Remember that records are little more than syntactic sugar.

They do not introduce a data type that would offer anything more

than the above. You could get a similar effect by introducing a

series of home-cooked macros, expanding to operations and

patterns on a tuple structure, and then let multiple modules

rely on them through a .hrl file.

I even have some old code lying around from the time before

records. It had a lot of code like this:

person(name, {person,X,_,_}) -> X;

person(age, {person,_,X,_}) -> X;

person(address, {person,_,_,X}) -> X.

person(name, X, {person,A,B,C}) -> {person,X,B,C};

person(age, X, {person,A,B,C}) -> {person,A,X,C};

person(addr, X, {person,A,B,C}) -> {person,A,B,X}.

A bit fiddly to create, but easy to verify through visual inspection.

Of course, you could put that in a .hrl file and use as a 'shrared

data type'. It would suffer from roughly the same problems as

records.

The obvious limitations with records were the subject of animated

discussion even before they were added to the language. You

wouldn't recall this unless you were a member of a pretty small

group of people debating this on a mailing list inside Ericsson, long

before the Open Source release.

Even then it was agreed that the proper way to do it was to introduce

a new data type. As my memory serves, the problem with that was

that JAM didn't have any more type tags to use. A redesign of the

memory management structure in the VM would be required in order

to add new data types. In light of this, records were seen as a

cost-efficient way to solve a common problem of copy-paste

programming.

Later on, I believe it was the HiPE team that proposed a new tagging

scheme and wrote some working code to demonstrate. By the time

it was integrated into Erlang, records had become ubiquitous, and

could not be easily replaced.

If my recollection is faulty, the old-timers can correct me. I recall this

playing out right after I joined Ericsson in 1996.

(The new tagging scheme was introduced in R7B in 2000, and is described

in Mikael Petterson's PhD thesis: http://www.it.uu.se/research/publications/reports/2000-029/)

One can argue that it's taken a long time to correct this issue, and

add a structured data type with all the goodness of records and

none of their drawbacks. The reasons:

- Erlang, even taking this into account, is a darn good language for

writing complex, robust and scalable systems.

- Convention actually works pretty well in practice.

- The OTP team wanted to get it right the second time.

- Backwards compatibility is considered extremely important in the

Erlang world (exactly because we don't want to break the

complex, robust and scalable systems already written in Erlang)

BR,

Ulf W

Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.

http://feuerlabs.com

eigenfunction

unread,

May 20, 2012, 8:21:09 AM5/20/12

to erlang-q...@erlang.org

> There is a very simple way of
> avoiding this problem that IMO is a very nice way of programming: Use
> records as abstract data types and have them in a single module that
> exports appropriate getters and setters for manipulating fields of the
> record. If, for whatever reason, you do not like this way of using
> records and want the ability to perform pattern matching and field
> extraction in more than one module as you do today with records, better
> make sure you use an appropriate Makefile (or equivalent) for compiling
> your application. It's not the fault of record syntax if you do not use
> such a mechanism! (Makefiles are technology of the 70's after all...)

Would you care to elaborate with a simple example in the context of a
gen_server?
I definitely feel guilty of what you are describing.

Nicholas Frechette

unread,

May 20, 2012, 10:24:38 AM5/20/12

to Ilyushonak Barys, erlang-q...@erlang.org

Hi,
At home, I built a system to do just that. Erlang worked great to
build the prototype and I was able to quickly iterate and get
something stable running. In fact, it has been running for 2 years now
without downtime. I also fell in love with erlang during this time.
Unfortunately, it also became clear at some point that it would have a
hard time to scale beyond a certain size. Things like modifying large
datasets in memory (order book management, historical data management)
very quickly are not strong suits of erlang. You can probably scale to
a professional suitable level with the proper time/resources however
for me, it wasn't exactly worth it. I already have my GUI application
in C# and seeing how I build high availability applications with tight
cpu/memory constraints every day at work in C++, C# proved a better
choice for me even if only because I know when the need arises, I'll
be able to optimize it easily to a level I'm comfortable with. In
erlang, I would have had to resort to using drivers written in C++ for
the critical bits or entire processes built in C++. C# will also allow
me to leverage my functional prog. skills with F# if the need arises
:)

Basically, that was it for me. I worked in erlang 1+ year building it,
I had a great time doing it but ultimately changed my mind despite the
fact it'll take me 1 year to rewrite the erlang bits in C#. You could
easily build everything you need in erlang, there is no doubt there
(and fairly quickly too). Whether you can reach cpu budgets and memory
budgets you set yourself will largely depend on how motivated you are
with erlang and how much you are willing to hybridize your system
(either with drivers or other processes). Using it only for message
processing seems reasonable but a bit strange if your solution is in a
different language. In C# for example, NServiceBus would probably be
an equally good solution. I'm sure java must have a stable solution
for messaging as well. Rolling out your own isn't all that hard
either, the Disruptor guys sorta did it.

That's my story and my 2 cents. Note that it wasn't a professional
project, only a hobby.
Cheers,
Nicholas

Richard O'Keefe

unread,

May 20, 2012, 6:42:56 PM5/20/12

to Garrett Smith, erlang-q...@erlang.org

On 19/05/2012, at 8:57 AM, Garrett Smith wrote:
>
> I apologize, this is somewhat off topic and not an answer to your question...
>
> But does anyone know where the Frames proposal stands vis-a-vis the
> OTP team's work on hashes?

I'm not sure what you are asking. The paper that you mentioned made it
quite clear that they were NOT experimenting with frames or anything in
the same area of design space.

Jan Burse

unread,

May 21, 2012, 4:53:18 AM5/21/12

to o...@cs.otago.ac.nz, erlang-q...@erlang.org

Richard O'Keefe schrieb:

>
> On 19/05/2012, at 8:57 AM, Garrett Smith wrote:
>>
>> I apologize, this is somewhat off topic and not an answer to your question...
>>
>> But does anyone know where the Frames proposal stands vis-a-vis the
>> OTP team's work on hashes?
>
> I'm not sure what you are asking. The paper that you mentioned made it
> quite clear that they were NOT experimenting with frames or anything in
> the same area of design space.
>

Hi Richard,

If I remember well, your proposal mentions some benchmarks
with monomorphic caching. You also explained the SmallTalk
history of caching methods. So basically I suppose someone
is doing the following:

if (cache == null || cache.base != base)
cache = lookup(base,key); /* costly */
return cache;

Since last week I have been experimenting with polymorphic
caching. And it seems that is practically same speed as
monomorphic caching, but has the additional advantage of
not trashing the cache.

Trashing of monomorphic cache happens when the base
is not constant for a call site. When trashing happens
time spent goes drastically up. I observed for some
test case of mine:

No Trashing: 1400 ms
Trashing: 2200 ms

So I did a very simple polymorphic cache consisting of a
a single linked list. Its adding new entries at the end
of the list if it didn't find an entry with matching
base. Its not that bad:

Monomorphic: 1400 ms
Polymoprhic: 1450 ms

I am not using some locking. Its just overwriting the
the last link with a new entry, so concurrently it might
do double work sporadically, but the result should never
be wrong. But it requires GC of the eventually lost record.

Something done for Erlang records?

Bye

Garrett Smith

unread,

May 21, 2012, 8:15:40 AM5/21/12

to Richard O'Keefe, erlang-q...@erlang.org

On Sun, May 20, 2012 at 5:42 PM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>
> On 19/05/2012, at 8:57 AM, Garrett Smith wrote:
>>
>> I apologize, this is somewhat off topic and not an answer to your question...
>>
>> But does anyone know where the Frames proposal stands vis-a-vis the
>> OTP team's work on hashes?
>
> I'm not sure what you are asking. The paper that you mentioned made it
> quite clear that they were NOT experimenting with frames or anything in
> the same area of design space.

Richard, I'm not seeing what you're seeing.

That's a PDF of presentation slides from 2011 User Conf-- it lists
"frames" in the title and as a input to their process.

I read the document as a status update on their work, listing some
performance results from various data structures.

The "Conclusion and way forward" slide seems open ended.

In any event, is not relevant to any discussion here what the OTP team
is actually working on? Is this is a blip of interest in a decade old
proposal that has no shot of getting into the language?

Garrett

Kenneth Lundin

unread,

May 21, 2012, 8:54:19 AM5/21/12

to Garrett Smith, erlang-q...@erlang.org

On Mon, May 21, 2012 at 2:15 PM, Garrett Smith <g...@rre.tt> wrote:

On Sun, May 20, 2012 at 5:42 PM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>
> On 19/05/2012, at 8:57 AM, Garrett Smith wrote:
>>
>> I apologize, this is somewhat off topic and not an answer to your question...
>>
>> But does anyone know where the Frames proposal stands vis-a-vis the
>> OTP team's work on hashes?
>
> I'm not sure what you are asking. The paper that you mentioned made it
> quite clear that they were NOT experimenting with frames or anything in
> the same area of design space.

Richard, I'm not seeing what you're seeing.

That's a PDF of presentation slides from 2011 User Conf-- it lists
"frames" in the title and as a input to their process.

I read the document as a status update on their work, listing some
performance results from various data structures.

The "Conclusion and way forward" slide seems open ended.

In any event, is not relevant to any discussion here what the OTP team
is actually working on? Is this is a blip of interest in a decade old
proposal that has no shot of getting into the language?

Garrett

The conclusion from my talk at EUC November 2011 was like this:

Conclusions and way forward
› New better records
– optionally named, possibly declared
– few named fields

– many instances
› ------- and --------------------------------------------------------------------------
› Hashmaps
– many ”keys”
– few instances
› are two different things (but similar)
› we want both
› HAMT looks really promising for Hash Maps
› but it might be hard to combine with one single representation

› We have to decide what to address first
› Probably some experimental implementation released during 2012

I think this clearly indicates that we intended to work on both "better records" and "hashes".

Now I can as a fact tell you that we are addressing both.

We are prototyping an implementation of "hashes", or whatever we will call them, and we are also prototyping a reference implementation of the frames proposal.

We think that the frames proposal is interesting and think we need an implementation so we can evaluate performance and compare it with records. After the evaluation we will come to a decision and will also give more details about possible changes/restrictions/additions we want to make compared with the current frames proposal. What we find most interesting with the frames proposal is the well thought semantics.

/Kenneth , Erlang/OTP, Ericsson

Richard O'Keefe

unread,

May 21, 2012, 6:42:53 PM5/21/12

to Garrett Smith, erlang-q...@erlang.org

On 22/05/2012, at 12:15 AM, Garrett Smith wrote:
>>
>> I'm not sure what you are asking. The paper that you mentioned made it
>> quite clear that they were NOT experimenting with frames or anything in
>> the same area of design space.
>
> Richard, I'm not seeing what you're seeing.
>
> That's a PDF of presentation slides from 2011 User Conf-- it lists
> "frames" in the title and as a input to their process.

We're talking about
http://www.erlang-factory.com/upload/presentations/468/EUC_Hashes2011.pdf
and that's all I have to go on.

Slide 0: title
Slide 1: there's always room for improvements RECORDS Hash maps
Slide 2: outline of records (few named fields, many instances, matchable)
Slide 3: not a distinct data type, not suitable for many fields
Slide 4: outline of hash maps (proplists, dict, gb_trees, gb_sets)
Slide 5: what's wrong with proplists
Slide 6: what's wrong with dict
Slide 7: Requirements/Goals ***FOR HASHMAPS***

And it is quite clear from content slide 7 that they are
talking about something that can do the job of dict and
gb_tree; suitability for type checking is NOT listed.

More precisely, content slide 7 gives the impression of
a vague desire for something that's a floor polish AND
a desert topping, a bulk hashed container AND a possible
replacement for records.

Slide 8: mentions me, Joe Armstrong, Anton Lavrik, Phil Bagwell.
Slide 9: Erlson syntax.
Slide 10: >>> We have implemented prototypes and performed measurements
using the Vlists and HAMT datastructures internally in the
Erlang VM <<<

It is very very clear from this that the experiments
concerned something suitable for *LARGE* data structures.
Slide 11: Diagram of records as tuples.
Slide 12: Diagram of VLists / vhash lists.
Slide 13: Diagram of Hash Array Mapped Tries.

Popcount is one of those things that *ought* to be fast but
that often get left out of actual chips. POPCNT wasn't on
PCs until Core i7, I believe. It's in the SPARC V9
architecture, but is not in all SPARC V9 _hardware_, like
the machine on my desk.

Slide 14: Diagram of hash table with exception list
Slide 15: Another such diagram

This appears to be an adaptation of a 1980s technique for
O(1) array update; the latest version is compact and in
place while old versions have a list of deltas.

Slide 16: Graph of time in microseconds against number of elements
inserted for dict, gb_trees, vlist_hash, and hamt (up to ~ 30)
Slide 17: Another such (up to ~ 65 000)
Slide 18: Graph of lookup time (up to ~ 30)
Slide 19: Another such (up to ~ 65 000)
Slide 20: Graph of memory (up to ~ 30 elements)

These data structures take from 4 to 8 words per element.
Frames, in contrast, take 1 to 2 words.

Slide 21: Another such graph (up to ~ 65 000 elements)
Slide 22: Table of requirements by data structures.

Frames are conspicuous by their absence.

Slide 23: "new better records" "and Hashmaps" "ARE TWO DIFFERENT THINGS."

I agree with this distinction and made it a decade ago.

This talk was about Hashmaps, NOT about anything credible as
new better records.

It will be really good for Erlang to have a better implementation of
hashmaps. This was good work. It owes nothing to me; I'd never even
heard of Hash Array Mapped Tries before. I look forward to seeing
something coming of it.

But there is NOTHING reported in that presentation that evaluates
alternatives for "new better records".

> In any event, is not relevant to any discussion here what the OTP team
> is actually working on? Is this is a blip of interest in a decade old
> proposal that has no shot of getting into the language?

What's there is a solid examination of "can we provide better support
for functionally pure persistent dictionaries than what we've already
got" that briefly drags a "new better records" red herring over the
track at the beginning and then drops it. It expresses the _wish_
that one data structure might be usable for both purposes, and I
suppose you could argue that slide 20 does *implicitly* demonstrate
that the current set of candidates for better dictionaries are *NOT*
credible candidates for better records.

To be fair, there's no reason why a single logical data structure
has to have a *uniform* representation at run time (as I was saying
to a data base class yesterday). One could use something like frames
up to a certain size and something else above that size.

one data structure _might

Reply all

Reply to author

Forward