[erlang-questions] [ANN] EGTM: embedded ACID compliant NoSQL engine for Erlang

121 views
Skip to first unread message

Tomas Morstein

unread,
Aug 23, 2012, 8:24:08 AM8/23/12
to erlang-q...@erlang.org
Dear Erlang community,

It's finally here! We've just released first public version of IDEA EGTM.

EGTM is an Erlang application built on the top of FIS GT.M engine what is a complete implementation of M[UMPS] technology.
The main characteristics of M[UMPS] is that it is both language and database, so one can directly access persistent multidimensional arrays from M language.

Main benefits of GT.M itself:
- it's ANSI/ISO M[UMPS] compatible
- fully ACID compliant
- it's _very_ fast even if your data layout is not so well optimized
- it's small and embedded
- it has native replication
- it can be distributed across multiple nodes (remote data may be treated as local)
- it's mature -- as a heart of PROFILE core-banking system, it is used in hundreds of financial institutions around the world
- its Linux version is free

Main benefits of EGTM:
- inherits all the properties of GT.M
- allows Erlang to freely call MUMPS routines and share data with M[UMPS] applications without any limitations
- can be used as a data-only storage for Erlang, without a single line of M[UMPS] code, so you don't have to learn M language that may not look so friendly (but in fact, is very simple and powerful!)
- EGTM can be deployed on a private secondary GT.M replica instance just to make some data mining (web reporting via ChicagoBoss, for example) without affecting master database
- can be used with any Erlang application as well as with ChicagoBoss or any other web framework
- through layered software (from IDEA :) ) like IODB, it's possible to map EGTM tree structures to Erlang objects and optionally access them via SQL
- IODB is not publicly available yet (coming soon!), but we use it in production and at the moment in conjunction with ChicagoBoss integration module, although the model compiler is ready to be used standalone
- another layered software is EGTM HAC what is EGTM HighAvailibility Cluster extension -- the best of HA properties of native GT.M and Erlang distributed properties

Even if the performance of any external binding will never be as good as native M code, it should be enough for many applications. If the performance is a primary requirement, it's possible to implement complex tasks with native M and simply call the native routine from EGTM.

An example of application that is written in ChicagoBoss framework with pure EGTM (completely without IODB objects) is our website http://www.idea.cz.
I am not able to do it immediately, but the source code may be released as a sample CB+EGTM reference project later.

Some resources:
- EGTM on GitHub: http://github.com/ztmr/egtm
- EGTM "homepage": http://labs.idea.cz/egtm
- GT.M documentation: http://www.mumps.cz/gtm/
- slides about GT.M: http://www.mumps.cz/gtm/misc/120730-1agtmasnosqldatabase.pdf
- introduction to M-based systems for relational people: http://www.fooboo.org/~tmr/gtm/UniversalNoSQL.pdf

In meantime of posting this message, there were started several discussions in another Erlang-related groups:
- https://groups.google.com/forum/#!forum/chicagoboss
- https://www.linkedin.com/groups/EGTM-embedded-ACID-compliant-NoSQL-90878.S.149660834

Enjoy and feel free to ask!

Tom
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Zabrane Mickael

unread,
Aug 23, 2012, 8:53:49 AM8/23/12
to Tomas Morstein, erlang-q...@erlang.org
Hi Tomas,

Didn't hear about MUPS nor GT.M before.
It's simply awesome. Thanks for sharing !!!

Can I imagine "egmt" as a good replacement to my RIAK store (for fast key/value retrieval)?
Any pratical benchs?

Finally, for when are you planning to release "iodb"  (http://www.idea.cz/technology)?

Regards,
Zabrane

Tomas Morstein

unread,
Aug 23, 2012, 8:20:19 PM8/23/12
to erlang-q...@erlang.org
Hi,

Yes, MUMPS is somehow "hidden" from mainstream for more over
40 years (it was operating system for DEC PDP minicomputers,
similarly like early UNIX versions).
Maybe it was also the reason why one of its commercial
implementations is called Caché (= hidden in French) :-)

To answer your questions:

(1) to be honest, I don't like benchmarks so much. We don't
know RIAK well enough to compare to. I also think that RIAK
is much more high-level DB than GT.M/EGTM.

If you're asking for speed, I can tell you only stories
about M compared to relational databases (MySQL, PostgreSQL,
Oracle, MS-SQL). In short, even if we wanted to be fair
and artificially slowed down the M task by significant
anti-optimalizations, the SQL did the job for hours where
it took a few minutes in MUMPS.
I will not share these "benchmarks" publicly as they're
good only for marketing, not for any fair, serious and
meaningful technology competition.
The number of factors and input variables is not small
and especially when it comes to NoSQL and distributed
environment, I'd say it's not comparable at all.

But because I am interested too, let's open a competition
like this:
- try to define a sensible test scenario (what's the goal,
what we're going to measure);
- implement it with RIAK;
- I will implement the same scenario with GT.M in meantime
(in both variants: native M code and EGTM-powered Erlang)
- we can compare then...

(2) according to IODB, it is something we're using in production,
but it's not ready to be released to public yet.
Actually, we're working on a brand new model compiler (what
unlocks a lot of new features as well as much better performance)
and we will probably change some general API functions to make
it more comfortable to use.
So the only I can say at the moment is: stay tuned! :-)

Since we use some IODB objects from JavaScript, we have also
a simple IODB REST API implemented as a ChicagoBoss controller
what is a part of IDEA CloudOS where we also plan some web GUI
management (EGTM Global browser, IODB model manager, etc.),
but this is definitely on our wishlist...

(3) what actually came into my mind (but I am not sure about
its value in practical life):
- EGTM seems to be a good adept to act as a Mnesia backend
(it can break all the DETS limits I am aware of)
- what about employing EGTM as a RIAK storage backend? :-)
(For example, I have experimented with GT.M as a backend
for OpenLDAP around three years ago.)

Regards,
Tom.

Vincent de Phily

unread,
Aug 24, 2012, 5:16:28 AM8/24/12
to erlang-q...@erlang.org, Tomas Morstein
On Thursday 23 August 2012 14:53:49 Zabrane Mickael wrote:
> Hi Tomas,
>
> Didn't hear about MUPS nor GT.M before.
> It's simply awesome. Thanks for sharing !!!

You might want to avoid googling for "wtf mumps" to keep that spirit up then
:p

Sorry, couldn't resist, and I fully admit that I know very little about mumps
and that there's certainly good reasons why people are using it.

--
Vincent de Phily

Tomas Morstein

unread,
Aug 24, 2012, 7:02:55 AM8/24/12
to erlang-q...@erlang.org
> > Didn't hear about MUPS nor GT.M before.
> > It's simply awesome. Thanks for sharing !!!
>
> You might want to avoid googling for "wtf mumps" to keep that spirit
> up then
> :p

At first, just a refreshment:
EGTM's goal was to put the best from Erlang and MUMPS together :-)

So well, it's true that many people hate the M _language_ because
for historical reasons, it has some features resulting in situations
where a M begineer is not able to understand the code [*].

The EGTM is quite different because you don't need to touch such
a MUMPS code!

Just write a regular Erlang code and use MUMPS only for its good
database properties, don't care about the rest.

If you need to do some performance tuning, you can write
M code in style of your own preference, but you don't have to.
If you want to do Oracle tuning, you don't need to learn
a new, maybe crazy, language, but you need to become an Oracle
internals wizard.
And from my point of view, I'd go for learning a new language :)

-----
[*] back to these "language features":
Sure, it is possible to write inline code with a commands of single
character, but it is only _possible_ shortcut. Many modern M developers
use VeryLongCamelNaming.Possibly.With.Some.NameSpaces :-)

For example, Caché Studio (their IDE) has a button that will
expand/collapse all the code in your buffer to short/long
form.
So you open a routine with code like this:
S X="" F S X=$O(^Foo(X)) Q:X="" W X,!
and if you click to expand button, it makes:
Set X="" For Set X=$Order(^Foo(X)) Quit:X="" Write X,!
and vice versa.

The original short code was good when transfering source
code over serial lines in past.
Imagine, that for example OpenVista healthcare project
has around 25.000 routines, what is 331 MB of raw source
code with over than 2.000.000 LoC.
(Not counting their GUI client application.)

I guess the expanded source can grow up to 1G what's
not much today, but for example on VAX computer with
32M RAM, 25MHz CPU and 1G disk, serving to a big hospital
with hundreds of users...

(And since we do migrations of legacy hardware, I can confirm
there're lots of mission critical servers with similar
or lower configration still in production!)

> Sorry, couldn't resist, and I fully admit that I know very little
> about mumps
> and that there's certainly good reasons why people are using it.

Yep, this is a realistic point of view. Many people judge
what they never saw and that's an infinite source of many
rumours.

Tomas Morstein

unread,
Aug 30, 2012, 9:18:29 AM8/30/12
to erlang-q...@erlang.org
For those who were interested in benchmarking, take a look here:
http://ksbhaskar.blogspot.com/2011/02/from-44-seconds-to-27-seconds-simple.html

Greg Burd

unread,
Sep 3, 2012, 3:12:07 PM9/3/12
to Tomas Morstein, erlang-q...@erlang.org
Given that this project is AGPLv3 I assume that this is a dual-license commercial option product, correct?  If so, does someone need to acquire a license from you and from the copyright holders of GT.M or do you have some commercial agreement with them?  Where is the price list?

Just curious given the prevalence of ASLv2 licensing for Erlang code, AGPLv3 is a highly infectious license.

@gregburd


--
@gregburd
Architect, Basho Technologies | http://basho.com | @basho

Max Bourinov

unread,
Sep 4, 2012, 3:32:48 AM9/4/12
to Greg Burd, Tomas Morstein, erlang-q...@erlang.org
Hi Tomas,

Great news indeed.

Before I dive into it, could you please tell me:
  • What is a main advantages compare to traditional RDBMS?
  • For what kind of data it is good? Is it good for keeping log entries or I better use traditional RDBMS (say PostgreSQL)?
I am trying to understand how I can use IDEA EGTM in my projects. So far we live with PostgreSQL. It is very good (as we all know), but there are some natural limitations when I use it for Erlang - just because it is general purpose DB.

Tomas, could you just gime me some ideas where and how IDEA EGTM will kick the a**?

I think IDEA EGTM will perfectly match my needs, because at the very bottom my app. business logic is very similar to what banks do.

p.s. I have found all docs and they looks superior. So, I see that IDEA EGTM is pretty well kept software.

Best regards,
Max

Tomas Morstein

unread,
Sep 24, 2012, 5:58:18 PM9/24/12
to erlang-q...@erlang.org
Hi, there are several points:
- IDEA EGTM depends on FIS GT.M which is under AGPLv3 for Linux/x86, Linux/x64, OpenVMS/Alpha and Tru64. GT.M is available for more platforms, but under commercial conditions directly from FIS;
- IDEA EGTM is free (AGPLv3) and should work against all the GT.M versions on all the platforms where both Erlang and GT.M is available.
- there is no commercial license for EGTM base. Only support and some non-public extensions are available on commercial contract basis;
- we wanted to make licensing more permissive (for example, Evan Miller, author of ChicagoBoss, recommended us to go with MIT), but it seems to be pointless since the GT.M itself is under AGPLv3 in the best (= the most open) case (Linux)
- EGTM is not only Erlang code, it's mixture of Erlang, C and M[UMPS].

So finally, you don't need to acquire any license neither from FIS nor from IDEA if you're on Linux/x86 or Linux/x64. That's also reason why there's no price list :-)
If you already have a valid commercial license of FIS GT.M for any non-Linux platform, EGTM is still free, but we expect/recommend you to order EGTM support from IDEA too.

...well, that's our default policy. For interesting projects, we are flexible enough to discuss an individual conditions/exceptions.


Dne pondělí, 3. září 2012 21:12:19 UTC+2 Greg Burd napsal(a):

Tomas Morstein

unread,
Sep 24, 2012, 5:59:21 PM9/24/12
to erlang-q...@erlang.org
Hi Max,

Thanks for your interest and sorry for my late reply!

Honestly, it's not so easy to compare MUMPS with RDBMS as it's completely different approach even when compared to the rest of NoSQL world...
I would really recommend you to take a look on http://www.fooboo.org/~tmr/gtm/UniversalNoSQL.pdf because it is "overview for relational people."

M technology is an excellent choice for analytical applications, so if you're going to make a log aggregator and analyzer, you can benefit from hierarchic multidimensional indices as well as from powerful M language that can help you to process large amounts of existing records in speed of light :-)
BTW, we have a product that does exactly the same -- it gets tons of log records, index them and the user analytics interface is done with EGTM+ChicagoBoss.

Sure, PostgreSQL is great and in field of RDBMS it was always one of my preferred choices, but from some point in time, I prefer to use MUMPS for everything I can :-)

The documentation is not so good because it was done quickly just to have something, and is primarily targetted to people with existing MUMPS experience... but if you're happy with it, we're happy too! :-)

Best regards,
Tom.

PS: there's a new GT.M V6.0-000 release worth to check!


Dne úterý, 4. září 2012 9:33:39 UTC+2 Max Bourinov napsal(a):
Hi Tomas,

Great news indeed.

Before I dive into it, could you please tell me:
  • What is a main advantages compare to traditional RDBMS?
  • For what kind of data it is good? Is it good for keeping log entries or I better use traditional RDBMS (say PostgreSQL)?
I am trying to understand how I can use IDEA EGTM in my projects. So far we live with PostgreSQL. It is very good (as we all know), but there are some natural limitations when I use it for Erlang - just because it is general purpose DB.

Tomas, could you just gime me some ideas where and how IDEA EGTM will kick the a**?

I think IDEA EGTM will perfectly match my needs, because at the very bottom my app. business logic is very similar to what banks do.

p.s. I have found all docs and they looks superior. So, I see that IDEA EGTM is pretty well kept software.

Best regards,
Max




On Mon, Sep 3, 2012 at 11:12 PM, Greg Burd <gr...@basho.com> wrote:
Reply all
Reply to author
Forward
0 new messages