* Cross <
crossl...@gmail.com> [12/07/31 19:21]:
> Tarantool looks like a nice concept, the Dynamic SQL is something that
> looks good, but still keeps the "relational" idea that is one of my
> complains with the RDBMS world, if you have information that is required
> all together (like the receipts sample I referenced) why would you separate
> it into different tables? just to be able to query them later? will not be
> wiser to create a way to query elements that is not row/columns structured?
>
> What do you think Konstantin?
Thank you for this question.
Way back in the 1960s databases didn't separate data
representation and data access.
To navigate in an index, a database user had to know the
physical structure of the index.
Obvious deficiencies of the approach led to introduction of
separation of data model and data representation. Relational model
is one and still the most popular way to do it.
One of the most well known deficiencies of a relational model is
the so-called object-relational impedance mismatch: there is more
than one way to map objects to relations, and none of them fits
all access patterns well.
It has as well a number of advantages: simplicity, ease of
analytical processing, and, let's not forget, performance:
by normalizing data, a user is forced to tell the DBMS
more about data constraints, distribution, future access
patterns.
This makes building efficient and to-the-point data representation
structures easier.
Unfortunately, the past generations of database management systems
did not address one of the main architecture drawbacks, which
plagues the relational model: rigidity of schema change. Very few
mainstream DBMS allow to change the structure of a relational
database quickly, without downtime or significant performance
penalty. This is not a drawback of the relational model, but of
one which relates to the implementation.
It should also be kept in mind that in many cases a relational
model is an overkill, and a simple key to value mapping is
sufficient.
And of course no single model *can* fits all needs (e.g. graph
databases build around the notion of nodes & edges, yet, good luck
trying to quickly calculate CUBE on a bunch of nodes in a graph
database).
Unfortunately, the world of NoSQL, when it comes to the data
model, often simply takes us back to the 60s:
there is minimal abstraction of data access from data
representation, and once a certain representation has been chosen,
there is no way to change it without rewriting your application
(e.g. to fit the new performance profile).
Scalability is an answer, but a silly one: throwing more hardware
at a problem is not always economical.
This outlines the reasoning behind the choice of the "relaxed
relational" model in Tarantool: to quickly fill the solution gap
we had with our social web projects, we needed a solution which
is simple enough in the first approximation, and yet doesn't
negate future opportunities of extending the model.
In Tarantool 1.5 we plan to extend the type system, but mainly
in the direction of providing an efficient representation for
the missing but commonly used data formats: homogeneous arrays,
date and time, exact decimal numbers.