Re: Too many NoSQL databases

637 views
Skip to first unread message

Konstantin Osipov

unread,
Jul 31, 2012, 10:23:08 AM7/31/12
to nosql-di...@googlegroups.com
* Cross <crossl...@gmail.com> [12/07/31 18:20]:

> But the lack of ACID compliant is something that is really difficult to
> accept in the enterprise world, that's why I started my own project to fill
> this gap,
>
> Now, what do you think about transactions in NoSQL databases? pointless? or
> just a feature that is impossible to achieve in a distributed system? I've
> an implementation of a transactional server for djondb and it works (it's
> on lab right now) but I'm wondering why the big players didnt consider to
> implement transactions in their systems.

Tarantool/Box is already all-or-nothing: if write to disk fails,
data modification is rolled back and an error is returned to the
client.

We're adding multi-statement transactions in Tarantool 1.5

--
http://tarantool.org - an efficient, extensible in-memory data store

Cross

unread,
Jul 31, 2012, 10:37:27 AM7/31/12
to nosql-di...@googlegroups.com
Tarantool looks like a nice concept, the Dynamic SQL is something that looks good, but still keeps the "relational" idea that is one of my complains with the RDBMS world, if you have information that is required all together (like the receipts sample I referenced) why would you separate it into different tables? just to be able to query them later? will not be wiser to create a way to query elements that is not row/columns structured?

What do you think Konstantin?
 

On Tuesday, July 31, 2012 9:23:08 AM UTC-5, Konstantin Osipov wrote:
* Cross  [12/07/31 18:20]:

David Whitten

unread,
Jul 31, 2012, 10:43:08 AM7/31/12
to nosql-di...@googlegroups.com
The GT.M database is a NoSQL database that supports ACID transactions.
You can read info about it at http://outoftheslipstream.com/
or more directly: http://www.mgateway.com/docs/universalNoSQL.pdf


--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nosql-discussion?hl=en.


Cross

unread,
Jul 31, 2012, 10:54:48 AM7/31/12
to nosql-di...@googlegroups.com, whi...@netcom.com
Nice paper, I'll read it from head to toe, looks really nice lecture to learn from.


On Tuesday, July 31, 2012 9:43:08 AM UTC-5, David Whitten wrote:
The GT.M database is a NoSQL database that supports ACID transactions.
You can read info about it at http://outoftheslipstream.com/
or more directly: http://www.mgateway.com/docs/universalNoSQL.pdf

On Tue, Jul 31, 2012 at 10:23 AM, Konstantin Osipov <kos...@tarantool.org> wrote:
* Cross <crossl...@gmail.com> [12/07/31 18:20]:

> But the lack of ACID compliant is something that is really difficult to
> accept in the enterprise world, that's why I started my own project to fill
> this gap,
>
> Now, what do you think about transactions in NoSQL databases? pointless? or
> just a feature that is impossible to achieve in a distributed system? I've
> an implementation of a transactional server for djondb and it works (it's
> on lab right now) but I'm wondering why the big players didnt consider to
> implement transactions in their systems.

Tarantool/Box is already all-or-nothing: if write to disk fails,
data modification is rolled back and an error is returned to the
client.

We're adding multi-statement transactions in Tarantool 1.5

--
http://tarantool.org - an efficient, extensible in-memory data store

--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-discussion@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussion+unsubscribe@googlegroups.com.

William la Forge

unread,
Jul 31, 2012, 11:15:19 AM7/31/12
to nosql-di...@googlegroups.com, whi...@netcom.com
I've also written a noSql database.

It is strictly in-memory but is fully durable and transactional:  https://github.com/laforge49/JFile 

It processes a million transactions a second on an i7 with ssd, so I think it is worth further development. It builds on a super fast incremental deserialization/reserialization library-- https://github.com/laforge49/JID --and a high-throughput actor library-- https://github.com/laforge49/JActor

Currently the database is more of a proof of concept for the actor library, which supports message passing at a rate of 250 million messages per second. A high-level description of how it works can be found here: http://barcampbangalore.org/bcb/bcb12/vertical-scaling-made-easy-through-high-performance-actors

I'd love to get some help on this--I've spent the last month just working on the actor library, covering some corner cases which I recently learned were not being handled well. But I should be done with that soon.

High throughput and vertical scaling has been my focus. An it looks like I'm beating everyone else out on throughput by orders of magnitude.

Please let me know what you think.

Bill la Forge
CTO, JActor Consulting

To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/nosql-discussion/-/NUNlYAR31v4J.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Eric Evans

unread,
Jul 31, 2012, 12:00:24 PM7/31/12
to nosql-di...@googlegroups.com
Perhaps it would be simpler if we started a thread for all of the
people who *haven't* written a NoSQL database.
Eric Evans
john.er...@gmail.com

Ladislav Thon

unread,
Jul 31, 2012, 12:22:15 PM7/31/12
to nosql-di...@googlegroups.com
Perhaps it would be simpler if we started a thread for all of the
people who *haven't* written a NoSQL database.

I haven't. But I was thinking about it :-)

Now, what do you think about transactions in NoSQL databases? pointless? or just a feature that is impossible to achieve in a distributed system? I've an implementation of a transactional server for djondb and it works (it's on lab right now) but I'm wondering why the big players didnt consider to implement transactions in their systems.

Here's the thing: NoSQL doesn't automatically mean horizontal scaling. For me (and I know I'm not alone), the biggest value of NoSQL databases is the ability to use different data model. Document databases are particularly nice in this regard. Take MongoDB, add transactions, get rid of those "clustering" stuff (automatic sharding, replica sets, whatever -- only keep the bare minimum to be able to have a master/slave setup and do a failover), make durability and safe mode the default (with ability to switch them off with reasonable granularity) and you've got a perfect database for like 99 % of the web.

LT

Florian Schintke

unread,
Jul 31, 2012, 3:22:18 PM7/31/12
to nosql-di...@googlegroups.com
Scalaris is another distributed NoSQL store, that supports
transactions across several keys and replication. (See
scalaris.googlecode.com)

Florian (contributor to Scalaris)

[Ladislav Thon]
> >
> > Perhaps it would be simpler if we started a thread for all of the
> > people who *haven't* written a NoSQL database.
> >
>
> I haven't. But I was thinking about it :-)
>
> Now, what do you think about transactions in NoSQL databases? pointless? or
> > just a feature that is impossible to achieve in a distributed system? I've
> > an implementation of a transactional server for djondb and it works (it's
> > on lab right now) but I'm wondering why the big players didnt consider to
> > implement transactions in their systems.
> >
>
> Here's the thing: NoSQL doesn't automatically mean horizontal scaling. For
> me (and I know I'm not alone), the biggest value of NoSQL databases is *the
> ability to use different data model*. Document databases are particularly
> nice in this regard. Take MongoDB, add transactions, get rid of those
> "clustering" stuff (automatic sharding, replica sets, whatever -- only keep
> the bare minimum to be able to have a master/slave setup and do a
> failover), make *durability* and *safe mode* the default (with ability to
> switch them off with reasonable granularity) and you've got a
> *perfect* database
> for like 99 % of the web.
>
> LT
>
> --
> You received this message because you are subscribed to the Google Groups "NOSQL" group.
> To post to this group, send email to nosql-di...@googlegroups.com.
> To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.

Cross

unread,
Jul 31, 2012, 4:07:09 PM7/31/12
to nosql-di...@googlegroups.com
I agree with @ladicek, NoSQL does not mean "big data" or "scalable", it's more like a consequence of design decisions, the true value of a NoSQL store is to be able to adapt it's model to new models of data, no matter if they're XML, json or any other schema, what is relevant here is to be able to create an easier way to store information and recover it later.

To @florian... wow!, as I said, too many NoSQL database stores, what is the difference between Scalaris and other NoSQL stores, sorry but I didnt manage to find your documentation, the User Guide is empty.
> To post to this group, send email to nosql-discussion@googlegroups.com.
> To unsubscribe from this group, send email to nosql-discussion+unsubscribe@googlegroups.com.

Florian Schintke

unread,
Jul 31, 2012, 4:58:50 PM7/31/12
to nosql-di...@googlegroups.com

> To @florian... wow!, as I said, too many NoSQL database stores, what is the
> difference between Scalaris and other NoSQL stores, sorry but I didnt
> manage to find your documentation, the User Guide is empty.

You should be able to retrieve the Users and Developers Guide now,
sorry. The embedded Google pdf viewer was somehow broken.

The main difference between Scalaris and other NoSQL stores is the
already mentioned support for transactions across several keys while
also maintaining replicas for the data to improve availability under
churn.

Konstantin Osipov

unread,
Aug 1, 2012, 10:24:35 AM8/1/12
to nosql-di...@googlegroups.com
* Cross <crossl...@gmail.com> [12/07/31 19:21]:

> Tarantool looks like a nice concept, the Dynamic SQL is something that
> looks good, but still keeps the "relational" idea that is one of my
> complains with the RDBMS world, if you have information that is required
> all together (like the receipts sample I referenced) why would you separate
> it into different tables? just to be able to query them later? will not be
> wiser to create a way to query elements that is not row/columns structured?
>
> What do you think Konstantin?

Thank you for this question.

Way back in the 1960s databases didn't separate data
representation and data access.

To navigate in an index, a database user had to know the
physical structure of the index.

Obvious deficiencies of the approach led to introduction of
separation of data model and data representation. Relational model
is one and still the most popular way to do it.

One of the most well known deficiencies of a relational model is
the so-called object-relational impedance mismatch: there is more
than one way to map objects to relations, and none of them fits
all access patterns well.

It has as well a number of advantages: simplicity, ease of
analytical processing, and, let's not forget, performance:
by normalizing data, a user is forced to tell the DBMS
more about data constraints, distribution, future access
patterns.

This makes building efficient and to-the-point data representation
structures easier.

Unfortunately, the past generations of database management systems
did not address one of the main architecture drawbacks, which
plagues the relational model: rigidity of schema change. Very few
mainstream DBMS allow to change the structure of a relational
database quickly, without downtime or significant performance
penalty. This is not a drawback of the relational model, but of
one which relates to the implementation.

It should also be kept in mind that in many cases a relational
model is an overkill, and a simple key to value mapping is
sufficient.

And of course no single model *can* fits all needs (e.g. graph
databases build around the notion of nodes & edges, yet, good luck
trying to quickly calculate CUBE on a bunch of nodes in a graph
database).

Unfortunately, the world of NoSQL, when it comes to the data
model, often simply takes us back to the 60s:
there is minimal abstraction of data access from data
representation, and once a certain representation has been chosen,
there is no way to change it without rewriting your application
(e.g. to fit the new performance profile).

Scalability is an answer, but a silly one: throwing more hardware
at a problem is not always economical.

This outlines the reasoning behind the choice of the "relaxed
relational" model in Tarantool: to quickly fill the solution gap
we had with our social web projects, we needed a solution which
is simple enough in the first approximation, and yet doesn't
negate future opportunities of extending the model.

In Tarantool 1.5 we plan to extend the type system, but mainly
in the direction of providing an efficient representation for
the missing but commonly used data formats: homogeneous arrays,
date and time, exact decimal numbers.

Peco Karayanev

unread,
Aug 1, 2012, 12:49:36 PM8/1/12
to nosql-di...@googlegroups.com
Konstantin,
Excellent post! I would throw one more concept in the mix.
 
Democratization of Data - Data access to data should be simplified and open to allow more people to work with it. (I don't want SQL experts to write 1 page queries to get what I need).
 
thanks!

--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.

Jim Peters

unread,
Aug 1, 2012, 2:05:19 PM8/1/12
to nosql-di...@googlegroups.com
Seems like it's a lot easier to write SQL queries than it is to write non-SQL queries ...

Konstantin Osipov

unread,
Aug 1, 2012, 2:15:23 PM8/1/12
to nosql-di...@googlegroups.com
* Jim Peters <jim.p...@iname.com> [12/08/01 22:07]:
> Seems like it's a lot easier to write SQL queries than it is to write
> non-SQL queries ...

It depends on what you're trying to do.

Setting a member of a JSON object is a lot easier to
write in JavaScript than in SQL.

Writing a typical search query against World database (*)
is going to be a lot easier in SQL.

(*) http://dev.mysql.com/doc/world-setup/en/index.html

Cross

unread,
Aug 1, 2012, 2:54:54 PM8/1/12
to nosql-di...@googlegroups.com
I like the way Konstantin resumed the history of NoSQL-SQL-NoSQL transition, although I'm not totally agree with some points.

I've to agree to some problems the relational model is a good fit, but querying a SQL database could be harder than NoSQL in some cases, let's say that you have a soccer game and you stored the goals of each game in tables, like this:

| Game   |   Goals   |
| 1234   |       1   |
| 8388   |       3   |

and goals go to another table like this:

| Game  |   Time    |  User   |
| 1234  |    1:34   |  paul1  |
| 8388  |    0:45   |  johnP  |
...

If you want to get fastest scorers, the SQL could be complex right? but in NoSQL this is simple with map/reduce functions, and a structure like this:

{  game: "1234", goals: [ { time: "1:34", user: "paul1" }, {time: "3:45", user: "carol" } ]
{  game: "8388", goals: [ { time: "0:34", user: "matt" } ]

Which is more complex? I think that'll depends on the perspective of the developer.

One of the things that bugs me, it's that several people think if the conclusion in the 60s was that RDBMS was the right fit, then it should rule forever and all the other possible options are condemn to death. I still believe that if the problem evolved (as we all saw how the web changed the way we think) then the solution should evolve too.

Winston Pacheco Junior

unread,
Aug 1, 2012, 3:01:04 PM8/1/12
to nosql-di...@googlegroups.com
Writing a typical search query against World database (*) is going to be a lot easier in SQL. 

- Not necessarely! There are queries that's a lot easier to do with noSQL, a Google style search, for example, with boosting and all this things. All of this, depends on the kind of noSQL and the amount of resources that the specific noSQL is using.
Lots of noSQL nowadays has Lucene integration and this is a very cool thing.
This all depend on the data format. There are data formats that will be cooler using noSQL and some cooler with SQL.

2012/8/1 Konstantin Osipov <kos...@tarantool.org>

Eric Evans

unread,
Aug 1, 2012, 4:07:52 PM8/1/12
to nosql-di...@googlegroups.com
On Wed, Aug 1, 2012 at 1:05 PM, Jim Peters <jim.p...@iname.com> wrote:
> Seems like it's a lot easier to write SQL queries than it is to write
> non-SQL queries ...

Yes.
Eric Evans
john.er...@gmail.com

Evan Buswell

unread,
Aug 1, 2012, 9:40:58 PM8/1/12
to nosql-di...@googlegroups.com
Interesting discussion. In Gobpersist, I try to implement atomicity
of commit with client-side code. This only works if the back end
database has locks or can emulate them, which both the dbs to which
I've written interfaces can. Or more basically, if there's any way at
all to insure that you can fetch one key and store it without it
changing in between; that's not always possible. It's not perfect,
and there's probably all sorts of performance problems with this
approach, but it's working OK for us so far, seeing as the NoSQL we're
using is just so much faster than a SQL alternative anyway. I haven't
really dealt with the now very big class of databases which don't
provide strong consistency, but it seems like with a lot of work you
could implement something like "eventually atomic" commits?

A few random things, too:

Tarantool looks great. I hadn't heard of it before, but am now
definitely interested.

> Unfortunately, the world of NoSQL, when it comes to the data
> model, often simply takes us back to the 60s:
> there is minimal abstraction of data access from data
> representation, and once a certain representation has been chosen,
> there is no way to change it without rewriting your application
> (e.g. to fit the new performance profile).

This is one of the major things that inspired Gobpersist. Though my
approach was to take NoSQL databases as-is and write a layer of
abstraction on top of that. The project doesn't really have enough
variety in its back ends to measure its success in that department,
but I'm pretty hopeful.

> Seems like it's a lot easier to write SQL queries than it is to write non-SQL queries ...

What's always been interesting to me about the NoSQL/SQL dichotomy is
the way that the presence or absence of the language itself has had
such gravity. In some ways, rejecting the relational model because of
SQL would be like rejecting object-oriented programming because of
C++. And if C++ were the only option for OOP, I might just reject the
whole thing. Which is not to say that the SQL language is "bad,"
simply that it is very clumsy for certain, specific purposes.
Especially for hierarchical data and graphs and all of those good
things that NoSQL has been doing much better at. For Gobpersist, I
tried to write a new query language specifically designed for
hierarchical data, but which isn't exactly non-relational. It's
probably more of a (functionally useful) experiment, than a finished
language. In actual execution, it's somewhat map-reduce-ish, but
again, doesn't/shouldn't map-reduce refer to implementation rather
than expression? Anyway, it just seems like query languages are
actually a pretty big area to explore, and it's strange the degree
that queries have been dominated by a single language for so long.

And while I'm on this subject, is there something like
turing-completeness for data query/data definition languages?

-Evan

David Whitten

unread,
Aug 2, 2012, 11:18:23 AM8/2/12
to nosql-di...@googlegroups.com

I've heard folks say that SPARQL is a query language that works especially
well for hierarchical (and other network graph) type data.

 
 In actual execution, it's somewhat map-reduce-ish, but
again, doesn't/shouldn't map-reduce refer to implementation rather
than expression?  Anyway, it just seems like query languages are
actually a pretty big area to explore, and it's strange the degree
that queries have been dominated by a single language for so long.

And while I'm on this subject, is there something like
turing-completeness for data query/data definition languages?

-Evan

Yingfei Xue

unread,
Aug 2, 2012, 11:18:18 PM8/2/12
to nosql-di...@googlegroups.com
Good discussion. Traditional database log mechanism is very important to achieve ACID-like function. Maxtable is a distributed nosql based on the traditional database log recovery, it can not only restore the system when the machine is down (hard fault), and also rollback this operation when one update hit error (soft fault). It also introduces a distributed range cache, can cache all data on the disk to the memory if you have enough memory.
You can know more in this link  http://code.google.com/p/maxtable/
--
Thanks,
Yingfei

Reply all
Reply to author
Forward
0 new messages