Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Kernels, Contexts, Threads, and Extensible Database Architecture

15 views

Skip to first unread message

knorth

unread,

Apr 20, 2012, 9:03:31 PM4/20/12

Dr. Dobb's Journal article discusses Oracle Database architecture:

Kernels, Contexts, Threads, and Extensible Database Architecture
http://www.drdobbs.com/database/232900522

Mladen Gogala

unread,

Apr 21, 2012, 2:44:28 AM4/21/12

It's an interesting article. I will refrain from nitpicking on the
context switch and the processor modes, after all, the first great
strides in my career were made on the CPU that had 4 regular modes and
"interrupt stack". The modes were user, supervisor, exec and kernel and
CPU was NVAX. Context switches and how to avoid them was a kind of
science back then.
Your article contains an interesting summary of some points in the
Andrew Tannenbaum's "Modern Operating Systems", a classic textbook that
is still used on various universities.
That doesn't have too much to do with the databases per se, but is an
interesting introduction and there is a very strong historic link. As far
as the databases go, your article is certainly interesting, with one
missed point: open source databases, and that is what we're talking about
here, are nowhere near the commercial ones in terms of reliability, sheer
processing efficiency and speed, capacity and versatility. NewSQL has
yet to withstand the test of time and the prospects do not look god. SQL
is modeled after the mathematical theory called "naive set
theory" (except for the abomination called "ANSI Join Syntax", which is
an attempt to introduce brain damage into SQL) and is also well
understood by the accountants, which is vitally important. Two or three
years ago there was a movement claiming that NoSQL will take over the
world in similar fashion as Pinky and the Brain. A brief summary of that
fad can be found on Youtube under the title "MongoDB is Web Scale". There
is no reason to conclude that the fate of this NewSQL will be any
different: it will probably carve a niche among Python enthusiasts, Ruby
enthusiasts and scripting kiddies.
Relational databases owe their overwhelming popularity to the rock solid
logical foundation of the SQL language as such. Contrary to popular
belief, set theory, with all of its rigor, including joins, projections
and set operations, is very well suited to the real world and is perfect
for operating on tables. SQL is still the king of the hill and while that
is so, there is very little chance that something will replace the
traditional RDBMS.

To make things worse for the newcomers, RDBMS is not only a SQL
interpreter, it's also a resource manager. Resource managers were used on
the computers like IBM 360 to avoid at that time extremely expensive
context switches, which connects us to the beginning of your article.
That was the ruling architecture on IBM mainframes, for several decades.
Unrelated to that, IBM also used to have VMware of sort, called VM/CMS,
long before VMWare has conquered the world. Transaction managers, like IMS
and CICS, were used to manage transactions, another business requirement,
mostly modeled after banking transactions. Some of the biggest and
earliest customers were banks, which explains why companies went out of
their ways to adjust their software to the banking business. Transaction
managers are very complex and hard to do in a scalable way. Rules of the
game, also motivated and supported by the banking industry, known as ACID
rules, do not help. Things like repeatable reads, transaction isolation,
atomicity, consistency, isolation and durability tend to have devastating
effect on performance, especially with new databases.

No Lego toys architecture can help there. Plugins are nice, as long as
the database is ACID compliant and can handle the volume. Also, the old
RDBMS-es are very well instrumented. That means that tuning the
applications is a very precise procedure, with numerous tools that can
help people to figure out where the time is spent and to help
performance. Various profilers, tracing tools, wait event interfaces and
alike are available. Typically, open source databases have very few such
tools and if they have morons on the steering committee, as in the
following link, the obscurity is all but guaranteed:

http://tinyurl.com/68gu822 (The author is very influential in the Postgres
community. That is probably one of the main reasons for the small size of
that community)

Producing a rock solid and usable RDBMS is a very expensive and labor
intensive undertaking and open source projects are simply not up to the
task. Partitioning, row level locking, hot backup, in-line upgrade,
usable procedural extensions, clustering and high availability are very
expensive to develop. The next Richter scale 10 earthquake in the Oracle
world will not be caused by an open source database or the advent of
NewSQL or NoSQL, it will be caused by the old nemesis:

http://tinyurl.com/c7vmuqq

Look at those prices! If DB2v10 turns out as good as rumor has it, and
if prices remain this low, there will be trouble in the paradise. A big
trouble. There is also a whole slew of DB2 books on the Amazon and DB2 is
now free for personal use. Looks like IBM is finally getting serious
about DB2 on Linux. IBM is the company that can re-kindle competition in
the RDBMS arena, not Percona, EnterpriseDB or 2ndQuadrant. NewSQL, NoSQL
or some other "simplified" version of SQL will not break the lock that SQL
vendors have on the market. It's just a fad for kids like this:

http://www.youtube.com/watch?v=oL-A4JYwgH4

It's a little league of the IT community and it will always remain a
little league. I would compare NewSQL and plugin oriented databases to
Kim Kardashian: pretty, present in the media but not really important.

--
http://mgogala.byethost5.com

0 new messages