RMS is not as slow as one thinks. It can be very fast if you
understand the application and using direct IO's.
With all relational DB's, there is usually an internal function
called a query optimizer. This is an internal function which
receives a query, then determines if the query should be executed
in index or sequential mode. There is overhead associated with
this and the output of the query optimizer is not always correct
(logic errors in query, or optimizer bugs) and hence, a query
that should use an index, might incorrectly decide to go
sequential. This is a classic case where all of a sudden a query
that takes 5 seconds normally, all of a sudden takes over a
minute. Symptoms are very much higher than normal DB IO's.
Now, the current RMS design has issues with maint, online backups
and likely a few other issues, so there is a trade-off.
However, why look at addressing future requirements with today's
technology?
We know there is a new file system coming on OpenVMS and I would
expect quite a few of the current RMS issues to be addressed with
the new design - including better performance.
> I would be extremely surprised if anyone wrote code to go block
> mode I/O on OpenVMS for data capture in the IoT space either
>
> High transaction rate environments resort to items like
sharding
> and distributed DB's like NoSQL Cassandra etc as well as other
> techniques. So far OpenVMS doesn't have anything like these
> technologies to my limited knowledge. At the device level the
> options are stripping but then you get hit with lack of
redundancy
> which isn't going to fly in most environments and even
stripping
> isn't going to save you for lots of small data writes which is
what
> IoT will be primarily focused on
>
There are huge load balancing trade-offs with distributed DB
sharding. In a nutshell, you assign different parts of the DB to
specific nodes. Each node can only update directly that part of
the DB it is assigned to. If one part of the DB becomes a hot
spot that exceeds the requirements of that single node, then your
only option is to replace that server with a bigger system,
re-design and re-partition the DB or return an error to the
application.
You have to design a sharded DB exceptionally well so you really
need to understand your workloads. That is the core of the
NonStop world. In their financial world, they understand their
transactions very well, but the big 800lb gorilla in every
NonStop environment is what happens if a workload exceeds the
capacity of one node?
A good WP that compares shared everything/disk DB's (OpenVMS,
Linux/GFS, z/OS) vs. shared nothing (Linux, Windows, UNIX,
NonStop) can be found here:
http://www.scaledb.com/wp-content/uploads/2015/11/Shared-Nohing-v
s-Shared-Disk-WP_SDvSN.pdf
""Comparing shared-nothing and shared-disk in benchmarks is
analogous to comparing a dragster and a Porsche. The dragster,
like the hand-tuned shared-nothing database, will beat the
Porsche in a straight quarter mile race. However, the Porsche,
like a shared-disk database, will easily beat the dragster on
regular roads. If your selected benchmark is a quarter mile
straightaway that tests all out speed, like Sysbench, a
shared-nothing database will win. However, shared-disk will
perform better in real world environments."
See notes above.
Remember - new file system and other new core things coming.
We should stop trying to address the future 5+ year requirements
of tomorrow using today's limitations when we know OpenVMS has a
new engine (file system) and new wheels (TCPIP stack) and a new
body (X86-64) coming in the next 18-24 months.
Also, as pointed out in the WP, one needs to consider the
benefits of being able to load balance IO requests across all
available back end systems vs. DB sharding across many small
systems across high latency LAN networks (net writes latency vs.
local memory, flash disk) before the system is deployed, then
having to deal with hot spots or unplanned workloads or DR later.
> IoT will drive the whole data / storage industry up another
notch
>
> We will see the early adopters take the lions share of the IOT
> space and I happen to think that will be linux yet again :-(, I
really
> don't think OpenVMS is in any shape at present to even begin to
> participate, it's having enough fun and games getting itself
onto
> x86
>
> The rebuilding of OpenVMS is going to need to address why
> people abandoned the platform in the first place, it's not just
a
> lack of x86 support. People are coding for other architectures
> currently and are doing so I think primarily because of good
> porting tools and excellent development frameworks and Open
> source is now just not a nice to have but an essential
>
Open source is indeed another good tool to have on one's tool
belt. More tools usually makes for a better carpenter.
However, there are trade-off's. Each solution architect
(carpenter) has to review these to determine what tools are right
for their environment.
In most cases, there will likely be a mix of custom code and open
source.
> On a philosophical front, man seems hell bent on sampling
> everything possible in the hope of controlling his environment
> and ultimately planning his existence. I happen to think it's
folly to
> pursuit such things to the nth degree but until this approach
as
> abandoned then expect IoT to keep getting more wild in it's
hype
> and promises. I mean if central banks cannot give up on their
> notion of a controlled economy (yeah, how well has that been
for
> the planet!), then what hope is there that IoT will be de-hyped
in
> the near future? i.e. none!
>
IoT is like Public Cloud, SDN, IT Utility, Adaptive Enterprise,
SOA, Real-Time Enterprise and a host of so many other industry
hype terms.
There is some truth that is just a re-invention of existing
technologies behind each of these, but the definition of each is
left up to the individual, so in the end, you can define these
terms as anything you want.