On 2016-08-30 17:55:32 +0000, Dirk Munk said:
> Stephen Hoffman wrote:
>>
>> That's certainly been the classic OpenVMS marketing. You're likely
>> going to be learning much about that and the limits of that design and
>> the limits of OpenVMS-style clustering and about implementing and using
>> partitioning as Stark scales up and starts to distribute data
>> geographically, too. Though if you can stay within the limits and
>> within a single cluster or can otherwise avoid or isolate geographic
>> distribution, things will be easier.
>
> If you need to build a VMS cluster for performance reasons, then forget
> geographic spreading. The latency involved with that is so enormous
> that you will get far less performance instead of more performance.
Flip your design over and think of cases when the clusters themselves
need to be distributed. There are times when you have to
geographically distribute your configurations, beyond cases of disaster
tolerance. When you have to maintain locality to the remote systems.
When the design that OpenVMS marketing has envisioned gets turned on
its head. Stark is very likely aiming to be world-wide, which means
there's no easy way around dealing with the latency, and which means
the folks gain experience around what's involved in these
configurations. This is where the classic OpenVMS marketing designs
start to require rather more local development work and/or integration
with open source clustering tools.
> The maximum distance between two nodes shouldn't be more then 50 meters or so.
The world is rather wider than that. Configurations past 50 meters
can work well for many cluster applications. Configurations past 50
meters can be mandatory for some. Though there are cases where the
latency is secondary to the storage and not to the links. FC SAN
bandwidth is low and FC SAN SSD latency is high, when compared with
networks and server DRAM. Stark is headed toward dealing with
configurations much wider than 50 meters, too. Then there's the fun
(or "fun") of keeping all that working and consistent.
> Of course this applies to all kind of clusters, not just VMS.
Ayup, the physics are certainly common. Approaches toward
availability and consistency do differ. Kerry will be hitting those
in the environment, given the clients are inherently distributed, and
which means that either long latencies from clients to distant clusters
are involved, or geographic distribution of the clusters and probably
then local software designs involving eventual consistency of the data
across those clusters.
>> What's available in OpenVMS is a pain to use, though. More than a
>> little investment is needed to make all that work, and RMS can be very
>> much less than fun when application rolling upgrades are involved.
>
> Why? What has that to do with RMS?
Ponder rolling application upgrades in a cluster configuration using
RMS-based applications, particularly when the application developers
need to change the file formats. When the cluster and the
applications have to remain available. It all gets... interesting.
Sooner or later, the application developers find themselves dealing
with something that looks rather like the morass that is the cluster
RMS file-based management — which is very far from pretty — and working
with limitations within that design that can be difficult to remove.
Making changes in these RMS file-based application environments is a
tedious slog, with the usual solution involving increasing numbers of
files, or moving to approaches that use other types of tools or
databases in preference to using RMS files.
All of these pieces — upgrades, clustering, patches, application
updates, shared RMS files — are interlinked, and it certainly appears
to have been a very long time since anybody's thought about and made
any changes around how all these pieces fit together, and whether the
existing design is sustainable and supportable. Sure, OpenVMS and
clustering works. That's certainly goodness. But — particularly if
the goal is to bring over new folks and new designs and not simply ease
OpenVMS into retirement — it's ill-documented, extremely difficult if
not impossible to upgrade without breaking compatibility, and with more
than a few other limitations and pains and misfeatures.