Overall I think this is a great suggestion. Good enough in fact to warrant
- The W3C appears to be pretty committed to WS-Federation. Again, why not
leverage an existing standard to build a Cloud Federation standard on?
-----Original Message-----
From: cloud-computing@googlegroups.com
[mailto:cloud-computing@googlegroups.com] On Behalf Of Reuven Cohen
Sent: August 21, 2008 2:27 PM
To: cloud-computing
Subject: The Standardized Cloud
Over the last few weeks I've been engaged in several conversations about the
need for a common, interoperable and open set of cloud computing standards.
During these conversations a recurring theme has started to emerge. A need
for cloud interoperability or the ability for diverse cloud systems and
organizations to work together in a common way. In my discussion yesterday
with Rich Wolski of the Eucalyptus project he described the need for what he
called "CloudVirt" similar to that of the Libvirt project for
virtualization. For those of you that don't know about libvirt, it's an open
source toolkit which enables a common API interaction with the
virtualization capabilities of recent versions of Linux (and other OSes).
I would like to take this opportunity to share my ideas as well as get some
feedback on some of the key pain points I see for the creation of common
cloud computing reference API or standard.
* Cloud Resource Description
The ability to describe resources is (in my opinion) the most important
aspect of any standardization effort. One potential avenue might be to use
the Resource Description Framework proposed by the W3C. The Resource
Description Framework (RDF) is a family of specifications, originally
designed as a metadata data model, which has come to be used as a general
method of modeling information through a variety of syntax formats. The RDF
metadata model is based upon the idea of making statements about Web
resources (or Cloud
Resources) in the form of subject-predicate-object expressions, called
triples in RDF lingo. This standardized approach could be modified as a
primary mechanism for describing cloud resources both locally and remotely.
* Cloud Federation (Cloud 2 Cloud)
The holy grail of cloud computing may very well be the ability to seamlessly
bridge both private clouds (datacenters) and remote cloud resources such as
EC2 in a secure and efficient manor. To accomplish this a federation
standard must be enabled. One of the biggest hurdles to over come in
federation is the lack of clear definition to what federation is.
So let me take a stab at defining it.
Cloud federation manages consistency and access controls when two or more
independent geographically distinct clouds share either authentication,
files, computing resources, command and control or access to storage
resources. Cloud federations can be classified into three categories:
peer-to-peer, replication, and hierarchical. Peer 2 peer seems to be the
most logical first step in creating a federation spec. Protocols like XMPP,
P4P and Virtual Distributed Ethernet may make for good starting points.
* Distributed Network Management
The need for a distributed and optimized virtual network is an important
aspect in any multi-cloud deployment. One potential direction could be to
explore the use of VPN or VDE technologies. My preference would be to use
VDE, (Virtual Distributed Ethernet). A quick refresher, a VPN is a way to
connect one or more remote computers to a protected network, generally
tunnelling the traffic through another network. VDE implements a virtual
ethernet in all its aspects, virtual switches, virtual cables. A VDE can
also be used to create a VPN.
VDE interconnects real computers running (through a tap interface), virtual
machines as well as the other networking interfaces through a common open
framework. VDE supports heterogeneous virtual machines running on different
hosting computers and could be the ideal starting point. Network shaping and
optimization may also play an important role in the ability to bridge two or
cloud resources.
Some network optimization aspects may include;
* Compression - Relies on data patterns that can be represented more
efficiently.
* Caching/Proxy - Relies on human behavior , accessing the same data
over and over.
* Protocol Spoofing - Bundles multiple requests from chatty applications
into one.
* Application Shaping - Controls data usage based on spotting specific
patterns in the data and allowing or disallowing specific traffic.
* Equalizing - Makes assumptions on what needs immediate priority based
on the data usage.
* Connection Limits - Prevents access gridlock in routers and access
points due to denial of service or peer to peer.
* Simple Rate Limits - Prevents one user from getting more than a fixed
amount of data.
* Memory Management
When looking at the creation of compute cloud memory tends to be a major
factor in the performance of a given virtual environment, whether a virtual
machine or some other application component. Cloud memory management will
need to involve ways to allocate portions of virtual memory to programs at
their request, and freeing it for reuse when no longer needed. This is
particularly important in "platform as a service" cloud deployments.
Several key memory management aspects may include;
* Provide memory space to enable several processes to be executed at
the same time
* Provide a satisfactory level of performance for the system users
* Protect each program's resources
* Share (if desired) memory space between processes
* Make the addressing of memory space as transparent as possible for
the programmer.
* Distributed Storage
I've been working on creating a cloud abstraction layer called "cloud raid"
as part of our ElasticDrive platform and have been looking at different
approaches for our implementation. My initial idea is to connect multiple
remote cloud storage services (S3, Nirvanix, CloudFS) for a variety of
purposes. During my research the XAM specification began to look like the
most suitable candidate. XAM addresses storage interoperability, information
assurance (security), storage transparency, long-term records retention and
automation for Information Lifecycle Management (ILM)-based practices.
XAM looks to solve key cloud storage problem spots including;
* Interoperability: Applications can work with any XAM conformant
storage system; information can be migrated and shared
* Compliance: Integrated record retention and disposition metadata
* ILM Practices: Framework for classification, policy, and
implementation
* Migration: Ability to automate migration process to maintain long-term
readability
* Discovery: Application-independent structured discovery avoids
application obsolescence
Potential Future Additions to the API
* I/o
The virtualization of I/O resources is a critical part of enabling a set of
emerging cloud deployment models. In large scale cloud deployments a
recurring issue has the ability to effectively management I/o resources
whether on a machine level or network. One of the problems a lot of users
are encountering is that of the "nasty neighbor" or a user who has taken all
available system I/o resources.
A common I/o API for sharing, security, performance, and scalability will
need to be addressed to help resolve these issues. I've been speaking with
several hardware vendors on how we might be able to address this problem.
This will most like have to be done at a later point after a first draft has
been released.
* Monitoring and System Metrics
One of the best aspects of using cloud technology is the ability to scale
applications in tandem to the underlying infrastructure and the demands
placed on it. Rather then just scaling on system load, users should have the
ability to selectively scale on other metrics such as response time, network
throughput or other metrics made available.
Having a uniform way to interact with system metrics will enable cloud
providers and consumers a common way to scale applications.
Security & Auditability.
In my conversations with several wall street CIO's the questions of both
security and cloud transparency with regards to external audits has come up
frequently.
---
My list of requirements is by no means a complete list. Cloud computing
encompasses a wide variety of technologies, architectures and deployment
models. What I am attempting to do is address the initial pain points
whether you are deploying a cloud or just using it. A lot of what I've
outlined may be better suited to a reference implementation then a standard,
but none the less I thought I'd put these out ideas out for discussion.
Comments Welcome.
(Original Post: http://elasticvapor.com/2008/08/standardized-cloud.html)
--
--
Reuven Cohen
Founder & Chief Technologist, Enomaly Inc.
blog > www.elasticvapor.com
-
Get Linked in> http://linkedin.com/pub/0/b72/7b4