Where Does Virtualization End and Cloud Begin

There’s much heated debate over cloud computing. The fact is that cloud computing is a paradigm shift in how we deliver compute capacity to end users. Virtualization, on the other hand, while enabling many of the conveniences of cloud computing, is not required to actually build a cloud. Virtualization, also, is an actual technology that creates a paradigm shift, not just a label for a paradigm shift, itself, like cloud computing is. In other words, there is no product that can “cloud you”, while there are many products that can “virtualize your business.” That said, most well-designed clouds include virtualization technologies.

The virtualization technologies that help enable cloud computing paradigms include virtualized networks, virtualized storage, virtualized name spaces, IP addresses, and clustered resources; and probably the mostly referred to, virtualized server and desktop machines. I’ll assume, at this point, if you’re reading this, you’re familiar with virtualization as it generally pertains to servers and desktops.

The question is, where does virtualization end and cloud begin?

To answer that, we have to establish some generally agreed assertions about cloud computing. In general, a modern cloud infrastructure will provide at least:

  • Self-Service – A self-service user portal to deploy and destroy compute instances
  • Automation and Elasticity – An engine that automatically handles the deployment request, notifies the user when the resource is ready, and terminates the resource when the user indicates it is no longer needed
  • Metering and Reporting – An administrative tool that meters the consumption of resources in the cloud as well as the ability to report on that usage, these two may be separate or an integrated solution
  • Billing & Charge-back (optional) – Depending on whether the cloud is designed for monetary gain or for organizationally charged back resources, the cloud may have the ability to do automated billing and/or automated charge-back

While there may or may not be dozens or hundreds of additional attributes to any given cloud infrastructure, I think that most cloud architects would agree that the above list is relatively inclusive of the basic components of an Infrastructure as a Service (IaaS) cloud solution.

In general, I have seen two marketectural (Marketing and Architecture) approaches to the answer to our question.

The first approach seems to treat virtualization as a very traditional component of cloud computing. The virtualization component of the cloud simply virtualizes servers and desktops and provides access to storage and networking for them. This approach does not include any aspect of user self-service, nor does it include a metering system and only has the rudimentary monitoring capability that has always been included in the solution.

This first approach seems to be most common in the industry. To complete the base cloud infrastructure, a company or service provider must purchase additional add-ons, which are often product-specific, and appear designed to lock you in to a single virtualization infrastructure.

The second approach is one that expands the architectural capabilities of the virtualization component and treats the cloud component of an IaaS solution as a higher level of capability and abstraction. This is the approach that Red Hat has taken with Red Hat Enterprise Virtualization (RHEV) 3.0 and its announced, upcoming, CloudForms solution.

Red Hat sees the base components of an IaaS cloud as key enablers at the virtualization layer. So RHEV 3.0 includes a power user portal (self-service & elasticity) giving users the ability to:

  • View assigned servers and desktops
  • Create/Edit/Delete VMs
  • Run VM with all options (including attach CD, etc)
  • Create/Edit/Delete/preview Snapshot
  • Create/Edit/Delete Templates
  • View VM statistics and status
  • View resource usage and statistics
    • Including network, storage, CPU and memory
  • Access the VM Console

RHEV 3.0 also includes an Enterprise Edition of Jasper Reports to provide advanced metering and reporting capability.

The only base component of a cloud architecture not included in RHEV 3.0 is the billing and charge-back capability, as it is not necessary in all solutions (optional). Generally, when billing and/or charge-back are required, they are implemented at a higher level, where they can include additional aspects and components of the cloud in the calculation of the resources consumed. This could include software licenses, physical disk usage, network usage, etc., that may or may not be consumed directly through the virtualized environment. Integrating the billing component into the virtualization layer would preclude a successful, extensible and inclusive billing and charge-back solution. For companies that require this capability, Red Hat has partners with solutions like Tivoli’s Usage and Accounting Manager (TUAM).

If Red Hat is offering an entire IaaS cloud solution, at no additional charge, as components of its RHEV 3.0 offering, then where does CloudForms come in?

Red Hat views cloud computing as more than just virtualizing your infrastructure. Clouds should provide abstraction from both the hardware and the virtualization themselves. A cloud infrastructure should give the enterprise or service provider the capability to use any sort of back end infrastructure they choose, physical or virtual, and provide automated, rules-based, dynamic access to those resources. That is the intent of Red Hat’s future CloudForms offering.

CloudForms employs a concept of Deployables which can define entire workloads, including database servers, app servers, web servers, all in a single package that can be deployed from a single end-user interface. Whereas RHEV 3.0 is going to deliver an entire IaaS cloud solution at one of the lowest costs in the industry, CloudForms will enable things like the automated deployment of en entire Platform as a Service (PaaS) solution with just a few clicks.

In general, virtualization will end where your virtualization solution provider decides it will. Most virtualization solutions provide only the basics, and call the rest (self-service, metering, reporting, etc.) “cloud” at significant additional cost. Red Hat’s virtualization solution provides everything an enterprise or service provider needs to stand up an IaaS offering at one low cost. Future offerings like CloudForms will serve to further enhance RHEV 3.0′s capabilities and ensure that customers have choice and flexibility in their cloud architectures, not just one solution that can’t fit all.

Cloud Storage – Overcomplicated, Underfunctional

This piece is not about cloud storage as an off-site storage solution. This is about the storage used by public cloud providers and private cloud implementations.

There is much debate over what kind of storage should be used in cloud computing solutions. There are those that say it depends, there are those that claim internal disk in cloud servers are just fine, there are those that say only fibre channel based enterprise disk is acceptable for a highly available, commercial cloud offering. I’m on the “it depends” side of the fence, in other words, standing on top of it. The type of storage used is not really what concerns me right now.

The lack of cloud awareness in storage is what’s bothering me. It seems that there is no cloud-ready storage solution. Enterprise storage, advanced storage management, storage replication, deduplication, etc., are all based on the premise of legacy, static, stateful computing. While there have been enhancements to support virtualization, these enhancements do not inherently address the needs of cloud computing.

There are some key components of a cloud. One of them is a self-service user interface. This means that users who are not savvy, do not understand storage tiers, do not really care how storage works, other than they can save files, are able to deploy and destroy workload instances on the cloud’s storage back-end – at their whim. Another key component of a cloud architecture is the workload automation. The cloud management platform needs to be able to automatically deploy workloads, and destroy workloads, based on the requests from the aforementioned self-service user interface.

This poses a certain dilemma for traditional enterprise storage offerings. Existing enterprise storage has no way to know the intended statefulness or longevity of a block of data in a cloud. There are no mechanisms in place for the cloud management software to notify the storage that a particular data set, virtual machine, OS instance, application, etc., may only be temporary, or that it may need to exist for the next 6 months.

Why does any of this matter? Well, let’s say a user scheduled a workload instance for 1 hour. They deploy that workload, and it happens to have identical bits to another workload deployed on the cloud’s backend storage. The storage, being smart, analyzes this and begins a disk deduplication process to reduce the overall usage of physical capacity. The deduplication process is now consuming processor time and hard drive I/O as it performs this function. It may be about 50% complete when the 1 hour time has expired and the cloud management software automatically removes the workload that the storage subsystem has spent the last hour encumbered by attempting to deduplicate it.

This is just one example of an extremely inefficient use of a storage controller’s capabilities and performance capacity. There are certainly ways to manually work around this, but the workarounds very quickly become limiting and shortly thereafter render many advanced storage management techniques useless with cloud computing.

The solution to the matter is for storage vendors to enhance their storage management interface (storage API’s) to the storage subsystems themselves as well as their advanced storage management software with cloud-aware capabilities. The cloud management software should be able to communicate to the storage backend what kind of storage it needs, how long it will need it, and have the ability to modify those policies on the fly, as users, as they inevitably will, change their mind about what resources and how many resources they need, at any given time.

Web Hosting Providers Are Cloud Pioneers

In the 1990′s and early 2000′s, web hosting was just that, web hosting. Web hosting services were extremely simple or extremely complex. There wasn’t much in-between for the average end user. Modern web hosting providers, like DreamHost, have been offering Cloud-like services for some time. It could even be argued that some of these “otherwise simple” web hosting providers are actually the pioneers of public Cloud Computing – driven by a utility computing model.

The Similarities

Public Clouds are a place to put workloads that an individual or business is not executing locally on their own systems. Web hosting providers offer essentially this service. Historically, though, there were no automated systems or on-demand capacity provided by web hosts.

Let’s look at DreamHost’s offerings, a provider whose services I know fairly well. DreamHost offers Goodies in which they include their One-Click Installs. These are workloads like WordPress, phpBB (forums), Zenphoto (an image content management system) and about a dozen more, that users can automatically deploy with very little effort and total automation on the back-end.

DreamHost also offers integration with Google’s Gmail, Apps and Hosting, should a user decide to integrate with their Cloud-based offerings. Additionally, there’s Amazon CloudFront integration to utilize Amazon’s S3 storage for rapid end-user performance worldwide. All of this integration amounts to Cloud-In-Cloud (CiC) offerings that go beyond the hybrid Cloud into Cloud unions and intersections. Fair warning to the numerically challenged: We may soon be expressing Cloud architectures in the form of mathematical sets.

The Differences

While all of this automation and hosting may sound very cloudy, there are limitations, hence, reasons that DreamHost may not offer a real Cloud experience. The user does not get to do anything they want with the service.

IBM’s Blue Cloud and Amazon’s EC2, for example, allow developers to write code, utilize tremendous grid-based compilation processing and execute generalized workloads. The presentation of the platform is automated, what the user does on it is generally not. A user’s public Cloud experience is generally assumed to be whatever they want it to be, only restricted by the operating systems provided or the capabilities of the Cloud itself.

With DreamHost, the system is self-serviceable and almost completely automated, but that limits what the user can accomplish. It is against the terms of service, for example, to use DreamHost’s processing power for large compilation tasks or distributed computational processing. These are generally considered some of the primary purposes of Cloud Computing.

Cloudy Computing

The concept of Cloud Computing is continuing to take shape. There is no exact right and wrong to what Cloud is. There are clearly a lot of uses for it, though. Just as HPC Clouds and Commercial Clouds become more interchangeable, the services offered by modern web hosting providers are becoming more Cloud-like all the time.

Hybrid Clouds, More Hype Than Happen, More Talk Than Tech

I had a great conversation with a well-respected colleague of mine today. We discussed what it will take to deliver on the promise of hybrid clouds. We both agreed that a significant amount of intelligence needs to be added to the current architecture of Cloud Computing in order to even begin to deliver on the promise of making a hybrid cloud a reality. My colleague seems to think it will take the industry another decade to really make these technologies as ubiquitous as IP and the Internet itself. I’m of the opinion that we can get there faster if the industry collaboratively focuses on some of the major hurdles.

The Hybrid Cloud

A hybrid cloud is one in which a workload can theoretically move seamlessly between a private cloud and a public cloud. Hybrid clouds offer the panacea that you can have protected workloads internally, capacity-driven workloads in an on-demand public cloud, and the ability to shift some of those workloads between the two, depending on requirements.

For the last few years, from about 2008 on, various individuals and organizations have been purporting the benefits of a hybrid cloud architecture. On paper, hybrid clouds look wonderful. But there is a disconnect between the paper diagram and the reality of the situation.

It appears as though many of the supposed cloud experts involved in the mass hysteria of hybrid clouds have yet to dig deeply into the technical limitations of modern workload portability. While the concept of a hybrid cloud, and the ability to shift workloads from one datacenter to another sounds fantastic, there is a significant gap between the architecture of the existing technology and the business requirements.

The Missing Links

An exhaustive list of all the obstacles involved in hybrid clouds is not our intention. Generally speaking, there are successful implementations of both private and public clouds already. Yet, at this time, the major obstacle of bridging the two into a hybrid cloud is workload mobility.

Workload mobility is what allows the two cloud types to talk to each other, for lack of a better term. Workload mobility can be accomplished in several different ways. A workload can be migrated offline or online, it can be the entire operating system and application stack, or it could be just the application. A single workload may include multiple instances of operating systems and applications or it may be a single entity.

Which aspects of workload mobility get implemented on a cloud-by-cloud basis are left up to the designers and owners of each cloud. Regardless of the implementation details, the private cloud must have a means of offloading a workload to a public cloud and/or vice versa. The case may also exist for workload mobility between two public clouds. It is generally taken for granted in many hybrid cloud architectural designs that this capability already exists, but the technology to deliver workload mobility in today’s hybrid cloud is actually quite limited.

To some extent, workload mobility does exist. But the existing workload mobility was designed to be utilized within a single datacenter, or within a single network. Hybrid clouds require that a workload, usually a virtual machine itself, move outside the datacenter, usually over a WAN to another datacenter. While this sort of workload mobility can be accomplished on a limited basis today, the existing technology is not designed to support commonplace and well-managed mobility of workloads across the WAN.

For the purpose of this article, the workload is assumed to include access to the data that the workload requires. Every aspect of shifting the workload from one location, or one cloud, to another, should include the same qualifications for the workload’s respective data. To effectively accomplish every day workload mobility across the WAN, there are several aspects of workload mobility that must be addressed:

  • Workload delivery guarantee – not just a simple Ack (Acknowledgement Packet)
  • Workload mobility Quality of Service (QoS)
  • Workload Security & Compliance

Workload Delivery Guarantee

Delivery guarantee does not just refer to the successful move of a workload, the ability to fail back if the move is unsuccessful, or the ability to acknowledge its success. Workload delivery guarantee requires that there is some method of planning ahead, before the workload migration begins, to ensure the timely arrival of the workload. In the interest of time, it also requires that there be a predetermined time frame for the workload to arrive at its new destination. Since this time frame should be predetermined, it infers that the workload should be made aware of its Estimated Time of Migration (ETM). Additionally, based on the time estimate, the workload should have the opportunity to act accordingly, prepare itself, prior to the activation of the migration process.

Workload Mobility QoS

The Quality of Service aspect of workload mobility is tied closely to the delivery guarantee. QoS is a method of organizing the priority of network traffic, and it is required at the workload level in the same way that network QoS is necessary at the packet level. Without some means of determining the priority of workloads during migration, it would be very difficult to offer any sort of ETM prior to or during the actual migration process.

It is also important to bear in mind that workload mobility QoS is not directly attributed to the relative importance of the workload that’s migrating, although that may often be the case. For example, the QoS level assigned to the migration of a particular workload may be higher or lower than the processor priority, or uptime priority assigned to the workload itself.

Workload Security & Compliance

Security and compliance are becoming increasingly important as more regulatory bodies scrutinize how business is done with respect to technology. Contrary to what some technology purists seem to believe, almost every business has some sort of regulatory restrictions on it. This includes PCI Compliance for credit card and retail transactions and financial compliance for every business that files its taxes or keeps its records electronically. To claim that security and compliance are only issues for major financial, federal, or health related industries just shows a lack of business acumen on the part of some technologists.

Having established the necessity for compliance with hundreds of regulatory bodies, what has not been clearly established are methods of ensuring compliance during workload migrations from one cloud to another.

What Needs To Be Done

The industry really needs to collaborate to address the above issues, and several others. Workload mobility is the cornerstone of hybrid clouds, and right now, that capability is extremely limited at best.

The most obvious work needs to be done at the network layer. This includes integration with the virtualization layer, as the virtualization layer is almost always a critical component of workload mobility. Above that, there is optional work to be done at the operating system and application layers, to further facilitate the transparency of migrating workloads inter-cloud.

The enhancements required at the network layer are the most critical at this juncture. The current level of network awareness for workload mobility is akin to an aviation system that only has local air traffic control, and no communication between cities. Planes would take off and land in whatever order they are ready to go or arrive. At some point, too many planes would be waiting to arrive at a single city because no planning was done ahead of time, and they start to run out of gas in the air, or have to request priority clearance to land in front of other planes that were expecting to be on the ground shortly. Most of the time, things would get sorted out, every now and then, we’d lose a plane. But even when things worked out, it would not provide any sort of reliable flight times.

The need to increase the integration with the virtualization layer is a natural extension of the network layer. In the above analogy, air traffic control needs to be able to communicate to the plane its expected departure and arrival times before it leaves the gate. There also needs to be a means of ensuring that those times remain accurate, and a method of notifying the plane once it has taken off if there is an emergency that requires it to take action. There is no guarantee that the primary, intended server or network connection will be available from start to finish.

The extra mile is integration with operating systems and applications. This provides the ability to not only update the wrapper that holds the workload, but also the application performing the work and the operating system supporting it (though I conjecture we are not far off from those becoming integrated, as well). This is the equivalent of the plane’s captain being able to communicate with the flight crew and the passengers in the cabin. Everyone can prepare for how long the flight will be, and can be updated if there are any changes to their status.

The issues surrounding security and compliance will need to be addressed at all the layers of existing architectural models. Most systems have traditionally been designed to be held in a secured environment, with the onus of security placed on exogenous utilities and appliances. That paradigm has to shift some, as the workloads themselves will need to maintain a state of security during migration. Depending on the implementation, that state of security can optionally be maintained when not migrating, adding to the overall benefits of the additional architecture. Much like wearing your seatbelt in the plane while it is still parked at the gate.

In the coming years, we will undoubtedly be hearing from some of the industry leaders, and probably some emerging ones, about technologies they are developing to address these needs. Currently hybrid clouds trail the airline industry in their ability to transport workloads effectively. With proper consideration and collaboration hybrid clouds may offer the equivalent of commercial flights to the moon in the next several years. It is safe to assume that there are many unforeseen needs that will arise along the way and that will create entirely new markets for Cloud Computing technologies.

Cloud Computing, What It Is, And What It’s Not

Executive Overview

Cloud computing is a concept. It is an architectural framework by which one or many organizations can deploy, manage and retract any workload, public or private. Cloud computing addresses business needs from a self-service, automated workload perspective. The concept collectively addresses all the aspects of modern computing, from components (SAN, Network, Servers, Software) to implementations (Virtual Desktops, Hosted Applications, E-mail, etc.) in a comprehensive, cohesive solution.

What It Is

The industry has evolved from implementing disparate, individual systems to sharing workloads and the cost of those workloads (Grids), to offering software and solutions as services (Service Oriented Architecture – SOA). Cloud is the next step in the evolution of the industry; it is the step that meets business requirements with a dynamic approach. “My business, my user, needs to do this,” Cloud makes this possible with the fewest number of duplicated efforts.

The buzzword-laden and slightly more complete definition:

A Cloud is a dynamic, infinitely scalable, expandable, and completely contractible architecture. It may consist of multiple, disparate, local and non-local hardware and virtualized platforms hosting legacy, fully installed, stateless, or virtualized instances of operating systems and application workloads.

What It Is Not

Cloud computing is not a platform, specific hardware architecture, specific software architecture or any specific product. It is not Internet-based computing, nor is it merely the use of shared resources or the storage of data somewhere abstract. Were any of this the case, then the first time an e-mail, a document or any other piece of data were ever stored on a server on the Internet, that would have been considered a Cloud.

Marketers seem to be struggling with how to position and sell Cloud computing and the offerings based on it. This is leading to many misrepresentations of what a Cloud is. Most of the Cloud computing solutions being marketed today are merely hyped up hyperbole of Internet-based or Web 2.0 computing. These solutions and offerings are components of what a Cloud computing architecture would represent.

Amazon’s online resource offering, EC2, is a good example of the common divide between marketing and technology. Amazon’s site defines EC2 as: “Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud.” It would more appropriately be defined something to the effect of “… is a web service based on Amazon’s Cloud computing architecture that provides resizable compute capacity to its users via the Internet.”

The Internet is not “the Cloud”, yet that seems to be the most common misuse of the term. This misuse is confusing business people as to what the Cloud really is and technology professionals as to how a Cloud is useful within their organizations. As a technology professional, it is important to understand that Cloud computing has benefits and applications far beyond large web service and hosting providers.

This does not preclude Cloud computing from use in Internet-based solutions. Amazon’s EC2 and Google Apps are good examples of this. The technologies used to deploy these systems are either heavily or entirely Cloud-based. The systems are dynamic, extensible and expandable. They may or may not exhibit all of the qualities of a true Cloud computing architecture, but they certainly qualify as being Cloud-based.

Another misconception of Clouds is that they are exclusively public, private, internal or external. Based on its definition, Cloud computing is a construct to implement any of those solutions, independently of each other or inclusive. A properly designed Cloud computing architecture could allow an organization to dynamically deploy, manage and retract, internal, external, public and private workloads.

Although a Public and a Private Cloud could be one and the same, most commonly, if a Cloud computing architecture is implemented to offer billable, service-based offerings to external users or customers, it is being considered a Public Cloud. Likewise, it is becoming commonplace to refer to Cloud computing architectures that only offer compute services to internal employees as being Private Clouds.

Cloud computing is fast becoming fantastic marketing jargon for companies and organizations that may or may not have the capability or the desire to really explain it or deliver on its promise. It is not an easy concept to grasp. The more abstract a concept is, the harder it is to explain; and even harder to properly implement. Cloud computing is an abstract concept that includes the implementation of multiple abstract technologies. All of the intangibles involved make explaining Cloud computing difficult, but poor explanation should not minimize what Cloud computing can accomplish.

How We Get There

Cloud Characteristics

Extensible – it can be modified to suit multiple purposes while the base architecture remains intact

Accessible – the services are easy to deploy, access and manage

Scalable – the components used in the design can be scaled indefinitely

Contractible – the deployed services can be easily removed

Clouds are built from existing and emerging technologies. Cloud computing architectures will be put in place and merged with existing, installed systems. They will incorporate every major technology used today. Virtualization and interconnectivity are only a few of the vital technologies that go into implementing a Cloud computing solution. Service Oriented Architecture (SOA), Storage Area Networks (SAN), and Dynamic configuration of Virtual Networks (VLANs) and physical networks are all part of it. Self-service user portals, virtualized desktops and shared compute resources could all be part of a well-designed Cloud.

Cloud computing is accomplished with a building block approach. Start with the base, reference architecture. Install the underlying tools to deploy, manage and retract the resources on that architecture. Then add the necessary components (hardware and software) for the workloads that a particular Cloud needs to support. As workloads requirements increase, additional building blocks are added to the Cloud.

What about traditional Operating System (OS) deployment tools? What about application deployment and orchestration tools? These legacy tools are part of the building blocks that get added to a Cloud computing architecture. On their own, they do not constitute a Cloud. They are part of the components that provide the ability to add and customize workloads within the Cloud.

One of the major requirements of Cloud computing is that the underlying tools to deploy, manage and retract the resources in a Cloud must be indefinitely scalable. If the underlying architecture is not scalable beyond any known capacity, then the design could be limiting.

Why Clouds?

Why are we building Clouds? What does Cloud computing accomplish?

Cloud computing

  • Reduces time to deployment
  • Reduces administration
  • Increases application flexibility
  • Decreases dependence on proprietary platforms
  • Enables Fit For Purpose Computing
  • Decouples the Workload from the Platform

Conclusion

The industry will build Clouds because Cloud computing is the next major step in delivering solutions, not just applications. Organizations need to rapidly deploy new workloads faster than ever. They need to be able to dynamically modify how those workloads are deployed. They need to be able to redeploy and retract those workloads, on-demand, like never before. Cloud computing is the logical next step in dynamic infrastructures and architectures.