Saturday, December 03, 2011

Will Amazon Support Linux Containers?

Early on, Amazon EC2 was recognized as the leading IaaS provider because of their ability to easily provision new virtual machines with a variety of configurations (size, speed, attachments, etc.) Virtual machines are a powerful, yet simple tool for engineers to use but they come at a price (a performance hit). At MomentumSI, we've been pondering if Amazon would ever support Linux Containers in their cloud. 


When asked, "Will Amazon Support Linux Containers?" Raj comments, "Would love it. We may see a type of instance which allows containers on it. You will have to take the whole machine and not just a container on it. That way AWS will not have to bother about maintaining the host OS. Given the complexities I think it will be a lower priority for Amazon and as it may be financially counterproductive; they may never do it."



Tom comments, "I doubt it. While I'm one of, if not *the*, biggest proponent of linux containers, the business reasoning still lags the technical reasoning. Intel, for instance, would *hate* such a move. Why? They spent a ton of money on virtualization at a chip level, which becomes a non-issue in containers (no hardware gets shared at the metal, rather, it's all one kernel for all containers). So, while it would be a great thing to see, the business market simply doesn't support this at this point, other than for folks like Pixar or other compute heavy folks.

What I *would* bet on is that AWS internally switches to some container based systems. For instance, ElasticMapReduce is far better off in a container world than in a VM world. Easier to maintain, direct access to 'cpu speed' and no need to virtualize access to disks -- it's all just there (even ISCSI ends up better in containers -- no 'vm to hypervisor' network translations)."


Amazon will likely be forced into one of three positions: 
1. Delivering sub-optimal platform performance on VM's (current state)
2. Supporting Linux Containers behind the scenes but not giving customer access to it. 
3. Delivering Linux Containers to customers and dealing with a whole new set of technical headaches. 


I'm more optimistic than my counterparts on the likelihood of #3. My reasons are simple: First, Amazon has done what they needed to do to satisfy customer needs.  Second, I think they'll need to do it to remain competitive with companies like Rackspace. As developers move from "needing a vm" to "needing a platform" (database, app server, etc.), Amazon will be pressed to expose a more highly performant layer to platform developers. One thing my associates and I agreed on is that we will not likely see containers in 2012... perhaps 2013?

Tuesday, November 22, 2011

Is Cloud Foundry a PaaS?

I've been asking some people in the industry a real simple question, "Is Cloud Foundry a Platform as a Service"?

The obvious answer would seem to be "yes" - after all, VMware told us it's a PaaS:

That should be the end of it, right? For some reason, when I hear "as-a-Service", I expect a "service" - as in Service Oriented. I don't think that's too much to ask. For example, when Amazon released their relational data service, they offered me a published service interface:
https://rds.amazonaws.com/doc/2010-07-28/AmazonRDSv4.wsdl

I know there are people who hate SOAP, WS-*, WSDL, etc. - that's cool, to each their own. If you prefer, use the RESTful API: http://docs.amazonwebservices.com/AmazonRDS/latest/APIReference/

Note that the service interface IS NOT the same as the interface of the underlying component (MySQL, Oracle, etc.), as those are exposed separately.

Back to my question - is Cloud Foundry a PaaS?

If so, can someone point me to the WSDL's, RESTful interfaces, etc?

Will those interfaces be submitted to DMTF, OASIS or another standards body?

Alternatively, is it merely a platform substrate that ties together multiple server-side technologies (similar to JBoss or WebSphere)?

Will cultural pushback kill private clouds?

Derick Harris asks the question, "Will cultural pushback kill private clouds?" His questioning comes from a piece provided by Lydia Leong where she notes that many enterprises have fat management structures and aren't organized like many of the leaner cloud providers.

I tend to agree with the premise that the enterprise will have difficulties in adopting private cloud but not for the reasons the authors noted. The IaaS & PaaS software is available. Vendors are now offering to manage your private cloud in an outsourced manner. More often than not, companies are educated on cloud and "get it". They have one group of people who create, extend and support the cloud(s). They have another group who use it to create business solutions. It's a simple consumer & provider relationship.

Traditionally, there are three ways things get done in Enteprise IT:
1. The CIO says "get'er done" (and writes a check)
2. A smart business/IT person uses program funds to sneak in a new technology (and shows success)
3. Geeks on the floor just go and do it.

With the number of downloads of open source stacks like OpenStack and Eucalyptus, it is apparent that model #3 is getting some traction. My gut tells me that the #2 guys are just pushing their stuff to the public cloud (will beg forgiveness - not asking for permission). On #1, many CIO's are hopeful that they can just 'extend their VMware' play - while more aggressive CIO's are looking to the next generation cloud vendors to provide something that matches the public cloud features in a more direct manner.

There are adoption issues in the enterprise. However, it's the same old reasons. Fat org-charts aren't going away and will not be the life or death of private cloud. In my opinion, we need the CIO's to make bold statements on switching to an internal/external cloud operating model. Transformation isn't easy. And telling the CIO that they need to fire a bunch of managers in order to look more like a cloud provider is silly advice and a complete non-starter.

Friday, August 12, 2011

Measuring Availability of Cloud Systems

The analysts at Saugatuck Technology recently wrote a note on "Cloud IT Failures Emphasize Need for Expectation Management". One comment caught my attention:

"Recall that the availability of a group of components is the product of all of the individual component availabilities. For example, the overall availability of 5 components, each with 99 percent availability, is: 0.99 X 0.99 X 0.99 X 0.99 X 0.99 = 95 percent."

I understand their math - but it strikes me odd that they would use this thinking when discussing cloud computing. In cloud environments, the components are often available as virtualized n+1 highly available pairs. If one is down, the other is taking over. In a non-cloud world, this architecture is typically only reserved for the most critical components (e.g., load balancers or other single-point-of-failures). It's also common to create a complete replica of the environment in a disaster recovery area (e.g., AWS availability zones). In theory, this leads to very high up-time.

Let me put this another way... I currently have 2 cars in my driveway. Let's say each of them has 99% up-time. If one car doesn't start, I'll try the other car. If neither car starts, I'll most likely walk over to my neighbors house and ask to borrow one of their two cars (my DR plan). You can picture the math... in the 1% chance that car A fails, theirs a 99% chance that car B will succeed, and so on. However, experience in both cars and in computing tells us that this math doesn't work either. For instance, if car A didn't start because it was 20 degrees below zero outside, there's a good chance that car B won't work start - and for that matter, my neighbors cars won't start either. Structural or natural problems tend to infect the mass.

I wish I could show you the new math for calculating availability in cloud systems - but it's beyond my pay grade. What I know is that the old math isn't accurate. Anyone have suggestions on a more modern approach?

Thursday, August 11, 2011

OpenShift: Is it really PaaS?

Redhat recently announced an upgraded version of OpenShift with exciting new features including support for Java EE6, Membase, MongoDB and more. See details at:

As I dug through the descriptions, I found myself with more questions than answers. When you say Membase or MongoDB are available as part of the PaaS, what does this really mean? For example:
  • They're pre-installed in clustered or replicated manner?
  • They're monitored out of the box?
  • Will it auto-scale based on the monitoring data and predefined thresholds? (both up and down?)
  • They have a data backup / restore facility as part of the as-a-service offering?
  • The backup / restore are as-a-service?
  • The backup / restore use a job scheduling system that's available as-a-service?
  • The backup / restore use an object storage system that has cross data center replication?
Ok, you get the idea. Let me be clear - I'm not suggesting that OpenShift does or doesn't do these things. Arguments can be made that it in some cases, it doesn't need to do them. My point is that several new "PaaS offerings" are coming to market and they smell like the same-ole-sh!t. If nothing else, the product marketing teams will need to do a better job of explaining what they currently have. Old architects need details.

It's no secret that I'm a fan of Amazon's approach of releasing their full API's (AWS Query, WSDL, Java & Ruby API's, etc.) along with some great documentation. They've built a layered architecture whereby the upper layers (PaaS) leverage lower layers (Automation & IaaS) to do things like monitoring, deployment & configuration of both the platforms and the infrastructure elements (block storage, virtual compute, etc.) The bar has been set for what makes something PaaS - and going forward, products will be measure based on this basis. It's ok if your offering doesn't do all they sophisticated things you find in AWS - but it's better to be up front about it. Old architects will understand.

Tuesday, April 26, 2011

Private Cloud Provisioning Templates

One of the primary benefits of a cloud computing environment is the increased automation. The Provisioning Service is perhaps the core mechanism to deliver this. To better understand the kinds of things we might orchestrate, take a look at the following template. You'll notice that it takes on the same format as Amazon's CloudFormation. This example launches a load balancer as part of our LB-aaS solution for a Eucalyptus cloud:

{
"ToughTemplateFormatVersion" : "2011-03-01",

"Description" : "Launch Load Balancer instance and install LB software.",

"Parameters" : {
"AvailabilityZone" : {
"Description" : "AvaialbilityZone in which an instance should be created",
"Type" : "String"
},
"AccountId" : {
"Description" : "Account Id",
"Type" : "String"
},
"LoadBalancerName" : {
"Description" : "Load Balancer Name",
"Type" : "String"
}
},

"Mappings" : {
"AvailabilityZoneMap" : {
"msicluster" : {
"SecurityGroups" : "default",
"ImageId" : "emi-FF070BFE",
"KeyName" : "rarora",
"EKI" : "eki-3A4A0D5A",
"ERI" : "eri-B2C7101A",
"InstanceType" : "c1.medium",
"UserData" : "80"
}
}
},

"Resources" : {
"LoadBalancerLaunchConfig": {
"Type": "TOUGH::LaunchConfiguration",
"Properties": {
"AccountId" : { "Ref" : "AccountId" },
"SecurityGroups" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "SecurityGroups" ]},
"ImageId" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "ImageId" ]},
"KeyName" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "KeyName" ]},
"InstanceType" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "InstanceType" ]},
"EKI" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "EKI" ]},
"ERI" : { "Fn::FindInMap" : [ "AvailabilityZoneMap", { "Ref" : "AvailabilityZone" }, "ERI" ]}
}
},
"LoadBalancerInstance" : {
"Type" : "TOUGH::EUCA::LaunchInstance",
"Properties" : {
"AccountId" : { "Ref" : "AccountId" },
"AvailabilityZone": { "Ref" : "AvailabilityZone" },
"LaunchConfig" : { "Ref" : "LoadBalancerLaunchConfig" },
"Setup" : {
}
}
},
"RegisterLoadBalancerInstance" : {
"Type" : "TOUGH::ElasticLoadBalancing::RegisterLoadBalancerInstance",
"Properties" : {
"AccountId" : { "Ref" : "AccountId" },
"LoadBalancerName" : { "Ref" : "LoadBalancerName" },
"Instance" : { "Ref" : "LoadBalancerInstance" }
}
},
"Setup" :{
"Type" : "TOUGH::EUCA::Parallel",
"Operations" : {
"TrackLoadBalancerInstance" : {
"Type" : "TOUGH::EUCA::TrackInstance",
"Name" : "LoadBalancerInstance",
"Properties" : {
"AccountId" : { "Ref" : "AccountId" },
"InstanceId" : { "Fn::GetAtt" : [ "LoadBalancerInstance", "InstanceId" ] }
}
},
"InstalLoadBalancerSoftware" : {
"Type" : "TOUGH::ElasticLoadBalancing::InstallLoadBalancerSoftware",
"Properties" : {
"AccountId" : { "Ref" : "AccountId" },
"IP" : { "Fn::GetAtt" : [ "LoadBalancerInstance", "PublicIp" ] }
}
}
}
}
},

"Outputs" : {
"PublicIP" : {
"Description" : "PublicIP address of the LoadBalancer",
"Value" : { "Fn::GetAtt" : [ "LoadBalancerInstance", "PublicIp" ] }
}
}
}

The JSON format can be a bit difficult to read if you're not familiar with it. Amazon and others now have UI's that facilitate the creation of the templates. In this example, there are a few items worth noting:
1. The template accepts input variables and returns information at the end of execution
2. The orchestration automates a series of tasks (launches a bare image, installs LB software, tracks the progress, configures the software, registers the newly launched instance, etc.)
3. The templates treat the cloud concepts (availability zones, cloud services, etc.) as first-order concepts in the syntax.

Keep in mind that the orchestration scripts can be multiple levels deep. This example was a simple one just to launch a load balancer. A more complicated orchestration would initiate multiple orchestration templates.

In the coming months, we'll be releasing a series of templates designed to orchestrate the provisioning of many common applications. The provisioning templates will fully leverage the power of the cloud (auto scale, auto recover, auto-snapshot, auto balance, etc.)

Sunday, April 24, 2011

Private Cloud Provisioning & Configuration

Cloud provisioning has focused on the rapid acquisition and initialization of a new server, disk or some other piece of infrastructure. Provisioning a single piece of infrastructure is now quite easy. Provisioning an entire set is much more complicated. In addition to the setup of a single piece of equipment, it's necessary to understand the dependencies between elements. In some cases, certain infrastructure components must be launched before another element or configuration data from one item needs to be used in a third element. Getting it all right is a difficult task and is a major cause of system failures. An approach to solving the problem is to consider the Deployment Fidelity, that is, the degree to which a deployment is able to fully describe it's architecture and configuration in a digitally precise manner.

Historically, application architects have used Word documents and Visio diagrams to depict the relationship between their software modules and the hardware infrastructure that would host them. Deployment Fidelity deals with accurately describing a set of computing resources and their relationship to each other. Organizations that embrace high fidelity will digitally describe their software and hardware topology: what type of hardware, operating systems, memory, infrastructure services, platform services, etc. and pass the digital description to the cloud provisioner for execution. The business value is two-fold. First, the high fidelity description reduces the chances of manual error, especially during hand-off. Second, the automation of the provisioning task reduces the deployment time and associated costs (e.g., sysadmins running individual scripts, testers waiting for new environments, etc.)





To increase the Deployment Fidelity, the relationships between elements must be captured. For instance, if an application server uses a relational database, the link between the two is recorded and configuration variables (such as IP addresses) are noted. If the server has an outage, a replacement can be auto-launched with the same configuration information. As the complexity of an application increases (load balancers, web servers, app servers, multiple databases, message queues, pub/sub, etc.) the need to keep a digital description becomes extremely important in order to reduce the chance of errors during deployment.

From an organizational perspective, there are two highlights: 1. The deployment architect can describe their proposed solution with complete fidelity - no misinterpretation. In addition, if there is an issue, the changes to the architecture can be captured in version control, just as if it was another piece of software code. 2. The sysadmin or release engineer can take the provisioning script and easily create a new environment (i.e., replicating Dev to Test, etc.)

Today, MomentumSI is announcing the release of two new services that orchestrate the provisioning of complex application topologies and then provide the configuration information:
The Tough Provisioning Service provides equivalent functionality found in Amazon's CloudFormation and is API/Syntax compatible with their offering.

The Tough Configuration Service integrates the most popular configuration management systems into the private cloud. Use your choice of Chef or Puppet to create configuration scripts and then expose them as enterprise grade services (secure access, multiple node delivery, guaranteed transmission, closed loop feedback, etc.)

Our solution brings this functionality to your private cloud by complementing your existing investment in VMware or Eucalyptus.

For more information, see Tough Solutions.

Tuesday, April 05, 2011

Are Enterprise Architects Intimidated by the Cloud?

Are Enterprise Architects Intimidated by the Cloud?

EA's are often the champion of large change initiatives that span multiple business units. If they're not on board - we've got problems.

Here's why I ask the question:
1. It's my perception (perhaps incorrect) that the EA leadership typically doesn't come from a background in infrastructure architecture. It's been my observation that the EA's who tend to get promoted usually have a background in business or application architecture. These people are often hesitant to enter deep discussions on CPU power consumption, DNS propagation, VLAN decisions, storage protocols, hypervisor trade-offs, etc.

2. Most people have agreed that the cloud can be viewed as a series of layers. You can attack it from top (SaaS) or bottom (IaaS). Quite frankly, there isn't *that much* architecture in SaaS (other than the secure connection and integration). That leaves IaaS as the starting point - which takes me back to point #1 - IaaS intimidates the EA team - - meaning that they're relying on the I.T. data center operations team (and localized infrastructure architects) to define the foundational IaaS layers which will serve PaaS, Dev/Test, disaster recovery, hadoop clusters, etc.

Any truth here? Leave a comment (moderated) or send me an email either way: jschneider AT MomentumSI DOT com