Wednesday, March 16, 2011

Providing Cloud Service Tiers

In the early days of cloud computing emphasis was placed on 'one size fits all'. However, as our delivery capabilities have increased, we're now able to deliver more product variations where some products provide the same function (e.g., storage) but deliver better performance, availability, recovery, etc. and are priced higher. I.T. must assume that some applications are business critical while others are not. Forcing users to pay for the same class of service across the spectrum is not a viable option. We've spent a good deal of time analyzing various cloud configurations, and can now deliver tiered classes of services in our private clouds.

Reviewing trials, tribulations and successes in implementing cloud solutions, one can separate tiers of cloud services into two categories: 1) higher throughput elastic networking; or 2) higher throughput storage. We leave the third (more CPU) out of this discussion because it generally boils down to 'more machines,' whereas storage and networking span all machines.

Higher network throughput raises complex issues regarding how one structures networks – VLAN or L2 isolation, shared segments and others. Those complexities, and related costs, increase dramatically when adding multiple speed NICS and switches, for instance 10GBase-T, NIC teaming and other such facilities. We will delve into all of that in a future post.

Tiered Storage on Private Cloud

Where tiered storage classes are at issue, cost and complexity is not such a dramatic barrier, unless we include a mix of network and storage (i.e., iSCSI storage tiers). For the sake of simplicity in discussion, let's ignore that and break the areas of tiered interest into: 1) elastic block storage (“EBS”); 2) cached virtual machine images; 3) running virtual machine (“VM”) images. In the MomentumSI private cloud, we've implemented multiple tiers of storage services by adding solid state drives (SSD) drives to each of these areas, but doing so requires matching the nature of the storage usage with the location of the physical drives.

Consider implementing EBS via higher speed SSD drives. Because EBS volumes avail themselves over network channels to remain attachable to various VMs, unless very high speed networks carry the drive signaling and data, a lower speed network would likely not allow dramatic speed improvements normally associated with SSD. Whether one uses ATA over Ethernet (AoE), iSCSI, NFS, or other models to project storage across the VM fabric, even standard SATA II drives, under load could overload a one-gigabit Ethernet segment. However, by exposing EBS volumes on their own 10Gbe network segments, EBS traffic stands a much better chance of not overloading the network. For instance, at MSI we create a second tier of EBS service by mounting SSD on the mount points under which volumes will exist – e.g., /var/lib/eucalyptus/volumes, by default, on a Eucalyptus storage controller. Doing so gives users of EBS volumes the option of paying more for 'faster drives.'

While EBS gives users of cloud storage a higher tier of user storage, the cloud operations also represent a point of optimization, thus tiered service. The goal is to optimize the creation of images, and to spin them up faster. Two particular operations extract significant disk activity in cloud implementation. First, caching VM images on hypervisor mount points. Consider Eucalyptus, which stores copies of kernels, ramdisks (initrd), and Eucalyptus Machine Images (“EMI”) files on a (usually) local drive at the Node Controllers (“NC”). One could also store EMIs on an iSCSI, AoE or NFS, but the same discussion as that regarding EBS applies (apply fast networking with fast drives). The key to the EMI cache is not so much about fast storage (writes), rather rapid reads. For each running instance of an EMI (i.e., a VM), the NC creates a copy of the cached EMI, and uses that copy for spinning up the VM. Therefore, what we desire is very fast reads from the EMI cache, with very fast writes to the running EMI store. Clearly that does not happen if the same drive spindle and head carry both operations.

In our labs, we use two drives to support the higher speed cloud tier operations: one for the cache and one for the running VM store. However, to get a Eucalyptus NC, for instance, to use those drives in the most optimal fashion, we must direct the reads and writes to different disks,– one drive (disk1) dedicated to cache, and one drive (disk2) dedicated to writing/running VM images. Continuing with Eucalyptus as the example setup (though other cloud controllers show similar traits), the NC will, by default, store the EMI cache and VM images on the same drive -- precisely what we don't want for higher tiers of services.

By default, Eucalyptus NCs store running VMs on the mount point /usr/local/eucalyptus/???, where ??? represents a cloud user name. The NC also stores cached EMI files on /usr/local/eucalyptus/eucalyptus/cache -– clearly within the same directory tree. Therefore, unless one mounts another drive (partition, AoE or iSCSI drive, etc.) on /usr/local/eucalyptus/eucalyptus/cache, the NC will create all running images by copying from the EMI cache to the run-space area (/usr/local/eucalyptus/???) on the same drive. That causes significant delays in creating and spinning up VMs. The simple solution: mount one SSD drive on /usr/local/eucalyptus, and then mount a second SSD drive on /usr/local/eucalyptus/eucalyptus/cache. A cluster of Eucalyptus NCs could share the entire SSD 'cache' drive by exposing it as an NFS mount that all NCs mount at /usr/local/eucalyptus/eucalyptus/cache. Consider that the cloud may write an EMI to the cache, due to a request to start a new VM on one node controller, yet another NC might attempt to read that EMI before the cached write completes, due to a second request to spin up that EMI (not an uncommon scenario). There exist a number of ways to solve that problem.

The gist here: by placing SSD drives at strategic points in a cloud, we can create two forms of higher tiered storage services: 1) higher speed EBS volumes; and 2) faster spin-up time. Both create valid billing points, and both can exist together, or separately in different hypervisor clusters. This capability is now available via our Eucalyptus Consulting Services and will soon be available for vCloud Director.

Next up – VLAN, L2, and others for tiered network services.

Monday, March 14, 2011

Auto Scaling as a Service

The Tough Auto Scaling Service is our offering to enable the automated scaling of an application tier at runtime. System data collected by a monitoring service provides the intelligence to provision or deprovision resources according to SLA's. Out of the box, our service uses our own Tough Monitoring Service, however, since we're using the de facto standard (Amazon Web Services), you can plug in any implementation that is AWS compatible.

The auto scaling service works by defining an 'auto scaling group'. This identifies the kind of service which will shrink or expand based on system load. The most common use case for auto scaling is for the Web tier where additional Web servers are added on the fly to respond to heavy loads. Auto scaling can also be used on stateful tiers but extra attention must be spent on managing the state replication mechanisms (clustering, etc.)

As new servers are provisioned to respond to the load request, they can be added to a dynamically programmable load balancer. This enables in-bound application traffic to be evenly divided across the array of virtual servers identified in an auto scaling group. Conversely, when the load returns to normal levels, the virtual servers are taken out of the load balanced pool allowing a graceful shutdown. To enable this scenario, we're using our Tough Load Balancing Service, but once again, customers can use any AWS compatible load balancer to perform this operation.

One of the key concepts of cloud computing is the concept of 'elasticity'; another is 'automation'. The Auto Scaling Service brings these two concepts together and applies them toward the compute side of the world to provide three key benefits:
1. Increased success rates on Service Level Agreements - The system auto scales to meet SLA's
2. Higher utilization rates - Unused virtual servers are released back to the pool
3. Reduced operating costs - Predefined policies automate activities that previously would have been human intensive tasks.

Combined, these three benefits make auto scaling a critical component of any private / hybrid cloud environment. It's also worth pointing out that the auto scaling service is a fundamental building block to enable other scalable services such as Platform Services (PaaS).

Wednesday, March 09, 2011

Non-Invasive Cloud Monitoring as a Service

The Tough Cloud Monitoring solution is our next generation offering targeting virtualized workloads, as well as PaaS services, housed in either traditional data centers or private cloud environments.

By monitoring, we mean 'health and performance' monitoring of infrastructure and platforms. Our service provides the traditional statistical information: CPU utilization, disk I/O, network traffic, etc. This begs the question, "why does the world need yet-another monitoring solution?" Quite frankly, we were surprised that there weren't better options available on the market. So, once again, we started from scratch with a new design center:

1. Make it massively scalable and highly available
Some of our customers currently have 1,000's of virtualized work loads operating and it is clear that the next generation service providers will have 10's of thousands running. Our design needed to easily scale itself from both a data collection perspective and burst storage. We bit the bullet and designed a solution from scratch to use Apache Cassandra at the core. This enabled us to leverage it's built-in cross-data center peer replication schemes and dynamic partitioning. In addition, Cassandra was a good fit for us because it was designed to accept very fast (stream oriented) writes of data.

2. The monitors should be non-invasive and agent free
Being non-invasive is always a good goal; it makes it easier to collect data on targets without having to install additional software on the machine (which can be a real problem when you already have lots of machines running in production). Knock-on-wood, but so far, we've been able to deliver all of our monitors completely out-of-band. No need to install Ganglia, collectD, etc. on hundreds/thousands of boxes...

3. The monitors should support a standard, service oriented API
In building our early private clouds, we were surprised to see that most of the system monitoring tools were "closed" systems. They collected the data but didn't make it easily available to other systems; they were designed to deliver the information to humans in HTML. This was a non-starter for us since the new world is about achieving higher levels of system automation (not human tasking). Naturally, we went with the de facto standard Amazon Web Services and the CloudWatch API. Our solution delivers full compatibility with CloudWatch from a WSDL, AWS Query and command line perspective. This makes it real easy for the monitoring data to be consumed by other services like Auto Scale.

4. Use a consistent model for IaaS and PaaS
By supporting the AWS service interface model, we inherited this feature. Just as CloudWatch monitors services like their Elastic Load Balancer and Relational Data Services, we'll be providing similar support for internal PaaS platforms.

We believe that we have achieved all of our design goals. The Tough Cloud Monitor is available today for traditional data centers, private clouds or service providers.

Tuesday, March 08, 2011

Tough Load Balancing as a Service

Last week, MomentumSI announced the availability of our Tough Load Balancing Service along with a Cloud Monitoring and Auto Scaling solution.

The concept of load balancing has been around for decades - so nothing too new there. However, applying the 'as a Service' model to load balancing remains a fairly new concept, especially in the traditional data center. Public cloud providers like Amazon have offered a similar function for the last couple of years and have seen significant interest in their offering. We believe LB-aaS offers an equivalent productivity boost to traditional data centers, private cloud customers or service providers who want to extend their current offerings.

Our design goals for the solution were fairly simple - and we believe we met each of them:

1. Don't interfere with the capability of the underlying load balancer
The LB-aaS solution wraps traditional load balancers (currently, software based only) to enable rapid provisioning, life-cycle management, configuration and high availability pairing. All of these functions run outside of the ingress / egress path of the data. This means you do not incur additional latency in the actual balancing. Also, our design enables us to snap in various load balancer implementations. Our current solution binds to HAProxy and Pound for SSL termination. Based on customer demand, we anticipate adding additional providers (e.g, F5, Zeus, etc.) Our goal is to nail the "as-a-Service" aspect of the problem and to be able to easily swap in the right load balancer implementation for our customers specific needs.

2. Make life easier for the user
I was recently as one of my enterprise customers speaking with an I.T. program manager. She commented that her team was in a holding pattern while they ordered a new load balancer for their application. Her best guess was that it was going to take about 5 weeks to get through their internal procurement cycle and then another 2-3 weeks to get it queued up for the I.T. operations people to get around to installing, configuring and testing it out. When I told her about our LB-aaS solution (2-3 minutes to provision and another whopping 5-10 minutes to configure), she just started laughing... and made a comment about necessity being the mother of all invention.

3. Deliver an open API
Delivering an open API was an easy decision for us. We went with the Amazon Web Service Elastic Load Balancer API. We maintained compatibility with their WSDL as well as providing command line capabilities and the use of their AWS Query protocol. As the ecosystem around AWS continues to grow, we want companies to be able to immediately plug into our software without code-level changes.

4. Don't cause pain down the road
We've seen some companies put software based load balancers into their VM image templates. We see this as last-years stop-gap solution. The lack of device-specific life-cycle management leads to configuration drift and no service-oriented interface means you can't use the load balancer as part of an integrated solution pattern (like auto-scale). Let's face it, the world is moving to an 'as a service' model for some good reasons.

Again, the Tough Load Balancing Service is available today and can easily work in current data centers, private clouds or service providers.

Wednesday, March 02, 2011

Separating IaaS into Two Layers

For some time now, I've been watching cloud architects consider their strategy for deploying wide-scale Infrastructure-as-a-Service (IaaS). Many of my friends are quick to draw the standard Gartner cloud stack (SaaS, PaaS followed by IaaS). And although I think this is a simple way to look at the layers, it can be dangerous if that's where the conversation ends.

I'd like to suggest that we consider at least two distinct IaaS layers:



Some people call the first layer, "Hardware-as-a-Service". It primarily focuses on the 'virtualization' of hardware enabling better manipulation by the upper layers. This was the core proposition of the original EC2. There are some great vendors in this space like Eucalyptus, Cloud.com and VMware. Cool projects are also emerging out of OpenStack which many of the aforementioned companies hope to adopt and extend.

The second layer is the 'automation layer'. It focuses on providing convenience mechanisms around layer 1 services. This includes everything from making multi-step human tasks more easily accomplished through orchestrations, to closed-loop systems akin to the problem defined in autonomic computing. The core elements delivered in layer 2 includes self-inspection, self-healing, self-protection and resource optimization. These are some pretty powerful concepts. So powerful in fact, that it often makes sense for consuming technologies to bind to layer 2, rather than directly to layer 1.

We're starting to see this layered approach unveil itself at Amazon. Services like Elastic Beanstalk focus on integrating many of the lower layer building blocks into an easy to consume bundle, while also delivering several of the autonomic properties. It's pretty cool stuff. But, it's only cool if you actually use it. I loved that Amazon started off real low in the stack (EC2 servers) and worked their way up. It was fundamentally the right way to rethink the problem. The downside is that many engineers are now overly comfortable using the original atomic elements when they need to be looking harder at the new convenience layers (e.g., CloudFormation, Elastic Beanstalk, etc.)

The announcement we made yesterday regarding custom implementations of Amazon CloudWatch, Elastic Load Balancer and Auto Scaling for private cloud demonstrate our commitment to this approach. We're also big believers in industry standards. In my younger (and more naive) days, I would have preached about 'open standards' over 'industry standards' but I've sat in on too many industry conference calls listening to vendors with agendas bicker over standards only to wait years to get a solution which was designed by a committee. When it comes to cloud standards, I'll gladly let those younger (or more patient) than I fight those fights. Until then, we're backing the de-facto standard, AWS. And to those who say "standardized API isn't important", I'll have to kindly disagree ;-)

Friday, February 25, 2011

Amazon CloudFormation Exceeds Expectations

Today, Amazon released their latest offering, CloudFormation. Simply put, CloudFormation is the service we've all been waiting for. The entire topology of an application can be described including the images, storage, security, load-balancing, auto-scaling, databases, messaging and more. It's the glue that holds it all together.

CloudFormation provides some UI screens to allow developers & release engineers to easily describe the makeup of their applications. Under the covers, the description is turned into a structured template. This template can then be sent to the AWS provisioning engine which understands the dependency chain - and launches each component in the precise order.

In standard AWS fashion, the template descriptions are available for developers to review or to create from scratch. Once a template is created, CloudFormation provides an API for developers to call which will take the template as input and execute it. It's great to see Amazon continue down the path of not only providing UI's but also making the functions available as services.

At MomentumSI, we've been anticipating the launch of this service. It pulls together all of the piece-parts which Amazon has been developing over the years. Finally, the picture can be painted on how Amazon can be used for complete application solutions. I tip my hat.

Tuesday, November 23, 2010

Defending the Private Cloud

Phil - You know I love hyperbole as much as the next guy, but come on... discrediting the private cloud?? (and to those who aren't aware - I've corresponded with Phil for just over 7 years and have a sincere respect for him... but that doesn't mean I won't rip into his posts ;-)

Phil writes:
"Back in January, I made a controversial prediction that private clouds will be discredited by year end. Now, in the eleventh month of the year, the cavalry has arrived to support my prediction, in the form of a white paper published by a most unlikely ally, Microsoft."
A whitepaper from Microsoft is the cavalry? Wow - it must have been written by Bill Gates himself! Or... a couple noobs with MBA's and banking backgrounds... But to be fair, the paper rocks. It's dead on. It says that the cloud model is a good one - and that *eventually* more and more applications will be a good fit for the large scale public clouds.

The cool stuff described in the paper is evolving; it will take time (like a decade). That said... let's take a look at the realities that my clients live with on a daily basis:

1. ALL (not some) of my large clients have a mixed computing environment including some combination of AIX, Solaris and Z. NONE (not some) of the public cloud providers have options for supporting all of these environments. I know, you're thinking to yourself... well, they should just port the applications to Linux/Wintel and all would be good. However, in the vast majority of the cases, the applications are packaged software and my clients have little influence over the vendors who own them. So, to be clear - a significant portion of the applications are not targets for the current large scale cloud providers (like Amazon, Microsoft, etc.)

2. Most applications are data intensive and coupled together. This presents a problem when you want to move applications from your internal data center to a public cloud. I compare it to pulling out a paper-clip from your desk drawer only to find it bound to a bunch of other paper-clips. Enterprise applications are often glued together, with either low latency requirements between them or requiring large amounts of data to be moved between them (not good if you have remote data centers with thin pipes and ingress/egress fees.) **Phil, it's about Loose Coupling ;-)

3. Hardware and software provisioning times in the enterprise are embarrassing. The amount of time/money that is wasted waiting for new environments to be procured, stood up, tested, secured, etc. would astound you. The pain is real TODAY - and waiting a decade for a public cloud to be able to support the half-dozen hardware platforms, operating systems, COTS licenses, etc. you need to perform integration testing on isn't an option.

4. Mankind didn't suddenly change in 2010. It turns out that wholesale moves from one computing model to another is not in the corporate DNA. Enterprises who excel at mitigating risks are taking incremental steps to the cloud. First, they're interested in finding out simple things like "how will my business critical application perform if we virtualize it?" or... "If we moved our data intensive application off of our vertically scaled mainframe onto a horizontally scale commodity compute (share nothing) architecture - - will it still perform?" You see... enterprise I.T. has lots of unknowns around cloud architectures. It will take some time for them to understand the basics. Once they answer the architectural questions, figuring out who hosts it is rather simple problem (price, service & reliability).




Private cloud is a natural stepping stone. Most I.T. professionals that I have met do not understand the architectures, processes and operating models (regardless of public or private). Pushing naive people to a public cloud where their mistakes will be hidden by a magically elastic service interface is *not a good idea*. Trust me... it shows up when they get the bill. Instead, I wholeheartedly recommend a stepwise approach to learning about horizontal scaling, sharding, MapReduce, BigData, multi-tenant services, etc. in an environment where they can observe actions and outcomes.

Wednesday, August 25, 2010

MomentumSI Partners on Private / Hybrid Cloud

Why self-service private cloud?
  • Improved agility — Deployment cycles shrink from months to minutes, making IT far more responsive to business lines and other internal customers.
  • Reduced capital expense — Utilization of hardware capacity improves dramatically due to elastic provisioning and de-provisioning of services.
  • Reduced operating costs — Software infrastructure and provisioning processes are standardized and automated. Control is decentralized to decentralized constituents.
  • Reduced risk — The controlled cloud provides an alternative to rogue deployments to the public cloud. The ability to move workloads between deployment environments (physical, virtual or cloud) avoids platform lock-in.
The core technologies and services within the MomentumSI self-service private/hybrid cloud platform include:


Tuesday, June 08, 2010

Current challenges for Application Performance Engineering

Application performance engineering is a discipline encompassing expertise, tools, and methodologies to ensure that applications meet their non-functional performance requirements. Performance engineering has understandably become more complex with the rise in multi-tier, distributed applications architectures that include SOA, BPM, SaaS, PaaS, cloud and others. Although performance engineering ideally should be applied across the lifecycle, we’re seeing more factors that unfortunately push it into the production phase, typically to resolve problems that have already gotten out of hand. That clearly a tougher challenge, so how did we get to this point?

In the client-server past, performance optimization was something that folks in the IT department typically figured out through trial and error. Developers learned to write more efficient database queries, database administrators learned to index and cache, and system administrators monitored CPU and memory to upgrade when needed.

As application architectures started to get more complex, the dependencies increased and it was harder for one team track down problems without chasing their tail. More organizations adopted something that was previously only used by enterprises with highly scalable, reliable mission critical applications – the performance testing lab. Vendors like Mercury created popular load testing tools like LoadRunner, and organizations invested millions in lab hardware and software in an attempt to recreate production environments that they could control for testing purposes.

Unfortunately, these performance labs became very difficult to cost justify. First, it always seemed to take too much time and money to setup the realistic test environments you’d like, particularly as apps became more distributed. Next, projects were often already behind schedule when it came time to test, and so lab times often had to be cut short. Factors like these minimized the lab’s value, but the real killer was the high maintenance costs for all that hardware and software, along with the data center and staff.

This put many IT organizations in a tough spot. With limited means to perform system-wide performance testing, and the inclusion of more SaaS/PaaS/cloud services in their architecture, they had to make due with whatever subsystem level performance testing they could get. After that, its finger-crossing and resigning yourself to further optimization in production.

Unfortunately, production can be a very frustrating place to try and optimize performance, particularly when you have performance problems and growing complaints from customers, partners, etc. It’s in these pressured environments where you need true performance engineers that follow a methodical and systematic end-to-end approach. Performance bottlenecks can reside in a myriad of places in highly distributed architectures, and you need to follow a disciplined methodology to analyze dependencies, isolate problem areas, and then leverage the best of breed tools to trace, profile, optimize, etc each of the tiers and technologies in the application delivery path. This takes a lot of skill and expertise.

In short, the challenges faced by today’s application performance engineer in production settings is a far cry from the client-server days of in-house tuning and experimentation. We expect that the role of Performance Engineer will grow in importance as SOA, BPM, cloud, and SaaS/PaaS implementations increase, and until more viable pre-production system performance testing options are available to rise up to the challenge.

Friday, October 23, 2009

SOA Manifesto

Here's a few quick links:

The MomentumSI SOA Manifesto from 2007:

Our discussion forum on the 2009 SOA Manifesto:

The 2009 SOA Manifesto:

Sunday, June 21, 2009

The Case for Expropriated Reuse

Expropriated Reuse is a form of reuse that focuses on the here and now. The goal isn’t to define some new service and hope for ‘accidental reuse’ or even to put forward a case for ‘planned reuse’. Instead, it’s the act of going out and finding redundant code that already exists across multiple systems and turning it into a single shared service. I’ll repeat that: Find redundant code and refactor out the common elements into shared services. This ALWAYS results in multiple consumers (or mandated consumers, if you prefer).

I’ll give a quick example. Over time, company X had ‘accidentally’ created 5 software modules that existed inside of larger applications that all did some form of ‘quoting a price to customers’. This led to 5 teams maintaining it, 5 sets of computer hardware, etc. Expropriated Reuse is the act of going to the applications and cutting out the common elements and turning them into one shared service. Note that this is different than application rationalization or application consolidation that tends to use a nuclear bomb to deal with the problem. We’re recommending sniper firing with a bit of additional precision bombing.

The reasons that we do this are mostly financial. We want to reduce the amount of code that has to be maintained and operated. We want to reduce our costs. There are plenty of other reasons like increased quality, time-to-market, etc. but I’m done with those softies. The case is cold hard cash. Show me how to save money or go away.

IMHO, the new SOA agenda is about expropriated reuse. The SOA Program must actively identify opportunities to make the enterprise software portfolio more efficient and less costly. Just like in city planning exercises we must acknowledge the needs of the community over the needs of the few. And I’m in agreement with Clinton that ‘we should never waste a good crisis’. Reduced I.T. budgets have created a ‘crisis of efficiency’ in virtually all of our clients. The imperative is to find ways to reduce budgets in the short term and over the next 3-5 years.

To quote Todd Biske, “… a common analogy for enterprise architecture these days is that of city planning. … (does) your architecture have more parallels to the hot political potato of eminent domain? Are you having to bulldoze applications that were only built a few years ago whose residents are currently happy for the greater good? What about old town? Any plans for renovation? If you aren’t asking these questions, then are you really embracing enterprise SOA?”

I’ve been wondering: Should all SOA programs that do not have the authority to issue an order of ‘eminent domain’ on the software portfolio be shut down? Does your SOA program have a hunting license to go find inefficiencies / duplicate coding and to issue an order of eminent domain on that code? Can you imagine what our country would look like if we couldn’t issue an order of eminent domain to capture land for our highways, bridges or railroads? Can you imagine if we didn’t have the ability to implement ‘easement by necessity’? Consider your town/neighborhood and think about the following: Railroad easements, storm drain or storm water easements, sanitary sewer easements, electrical power line easements, telephone line easements, fuel gas pipe easements, etc. What a mess it would be. The crisis of budgets must draw out the leaders. If you haven’t already been issued a hunting license, it’s time to go get one.

So I’ll answer my own question. If you’re SOA program is responsible for watching the blinking lights on your newly acquired SOA Management tool, or making sure that people enter their ‘accidental’ services into your flashy registry, I’ll recommend that they shut you down. This is a waste of time. The SOA group must be given the imperative and authority (a hunting license) to find waste in the enterprise and to destroy it. SOA isn’t about policing people on WS-Standards or similar crap – it’s about saving your company millions.

The Case for Planned Reuse

In my last post, I argued that the concept of ‘accidental services’ or ‘build it and they will come’ is a bad idea – because … they typically don’t come. Services that are created with a very specific consumer in mind are typically limited in capability, scope and result in limited reuse.

The MomentumSI Harmony method suggests that service analysis be performed on the first consumer’s needs as well as potential consumers that aren’t in the immediate scope. This is easier said than done. How do you identify the requirements of a service if you have ‘phantom consumers’?? The short answer is that there are techniques that involve looking at UI models, process models, data models and other artifacts that will give you insight into the domain. The result is a list of potential consumers and a plan for their eventual consumption. The point is that there are techniques to help organizations define services according to a plan – and doing so leads to increased reuse and a better software portfolio.

Again, Planned Reuse is most effective when you’re working in a new domain and you don’t already have a bunch of conflicting/overlapping software that exists. The immediate project might call for an ‘Order Service’, but you know that the service will eventually be called by the Web eCommerce system, the call center software, the B2B gateway, etc. Those projects aren’t in scope – but you consider their needs when designing the service.

This is all fine, but what happens when you’re analyzing a service for an immediate project that clearly should be called by existing projects/software? This is the case for Expropriated Reuse.

The Case Against Accidental Reuse

"Accidental Reuse" is a term that I've been throwing around a lot lately. In layman's terms, it means, "If you build it, they will come." This notion has been disproved in virtually every field (except in the Field of Dreams) - and I suggest it is even more unlikely in the field of software engineering.

It has been my observation that most software engineering teams suffer from a bad case of 'not invented here' syndrome. We work in a discipline that is not regulated and the certifications are a joke at best. Most programmers don't have engineer training - and most architects have learned from doing and observing. In short, there are good reasons for one team to not fully trust the output of another team. We also have the issues of human entitlement. It's been my observation that a generation of software developers has been created who feel that they are entitled to 'not be bored' and 'deserve cool problems'. Once again, we have people who feel the need to create new stuff - not reuse.

I could go on - but I'm guessing there is no need. Anyone who has been in the industry has seen the problems. This leads back to my original point - accidental reuse is a bad idea. And here, I'm as guilty as the guy sitting next to me on the last 20 panels where I've been on telling the world to 'build it and they will come'... design for reuse... create services... register the services... and they will come. Apologies.

Most SOA report cards that I get still show metrics on: how many services there are, how many consumers there are or how many times the service has been called. These are fine metrics but in my humble opinion leave behind one very important metric: how many unknown consumers use the service? (accidental reuse, planned reuse & expropriated reuse)

Now it gets interesting. My aggregate data shows that most services are built with one consumer in mind and almost 80% of all transactions go through the original consumer. Less than 2% go through 'accidental channels'. Of the 2%, most of those were for externally facing systems (B2B) where advance knowledge of consumption is limited or they were in 'technical' or 'entity' services where they offered limited functionality and in some cases, limited return.

The remaining 18% go toward 'planned reuse' or 'expropriated reuse'. This number is too low - but this is the opportunity. More on this later...

Coming next:
- The Case for Planned Reuse
- The Case for Expropriated Reuse (right of eminent domain)

Saturday, December 20, 2008

Friday, November 07, 2008

Nomination for Federal CTO

From the Barack Obama campaign site:

"Obama will appoint the nation's first Chief Technology Officer (CTO) to ensure that our government and all its agencies have the right infrastructure, policies and services for the 21st century. The CTO will ensure the safety of our networks and will lead an interagency effort, working with chief technology and chief information officers of each of the federal agencies, to ensure that they use best-in-class technologies and share best practices."

I nominate Tim O'Reilly as the first Federal CTO. Do I hear a second??

Thursday, October 23, 2008

16 Corrections on Cloud Computing

In March of 2008, RedMonk analyst, James Governor, submitted his list of "15 Ways to Tell if it's not Cloud Computing".

The consultants at MomentumSI have found 16 corrections:


Monday, October 20, 2008

Real World SOA

It was my pleasure to be a guest on the Real World SOA Podcast with David Linthicum. Take a peek:

http://weblog.infoworld.com/realworldsoa/archives/2008/10/my_conversation.html

Topics:
- How do we help organizations with SOA?
- What is the purpose of a SOA methodology?
- What is the next big thing???

Wednesday, September 03, 2008

Talking to the Business about SOA

Recently, there's been more chatter about how (or if) you should talk to the business about SOA. Yesterday, I sat in on the SOA Consortium conference call where this was the main theme. Interestingly, the moderator posed the questions and a couple participants were quick to respond... "we don't talk to the business about SOA..." The moderator took it in stride and started down the path of business and I.T. alignment - and once again the participants pushed back. Not to be dissuaded the moderator went down the BPM path. Once again, the participants pushed back. The group commented that, "talking SOA is too abstract for the business" and there was a "need to talk about business specific functionality vs just high level Agility and Change".

After attending this call, I stumbled on to Joe 2.0's blog post on the exact same topic! Even funnier was that he was quoting Jean-Jacques Dubray (JJ), who I had a 3 hour phone call with on the subject just days earlier. JJ had commented, "My experience is that the key people that you have to focus all your energy on are the developers, architects, business analysts, QAs and operations."
Joe 2.0 goes on:
Dubray says SOA is a “pure IT problem.” But in this era of the online collaborative organization, when we rely on technology for every aspect of our business, are there really any “pure IT” problems?


I don't want to split hairs... but IMHO, the answer is, "yes, some problems are just I.T. problems". Sure, I.T. problems, like HR or garbage collection, may bubble their way up to become a business problem, but at the end of the day I.T. has to figure out how to do their job and go do it. When the janitor picks up the trash in my office they do it in the most efficient way they know how. They don't ask 'the business' if they should do it efficiently - they just do it. When did I.T. become such wussies?

I'm a big believer in talking to business about whatever they want to talk about... Inventory Visibility? Love it. Customer Loyalty? Love it. New Product Introduction? Love it. That said, I believe it's I.T.'s responsibility to bring technology solutions forward. Most business people understand things like forms, window, graphs, reports, etc. They understand visual deliverables (not invisible deliverables like WSDL's). I think that is why we're seeing the most successful SOA shared service centers adopting capabilities around Rich Composite Applications, Mashups and other edge-of-the-enterprise development capabilities. They engage with the business about business problems and then use mashups and other techniques to quickly demo/prototype/build solutions that their users can relate to.

If you're looking for inspiration on this process, I'm happy to recommend a book on the subject, Mashup Corporations.
The authors do a great job of walking the readers through a fictional company. As business problems are encountered, they introduce Web 2.0 and mashup solutions. Prototypes are put together and the concepts are tested out. SOA is discussed as the 'efficient way' to make it happen. Again, they didn't talk to the business about SOA (or even services)!

Monday, September 01, 2008

PaaS Enables New ROI


If you haven't already checked out Amy Shuen's book, "Web 2.0: A Strategy Guide", you should grab a copy; it's worth the read. Amy discusses the trends around Web 2.0 in the clearest, most concise manner I could have hoped for. Enough bragging about her book - one of her diagrams inspired me to think about the effects that PaaS has on the Enterprise I.T. development model.

Amy pointed out that a version of the Long Tail lives in the I.T. application development world. Certain business problems (ie, order management) have a very real and significant value proposition; these systems are often purchased from ISV's. The next set of applications often have slightly less of an ROI and are often built by the I.T. custom development group. In many cases these are departmental applications or add-on's to the procured systems. Recently, new SaaS solutions are finding their way into the enterprise because they fulfill point-requirements and have a low-cost of entry.


In the past, this left lots of business problems in the hands of shadow I.T., or power-users. But all too often new systems concepts were taken to the I.T. review board and turned down because they didn't project an adequate ROI. The return on some of these systems may have been rewarding, but the initial investment (hardware, infrastructure licenses, long development cycles, etc.) drove down the overall ROI to the point where the idea was rejected. These systems are prime candidates for PaaS, where the initial investment is significantly decreased by the pre-hosted, pay-by-the-drink model. Once again, hosted platforms will likely be the key enabler of long tail opportunities.

The long tail model of reviewing new system requests is an interesting method for I.T. governance and planning committees to consider. It is my belief that if enterprise organizations fail to meet the needs of the long tail, they will be met by other 3rd party providers who will be all to willing to help!