Saturday, July 31, 2010

Cloud Services & Availability SLA – Should scheduled maintenance be excluded from the measurements?

Availability SLA is an important criterion to enterprise customers when selecting a Cloud service provider.  If the service provider doesn’t offer appropriate SLAs, they often don’t even make it to the “short list”. 

As an example, Amazon’s EC2 availability SLA is 99.95%. Amazon also offers transparency by providing a website that publishes up-to-the-minute status on their services and any related issues.  In the case of AWS, Amazon makes the following exclusions:

Amazon EC2 SLA Exclusions

The Service Commitment does not apply to any unavailability, suspension or termination of Amazon EC2, or any other Amazon EC2 performance issues:

  • (i) that result from Service Suspensions described in Section 7.1 of the AWS Agreement;
  • (ii) caused by factors outside of our reasonable control, including any force majeure event or Internet access or related problems beyond the demarcation point of Amazon EC2;
  • (iii) that result from any actions or inactions of you or any third party;
  • (iv) that result from your equipment, software or other technology and/or third party equipment, software or other technology (other than third party equipment within our direct control);
  • (v) that result from failures of individual instances not attributable to Region Unavailability; or
  • (vi) arising from our suspension and termination of your right to use Amazon EC2 in accordance with the AWS Agreement (collectively, the “Amazon EC2 SLA Exclusions”).

If availability is impacted by factors other than those explicitly listed in this agreement, we may issue a Service Credit considering such factors in our sole discretion.

Here is section 7.1 from AWS agreement:

  • “…suspended for the duration of any unanticipated or unscheduled downtime or unavailability of any portion or all of the Services for any reason, including as a result of power outages, system failures or other interruptions…”
  • we shall also be entitled, without any liability to you, to suspend access to any portion or all of the Services at any time, on a Service-wide basis:
    • (a) for scheduled downtime to permit us to conduct maintenance or make modifications to any Service;
    • (b) in the event of a denial of service attack or other attack on the Service or other event that we determine, in our sole discretion, may create a risk to the applicable Service, to you or to any of our other customers if the Service were not suspended; or
    • (c) in the event that we determine that any Service is prohibited by law or we otherwise determine that it is necessary or prudent to do so for legal or regulatory reasons (collectively, “Service Suspensions”).

To the extent we are able, we will endeavor to provide you email notice of any Service Suspension in accordance with the notice provisions set forth in Section 15 below and to post updates on the AWS Websites regarding resumption of Services following any such suspension, but shall have no liability for the manner in which we may do so or if we fail to do so.

So, if there is any outage unintended or intended by Amazon, those numbers may not be included in their service availability measurements.  In fairness to Amazon, I haven’t experienced much outage or remember any notices for scheduled maintenance.   Never-the-less, if it occurs, it won’t be counted as an outage.

You can find EC2's SLA at http://aws.amazon.com/ec2-sla/.
--------------------------------
The other day, I received the following notification from GoGrid:
image
In this case, customers basically had no administration access to their running servers for 4 hours. 

GoGrid claims to offers 100% uptime in their SLA, but there are some exclusions as follows:

  1. downtime during scheduled maintenance or Emergency Maintenance
  2. outages caused by acts or omissions of Customer, including its applications, equipment, or facilities, or by any use or user of the Service authorized by Customer
  3. outages caused by hackers, sabotage, viruses, worms, or other third party wrongful actions
  4. DNS issues outside of GoGrid's control
  5. outages resulting from Internet anomalies outside of GoGrid's control
  6. outages resulting from fires, explosions, or force majeure
  7. outages to the Customer Portal
  8. failures during a "beta" period

According to item #1, scheduled maintenance is not included as part of their availability SLA.

Rackspace also claims to offer 100% availability in their SLA, but they also exclude scheduled maintenance from their availability measurement.

So, I appreciate the difference between scheduled maintenance and an unexpected outage.  However, from an availability perspective, I think scheduled maintenance should be included in the availability measurements and reporting.  Otherwise, it is misleading clients.

The Cloud service provider has options to manage for continuous availability.  That is under their control.  Regardless of scheduled vs. unscheduled service outage, the business impact may be no less even if the subscriber is aware of an outage in advance.

What is your experience with Cloud service providers?   Do they explain the SLAs clearly?  Do you think scheduled maintenance should be included or excluded?

7 comments:

Unknown said...

Very informative blog...The amazon ec2 sla is a policy of the Amazon Elastic Compute Cloud under the terms of the Amazon Web Services Customer Agreement.Thanks for sharing such information.

Unknown said...

Cloud is one of the tremendous technology that any company in this world would rely on (Salesforce Training in Chennai). Using this technology many tough tasks can be accomplished easily in no time. Your content are also explaining the same(Salesforce admin training in chennai). Thanks for sharing this in here. You are running a great blog, keep up this good work.

Padhma said...

This is a great post. I like this topic.This site has lots of advantage. It helps me in many ways.Thanks for posting this again.

Back to Original Services Private Limited

Mirnalini Sathya said...

Wonderful blog on Cloud domain, Thank you sharing the informative article with us. Hope your article will reach top of the SERP result to the familiar cloud related queries.
cloud computing training centers in chennai
cloud computing training institutes in chennai

AWS Downtime said...

It is important to understand AWS downtime and its limit. This blog share complete information on AWS downtime. Thanks for sharing.

anusha said...


AWS Training in Chennai AWS Training in Chennai in weekends.Learn AWS in just 5 weekends from BITA-Best Training Institute in Chennai.

sharan said...

Thanks for sharing this questions and I hope they will reach out to a larger percentage of online users. Check out my article by clicking on

Microsoft Windows Azure Training | Online Course | Certification in chennai | Microsoft Windows Azure Training | Online Course | Certification in bangalore | Microsoft Windows Azure Training | Online Course | Certification in hyderabad | Microsoft Windows Azure Training | Online Course | Certification in pune