Cloud hosting Sitecore - Scaling for peak loads

In my first post in this series, I looked at options for automated deployment of Sitecore in the cloud to deliver high availability. In this second post we take a look at how we can auto-scale these environments in and out to address different load scenarios.

Other posts in this series

Cloud hosting Sitecore - High Availability

Cloud hosting Sitecore - Serving global audiences
Cloud hosting Sitecore - Disaster recovery

Cloud hosting Sitecore - Cloud development patterns

Pattern 2: Scaling for peak loads

One of the key benefits of cloud platforms is their ability to scale horizontally by adding new instances of application nodes. In the case of a web site like those served by Sitecore this generally means adding additional content delivery nodes to the web farm.

Sometimes you know when peak loads occur. This could be a regular peak such as Monday morning or a planned occasion such as an event or even a large newsletter broadcast. Other times you don't see a peak coming. This could be an unexpected mention of your site going viral through social media channels. Both Amazon Web Services and Microsoft Azure support auto-scaling of a delivery farm to ensure that traffic peaks can be handled by your site.

Microsoft Azure

The Sitecore Azure module deploys delivery service nodes as cloud services inside Azure. These can take advantage of Microsoft Azure's built in features for scalability.

If you know in advance that peak loads occur at certain times, Microsoft Azure lets you set the number of cloud service instances will be available to serve requests for web content based on a schedule. In the example below I set up a standard schedule for day and night times (day being configured as running 10:30 AM to 10:30 PM) and for weekends, and I also set up a special schedule for the site around the Trendspot conference event.

I can then configure my site to automatically scale up with extra instances during the times and dates I am expecting additional traffic for my “Trendspot” event. During the period when the event is running I want to run with six instances during the day, rather than the usual four.

Azure allows you to set your site to auto-scale based on Queue length (primarily for worker roles) or CPU load. The CPU option lets you set a target CPU utilisation band for your web roles, and the Azure platform then will scale the number of cloud service nodes in and out to maintain the requested utilisation level. In addition you set a minimum and maximum number of nodes for scaling, to ensure that a minimum of two nodes is maintained to meet availability requirements, and to ensure a maximum number of nodes is used to maintain control of overall hosting spend.

Autoscaling is implemented in the Azure fabric rather than on individual role instances, this means that once it is set up, it is retained on the slot where it is configured even when production and staging slots are swapped during a deployment. This means that your production scaling rules stay in place, even if you perform a deployment.

Amazon Web Services

Amazon Web Services provides autoscaling of EC2 instances based on alarms generated from its CloudWatch monitoring system. CloudWatch supports a number of different metrics for generating alarms, and the platform can be extended to add new custom metrics.

Once the appropriate alarms have been configured, autoscaling rules can be defined to add and remove EC2 instances to the autoscaling group when specified alarms occur. Since autoscaling is based on the alarms, you can configure your farm to scale based on any metric that might be a bottleneck for your application (for example disk reads).

AWS autoscaling currently doesn’t natively support scaling your environment based on a schedule, however this use case can be custom implemented by creating a time based custom metric to use as the basis for alarms to increase and decrease the number of instances. It would also be possible to create custom metrics to scale based on other data coming out of your Sitecore solution, such as the number of concurrent users or even the rates of particular types of application transactions like e-commerce orders or report views. Although this takes more effort to set up and would need to be built into your application this gives more control over scaling your application based on what really causes performance bottlenecks.

Sitecore licensing

Hopefully you can see from the simple examples above that once you have invested the effort in moving you hosting platform to one of these public cloud services then the configuration needed to take advantage of auto-scaling is quite straight forward. The big question when it comes to using cloud hosting to scale out a Sitecore based environment however is always licensing!

Some Sitecore licensing models currently require that you license the maximum amount of servers that you will run. This obviously causes issues with scaling out to meet higher loads (particularly unexpected ones), however even if your Sitecore licensing does have this restriction you can still make use of auto-scaling cloud functionality to ensure that you aren't paying for peak level hosting infrastructure at all times. For example perhaps you scale down from three or four delivery servers during the day to run only two overnight. This can have a significant impact on the overall hosting costs of your system.

If you do have the type of site that has to handle large peak loads in excess of your normal operating capacity, you can also talk to Sitecore about licensing models that do provide the freedom to scale as much as required.

Up next...

In the next installment of this series, we'll have a look at how you can use public cloud hosting platforms to better serve global audiences for your Sitecore sites.

Search This Blog

Digital Learnings in Sitecore & Cloud