How Does EC2 Auto Scaling Work?
What is Amazon EC2 Auto Scaling?
Amazon’s EC2 auto scaling feature helps maintain a sufficient number of instances of EC2 to satisfy an application’s required workload. Dynamic scaling is one of the major benefits that lead organizations to migrate workloads to public cloud services like Amazon Web Services.
EC2 auto scaling is based on instance collections, referred to as auto scaling groups. Within an auto scaling group, you specify the minimum and (optionally) the maximum number of instances per group. Amazon’s EC2 auto scaling then adjusts the auto scaling group dynamically according to your specifications.
You can also specify a desired capacity—either at group initiation or any time thereafter, and the tool adheres to that number. Additionally, you can specify scaling policies, based on which EC2 auto scaling will launch or terminate the required instances to satisfy the policy and in accordance with application demand.
See the example in the diagram below. It illustrates an auto scaling group with a minimum of one instance, a maximum of four, and a desired capacity of two. EC2 auto scaling adjusts the number of instances accordingly.
EC2 Auto Scaling Components
Amazon’s EC2 auto scaling includes the following components:
- Groups—EC2 instance groups are the basic logical unit for scaling and management purposes. You specify minimum, maximum, and desired EC2 instance quantities.
- Configuration templates—a launch template serves as a configuration template for launching each group’s EC2 instances. You can also use a launch configuration, but this offers less features. You may configure instance type, security groups, key pair, Amazon Machine Image (AMI) ID, and block device mapping.
- Scaling options—there are several ways to adjust the scale of auto scaling groups. Groups can scale according to a schedule or based on specific condition occurrences (dynamic scaling).
Auto Scaling Strategies
The following are common strategies you can define to manage the behavior of auto scaling groups.
Perpetuate Existing Instance Levels
The simplest auto scaling strategy involves configuring auto scaling to a set number of instances. EC2 auto scaling will shut down failed instances and launch replacements, continuously scanning all instances. This provides a set and predetermined number of instances at all times.
Implement Manual Scaling
Manual scaling is the most basic way of actively scaling resources. By specifying the maximum, minimum, and desired values, EC2’s auto scaling will manage the termination and creation of instances to maintain your specified capacity in a stable manner.
Scale in Accordance with a Schedule
You can also scale events automatically by time and date—especially useful if you have an accurate means of forecasting demand. This prevents automation from performing a large number of scaling events on a continuous basis. It also allows you to predict the amount of available resources at any given time.
Scale Along with Demand
The ultimate combination of the previous four is where AWS shines—scaling by demand. The system shifts seamlessly between traditional strategies and more complex ones, responding to fluctuating traffic and accommodating unpredictable spikes. This option has some added benefits—for example, you can specify that CPU utilization should remain at 80%, even as application load changes.
EC2 Auto Scaling Best Practices
Here are a few best practices you can use to make AWS auto scaling more effective.
Use On-Minute Metrics Frequency
AWS monitoring is a critical part of EC2 auto scaling, because scaling events can depend on application and instance metrics.
Set EC2 instance metrics for a one-minute frequency whenever possible, to ensure swift response to changes in utilization. A five-minute frequency may slow down response time, resulting in scaling on old metric data. By default, EC2 instances have basic monitoring, which provides metric instance data at five-minute intervals. You can set them to detailed monitoring to receive instance metric data at one-minute frequencies, for an additional charge.
Configure Auto Scaling Group Health Check
Properly configuring the AWS auto scaling group’s health check feature will determine the health of registered EC2 instances, before using them to perform scaling activities. When using an AWS elastic load balancer (ELB) to distribute traffic across group instances, you can use the ELB health check instead. It works at the hypervisor and application levels.
Start with Forecast-Only, Then Use Forecast-and-Scale
Predictive scaling schedules future capacity using workload forecasts. The quality of those predictions depends on how cyclic are the workloads and application requests. Forecast-only mode enables predictive scaling, which lets you judge the forecast quality and the scaling activities it recommends, without actually scaling according to the forecast.
After creating your scaling plan in forecast-only mode, and seeing that the forecasts are accurate for your workloads, set the predictive scaling mode to forecast-and-scale, allowing it to scale workloads automatically.
Configure Auto Scaling Group Notifications
Auto-scaling groups should be able to dispatch email notifications when a launch, terminate, or other EC2 scaling event occurs. Ensure you have enabled the notifications feature so that the AWS SNS topic dedicated to your auto scaling group can process and send notifications of scaling events in real time.
Conclusion
In this article I covered the basics of EC2 auto scaling:
- Auto scaling components, including auto scaling groups, configuration templates and scaling options.
- Auto scaling strategies, including perpetuating instance levels, manual scaling, scaling on a schedule, and scaling dynamically according to demand.
- Auto scaling best practices, including setting metrics to a frequency of one minute, configuring health checks, testing forecasts before using them to scale, and using auto scaling group notifications.