Improving Spot Instances Reliability with Cloud Backup
Spot instances let you use spare cloud capacity for discounted rates. It helps reduce your overall cloud costs while ensuring cloud resources do not go to waste. You can request a spot instance by defining your requirements and setting a bid. Once your cloud vendor has spare capacity that meets your requirements, it supplies a Spot instance.
Amazon Web Services (AWS), Microsoft Azure, and Google Cloud offer spot instances at a discount of up to 90% compared to the on-demand price. You can leverage these discounts to make the most of your budget. However, know that spot instances are not reliable. If the capacity is needed, the cloud vendor may terminate or interrupt your instance.
In this article I’ll explain how [you can use cloud backup] (https://bluexp.netapp.com/cloud-backup) to improve the reliability and resiliency of workloads running on spot instance, and enable speedy recovery after spot instance termination.
When to Use Spot Instances
Here are key use cases for spot instances:
- Running fault-tolerant applications—including web servers, continuous integration/continuous (CI/CD) development, Hadoop data processing, and API backends.
- Running workloads that save data to persistent storage—including Amazon Elastic Block Store (Amazon EBS), Amazon Simple Storage Service (Amazon S3), Amazon Elastic File System (Amazon EFS), Amazon Relational Database Service (Amazon RDS), and Amazon DynamoDB.
- Scaling applications—including stateless web services, big data analytics, massively parallel computations, and image rendering. In these cases, you should use spot instances to supplement your on-demand Instances as needed. You should not try to use Spot instances to handle 100% of your workload.
- Supporting stateless, nonproduction applications—including development and testing servers that can handle occasional downtime.
Do not use spot instances to support sensitive databases or workloads.
How Spot Instances Work
This discussion will explain how spot instances work in AWS, but is relevant for most cloud providers.
AWS determines the spot price according to supply and demand trends for spare EC2 capacity over a long-term period. When you run an instance, you pay the current Spot price at the start of each instance hour, rounded up to the nearest second.
You set a maximum price for spot instances, so you’ll never pay more than planned. If capacity becomes unavailable or the spot price exceeds the maximum price you set, the instance will automatically be stopped or terminated (based on your preferences).
While the spot price can change at any time, it generally changes up to once an hour. You can view the historical and current spot prices via the AWS Management Console or the describe-spot-price-history command. You can use this information to inform the maximum price based on spot price trends.
In terms of performance, spot instances are identical to regular EC2 instances (while they run), and you can terminate them when you don’t need them anymore. When you terminate an instance, you pay for the partial hour you use.
If Amazon stops or terminates a Spot instance, it will not charge you for your partial hour usage—this is the case whether the cause of termination is a lack of capacity or the Spot price exceeding your maximum price.
Using Cloud Backup to Improve Reliability for Spot Instances
If you are running workloads on spot Instances, there is always the risk that the spot instance will be unexpectedly terminated and data will be lost. It is also important to preserve the status of the workload—for example, in a batch job, you should keep a record of the system’s progress through the batch, making it possible to resume from the point it stopped.
Here are a few ways to backup spot instances to enable easy recovery from termination:
1. Use Managed Disks
All cloud providers allow you to create a virtual hard disk and attach it to a compute instance. For example, Amazon provides EBS volumes, Azure offers managed disks, and Google Cloud has persistent disks.
There is usually the option of keeping this storage volume even after the instance shuts down. This is the basic way to retain data from your spot instances—attach a storage volume to them, and retain the storage volume after the instance shuts down. You can then restart the workload in a new instance and reconnect the old storage volume.
2. Take Regular Snapshots
Assuming you have a persistent storage volume such as an EBS instance, most cloud providers allow you to take snapshots of your storage volumes and keep them in low-cost storage such as Amazon S3. This is a great idea for spot instances. Run regular snapshots, and then when the spot instance shuts down, you can restore the data from the snapshot.
3. Backup to Other Sources
There are various ways to backup your data to other sources, making it possible to resume workloads after a spot instance terminates:
- Create an event stream from your workloads and save data in Amazon S3, DynamoDB, or other databases.
- Use AWS Backup to perform scheduled backups of your instance every few hours. You will then have the ability to restore your workload to the last point of recovery.
- Use other backup systems to save data from the instance to an on-premise or other storage location at predetermined intervals.
Things to Consider When Using AWS Spot Instances
Availability
You cannot rely on Spot Instances for long-term purposes because spare capacity is not always available. You can use Spot instances alongside your reserved instances or on-demand instances. This combination can help ensure you can support your workloads even when Spot instances are interrupted.
Spot Allocation Strategy
AWS offers various Spot allocation strategies, including auto-scaling. You can leverage auto-scaling to scale your Spot fleets automatically. Once demand increases, AWS assigns additional resources to maintain optimal performance. Auto-scaling tools can also help you choose the most relevant Spot Instance available at the lowest price.
Rightsizing Instances for Optimal Performance
You can guarantee optimal performance and help optimize the resources you use by rightsizing Spot instances. You can use a third-party tool to rightsize your instances based on AWS CloudWatch data. Automation tools allow you to avoid manually analyzing historical data.
When you rightsize Spot instances, you should consider your cloud environment and the context of your workload. You can make decisions based on whether your workload is permanent or temporary, whether you require a predictable EC2 instance, or the stage of the development lifecycle (i.e., development, testing, or production).
Conclusion
In this article, I explained the basics of spot instances and showed how to solve one of the biggest problems of spot instances—their lack of reliability—with cloud backup. I showed a few ways to protect your workloads against the unexpected shutdown of spot instances:
- Leave Amazon EBS volumes behind if instances are terminated.
- Take regular snapshots of storage volumes.
- Backup data to additional sources.
I hope this will be useful as you deepen the user of spot instances in your organization to maximize cost savings.