Report ID
Report Authors
Alexander Pucher, Rich Wolski, and Chandra Krintz
Report Date

Spot instances are a commonly offered by IaaS cloud providers to opportunistically utilize spare capacity and meet temporary user demand for additional resources at low cost. Although the availability of service SLAs is a core paradigm of cloud computing, spot instances typically come without any service quality guarantees. We aim to extend the spot instance service to provide SLAs for eviction probability, based on the user estimate of the maximum expected instance life- time. In addition to providing users with better usability and ahead-of-time quality of service guarantees, this statistical certainty also opens the door to cloud-to-cloud federation of workloads. For this federation to be possible, however, the statistical guarantees must be adhered to strictly, for a wide range of real-world workloads, at cloud scale.

To this end, we propose a new approach to providing SLAs on the time-until-eviction for spot instances. We employ Monte-Carlo simulation to compute the quantiles of the conditional distributions of future spot instances for different available capacity levels. An IaaS cloud scheduler then uses these quantiles to determine when to provision federated requests in order to maintain an SLA at a specific target eviction probability for spot instances. We investigate the reliability of such SLA enforcement using synthetic and real- world traces, test its viability for cloud-to-cloud workload federation, and provide an in-depth analysis of trade-offs and cost factors of such federation. 

paper-tr.pdf493.04 KB