Cost Optimal Elastic Auto-Scaling in Cloud Infrastructure

Wednesday, 17 December 2014
Supratik Mukhopadhyay1, Subhajit Sidhanta1, Samrat Ganguly2, Sangram Ganguly3 and Ramakrishna R Nemani3, (1)Louisiana State University, Computer Science, Baton Rouge, LA, United States, (2)NEC Corporation of America, santa clara, CA, United States, (3)NASA Ames Research Center, Moffett Field, CA, United States
Today, elastic scaling is critical part of leveraging cloud. Elastic scaling refers to adding resources only when it is needed and deleting resources when not in use. Elastic scaling ensures compute/server resources are not over provisioned. Today, Amazon and Windows Azure are the only two platform provider that allow auto-scaling of cloud resources where servers are automatically added and deleted. However, these solution falls short of following key features: A) Requires explicit policy definition such server load and therefore lacks any predictive intelligence to make optimal decision; B) Does not decide on the right size of resource and thereby does not result in cost optimal resource pool.

In a typical cloud deployment model, we consider two types of application scenario: A. Batch processing jobs → Hadoop/Big Data case B. Transactional applications → Any application that process continuous transactions (Requests/response)

In reference of classical queuing model, we are trying to model a scenario where servers have a price and capacity (size) and system can add delete servers to maintain a certain queue length. Classical queueing models applies to scenario where number of servers are constant. So we cannot apply stationary system analysis in this case. We investigate the following questions 1. Can we define Job queue and use the metric to define such a queue to predict the resource requirement in a quasi-stationary way? Can we map that into an optimal sizing problem? 2. Do we need to get into a level of load (CPU/Data) on server level to characterize the size requirement? How do we learn that based on Job type?