College of Science & Engineering
With the advent of micro-service based software designs, encapsulating them in containers such as docker and deploying to an IaaS backed by a cloud scheduler has enabled rapid prototyping. It has also led to the development of scalable solutions. However, managing large-scale data centers so that one gets maximum utilization and performance is difficult. This is especially true when there is a heterogeneous mix of workloads (services and batch jobs) that need to be scheduled while considering resources, cost to deploy, and the priority of the job/service to be scheduled. There are several solutions which aim at orchestration, scheduling, logging, and ensuring zero down time and error resiliency. Often the scheduling of jobs is based on the resource availability for the next job in the ready priority queue.
This project aims at optimizing the scheduling space such that number of tasks scheduled per machine is maximized. In order to pack more containers per machine, the researchers make use of the state-of-the-art deep learning techniques (LSTM, RNN) to predict the system performance in real time. The ability to pack many containers to machine without losing performance helps reduction in migration of jobs with/across the cluster and ensures meeting the required SLOs.