Cost-effective and Qos-aware resource allocation for cloud computing
Date of Issue2016
School of Computer Engineering
Centre for Multimedia and Network Technology
As the most important problem in cloud computing technology, resource allocation not only affects the cost of the cloud operators and users, but also impacts the performance of cloud jobs. Provisioning too much resource in clouds wastes energy and cost while provisioning too few resource will cause performance degradation of cloud applications. Current researches in the resource allocation field mainly focus on homogeneous resource allocation and take CPU as the most important resource in resource allocation. However, as resource demands of cloud workloads get increasingly heterogeneous on different resource types, current methods are not suitable for some other type of jobs such as memory-intensive applications. They are neither efficient in terms of offering economical and high-quality resource allocation in clouds. In this thesis, we firstly propose a resource provisioning method, namely BigMem, to consider the features of resource allocation based on memory. Memory-intensive applications have recently become popular for high-throughput and low-latency computing. Current resource provisioning methods focus more on other resources such as CPU and network bandwidth which are considered as the bottlenecks in traditional cloud applications. However, for memory-intensive jobs, main memories are always the bottleneck resource for performance. Therefore, main memory should be the first consideration in resource allocation and provisioning for VMs in clouds hosting memory-intensive applications. By considering the unique behavior of resource provisioning for memory-intensive jobs, BigMem is able to effectively reduce the resource usage for dynamic workloads in clouds. Specifically, we seek Markov Chain modeling to periodically determine the required number of PMs and further optimize the resource utilization by conducting VM migration and resource overcommit. We evaluate our design using simulation with synthetic and real world traces. Experiments results show that BigMem is able to provision the appropriate number of resources for highly dynamic workloads while keeping an acceptable service-level-agreement (SLA). By comparisons, BigMem reduces the average number of active machines in data center by 63\% and 27\% on average than peakload provisioning and heuristic methods, respectively. These results translate into good performance for users and low cost for cloud providers. To support different types of workloads in clouds (such as memory-intensive and computation-intensive applications), we then propose a heterogeneous resource allocation method, skewness-avoidance multi-resource allocation (SAMR), that considers the skewness of different resource types to optimize the resource usage in clouds. Current IaaS clouds provision resources in terms of virtual machines (VMs) with homogeneous resource configurations where different types of resources in VMs have similar share of the capacity in a physical machine (PM). However, most user jobs demand different amounts for different resources. For instance, highperformance-computing jobs require more CPU cores while memory-intensive applications require more memory. The existing homogeneous resource allocation mechanisms cause resource starvation where dominant resources are starved while non-dominant resources are wasted. To overcome this issue, we propose SAMR to allocate resource according to diversified requirements on different types of resources. Our solution includes a job allocation algorithm to ensure heterogeneous workloads are allocated appropriately to avoid skewed resource utilization in PMs, and a model-based approach to estimate the appropriate number of active PMs to operate SAMR. We show relatively low complexity for our model-based approach for practical operation and accurate estimation. Extensive simulation results show the effectiveness of SAMR and the performance advantages over its counterparts. Finally, we turn to a resource allocation problem in a specific application for media computing in clouds. As the ``biggest big data", video data streaming in the network contributes the largest portion of global traffic nowadays and in future. Due to heterogeneous mobile devices, networks and user preferences, the demands of transcoding source videos into different versions have been increased significantly. However, video transcoding is a time-consuming task and how to guarantee qualityof-service (QoS) for large video data is very challenging, particularly for those realtime applications which hold strict delay requirement. In this thesis, we propose a cloud-based online video transcoding system (COVT) aiming to offer economical and QoS guaranteed solution for online large-volume video transcoding. COVT utilizes performance profiling technique to obtain different performance of transcoding tasks in different infrastructures. Based on the profiles, we model the cloud-based transcoding system as a queue and derive the QoS values of the system based on queuing theory. With the analytically derived relationship between QoS values and the number of CPU cores required for transcoding workloads, COVT is able to solve the optimization problem and obtain the minimum resource reservation for specific QoS constraints. A task scheduling algorithm is further developed to dynamically adjust the resource reservation and schedule the tasks so as to guarantee the QoS. We implement a prototype system of COVT and experimentally study the performance on real-world workloads. Experimental results show that COVT effectively provisions minimum number of resources for predefined QoS. To validate the effectiveness of our proposed method under large scale video data, we further perform simulation evaluation which again shows that COVT is capable to achieve cost-effective and QoS-aware video transcoding in cloud environment.