April / June 2015
Cloud-Link: Special Issue on Cloud Resource Allocation
Cloud computing allows the effective sharing of computing resources over the Internet. Hence management of resource allocation is an important issue in cloud computing. This issue of Cloud-Link addresses this important topic, for which 11 articles have been selected to cover different aspects of the subject.
Virtual machines are one of the key components in cloud computing. Three articles “Optimization of Composite Cloud Service Processing with Virtual Machines,” “Truthful Greedy Mechanisms for Dynamic Virtual Machine Provisioning and Allocation in Clouds,” and “A Toolkit for Modeling and Simulation of Real Time Virtual Machine Allocation in a Cloud Data Center,” address various resource allocation issues related to virtual machines. MapReduce is a popular technique to facilitate intensive data computing in a cloud environment. The articles “PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce” and “Cost-Effective Resource Provisioning for MapReduce in a Cloud,” study scheduling and resource management issues, respectively, on MapReduce. It is also important to consider application-related and network-related resource allocation issues. Resource management for database-as-a-service is examined in the article “SmartSLA: Cost-Sensitive Management of Virtualized Resources for CPU-Bound Database Services.” The article “Innovative Schemes for Resource Allocation in the Cloud for Media Streaming Applications” proposes resource allocation schemes for cloud-based media streaming applications. The article “Experimental Demonstration of Datacenter Resources Integrated Provisioning Over Multi-Domain Software Defined Optical Networks” investigates how computing and network resources can be allocated in a collaborative manner over software defined optical networks. Last but not least, the following three articles presented in this issue of Cloud-Link address other important topics related to cloud resource allocation: resource provisioning for hybrid clouds is studied in the article “Towards Operational Cost Minimization in Hybrid Clouds for Dynamic Resource Provisioning with Delay-Aware Optimization,” QoS-aware resource allocation is considered in the article “Aggressive Resource Provisioning for Ensuring QoS in Virtualized Environments,” and the article “A Real Time Group Auction System for Efficient Allocation of Cloud Internet Applications” proposes a group auction method for resource allocation.
We hope that this issue of Cloud-Link can provide you with useful references to further explore this important and interesting topic. Articles have been selected based on various considerations (for example, variety, relevancy, and anticipated reader interest) and unavoidably there are many other useful and insightful articles that have not been included. You are also encouraged to search through IEEE Xplore and other databases for further reading.
Note that two of the aforementioned articles, “Optimization of Composite Cloud Service Processing with Virtual Machines” and “Truthful Greedy Mechanisms for Dynamic Virtual Machine Provisioning and Allocation in Clouds,” are available (free of charge) through Cloud-Link from IEEE. The articles were selected partly based on your responses and comments. We encourage you to continue to discuss the articles, provide comments, and tell us through LinkedIn what you think.
We are looking for topics for upcoming issues. If you have any suggestions, please email them to email@example.com.
Henry Chan, Victor Leung, Jens Jensen, and Tomasz Wiktorski
Sheng Di, D. Kondo, and Cho-Li Wang
Published in IEEE Transactions on Computers, June 2015
By leveraging virtual machine (VM) technology, this article’s authors optimize cloud system performance based on refined resource allocation, in processing user requests with composite services. Their contribution is three-fold. First, they devise a VM resource allocation scheme with a minimized processing overhead for task execution. Next, they comprehensively investigate the best-suited task scheduling policy with different design parameters. Finally, they explore the best-suited resource sharing scheme with adjusted divisible resource fractions on running tasks in terms of the proportional-share model (PSM), which can be split into an absolute mode (called AAPSM) and relative mode (RAPSM). The authors implement a prototype system over a cluster environment deployed with 56 real VM instances, and summarize valuable experience from their evaluation. As the system runs in short supply, lightest workload first (LWF) is mostly recommended because it can minimize the overall response extension ratio (RER) for both sequential-mode tasks and parallel-mode tasks. In a competitive situation with over-commitment of resources, the best approach is combining LWF with both AAPSM and RAPSM. This outperforms other solutions in the competitive situation, by more than 16 percent with respect to the worst-case response time and by more than 7.4 percent with respect to the fairness.
MM. Nejad, L. Mashayekhy, and D. Grosu
Published in IEEE Transactions on Parallel and Distributed Systems, February 2015
A major challenging problem for cloud providers is designing efficient mechanisms for virtual machine (VM) provisioning and allocation. Such mechanisms enable the cloud providers to effectively use their available resources and obtain higher profits. Recently, cloud providers have introduced auction-based models for VM provisioning and allocation, which allow users to submit bids for their requested VMs. The authors formulate the dynamic VM provisioning and allocation problem for the auction-based model as an integer program considering multiple types of resources. They then design truthful greedy and optimal mechanisms for the problem such that the cloud provider provisions VMs based on the requests of the winning users and determines their payments. They show that the proposed mechanisms are truthful, that is, the users do not have incentives to manipulate the system by lying about their requested bundles of VM instances and their valuations. They perform extensive experiments using real workload traces in order to investigate the performance of the proposed mechanisms. The authors’ proposed mechanisms achieve promising results in terms of revenue for the cloud provider.
A Toolkit for Modeling and Simulation of Real Time Virtual Machine Allocation in a Cloud Data Center
Wenhong Tian, Yong Zhao, Minxian Xu, Yuanliang Zhong, and Xiashuang Sun
Published in IEEE Transactions on Automation Science and Engineering, January 2015
Resource scheduling in infrastructure as a service (IaaS) is one of the keys for large-scale cloud applications. Extensive research on all issues in a real environment is extremely difficult because it requires developers to consider network infrastructure and the environment, which might be beyond their control. In addition, the network conditions cannot be predicted or controlled. Therefore, performance evaluation of workload models and cloud provisioning algorithms in a repeatable manner under different configurations and requirements is difficult. There is still lack of tools that enable developers to compare different resource scheduling algorithms in IaaS regarding both computing servers and user workloads. To fill this gap in tools for evaluation and modeling of cloud environments and applications, the authors propose CloudSched. CloudSched can help developers identify and explore appropriate solutions considering different resource scheduling algorithms. Unlike traditional scheduling algorithms that consider only one factor such as CPU, which can cause hotspots or bottlenecks in many cases, CloudSched treats multidimensional resource such as CPU, memory, and network bandwidth integrated for both physical machines and virtual machines (VMs) for different scheduling objectives (algorithms). In this paper, two existing simulation systems at application level for cloud computing are studied, a novel lightweight simulation system is proposed for real-time VM scheduling in cloud data centers, and results from applying the proposed simulation system are analyzed and discussed.
Qi Zhang, M.F. Zhani, Yuke Yang, R. Boutaba, and B. Wong
Published in IEEE Transactions on Cloud Computing, April/June 2015
MapReduce has become a popular model for data-intensive computation in recent years. By breaking down each job into small map and reduce tasks and executing them in parallel across a large number of machines, MapReduce can significantly reduce the running time of data-intensive jobs. However, despite recent efforts toward designing resource-efficient MapReduce schedulers, existing solutions that focus on scheduling at the task-level still offer suboptimal job performance. This is because tasks can have highly varying resource requirements during their lifetime, which makes it difficult for task-level schedulers to effectively use available resources to reduce job execution time. To address this limitation, the authors introduce PRISM, a fine-grained resource-aware MapReduce scheduler that divides tasks into phases, where each phase has a constant resource usage profile, and performs scheduling at the phase level. The authors first demonstrate the importance of phase-level scheduling by showing the resource usage variability within the lifetime of a task using a wide-range of MapReduce jobs. They then present a phase-level scheduling algorithm that improves execution parallelism and resource use without introducing stragglers. In a 10-node Hadoop cluster running standard benchmarks, PRISM offers high resource use and provides 1.3´ improvement in job running time compared to the current Hadoop schedulers.
B. Palanisamy, A. Singh, and Ling Liu
Published in IEEE Transactions on Parallel and Distributed Systems, May 2015
This paper presents a new MapReduce cloud service model, Cura, for provisioning cost-effective MapReduce services in a cloud. In contrast to existing MapReduce cloud services such as a generic compute cloud or a dedicated MapReduce cloud, Cura has a number of unique benefits. First, Cura is designed to provide a cost-effective solution to efficiently handle MapReduce production workloads that have a significant amount of interactive jobs. Second, unlike existing services that require customers to decide the resources to be used for the jobs, Cura leverages MapReduce profiling to automatically create the best cluster configuration for the jobs. While the existing models allow only a per job resource optimization for the jobs, Cura implements a globally efficient resource-allocation scheme that significantly reduces the resource usage cost in the cloud. Third, Cura leverages unique optimization opportunities when dealing with workloads that can withstand some slack. By effectively multiplexing the available cloud resources among the jobs based on the job requirements, Cura achieves significantly lower resource usage costs for the jobs. Cura’s core resource management schemes include cost-aware resource provisioning, VM-aware scheduling, and online virtual machine reconfiguration. Our experimental results using Facebook-like workload traces show that our techniques lead to more than 80 percent reduction in the cloud compute infrastructure cost with up to 65 percent reduction in job response times.
Pengcheng Xiong, Yun Chi, Shenghuo Zhu, Hyun Jin Moon, C. Pu, and H. Hacgumus
Published in IEEE Transactions on Parallel and Distributed Systems, May 2015
Virtualization-based multitenant database consolidation is an important technique for database-as-a-service (DBaaS) providers to minimize their total cost, which is composed of service-level agreement (SLA) penalty cost, infrastructure cost, and action cost. Due to the bursty and diverse tenant workloads, over-provisioning for the peak or under-provisioning for the off-peak often results in either infrastructure cost or SLA penalty cost. Moreover, although the process of scaling out database systems will help DBaaS providers satisfy tenants’ SLA, its indiscriminate use has performance implications or incurs action cost. In this paper, the authors propose SmartSLA, a cost-sensitive, virtualized, resource-management system for CPU-bound database services, which is composed of two modules. The system modeling module uses machine-learning techniques to learn a model for predicting the SLA penalty cost for each tenant under different resource allocations. Based on the learned model, the resource allocating module dynamically adjusts the resource allocation by weighing the potential reduction of SLA penalty cost against increase of infrastructure cost and action cost. SmartSLA is evaluated by using the TPC-W and modified YCSB benchmarks with dynamic workload trace and multiple database tenants. The experimental results show that SmartSLA is able to minimize the total cost under time-varying workloads compared to the other cost-insensitive approaches.
A. Alasaad, K. Shafiee, H.M. Behairy, and V.C.M. Leung
Published in IEEE Transactions on Parallel and Distributed Systems, April 2015
Media streaming applications have recently attracted a large number of users in the Internet. With the advent of these bandwidth-intensive applications, it is economically inefficient to provide streaming distribution with guaranteed quality of service (QoS) relying only on central resources at a media content provider. Cloud computing offers an elastic infrastructure that media content providers (for example, video on demand (VoD) providers) can use to obtain streaming resources that match the demand. Media content providers are charged for the amount of resources allocated (reserved) in the cloud. Most of the existing cloud providers (for example, Amazon CloudFront and Amazon EC2) employ a pricing model for the reserved resources that is based on nonlinear time-discount tariffs. Such a pricing scheme offers discount rates depending nonlinearly on the period of time during which the resources are reserved in the cloud. In this case, an open problem is to decide on both the right amount of resources reserved in the cloud, and their reservation time such that the financial cost on the media content provider is minimized. The authors propose a simple, easy-to-implement algorithm for resource reservation that maximally exploits discounted rates offered in the tariffs, while ensuring that sufficient resources are reserved in the cloud. Based on the prediction of demand for streaming capacity, our algorithm is carefully designed to reduce the risk of making wrong resource allocation decisions. The results of our numerical evaluations and simulations show that the proposed algorithm significantly reduces the monetary cost of resource allocations in the cloud as compared to other conventional schemes.
Experimental Demonstration of Datacenter Resources Integrated Provisioning Over Multi-Domain Software Defined Optical Networks
Haoran Chen, Jie Zhang, Yongli Zhao, Junni Deng, Wei Wang, Ruiying He, Xiaosong Yu, Yuefeng Ji, Haomian Zheng, Yi Lin, and Haifeng Yang
Published in IEEE Journal of Lightwave Technology, April 2015
Due to the emergence of cloud computing and various cloud services that are remote and geographically distributed, datacenters interconnected by optical networks have attracted much attention from network operators and service providers. With the purpose of supporting cloud services more effectively and efficiently, IT resources and interconnected network resources provisioning could be considered in an orchestrated way. In this paper, the authors present a datacenter resources integrated provisioning (DRIP) architecture using coordinated virtualization of distributed datacenters and operator’s multidomain software defined optical networks. The DRIP architecture aims to accomplish IT resources and optical network resources integrated allocation. In order to investigate the feasibility and efficiency of the proposed architecture, two IT resources allocation strategies and two virtual network composition strategies are evaluated on the authors’ testbed. They perform an experimental demonstration to evaluate the strategies’ performance in terms of three metrics: CPU use ratio of physical hosts, virtual network failure rate, and average latency.
Towards Operational Cost Minimization in Hybrid Clouds for Dynamic Resource Provisioning with Delay-Aware Optimization
Song Li, Yangfan Zhou, Lei Jiao, Xinya Yan, Xin Wang, and M.R.-T. Lyu
Published in IEEE Transactions on Services Computing, May/June 2015
Recently, a hybrid cloud computing paradigm has been widely advocated as a promising solution for software-as-a-service (SaaS) providers to effectively handle dynamic user requests. With such a paradigm, the SaaS providers can seamlessly extend their local services into the public clouds so that the dynamic user request workload to a SaaS can be elegantly processed with both the local servers and the rented computing capacity in the public cloud. However, although it is suggested that a hybrid cloud might save costs compared with building a powerful, private cloud, considerable renting and communication costs are still introduced in such a paradigm. How to optimize such operational cost becomes one major concern for the SaaS providers when adopting the hybrid cloud computing paradigm. However, this critical problem remains unanswered in the current state of the art. In this paper, the authors focus on optimizing the operational cost for the hybrid cloud paradigm by theoretically analyzing the problem with a Lyapunov optimization framework. This allows the authors to design an online, dynamic provision algorithm. In this way, the approach can address the real-world challenges where no a priori information of public cloud renting prices is available and the future probability distribution of user requests is unknown. The authors then conduct an extensive experimental study based on a set of real-world data, and the results confirm that their algorithm can work effectively in reducing the operational cost.
J. Liu, Y. Zhang, Y. Zhou, D. Zhang, and H. Liu
Published in IEEE Transactions on Cloud Computing, April/June 2015
Elasticity has now become the elemental feature of cloud computing as it enables the ability to dynamically add or remove virtual machine instances when workload changes. However, effective virtualized resource management is still one of the most challenging tasks. When the workload of a service increases rapidly, existing approaches cannot respond efficiently to the growing performance requirement because of either inaccuracy of adaptation decisions or the slow process of adjustments, both of which might result in insufficient resource provisioning. As a consequence, the quality of service (QoS) of the hosted applications might degrade and the service level objective (SLO) will be thus violated. In this paper, the authors introduce SPRNT, a novel resource-management framework, to ensure high-level QoS in the cloud computing system. SPRNT uses an aggressive resource provisioning strategy, which encourages SPRNT to substantially increase the resource allocation in each adaptation cycle when workload increases. This strategy first provisions resources that are possibly more than actual demands, and then reduces the over-provisioned resources if needed. By applying the aggressive strategy, SPRNT can satisfy the increasing performance requirement in the first place so that the QoS can be kept at a high level. The experimental results show that SPRNT achieves up to 7.7´ speedup in adaptation time, compared with existing efforts. By enabling quick adaptation, SPRNT limits the SLO violation rate up to 1.3 percent even when dealing with rapidly increasing workload.
Chonho Lee, Ping Wang, and D. Niyato
Published in IEEE Transactions on Services Computing, March/April 2015
The increasing number of cloud-based Internet applications has led to the demand for efficient resource and cost management. This paper proposes a real-time group auction system for the cloud instance market. The system is designed based on a combinatorial double auction, and its applicability and effectiveness are evaluated in terms of resource efficiency and monetary benefits to auction participants (for example, cloud users and providers). The proposed auction system helps them decide when and how providers will allocate their resources and to which users. Furthermore, the authors propose a distributed algorithm using a group formation game that determines which users and providers will trade resources by their cooperative decisions. To find how to allocate the resources, the utility optimization problem is formulated as a binary integer programming problem and the nearly optimal solution is obtained by a heuristic algorithm with quadratic time complexity. In comparison studies, the proposed real-time group auction system with cooperation outperforms an individual auction in terms of the resource efficiency (for example, the request acceptance rate for users and resource use for providers) and monetary benefits (for example, average payments for users and total profits for providers).