Untitled Document

Automatic Identification of I/O Bottleneck and Run-time Optimization for Cluster Virtualization

Sponsored by NSF-CCF HECURA

Abstract:

Extending virtualization technology into high-performance, cluster platforms generates exciting new possibilities. However, I/O efficiency in virtualized environments, specifically with respect to disk I/O, remains little understood and hardly tested.

The objective of this research is to investigate fundamental techniques for virtual clusters that not only facilitate rigorous performance studies, but also identify places where performance is suffering and then optimize the system to lessen the impact of such bottlenecks. To accomplish this objective, the following research tasks will be conducted: 1) An in-depth analysis of I/O efficiency in virtualized environments and investigation of intelligent and automated I/O bottleneck identification schemes; 2) Design and development of techniques to optimize I/O to address the detected I/O bottlenecks; 3) Development of an extensible framework for characterizing I/O workloads across virtualized clusters.

This research will greatly contribute to understanding virtualized I/O, identifying I/O bottlenecks and optimizing I/O, and thus facilitate the cluster systems to most effectively utilize virtualization technology. This project will also contribute to the society through promoting research and engaging under-represented groups that leads students to advancing their careers in science and engineering.

Personnel

- Investigator

Dr. Xubin He, Virginia Commonwealth University, PI

- Collaborators

Dr. Stephen Scott, Oak Ridge National Lab
Changsheng Xie, Huazhong University of Science and Technology
Yulong Yu and He Guo: Dalian University of Technology

- Graduate Students

Tao Lu, Virginia Commonwealth University (PhD student, Fall 2012- )
Yuhua Guo, Virginia Commonwealth University (PhD student, Fall 2013- )
Morgan Stuart, Virginia Commonwealth University (MS student, Fall 2013-)
Chao Yu, Virginia Commonwealth University (Visiting PhD student, Spring-Fall 2012)
Chentao Wu, Virginia Commonwealth University (PhD student, Fall 2010-Fall 2012)
Ben Eckart, Tennessee Tech University (MS student, Spring 2009-August 2010)
Ferrol Aderholdt , Tennessee Tech University (MS student, Fall 2009-August 2010)

- Undergraduate Students

Morgan Stuart, Virginia Commonwealth University (Fall 2012-Summer 2013)
David Elizondo, Virginia Commonwealth University (Spring 2012)
David Lyons, Virginia Commonwealth University (Spring 2011)
Juho Yoo, Tennessee Tech University (Spring 2010)
Jason Taylor, Tennessee Tech University (Spring 2009)

Recent Publications

M. Stuart, T. Lu, and X. He, "Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance", 9th Prallel Data Storage Workshop (PDSW), held in conjunction with SC'14, New Orleans, November 2014.
T. Lu and M. Stuart and K. Tang and X. He (2014). Clique Migration: Affinity Grouping of Virtual Machines for Inter-Cloud Live Migration. International Conference on Networking, Architecture, and Storage (NAS). Tianjin, China. Best Student Paper Award.
S. Li and X. He and S. Wan and Y. Guo and P. Huang and D. Chen and Q. Cao and C. Xie (2014). Exploiting Decoding Computational Locality to Improve the I/O Performance of an XOR-coded Storage Cluster under Concurrent Failures. International Symposium on Reliable Distributed Systems (SRDS). Nara, Japan.
P. Huang and P. Subedi and X. He and S. He and K. Zhou (2014). FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance. The USENIX Annual Technical Conference (ATC). Philadelphia.
M. Fu and D. Feng and Y. Hua and X. He and Z. Chen and W. Xia and F. Huang and Q. Liu (2014). Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information. The USENIX Annual Technical Conference (ATC). Philadelphia.
P. Huang and G. Wu and X. He and W. Xiao (2014). An Aggressive Worn-out Flash Block Management Scheme to Alleviate the SSD Performance Degradation. The European Conference on Computer Systems (Eurosys). Amsterdam, The Netherlands.
Y. Yu, X. He, H. Guo, S. Zhong, Y. Wang, X. Chen, and W. Xiao, “APR: A Novel Parallel Repacking Algorithm for Efficient GPGPU Parallel Code Transformation”, Proceedings of Workshop on General Purpose Processing Using GPUs (GPU-7), 2014.
Y. Yu, X. He, H. Guo, Y. Wang, and X. Chen, “A Credit-Based Load-Balance-Aware CTA Scheduling Optimization Scheme in GPGPU”, International Journal of Parallel Programming, August 2014.
Guanying Wu, Ping Huang, and Xubin He, "Reducing SSD Access Latency via NAND Flash Program and Erase Suspension", Journal of Systems Arhcitecture, Vol. 60, No. 4, April 2014, pp. 345-356.
C. Wu, X. He, Q. Cao, C. Xie, and S. Wan, "Hint-K: An Efficient Multi-level Cache Using K-step Hints", IEEE Transaction on Parallel and Distributed Systems, Vol. 25, No. 3, March 2014.
Guanying Wu, Xubin He, Ningde Xie, Tong Zhang, "Exploiting Workload Dynamics to Improve SSD Read Latency via Differentiated Error Correction Codes", ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 18, issue 4, October 2013.
S. Wan, X. He, et al, “An Efficient Penalty-Aware Cache to Improve the Performance of Parity-Based Disk Arrays under Faulty Conditions”, IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 24, No. 8,August 2013.
Chentao Wu and Xubin He, "A Flexible Framework to Enhance RAID-6 Scalability via Exploiting the Similarities among MDS Codes", Proc. of the 42nd International Conference on Parallel Processing (ICPP'2013), Lyon, France, October 1-4, 2013 (acceptance rate: 30%, 59 out of 193 submissions).
X. Zhou, Q. Cao, C. Xie, and X. He, “D-PALD: A Dynamic Power-Aware Load Dispatcher with Response Time Percentile Guarantee in Heterogeneous Clusters”, Proc. of the 8th IEEE International Conference on Networking, Architecture, and Storage (NAS), July 2013 (acceptance rate: 37.5%).
Tao Lu, Morgan Stuart, and Xubin He, "SLM: Synchronized Live Migration of Virtual Clusters across Data Centers", Poster presentation, the 11th USENIX Conference on File and Storage Technologies (FAST2013), Feb 12-15, 2013.
C. Wu, X. He, Q. Cao, C. Xie, and S. Wan, “Hint-K: An Efficient Multi-level Cache Using K-step Hints”, IEEE Transactions on Parallel and Distributed Systems (TPDS), accepted in February 2013.
Chentao Wu and Xubin He, "GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling", Proc. of the 41st International Conference on Parallel Processing (ICPP'2012),Pittsburgh, PA, September 10-13, 2012 (acceptance rate: 28%).
Chentao Wu, Xubin He, Jizhong Han, Huailiang Tan, and Changsheng Xie, "SDM: A Stripe-based Data Migration Scheme to Improve the Scalability of RAID-6", Proc. of the the IEEE International Conference on Cluster Computing(Cluster'2012), Beijing, September 24-28, 2012 (acceptance rate: 28.86%).
Min Li, Yulong Zhang, Kun Bai, Wanyu Zang, Meng Yu, and Xubin He, "Improving Cloud Survivability through Dependency based Virtual Machine Placement". Proc. of the International Conference on Security and Cryptography (SECRYPT'12), Rome, Italy, 24-27 July 2012.
B. Eckart, X. He, C. Wu, F. Aderholdt, F. Han, S. Scott, "Distributed Virtual Diskless Checkpointing: A Highly Fault Tolerant Scheme for Virtualized Clusters", the High-Performance Grid and Cloud Computing Workshop,to be held with the 26th International Parallel and Distributed Processing Symposium (IPDPS), May, 2012.
Y. Yu, Y. Wang, H. Guo, and X. He, "Optimization Schemes to Improve Hybrid Co-scheduling for Concurrent Virtual Machines", International Journal of Parallel, Emergent and Distributed System, Feb. 2012.
Y. Yu, Y. W, H. Guo, and X. He, "Hybrid Co-Scheduling Optimizations for Concurrent Applications in Virtualized Environment", the 6th IEEE International conference on Networking, Architecture, and Storage (NAS), DaLian, China, July 28-30, 2011 (acceptance rate: 33/107=30.8%).
C. Wu, X. He, G. Wu, S. Wan, X. Liu, Q. Cao, and C. Xie, "HDP Code: A Horizontal-Diagonal Parity Code to Optimize I/O Load Balancing in RAID-6," Proceedings of the 41st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN2011), June 27-June 30, 2011, Hongkong, China (acceptance rate: 26/148=17.6%).
S. Wan, Q. Cao, J. Huang, S. Li, X. Li, S. Zhan, L. Yu, C. Xie, and X. He, "Victim Disk First: An Asymmetric Cache to Boost the Performance of Disk Arrays under Faulty Conditions", The USENIX Annual Technical Conference, Portland, OR, June 15-17, 2011 (acceptance rate: 27/180=15%).
C. Wu, S. Wan, X. He, Q. Cao, and C. Xie, "H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6", The 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS),Anchorage,Alaska, May 16-20, 2011.
C. Wu, X. He, Q. Cao, and C. Xie, "Hint-K: An Efficient Multi-level Cache Using K-step Hints", Proceedings of the 39th International Conference on Parallel Processing, September 13-16, 2010.
B. Eckart, F. Aderholdt, J. Yoo, X. He, and S. Scott, "A Top-Down Approach to Dynamically Tune I/O for HPC virtualization," Proceedings of the 4th Workshop on System-level Virtualization for High Performance Computing (HPCVirt), in conjunction with the 5th ACM SIGOPS European Conference on Computer Systems (EuroSys), April 2010.
F. Aderholdt, B. Eckart, X. He, and S. Scott, "Investigating Locality Reformations for Cluster Virtualization", Poster presentation, the 8th USENIX Conference on File and Storage Technologies (FAST2010), Feb 23-26, 2010.
B. Eckart, X. He, H. Ong, S. Scott, "An Extensible I/O Performance Analysis Framework for Distributed Environments", Euro-Par, Delft, The Netherlands, August 25-28, 2009.
B. Eckart, X. He, H. Ong, and S. Scott, “KVM on Clusters: Tackling the Disk I/O Bottleneck for HPC Virtualization”, poster presentation at the 7th USENIX conference on File and Storage Technologies (FAST), Feb. 24-27, 2009, San Francisco, CA.

Thesis/Dissertations

[PhD] Chentao Wu, “Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes”, Date Graduated: December 2012. First employment after graduation: Assistant Professor, Shanghai Jiaotong University, Shanghai, China.

[MS] Ben Eckart, "A Distributed Framework for Runtime Optimization and Diskless Checkpointing on Virtualized Clusters", Date Graduated: August 2010. First employment after graduation: PhD candidate at CMU.

Software Release

ExPerT: An EXtensible Performance Toolkit, http://code.google.com/p/expert/

Sponsor

National Science Foundation (NSF)