NSF

Collaborative Research: Cross-Layer Exploration of Non-Volatile Solid-State Memories to Achieve Effective I/O Stack for High-Performance Computing Systems

Sponsored by NSF-CCF

Abstract:

The objective of this research is to develop techniques that utilize solid-state memory technologies from device, circuit, architecture, and system perspectives across I/O hierarchy in order to exploit their true potential for improving I/O stack performance in high-performance computing systems.

I/O friendly memory system architectures will be developed to enable hybrid processor-memory 3D integrations with largely reduced off-chip I/O traffic. Adaptive cache management and hotspot prediction methods will be developed to address the low random write performance of solid-state drives, and data processing techniques will be developed to enable run-time configurable trade-offs among solid-state drive performance characteristics. A comprehensive full-system simulation infrastructure will be developed to evaluate and demonstrate the research under diverse high-performance computing workloads.

The research will facilitate the high-performance computing systems to most effectively utilize existing/emerging memory and processing technologies to tackle the grand I/O stack design challenge. It can greatly contribute to enabling high-performance computing systems to stay on track of their historic scaling, and hence benefit numerous real-life applications such as biology, chemistry, earth science, health care, etc. This project will also contribute to the society through engaging under-represented groups, research infrastructure dissemination for education and training, and outreach to high school students.

 

Personnel

- Investigators

- Collaborators

- Post Doctoral Researcher

- Graduate Students

- Undergraduate Students

Recent Publications

  1. P. Huang and P. Subedi and X. He and S. He and K. Zhou (2014). FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance. The USENIX Annual Technical Conference (ATC). Philadelphia.
  2. P. Huang and G. Wu and X. He and W. Xiao (2014). An Aggressive Worn-out Flash Block Management Scheme to Alleviate the SSD Performance Degradation. The European Conference on Computer Systems (Eurosys). Amsterdam, The Netherlands.
  3. G. Wu and X. He, “Reducing SSD Access Latency via NAND Flash Program and Erase Suspension,” Journal of Systems Arhcitecture, Vol. 60, No. 4, 2014, pp. 345-356.
  4. G. Wu, X. He, N. Xie, and T. Zhang, “Exploiting Workload Dynamics to Improve SSD Read Latency via Differentiated Error Correction Codes,”ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 18, issue 4, October 2013.
  5. Hua Wang, Ping Huang, Shuang He, Ke Zhou, Chunhua Li, and Xubin He, “A Novel I/O Scheduler for SSD with Improved Performance and Lifetime”, Proc. of the 29th IEEE Symposium on Massive Storage Systems and Technologies (MSST), May 2013 (acceptance rate: 30 out of 109 submissions=27.5%).
  6. Chentao Wu and Xubin He, "GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling", Proc. of the 41st International Conference on Parallel Processing (ICPP'2012),Pittsburgh, PA, September 10-13, 2012 (acceptance rate: 28%).
  7. Chentao Wu, Xubin He, Jizhong Han, Huailiang Tan, and Changsheng Xie, "SDM: A Stripe-based Data Migration Scheme to Improve the Scalability of RAID-6", Proc. of the the IEEE International Conference on Cluster Computing(Cluster'2012), Beijing, September 24-28, 2012 (acceptance rate: 28.86%).
  8. G. Wu and X. He, "Delta FTL: Improving SSD Lifetime via Exploiting Content Locality", Proc. of the European Conference on Computer Systems (Eurosys'2012), acceptance rate: 27/178=15%.
  9. G. Wu and X. He, "Reducing SSD Read Latency via NAND Flash Program and Erase Suspension" , Proc. of the 10th USENIX Conference on File and Storage Technologies (FAST '12), acceptance rate: 26/137=19%.
  10. G. Wu, X. He, and B. Eckart, "An Adaptive Write Buffer Management Scheme for Flash-based SSD," ACM Transactions on Storage, Feb. 2012.
  11. Xin Chen, Xubin He, He Guo, and Yuxin Wang, “Design and Evaluation of an Online Anomaly Detector for Distributed Storage Systems”, Journal of Software, 2011 (in print).
  12. G. Wu, C. Wu, and X. He, "Latent Sector Error Modeling and Detection for NAND Flash-based SSDs", Poster session report, the 9th USENIX Conference on File and Storage Technologies (FAST2011), Feb 15-17, 2011.
  13. S. Wan, Q. Cao, J. Huang, S. Li, X. Li, S. Zhan, L. Yu, C. Xie, and X. He, "Victim Disk First: An Asymmetric Cache to Boost the Performance of Disk Arrays under Faulty Conditions", The USENIX Annual Technical Conference, Portland, OR, June 15-17, 2011 (acceptance rate: 27/180=15%).
  14. C. Wu, X. He, G. Wu, S. Wan, X. Liu, Q. Cao, and C. Xie, "HDP Code: A Horizontal-Diagonal Parity Code to Optimize I/O Load Balancing in RAID-6," Proceedings of the 41st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN2011), June 27-June 30, 2011, Hongkong, China (acceptance rate: 26/148=17.6%).
  15. C. Wu, S. Wan, X. He, Q. Cao, and C. Xie, "H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6", The 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS),Anchorage,Alaska, May 16-20, 2011.
  16. Chentao Wu, Xubin He, Qiang Cao and Changsheng Xie, "Hint-K: An Efficient Multi-level Cache Using K-step Hints", Proceedings of the 39th International Conference on Parallel Processing (ICPP), Sept. 13-16, 2010.
  17. G. Wu, X. He, N. Xie, and T. Zhang, "DiffECC: Improving SSD Read Performance Using Differentiated Error Correction Coding Schemes", The 18th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems, August 17-19, 2010. Best Paper Award Candidate.
  18. S. Wan, Q. Cao, C. Xie, B. Eckart, and X. He, "Code-M: A Non-MDS Erasure Code Scheme to Support Fast Recovery from up to Two-Disk Failures in Storage Systems", Proceedings of the 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2010), June 28-July 1, 2010.
  19. G. Wu, B. Eckart, and X. He, "BPAC: An Adaptive Write Buffer Management Scheme for Flash-based Solid State Drives," The 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST2010), May 6-7, 2010.
  20. L. McNeese, G. Wu, and X. He, "The Hot Pages Associative Translation Layer for Solid State Drives, " Work in progress report, the 8th USENIX Conference on File and Storage Technologies (FAST2010), Feb 23-26, 2010.
  21. C. Wu, X. He, S. Wan, Q. Cao, C. Xie, “Hotspot Prediction and Cache in Distributed Stream-processing Storage Systems”, To be presented at the International Performance Computing and Communication Conference (IPCCC), December 14-16, 2009

Invention Disclosure and Patent

Thesis/Dissertations

[PhD] Guanying Wu, "Performance and Reliability Study and Exploration of NAND Flash-based Solid State Drives", Date Graduated: August 2013. First employment after graduation: LSI Inc., Denver, CO

[PhD] Chentao Wu, “Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes”, Date Graduated: December 2012. First employment after graduation: Assistant Professor, Shanghai Jiaotong University, Shanghai, China.

[MS] Guanying Wu, "Design and Evaluation of an Adaptive Write Buffer Cache for Solid State Drives ", Date Graduated: December 2009. First employment after graduation: PhD candidate at TTU/VCU.

Sponsor

National Science Foundation (NSF)