Abstract
Storage performance in supercomputers is variable, depending not only on an application’s workload but also on the types of other concurrent I/O activities. In particular, performance degradation in meta-data accesses leads to poor storage performance across applications running at the same time. We herein focus on two representative performance problems, high load and slow response of a meta-data server, through analysis of meta-data server activities using file system performance metrics on the K computer. We investigate the root causes of such performance problems through MDTEST benchmark runs and confirm the performance improvement by server-side quality-of-service management in service thread assignment for incoming client requests on a meta-data server.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brim, M.J., Lothian, J.K.: Monitoring extreme-scale Lustre toolkit. In: Proceedings of the International Workshop on the Lustre Ecosystem: Challenges and Opportunities (2015). http://arxiv.org/html/1506.05323
Crosby, L.D., Mohr, R.: Petascale I/O: challenges, solutions, and recommendations. In: Proceedings of the Extreme Scaling Workshop, BW-XSEDE 2012, pp. 7:1–7:7. University of Illinois at Urbana-Champaign (2012)
Ezell, M., Mohr, R., Wynkoop, J., Braby, R.: Lustre at petascale: experiences in troubleshooting and upgrading. In: 2012 Cray User Group Meeting (2012)
Hirai, K., Iguchi, Y., Uno, A., Kurokawa, M.: Operations management software for the K computer. Fujitsu Sci. Tech. J. 48(3), 310–316 (2012)
Lustre. http://lustre.org/
MDTEST. https://github.com/hpc/ior
Mohr, R., Brim, M., Oral, S., Dilger, A.: Evaluating progressive file layouts for Lustre (2016). http://lustre.ornl.gov/ecosystem-2016/
Morrone, C.: LMT Lustre monitoring tools. In: Lustre User Group 2011 (2011)
Qian, Y., Barton, E., Wang, T., Puntambekar, N., Dilger, A.: A novel network request scheduler for a large scale storage system. Comput. Sci. - Res. Dev. 23(3), 143–148 (2009)
Qian, Y., et al.: A configurable rule based classful token bucket filter network request scheduler for the Lustre file system. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, pp. 6:1–6:12. ACM (2017)
Qian, Y., Yi, R., Du, Y., Xiao, N., Jin, S.: Dynamic I/O congestion control in scalable Lustre file system. In: IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST 2013), pp. 1–5, May 2013
Reed, J., Archuleta, J., Brim, M.J., Lothian, J.: Evaluating dynamic file striping for Lustre. In: Proceedings of the International Workshop on the Lustre Ecosystem: Challenges and Opportunities (2015). http://arxiv.org/html/1506.05323
Saini, S., Rappleye, J., Chang, J., Barker, D., Mehrotra, P., Biswas, R.: I/O performance characterization of Lustre and NASA applications on Pleiades. In: 19th International Conference on High Performance Computing (HiPC), pp. 1–10 (2012)
Sakai, K., Sumimoto, S., Kurokawa, M.: High-performance and highly reliable file system for the K computer. Fujitsu Sci. Tech. J. 48(3), 302–309 (2012)
Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST 2002, USENIX Association (2002)
Sumimoto, S.: An overview of Fujitsu’s Lustre based file system. In: Lustre User Group 2011 (2011)
Uselton, A.: Deploying server-side file system monitoring at NERSC. In: 2009 Cray User Group Meeting (2009)
Uselton, A., Wright, N.: A file system utilization metric for I/O characterization. In: 2013 Cray User Group Meeting (2013)
Zhang, X., Davis, K., Jiang, S.: QoS support for end users of I/O-intensive applications using shared storage systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 18:1–18:12. ACM (2011)
Acknowledgment
The results of this paper were obtained using the K computer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Tsujita, Y., Furutani, Y., Hida, H., Yamamoto, K., Uno, A., Sueyasu, F. (2018). I/O Interference Alleviation on Parallel File Systems Using Server-Side QoS-Based Load-Balancing. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 11203. Springer, Cham. https://doi.org/10.1007/978-3-030-02465-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-02465-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02464-2
Online ISBN: 978-3-030-02465-9
eBook Packages: Computer ScienceComputer Science (R0)