In this episode of "Call the Doctor," Scot Breitenfeld discusses the use of the h5fuse script and subfiling VFD in HDF5. The discussion begins with questions about the typical users of subfiling VFD, which is commonly used by scientific and engineering applications that deal with large HDF5 files in parallel. The subfiling VFD is designed to address issues with writing to a single file when dealing with large-scale problems involving thousands of ranks.
Scot then explains how to use the h5fuse script within an application. He demonstrates an example using the ExaMPM application, which performs time steps and outputs HDF5 files. The h5fuse script is called after each time step to fuse the subfiles into a valid HDF5 file. Scot provides code snippets and explains the process of determining the location of the configuration file, setting up the h5fuse utility, and checking the completion of the process.
He also shares performance results of fusing files on the Frontier system, where he ran the application on 1024 nodes. The average time taken to h5fuse a nearly 1 TB file was around 40 seconds, while the simulation time remained consistent with or without the h5fuse operation.
During the Q&A session, Scot clarifies that the h5fuse script is executed on all nodes but only fuses the sub-files accessible to each node. He also explains that modifying NetCDF or CGNS libraries for subfiling involves replacing the MPIO driver with the sub-filing driver and configuring the application to use sub-filing through environment variables or hardcoded options.
This session happened on July 18, 2023. You can also watch this episode online.
Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF.
Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!