The HDF Group's John Readey and Matt Larson discuss the multi-dataset functionalities in HDF5 and the potential for parallel processing in HPC applications. They explain how the HDF5 library now allows users to write or read from multiple datasets with multiple selections in a single call. They highlight the implementation of this feature in the HSDS (Highly Scalable Data Service) and its advantages for improving performance in client-server-based applications using cloud storage. Matt shares his benchmark results, demonstrating the significant speedup achieved by rearranging the requests using the H5D read/write multi functionality. John also presented a Python analog called "multi-manager," which provides an easy way to manage and perform actions on multiple groups, attributes, and datasets. They discuss the limitations and potential benefits of the H5D read/write multi functionality and its integration with HSDS. Overall, the video explores the possibilities of parallel processing and the potential for improving performance in HDF5 applications.
Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF.
Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!