IN21D-07:
The Hierarchical Data Format (HDF): A Foundation for Sustainable Data and Software

Tuesday, 16 December 2014: 9:30 AM
Michael J Folk and Ted Habermann, HDF Group, Champaign, IL, United States
Abstract:
Many groups have estimated that scientists spend more than 50% of their time trying to figure out how to use and understand data they have already discovered and downloaded. Standard metadata and data formats play a critical role in simplifying this process and freeing significant time for analysis and science.

When combined with community specific conventions and application-specific semantics, the Hierarchical Data Format (HDF) and software provide a robust and reliable foundation that many scientific communities currently rely on for interoperability and high-performance storage of metadata and data.

HDF is currently being used in national science facilities all over the world to standardize data access and share data across observational communities with demanding data and computing requirements. In many of these situations HDF is the standard format underlying community standard formats with other names. In addition, HDF is supported or used transparently by many open-source and commercial data management and analysis tools and forms the basis for building sustainable archives in many communities.

The HDF foundation addresses many technical obstacles related to sustainable data and software, critical steps towards data reuse and sustainable science. The next step is identifying individuals and organizations that are already using HDF successfully and bringing them together to share experiences and best practices. This presentation is a call to action to take that next step.