HashDist: Reproducible, Relocatable, Customizable, Cross-Platform Software Stacks for Open Hydrological Science

Friday, 19 December 2014
Aron Jamil Ahmadia and Christopher E Kees, Coastal and Hydraulics Laboratory, Vicksburg, MS, United States
Developing scientific software is a continuous balance between not reinventing the wheel and getting fragile codes to interoperate with one another. Binary software distributions such as Anaconda provide a robust starting point for many scientific software packages, but this solution alone is insufficient for many scientific software developers. HashDist provides a critical component of the development workflow, enabling highly customizable, source-driven, and reproducible builds for scientific software stacks, available from both the IPython Notebook and the command line.

To address these issues, the Coastal and Hydraulics Laboratory at the US Army Engineer Research and Development Center has funded the development of HashDist in collaboration with Simula Research Laboratories and the University of Texas at Austin. HashDist is motivated by a functional approach to package build management, and features intelligent caching of sources and builds, parametrized build specifications, and the ability to interoperate with system compilers and packages. HashDist enables the easy specification of "software stacks", which allow both the novice user to install a default environment and the advanced user to configure every aspect of their build in a modular fashion. As an advanced feature, HashDist builds can be made relocatable, allowing the easy redistribution of binaries on all three major operating systems as well as cloud, and supercomputing platforms. As a final benefit, all HashDist builds are reproducible, with a build hash specifying exactly how each component of the software stack was installed.

This talk discusses the role of HashDist in the hydrological sciences, including its use by the Coastal and Hydraulics Laboratory in the development and deployment of the Proteus Toolkit as well as the Rapid Operational Access and Maneuver Support project. We demonstrate HashDist in action, and show how it can effectively support development, deployment, teaching, and reproducibility for scientists working in the hydrological sciences.

The HashDist documentation is available from: http://hashdist.readthedocs.org/en/latest/ HashDist is currently hosted at: https://github.com/hashdist/hashdist