Sustaining an Online, Shared Community Resource for Models, Robust Open source Software Tools and Data for Volcanology – the Vhub Experience
Abstract:Over the last 5 years we have created a community collaboratory Vhub.org [Palma et al, J. App. Volc. 3:2 doi:10.1186/2191-5040-3-2] as a place to find volcanology-related resources, and a venue for users to disseminate tools, teaching resources, data, and an online platform to support collaborative efforts. As the community (current active users > 6000 from an estimated community of comparable size) embeds the tools in the collaboratory into educational and research workflows it became imperative to:
a) redesign tools into robust, open source reusable software for online and offline usage/enhancement;
b) share large datasets with remote collaborators and other users seamlessly with security;
c) support complex workflows for uncertainty analysis, validation and verification and data assimilation with large data.
The focus on tool development/redevelopment has been twofold – firstly to use best practices in software engineering and new hardware like multi-core and graphic processing units. Secondly we wish to enhance capabilities to support inverse modeling, uncertainty quantification using large ensembles and design of experiments, calibration, validation. Among software engineering practices we practice are open source facilitating community contributions, modularity and reusability. Our initial targets are four popular tools on Vhub – TITAN2D, TEPHRA2, PUFF and LAVA. Use of tools like these requires many observation driven data sets e.g. digital elevation models of topography, satellite imagery, field observations on deposits etc. These data are often maintained in private repositories that are privately shared by “sneaker-net”. As a partial solution to this we tested mechanisms using irods software for online sharing of private data with public metadata and access limits. Finally, we adapted use of workflow engines (e.g. Pegasus) to support the complex data and computing workflows needed for usage like uncertainty quantification for hazard analysis using physical models.