IN21E-01
Personalized, Shareable Geoscience Dataspaces For Simplifying Data Management and Improving Reproducibility

Tuesday, 15 December 2015: 08:00
2020 (Moscone West)
Tanu Malik, University of Chicago, Chicago, IL, United States
Abstract:
Research activities are iterative, collaborative, and now data- and compute-intensive. Such research activities mean that even the many researchers who work in small laboratories must often create, acquire, manage, and manipulate much diverse data and keep track of complex software. They face difficult data and software management challenges, and data sharing and reproducibility are neglected. There is signficant federal investment in powerful cyberinfrastructure, in part to lesson the burden associated with modern data- and compute-intensive research. Similarly, geoscience communities are establishing research repositories to facilitate data preservation. Yet we observe a large fraction of the geoscience community continues to struggle with data and software management. The reason, studies suggest, is not lack of awareness but rather that tools do not adequately support time-consuming data life cycle activities.

Through NSF/EarthCube-funded GeoDataspace project, we are building personalized, shareable dataspaces that help scientists connect their individual or research group efforts with the community at large. The dataspaces provide a light-weight multiplatform research data management system with tools for recording research activities in what we call geounits, so that a geoscientist can at any time snapshot and preserve, both for their own use and to share with the community, all data and code required to understand and reproduce a study. A software-as-a-service (SaaS) deployment model enhances usability of core components, and integration with widely used software systems. In this talk we will present the open-source GeoDataspace project and demonstrate how it is enabling reproducibility across geoscience domains of hydrology, space science, and modeling toolkits.