IN43D-04
Building Capacity Through Hands-on Computational Internships to Assure Reproducible Results and Implementation of Digital Documentation in the ICERT REU Program

Thursday, 17 December 2015: 14:25
2020 (Moscone West)
Rosalia Gomez and John Gentle, Texas Advanced Computing Center, Austin, TX, United States
Abstract:
Modern data pipelines and computational processes require that meticulous methodologies be applied in order to insure that the source data, algorithms, and results are properly curated, managed and retained while remaining discoverable, accessible, and reproducible. Given the complexity of understanding the scientific problem domain being researched, combined with the overhead of learning to use advanced computing technologies, it becomes paramount that the next generation of scientists and researchers learn to embrace best-practices.

The Integrative Computational Education and Research Traineeship (ICERT) is a National Science Foundation (NSF) Research Experience for Undergraduates (REU) Site at the Texas Advanced Computing Center (TACC). During Summer 2015, two ICERT interns joined the 3DDY project. 3DDY converts geospatial datasets into file types that can take advantage of new formats, such as natural user interfaces, interactive visualization, and 3D printing. Mentored by TACC researchers for ten weeks, students with no previous background in computational science learned to use scripts to build the first prototype of the 3DDY application, and leveraged Wrangler, the newest high performance computing (HPC) resource at TACC.

Test datasets for quadrangles in central Texas were used to assemble the 3DDY workflow and code. Test files were successfully converted into a stereo lithographic (STL) format, which is amenable for use with a 3D printers. Test files and the scripts were documented and shared using the Figshare site while metadata was documented for the 3DDY application using OntoSoft.

These efforts validated a straightforward set of workflows to transform geospatial data and established the first prototype version of 3DDY. Adding the data and software management procedures helped students realize a broader set of tangible results (e.g. Figshare entries), better document their progress and the final state of their work for the research group and community, helped students and researchers follow a clear set of formats and fill in the necessary details that may be lost otherwise, and exposed the students to the next generation workflows and practices for digital scholarship and scientific inquiry for converting geospatial data into formats that are easy to reuse.