Community Capacity Building as a vital mechanism for enhancing the growth and efficacy of a sustainable scientific software ecosystem: experiences running a real-time bi-coastal "Open Science for Synthesis" Training Institute for young Earth and Environmental scientists

Tuesday, 16 December 2014
Mark Schildhauer1, Matthew B. Jones1, Benjamin Bolker2, W. Christopher Lenhardt3, Stephanie E Hampton4, Ray Idaszak3, Stacy Rebich Hespanha5, Stan Ahalt3 and Laura Christopherson3, (1)National Center for Ecological Analysis and Synthesis, Santa Barbara, CA, United States, (2)McMaster University, Hamilton, ON, Canada, (3)Renaissance Computing Institute, Chapel Hill, NC, United States, (4)Washington State University, Pullman, WA, United States, (5)University of California Santa Barbara, Santa Barbara, CA, United States
Continuing advances in computational capabilities, access to Big Data, and virtual collaboration technologies are creating exciting new opportunities for accomplishing Earth science research at finer resolutions, with much broader scope, using powerful modeling and analytical approaches that were unachievable just a few years ago. Yet, there is a perceptible lag in the abilities of the research community to capitalize on these new possibilities, due to lacking the relevant skill-sets, especially with regards to multi-disciplinary and integrative investigations that involve active collaboration.

UC Santa Barbara’s National Center for Ecological Analysis and Synthesis (NCEAS), and the University of North Carolina’s Renaissance Computing Institute (RENCI), were recipients of NSF OCI S2I2 “Conceptualization awards”, charged with helping define the needs of the research community relative to enabling science and education through “sustained software infrastructure”. Over the course of our activities, a consistent request from Earth scientists was for “better training in software that enables more effective, reproducible research.”

This community-based feedback led to creation of an “Open Science for Synthesis” Institute— a innovative, three-week, bi-coastal training program for early career researchers. We provided a mix of lectures, hands-on exercises, and working group experience on topics including: data discovery and preservation; code creation, management, sharing, and versioning; scientific workflow documentation and reproducibility; statistical and machine modeling techniques; virtual collaboration mechanisms; and methods for communicating scientific results. All technologies and quantitative tools presented were suitable for advancing open, collaborative, and reproducible synthesis research.

In this talk, we will report on the lessons learned from running this ambitious training program, that involved coordinating classrooms among two remote sites, and included developing original synthesis research activities as part of the course. We also report on the feedback provided by participants as to the learning approaches and topical issues they found most engaging, and why.