IN52A-01
DREAM: Distributed Resources for the Earth System Grid Federation (ESGF) Advanced Management

Friday, 18 December 2015: 10:20
2020 (Moscone West)
Dean Norman Williams, Lawrence Livermore National Laboratory, Livermore, CA, United States and Earth System Grid Federation
Abstract:
The data associated with climate research is often generated, accessed, stored, and analyzed on a mix of unique platforms. The volume, variety, velocity, and veracity of this data creates unique challenges as climate research attempts to move beyond stand-alone platforms to a system that truly integrates dispersed resources. Today, sharing data across multiple facilities is often a challenge due to the large variance in supporting infrastructures. This results in data being accessed and downloaded many times, which requires significant amounts of resources, places a heavy analytic development burden on the end users, and mismanaged resources.

Working across U.S. federal agencies, international agencies, and multiple worldwide data centers, and spanning seven international network organizations, the Earth System Grid Federation (ESGF) has begun to solve this problem. Its architecture employs a system of geographically distributed peer nodes that are independently administered yet united by common federation protocols and application programming interfaces. However, significant challenges remain, including workflow provenance, modular and flexible deployment, scalability of a diverse set of computational resources, and more.

Expanding on the existing ESGF, the Distributed Resources for the Earth System Grid Federation Advanced Management (DREAM) will ensure that the access, storage, movement, and analysis of the large quantities of data that are processed and produced by diverse science projects can be dynamically distributed with proper resource management. This system will enable data from an infinite number of diverse sources to be organized and accessed from anywhere on any device (including mobile platforms). The approach offers a powerful roadmap for the creation and integration of a unified knowledge base of an entire ecosystem, including its many geophysical, geographical, social, political, agricultural, energy, transportation, and cyber aspects. The resulting aggregation of data combined with analytics services has the potential to generate an informational universe and knowledge system of unprecedented size and value to the scientific community, downstream applications, decision makers, and the public.