Adaptable Information Models in the Global Change Information System

Tuesday, 16 December 2014: 4:13 PM
Brian Duggan1,2, Andrew Buddenberg3, Steve Aulenbach1,2, Robert Wolfe4 and Justin Goldstein1,5, (1)US Global Change Research Program, Washington D.C., DC, United States, (2)University Corporation for Atmospheric Research, Boulder, CO, United States, (3)National Climatic Data Center, Asheville, NC, United States, (4)NASA Goddard Space Flight Center, Washington, DC, United States, (5)NASA Goddard Space Flight Center, Greenbelt, MD, United States
The US Global Change Research Program has sponsored the creation of the Global Change Information System (<>) to provide a web based source of accessible, usable, and timely information about climate and global change for use by scientists, decision makers, and the public. The GCIS played multiple roles during the assembly and release of the Third National Climate Assessment. It provided human and programmable interfaces, relational and semantic representations of information, and discrete identifiers for various types of resources, which could then be manipulated by a distributed team with a wide range of specialties. The GCIS also served as a scalable backend for the web based version of the report.

In this talk, we discuss the infrastructure decisions made during the design and deployment of the GCIS, as well as ongoing work to adapt to new types of information. Both a constrained relational database and an open ended triple store are used to ensure data integrity while maintaining fluidity. Using natural primary keys allows identifiers to propagate through both models. Changing identifiers are accomodated through fine grained auditing and explicit mappings to external lexicons. A practical RESTful API is used whose endpoints are also URIs in an ontology. Both the relational schema and the ontology are maleable, and stability is ensured through test driven development and continuous integration testing using modern open source techniques. Content is also validated through continuous testing techniques. A high degres of scalability is achieved through caching.