PA53A-2235
Automated Atmospheric Composition Dataset Level Metadata Discovery. Difficulties and Surprises

Friday, 18 December 2015
Poster Hall (Moscone South)
Richard F Strub1, Stefan R Falke2, Steve Kempler1, Ed Fialkowski3, Oleg Goussev4 and Chris Lynnes1, (1)NASA Goddard Space Flight Center, Greenbelt, MD, United States, (2)Northrop Grumman, St Louis, MO, United States, (3)George Mason University Calverton, Calverton, MD, United States, (4)German Aerospace Center DLR Berlin, Berlin, Germany
Abstract:
The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate Variables identified by the Global Climate Observing System – CH4, CO, CO2, NO2, O3, SO2 and aerosols. This poster addresses our efforts in building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience of the excellence, variety, and challenges we encountered.
Conclusions:
1.The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web Services rather than any GUI usability improvements due to the large amount of data in their catalog.
2.There is a trend at the large catalogs towards simulating small data provider portals through advanced services.
3.Populating metadata catalogs using ISO19115 is too complex for users to do in a consistent way, difficult to parse visually or with XML libraries, and too complex for Java XML binders like CASTOR.
4.The ability to search for Ids first and then for data (GCMD and ECHO) is better for machine to machine operations rather than the timeouts experienced when returning the entire metadata entry at once.
5.Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. (This is currently being addressed)
6.Most (if not all) Earth science atmospheric composition data providers store a reference to their data at GCMD.