IN51C-05
A Look Under the Hood: How the JPL Tropical Cyclone Information System Uses Database Technologies to Present Big Data to Users

Friday, 18 December 2015: 08:58
2020 (Moscone West)
Brian Knosp, Michael Gangl, Svetla M Hristova-Veleva, Richard M Kim, Peggy Li, Joseph Turk and Quoc A Vu, Jet Propulsion Laboratory, Pasadena, CA, United States
Abstract:
The JPL Tropical Cyclone Information System (TCIS) brings together satellite, aircraft, and model forecast data from several NASA, NOAA, and other data centers to assist researchers in comparing and analyzing data and model forecast related to tropical cyclones. The TCIS has been running a near-real time (NRT) data portal during North Atlantic hurricane season that typically runs from June through October each year, since 2010.

Data collected by the TCIS varies by type, format, contents, and frequency and is served to the user in two ways: (1) as image overlays on a virtual globe and (2) as derived output from a suite of analysis tools. In order to support these two functions, the data must be collected and then made searchable by criteria such as date, mission, product, pressure level, and geospatial region. Creating a database architecture that is flexible enough to manage, intelligently interrogate, and ultimately present this disparate data to the user in a meaningful way has been the primary challenge.

The database solution for the TCIS has been to use a hybrid MySQL + Solr implementation. After testing other relational database and NoSQL solutions, such as PostgreSQL and MongoDB respectively, this solution has given the TCIS the best offerings in terms of query speed and result reliability. This database solution also supports the challenging (and memory overwhelming) geospatial queries that are necessary to support analysis tools requested by users. Though hardly new technologies on their own, our implementation of MySQL + Solr had to be customized and tuned to be able to accurately store, index, and search the TCIS data holdings.

In this presentation, we will discuss how we arrived on our MySQL + Solr database architecture, why it offers us the most consistent fast and reliable results, and how it supports our front end so that we can offer users a look into our “big data” holdings.