Application of Open Source Technologies for Oceanographic Data Analysis

Friday, 18 December 2015
Poster Hall (Moscone South)
Michael Gangl1, Thomas Huang1, Nga T Quach1, Brian D Wilson2, George Chang1, Edward M Armstrong1 and Toshio Michael Chin2, (1)NASA Jet Propulsion Laboratory, Pasadena, CA, United States, (2)Jet Propulsion Laboratory, Pasadena, CA, United States
NEXUS is a data-intensive analysis solution developed with a new approach for handling science data that enables large-scale data analysis by leveraging open source technologies such as Apache Cassandra, Apache Spark, Apache Solr, and Webification. NEXUS has been selected to provide on-the-fly time-series and histogram generation for the Soil Moisture Active Passive (SMAP) mission for Level 2 and Level 3 Active, Passive, and Active Passive products. It also provides an on-the-fly data subsetting capability. NEXUS is designed to scale horizontally, enabling it to handle massive amounts of data in parallel. It takes a new approach on managing time and geo-referenced array data by dividing data artifacts into chunks and stores them in an industry-standard, horizontally scaled NoSQL database. This approach enables the development of scalable data analysis services that can infuse and leverage the elastic computing infrastructure of the Cloud. It is equipped with a high-performance geospatial and indexed data search solution, coupled with a high-performance data Webification solution free from file I/O bottlenecks, as well as a high-performance, in-memory data analysis engine. In this talk, we will focus on the recently funded AIST 2014 project by using NEXUS as the core for oceanographic anomaly detection service and web portal. We call it, OceanXtremes