IN51B-1803
Parallel and Scalable Big Data Analysis in the Earth Sciences with JuML

Friday, 18 December 2015
Poster Hall (Moscone South)
Markus Goetz, Organization Not Listed, Washington, DC, United States
Abstract:
Recent developments of using a significantly increasing number of sensors with better resolutions in the wide variety of different earth observation projects continously contribute to the availability of ‘big data‘ in the earth sciences. Not only the volume, velocity, and variety of the datasets pose increasing challenges for its analysis, but also the complexity of datasets (e.g. high number of dimensions in hyper-spectral images) requires data algorithms that are able to scale. This contribution will provide insights about the Juelich Machine learning Library (JuML) and its contents that have been actively used in several scientific use cases in the earth sciences. We discuss and categorize challenges related to ‘big data‘ analysis and outline parallel algorithmic solutions driven by those use cases.