IN31A-1757
Climatic response variability and machine learning: development of a modular technology framework for predicting bio-climatic change in pacific northwest ecosystems”
Climatic response variability and machine learning: development of a modular technology framework for predicting bio-climatic change in pacific northwest ecosystems”
Wednesday, 16 December 2015
Poster Hall (Moscone South)
Abstract:
The creation and use of large amounts of data in scientific investigations has become common practice. Data collection and analysis for large scientific computing efforts are not only increasing in volume as well as number, the methods and analysis procedures are evolving toward greater complexity (Bell, 2009, Clarke, 2009, Maimon, 2010). In addition, the growth of diverse data-intensive scientific computing efforts (Soni, 2011, Turner, 2014, Wu, 2008) has demonstrated the value of supporting scientific data integration. Efforts to bridge this gap between the above perspectives have been attempted, in varying degrees, with modular scientific computing analysis regimes implemented with a modest amount of success (Perez, 2009). This constellation of effects – 1) an increasing growth in the volume and amount of data, 2) a growing data-intensive science base that has challenging needs, and 3) disparate data organization and integration efforts – has created a critical gap. Namely, systems of scientific data organization and management typically do not effectively enable integrated data collaboration or data-intensive science-based communications. Our research efforts attempt to address this gap by developing a modular technology framework for data science integration efforts – with climate variation as the focus. The intention is that this model, if successful, could be generalized to other application areas.Our research aim focused on the design and implementation of a modular, deployable technology architecture for data integration. Developed using aspects of R, interactive python, SciDB, THREDDS, Javascript, and varied data mining and machine learning techniques, the Modular Data Response Framework (MDRF) was implemented to explore case scenarios for bio-climatic variation as they relate to pacific northwest ecosystem regions. Our preliminary results, using historical NETCDF climate data for calibration purposes across the inland pacific northwest region (Abatzoglou, Brown, 2011), show clear ecosystems shifting over a ten-year period (2001-2011), based on multiple supervised classifier methods for bioclimatic indicators.