IN52A-03
A Cloud Computing Workflow for Scalable Integration of Remote Sensing and Social Media Data in Urban Studies

Friday, 18 December 2015: 10:48
2020 (Moscone West)
Aiman S Soliman, National Center for Super Computing Applications, CyberGIS Center for Advanced Digital and Spatial Studies, Urbana, IL, United States
Abstract:
Urban ecosystems are unique earth environments because both their physical and social components contribute to the overall dynamics of the system. Up-to-date, remote sensing data (e.g. optical and LiDAR) allowed researchers to monitor the development of impervious surfaces however, it was not adequate to detect associated social dynamics. Geo-located social media (e.g. Twitter) provides a data source to detect population dynamics and understand the interaction of people with their physical environment. Although, integrating social media with remote sensing data has been hindered by large volumes of data and the lack of models for integrating remote sensing products with unstructured social media data. In this research work, we leveraged the NSF chameleon cloud computing platform to provide virtual clusters and elastic auto-scaling of resources that are needed for the synthesis of landuse and geo-located Twitter data. In this context, data synthesis was used to address research questions related to population dynamics in major metropolitan areas. We provide an overview of a cloud computing workflow comprised of a set of coupled scalable synthesis modules for: a) preprocessing data, which includes storage and query of heterogeneous data streams, b) spatial data integration, which matches geo-located Twitter data with user defined landuse maps based on a conceptual model of human mobility and c) visualization of urban mobility patterns. Our results demonstrate the flexibility to connect data, synthesis methods and computing resources using cloud computing, which would be otherwise very difficult for untrained scientists to setup and control. Furthermore, we demonstrate the capabilities of CyberGIS-based workflow using the case study of comparing commuting distances across major US cities from 2013 through the present. We demonstrate how our workflow will support discoveries in urban ecological studies as well as linking human and physical dimensions in environmental research.