NH43B-1884
Intensity Maps Production Using Real-Time Joint Streaming Data Processing From Social and Physical Sensors
Abstract:
Intensity is one of the most useful measures of earthquake hazard, as it quantifies the strength of shaking produced at a given distance from the epicenter. Today, there are several data sources that could be used to determine intensity level which can be divided into two main categories. The first category is represented by social data sources, in which the intensity values are collected by interviewing people who experienced the earthquake-induced shaking. In this case, specially developed questionnaires can be used in addition to personal observations published on social networks such as Twitter. These observations are assigned to the appropriate intensity level by correlating specific details and descriptions to the Modified Mercalli Scale. The second category of data sources is represented by observations from different physical sensors installed with the specific purpose of obtaining an instrumentally-derived intensity level. These are usually based on a regression of recorded peak acceleration and/or velocity amplitudes. This approach relates the recorded ground motions to the expected felt and damage distribution through empirical relationships.The goal of this work is to implement and evaluate streaming data processing separately and jointly from both social and physical sensors in order to produce near real-time intensity maps and compare and analyze their quality and evolution through 10-minute time intervals immediately following an earthquake. Results are shown for the case study of the M6.0 2014 South Napa, CA earthquake that occurred on August 24, 2014. The using of innovative streaming and pipelining computing paradigms through IBM InfoSphere Streams platform made it possible to read input data in real-time for low-latency computing of combined intensity level and production of combined intensity maps in near-real time. The results compare three types of intensity maps created based on physical, social and combined data sources. Here we correlate the count and density of Tweets with intensity level and show the importance of processing combined data sources at the earliest time stages after earthquake happens. This method can supplement existing approaches of intensity level detection, especially in the regions with high number of Twitter users and low density of seismic networks.