Determination of Trends in Ozone in the Mid-Atlantic Using Non-Negative Matrix Factorization

Tuesday, 16 December 2014
Ashley Russell1, Pengyu Xiao2, Steven G Brown1 and Laura Balzano2, (1)Sonoma Technology, Inc., Petaluma, CA, United States, (2)University of Michigan, Ann Arbor, MI, United States
Air pollution data are routinely collected at high time resolution at many sites in the United States, but such data are often assessed singularly or in small jurisdictional groups rather than on a large-scale, regional basis. Examining air pollution data, such as for ambient ozone, in a regional context may be advantageous given that air pollution is influenced by a combination of micro, local, and regional sources.

Non-negative matrix factorization (NMF) algorithms have been widely used by the environmental research community to identify factors governing pollutant concentrations. NMF can also be useful for identifying and interpreting outlier data, particularly for large data sets.

We applied NMF algorithms to ozone data collected at over 100 monitoring sites in the Mid‑Atlantic states during the summer of 2013 to examine their utility for identifying outlier data and outlier monitoring sites in the ozone monitoring network. We compared results from five different NMF algorithms with various strengths (such as being robust to missing data or outliers) to assess differences in their ability to identify outliers and to determine underlying factors influencing ambient ozone concentrations. In the future, these NMF methods can be applied to any large data matrix, such as those from networks of small, low-cost air pollution sensors and large-scale environmental monitoring networks.