A41A-0031
Assessing the use of Machine Learning Algorithms for Predicting Electron Impact Mass Spectra

Thursday, 17 December 2015
Poster Hall (Moscone South)
David O Topping, University of Manchester, Manchester, M13, United Kingdom
Abstract:
The importance of atmospheric aerosol particles is clear. Numerical models of atmospheric aerosol are built around our best understanding of aerosol formation, mechanistic models attempting to account for the movement of compounds between the gaseous and condensed phases at a molecular level. How do we know our predictions of aerosol composition are accurate, the instruments used are sensitive enough or that our knowledge of key processes is adequate to enable accurate predictions? Despite a wealth of data from instruments as the aerosol mass spectrometer (AMS), there are no direct methods to compare model-measurement output to inform us about sensitivity to key processes or components included, or not, in these models.

In short, can we take detailed speciated output from mechanistic models and predict the resultant EI mass spectra? To answer this, we must assess the applicability of algorithms to replicate a library of mass spectra from a range of compounds. This is the focus of a proof-of-concept study using a suite of supervised learning algorithms applied to output from the AMS. Supervised learning techniques represent not one but a number of different algorithms. The performance of any given method is sensitive to the 'signature' of molecular representations used as a basis for the training. By employing generic fingerprints used in clustering procedures to property predictive technique fragmentations, we probe the ability of a number of supervised learning methods to replicate measured EI mass spectra from a library of AMS measurements. Sensitivity to both under and over fitting are carried out along with predictions based on gas phase degradation mechanisms to illustrate the potential for refining future mechanisms.