A Semi-Automated Machine Learning Algorithm for Tree Cover Delineation from 1-m Naip Imagery Using a High Performance Computing Architecture

Tuesday, 16 December 2014
Saikat Basu1, Sangram Ganguly2, Ramakrishna R Nemani2, Supratik Mukhopadhyay1, Cristina Milesi3, Petr Votava4, Andrew Michaelis5, Gong Zhang2, Bruce D Cook6, Sassan S Saatchi7 and Edward Boyda8, (1)Louisiana State University, Computer Science, Baton Rouge, LA, United States, (2)NASA Ames Research Center, Moffett Field, CA, United States, (3)NASA-CSUMB, Sunnyvale, CA, United States, (4)California State University Monterey Bay, Seaside, CA, United States, (5)University Corporation at Monterey Bay, Seaside, CA, United States, (6)NASA Goddard Space Flight Center, Greenbelt, MD, United States, (7)NASA Jet Propulsion Laboratory, Pasadena, CA, United States, (8)Bay Area Enviromental Research, Sonoma, CA, United States
Accurate tree cover delineation is a useful instrument in the derivation of Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) satellite imagery data. Numerous algorithms have been designed to perform tree cover delineation in high to coarse resolution satellite imagery, but most of them do not scale to terabytes of data, typical in these VHR datasets. In this paper, we present an automated probabilistic framework for the segmentation and classification of 1-m VHR data as obtained from the National Agriculture Imagery Program (NAIP) for deriving tree cover estimates for the whole of Continental United States, using a High Performance Computing Architecture. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field (CRF), which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by incorporating expert knowledge through the relabeling of misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the state of California, which covers a total of 11,095 NAIP tiles and spans a total geographical area of 163,696 sq. miles. Our framework produced correct detection rates of around 85% for fragmented forests and 70% for urban tree cover areas, with false positive rates lower than 3% for both regions. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR high-resolution canopy height model shows the effectiveness of our algorithm in generating accurate high-resolution tree cover maps.