The data post-processing pipeline for AmeriFlux data products

Friday, 19 December 2014
Deb Agarwal1, Gilberto Pastorello2, Cristina Poindexter3, Dario Papale4, Carlo Trotta5, Alessio Ribeca5, Eleonora Canfora5, Boris Faybishenko6 and Taghrid Samak6, (1)LBNL, Berkeley, CA, United States, (2)Lawrence Berkeley National Lab, Emeryville, CA, United States, (3)University of California Berkeley, Berkeley, CA, United States, (4)Tuscia University, Department for Innovation in Biological, Agro-food and Forest systems (DIBAF), Viterbo, Italy, (5)University of Tuscia, DIBAF, Viterbo, Italy, (6)Lawrence Berkeley National Laboratory, Berkeley, CA, United States
The AmeriFlux network gathers, curates, and publishes data collected by independently managed field sites measuring fluxes of carbon, water, and energy across the Americas. The data are processed into fluxes and quality controlled by individual tower teams and sent to the network for publication. After further data quality control, these data go through a series of post-processing steps to generate derived and value-added data products. In this presentation we describe these steps and discusses our approach in combining them into a consistent and reproducible processing pipeline that is being used to generate a new release of these data products. The first involves two Ustar threshold calculation approaches, namely the Moving Point Test (MPT) and the Change Point Detection (CPD) approaches. Based on a combination of bootstrapping and these two Ustar threshold calculation methods, an ensemble of Ustar thresholds are generated. The values in this ensemble are all used for Ustar filtering and also to generate an uncertainty estimation. A model efficiency comparison approach is used to select reference values for both the Ustar threshold and the net ecosystem exchange (NEE). The next step takes care of gapfilling of micro-meteorological variables using a combination of the Marginal Distribution Sampling (MDS) method for shorter gaps and, for longer gaps, downscaled data based on the ERA Interim data products, harmonized to the data from each site. In the next step, two methods are used to gapfill the NEE and energy fluxes: the first based on the MDS method and the second based on Artificial Neural Networks (ANN). Another step is the partitioning of NEE into ecosystem respiration and gross primary production (GPP). This step is currently using two methods: one based on nighttime data (using a respiration model) and another on daytime data (using respiration and photosynthesis models). The final step involves a calculation of uncertainties, the determination of reference data products for some of the variables, the computation of their uncertainty values, and the aggregation to other temporal scales (e.g., monthly and yearly). This work is being developed in close collaboration with European partners from ICOS, and is being used to generate new versions of the global flux data products for FLUXNET.