Geoscientific Model Development Discussions | 2021
CLIMFILL: A Framework for Intelligently Gap-filling Earth Observations
Abstract
Abstract. Earth observations have many missing values. Their abundance and often complex patterns can be a barrier for combining different observational datasets and may cause biased estimates. To overcome this, missing values in geoscientific data are regularly infilled with estimates through univariate gap-filling techniques such as spatio-temporal interpolation. However, these mostly ignore valuable information that may be present in other dependent observed variables. Here we propose CLIMFILL, a multivariate gap-filling procedure that builds up upon simple interpolation by additionally applying a statistical imputation method that is designed to account for dependence across variables. In contrast to popular up-scaling approaches, CLIMFILL does not need a gap-free gridded donor variable for gap-filling. CLIMFILL is tested using gap-free ERA5 re-analysis data of ground temperature, surface layer soil moisture, precipitation, and terrestrial water storage to represent central interactions between soil moisture and climate. These observations were matched with corresponding remote sensing observations and masked where the observations have missing values. CLIMFILL successfully recovers the dependence structure among the variables across all land cover types and altitudes, thereby enabling subsequent mechanistic interpretations. Soil moisture-temperature feedback, which is underestimated in high latitude regions due to sparse satellite coverage, is adequately represented in the multivariate gap-filling. Univariate performance metrics such as correlation and bias are improved compared to spatiotemporal interpolation gap-fill for a wide range of missing values and missingness patterns. Especially estimates for surface layer soil moisture profit taking into account the multivariate dependence structure of the data. The framework al- lows tailoring the gap-filling process to different environmental conditions, domains, or specific use cases and hence can be used as a flexible tool for gap-filling a large range of remote sensing and in situ observations commonly used in climate and environmental research.\n