Abstract:
Together with the increasing occurrence of drought events in many regions of the world, the constant need to increase agricultural production demands a more cautious regional water resources planning and assessment of irrigation needs and, thus, a more precise estimate of real evapotranspiration ET. Several water management challenges have been addressed in recent years by models utilizing artificial intelligence. The main challenging aspects are represented by the choice of the best algorithm, availability of climatic data, and having adequate representative features. This study evaluated six machine learning models in two categories, i.e point-wise (Multi-Layer Perceptron (MLP), Ensemble of MLP, Meta-Learning), and probabilistic and uncertainty (Mixed Density Networks, MCDropout, Deep Ensemble) for accurately estimating daily ET with limited meteorological data in various climate regions (from dry continental to Mediterranean climates) and seasons from Ameriflux and Euroflux towers. Our datasets include a collection of publicly accessible remotely detected information traversing 26 sites from 2000 to 2018 such as Real ET values (the response variable) obtained from the Ameriflux and Euroflux towers, in addition to, climate and remotely-sensed data (LST, NDVI, and ALBEDO) obtained from EEflux.
In this thesis, we have incorporated utility-based learning and data oversampling (SMOGN) techniques targeting to enhance the recall of our models to capture extreme (relevant) values of ET. Furthermore, we have also experimented with different feature selection techniques and interpretability tools (SHAP and LIME) that show that air temperature, relative humidity, LST, and NDVI are the top contributing features. We have developed our study in a way to permit agricultural specialists and farmers to select between point-wise forecast, or probabilistic and uncertainty forecast. Our best performing point-wise model is a reptile meta learner that utilizes a multi-layer perceptron. Our meta-learner model achieved an R2 of 0.79, RMSE of 0.90, and a Recall of 0.96 on the holdout subset of our entire dataset.