New publication on Remote Sensing of Environment
Abstract
Accurate large-scale crop yield mapping is crucial for understanding how weather and climate variability affect food security. While temporal deep learning models have achieved notable success in extracting time-series features for yield mapping, they often struggle to model spatial dependencies, such as yield spatial autocorrelations and the influence of time-invariant variables (e.g., soil properties and topography). To address this limitation, we propose KGML-Graph, a knowledge-guided graph machine learning framework that integrates spatial learning with temporal structures to explicitly capture these underutilized spatial dependencies. We incorporate knowledge-guided edge weights, derived from historical yield correlations, into the graph structure to identify patterns among counties with similar yield dynamics and enhance model training. The framework was evaluated using data from 627 counties across the U.S. Corn Belt from 2000 to 2020, benchmarking against Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Temporal Convolutional Neural Network (TempCNN). Results show that KGML-Graph outperformed benchmarks in cross-year testing, reducing RMSE by at least 10.8%, and improved R2 by at least 9.3% in temporal extrapolation during 2017–2020. Furthermore, it demonstrated superior spatial transferability by maintaining accuracy in unseen regions and significantly reducing spatial autocorrelation in estimation residuals. Under extreme climatic conditions, the model achieved at least a 14.4% improvement in R2 on out-of-distribution test data and reduced the mean estimation residual from -0.413 to -0.074 metric tons per hectare, effectively mitigating the systematic yield overestimation observed in baseline models. Our attribution analyses highlighted the contributions of both graph structure and knowledge-guided edge weights to the improved performance by better capturing spatial patterns and representing key static variables, such as soil organic carbon content. These findings underscore the potential of KGML-Graph as a robust framework for unifying spatial and temporal learning to support accurate, large-scale crop yield mapping across diverse climatic and geographic conditions.