Developing benchmark datasets to facilitate AI-engaged algorithms for ecosystem modeling and climate change mitigation. This research is to create a comprehensive dataset that integrates observational and simulation data, crucial for accurately modeling GHG fluxes and other climate variables. The dataset will draw from remote sensing, weather, soil properties, and management data, sourced from new and ongoing projects, public databases like FLUXNET, ERA, CMIP6, Sentinel-2, and simulations from process-based models such as ecosys, DayCent, DNDC, CLM, and ELM. Simulation data will be key in guiding AI algorithms in data-sparse situations and decomposing complex processes that are difficult to measure directly. Ongoing efforts include agroecosystem benchmark dataset development and the AI4NM harmonized CH4 dataset project. This work will be foundational for integrating AI into scientific research, bridging computing and domain sciences.
Broadening participation in computing by developing a user-friendly knowledge-guided machine learning interface for agricultural and natural ecosystem modeling. Despite the advantages of KGML, its development remains complex, requiring expertise in both machine learning and scientific principles, which limits its broader application. This work aims to simplify KGML model development for agricultural and natural ecosystem modeling tasks, by creating a user-friendly AI interface, PyKGML. PyKGML enables users to construct KGML models by selecting from a library of state-of-the-art ML models, knowledge-guided architectures, scientific initializations, and constraint-based loss functions, each designed to incorporate domain insights seamlessly. It supports diverse data integration, including direct measurements (e.g., soil sampling, chamber data, eddy covariance flux measurements, remote sensing images) and synthetic datasets (e.g., simulation data from PB models), facilitating training and testing with user-provided data. Leading efforts can be found at: https://ai4agriculture.github.io/PyKGML_development