WO2024138084A1

WO2024138084A1 - Localized carbon source intervention modeling

Info

Publication number: WO2024138084A1
Application number: PCT/US2023/085583
Authority: WO
Inventors: Jyoti SHANKAR; Nathan BASCH; Joseph WEEKS
Original assignee: Indigo Ag, Inc.
Priority date: 2022-12-23
Filing date: 2023-12-21
Publication date: 2024-06-27

Abstract

A system and method for determining an expected ecosystem attributes for an agronomic region using an emulator system. The system includes a network system configured to access ground-truth agronomic data and remote sensing data for a plurality of agronomic regions, apply a first agronomic model to generate emulator training data for the agronomic regions, train a second agronomic model to generate an emissions baseline for an input agronomic region, and apply a third agronomic model to generate one or more interventions for the input agronomic region. The interventions may include one or more agronomic practices that are different from typical agronomic practices for the region, and may be selected based on their potential impact on an ecosystem attribute. The system may be used to determine the contribution of one or more agronomic practices, fields, or combinations thereof to ecosystem attributes and impacts of a geographic region.

Description

LOCALIZED CARBON SOURCE INTERVENTION MODELING

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Patent Application 63/477,086, titled “Localized Carbon Source Intervention Modeling,” filed on December 23, 2022, U.S. Patent Application 63/530,570 titled “Localized Carbon Source Intervention Modeling,” filed on August 3, 2023, and U.S. Patent Application 63/606,484 titled Localized Carbon Source Intervention Modeling,” filed on December 5, 2023, each of which is incorporated by reference herein in their entirety.

TECHNICAL FIELD

[0002] The invention relates generally to agronomic data collection and subsequent modeling of ecosystem attributes. More particularly, the invention pertains to techniques of utilizing mixed-source data to create an efficient, but high-fidelity agronomic model capable of predicting ecosystem attributes in various agricultural regions.

BACKGROUND

[0003] Historically, understanding of agricultural ecosystem attributes (e.g., emissions, yield, etc.) has been either highly localized or generalized over large regions. Granular data collected directly from the field or "ground-truth" data, bears high resource costs.

Simultaneously, remote sensing data, while efficient to gather, lacks a level of precision and accuracy. Better methods are therefore required to enable farmers and researchers to predict and possibly influence the attributes of an agricultural ecosystem on a larger scale using localized, high-quality data.

[0004] As such, an innovation that addresses these challenges by introducing a system and method that combines localized ground-truth data and remote sensing data to train predictive models would be beneficial. This model will enable predictions for regions where agronomic data may not be as readily available, less dense, or of lower quality. SUMMARY

[0005] In some aspects, the techniques described herein relate to a method for determining one or more ecosystem attributes for a geographic region, the method including: accessing ground-truth agricultural data representing a first set of agronomic regions; applying a first agronomic model to the ground-truth agricultural data to determine an ecosystem attribute for the first set of agronomic regions; accessing remote sensing data representing a second set of agronomic regions; training a second agronomic model using the ground-truth agricultural data, the determined ecosystem attribute, the remote sensing data, the first set of agronomic regions, and the second set of agronomic regions, the second agronomic model trained to predict one or more ecosystem attributes for agronomic regions based on (1) the ground-truth data and determined ecosystem attribute for the first set of agronomic regions, and (2) the remote sensing data for the second set of agronomic regions; applying the second agronomic model to an agronomic region to quantify one or more ecosystem attributes for the agronomic region.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate various exemplary embodiments and together with the description, serve to explain the principles of the disclosed embodiments.

[0007] FIG. 1A is a flowchart illustrating an exemplary method of determining agricultural practice changes, according to embodiments disclosed herein.

[0008] FIG. IB is a flow chart illustrating a method for determining an agricultural practice change using an ecosystem attribute emulator inputting ground-truth data and remote sensing data, according to embodiments disclosed herein.

[0009] FIG. 2 is a schematic illustrating differences between total emissions with and without a source intervention, according to embodiments disclosed herein. [0010] FIG. 3 is a schematic illustrating different practice distributions, according to embodiments disclosed herein.

[0011] FIG. 4 is a chart illustrating an exemplary construction of neighbor pairs, according to embodiments disclosed herein.

[0012] FIG. 5 is a chart illustrating results from co-occurring changes in management practices, according to embodiments disclosed herein.

[0013] FIG. 6 is a map illustrating a paired region, according to embodiments disclosed herein.

[0014] FIG. 7 is a schematic illustrating practices representing a target Nitrogen balance in the baseline year, according to embodiments disclosed herein.

[0015] FIG. 8 is a graph illustrating concurrent practice changes, according to embodiments disclosed herein.

[0016] FIG. 9 is a graph comparing a number of co-occurring practices with optimal planting over different geographic regions, according to embodiments disclosed herein.

[0017] FIG. 10A is a series of graphs illustrating a frequency of interventions by number of practice changes (levers), according to embodiments disclosed herein.

[0018] FIG. 10B is a series of graphs illustrating a CO2e impact associated with interventions with a number of levers, according to embodiments disclosed herein.

[0019] FIG. 11A is a graph illustrating emissions deltas resulting from various combinations of management levers.

[0020] FIG. 11B is a graph illustrating emissions deltas resulting from various combinations of interventions.

[0021] FIG. 12A is a map of exemplary region CMZ 40 or KS South Central.

[0022] FIG. 12B is a series of graphs illustrating common intervention combinations in the KS South Central region. [0023] FIG. 13 is an exemplary schematic illustrating a remote sensing dataset, according to embodiments disclosed herein.

[0024] FIG. 14 is a schematic illustrating an exemplary carbon dataset, according to embodiments disclosed herein.

[0025] FIG. 15 is a computing node according to embodiments of the present disclosure. [0026] FIG. 16 is a graph illustrating differences in a determined ecosystem attribute for different states when calculated using a high fidelity model and a low-fidelity model.

[0027] FIG. 17 is an example system environment according to the principles described herein.

DETAILED DESCRIPTION

[0028] Reference will now be made in detail to the exemplary embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.

[0029] The systems, devices, and methods disclosed herein are described in detail by way of examples and with reference to the figures. The examples discussed herein are examples only and are provided to assist in the explanation of the apparatuses, devices, systems, and methods described herein. None of the features or components shown in the drawings or discussed below should be taken as mandatory for any specific implementation of any of these devices, systems, or methods unless specifically designated as mandatory.

[0030] Also, for any methods described, regardless of whether the method is described in conjunction with a flow diagram, it should be understood that unless otherwise specified or required by context, any explicit or implicit ordering of steps performed in the execution of a method does not imply that those steps must be performed in the order presented but instead may be performed in a different order or in parallel. I. INTRODUCTION

[0031] Over time, the types and quantity of data available to farmers to aid in generating and modifying management practices for their farms have increased. At the onset, farmers were limited to what the eye could see, the nose could smell, and the hands could touch. However, as technology has progressed farmers are now able to access measurements from a wide variety of sources such as on-field, “ground-truth” measurements, and remote sensing measurements. Farmers have increasingly leveraged this data in modifying their farming practices to, e.g., produce fewer emissions.

[0032] To expand, generating high-quality "ground truth" data in agricultural fields involves comprehensive onsite measurements and sophisticated analysis. These techniques can yield accurate data about the exact state of the field — information about crop growth, soil health, microclimates, and other relevant elements. Yet, they necessitate substantial time and resources, from the deployment of measurement tools and techniques to laborious gathering and meticulous consolidation of multivariate data, all of which limit the frequency and applicability of the data on a larger scale. On the other hand, farmers are also able to generate sparse, low-fidelity remote sensing datasets of fields by applying techniques like satellite imagery or drone-based photographic analysis. Compared to high-fidelity measurements these methodologies present a relatively economical approach for data gathering. It enables frequent capture of larger field areas, facilitating the acquisition of a broad, albeit less detailed perspective of the field state.

[0033] Unfortunately, even though improved data gathering has become more mainstream, the techniques oftentimes go underutilized because they can be expensive and difficult to implement. As a result, there is highly disparate data for various agronomic regions across the world. Some regions have reams of data obtained from chemical sensors, spectrometers, satellites, etc., while other regions only have the information sensed by their tending farmer.

[0034] Because of this, there has been a bifurcation in agricultural models given the widely variable data on which they may be applied. On one hand, high-fidelity models using high-quality ground-truth datasets provide a route to in-depth understanding and predictions. However, these models, albeit accurate, are notable for their complexity, computational expense, and operational inefficiency — they demand powerful computing resources, intensive data processing, and extensive processing time. For example, running a typical high-quality agronomic simulation to generate possible interventions using a dense groundtruth data requires tens of millions of simulations. While some of these simulations and data processing may run in parallel to reduce the computation time, the overall processor and computation requirements for such processes are, in aggregate, incredibly high. Due to these factors, high-fidelity models leveraging ground truth data limit their scalability and real-time application potential. On the other hand, low-fidelity models leveraging sparse remote sensing datasets are inherently more efficient but less accurate. They can swiftly churn through the sparse data and generate rapid insights, but the trade-off occurs in accuracy. The difference in capabilities between the models originates from the lower granularity of the data, leading to analytical predictions that are less precise than those generated from high- quality datasets.

[0035] As such, a new model - effectively a series of models — trained to simulate the results of high-fidelity models could effectively bridge this gap. By uniquely leveraging both high-quality ground truth data and sparse, low-fidelity remote sensing datasets, this model can infuse the efficiency of models leveraging remote sensing with the accuracy of the models leveraging ground truth measurements. This hybrid elevates field management capabilities by enabling frequent, highly accurate insights for field processes, while not requiring the substantial processing and data requirements necessary for a standalone high- caliber model.

[0036] Described herein are systems and methods that enable farmers to leverage the data sensed from acutely measured agronomic regions to make predictions for under-measured agronomic regions. That is, the present disclosure includes systems and methods that enable farmers to utilize data from well-measured agronomic regions to make projections for less- measured areas in a manner that lowers resource requirements, which enhances accessibility and efficiency.

II. QUANTIFYING AND REDUCING ECOSYSTEM IMPACT

[0037] Quantifying an ecosystem attribute (e.g., at a specific time) for a field is a challenging endeavor, particularly when trying to measure a shift in a baseline for that attribute. The quantification requires capturing a multitude of measurements, parsing and normalizing the data for various agronomic factors, and then generating and applying a model that can utilize the data. Quantifying a baseline shift necessitates modeling changes to various agronomic practices that lead to the initial baseline, and predicting how those modified practices will impact the ecosystem (and corresponding attributes).

[0038] As described above, quantifying a baseline ecosystem attribute is challenging, and is difficult for many reasons. First, there is a large amount of uncertainty in various agronomic models like DayCent, and generating models based on data from uncertain models can lead to greater uncertainty if not addressed in model configuration. Second, there is typically a loose connection between weather, soil, and spectral data & emissions, making identification of signal and correlation in agronomic data very challenging. Third, producing estimates over immensely large regions is difficult because determining the physical boundaries for data represented in a dataset is challenging. Fourth, the amount of data is truly astronomical, including millions of high-quality, dense data sets that require large amounts of memory and processing power to process. Fifth, integrating real-world agronomic expertise with remote sensing data requires specialized training and processing. Sixth, ecosystem attributes are spatially heterogeneous - with a given ecosystem attribute variable on the order of meters - necessitating efficient methods of creating data that represent larger areas.

Finally, agronomic data is not standardized, making the combination of various datasets from various sources immensely difficult.

[0039] The emulator system provided hereinbelow addresses all of these challenges to generate an estimate for an ecosystem attribute in the field. The emulator system’s functionality, therefore, rests on tangible and innovative techniques using algorithms (e.g., machine-learned algorithms) to improve farming practices and optimize their impact on ecosystems.

[0040] The steps implemented to design an intervention and or quantify an ecosystem impact associated with agricultural practices are detailed below. In a first design step, regions are profiled for their prevalent agronomic practices and practice combinations that comprise management systems. Practices that have empirically been observed to produce a positive ecosystem impact (e.g., reduce greenhouse gas emissions) are identified, and from these practices, practice combinations are constructed. If farmers participate in the intervention, they will be required to implement these practices. An empirical model estimating the contribution of practice combinations to emissions is also built.

[0041] After a plan is designed, implementation occurs. For example, each project can intervene on a specified amount of grain/production. Farmers enroll in the project by committing to a practice (management) combination intervention on a specified amount of their production via a contract. Farmers earn the premium following the implementation of the practice (management) combinations and delivery of their contract production. [0042] To quantify an ecosystem impact (e.g., GHG emissions reduction, increased soil carbon sequestration, etc.), the enrolled fields are verified as having implemented the systems. For each enrolled field, the model difference is calculated. Various scenarios can be used in quantification. The intervention scenario accounts for the estimated ecosystem attribute for the grain produced utilizing combinations. The supply-shed scenario (or preintervention scenario from the supply-shed) accounts for the expected ecosystem attribute for the grain produced utilizing practices/management. Quantification for either scenario follows the Value Chain (Scope 3) and reporting guidance as closely as possible. In some embodiments, only the impact of the process that adheres to the Value Chain (Scope 3) interventions greenhouse gas emissions is accounted for. The only process that the intervention changed is the distribution of grain within the intervention. It is estimated that the ecosystem impact from the changed processes impacts physical parameters such as soil, and weather is left unchanged. The expected ecosystem impact is estimated by modeling both interventions in the same field. Additionally, per the same guidelines, estimation is required in both scenarios. Consistency is maintained by using the same model in both scenarios.

[0043] In an execution phase, differences between field-level estimated ecosystem attributes (e.g., GHG emissions) are computed to quantify ecosystem attribute changes underlying shifts in practice co-occurrence patterns. This is done when the program crop is grown with program-directed observed practices and when the program crop is grown on the same field with practices that commonly co-occur in a comparable neighboring field. These differences are aggregated over a comparable neighboring region (a supply shed).

[0044] To eliminate the data burden, there is no historical data gathered from the farmer. In-season data requirements are minimal and data collection and validation are generally not performed by a farmer. The low-data requirement is enabled by the choice and design of the impact model as described. However, the intentional low data burden design puts constraints on the volume and type of data that can be acquired. The low data requirement is enabled by and factored into the impact estimation model for intervention accounting. The model design heavily influences the number and extent of data any project will require and the nature of the intervention project. As a result, any impact estimation modeling decisions must be made with care - to not affect or minimally affect the data burden of the project implementation. [0045] Ecosystem attribute changes need to be permanent for the season. Seasonal permanence of the ecosystem attribute changes we enable is critical because of the following mechanics of quantification, for example:

• Annual accounting: emissions reductions are attached to the grain grown and sold to the sponsor company (the company requesting the intervention and or quantification of ecosystem impact) - which then incorporates these emissions reductions into their annual inventory accounting.

• Co-mingling and loss of traceability: grain grown during the season gets co-mingled and sold to the company that is buying the emissions reductions. Because of this comingling of the grain, this grain cannot be traced back to the land that produced it (without colossal infrastructural changes in how current grain supply chains are structured).

• No buffers or monitoring beyond the season of contract: buffers are not maintained for loss of permanence or monitor practices beyond the one-season contract with the farmer.

As a result, any emissions sources included and quantified should result in permanent reductions for the season.

[0046] The quantification model incorporates sampling theory at its core. Characterizing how grain is being grown in a supply-shed at a field level is a large sampling task. However, it is made simpler with remote sensing, sampling, and probability theory. The key to scaling is through an understanding of sampling and probability theory and vast amounts of inspiration from how a leviathan task such as the U.S. Census is accomplished with surprisingly accurate results. Start by collecting the most detailed information about the supply shed where this is possible with remote sensing. Then, gather empirical data (literature, source farmers, carbon farmers, yellowbook) from a sample of the supply shed - where available. The above knowledge sources are combined using a series of imputation models to fill gaps in the data. The resulting data characterizes how grain in the supply shed was grown, along with an understanding of how to intervene to change the ecosystem attribute profile of the grain.

[0047] On a supply-shed scale, it is impossible to collect detailed data on every field contributing grain to the overall supply. Additionally, for most commodities produced in the United States, regardless of management practice, all grain is combined at large handling facilities such that the ecosystem attributes associated with the total supply are homogenized. However, for example, the ecosystem attributes of a bushel of grain obtained from this total are not likely the emissions of a hypothetical “average field”. The ecosystem attributes of a random sample of grain grown in the supply shed are a weighted average of ecosystem attributes with each bushel that has contributed to the overall supply. The ecosystem attributes themselves are consequences of the practice combinations that are implemented by farmers to produce grain in each field. A “random draw” approach is taken to estimate the ecosystem attributes associated with grain not produced in the intervention project. Briefly, either annually published regional university extension survey datasets or a remote sensing algorithm-based workflow or both are used, combined with imputation modeling to estimate the co-occurrence probabilities of management practices in the overall production system within the previously delineated project supply-shed boundary. (If available, data contributed by carbon farmers as part of this dataset can also be used). Statistical models are used to estimate the emissions associated with a bushel of grain produced using each practice combination. The emissions per bushel of grain is the weighted average combination of the model-estimated emissions. The weights are the co-occurrence probabilities of each practice combination - e.g., how frequent are a corn-wheat rotation and a stripper-header harvest compared to a wheat-wheat rotation and conventional tilling?

III. DEFINITIONS

[0048] As used herein, the term “exemplary” is used in the sense of “example,” rather than “ideal.” Moreover, the terms “a” and “an” herein do not denote a limitation of quantity but rather denote the presence of one or more of the referenced items.

[0049] As used herein an “ecosystem benefit” is used equivalently with “ecosystem attribute” or “environmental attribute,” each refer to an environmental characteristic (for example, as a result of agricultural production) that may be quantified and valued (for example, as an ecosystem credit or sustainability claim). Examples of ecosystem benefits include without limitation reduced water use, reduced nitrogen use, increased soil carbon sequestration, greenhouse gas emission reduction, greenhouse gas emission avoidance, etc. An example of a mandatory program requiring accounting of ecosystem attributes is California’s Low Carbon Fuel Standard (LCFS). Field-based agricultural management practices can be a means for reducing the carbon intensity of biofuels (e.g., biodiesel from soybeans).

[0050] An “ecosystem impact” is a change in an ecosystem attribute relative to a baseline. In various embodiments, baselines may reflect a set of regional standard practices or production (a comparative baseline), prior production practices and outcomes for a field or farming operation (a temporal baseline), or a counterfactual alternative scenario (a counterfactual baseline). For example, a temporal baseline for determination of an ecosystem impact may be the difference between a safrinha crop production period and the safrinha crop production period of the prior year. In some embodiments, an ecosystem impact can be generated from the difference between an ecosystem attribute for the latest crop production period and a baseline ecosystem attribute averaged over a number (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) of prior production periods.

[0051] A counterfactual scenario refers to alternative factual assumptions, for example, what could have happened within the crop growing season in an area of land given alternative practices, what could have happened within a supply shed (e.g. sourcing region) given alternative assumptions regarding the fields included within the supply shed, what could have happened within the crop growing season in an area of land given alternative climate or weather conditions, etc. In various embodiments, a counterfactual scenario is based on an approximation of supply shed practices, for example, average rates of utilization or adoption of agronomic practices within a region.

[0052] An “ecosystem credit” is a unit of value corresponding to an ecosystem benefit or ecosystem impact, where the ecosystem attribute or ecosystem impact is measured, verified, and or registered according to a methodology. In some embodiments, an ecosystem credit may be a report of the inventory of ecosystem attributes (for example, an inventory of ecosystem attributes of a management zone, an inventory of ecosystem attributes of a farming operation, an inventory of ecosystem attributes of a supply shed, an inventory of ecosystem attributes of a supply chain, an inventory of a processed agricultural product, etc.). In some embodiments, an ecosystem credit is a life-cycle assessment. In some embodiments, an ecosystem credit may be a registry issued credit. Optionally, an ecosystem credit is generated according to a methodology approved by an issuer. An ecosystem credit may represent a reduction or offset of an ecologically significant compound (e.g., carbon credits, water credits, nitrogen credits). In some embodiments, a reduction or offset is compared to a baseline of ‘business as usual’ if the ecosystem crediting or sustainability program did not exist (e.g., if one or more practice changes made because of the program had not been made).

[0053] In some embodiments, a reduction or offset is compared to a baseline of one or more ecosystem attributes (e.g., ecosystem attributes of one or more: field, sub-field region, county, state, region of similar environment, supply shed geographic region, a supply shed, etc.) during one or more prior production period. For example, ecosystem attributes of a field in 2022 may be compared to a baseline of ecosystem attributes of the field in 2021. In some embodiments, a reduction or offset is compared to a baseline of one or more ecosystem attributes (e.g., ecosystem attributes of one or more: field, sub-field region, county, state, region of similar environment, supply shed geographic region, a supply shed, etc.) during the same production period. For example, ecosystem attributes of a field may be compared to a baseline of ecosystem attributes of a supply shed comprising the field. An ecosystem credit may represent a permit to reverse an ecosystem benefit, for example, a license to emit one metric ton of carbon dioxide. A carbon credit represents a measure (e.g., one metric ton) of carbon dioxide or other greenhouse gas emissions reduced, avoided, or removed from the atmosphere. A nutrient credit, for example, a water quality credit, represents pounds of a chemical removed from an environment (e.g., by installing or restoring nutrient-removal wetlands) or reduced emissions (e.g., by reducing rates of application of chemical fertilizers, managing the timing or method of chemical fertilizer application, changing the type of fertilizer, etc.). Examples of nutrient credits include nitrogen credits and phosphorus credits. A water credit represents a volume (e.g., 1000 gallons) of water usage that is reduced or avoided, for example by reducing irrigation rates, managing the timing or method of irrigation, or employing water conservation measures such as reducing evaporation application. [0054] A “sustainability claim” is a set of one or more ecosystem benefits associated with an agricultural product (for example, including ecosystem benefits associated with the production of an agricultural product). Sustainability claims may or may not be associated with ecosystem credits. For example, a consumer package good entity may contract raw agricultural products from producers reducing irrigation, in order to make a sustainability claim of supporting the reduction of water demand on the final processed agricultural product. The producers reducing irrigation may or may not also participate in a water ecosystem credit program, where ecosystem credits are generated based on the quantity of water that is actually reduced compared against a baseline.

[0055] Offsets” are credits generated by third-parties outside the value chain of the party with the underlying carbon liability (e.g., oil company that generates greenhouse gases from combusting hydrocarbons purchases carbon credit from a farmer).

[0056] “Insets” are ecosystem resource (e.g., carbon dioxide) reductions within the value chain of the party with the underlying carbon liability (e.g., oil company who makes biodiesel reduces carbon intensity of biodiesel by encouraging farmers to produce the underlying soybean feedstock using sustainable farming practices). Insets are considered Scope 1 reductions.

[0057] “Scopes” are categories of greenhouse gases. Greenhouse gases are often categorized as Scope 1, Scope 2, or Scope 3. Scope 1 emissions are direct greenhouse gas emissions that occur from sources that are controlled or owned by an organization. Scope 2 emissions are indirect greenhouse gas emissions associated with purchase of electricity, stem, heating, or cooling. Scope 3 emissions are the result of activities from assets not owned or controlled by the reporting organization, but that the organization indirectly impacts in its value chain. Scope 3 emissions represent all emissions associated with an organization’s value chain that are not included in that organization’s Scope 1 or Scope 2 emissions. Scope 3 emissions include activities upstream of the reporting organization or downstream of the reporting organization. Upstream activities include, for example, purchased goods and services (e.g., agricultural production such as wheat, soybeans, or com may be purchased inputs for production of animal feed), upstream capital goods, upstream fuel and energy, upstream transportation and distribution (e.g., transportation of raw agricultural products such as grain from the field to a grain elevator), waste generated in upstream operations, business travel, employee commuting, or leased assets. Downstream activities include, for example, transportation and distribution other than with the vehicles of the reporting organization, processing of sold goods, use of goods sold, end of life treatment of goods sold, leased assets, franchises, or investments.

[0058] An ecosystem credit may generally be categorized as either an inset (when associated with the value chain of production of a particular agricultural product), or an offset, but not both concurrently.

[0059] As used herein, a “crop-growing season” may refer to fundamental unit of grouping crop events by non-overlapping periods of time. In various embodiments, harvest events are used where possible.

[0060] An “issuer” is an issuer of ecosystem credits, which may be a regulatory authority or another trusted provider of ecosystem credits. An issuer may alternatively be referred to as a “registry”.

[0061] A “token” (alternatively, an “ecosystem credit token”) is a digital representation of an ecosystem benefit, ecosystem impact, or ecosystem credit. The token may include a unique identifier representing one or more ecosystem credits, ecosystem attribute, or ecosystem impact, or, in some embodiments a putative ecosystem credit, putative ecosystem attribute, or putative ecosystem impact, associated with a particular product, production location (e.g., a field), production period (e.g., crop production season), and/or production zone cycle (e.g., a single management zone defined by events that occur over the duration of a single crop production season).

[0062] “Ecosystem credit metadata” is at least information sufficient to identify an ecosystem credit issued by an issuer of ecosystem credits. For example, the metadata may include one or more of a unique identifier of the credit, an issuer identifier, a date of issuance, identification of the algorithm used to issue the credit, or information regarding the processes or products giving rise to the credit. In some embodiments, the credit metadata may include a product identifier as defined herein. In other embodiments, the credit is not tied to a product at generation, and so there is no product identifier included in the credit metadata.

[0063] A “product” is any item of agricultural production, including crops and other agricultural products, in their raw, as-produced state (e.g., wheat grains), or as processed (e.g., oils, flours, polymers, consumer goods (e.g., crackers, cakes, plant based meats, animalbased meats (for example, beef from cattle fed a product such as corn grown from a particular field), bioplastic containers, etc.). In addition to harvested physical products, a product may also include a benefit or service provided via the use of the associated land (for example, for recreational purposes such as a golf course), pasture land for grazing wild or domesticated animals (where domesticated animals may be raised for food or recreation).

[0064] “Product metadata” are any information regarding an underlying product, its production, and/or its transaction which may be verified by a third party and may form the basis for issuance of an ecosystem credit and/or sustainability claim. Product metadata may include at least a product identifier, as well as a record of entities involved in transactions.

[0065] As used herein, “quality” or a “quality metric” may refer to any aspect of an agricultural product that adds value. In some embodiments, quality is a physical or chemical attribute of the crop product. For example, a quality may include, for a crop product type, one or more of a variety; a genetic trait or lack thereof; genetic modification of lack thereof; genomic edit or lack thereof; epigenetic signature or lack thereof; moisture content; protein content; carbohydrate content; ash content; fiber content; fiber quality; fat content; oil content; color; whiteness; weight; transparency; hardness; percent chalky grains; proportion of corneous endosperm; presence of foreign matter; number or percentage of broken kernels; number or percentage of kernels with stress cracks; falling number; farinograph; adsorption of water; milling degree; immature grains; kernel size distribution; average grain length; average grain breadth; kernel volume; density; L/B ratio; wet gluten; sodium dodecyl sulfate sedimentation; toxin levels (for example, mycotoxin levels, including vomitoxin, fumonisin, ochratoxin, or aflatoxin levels); and damage levels (for example, mold, insect, heat, cold, frost, or other material damage).

[0066] In some embodiments, quality is an attribute of a production method or environment. For example, quality may include, for a crop product, one or more of: soil type; soil chemistry; climate; weather; magnitude or frequency of weather events; soil or air temperature; soil or air moisture; degree days; rain fed; irrigated or not; type of irrigation; tillage frequency; tillage type; cover crop (present or historical); fallow seasons (present or historical); crop rotation; organic; shade grown; greenhouse; level and types of fertilizer use; levels and type of chemical use; levels and types of herbicide use; pesticide-free; levels and types of pesticide use; no-till; use of organic manure and byproducts; minority produced; fairwage; geography of production (e.g., country of origin, American Viti cultural Area, mountain grown); pollution-free production; reduced pollution production; levels and types of greenhouse gas production; carbon neutral production; levels and duration of soil carbon sequestration; and others. In some embodiments, quality is affected by, or may be inferred from, the timing of one or more production practices. For example, food grade quality for crop products may be inferred from the variety of plant, damage levels, and one or more production practices used to grow the crop. In another example, one or more qualities may be inferred from the maturity or growth stage of an agricultural product such as a plant or animal. In some embodiments, a crop product is an agricultural product.

[0067] In some embodiments, quality is an attribute of a method of storing an agricultural good (e.g., the type of storage: bin, bag, pile, in-field, box, tank, or other containerization), the environmental conditions (e.g., temperature, light, moisture, relative humidity, presence of pests, CO2 levels) during storage of the crop product, method of preserving the crop product (e.g., freezing, drying, chemically treating), or a function of the length of time of storage. In some embodiments, quality may be calculated, derived, inferred, or subjectively classified based on one or more measured or observed physical or chemical attributes of a crop product, its production, or its storage method. In some embodiments, a quality metric is a grading or certification by an organization or agency. For example, grading by the USDA, organic certification, or non-GMO certification may be associated with a crop product. In some embodiments, a quality metric is inferred from one or more measurements made of plants during growing season. For example, wheat grain protein content may be inferred from measurement of crop canopies using hyperspectral sensors and/or NIR or visible spectroscopy of whole wheat grains. In some embodiments, one or more quality metrics are collected, measured, or observed during harvest. For example, dry matter content of com may be measured using near-infrared spectroscopy on a combine. In some embodiments, the observed or measured value of a quality metric is compared to a reference value for the metric. In some embodiments, a reference value for a metric (for example, a quality metric or a quantity metric) is an industry standard or grade value for a quality metric of a particular agricultural good (for example, U.S. No. 3 Yellow Corn, Flint), optionally as measured in a particular tissue (for example, grain) and optionally at a particular stage of development (for example, silking). In some embodiments, a reference value is determined based on a supplier’s historical production record or the historical production record of present and/or prior marketplace participants.

[0068] A “field” is the area where agricultural production practices are being used (for example, to produce a transacted agricultural product) and/or ecosystem credits and/or sustainability claims.

[0069] As used herein, a “field boundary” may refer to a geospatial boundary of an individual field.

[0070] As used herein, an “enrolled field boundary” may refer to the geospatial boundary of an individual field enrolled in at least one ecosystem credit or sustainability claim program on a specific date.

[0071] In various embodiments, a field is a unique object that has temporal and spatial dimensions. In various embodiments, the field is enrolled in one or more programs, where each program corresponds to a methodology. As used herein a “methodology” (equivalently “program eligibility requirements” or “program requirements”) is a set of requirements associated with a program, and may include, for example, eligibility requirements for the program (for example, eligible regions, permitted practices, eligible participants (for example, size of farms, types of product permitted, types of production facilities permitted, etc.) and or environmental effects of activities of program participants, reporting or oversight requirements, required characteristics of technologies (including modeling technologies, statistical methods, etc.) permitted to be used for prediction, quantification, verification of results by program participants, etc. Examples of methodologies include protocols administered by Climate Action Reserve (CAR) (climateactionreserve.org), such as the Soil Enrichment Protocol; methodologies administered by Verra (verra.org), such as the Methodology for Improved Agricultural Land Management, farming sustainability certifications, life cycle assessment, and other similar programs. In various embodiments, the field data object includes field metadata. “One or more methodologies” refers to a data structure comprising program eligibility requirements for a plurality of programs. More briefly, a methodology may be a set of rules set by a registry or other third party, while a program implements the rules set in the methodology.

[0072] In various embodiments, the field metadata includes a field identifier that identifies a farm (e.g., a business) and a farmer who manages the farm (e.g., a user). In various embodiments, the field metadata includes field boundaries that are a collection of one or more polygons describing geospatial boundaries of the field. In some embodiments, polygons representing fields or regions within fields (e.g., management event boundaries, etc.) may be detected from remote sensing data using computer vision methods (for example, edge detection, image segmentation, and combinations thereof) or machine learning algorithms (for example, maximum likelihood classification, random tree classification, support vector machine classification, ensemble learning algorithms, convolutional neural network, etc.).

[0073] In various embodiments, the field metadata includes farming practices that are a set of farming practices on the field. In various embodiments, farming practices are a collection of practices across multiple years. For example, farming practices include crop types, tillage method, fertilizers and other inputs, etc. as well as temporal information related to each practice which is used to establish crop growing seasons and ultimately to attribute outcomes to practices. In various embodiments, the field metadata includes outcomes. In various embodiments, the outcomes include at least an effect size of the farming practices and an uncertainty of the outcome. In various embodiments, an outcome is a recorded result of a practice, notably: harvest yields, sequestration of greenhouse gases, and/or reduction of emissions of one or more greenhouse gases. [0074] In various embodiments, the field metadata includes agronomic information, such as soil type, climate type, etc. In various embodiments, the field metadata includes evidence of practices and outcomes provided by the grower or other sources. For example, a scale ticket from a grain elevator, an invoice for cover crop seed from a distributor, farm machine data, remote sensing data, a time stamped image or recording, etc. In various embodiments, the field metadata includes product tracing information such as storage locations, intermediaries, final buyer, and tracking identifiers.

[0075] In various embodiments, the field object is populated by data entry from the growers directly. In various embodiments, the field object is populated using data from remote sensing (satellite, sensors, drones, etc.). In various embodiments, the field object is populated using data from agronomic data platforms such as John Deere and Granular, and/or data supplied by agronomists, and/or data generated by remote sensors (such as aerial imagery, satellite derived data, farm machine data, soil sensors, etc.). In various embodiments, at least some of the field metadata within the field object is hypothetical for simulating and estimating the potential effect of applying one or more practices (or changing one or more practices) to help growers make decisions as to which practices to implement for optimal economic benefit.

[0076] In various embodiments, the system may access one or more model capable of processing the field object, processing the field object (e.g., process the field object based on one or more model), and returning an output based on the metadata contained within the field object. In various embodiments, a collection of models that can be applied to a field object to estimate, simulate, and/or quantify the outcome (e.g., the effect on the environment) of the practices implemented on a given field. In various embodiments, the models may include process-based biogeochemical models. Process-based models are alternately referred to as mechanistic models. In various embodiments, the models may include machine learning models. In various embodiments, the models may include rule-based models. In various embodiments, the models may include a combination of models (e.g., ensemble models).

[0077] As used herein, a “management event” may refer to a grouping of data about one or more farming practices (such as tillage, harvest, etc.) that occur within a field boundary or an enrolled field boundary. A “management event” contains information about the time when the event occurred and has a geospatial boundary defining where within the field boundary the agronomic data about the event applies. Management events are used for modeling and credit quantification, designed to facilitate grower data entry and assessment of data requirements. Each management event may have a defined management event boundary that can be all or part of the field area defined by the field boundary. A “management event boundary” (equivalently a “farming practice boundary”) is the geospatial boundary of an area over which farming practice action is taken or avoided. In some embodiments, if a farming practice action is an action taken or avoided at a single point, the management event boundary is point location. As used herein, a farming practice and agronomic practice are of equivalent meaning.

[0078] As used herein, a “management zone” may refer to an area within an individual field boundary defined by the combination of management event boundaries that describe the presence or absence of management events at any particular time or time window, as well as attributes of the management events (if any event occurred). A management zone may be a contiguous region or a non-contiguous region. A “management zone boundary” may refer to a geospatial boundary of a management zone. In some embodiments, a management zone is an area coextensive with a spatially and temporally unique set of one or more farming practices. In some embodiments, an initial management zone includes historic management events from one or more prior cultivation cycles (for example, at least 2, at least 3, at least 4, at least 5, or a number of prior cultivation cycles required by a methodology). In some embodiments, a management zone generated for the year following the year for which an initial management zone was created will be a combination of the initial management zone and one or more management event boundaries of the next year. A management zone can be a data-rich geospatial object created for each field using an algorithm that crawls through management events (e.g., all management events) and groups the management events into discrete zonal areas based on features associated with the management event(s) and/or features associated with the portion of the field in which the management event(s) occur. The creation of management zones enables the prorating of credit quantification for the area within the field boundary based on the geospatial boundaries of management events.

[0079] In some embodiments, a management zone is created by sequentially intersecting a geospatial boundary defining a region wherein management zones are being determined (for example, a field boundary), with each geospatially management event boundary occurring within that region at any particular time or time window, wherein each of the sequential intersection operations creates two branches - one with the intersection of the geometries and one with the difference. The new branches are then processed with the next management event boundary in the sequence, bifurcating whenever there is an area of intersection and an area of difference. This process is repeated for all management event boundaries that occurred in the geospatial boundary defining the region. The final set of leaf nodes in this branching process define the geospatial extent of the set of management zones within the region, wherein each management zone is non-overlapping, and each individual management zone contains a unique set of management events relative to any other management zone defined by this process.

[0080] As used herein, a “zone-cycle” may refer to a single cultivation cycle on a single management zone within a single field, considered collectively as a pair that define a foundational unit (e.g., also referred to as an “atomic unit”) of quantification for a given field in a given reporting period.

[0081] As used herein, a “baseline simulation” may refer to a point-level simulation of constructed baselines for the duration of the reported project period, using initial soil sampling at that point (following SEP requirements for soil sampling and model initialization) and management zone-level grower data (that meets SEP data requirements). [0082] As used herein, a “with-project simulation” may refer to a point-level simulation of adopted practice changes at the management zone level that meet SEP requirements for credit quantification.

[0083] As used herein, a “field-level project start date” may refer to the start of the earliest cultivation cycle, where a practice change was detected and attested by a grower. [0084] As used herein, a “required historic baseline period” may refer to years (in 365 day periods, not calendar years) of required historic information prior to the field-level project start date that must fit requirements of the data hierarchy in order to be modeled for credits. A number of required years is specified by the SEP, based on crop rotation and management.

[0085] As used herein, a “cultivation cycle” (equivalently a “crop production period” or “production period”) may refer to the period between the first day after harvest or cutting of a prior crop on a field or the first day after the last grazing on a field, and the last day of harvest or cutting of the subsequent crop on a field or the last day of last grazing on a field. For example, a cultivation cycle may be: a period starting with the planting date of current crop and ending with the harvest of the current crop, a period starting with the date of last field prep event in the previous year and ending with the harvest of the current crop, a period starting with the last day of crop growth in the previous year and ending with the harvest or mowing of the current crop, a period starting the first day after the harvest in the prior year and the last day of harvest of the current crop, etc. In some embodiments, cultivation cycles are approximately 365-day periods from the field-level project start date that contain completed crop growing seasons (planting to harvest/mowing, or growth start to growth stop). In some embodiments, cultivation cycles extend beyond a single 365-day period and cultivation cycles are divided into one or more cultivation cycles of approximately 365 days, optionally where each division of time includes one planting event and one harvest or mowing event.

[0086] As used herein, a “historic cultivation cycles” may refer to defined in the same way as cultivation cycles, but for the period of time in the required historic baseline period. [0087] As used herein, a “historic segments” may refer to individual historic cultivation cycles, separated from each other in order to use to construct baseline simulations.

[0088] As used herein, a “historic crop practices” may refer to crop events occurring within historic cultivation cycles.

[0089] As used herein, a “baseline thread/parallel baseline threads” may refer to each baseline thread is a repeating cycle of the required historic baseline period, that begin at the management zone level project start date. The number of baseline threads equals the number of unique historic segments (e.g., one baseline thread per each year of the required historic baseline period). Each baseline thread begins with a unique historic segment and runs in parallel to all other baseline threads to generate baseline simulations for a with-project cultivation cycle.

[0090] As used herein, an “overlap in practices” may refer to an unrealistic agronomic combinations that arise at the start of baseline threads, when dates of agronomic events in the concluding cultivation cycle overlap with dates of agronomic events in the historic segment that is starting the baseline thread. In this case, logic is in place based on planting dates and harvest dates to make adjustments based on the type of overlap that is occurring. [0091] An “indication of a geographic region” is a latitude and longitude, an address or parcel id, a geopolitical region (for example, a city, county, state), a region of similar environment (e.g., a similar soil type or similar weather), a supply shed, a boundary file, a shape drawn on a map presented within a GUI of a user device, image of a region, an image of a region displayed on a map presented within a GUI of a user device, a user id where the user id is associated with one or more production locations (for example, one or more fields). [0092] For example, polygons representing fields may be detected from remote sensing data using computer vision methods (for example, edge detection, image segmentation, and combinations thereof) or machine learning algorithms (for example, maximum likelihood classification, random tree classification, support vector machine classification, ensemble learning algorithms, convolutional neural network, etc.).

[0093] “Ecosystem observation data” are observed or measured data describing an ecosystem, for example weather data, soil data, remote sensing data, emissions data (for example, emissions data measured by an eddy covariance flux tower), populations of organisms, plant tissue data, and genetic data. In some embodiments, ecosystem observation data are used to connect agricultural activities with ecosystem variables. Ecosystem observation data may include survey data, such as soil survey data (e.g., SSURGO). In various embodiments, the system performs scenario exploration and model forecasting, using the modeling described herein. In various embodiments, the system proposes climate-smart crop fuel feedstock CI integration with an existing model, such as the Greenhouse gases, Regulated Emissions, and Energy use in Technologies Model (GREET), which can be found online at https://greet.es.anl.gov/ (the GREET models are incorporated by reference herein). [0094] A “crop type data layer” is a data layer containing a prediction of crop type, for example USDA Cropland Data Layer provides annual predictions of crop type, and a 30m resolution land cover map is available from MapBiomas (https://mapbiomas.org/en). A crop mask may also be built from satellite-based crop type determination methods, ground observations including survey data or data collected by farm equipment, or combinations of two or more of: an agency or commercially reported crop data layer (e.g., CDL), ground observations, and satellite-based crop type determination methods.

[0095] A “vegetative index” (“VI”) is a value related to vegetation as computed from one or more spectral bands or channels of remote sensing data. Examples include simple ratio vegetation index (“RVI”), perpendicular vegetation index (“PVI”), soil adjusted vegetation index (“SAVI”), atmospherically resistant vegetation index (“AR VI”), soil adjusted atmospherically resistant VI (“SARVI”), difference vegetation index (“DVI”), normalized difference vegetation index (“ND VI”). ND VI is a measure of vegetation greenness which is particularly sensitive to minor increases in surface cover associated with cover crops.

[0096] SEP” stands for soil enrichment protocol. The SEP version 1.0 and supporting documents, including requirements and guidance, (incorporated by reference herein) can be found online at https://www.climateactionreserve.org/how/protocols/soil-enrichment/. As is known in the art, SEP is an example of a carbon registry methodology, but it will be appreciated that other registries having other registry methodologies (e.g., carbon, water usage, etc.) may be used, such as the Verified Carbon Standard VM0042 Methodology for Improved Agricultural Land Management, vl.O (incorporated by reference herein), which can be found online at https://verra.org/methodology/vm0042-methodology-for-improved- agri cultural -land-management-v 1-0/. The Verified Carbon Standard methodology quantifies the greenhouse gas (GHG) emission reductions and soil organic carbon (SOC) removals resulting from the adoption of improved agricultural land management (ALM) practices.

Such practices include, but are not limited to, reductions in fertilizer application and tillage, and improvements in water management, residue management, cash crop and cover crop planting and harvest, and grazing practices. [0097] “LRR” refers to a Land Resource Region, which is a geographical area made up of an aggregation of Major Land Resource Areas (MLRA) with similar characteristics. [0098] DayCent is a daily time series biogeochemical model that simulates fluxes of carbon and nitrogen between the atmosphere, vegetation, and soil. It is a daily version of the CENTURY biogeochemical model. Model inputs include daily maximum/minimum air temperature and precipitation, surface soil texture class, and land cover/use data. Model outputs include daily fluxes of various N-gas species (e.g., N2O, NOx, N2); daily CO2 flux from heterotrophic soil respiration; soil organic C and N; net primary productivity; daily water and nitrate (NO3) leaching, and other ecosystem parameters.

[0099] Source interventions shrink emissions. Source intervention projects change on- farm production practices within supply sheds from which large food and beverage companies procure their grain or other agricultural needs. By intervening to change the process by which grain is produced, sponsoring companies can reduce their Scope 3 emissions. Scope 3 emissions are emissions not produced by the large company itself, and not result of activities from assets owned or controlled by the company, but by those that the company is indirectly responsible for, up and down its value chain. Source designs and implements process changes and quantifies the emissions reduced by the process change in grain production.

[00100] Interventions are made to change farm practice distribution. For example, a company needs a grain supply. For a specified volume of grain in the company’s supply shed, interventions change the distribution of farm practices that produce the grain. As a result of the intervention, the co-occurrence probabilities of specific practice combinations are changed in the portion of grain production within the intervention project. Interventions can be relatively small compared to the size of the supply shed today. For example, an intervention may be 2 million cwt of rice, relative to a supply shed projected to produce more than 86 million cwt of rice.

[00101] Resource-efficient practices emit less per unit grain produced. The intervention changes the grain production process for a specific volume of grain and shrinks the emissions associated with its production. The grain produced with the intervention co-mingles with other grain in the supply-shed before it is procured by the sponsoring company. To account for this co-mingling, emission reductions associated with interventions are quantified following the well-established mass-balance approaches described in the following protocols:

• Corporate Value Chain (Scope 3) Accounting and Reporting Standard, part of the GHG Protocol Corporate Accounting and Reporting Standard;

• Technical Guidance for Calculating Scope 3 Emissions; and

• Value Chain (Scope 3) interventions greenhouse gas accounting and reporting guidance.

These are the global corporate standard Scope 3 emissions protocol equivalents to US SEP protocol.

IV. AN EMISSIONS EMULATOR FOR SOURCE INTERVENTIONS

[00102] As discussed above, accurately generating an ecosystem-attribute baseline for a field to generate one or more interventions is challenging due to the disparate nature of agronomic data across the world. To illustrate, consider a field with a complete or nearly complete picture of its agronomic state (e.g., robust current and historical ground-truth agronomic data, complementary remote sensing data, documented management practices, interviews with farmers, etc.). In this case, simulating the ecosystem attributes of the field with one or more models (for example, process-based biogeochemical models, inventorybased greenhouse gas emissions calculators, etc.) may provide an accurate (or more accurate) representation of the field’s baseline ecosystem attributes due to the robust dataset on which the models can act. As such, one could reasonably expect that interventions generated for the field would be more likely to provide an accurate prediction of ecosystem attribute modification for the field (due to the accurate baseline).

[00103] On the other hand, consider a field with a sparse picture of its agronomic state (e.g., only remote sensing data, an academic projection using outdated agronomic management practices, etc.). In this case, simulating ecosystem attributes of the field with one or more agricultural and/or agronomy models may not provide an accurate representation of the field's baseline ecosystem attributes due to the sparse dataset on which the models can act. As such, any interventions generated for the field, if even possible, are less likely to provide an accurate prediction of ecosystem attribute modification for the field (due to the inaccurate baseline).

[00104] To that end, various methods of simulating an ecosystem attribute (such as greenhouse gas emissions) to generate a baseline from which to generate interventions for a specific field or supply shed are now described.

First Method

[00105] In an example, ground-truth agricultural data may be used to determine an agricultural practice change. FIG. 1A is a flow chart illustrating an exemplary method 100 for determining an agricultural practice change. Exemplary method 100 may include one or more of the steps below. In step 101, the method may include reading a plurality of agricultural practices associated with a target geographic region. In step 102, the method may include determining, from the plurality of agricultural practices, a first value for a quality attribute (e.g., a data point for the agronomic data, whether ground-truth or remotely sensed) of the target geographic region. In step 103, the method may include reading a plurality of agricultural practices associated with each of a plurality of reference geographic regions. In step 104, the method may include selecting a first of the plurality of reference geographic regions based on its similarity to the target geographic region. In step 105, the method may include determining at least one agricultural practice difference between the target geographic region and the first of the plurality of reference geographic regions. In step 106, the method may include determining at least one target agricultural practice for the target geographic. In step 107, the method may include simulating a second value of the quality attribute of the target geographic region based on the at least one target agricultural practice. In step 108, the method may include determining a recommendation regarding adoption of the target agricultural practices based on a comparison of the first and second values of the quality attribute.

[00106] Now, as described above, the impact of any change in practice on yield and emissions generated from the method 100 depends on what the farmer’s historical practices were and/or the accuracy and fidelity of the baseline generated from those historical practices. From the larger company’s perspective, the impact of practice changes depends on the difference (delta) between the inventory of emissions at the start of the intervention project and the inventory of emissions after practice changes have been adopted. If the farmers in an intervention were already selling their crop into the supply chain of the company and continue to sell their crop into the supply chain of the company after an intervention, then the complete impact of their practice changes can travel to the company. Additionally, the emissions inventory of crop that the company buys is the emissions released in the production of that crop. Exemplary analysis from method 100 can cover the following facets of intervention in agricultural practices:

• What are the intervention impact ranges for single or multiple practice changes moving from higher-emitting to lower-emitting field management?

What does it take to “Shift the curve”? What happens if the co-occurrence probabilities of practices are changed in the same locations, weather, and soil?

• How do these ranges differ by supply shed?

• What is the corresponding estimated emissions inventory of the crop that is coming from the supply shed?

• How will the emissions inventory change if the intervention is implemented on a specific volume of crop in the supply shed?

Exemplary results are displayed in FIG. 5.

[00107] Unfortunately, oftentimes, generating an intervention for each individual field using method 100 may not be feasible given the sparsity of agronomic data for various fields. That is, as described above, generating emissions baselines and their corresponding interventions for fields with dense agronomic data may prove feasible, while generating the same for fields with meager agronomic data may prove infeasible and/or unreliable.

Second Method

[00108] As such, an algorithm including a sequence of models (e.g., an “emissions emulator system” or “emulator system”) enabling the generation of baselines and their corresponding interventions is described. The emulator system includes a sequence of models that employ both statistical imputation of data and emulation of field-level data to generate baselines and interventions. The algorithm allows for a field-level characterization of an emissions baseline, a simulation of possible interventions, a determination of impactful interventions based on the simulation, and the generation of intervention recommendations. [00109] Key to this approach is the synthesis of ground-truth data and remote sensing data. Ground-truth data, as described above, is high-quality agronomic data provided by farmers (or some other party) about their fields and/or the surrounding fields (e.g., sensor measurements, reported farming practices, etc.). Remote sensing data is data obtained about the field without direct input from the farmer (e.g., satellite images, weather data, etc.). The remote sensing data may be processed to infer additional agronomic data about the field. For example, a satellite image may be analyzed to determine one or more agronomic practices that are implemented in a field. However, remote sensing data generally does not provide sufficiently detailed data for generating high-quality emissions baselines and interventions. For instance, remote sensing data may not include management events such as fertilizer applications and associated attributes such as the type and rate of fertilizer applied, etc. This data, however, may be included in ground-truth data and may be key to estimating emissions and designing interventions. Finally, remote sensing data is unable to provide estimates of emissions from these practices.

[00110] Notably, however, remote sensing data is still a valuable resource that allows simulation efforts to scale to wider agronomic regions using an algorithm that integrates ground-truth datasets (e.g., datasets that contain these farm data, databases, and sources that either contain or can simulate emissions estimates) and the remote sensing data. The algorithm below reflects a system of models (e.g., an emulator system) that impute localized management practices including dense, high-fidelity ground-truth data to the large-scale, sparse, low-fidelity, remote sensing field data. In turn, the algorithm is formulated to generate (e.g., emulate) ecosystem attributes (e.g. greenhouse gas emissions) estimates at scale over the imputed remote sensing field data (e.g., on a per-field basis).

[00111] In more detail, various datasets (e.g., some of those described in the list below) are combined in an algorithm, implemented as an emulator system, that estimates ecosystem attributes, such as greenhouse gas emissions, at the field level using a series of imputation models. The imputation models, in effect, address missing variables in remote sensing data by imputing those variables from similar variables in ground-truth data. The resulting data is used to a) characterize the various management practices applied in a field (e.g., how grain in the supply shed was grown), and b) build an emissions model that leverages a sequence of models in an ensemble. These two pieces provide information on how to design an intervention to, e.g., change the emissions profile of the crop grown in a given supply shed.

1) Datasets: gathering and processing the datasets for the appropriate supply shed (or other agronomic region) and crop of interest a) Remote Sensing data for the, e.g., past 3 (or more) years. Example remote sensing datasets are shown in FIG. 13, and may include: i) Measurements from satellite images ii) Inferences from satellite images b) Ground-truth data, including one or more of: i) field level, directly observed agronomic data (for example, type of cash and cover crops planted; dates of planting, applying one or more input (for example, fertilizer, compost, insecticide, herbicide, etc.), grazing, harvesting a crop, terminating a cover crop, etc.) ii) Literature on primary emissions studies. iii) LCA dataset from literature and data from a life cycle inventory database (for example, the ecoinvent dataset). iv) Estimates (e.g., model outputs, forecasts, etc.) v) Weather data vi) Soil data vii) IPCC data - e.g., a set of climate zones reflecting geographic boundaries with distinct climate characteristics as defined by the Intergovernmental Panel on Climate Change. viii) Regional university extension or USDA survey datasets 2) Algorithm: Designing an ecosystem attribute emulator which includes one or more of the following steps: a) Quantifying ecosystem attributes for the ground-truth data, for example generating one or more of: i) DayCent Simulations ii) Fast-GHG Simulations iii) DNDC Simulations b) Generating additional training data i) Applying, in some cases, robust statistical methods to impute variables from, e.g., complementary ground-truth data to missing variables in remote sensing datasets, variables in ground truth management data using observations from remote sensing data, etc. ii) Applying, in some cases, an input engine to the ground truth data to generate synthetic agronomic reference data c) Training a model (for example, a tree-based ensemble model such as Extreme Gradient Boosting or some other tree-based ensemble model) using the remote sensing data, the ground-truth data, the ecosystem attribute simulations, and the imputed dataset to generate emissions baselines on a field-level basis, and d) Generates field-level ecosystem attribute estimates that may be used to generate interventions for fields or supply sheds

[00112] Following this algorithm, one of the main goals is to generate sufficient data to accurately train a model for the sequence of models described herein. In step 2b, one approach is to determine the co-occurrence probabilities of practices at the level of the supply shed and estimate models for a field-level characterization of the ecosystem attribute (e.g. emissions) inventory in the supply shed. For context, some example imputation techniques may include: course tillage to inferred fine grain tillage, no-till data to zero tillage passes, conventional tillage data to three tillage passes with moldboard plow, identification of cover crop to perennial cover crop and planting date, identification of irrigation to an amount of water applied to a crop, among other examples. Missing and sparse data is addressed by sequential and multiple imputation techniques. Another approach is to generate synthetic field data using an input engine. The input engine receives a period of historic ground truth data for a specific time and place and generates synthetic data for a field or similar field that resembles that data. In effect, the input engine creates a series of “ground truth” agronomic actions that produce “ground truth” agronomic data based on, e.g., a decade’s worth of real agronomic data for the field. In some cases, the input engine can be trained to generate agronomic data for variables that are more sparse or missing from the ground truth data such that the simulated data is more robust than actual ground-truth data. In both of these cases, the algorithm addresses the challenges of missing and sparse data, small samples, and nonrepresentativeness of areas for interventions that occur for various reasons (as described hereinabove) in modern agronomic practice.

[00113] FIG. IB is a flow chart illustrating a method for determining an agricultural practice change (intervention) using an emissions emulator inputting ground-truth data and remote sensing data, according to one example embodiment. The method 110 may include additional or fewer steps, and the steps may occur in a different order. Moreover, one or more of the steps may be repeated or omitted.

[00114] The method may be implemented by a network system including an emulator system and an intervention module (see below). The network system, using the emulator system and intervention module, is configured to generate an estimate of an ecosystem attribute for an agronomic region, and generate an intervention for the agronomic region based on the estimate. The network system may be connected to one or more remote systems via a network. Each of the one or more remote systems includes a data accessing module configured for accessing agronomic data and/or a data measurement module configured for measuring agronomic data.

[00115] The network system accesses 111 ground-truth agronomic data for each of a first set of agronomic regions. Each agronomic region may be, e.g., a field or a combination of fields, a supply shed or a combination of supply sheds, etc. The ground-truth agronomic data may include, e.g., current or previous weather data, current or previous soil data, current or previous crops grown in the agronomic regions, one or more current or previous tillage pass events, one or more current or previous nitrogen fertilizer events, one or more emissions measurements, or any other applicable on-field measurements. The network system may receive or access the ground-truth agronomic data from one or more client systems and/or one or more data stores.

[00116] The network system applies 112 a first agronomic model to the ground-truth data to generate emulator training data for a first set of agronomic regions. That is, the first agronomic model inputs the ground-truth data for the first set of agronomic regions and outputs emulator training data for the first set of agronomic regions. The emulator training data is a data set that provides high-quality, dense agronomic data for each agronomic region in the first set of agronomic regions based on the input ground-truth data.

[00117] The emulator training data generally includes one or more ecosystem attributes of the first set of agronomic regions but may include additional or different data. For example, first agronomic model may generate an ecosystem attribute for an agronomic region in the training data that is missing in the ground-truth data, generate an ecosystem attribute that is a synthesis of the input ground-truth data, or generate an ecosystem attribute that is a function of, or computed from, the input ground-truth data. In some configurations, the emulator training data may also include the input ground-truth data itself. Additionally, the first agronomic model is typically a computationally expensive high-quality agronomic model configured to generate accurate, robust estimates of various agronomic data based on input agronomic data. Examples of models for estimation of one or more ecosystem attributes include DayCent (Parton et al. 1998 Global and Planetary Change 19(1-4) 35-48; Kelly et al. 2000. J. Geophys. Res. 105: 20,093-, 20, 100; Del Grosso et al., 2001 In M. Schaffer (ed.) Modeling carbon and nitrogen dynamics for soil management , CRC Press LLC, Boca Raton FL), DayCent-CR (Mathers et al. 2023 Geoderma 438,116647), RothC, MEMs (Robertson et al 2019 Biogeosciences 16(6)_ 1225-1248; Zhang et al. 2021 Biogeosciences 18 3147-3171), Fast-GHG (atkinson.comell.edu/fast-ghg/), BAMS (Riley et al. 2014 Geoscientific Model Development 7(4) 1335-1355; Tang et al. 2019 Biogeochemistry 144(2) 197-214), Comission (Ahrens et al. 2015 Soil Biology and Biochemistry 88 390-402), Corpse (Sulman et al. 2014 Nature Climate Change 4(12) 1099-1102; Sulman et al. 2017 Ecology Letters 20(8) 1043- 1053), MEND (Wang et al. 2013 Ecology Applications 23(1) 255-272), Mimics (Wieder et al. 2014 Biogeosciences 11(14) 2899-3917; Kyker-Snowman et al. 2022 Geoscientific Model Development 11(6) 2111-2138), SALUS (Basso et al. 2006 Italian Journal of Agronomy 1(4) 677-688, DNDC (Li et al. J. Geophys. Res 97 9759-9776), etc..

[00118] The first agronomic model may be trained using measured ground-truth data for an agronomic region (e.g., emissions data obtained from one or more eddy covariance towers) and corresponding agronomic practice data (e.g., data obtained from farming equipment including without limitation irrigation systems, harvesting equipment/vehicles, planting equipment/vehicles, etc.) for that agronomic region and/or agronomic area. Additionally, the first agronomic model may be trained using ground-truth and emissions data for “similar” agronomic regions (or areas). A similar agronomic region may be a region having substantially similar characteristics (e.g., soil type, weather, management practices) or locations (e.g., a neighboring field). In this manner, the first agronomic model is trained to identify correlations and correspondences between ground-truth data and agronomic measurements such that the model may predict agronomic measurements for other agronomic regions given the ground-truth data for the input agronomic region(s).

[00119] The network system accesses 113 remote sensing data representing a second set of agronomic regions. The remote sensing data may include, e.g., remotely sensed previous or current weather data, previous or current soil data, previous or current crop information, previous or current tillage state, previous or current irrigation state, etc. Typically, the second set of agronomic regions includes a greater number of agronomic regions than the first set of agronomic regions because generating remote sensing data for an agronomic region is easier than generating ground-truth data (and emulator training data) for an agronomic region. Additionally, in some configurations, the second set of agronomic regions may be the same set of agronomic regions as the first set of agronomic regions, or the second set of agronomic regions may be smaller than the first set of agronomic regions. The second set of agronomic regions may comprise some, all, or none of the first set of agronomic regions. The second set of agronomic regions may be larger than the first set of agronomic regions. Whatever the configuration, the first set of agronomic regions represents those regions for which there is high-quality dense agronomic data, and the second set of agronomic regions represents those regions for which there is lower-quality, sparse agronomic data.

[00120] The remote sensing data is typically sparser (e.g. lower spatial resolution) and has lower fidelity than the ground-truth data and/or the emulator training data. That is, the remote sensing data may include fewer variables (e.g., measurements, data points, etc.) at lower resolution (e.g., one datapoint per field rather than multiple datapoints per field) than the ground-truth data and/or emulator training data. In some cases, the second set of agronomic regions may include one or more of the agronomic regions in the first set of agronomic regions. The ground-truth data and/or emulator training data for an agronomic region may be supplemented with remote sensing data. The network system may receive or access the remote sensing agronomic data from one or more remote systems and/or one or more data stores.

[00121] In some embodiments, the network system may analyze remote sensing data to detect one or more unobserved agricultural practices. For example, one or more filtering and or semantic segmentation operations can be applied to remote sensing data to determine an estimate of the percent crop residue cover of soil. Examples of filtering operations include mean filtering, Gaussian filtering, high-pass filtering, filtering based on green spectral bands, etc. Examples of segmentation operations, include convolutional neural networks, deep convolutional neural networks, fully convolutional networks, pyramid scene parsing networks, Encoder-Decoders, U-Nets, DeepLab, ParseNet, etc. Region-growing / superpixel algorithms are commonly used as a first step in segmentation. This creates “superpixels” based on pixel similarity that is then easier to combine into a final segmentation. A low percent of crop residue cover and or patterns of crop residue cover within a geographic region are associated with tillage events. The percent crop residue cover and or pattern of crop residue cover may be associated with one or more unobserved agricultural practices, for example, one or more permutations of: the tillage instrument utilized (e.g., chisel, disk, moldboard, minimum till ripper, etc.), the depth of tillage, tillage frequency (e.g., number of tillage passes), etc. Each of the foregoing are examples of attributes of the agricultural practice of tillage. Other agricultural practices may have attributes such as the application rate of an input application activity (e.g. fertilizer, herbicide, insecticide, etc.), a seeding rate or seeding density, a rate and or duration of irrigation, a frequency of input application, a cover crop blend, etc.

[00122] The network system may train 114 a second agronomic model to generate an estimate of one or more ecosystem attribute for an input agronomic region. The network system may input any of (1) the ground-truth data, (2) the first set of agronomic regions, (3) the determined emulator training datajbr the first set of agronomic regions, (4) the remote sensing data, and (5) the second set of agronomic regions, and output of an ecosystem attribute model (for example, a process-based biogeochemical model) for any given input agronomic region (e.g., any given field). For instance, a farmer may input a field to the second agronomic model and receive an emission baseline for the field (even if the input field does not correspond to any remote sensing and/or ground-truth data).

[00123] The network system may train the second agronomic model using one or more techniques. In a first example, the network system may train the second agronomic model by applying statistically robust imputation models to the ground-truth data and remote sensing data which imputes information from the dense ground-truth data to the sparse remote sensing data. For example, a first agronomic region may have a robust ground-truth data set while its neighboring second agronomic region may have a sparse remote sensing data set. As such, the network system may impute some of the information from the ground-truth dataset for the first agronomic region to the second agronomic region given their proximity and/or similarity. In a second example, the network system may train the second agronomic model by identifying correlations and correspondences between (1) the ground-truth data and emulator training data for the first set of agronomic regions and (2) the remote sensing data for the second set of agronomic regions. In effect, the second agronomic model is trained to infer ecosystem attributes (e.g., emissions baselines) for any given agronomic region based on underlying, identifiable (or, in some cases, not identifiable) relationships between the (1) sparse remote sensing data and (2) the dense ground truth and emulator training data.

[00124] The network system receives 115 a particular agronomic region from a client system and applies 116 the second agronomic model to the agronomic region to determine an emission level for the particular agronomic region. In various examples, the particular agronomic region (1) may have corresponding remote sensing data and corresponding ground-truth data, (2) may have corresponding remote sensing data but does not have corresponding ground-truth data, or (3) may not have either corresponding remote sensing data or corresponding ground-truth data. In various examples, the particular agronomic region includes a specific field or field(s), a type of field, a supply shed (e.g. sourcing region), etc.

[00125] In some configurations, the network system may also receive remote sensing data and/or ground-truth data for the particular agronomic region. In this case, the network system may input the received remote sensing data and/or ground-truth data for the particular agronomic region and the agronomic region itself into the second agronomic model when generating the emissions baseline for the particular agronomic region. In this case, typically, the emissions baseline is more likely to be correct because the second agronomic model has a deeper set of input data for the particular agronomic region from which to generate its emissions baseline.

[00126] The network system may apply a third agronomic model to generate 117 one or more interventions for the particular agronomic region. The one or more interventions may include one or more agronomic practices that are different from the typical (or recorded) agronomic practices for the region. The one or more different agronomic practices may modify an ecosystem attribute in the particular agronomic region which would, in turn, lead to a modified estimate of the ecosystem attribute (e.g. emissions estimate) for the agronomic region. The modified emission estimate (from the one or more different agronomic practices) relative to the original emission baseline allows the network system to determine an environmental impact for the particular agronomic region and each set of one or more agronomic practices. A set of optimized agronomic practices to implement (an intervention) may be selected based on maximizing a beneficial change in an ecosystem attribute, maximizing the permanence of an ecosystem attribute, reducing a permanence risk of an ecosystem attribute, maximizing a benefit to plant health, maximizing a benefit to soil health, reducing uncertainty in a quantification of an ecosystem attribute, maximizing the overall ecosystem impact for the field, minimizing requirements for future management events, minimizing current or future data collection requirements.

[00127] In some embodiments, the network system can be applied to estimate the contribution of one or more agronomic practices, one or more fields, or combinations of one or more fields or one or more fields, to ecosystem attributes or ecosystem impacts of a geographic region, optionally, in a current, past, or future time period.

Varying Model Fidelity

[00128] More generally, the processes set forth above describe combining “higher” fidelity agronomic data with “lower” fidelity agronomic data to generate a baseline value for one or more ecosystem attribute (e.g. methane emissions) for an agronomic region. Notably, what may be considered “higher” fidelity agronomic data and “lower” fidelity agronomic data may depend on the configuration of the network system.

[00129] As described above, the “higher” fidelity agronomic data was ground-truth data received or accessed from client systems and or databases, and the “lower” fidelity agronomic data was remote sensing data received or accessed from remote sensing systems or databases. In another configuration, the “higher” fidelity agronomic data may be groundtruth data received or accessed from client systems or databases, and the “lower” fidelity agronomic data may be data sampled or aggregated from the “higher” fidelity ground-truth data. For instance, if the “higher” fidelity ground-truth data includes each day and time a tilling pass occurs on a field, the “lower” fidelity data may include “tilled field” (derived from the ground-truth data). In effect, the “lower” fidelity data may be down-sampled data created from the “higher” fidelity data. [00130] The choice of which type of data to use for higher fidelity vs. lower fidelity data has implications on training an agronomic model to generate an emissions baseline.

Typically, there is an inverse relationship between the fidelity of the “lower” fidelity data and the imputation breadth of the trained model. In other words, it is easier to impute lower fidelity data across a large, disparate set of agronomic regions than it is to impute higher fidelity data across that same set of agronomic regions. Therefore, in various configurations, the network system may allow a user to select which fidelity level to apply when generating an emissions baseline. Similarly, the network system may also enable a mixture model that applies various levels of fidelity to agronomic regions based on the available datasets. [00131] To illustrate this, FIG. 16 illustrates differences in a determined ecosystem attribute for different states when calculated using a high-fidelity model and a low-fidelity model. In the illustrated example, the x-axis represents a determined ecosystem attribute (e.g., kilograms of emitted carbon dioxide per bushel) using a low-fidelity emulation model, while the y-axis represents the same ecosystem attribute determined using a high-fidelity emulation model. In this example, a high-fidelity emulation model indicates that the emulator is trained to determine an ecosystem attribute on a denser dataset than the low-fidelity emulator.

[00132] In the figure, variations in color the correspond to different agronomic regions (e.g., states). Each data point represents an emulation of the agronomic region. Thus, each data point represents the emulation of an ecosystem attribute for an agronomic region, with the x-axis indicating the result with a low-fidelity emulator and the y-axis indicating the result with a high-fidelity emulator.

[00133] The illustrated figure includes a dashed red line showing where the high-fidelity model and the low-fidelity model make the same determination of the ecosystem attribute. Additionally, large points on the graph represent an average of the emulations (both high fidelity and low fidelity) for a given agronomic region.

V. LEVERAGING AN EMISSIONS EMULATOR SYSTEM

[00134] As described above, a network system may be configured to leverage an emissions emulator (e.g., a series of agronomic models) to generate interventions for agronomic regions using ground-truth agronomic data and remote sensing agronomic data. The emulator generates an agronomic baseline for a region, and the network system generates one or more interventions for that agronomic region. The intervention may include one or more modified agronomic practices that produce a (beneficial) environmental impact on the agronomic region. The following figures illustrate how interventions affect environmental impacts and the effects of those interventions and environmental impacts on various parties. [00135] Returning now to FIG. 2, FIG. 2 is a schematic illustrating differences between total emissions with and without a source intervention, according to embodiments disclosed herein. As shown, the total emissions output is reduced with a source intervention. The source emissions reduction occurs as a part of Scope 3, where Scope 3 emissions are emissions not produced by the large company itself, and not the result of activities from assets owned or controlled by the company, but by those that the company is indirectly responsible for, up and down its value chain. The reduction in emissions is calculated by collecting data on interventions and accounting for what the interventions changed. Scopes 1, 2, and 3 are calculated by a company based on the data they can obtain.

[00136] FIG. 3 is a schematic illustrating different practice distributions, according to embodiments disclosed herein. A bushel of grain within the intervention project will have been grown with a different distribution of practices compared to a bushel from the larger supply-shed. Pi, P2, P3, P4, ... Pn denote difference agricultural practices. [00137] FIG. 4 is a chart illustrating an exemplary construction of neighbor pairs, according to embodiments disclosed herein. Neighboring pairs are used to assess impact of interventions in the planning phase of a project. Imputation modeling can be used to combine and augment knowledge and/or data sources as well. An intermediate metric can measure the extent of the distributional shift that the interventions bring about could be the KL divergence between the initial supply-shed co-occurrence probabilities and the resulting probabilities from the intervention. Continuous improvement may lead to a larger shift from the supplyshed and an appreciable shift each year against the previous year. Both the intervention and the neighboring supply-shed scenario model follow the guidelines in the GHG protocol guidance. This modeling framework incorporates some assumed design. For example, an intervention project is far more effective in a region where practices are high emitting and there is scope for improvement. Regions with low-emitting practices offer lower results. The assumptions make a case for targeted interventions to the regions where they would have the greatest impact. In this case, for example, focus is on the effect of adopting an Optimal Planting Date as the practice of interest. In the pair illustrated in FIG. 7, the field of interest is already implementing practices that represent a good Nitrogen balance (P4) in the baseline year. In a subsequent year of the project, the field implements an Optimal Planting Date (Pl) as well as a Phosphorus application (P6) based on what his neighbor implemented.

[00138] Using the discussed framework, the method would predict the impact of a suboptimal planting date and a good nitrogen balance on yield (la) and emissions (lb).

Then, the impact of adopting an Optimal Planting Date on yield (2a) is predicted (using a pretrained yield model ensemble), the new Nitrogen balance is calculated, and then emissions (2b) are predicted (using a pre-trained emissions model ensemble). The changes in Phosphorus applications as part of these predictions are also plugged in as part of these predictions. [00139] The difference between the updated emissions predictions (2b) and the original emissions predictions (lb) is determined, as well as the difference between updated yield predictions (2a) and the original yield predictions (la). These two differences allow for a sense of how adopting an optimal planting date impacts both yield and emissions in the presence of other co-occurring practices. Results for co-occurrences while adopting an Optimal Planting Date are shown in FIG. 8 and FIG. 9. FIG. 8 is a schematic illustrating the effect of adopting an optimal planting date using a neighbor pair, according to an exemplary embodiment.

[00140] Additionally, farmers often implement a varying number of practices each year on their fields. By including the effect of other practices that may have occurred alongside that intervention, the approach attempts to realistically capture observed co-occurrence patterns and their effect on yield as well as emissions (and emissions changes). In other words, the approach incorporates observed differences in farming practices within the supply shed. These observed differences help illustrate which farmers are ahead of the curve, and how certain farmers could incorporate new practices to lower their emissions and even increase their yield. To evaluate the impact of practice changes, famers in the same supply shed are paired up and differences in their absolute emissions are evaluated. FIG. 9 is a graph comparing a number of co-occurring practices with optimal planting over different geographic regions, according to embodiments disclosed herein.

[00141] FIG. 5 is a chart illustrating exemplary results from co-occurring changes in management practices, according to embodiments disclosed herein. A paired-neighbor practice co-occurrence algorithm is used to understand how shifting the existing practice cooccurrence patterns in a given region affects yields as well as emissions. This method assumes that practice co-occurrences as reported by farmers in a program are representative of farmers in the region. For example, considering FIG. 6, a map illustrating a paired region, where a field-year of interest occurs in this region. Appropriate neighbor-pairs are selected to this field-year by considering all of the other field-years in the same region, available in the dataset. It is assumed that the selected neighbor-pairs represent the feasible future states of the field of interest. In this example, there are a total of seven practices that could be mechanistically connected to, and directly or indirectly contributing the emissions resulting from growing the wheat on the field. The practices that were reported from the given field- year are contrasted with the practices reported from selected neighbor-pairs with lower emissions.

[00142] Stacking occurs when multiple practice changes (sometimes referred to herein as levers) are implemented within a given field, year (e.g., conventional till —> no till and a nitrogen timing change). Stacking is an important concept because differences (deltas) between farmers often occur not just as one intervention but a stack of multiple within the same year. Stacking is common in synthetic data (FIG. 10A) and some of the delta trends as more practices are stacked (FIG. 10B). FIG. 10A is a series of graphs illustrating a frequency of interventions by number of levers, according to embodiments disclosed herein. FIG. 10B is a series of graphs illustrating a CO2e impact associated with interventions with a number of levers, according to embodiments disclosed herein. Practice changes include without limitation: fertility management (for example, application of nutrients such as nitrogen to maintain soil composition within optimal range accounting for nutrients removed by crops), fertility timing (for example, timing of nitrogen application relative to weather events, seasons (e.g., spring application), etc.), use of starter fertilizer (e.g., phosphorus application), tillage and residue management (e.g., conservation tillage such as no-till, in-row subsoiling, strip-till, ridge-till, or any other tillage practice that builds up crop residue on the soil surface), etc. [00143] Since relatively few interventions in the simulated data occur by themselves, emissions deltas are estimated as the median emissions change within each practice/supply shed across varying number of stacks. This means that that within each individual intervention, other interventions are included that may have occurred alongside that intervention. A histogram of all simulated emissions deltas across # of levers is provided (FIG. 11 A) as well as across interventions (FIG. 11B). FIG. 11A is a graph illustrating emissions deltas resulting from various combinations of management levers. FIG. 1 IB is a graph illustrating emissions deltas resulting from various combinations of interventions. [00144] FIG. 12A is a map of exemplary region CMZ 40 or KS South Central. FIG. 12B is a series of graphs illustrating common intervention combinations in the KS South Central region.

[00145] FIG. 13 is an exemplary schematic illustrating a remote sensing dataset, according to embodiments disclosed herein. The dataset includes inferred boundaries for the supply shed and inferred practices for those boundaries and inferred yield. Caveats for this dataset is that available management practices in L5 RSG inference are very coarse. Accordingly, there is a need to impute more granular management practices that extend this dataset using a modeling sequence.

[00146] FIG. 14 is a schematic illustrating an exemplary detailed dataset, according to embodiments disclosed herein. Data from farmers in the supply shed and growing the crop of interest gives additional field level datapoints with detailed management data. The dataset is farmer contributed. The dataset allows for the building of a model sequence to impute missing information in remote sensing data, and to build a first iteration of an emissions model.

[00147] FIG. 15 is a schematic of an exemplary computing node. Computing node 10 is only one example of a suitable computing node and is not intended to suggest any limitation as to the scope of use or functionality of embodiments described herein. Regardless, computing node 10 is capable of being implemented and/or performing any of the functionality set forth hereinabove.

[00148] In computing node 10 there is a computer system/server 12, which is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known computing systems, environments, and/or configurations that may be suitable for use with computer system/server 12 include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputer systems, mainframe computer systems, and distributed computing environments that include any of the above systems or devices, and the like.

[00149] Computer system/server 12 may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, and so on that perform particular tasks or implement particular abstract data types. Computer system/server 12 may be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

VI. COMPUTING SYSTEM

[00150] As shown in FIG. 15, computer system/server 12 in computing node 10 is shown in the form of a general-purpose computing device. The components of computer system/server 12 may include, but are not limited to, one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including system memory 28 to processor 16.

[00151] Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, Peripheral Component Interconnect (PCI) bus, Peripheral Component Interconnect Express (PCIe), and Advanced Microcontroller Bus Architecture (AMBA).

[00152] Computer system/server 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer system/server 12, and it includes both volatile and non-volatile media, removable and nonremovable media.

[00153] System memory 28 can include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32.

Algorithm Computer system/server 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media. By way of example only, storage system 34 can be provided for reading from and writing to a non-removable, non-volatile magnetic media (not shown and typically called a "hard drive"). Although not shown, a magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable, non-volatile optical disk such as a CD-ROM, DVD-ROM or other optical media can be provided. In such instances, each can be connected to bus 18 by one or more data media interfaces. As will be further depicted and described below, memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the disclosure.

[00154] Program/utility 40, having a set (at least one) of program modules 42, may be stored in memory 28 by way of example, and not limitation, as well as an operating system, one or more application programs, other program modules, and program data. Each of the operating system, one or more application programs, other program modules, and program data or some combination thereof, may include an implementation of a networking environment. Program modules 42 generally carry out the functions and/or methodologies of embodiments as described herein.

[00155] Computer system/server 12 may also communicate with one or more external devices 14 such as a keyboard, a pointing device, a display 24, etc.; one or more devices that enable a user to interact with computer system/server 12; and/or any devices (e.g., network card, modem, etc.) that enable computer system/server 12 to communicate with one or more other computing devices. Such communication can occur via Input/Output (VO) interfaces 22. Still yet, computer system/server 12 can communicate with one or more networks such as a local area network (LAN), a general wide area network (WAN), and/or a public network (e.g., the Internet) via network adapter 20. As depicted, network adapter 20 communicates with the other components of computer system/server 12 via bus 18. It should be understood that although not shown, other hardware and/or software components could be used in conjunction with computer system/server 12. Examples, include, but are not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, etc.

[00156] FIG. 17 illustrates an example system environment, according to the principles described herein. The environment 1600 of the embodiment of FIG. 17 includes a network system 1602 and one or more remote systems 1604 connected via a network 1606. In other examples and/or embodiments, environment 1600 may include fewer, different, or additional components or systems than those described herein. Moreover, in various embodiments, the functionality attributed to one system or component in the environment 1600 may be attributable to another system or component.

[00157] The one or more remote systems 1604 may be any system configured to transmit dense, high-fidelity ground truth data or sparse low-fidelity agronomic data to the network system 1602 via the network 1606. The one or more remote systems 1604 include one or more data accessing modules 1630 and/or one or more data measurement modules 1632. Each data accessing module 1630 is configured to access agronomic data pertaining the agronomic regions for which the network system 1602 employs an emulator system 1610 to generate an expected ecosystem attribute. A data accessing module 1632 is configured to access ground truth and/or agronomic data in environment 1600. For example, the data accessing module 1630 may be configured to access recommendations from industry experts, access weather data, etc. A data measurement module 1632 is configured to measure ground truth and/or agronomic data in the environment. For example, the data measurement module 1632 may be a sensor in the field, a satellite measuring agronomic data, etc.

[00158] The network system 1602 is any system configured to receive ground truth and/or agronomic data via a network 1606 and determine an ecosystem attribute. The network system 1602 includes an emulator system 1612, an intervention module 1614, and a learning module 1616. The network system 1602 may include additional, fewer, or different elements than those described herein.

[00159] The network system 1602 includes an emulator system 1612 configured to determine an ecosystem attribute for agronomic regions. To do so, the emulator system 1612 employs a first agronomic model 1620 and a second agronomic model 1622. The first agronomic model 1620 is configured for processing ground truth data to determine ecosystem attributes for agronomic regions based on that high-quality, dense ground truth data. The second agronomic model 1622 is configured to quantify an ecosystem attribute by emulating the results of the first agronomic model for agronomic regions for which there is not necessarily high-quality agronomic data. That is, the second agronomic model 1622 determines ecosystem attributes for any agronomic region, regardless of whether there is high-quality data (or remote sensing data) for that agronomic region. To that end, the second agronomic model 1622 is trained based on the ground-truth data and determined ecosystem attributes for agronomic regions for which the network system 1602 has high-quality agronomic data, and (2) the remote sensing data for agronomic regions for which the network system 1602 has such data. Generally, the remote sensing data covers more agronomic regions than the ground truth data.

[00160] The network system 1610 includes an intervention module 1614 configured to generate one or more interventions for agronomic regions based on the determined ecosystem attributes. The interventions may change the ecosystem attribute in a positive manner or a negative manner. For instance, the intervention may recommend one or more farming practices to reduce the greenhouse gas emissions for a field based on the emulated ecosystem attribute for that field.

[00161] The network system may include a learning module 1616 configured to train and/or apply a machine-learned model configured to determine an ecosystem attribute, baseline, impact, etc. In various embodiments, the machine-learned model can be the first agronomic model 1620 and/or the second agronomic model 1622, but could include other machine-learned models. The machine-learned model may be trained on any data received at the network system. For instance, the machine-learned model may be trained on ground-truth agronomic data and remote sensing agronomic data received from the one or more remote systems 1650. [00162] In various embodiments, a learning system (e.g., learning module 1616) is provided. In some embodiments, a feature vector is provided to a learning system. Based on the input features, the learning system generates one or more outputs. In some embodiments, the output of the learning system is a feature vector. In some embodiments, the learning system comprises an SVM. In other embodiments, the learning system comprises an artificial neural network. In some embodiments, the learning system is pre-trained using training data. In some embodiments training data is retrospective data. In some embodiments, the retrospective data is stored in a data store. In some embodiments, the learning system may be additionally trained through manual curation of previously generated outputs.

[00163] In some embodiments, the learning system, is a trained classifier. In some embodiments, the trained classifier is a random decision forest. However, it will be appreciated that a variety of other classifiers are suitable for use according to the present disclosure, including linear classifiers, support vector machines (SVM), or neural networks such as recurrent neural networks (RNN).

[00164] Suitable artificial neural networks include but are not limited to a feedforward neural network, a radial basis function network, a self-organizing map, learning vector quantization, a recurrent neural network, a Hopfield network, a Boltzmann machine, an echo state network, long short term memory, a bi-directional recurrent neural network, a hierarchical recurrent neural network, a stochastic neural network, a modular neural network, an associative neural network, a deep neural network, a deep belief network, a convolutional neural networks, a convolutional deep belief network, a large memory storage and retrieval neural network, a deep Boltzmann machine, a deep stacking network, a tensor deep stacking network, a spike and slab restricted Boltzmann machine, a compound hierarchical-deep model, a deep coding network, a multilayer kernel machine, or a deep Q-network. [00165] The training of the machine-learned models described herein (such as neural networks and other models referenced herein) include the performance of one or more non- mathematical operations or implementation of non-mathematical functions at least in part by a machine or computing system, examples of which include but are not limited to data loading operations, data storage operations, data toggling or modification operations, non- transitory computer-readable storage medium modification operations, metadata removal or data cleansing operations, data compression operations, image modification operations, noise application operations, noise removal operations, and the like. Accordingly, the training of the machine-learned models described herein may be based on or may involve mathematical concepts, but is not simply limited to the performance of a mathematical calculation, a mathematical operation, or an act of calculating a variable or number using mathematical methods.

[00166] Likewise, it should be noted that the training of the models describes herein cannot be practically performed in the human mind alone. The models are innately complex including vast amounts of weights and parameters associated through one or more complex functions. Training and/or deployment of such models involve so great a number of operations that it is not feasibly performable by the human mind alone, nor with the assistance of pen and paper. In such embodiments, the operations may number in the hundreds, thousands, tens of thousands, hundreds of thousands, millions, billions, or trillions. Moreover, the training data may include hundreds, thousands, tens of thousands, hundreds of thousands, or millions of measurements. Accordingly, such models are necessarily rooted in computer-technology for their implementation and use.

[00167] The present disclosure may be embodied as a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.

[00168] The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non- exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.

[00169] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.

[00170] Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’s computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user’s computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.

[00171] Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

[00172] These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.

[00173] The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

[00174] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

VII. ADDITIONAL INTERVENTION EXAMPLES AND CONSIDERATIONS

A specific neighbor-pair example

[00175] Field A (Field-year of interest) and Field B (Neighbor pair) operate within the region using the same management practices, but Field A is planted at an Optimal planting date (in September) while Field B is planted sub-optimally in late October.

[00176] In Step 1 : we run Field A's management practices (+ location, weather, and soil data) through our yield model ensemble to predict baseline yield, estimate the nitrogen balance and then run this data through the emissions model ensemble to predict emissions. In Step 2: we change the planting date to Field B's. We then estimate the updated yield with our yield ensemble, estimate the new nitrogen balance and plug in updated yield and planting date data into our emissions model ensemble to get the new emissions.

[00177] In Step 3: we obtain the difference between Field A's original emissions predictions and the emissions prediction with Field B's planting date (and accompanying new yield + nitrogen balance).

[00178] We aggregate and summarize this emissions and yield delta with all the other deltas from the other Neighbor pairs and for all of the other practices of interest. Doing this over several neighbor pairs gives a realistic distribution of deltas that can be attributed to adopting an Optimal Planting Date in each region, in both the presence and absence of other practices. Results are shown in FIG. 8 and FIG. 9.

Optimizing yield as a function of five actionable farm management practices [00179] The model we built estimates that Yield can be expressed as:

Y A - P_D + B ■ P_R + C ■ N + D ■ S_PH + E ■ P_A Where Y is the yield as expressed by a linear combination of at least five farm management practices, PD, PR, N, SPH, and PA. PD is the planting delay as measured from the earliest USDA planting data in the county where the agronomy practices occur, PR is the planting rate, N is the nitrogen level in the soil at a specific time during a cultivation cycle, SPH is the pH of the soil, and PA is the delay in pesticide application after a plant disease outbreak was reported in the county. A, B, C, D, and E are numerical values that may scale the various management practices based on quantitative or qualitative observations in the field. As an example, A may be 5, B may be 3, C may be 6, D may be 2, and E may be 1. The yield formula may include other management practices and the scaling factors may be different. [00180] In optimization land, we want to minimize a function for the purpose of mathematical and notational convenience. So -yield (or a dollar amount) is the objective function or cost function. The minimal value of -yield (or a dollar amount), z.e., the objective function is called the optimal cost. The five farm management practices are called decision variables. The values of the five decision variables that lead to the optimal cost constitute the optimal solution.

[00181] If, suppose: we are not interested in planting dates before the earliest USDA- recommended planting date for a given county, only interested in a planting rate between a recommended minimum and an economically feasible maximum planting rate for a specific farmer, nitrogen application only if the farmer can afford it, pH addressal with lime application only if the farmer can afford the cost of nitrogen, pesticides and lime together, or pesticide spraying only if the farmer can hire labor to spray at the right time in a responsive manner: each of these conditions that needs to be met is called a constraint. If the decision variables do not meet these constraints, they cannot be a solution. If there is a set of values for the decision variables that satisfy the constraints, it is called a feasible solution. If the objective function is a linear combination of decision variables and the constraints can also be expressed as linear functions, the optimization procedure is called linear optimization.

For isolating the effects of an observed intervention on a field

[00182] In this case, we determine the delta between the emissions estimates from practices observed on the field and the emissions estimates if the field were not in the program (z.e., belonged to the population of the fields in its supply shed and thus inherited the practice patterns in the supply shed).

Paired neighbor approach for evaluating effect size (emissions change) of a future intervention(s)

[00183] In this case, we determine the delta between the emissions estimates from practices observed on the field and the emissions estimates if the field switched one or more of its practices to resemble those of a neighbor with similar characteristics.

Emissions cohort approach for identifying a low-emissions inventory within a supply shed [00184] In this case, we determine the emissions inventory of each field in the supply-shed using a model estimated either with high-density field data and remote sensing (for wheat) or with literature and extension farmer profiling (for rice). Given a delivery location, we can estimate the aggregate emissions of a cohort of fields supplying the location and contrast it with emissions from other similar delivery cohorts across the supply shed. The cohorts with the lowest emissions become targets for sourcing.

[00185] Wheat flavor: ns from carbon farmers. The ensembles are used to predict emissions using boundaries and remote sensing generated practice inferences. [00186] Inferences: The ensembles are used to predict emissions on an exhaustive (but feasible) set of practice combinations constrained by prevalences reported in annual regional extension surveys.

Quantifying field-level inventory uncertainty in emissions estimates

[00187] In this case, we first construct a field-level dataset that consists of all the practices that were observed with 100% certainty on a given field. For the set of practices for which there is insufficient data, we plug in all possible combinations of practices that can co-occur in the supply shed. An emissions ensemble trained on literature is used to predict the emissions on the field level dataset. The final result is a distribution of emissions estimates for each field - from which uncertainty estimates can be derived.

Claims

What is claimed is:

1. A method for determining one or more ecosystem attributes for an agronomic region, the method comprising: accessing ground-truth agricultural data representing a first set of agronomic regions; applying a first agronomic model to the ground-truth agricultural data to determine an ecosystem attribute for the first set of agronomic regions; accessing remote sensing data representing a second set of agronomic regions; training a second agronomic model using the ground-truth agricultural data, the determined ecosystem attribute, the remote sensing data, the first set of agronomic regions, and the second set of agronomic regions, the second agronomic model trained to predict one or more ecosystem attributes for agronomic regions based on (1) the ground-truth data and determined ecosystem attribute for the first set of agronomic regions, and (2) the remote sensing data for the second set of agronomic regions; applying the second agronomic model to the agronomic region to determine one or more ecosystem attributes for the agronomic region.

2. The method of claim 1, further comprising: applying a third agronomic model to the determined ecosystem attribute to generate an intervention for the agronomic region, the intervention including a different different agronomic practice for the agronomic region than a standard agronomic practice for the agronomic region.

3. The method of claim 2, wherein the different agronomic practice, if implemented in the agronomic region, is predicted to modify the ecosystem attribute for the agronomic region.

4. The method of claim 2, wherein the different agronomic practice, if implemented in the agronomic region, is predicted to generate a beneficial change to the ecosystem attribute.

5. The method of claim 2, further comprising generating a recommendation for a farmer of the agronomic region based on the determined intervention.

6. The method of claim 1, wherein training the second agronomic model to predict one or more ecosystem attributes comprises: training the second agronomic model to infer ecosystem attributes for any agronomic region based on underlying relationships between the (1) the ground-truth data and determined ecosystem attribute for the first set of agronomic regions, and (2) the remote sensing data for the second set of agronomic regions.

7. The method of claim 1, wherein accessing ground-truth data further comprises: accessing a plurality of agricultural practices associated with the first set of agronomic regions; determining, from the plurality of agricultural practices, one or more attribues of the plurality of agricultural practices; and generating ground-truth data comprising the one or more attributes of the plurality of agricultural practices.

8. The method of claim 1, wherein accessing remote sensing data further compnses: accessing a plurality of remotely sensed images of the second set of agronomic regions; determining, from the plurality of remotely sensed images, one or more attributes of agronomic attributes of the second set of agronomic regions; and generating remote sensing data comprising the one or more attributes of the agronomic attributes.

9. The method of claim 1, wherein accessing remote sensing data further comprises: accessing a first plurality of remotely sensed images of the second set of agronomic regions and a second plurality of remotely sensed images of the second set of agronomic regions; determining, from the first and second plurality of remotely sensed images, one or more attributes of unobserved agricultural practices occuring in the second set of agronomic regions; and generating remote sensing data comprising the one or more attributes of the unobserved agricultural practices.

10. The method of claim 1, wherein each of the first set of agronomic regions and the second set of agronomic regions comprise one or more fields or farms.

11. The method of claim 1, wherein the ecosystem attribute is a quantification of one or more of water use, nitrogen use, soil carbon sequestration, or greenhouse gas emissions.

12. The method of claim 1, wherein the second agronomic model is configured to emulate one or more results of the first agronomic model.

13. The method of claim 1, wherein the second set of agronomic regions comprises the first set of agronomic regions.

14. The method of claim 1, wherein the first set of agronomic regions is different than the second set of agronomic regions.

15. The method of claim 1, wherein the agronomic region is not included in the first set of agronomic regions.

16. The method of claim 1, wherein the agronomic region is not included in the second set of agronomic regions.

17. The method of claim 1, wherein the ground-truth data does not comprise ground-truth data corresponding to the agronomic region.

18. The method of claim 1, wherein the accessed remote sensing data does not comprise remote sensing data corresponding to the agronomic region.

19. A non-transitory computer-readable storage medium comprising computer program instructions for determining one or more ecosystem attributes for an agronomic region, the computer program instructions, when executed by one or more processors, causing the one or more processors to: access ground-truth agricultural data representing a first set of agronomic regions; apply a first agronomic model to the ground-truth agricultural data to determine an ecosystem attribute for the first set of agronomic regions; access remote sensing data representing a second set of agronomic regions; train a second agronomic model using the ground-truth agricultural data, the determined ecosystem attribute, the remote sensing data, the first set of agronomic regions, and the second set of agronomic regions, the second agronomic model trained to predict one or more ecosystem attributes for agronomic regions based on (1) the ground-truth data and determined ecosystem attribute for the first set of agronomic regions, and (2) the remote sensing data for the second set of agronomic regions; apply the second agronomic model to the agronomic region to determine one or more ecosystem attributes for the agronomic region.

20. A system comprising: a first remote system configured for generating ground truth-agricultural data for agronomic regions; a second remote system configured for generating remote sensing data for agronomic data; and a network system comprising; one or more processors; and a non-transitory computer readable storage medium comprising computer program instructions for determining one or more ecosystem attrubtes for an agronomic region, the computer program instructions, when executed by the one or more processors, causing the one or more processors to: access, from the first remote system, ground-truth agricultural data representing a first set of agronomic regions; apply a first agronomic model to the ground-truth agricultural data to determine an ecosystem attribute for the first set of agronomic regions; access, from the second remote system, remote sensing data representing a second set of agronomic regions; train a second agronomic model using the ground-truth agricultural data, the determined ecosystem attribute, the remote sensing data, the first set of agronomic regions, and the second set of agronomic regions, the second agronomic model trained to predict one or more ecosystem attributes for agronomic regions based on (1) the ground-truth data and determined ecosystem attribute for the first set of agronomic regions, and (2) the remote sensing data for the second set of agronomic regions; apply the second agronomic model to the agronomic region to determine one or more ecosystem attributes for the agronomic region.