Nothing Special   »   [go: up one dir, main page]

Global Crop Calendars of Maize and Wheat in The Framework of The Worldcereal Project

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

GIScience & Remote Sensing

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tgrs20

Global crop calendars of maize and wheat in the


framework of the WorldCereal project

Belén Franch, Juanma Cintas, Inbal Becker-Reshef, María José Sanchez-


Torres, Javier Roger, Sergii Skakun, José Antonio Sobrino, Kristof Van Tricht,
Jeroen Degerickx, Sven Gilliams, Benjamin Koetz, Zoltan Szantoi & Alyssa
Whitcraft

To cite this article: Belén Franch, Juanma Cintas, Inbal Becker-Reshef, María José Sanchez-
Torres, Javier Roger, Sergii Skakun, José Antonio Sobrino, Kristof Van Tricht, Jeroen Degerickx,
Sven Gilliams, Benjamin Koetz, Zoltan Szantoi & Alyssa Whitcraft (2022) Global crop calendars of
maize and wheat in the framework of the WorldCereal project, GIScience & Remote Sensing, 59:1,
885-913, DOI: 10.1080/15481603.2022.2079273

To link to this article: https://doi.org/10.1080/15481603.2022.2079273

© 2022 The Author(s). Published by Informa View supplementary material


UK Limited, trading as Taylor & Francis
Group.

Published online: 06 Jun 2022. Submit your article to this journal

Article views: 1031 View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tgrs20
GISCIENCE & REMOTE SENSING
2022, VOL. 59, NO. 1, 885–913
https://doi.org/10.1080/15481603.2022.2079273

Global crop calendars of maize and wheat in the framework of the WorldCereal
project
Belén Franch a,b†, Juanma Cintasa†, Inbal Becker-Reshefb, María José Sanchez-Torresa, Javier Rogerc, Sergii Skakunb,d,
José Antonio Sobrinoa, Kristof Van Trichte, Jeroen Degerickxe, Sven Gilliamse, Benjamin Koetzf, Zoltan Szantoif,g
and Alyssa Whitcraftb
a
Global Change Unit, Image Processing Laboratory, Universitat de Valencia, Paterna (Valencia), Spain; bDepartment of Geographical Sciences,
University of Maryland, College Park, MD, USA; cLARS group, Universitat Politècnica de València, Valencia, Spain; dNASA Goddard Space Flight
Center, Greenbelt, MD, USA; eVITO Remote Sensing, Mol, Belgium; fScience, Applications & Climate Department, European Space Agency,
Paris, France; gDepartment of Geography & Environmental Studies, Stellenbosch University, South Africa

ABSTRACT ARTICLE HISTORY


Crop calendars provide valuable information on the timing of important stages of crop development Received 20 December 2021
such as the planting or Start of Season (SOS) and harvesting dates or End of Season (EOS). This Accepted 9 May 2022
information is critical for many crop monitoring applications such as crop-type mapping, crop condition KEYWORDS
monitoring, and crop yield estimation and forecasting. Spatially detailed information on the crop Agriculture monitoring; Start
calendars provides an important asset in this respect, as it allows the algorithms to account for specific of Season; End of Season;
local circumstances while also maximizing their robustness and global applicability. Existing global crop LSP; Sentinel-2; Landsat 8
calendar products, as produced by the Group on Earth Observations’ Global Agricultural Monitoring
(GEOGLAM) Crop Monitor, the United States Department of Agriculture Foreign Agricultural Service
(USDA-FAS), the Food and Agriculture Organization (FAO), and the European Commission Joint
Research Center’s Anomaly hot Spots of Agricultural Production (ASAP), generally provide this informa­
tion only at national or subnational level. In this work, we present gridded SOS and EOS maps for wheat
and maize that represent the crop calendars’ spatial variability at 0.5° spatial resolution. These maps are
generated in the framework of WorldCereal, which is a European Space Agency (ESA) funded project
whose cropland and crop-type wheat and maize algorithms at global scale and at 10 m spatial
resolution require this information. The proposed maps are built leveraging the above noted global
products (Crop Monitor, USDA-FAS, FAO, ASAP) whose datasets are combined into a baseline map and
sampled to train a Random Forest algorithm based on climatic and geographic data. Their evaluation
against test data from the baseline maps set aside for validation purposes show a good performance
with SOS (EOS) R2 of 0.87 (0.92) and a root mean square error (RMSE) of 27 (26) days for wheat, showing
the lowest errors (RMSE <15 days) in North America, Central Europe, South Africa, and Australia, all
critical areas for global wheat production and trade. Meanwhile, the largest errors (RMSE between 40
and 60 days) occurred in regions of South America close to the Amazon Forest and in Africa close to the
Congo Basin. In the case of maize, the SOS (EOS) evaluation shows an R2 of 0.88 (0.79) and an RMSE of
24 (28) days for maize, with the best performing regions (RMSE < 15 days) located in the Northern
Hemisphere, South Africa, and Australia, important areas for global maize production and trade.
Meanwhile, the worst performing regions were in Brazil, Saudi Arabia and India. Additionally, the
crop calendars were evaluated using a simple Land Surface Phenology (LSP) model based on Sentinel-2
and Landsat 8 Earth Observation data from Sentinel-2 and Landsat 8 over known wheat and maize
fields. The results show a SOS (EOS) R2 of 0.75 (0.88) and an RMSE of 25 (18) days for wheat and SOS
(EOS) R2 of 0.80 (0.88) and an RMSE of 35 (24) days for maize. Therefore, the presented calendars present
an advancement over the existing crop calendar products in terms of capturing spatial coverage and
variability and reporting their accuracy.

1 Introduction
Nowadays, agriculture has to face challenges such as temperatures that destabilizes crop phenology (FAO
the adaptation to a climate with the increasing fre­ 2008, 2015; Mbow et al. 2019); a continuous growing
quency of extreme events (e.g. floods, droughts, human population with higher nutritional demands
frosts, heat waves) and an increase of mean global (Godfray et al. 2010; FAO 2015); and a situation with

CONTACT Belén Franch belen.franch@uv.es



Both authors contributed equally to this work and should be considered co-first authors.
Supplemental data for this article can be accessed online at https://doi.org/10.1080/15481603.2022.2079273
© 2022 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
886 B. FRANCH ET AL.

scarcity of fuel/energy-vectors that increases the pro­ on the Growing Degree Days (GDD) temperature accu­
duction costs (FAO 2015; Godfray et al. 2010). Under mulation, the WorldCereal crop-type mapping algo­
these premises, knowledge of the main crops’ rithms at global scale also exploits GDD. This
requirements (e.g. water, nutrients, C/N, and pH), transformation of the time dimension to a GDD dimen­
extent, location, and condition is crucial to enhance sion requires information on the SOS to determine
market transparency. These premises also follow the when to start the accumulation of temperature,
mandate of the G20 Group on Earth Observations which also requires information on the SOS. Further,
Global Agricultural Monitoring Initiative (GEOGLAM) the Essential Agricultural Variables for GEOGLAM effort
(Becker-Reshef et al. 2019). Earth Observation (EO) is (Whitcraft et al. 2019; www.geoglam.org), consider the
widely used in agricultural monitoring applications crop calendars as essential inputs for downstream pro­
(Atzberger 2013; Whitcraft, Becker-Reshef, and cessing (i.e. combination with other EO datasets for
Justice 2015a; Skakun et al. 2017; Becker-Reshef further analysis) and also for non-technical audiences
et al. 2018, 2019; Benami et al. 2021) such as cropland as they seek to understand which crops are in season,
and crop-type classification (Mingwei et al. 2008; where, and how current seasons are progressing rela­
Teluguntla et al. 2018; Orynbaikyzy et al. 2020; Wang tive to “typical” years. They are further required not
et al. 2020), hydrological requirement estimation simply as backward-looking products (i.e. after the
(Teluguntla et al. 2018; Javed et al. 2020), phenology season is completed), but as current season products
description (Whitcraft, Becker-Reshef, and Justice as critical inputs to production outlooks, agricultural
2015b; Garcia-Perez et al. 2021; Descals et al. 2021; land use and management practices detection, and
Bórnez et al. 2020; Helman 2018) and yield estimation disaster risk assessment. Thus, it is important to gen­
and prediction (Becker-Reshef et al. 2010; Kogan et al. erate a crop calendar with three main characteristics:
2013; Franch et al. 2019, 2021). global extent, sufficient resolution to account for spa­
Crop calendars provide information on the timing of tial variability in cropping practices and weather con­
phenological stages of the crops (Caparros-Santiago, ditions, and a defined season when it was created in
Rodriguez-Galiano, and Dash 2021). Among the differ­ order to determine their relevance to current cropping
ent phenological stages, the crop’s planting and har­ practices (Whitcraft, Becker-Reshef, and Justice 2015b).
vesting dates, hereafter referred to as Start of Season In the literature, different studies defined crop calen­
(SOS) and End of Season (EOS), are critical inputs to dars for a wide range of crops, applications, and scales.
several agricultural applications. For instance, the Depending on how they are generated, they can be
GEOGLAM Crop Monitor (CM) for the Agricultural sorted into three main groups: based on local informa­
Market Information System (AMIS) reports infer crop tion, crop analysts or already existing crop calendars;
conditions based on crop calendars (Fritz et al. 2019; based on crop models; and based on EO. Within the
Whitcraft et al. 2019). Franch et al. (2015; 2021) devel­ first group the main global crop calendar products are
oped a crop yield forecasting model based on the produced by the Food Agriculture Organization (FAO;
Growing Degree Days (GDD) accumulation, which in https://cropcalendar.apps.fao.org/) and the Foreign
turn requires SOS information. The WorldCereal project Agriculture Service of the United States Department
(www.esa-worldcereal.org), funded by the European of Agriculture (USDA-FAS; https://ipad.fas.usda.gov/
Space Agency (ESA), aims to develop an open-source ogamaps/cropcalendar.aspx). They are all based on
system that will allow users to generate seasonally spatial interpolation of census-based data, local crop
updated global-level map products of cropland extent, analysts’ information, data from international agricul­
irrigation, and crop type (limited to wheat and maize), tural agencies or field visits to the targeted area.
at 10 meters resolution. These products will be gener­ However, they are mostly provided at national-level
ated/updated within 1 month after the EOS of the or sub-national ecological regions and hence do not
major growing season in a certain area. To do so, capture possible variations of SOS and EOS within their
WorldCereal will rely on Agro-Ecological Zones (AEZ) correspondent units. Further, their sources are typically
maps that group areas with similar SOS and EOS where poorly documented, they provide a range of SOS and
the algorithm will be triggered at the same time. EOS dates in time spans of 10–15 days, or much longer,
Additionally, similarly to Franch et al. (2015; 2021) and there is rarely information on their accuracy
that developed a crop yield forecasting model based (Stehfest et al. 2007; USDA-NASS, 2010; Whitcraft,
GISCIENCE & REMOTE SENSING 887

Becker-Reshef, and Justice 2015b). In the literature, EO approaches are based on associating the pheno­
there are some authors that explore local crop calen­ logical stages of a crop with the temporal evolution of
dars and the effects that climate change has over them, vegetation indexes, such as the well-established
although they have very limited spatial representative­ Normalized Difference Vegetation Index (NDVI). This
ness (e.g. Banerjee and Dawn 2020; Yang et al. 2021). methodology is known as Land Surface Phenology
Recently, Jägermeyr et al. (2021) created global crop (LSP). From a LSP analysis, metrics such as SOS (and
calendars on planting and maturity dates of winter and EOS) can be extracted by detecting the first (or last)
spring wheat, maize, soybean and rice. These calendars signals of vegetation cover generally based on the
were generated integrating different crop calendar temporal trajectory of a given vegetation index.
products such as the global Joint Research Center’s However, given that they rely on detecting the change
Anomaly Hot Spots of Agricultural Production (ASAP; in vegetation cover, EO-based crop calendars typically
Dimou, Meroni, and Rembold 2018), Sacks et al. (2010) show a delay in SOS and an advance in EOS relative to
or MIRCA2000 (Portmann, Siebert, and D.ll 2010) actual planting and harvesting dates, respectively.
among others. However, this study did not include Given the differences that these products may provide,
any evaluation on the accuracy of the presented crop hereafter we will refer to them as LSP SOS and LSP EOS
calendars. maps. Bornez et al. (2020) evaluated different
Crop calendars built on crop models are based approaches and vegetation indices to derive LSP
on building relationships between climate variables metrics. They highlighted that LSP algorithms can be
and the crop phenological stages. For example, inferred based on threshold and percentiles, moving
Stehfest et al. (2007) commented that crop calen­ averages, first derivatives and fitted models (e.g.
dars of USDA-FAS and FAO are insufficient for yield Jönsson and Eklundh 2004; Patel and Oza 2014;
modeling. Therefore, they used the DayCent eco­ Whitcraft, Becker-Reshef, and Justice 2015b; Ren,
system model to generate custom crop calendars of Campbell, and Shao 2017; de Castro et al. 2018;
wheat, maize, rice and soybean. They modeled the Becker et al. 2020, 2021). Within the EO-based
planting date which lead to the greatest yields, approaches, the ASAP product (Dimou, Meroni, and
assuming that farmers would optimize planting Rembold 2018) used Sentinel-2 data to derive LSP
and harvesting windows for yield. In this work, the SOS and LSP EOS with the SPIRITS software, and that
planting date maps were evaluated aggregating then were cross-checked with existing crop calendars
them at national level with FAO and USDA-FAS (i.e. FAO and USDA) to assign them to a crop.
crop calendars information. Their results showed Lastly, the GEOGLAM CM crop calendars for winter
a coefficient of determination (R2) of 0.67 for winter wheat, spring wheat, maize, rice, and soybean, serve as
wheat and 0.66 for maize planting dates. Sawano inputs to their monthly crop condition reports (Becker-
et al. (2008) computed rice crop calendars in the Reshef et al. 2019). These are generated from a mix of
Northeast of Thailand based on field surveys and air three broad types of calendars, and regularly updated
temperature from weather stations to infer Growing as new and improved calendars are released and
Degree Days whose accumulation was associated assimilated into the GEOGLAM products.
with the different stages of rice. Kyu Lee and An The aforementioned limitations of the existing
Dang (2020) used the FAO’s AquaCrop model (Hsiao crop calendar products with respect to their age in
et al. 2009) to assess the effect of climate change in a climate change scenario that is modifying the tim­
cassava crops yields in the Binh Thuan Province ing of crop phenological stages all over the world
(Vietnam). They found that the planting dates that (Jeong et al. 2011; Piao et al. 2015); representation
would produce the highest yields were advanced of fine-level variability; geographic completeness; and
between 14 and 21 days for spring crops, and validation status, therefore limit their applicability,
between 7 and 14 days for summer crops compared particularly to in-season mapping activities such as
to old crop calendars. Similarly, Zambrano Nájera those that characterize WorldCereal. Additionally,
and Ortega (2021) used the water requirements of the lack of ground-truth data at fine resolution has
tobacco paired with precipitation information to been the principal bottleneck to developing fine-
adapt its planting date in the region of Ovejas resolution crop calendars. Therefore, the main objec­
(Colombia) to minimize irrigation needs. tive of this study is to build pixel-based, spatially
888 B. FRANCH ET AL.

explicit SOS and EOS maps of wheat and maize for the 3 Materials and methods
entire globe, while still being limited by a lack of
Figure 1 shows a flowchart that summarizes the main
sufficient ground data on planting and harvest.
steps followed in this study.
Particularly, the main scientific questions we want to
address, are the following:
Q1: How can we leverage the existing global crop 3.1 Existing Crop Calendar Products
calendar products to derive wheat and maize calen­
dars with finer spatial resolution? Table 1 summarizes the official crop calendar pro­
Q2: Can we use moderate resolution EO data to ducts considered in this work to create the baseline
evaluate the SOS and EOS maps? crop calendars More details on each of these products
Thus, we aim to take advantage of existing global are provided in the following sections.
crop calendar products to derive maximum spatial
coverage, and then utilize EO data and climate infor­ 3.1.1 Crop Monitor
mation to improve them. Lastly, we evaluated the GEOGLAM has been publishing its monthly CM reports
resulting SOS and EOS maps both considering data since 2013, in response to the 2011 G20 Agricultural
from the baseline map set aside for validation pur­ Ministers Action Plan on Price Volatility and Markets
poses and using Sentinel-2 and Landsat 8 data over (Parihar et al. 2012). These CM reports provide timely,
several locations where the crop type is known. transparent, and actionable information on crop

Figure 1. Flowchart of the main steps followed in this study. The shapes represent: source data (rounded squares), main processing
steps (squares), or outputs (circle). The different colors link the path followed by each data source through the flowchart.

Table 1. Source and main characteristics of the crop calendar products considered in this study.
Crop
phenological Spatial Temporal
Crop calendar Data source stagesa resolution resolution Reference
GEOGLAM Crop Partners organization analysts, Remote Sensing, 5 Administrative 15 days Becker-Reshef et al.
Monitor for AMIS USDA-FAS, USDA-NASS, and FAO Unit (2019)
ASAP Remote Sensing, USDA-FAS and FAO 4 Administrative 10 days Dimou, Meroni, and
Unit Rembold (2018)
USDA-FAS Crop analysts (not documented) 3 National & 15 days USDA (2006)
subnational
FAO Crop analysts 3 Agro- 15 days Sacks et al. (2010)
Ecological
Zone
a
Number of crops phenological stages recovered by the crop calendar.
GISCIENCE & REMOTE SENSING 889

conditions for the four main commodity crops (wheat, that most of the information provided by the former is
maize, rice, soybean) in the major producer/exporting already captured by CM, we used the latter as input to
countries. It brings together the international agricul­ fill in CM gaps in some African countries. This particu­
ture monitoring community with over 40 contributing lar calendar is based on crop analysts information and
organizations that build a multi-source consensus provides data on the planting and harvesting dates for
about crop growing conditions, status and agro- 130 crops, in 283 AEZs, and with a temporal resolution
climatic conditions likely to impact global production of 15 days (as described in Sacks et al. 2010).
(Becker-Reshef et al. 2019). The information of CM is
aggregated at Administrative Unit Level 2 (AU) level, 3.1.4 ASAP
which are homogeneous boundaries based on the AEZ The European Commission’s Joint Research Center’s
of the FAO Global Administrative Unit Layers (GAUL). In ASAP system offers information about the SOS and
order to create the crop calendars, the CM considered the EOS of different crops in 80 priority countries for
the 5-year average of a complete crop growth cycle, the European Development Fund and those moni­
input from partners, remote sensing, and USDA and tored by GEOGLAM CM for Early Warning (CM4EW,
FAO crop calendars, employing a similar method to the Dimou, Meroni, and Rembold 2018; Rembold et al.
one Sacks et al. (2010). As a result, the CM crop calen­ 2019). Its crop calendars are downscaled from existing
dars offer information about the stage of the main crop calendars (i.e. FAO and USDA-FAS) to national and
commodities traded in the international food market. sub-national levels through LSP metrics. Their objec­
Particularly, the main crop stages that CM provide are tive was to study the link between LSP and crop
as follows: Planting, Early vegetative, Reproductive, calendars information. In order to create their own
Ripening, and Harvest; with 15 days of temporal reso­ crop calendars, they employed FAO (Fritz et al. 2019),
lution. Furthermore, it also offers this information for USDA (USDA 2006) and GrisP (GRiSP 2010) crop calen­
countries at risk of food insecurity through its Crop dars; crop yield and production data from country
Monitor for Early Warning (Becker-Reshef et al. 2020). statistics, government agencies and FAOSTATS; crop
masks from Pérez-Hoyos et al. (2017); and MODIS pro­
3.1.2 USDA-FAS duct MOD13A2v6 over the period 2013–2016. With
The USDA-FAS (https://ipad.fas.usda.gov/ogamaps/ that information, they detected the main crops and
cropcalendar.aspx) International Production and computed their LSP SOS and LSP EOS over the NDVI
Assessment (IPAD) division has the objective of mon­ profile averaged for the period. Then, they confronted
itoring the agricultural production of the major com­ LSP SOS and LSP EOS in a scatter plot to create clusters
modity crops, as well as assessing food security and extracted the start and the end of the LSP SOS and
throughout the world (Fritz et al. 2019; Becker- LSP EOS. Finally, they crossed this information with the
Reshef et al. 2010). In order to achieve this, USDA- existing crop calendars SOS and EOS, with a margin of
FAS employs a consensus of evidence approach by 30 days, to assign a crop type. The final product repre­
integrating reports, field surveys, climate data, crop sents the start and end of the LSP SOS and LSP EOS
models, and remote sensing data (Becker-Reshef et al. with a temporal resolution of 10 days, at a national or
2010). To support this, USDA-FAS created its own crop sub-national level, depending on the area.
calendars based on crop analysts input at national
level with a 15-day temporal resolution (as described
3.2. Baseline Maps
in Sacks et al. 2010). USDA-FAS SOS and EOS are
represented as a period rather than a single date to The baseline crop calendars are based on the integra­
account for possible local and annual variations. tion of existing crop calendars information: CM,
USDA-FAS, FAO and ASAP, in that order of prevalence.
3.1.3 FAO This order was decided based on their up-to-date
Currently, there are two crop calendars generated by information, documentation, applicability, and meth­
FAO: Global Information and Early Warning System odology. CM’s crop calendars cover a total of four
(GIEWS), a crop calendar at national-level for the phenological stages. Also, it is based on the best
Early Warning countries, and a crop calendar focused available data from different international agencies,
on 44 African countries (FAO 2021). In this study, given national ministries, and expert knowledge and it has
890 B. FRANCH ET AL.

been broadly used (Becker-Reshef et al. 2019). dates provided. Finally, given that ASAP is the only
Therefore, this product has been considered as the product based on EO data, it was only considered
base map whose missing information is complemen­ when any of the other products was not available
ted with the other products. ASAP, USDA-FAS, and for a given region. Looking for consistency between
FAO calendars’ information were considered based the products and given the range of dates provided
on their spatial consistency with the nearby data by ASAP, we selected within the range the closest
from CM. The geographical boundaries of each sub­ date to the nearby regions with calendar data
national unit of the baseline map, hereafter named as extracted from the other products.
polygons, were inherited from CM and ASAP. When
the information was recovered from USDA-FAS and 3.2.2 Considerations for maize
FAO, the administrative units of CM were used. Some regions in tropical areas can contain multiple
In this work, the baseline maps served as source of seasons of maize. CM and ASAP include information
training and validation data to develop the pixel- on two or even three maize growing seasons in some
based crop calendar maps. regions. However, since the models are led by climate
variables and those regions are generally restricted to
3.2.1 Considerations for wheat tropical areas, the maize calendar presented in this
Winter wheat crop does not refer to a specific wheat work just focus on one of them. Particularly, we
species, but to the timing when wheat is planted selected the first season defined in CM as the larger
(Sacks et al. 2010). In the Northern Hemisphere winter producing season (Crop Monitor 2021a) provides
wheat plants go through a winter period with tem­ a table with information on the season selected for
peratures below 0°C when the crop is generally cov­ maize), but ensuring maximum consistency with
ered by snow and is dormant. However, some neighboring countries.
Northern Hemisphere regions (e.g. Canada, northern To create the baseline SOS and EOS maps of maize,
United States of America (U.S.), northern Europe, we selected the first date within the range of dates
Ukraine, Russia, Kazakhstan) also plant spring wheat, provided in CM, USDA-FAS and FAO; and the most
whose main difference with winter wheat is that it is consistent date with the surrounding areas for ASAP.
planted during the spring and harvested later. In the
Southern Hemisphere, winter temperatures are
milder, and wheat does not go into dormancy. 3.3 Auxiliary data
These differences in the wheat phenological stages Table 2 shows a summary of all auxiliary data consid­
require an extra step in the definition of the wheat ered to build our pixel-based crop calendars. The
SOS. In the framework of WorldCereal, we aim to temperature, precipitation and dew point tempera­
generate a standardized wheat calendar of the ture averaged by month were downloaded for the
Northern and Southern hemispheres. Additionally, period 1990–2019 from ERA5-Land climate products
note that these calendars are static, and the particular (ERA5-Land 2021). ERA5-Land products are down­
climatology of each year may advance or delay the scaled from ERA5 products, improving its spatial reso­
SOS. Consequently, in this work, we selected the SOS lution to 0.1°. Minimum, maximum, average and
as 1 month before the start of the Vegetative- standard deviation statistics were computed monthly,
Productive stage as defined in CM. This means that seasonally, and annually for each product (Muñoz-
the selected SOS in the Northern Hemisphere is Sabater et al. 2021). Latitude and longitude of each
referred to winter wheat and does not capture the pixel represented by the x and y coordinates from the
planting date but the end of the dormancy of winter geographic projection were also selected as features.
wheat. In the case of USDA-FAS and FAO crop calen­
dars, which provide a range of dates when the SOS
Table 2. Auxiliary data considered in this study.
happens, we selected a month before the end of the Variable Source Resolution Aggregated
range of SOS to account for the typical delays 2 m air temperature ERA-5 Land 0.1° Month
between planting and emergence of the crop. 2 m air dew point temperature ERA-5 Land 0.1° Month
Precipitation ERA-5 Land 0.1° Month
Regarding the EOS, in CM, USDA-FAS and FAO we Latitude Own-made 0.5° -
selected the average date of the harvest range of Distances Own-made 0.5° -
GISCIENCE & REMOTE SENSING 891

Finally, following the methodology of Hengl et al. fitting we need to sample the training and valida­
(2018a), we also considered as features the euclidean tion data. This has been done using as a reference
distances among the training data. This step is quite the CM static coarse resolution crop-type maps
important since it will add to the model some non- (Crop Monitor 2021b). These maps represent
modeled spatial variation. All the data were the percent of each pixel covered by a specific
resampled to 50 km. crop at 0.05° resolution (approx. 5.6 km at the
equator). For each crop (i.e. wheat and maize), we
selected the pixel with the highest percentage of
3.4 WorldCereal reference data crop inside each polygon. With this approach, we
WorldCereal has generated a common input baseline assume that the crop calendar information corre­
(CIB) which is defined as a database of Analysis Ready sponds to the most representative area within each
Data (ARD) input patch cubes that is used as polygon, and this should be related to the area
a reference to train, evaluate and inter-compare the where the crop is more concentrated, that is,
different algorithms. These patch cubes are context- with a higher percentage of crop as defined by
aware stacks of ARD Sentinel-1, Sentinel-2, Landsat 8 the crop-type maps. As a result, the baseline
and non-EO data, coinciding with the reference data maps’ information was sampled considering
collected. By extracting a large enough context of a total of 354 samples of wheat (Figure 2a) and
imagery around each reference data point/polygon, 443 samples of maize (Figure 2b). However, these
the resulting input patch cubes are ready to serve data were not homogeneously distributed in space.
pixel-based as well as context-based algorithms. In order to achieve a homogeneous distribution,
Reference samples in the CIB are distributed into cali­ data was split in 10 groups based on their spatial
bration/validation/test sets to guarantee their inde­ information. Then, each group was split to three sub­
pendence and consistency during benchmarking of groups based on their SOS information, and sampled
the different products. The resulting CIB hosts all 10 points inside each subgroup, setting aside the non-
extracted ARD input patch cubes together with their selected data for validation purposes. Note that when
training labels, and metadata describing the origin of a subgroup did not meet the minimum requirement
each patch. The CIB is not a static database but evolves of 10 samples, the whole subgroup was selected as
simultaneously with the addition of newly collected training data.
reference datasets. The CIB used in this work is
shown in Table 3, showing that the WorldCereal pro­
ducts have been benchmarked to date for 61,291 sam­ 3.6 Crop Calendar data definition
ples spread around most continents, although a clear Crop calendars are generally described as the Day Of
bias is visible toward regions with easily accessible the Year (DOY) when the SOS or EOS happen.
reference data (such as the European LPIS and US However, DOYs range from 1 (January 1) to 365
crop data layer). Additionally, the html files provided (December 31) being DOY 1 and DOY 365 separated
as supplementary materials show the spatial distribu­ only for 1 day. This means that calendars have
tion of this dataset. Nevertheless, this CIB provides an a circular distribution, and should be treated as
unprecedented wealth of agricultural reference sam­ such during the modeling process (Mardia and
ples to evaluate the crop calendars. Note that in this Jupp 2000; Gatto and Jammalamadaka 2015).
work we just rely on the optical data from Sentinel-2 Following Cremers, we mapped the DOY into
and Landsat 8 to infer the LSP SOS and LSP EOS a unitary circle, in which each DOY is represented
validation data as is further described in section 3.8.2. by a unitary vector, a mean direction (i.e. an angle),
and sine and cosine components. During the spatial
prediction modeling), each component was mod­
3.5 Sampling strategy for the baseline maps
eled separately based on the auxiliary data. Finally,
The baseline maps provide constant calendar infor­ the resulting maps of each component were reor­
mation through each polygon. However, to gener­ dered back to a circle by estimating their arc tangent
ate the model and avoid artifacts caused by over (1) and were translated back to DOY.
892 B. FRANCH ET AL.

Table 3. CIB samples’ overview pointing out their respective source dataset, location, date of collection (year), geometry type, and the
number of samples present in the CIB.
Source Location Year Geometry Number of samples
LPIS Flanders Flanders region (Belgium) 2017–2019 POLYGON 8941
LPIS France France 2017–2019 POLYGON 8796
LPIS Latvia Latvia 2019 POLYGON 2960
LPIS Austria Austria 2017–2019 POLYGON 8899
JECAM -Kyiv- Ukraine (NASU-SSAU) Kyiv oblast (Ukraine) 2019 POLYGON 1029
SIGPAC Catalunya Catalonia province (Spain) 2019 POLYGON 2978
SIGPAC Andalucia Andalusia province (Spain) 2019 POLYGON 3094
USDA Crop Data Layer USA 2019 MAP 14709
Buenos Aires Grain Exchange Argentina Argentina 2018–2019 POLYGON 3558
Sen2Agri South Sudan South Sudan 2017 POINT/POLYGON 4505
Gardian PBI India (IFPRI) India 2017–2018 POINT 943
Gardian OCP Nigeria (IIATA) Nigeria 2017 POINT 399
RadiantEarth Uganda Uganda 2017 POLYGON 199
RadiantEarth Tanzania Tanzania 2018 POLYGON 249
RadiantEarth Kenya Kenya 2019 POLYGON 32
TOTAL 61,291

sinθ f ðXGsinθ ; XCsinθ Þ estimates of random decision trees using techniques


θ ¼ arctan ¼ arctan (1)
cosθ gðXGcosθ ; XCcosθ Þ as bagging (Hengl et al. 2018a; Géron 2019).
However, Random Forest doesn’t account for spatial
where θ is the mean direction representing the DOY,
autocorrelation since each observation is considered
sin and cosθ are the predicted sin and cosine compo­
independent. To overcome this and add a spatial
nents of the mean direction, f(⋅) (and g(⋅)) represent the
sense into the model, the model was trained with
trained spatial random forest algorithm, whose output
geographical features, more specifically euclidean
is the sine (and cosine) component that defines the
distances to each point, as proposed by Hengl et al.
calendar dates, and whose inputs are the geographical
(2018a).
(XG) and climate (XC) features of each model.
The spatial Random Forest model considered is
based on the Python library sklearn (Pedregosa
et al. 2011). Each component describing the crop
3.7 Spatial prediction modeling
calendars (i.e. sine and cosine once the data were
Kriging interpolation algorithms are frequently used transformed to circular metrics) was considered
to build spatial prediction algorithms with good independently in the modeling process. In each
results (Shukla et al. 2020; Xu et al. 2019; Lin et al. case, we assessed the importance of each feature
2018; Giustini et al. 2019). However, data must meet (i.e. latitude, climatic variables and distances)
the following assumptions (Hengl et al. 2018a; Hengl based on the permutation feature importance
2007): method (Altmann et al., 2010) and selected the
top 15. This number was chosen after carefully
(1) Normal distribution of the residuals analyzing the trade-off between error minimization
(2) The variogram must be estimated from the and number of features considered to avoid over
data before the application of the model fitting. Then, the model was trained based on the
(3) It is assumed an almost-linear relation among selected features, the number of trees, and the
dependent variables and covariates. maximum depth of each tree. Spatial autocorrela­
tion can generate overfitting of the model
Since crop calendar’s data have a circular distribu­ (Brenning 2012; Schratz et al. 2019) and underesti­
tion, it does not meet the first requirement, and it mate the error. Therefore, we implemented
does not ensure the third. As an alternative, our a spatial cross-validation algorithm similar to
model is based on the methodology presented by Brenning (2012). Our algorithm was iterated 10
(Hengl et al. 2018a) and tested by van den Hoogen times, creating 15 spatial groups in each iteration,
et al. (2019) and Hengl et al. (2018b). Basically, we based on the Leave-One-Group-Out algorithm.
applied a Random Forest algorithm, a robust Appendix A provides additional information about
machine learning algorithm based on ensemble the features selected. The resulting SOS and EOS
GISCIENCE & REMOTE SENSING 893

Figure 2. Training (white triangles) and validation (red dots) data selected from the wheat (top) and maize (bottom) baseline maps to
build and evaluate the pixel-based maps. Data location shown over the SOS baseline maps.

maps were finally post-processed to improve spa­ average filter with a moving window of 3 by 3
tial and temporal consistency. On the one hand, pixels. On the other hand, looking for the SOS
the resulting maps showed some noise scattered and EOS maps consistency, we computed the
along the coast lines caused by model inaccura­ Length of Season (LOS) and masked out values
cies. This noise was reduced by applying a circular with LOS lower than 30 days or greater than
894 B. FRANCH ET AL.

280 days. Finally, we used a circular average with improved cloud and shadow masking deployed in the
a moving window of 9 by 9 pixels to fill in the WorldCereal system consists of the erosion of the
masked areas and expand the coast borders. original mask by 30 m to reduce bright surfaces
wrongly labeled as clouds and the dilation of the
original mask by 200 m to reduce impurities near
3.8 Evaluation
cloud edges wrongly labeled as clear.
We validated the model following two strategies: (1) After applying the cloud masking step described
against the baseline validation samples and (2) above, the resulting images may still present some
against the LSP SOS and LSP EOS, based on the undetected clouds and shadows, which may nega­
WorldCereal reference data. tively impact the optical time series. Therefore, an
additional multitemporal gap detection and removal
3.8.1 Evaluation using baseline data algorithm was implemented (Van Hoolst et al., 2016).
The sampled dataset was divided into training, used This algorithm identifies local minima in time series of
to build the model, and test subsets, which were NDVI, which exceed a threshold and are hence
considered to evaluate the performance of the flagged to be added to the general time-series mask
model. In addition, to inter-compare the predicted which is then propagated into all individual optical
crop calendars and the baseline crop calendars, the bands of that sensor.
pixel-based data was aggregated at polygon level but In each location, the CIB extracts EO information
masking out areas with no crop by using as over a polygon where the centroid is labeled as
a reference the CM crop-type maps. Note that the a given crop. Therefore, we just extracted the surface
cited aggregation at polygon level was done by reflectance values time series of the central pixel and
means of circular averages (Mardia and Jupp 2000). then, we computed the NDVI from the red and near-
Finally, we computed at each polygon the circular infrared bands. Additionally, we computed a 3-daily
residuals and, from them, the Root Mean Square composite by taking the median reflectance values of
Error (RMSE) and R2 metrics from them. all available cloud-free acquisitions in a 6-day moving
window. However, the resulting NDVI time-series still
3.8.2 EO-based evaluation strategy presented some noise. Therefore, this noise was
EO data preprocessing. In this study, the CIB dataset smoothed by a moving average with a search window
is considered as a reference where EO data is of nine dates. Finally, we filtered out data acquired
extracted over locations where the crop type is well over locations whose NDVI pattern did not show
determined. Based on this dataset, we can apply sim­ a consistent NDVI pattern with the neighboring loca­
ple LSP models based on the EO signal, and more tions by estimating average NDVI profiles within each
specifically optical data, retrieved in each location subnational unit of the baseline map.
tagged as wheat or maize to infer the crop calendars
information. In this task, we work on surface reflec­ LSP method. We developed a simple algorithm to
tance data at 10 m resolution retrieved by Sentinel-2 detect and validate the LSP SOS and EOS metrics for
and Landsat 8. They both include a cloud mask which wheat and maize crops. It is based on the methodol­
in Sentinel-2 is based on the Sen2Cor scene classifica­ ogy presented in Jönsson and Eklundh (2004)
tion and in Landsat it is based on the FMask algo­ (Figure 3), where the SOS and EOS are determined
rithm. As pointed out by Baetens et al. (2019), both based on NDVI thresholds:
Sen2Cor and FMask cloud/shadow masks should be �
NDVISOS ¼ mprev þ NDVIpeak mprev � p (2)
dilated to reduce cloud- and shadow-related impuri­
ties and drastically boost the quality of the mask. At �
NDVIEOS ¼ mpost þ NDVIpeak mpost � p (3)
the same time, pixel-based cloud masks tend to be
sensitive to bright surfaces such as certain green­
m ¼ minðNDVIÞ (4)
house roofs, which are marked as cloudy in the
cloud masks. When applying a dilation to these where NDVISOS and NDVIEOS are the NDVI at the SOS
masks, wrongly labeled bright surfaces are smeared and EOS dates; mprev and mpost are the minimum
out and will mask otherwise clear areas. Therefore, the value of NDVI at the left (previous) and right
GISCIENCE & REMOTE SENSING 895

Figure 3. Example of NDVI time-series over a maize field in Argentina. The green and the red circle represent the LSP SOS and LSP EOS,
respectively; the black triangles point out the local minimums, the white triangle represents the season’s peak, and the shadowed
buffer from SOS to EOS highlights the crop’s season.

(posterior) of the peak of the season; p is a constant; being y the observed values, and the x the predicted
and NDVIpeak is the NDVI value at the peak of the values. However, since the distribution of DOY data is
season. Despite Jönsson and Eklundh (2004) set the circular, we had to modify RMSE metric to deal with
value of p as 0.1, we set it as 0.2 to ensure that the SOS circular distances by computing circular residuals
and the EOS are detected in the correct ranges (i.e. instead of linear ones:
from the minimum to the peak and from the peak to a ¼ jx yj (7)
the minimum, respectively). Finally, we selected the
closest date of the time-series to the NDVI SOS and b ¼ j2π aj (8)
NDVI EOS as the SOS and the EOS dates.
However, given the simplicity of the LSP model, we d� ¼ minfa; bg (9)
established a time window to apply these equations.
obtaining a circular RMSE when combining
Basically, we defined a buffer of ±30 days based on
Equations 5–9:
the baseline crop calendars maps when equations 2–
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4 were applied. Next, the resulting LSP SOS and EOS u N
u1 X
were aggregated at polygon level and were evaluated RMSEcirc ¼t d2 (10)
N i¼1 circ
against the baseline map data. Finally, the proposed
pixel-level calendar maps were evaluated against the Finally, we represented the estimated SOS and EOS
LSP metrics. maps against the baseline calendars or LSP SOS/LSP
EOS metrics using two-dimensional (y = f(x)) plots.
3.8.3 Evaluation metrics While these plots are visual and easy to interpret,
The root mean squared error (RMSE; Géron 2019) and this supposes the linearization of a non-linear para­
the R2 metrics were estimated to assess the perfor­ meter. Thus, to minimize the error caused by the DOY
mance of the model. The RMSE is defined as circular nature and taking advantage of the grouping
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi of the Northern and Southern Hemisphere DOYs (as
u N
u1 X will be shown in the results section) the observed and
RMSE ¼ t d2 (5)
N i¼1 i predicted values were shifted by a cycle of 2π (i.e.
365 days) forcing some DOYs by the end of the
where N is the number of observations, and di are calendar year to have negative values. Particularly,
linear residuals usually defined as SOS and EOS values greater than 250 days for
wheat, and values greater than 210 (SOS) and 350
d¼x y (6)
(EOS) days for maize, were shifted by a cycle
896 B. FRANCH ET AL.

(365 days). In this context, the reported R2 statistic, December, while in the south from June to July; in
which assesses how strong the linear relationship is Chile, the harvesting dates span from March to July; in
between two variables and consequently should be the South of Africa, the dates span from September to
taken carefully, is defined in James et al. (2013) as November; and in Australia and New Zealand the
P 2 harvesting dates span from October to January; and
2 d
R ¼1 P (11) Indonesia, the dates span from July to August.
ðy �yÞ2

where y is an observation and ȳ is the average of the 4.1.1 Evaluation


observations. Evaluation using baseline data. Figure 5 shows the
evaluation of the modeled crop calendars with the
baseline data set aside for validation purposes and
4 Results Figure 6 shows the spatial distribution of the resi­
duals. The SOS evaluation (Figure 5a) shows an R2 of
4.1 Wheat
0.87 and an RMSE of 27 days. Figure 6a shows that
Figure 4(a, b)) shows the SOS and EOS predicted maps most of the residuals (over 85%) are lower than
masking out the non-cropland areas using the 30 days. However, there are two major outliers
Copernicus Land Cover product. In the case of the where the baseline provides a SOS around DOY 200
SOS, we can distinguish four main behaviors in and the modeled maps show values around 1.
terms of geographical locations: the Northern Looking at Figure 6a these two polygons are located
Hemisphere, the Mediterranean basin, and its sur­ in Tanzania and Yemen, where the lack of data forces
roundings down to the Equator, the effect of the the extrapolation of the model. Additionally, there are
tropics and the rainforests area, and the Southern some cases, represented in the scatter plot with hor­
Hemisphere. The Northern Hemisphere SOS after dor­ izontal lines, where the model shows some variability
mancy generally spans from January to April. whilst the baseline provides constant SOS.
However, surrounding the Mediterranean basin and Figure 5b shows the evaluation of the EOS maps
the across Middle East, the planting dates span from with an R2 of 0.92 and an RMSE of 26 days. The
November to February. In equatorial areas such as residual map (Figure 6b) shows that 82% of them
North Africa, the planting dates span from are lower than 30 days. However, there are some
December to August, while in Central America and residuals larger than 60 days (less than 3%) and are
the North of South America, the planting dates span located in Guatemala, Chile, Tanzania and Yemen
from March to April. Finally, in the Southern (outliers in Figure 5b). Additionally, in Figure 6b,
Hemisphere, the planting dates span from April to there is some horizontal data with baseline EOS
August. around the beginning of the year, while the modeled
There are three general patterns for wheat EOS EOS provides a variation from DOY 325 (−40 in the
(Figure 4b): the homogeneous pattern in the plot) to DOY 50.
Northern Hemisphere, whose dates increase as lati­
tude decrease; the equatorial area, with different pat­ EO-based evaluation. The supplementary material
tern in America, North and South Africa and Asia; and provides a html file where the reader can explore
the Southern Hemisphere, where Argentina and Chile, spatially the wheat sites within the CIB, select
Southern Africa and Oceania show different dates. a given site, and check its corresponding wheat
Specifically, in the Northern Hemisphere, including NDVI seasonal curve (highlighted in gray) and the
the Mediterranean basin, the dates span from April resulting LSP SOS and LSP EOS metrics (red dots).
to July; in the equatorial area, the dates span from Figure 7 shows the evaluation of all these LSP metrics
August to October in central and south America, in (i.e. SOS and EOS) against the baseline crop calendars
Africa from December to May at the north of the values by aggregating the LSP singular values to poly­
Equator and from October to March at the south, gon level. As discussed in the Introduction section,
and in Asia from March to April. The Southern a bias was expected between both datasets caused by
Hemisphere shows a heterogeneous pattern: dates the different crop development stages that each of
in the north of Argentina spans from November to them represents. In fact, the LSP metrics were
GISCIENCE & REMOTE SENSING 897

Figure 4. Wheat crop calendar maps over a base map from the Google Earth dataset. Locations with no agricultural areas are excluded.
Furthermore, the color legend is circular, meaning that the blue color of December 31 matches the blue color of January 1.

corrected for a constant shift of −20 days in the case of the variance, while the LSP EOS explains 97% of the
of the LSP SOS and +27 days in the case of the EOS variance. However, the SOS evaluation shows
LSP. These bias values were extracted from the linear a horizontal pattern along DOY 40, that is caused by
regression’s intercept of the baseline vs LSP metrics the baseline crop calendars constant values, while the
before correcting the shift. The LSP SOS explains 82% LSP metrics show some variability. These
898 B. FRANCH ET AL.

Figure 5. Wheat crop calendar evaluation at polygon scale.

discrepancies are located across northern states in the Specifically, in the Northern Hemisphere the SOS
US where CM shows constant dates of the transition span from April to May, from August to December
to vegetative phase. in the west of Southern Hemisphere, and from
After the successful LSP model evaluation with the August to late November in the east of the
baseline data, these results were then considered to Southern Hemisphere. However, there are some
evaluate the modeled SOS and EOS maps with the dates that do not match these rules, such as the
computed LSP SOS and EOS (Figure 8). The predicted dates of December and January in Indonesia; about
SOS explains 75% of the variance of the LSP SOS with April and May in Vietnam, Philippines and Venezuela;
an RMSE of 25 days, while the EOS explains 88% of the and from February to March in the equatorial west of
variance and an RMSE of 18 days. There are vertical Africa.
lines indicating generalization of the predicted SOS In the case of the EOS (Figure 9b), the global pat­
with respect to the LSPs. This might be caused either tern is more continuous. Generally, we can split the
by the inter-annual LSP metrics variation (note that results into three areas, the Northern Hemisphere, the
the CIB contains data from 2017 to 2021) or because west of the Southern Hemisphere and the east of the
there is an actual variation of the calendars that the Southern Hemisphere. In the Northern Hemisphere,
model cannot capture. the dates span from July to November, without con­
sidering the area of Malaysia; in the west of the
Southern Hemisphere the dates span from May to
4.2 Maize November, although the latter can be
Figure 9 shows the predicted SOS and EOS dates for a consequence of the extrapolation; and in the east
maize masking out the non-cropland areas using the of the Southern Hemisphere, including Malaysia, the
Copernicus Land Cover product. Maize maps show EOS span from December to June, with the latter days
a more constant pattern than wheat in both SOS and over Australia and New Zealand.
EOS. At this point we must keep in mind that the
SOS of wheat in the Northern Hemisphere represents 4.2.1 Evaluation
the winter wheat emergence after dormancy. Thus, Evaluation using baseline data. Figure 10 shows
the variability of the wheat SOS, and consequently the evaluation of the modeled metrics against the
EOS, is linked to the winter temperatures in each baseline data at polygon level, while Figure 11
area. In the SOS (Figure 9a), the map shows three shows the spatial distribution of the residuals. Maize
different patterns in the Northern Hemisphere, the SOS results (Figure 10a) show a good agreement with
west and the east of the Southern Hemisphere. an R2 of 0.88 and RMSE of 24 days. Over 87% of
GISCIENCE & REMOTE SENSING 899

Figure 6. Wheat residuals at polygon scale.

residuals of the SOS (Figure 11a) are lower than model while dates are closer in a circular context.
30 days and only 3% are larger than 60 days, and Note that the data represented in Figures 10a and
are located in the south of Brazil, Congo, Angola, 10b are grouped in two main clusters, which refer
Uganda, Saudi Arabia, India, Thailand and Philippines. to the Northern and Southern Hemisphere data.
Maize’s EOS evaluation (Figure 10b) shows an Looking at the residuals map (Figure 11b), most
2
R of 0.79 and an RMSE of 28 days. Note that the of the residuals (88%), and mainly in the Northern
low R2 is mainly caused by the data in the bot­ Hemisphere, are lower than 30 days. However,
tom right area, which do not fit well to a linear compared to the previous maps’ analysis, it
900 B. FRANCH ET AL.

Figure 7. Wheat LSP SOS and EOS evaluation against baseline.

Figure 8. Wheat SOS and EOS modeled maps evaluation against LSP metrics. The colors represent the data density.

shows the largest percentage of the residuals EO-based evaluation. Similar to wheat, the supple­
with values greater than 60 days (4%). These mentary material provides a html file where the
errors are located in Mexico, Brazil, Chile, reader can explore spatially the maize sites within
Guinea, Nigeria, Congo, Saudi Arabia, Thailand, the CIB, select a given site, and check its correspond­
Vietnam and Philippines. ing maize NDVI seasonal curve (highlighted in gray)
The source of the high residual errors in some and the resulting LSP SOS and LSP EOS metrics (red
areas is heterogeneous. In Brazil, Saudi Arabia and dots). Figure 12 shows the evaluation of all these LSP
India, the errors are linked to the large size of the metrics with the baseline crop calendars. As discussed
polygons, where the baseline maps provide in the previous section, the direct comparison of both
a constant calendar data while the climate data datasets shows a constant bias that has been cor­
show very different climate patterns. Meanwhile, rected in these plots by shifting +8 days the LSP SOS
in other areas such as Congo or Guinea, the and +15 the LSP EOS. The SOS evaluation shows an
model cannot describe the calendar dates. These RMSE of 37 days and an R2 of 0.64, while the EOS
areas are generally located in the Tropics, where evaluation shows an RMSE of 24 days and an R2 of
the SOS and EOS dates might be better explained 0.89. Also in this case, the observations are split into
by other socioeconomic factors rather than two groups that reference the Northern and Southern
climatology. concentrations of the baseline crop calendar.
GISCIENCE & REMOTE SENSING 901

Figure 9. Maize crop calendar dates over a base map from the Google Earth dataset. Locations with no agricultural areas are excluded.
Furthermore, the color legend is circular, meaning that the blue color of December 31 matches the blue color of January 1.

Figure 13 shows the evaluation of the modeled SOS variance with an RMSE of 35 days; and they can explain
and EOS maps with the LSP metrics considering the 88% of the LSP EOS variance with an RMSE of 24 days. In
aforementioned bias correction. The results show that this case, there are some vertical patterns that show LSP
the proposed maps can explain 80% of the LSP SOS variance while the proposed maps remain constant.
902 B. FRANCH ET AL.

Figure 10. Maize R2 evaluation at polygon scale.

5 Discussion or 15 days. Therefore, in the framework of the


WorldCereal project where these maps will be used
In this work, we inferred continuous (pixel-based)
by the classification algorithms, 30 days before the
global wheat and maize crop calendar maps based
vegetative stage ensures that, independently on the
on discrete information from existing global crop
yearly weather variation, the plant has not emerged
calendar products and meteorological data. First, we
yet. However, this selection, that provides consistency
built a baseline map by combining the information
between the Northern and Southern hemisphere,
provided by the existing products. CM was consid­
omits the winter wheat period before dormancy in
ered as the basemap and the other products comple­
the Northern hemisphere.
mented its information over the missing areas. Special
Second, the baseline SOS and EOS maps are gen­
attention was put toward the different products’ con­
eralized through the different countries or subna­
sistency in definition and temporal and spatial resolu­
tional units (polygons). Therefore, the baseline
tions. This was a challenging task given the range of
maps were sampled by assuming that the calendar
dates that each product considers to define the SOS
information within each polygon is representative of
or the period between planting and vegetative phase.
the area where the given crop is predominant. This
In fact, according to Whitcraft, Becker-Reshef, and
Justice (2015b), crop calendars are generally outdated was achieved by using as a reference the static crop-
and poorly documented, leading to a lack of confi­ specific masks from CM and representing
dence in the accuracy of the planting and harvesting the percent of each pixel covered by a specific
dates. In the framework of WorldCereal, whose main crop. We acknowledge that assigning to a given
objective is building robust wheat and maize classifi­ area the average crop calendar dates of the whole
cation algorithms and looking for coherence of the region might be biased in in countries with large
wheat’s SOS definition across the Northern and agricultural regions. For example, in the Middle East
Southern Hemisphere, we defined the SOS as 1 countries where the climate conditions in the same
month prior to the beginning of the vegetative period have considerable differences through the
phase. As explained in the Materials and methods north to south or east to west, the crops’ (i.e.
section, we selected this period of time after comput­ wheat and maize) SOS and EOS and their classifica­
ing the average time period between planting and tion patterns should be different. This is in fact the
the vegetative stage of wheat in the Southern major source of error causing the high residuals in
Hemisphere based on CM, which was estimated as 2 the maize calendars across the Middle East and India.
months. Note that the products considered in the However, this is the best information currently
baseline map provided the data in time frames of 10 available.
GISCIENCE & REMOTE SENSING 903

Figure 11. Maize residuals at polygon scale.

Third, we built a model to capture the calendar’s homogeneously distributed. This could be the conse­
spatial variability mainly based on meteorological quence of the low variability of the baseline calendars
data (dew point temperature and precipitation). in the Northern Hemisphere. The main advantage of
However, Sacks et al. (2010) explained that in some the proposed model is that given its simplicity it could
cases socioeconomic and cultural aspects have be easily applied to infer crop calendars of other
a strong impact on the crop calendars. Looking at annual crops.
the features’ importance (Appendix A), climate vari­ In the evaluation of the model based on a test
ables have a greater weight in the wheat compo­ subset of the baseline maps, the results show that
nents, but this is not the case for maize models, a model based on non-linear modeling of crop calen­
where the features’ importance is lower and more dar’s information at global scale shows good
904 B. FRANCH ET AL.

Figure 12. Maize LSP SOS and LSP EOS evaluation against baseline.

Figure 13. Maize SOS and EOS modeled maps evaluation against LSP metrics. The colors represent the data density.

performance metrics with SOS (EOS) R2 of 0.87 (0.92) Also, in large countries or subnational units with
and an RMSE of 27 (26) days for wheat and SOS (EOS) diverse agricultural conditions and very limited infor­
R2 of 0.88 (0.79) and an RMSE of 24 (28) days for maize. mation such as the Middle East and India as men­
These metrics show the good performance of the tioned above. Additionally, higher latitudes and
presented products when compared to previous isolated isles are generated based purely on extrapo­
works in the literature such as Stehfest et al. (2007), lation and buffering. Thus, these areas are expected to
whose results showed an R2 of 0.67 for wheat and 0.66 have high uncertainties. We must note the limited
for maize planting dates. However, these metrics are information available on crop calendars to build
limited to areas with available data. Hence, any calen­ a strong baseline of well-distributed training samples.
dar information outside these zones must be taken Given that fine-scale ground data are not available,
carefully, especially in the tropics where errors are we examined reported crop calendars in the literature.
larger according to the residual maps shown in Next, we highlight some cases where there are some
Figures 6 and 11. This might be a consequence: of discrepancies among products. In Tanzania, where the
the lack of data available to train the models, their literature and the proposed calendars define the wheat
multiple agricultural seasons, and their lower depen­ SOS in April, the baseline points out the SOS earlier in
dency on the climatology (main driver of our model). December. In fact, this is the largest residual (difference
GISCIENCE & REMOTE SENSING 905

between modeled and baseline SOS) in Africa shown in This might be a consequence of socio-economic factors
Figure 6. This inconsistency can be explained by the (that explain a large spread in LSP), the errors gener­
fact that in this particular polygon, the baseline shows ated by the simple LSP model considered, the presence
a very different SOS compared to the neighboring of errors in the in-situ reference data or the wide range
countries. Therefore, the model used to generate the of dates between the planting and vegetative phase
proposed calendars provided a stronger weight to the (and the ripening through harvest phase) that the
consistency across the neighboring countries, provid­ existing products currently have which minimizes the
ing a more accurate crop calendar estimation. In Nepal, spatial variation between polygons. Once evaluated,
both the baseline maps and the proposed calendars this data was then used to evaluate the predicted
match the dates reported in the literature (Thapa et al. calendar maps. This evaluation showed a SOS (EOS)
2020) excepting the EOS of the western region, where R2 of 0.75 (0.88) and an RMSE of 25 (18) days for
the baseline agrees with the reported calendar while wheat and a SOS (EOS) R2 of 0.80 (0.88) and an RMSE
the proposed calendars are shifted from 30 to 50 days. of 35 (24) days for maize. These results are consistent
Finally, in southern Brazil (Becker et al. 2020), both with the evaluation based on the baseline test dataset.
datasets SOS are shifted 2 months compared to the While the different crops and season showed equiva­
reported calendars, while the EOS is shifted 1 month. lent R2, the SOS evaluation errors were higher than the
Finally, the calendar maps were also evaluated EOS errors and maize showed higher errors than wheat.
based on the outputs from a simple LSP model based The higher errors found in the SOS maps when com­
on Sentinel-2 observations over in-situ reference sam­ pared to EOS can be a consequence of the larger
ples whose crop type is known. Given the simplicity of variability of the planting date when compared to the
the LSP model considered the SOS and EOS detection harvesting date. Meanwhile, maize shows higher errors
might show some inaccuracies. However, building than wheat given the larger amount of data analyzed
a strong LSP model is out of the scope of this work and their wider spatial distribution. Additionally, the
and the LSP data is just considered as an additional performance metrics are well aligned to other studies
evaluation dataset of the proposed maps. Looking at found in the literature such as Mishra et al. (2021),
the NDVI seasonal evolution across the different sites of whose results based on rice crop calendars showed
the CIB (supplementary materials) the considered LSP SOS and EOS estimations with an R2 of 0.88 and 0.82,
model generally captures well the LSP SOS and LSP and an RMSE of 34.2 days and 46.5 days, respectively.
EOS both for wheat and maize. The resulting data Despite the overall good metrics there are still some
derived from the LSP model was aggregated at poly­ discrepancies that can be caused by the yearly meteor­
gon level and was evaluated against the baseline data­ ological variations, which are captured by the LSP
set. These datasets showed a SOS (EOS) bias of 20 (27) metrics but are not represented in the static calendar
days in wheat and 8 (15) days in maize. This is a direct maps. Additionally, the horizontal pattern found in
consequence of how the SOS or EOS is defined in each Figure 7a corresponds to SOS data across the US and
product and the shift considered in each case to linear­ Europe and are caused by the stability of the baseline
ize the circular calendar dates. While the proposed SOS crop calendars whilst the LSP SOS show a variation
maps consider the SOS as one month before the vege­ that can get to 100 days.
tative phase, the LSP metrics detect the presence of The presented model is novel, and with the colla­
vegetation and therefore are detecting a crop stage boration of the agricultural communities around the
well within the vegetative phase. Reversely, the EOS as world to increase the quantity and quality of the refer­
defined in the baseline considers the average date ence data, and with the integrated use of ground and
within the harvest dates, while the LSP metrics detect EO data, it would be feasible to obtain more precise
the senescence of vegetation. When correcting this outputs. Particularly, LSP maps could be used to
constant bias, the evaluation showed a good consis­ improve the resolution of the calendars and their accu­
tency in both crops with an RMSE of 34 (17) days and racy in some regions where there is a lack of informa­
an R2 of 0.82 (0.97) for wheat SOS (EOS) and an RMSE of tion. However, while there are many different methods
37 (24) days and an R2 of 0.64 (0.89) for maize SOS to infer LSP metrics, if they are crop specific, they need
(EOS). However, there are some cases where the base­ to rely on crop-type maps. WorldCereal will produce
line data remain constant while the LSP metrics do not. those at global scale and at fine spatial resolution.
906 B. FRANCH ET AL.

However, this project will rely on the current crop Atzberger, C. February 2013. “Advances in Remote Sensing of
calendars first, to trigger the algorithms within Agriculture: Context Description, Existing Operational
1 month after the EOS and second to normalize the Monitoring Systems and Major Information Needs.” Remote
Sensing 5 (2): 949–981. 2072-4292. http://www.mdpi.com/
metrics with the accumulated GDD from the SOS. Thus,
2072-4292/5/2/949
in future studies, these maps can be improved in an Baetens, L., C. Desjardins, and O. Hagolle. 2019. “Validation of
iterative way with the WorldCereal crop-type maps. Copernicus Sentinel-2 Cloud Masks Obtained from MAJA,
As a summary and replying to the science ques­ Sen2cor, and Fmask Processors Using Reference Cloud
tions stated in the introduction section, in this work Masks Generated with a Supervised Active Learning
we have successfully applied a spatial Random Forest Procedure.” Remote Sensing 11 (4): 433.
Banerjee, A., and D. S. Dawn. 2020. “Preparation of Crop
algorithm to exploit the current crop calendar data­
Calendar on Mangalbari Town under Matiali Block,
sets available and build SOS and EOS winter wheat Jalpaiguri District.” Agriculture Journal IJOEAR 6 (12): 10.
and maize calendars that provide continuous spatial Becker, W. B., J. Richetti, E. Mercante, J. C. Esquerdo, C. A. Silva,
variability (replying to Q1). Regarding validation, A. Paludo, and J. A. Johann. August 2020. “Agricultural
given the scarcity of existing ground data and calen­ Soybean and Crop Calendar Based on Moderate Resolution
dars, as well as their unknown accuracy, it is very Satellite Images for Southern Brazil.” Semina: Ciências
Agrárias 41: 2419–2428. 16790359. http://www.uel.br/revis
challenging to evaluate our calendars in a rigorous
tas/uel/index.php/semagrarias/article/view/37964
way. In this context, a simple LSP model based on EO Becker, W. B., L. Silva, J. Richetti, T. Ló, and J. Johann. February
data acquired over already known crop types, was 2021. “Harvest Date Forecast for Soybeans from Maximum
a useful source to evaluate those maps, though as Vegetative Development Using Satellite Images.”
discussed above there are some limitations that International Journal of Remote Sensing 42: 1121–1138.
should be considered (LSP model errors and misclas­ 0143-1161. https://www.tandfonline.com/doi/full/10.1080/
01431161.2020.1823042
sification of the data) (replying to Q2).
Becker-Reshef, I., C. Justice, M. Sullivan, E. Vermote, C. Tucker,
A. Anyamba, J. Small, et al. June 2010. “Monitoring Global
Croplands with Coarse Resolution Earth Observations: The
Data availability statement Global Agriculture Monitoring (GLAM) Project.” Remote
The four crop calendar maps created in this study are available Sensing 2 (6): 1589–1609. 2072-4292. http://www.mdpi.
through the GitHub repository https://github.com/ucg-uv com/2072-4292/2/6/1589
/research_products/tree/main/cropcalendars. Becker-Reshef, I., B. Franch, B. Barker, E. Murphy, A. Santamaria-
Artigas, M. Humber, S. Skakun, and E. Vermote. October 2018.
“Prior Season Crop Type Masks for Winter Wheat Yield
Disclosure statement Forecasting: A US Case Study.” Remote Sensing 10 (10): 1659.
2072-4292. http://www.mdpi.com/2072-4292/10/10/1659
No potential conflict of interest was reported by the author(s). Becker-Reshef, I., B. Barker, M. Humber, E. Puricelli, A. Sanchez,
R. Sahajpal, K. McGaughey, et al. December 2019. “The
GEOGLAM Crop Monitor for AMIS: Assessing Crop
Funding Conditions in the Context of Global Markets.” Global Food
Security 23: 173–181. 22119124. https://linkinghub.elsevier.
This study was funded by the ESA WorldCereal project com/retrieve/pii/S221191241830155X
(4000130569/20/I-NB) and by the program Generacio Talent Becker-Reshef, I., C. Justice, B. Barker, M. Humber, F. Rembold,
of Generalitat Valenciana (CIDEGENT/2018/009);Conselleria de R. Bonifacio, . . . J. Verdin. 2020. “Strengthening Agricultural
Innovación, Universidades, Ciencia y Sociedad Digital, Decisions in Countries at Risk of Food Insecurity: The
Generalitat Valenciana. GEOGLAM Crop Monitor for Early Warning.” Remote Sensing
of Environment 237: 111553. doi:10.1016/j.rse.2019.111553.
Benami, E., Z. Jin, M. R. Carter, A. Ghosh, R. J. Hijmans, A. Hobbs,
ORCID B. Kenduiywo, and D. B. Lobell. February 2021. “Uniting
Belén Franch http://orcid.org/0000-0003-0593-7874 Remote Sensing, Crop Modelling and Economics for
Agricultural Risk Management.” Nature Reviews Earth &
Environment 2 (2): 140–159. 2662-138X http://www.nature.
References com/articles/s43017-020-00122-y
Bornez, K., A. Descals, A. Verger, and J. Peñuelas. 2020. “Land
Altmann, A., L. Toloşi, O. Sander, and T. Lengauer. 2010. Surface Phenology from VEGETATION and PROBA-V Data.
“Permutation Importance: A Corrected Feature Importance Assessment Over Deciduous Forests.” International Journal
Measure.” Bioinformatics 26 (10): 1340–1347. of Applied Earth Observation and Geoinformation 84: 101974.
GISCIENCE & REMOTE SENSING 907

Bórnez, K., A. Descals, A. Verger, and J. Peñuelas. February 2020. Franch, B., E. Vermote, S. Skakun, J. Roger, I. Becker-Reshef,
“Land Surface Phenology from VEGETATION and E. Murphy, and C. Justice. April 2019. “Remote Sensing Based
PROBA-V Data. Assessment over Deciduous Forests.” Yield Monitoring: Application to Winter Wheat in United States
International Journal of Applied Earth Observation and and Ukraine.” International Journal of Applied Earth Observation
Geoinformation 84: 101974. 03032434 https://linkinghub. and Geoinformation 76: 112–127. 03032434. https://linkinghub.
elsevier.com/retrieve/pii/S0303243419301278 elsevier.com/retrieve/pii/S0303243418306536
Brenning, A. “Spatial cross-validation and Bootstrap for the Franch, B., E. Vermote, S. Skakun, A. Santamaria-Artigas, N.
Assessment of Prediction Rules in Remote Sensing: The Kalecinski, J.-C. Roger, I. Becker-Reshef, B. Barker, C. Justice,
R Package Sperrorest.” In 2012 IEEE International and J. Sobrino. December 2021. “The ARYA Crop Yield
Geoscience and Remote Sensing Symposium, 5372–5375, Forecasting Algorithm: Application to the Main Wheat
Munich, Germany, July 2012. IEEE. doi: 10.1109/ Exporting Countries.” International Journal of Applied Earth
IGARSS.2012.6352393. Observation and Geoinformation 104: 102552. 03032434.
Caparros-Santiago, J. A., V. Rodriguez-Galiano, and J. Dash. https://linkinghub.elsevier.com/retrieve/pii/S0303243
January 2021. “Land Surface Phenology as Indicator of 421002592
Global Terrestrial Ecosystem Dynamics: A Systematic Fritz, S., L. See, J. C. L. Bayas, F. Waldner, D. Jacques, I. Becker-
Review.” ISPRS Journal of Photogrammetry and Remote Reshef, A. Whitcraft, et al. January 2019. “A Comparison of
Sensing 171: 330–347. 09242716. https://linkinghub.else Global Agricultural Monitoring Systems and Current Gaps.”
vier.com/retrieve/pii/S0924271620303269 Agricultural Systems 168: 258–272. 0308521X. https://linkin
Crop Monitor. “Classification System.” 2021a. https://cropmoni ghub.elsevier.com/retrieve/pii/S0308521X17312027
tor.org/index.php/crop-conditions/classification-systems/ Garcia-Perez, M. A., L. Quesada-Ruiz, J. A. Caparros-Santiago,
de Castro, A., J. Six, R. Plant, and J. Peña. November 2018. E. Sánchez-Rodríguez, and V. Rodriguez-Galiano. September
“Mapping Crop Calendar Events and Phenology-Related 2021. “A Case of Study of Land Surface Phenology for CAP
Metrics at the Parcel Level by Object-Based Image Analysis Management: Using Sentinel-2 Data to Obtain
(OBIA) of MODIS-NDVI Time-Series: A Case Study in Central Phenometrics for Winter Cereals in Andalusia, Spain.” In
California.” Remote Sensing 10 (11): 1745. 2072-4292 http:// Remote Sensing for Agriculture, Ecosystems, and Hydrology
www.mdpi.com/2072-4292/10/11/1745 XXIII, edited by C. M. Neale and A. Maltese, 6. Online Only,
Descals, A., A. Verger, G. Yin, and J. Penuelas. 2021. Spain: SPIE. doi:10.1117/12.2600114.
“A Threshold Method for Robust and Fast Estimation of Gatto, R., and S. R. Jammalamadaka. March 2015. “Directional
land-surface Phenology Using Google Earth Engine.” IEEE Statistics: Introduction.” In Wiley StatsRef: Statistics Reference
Journal of Selected Topics in Applied Earth Observations and Online, edited by N. Balakrishnan, T. Colton, B. Everitt,
Remote Sensing 1. 1939-1404, 2151-1535 https://ieeexplore. W. Piegorsch, F. Ruggeri, and J. L. Teugels, 1–8. Chichester, UK:
ieee.org/document/9265265/ John Wiley & Sons, . doi:10.1002/9781118445112.stat00201.pub2.
Dimou, M., M. Meroni, and F. Rembold. 2018. Development of Géron, A. 2019. Hands-On Machine Learning with Scikit-Learn,
a National and sub-national Crop Calendars Data Set Keras, and Tensorflow: Concepts, Tools, and Techniques to
Compatible with Remote Sensing Derived Land Surface Build Intelligent Systems. second ed. USA: O’Reilly Media.
Phenology: Technical Description of the Selection Method Giustini, F., G. Ciotoli, A. Rinaldini, L. Ruggiero, and M. Voltaggio.
Used for Building the Crop Calendars Data Set of the Anomaly April 2019. “Mapping the Geogenic Radon Potential and Radon
Hot Spots of Agricultural Production (ASAP). LU: Publications Risk by Using Empirical Bayesian Kriging Regression: A Case
Office. https://data.europa.eu/doi/10.2760/25859 Study from A Volcanic Area of Central Italy.” Science of the Total
ERA5-Land, 2021. European Centre for Medium-Range Weather Environment 661: 449–464. 00489697. https://linkinghub.else
Forecasts. Accesed the 10th of December of 10th December vier.com/retrieve/pii/S0048969719301640
2021. https://www.ecmwf.int/en/era5-land Godfray, H. C. J., J. R. Beddington, I. R. Crute, L. Haddad,
FAO. “Climate Change and Food Security: A Framework D. Lawrence, J. F. Muir, J. Pretty, S. Robinson, S. M. Thomas,
Document.” FAO, Viale delle Terme di Caracalla 00153 Rome, and C. Toulmin. February 2010. “Food Security: The
Italy, 2008. https://www.fao.org/3/k2595e/k2595e00.htm Challenge of Feeding 9 Billion People.” Science 327 (5967):
FAO. “Climate Change and Food Security: Risks and Responses.” 812–818. 0036-8075, 1095-9203. https://www.science.org/
FAO, 2015. https://www.fao.org/documents/card/en/c/ doi/10.1126/science.1185383
82129a98-8338-45e5-a2cd-8eda4184550f/ GRiSP. “Global Rice Science Partnership.” 2010.
FAO. “FAO Crop Calendar.” 2021. https://cropcalendar.apps. Helman, D. March 2018. “Land Surface Phenology: What Do We
fao.org/home Really ‘See’ from Space?” Science of the Total Environment
Franch, B., E. F. Vermote, I. Becker-Reshef, M. Claverie, J. Huang, 618: 665–673. 00489697. https://linkinghub.elsevier.com/
J. Zhang, and J. A. Sobrino. 2015. “Improving the Timeliness retrieve/pii/S004896971731954X
of Winter Wheat Production Forecast in the United States of Hengl, T.2007.European Commission, Joint Research Centre, and
America, Ukraine and China Using MODIS Data and NCAR Institute for Environment and Sustainability (Ispra,
Growing Degree Day Information.” Remote Sensing of I. A Practical Guide to Geostatistical Mapping of
Environment 161: 131–148. Environmental Variables.Luxembourg: Publications Office.
908 B. FRANCH ET AL.

Hengl, T., M. Nussbaum, M. N. Wright, G. B. Heuvelink, and Mardia, K. V., and P. E. Jupp.2000.Directional Statistics.
B. Gräler. August 2018a. “Random Forest as a Generic Wiley Series in Probability and Statistics.Chichester;
Framework for Predictive Modeling of Spatial and New York: J. Wiley.
spatio-temporal Variables.” PeerJ 6: e5518. 2167-8359. https:// Mbow, C., C. Rosenzweig, L. G. Barioni, T. G. Benton, M. Herrero,
peerj.com/articles/5518 M. Krishnapillai, E. Liwenga, et al. 2019. “Food Security.” In
Hengl, T., M. G. Walsh, J. Sanderman, I. Wheeler, S. P. Harrison, and Climate Change and Land: An IPCC Special Report on Climate
I. C. Prentice. August 2018b. “Global Mapping of Potential Change, Esertification, Land Degradation, Sustainable Land
Natural Vegetation: An Assessment of Machine Learning Management, Food Security, and greenhouse Gas Fluxes in
Algorithms for Estimating Land Potential.” PeerJ 6: e5457. Terrestrial Ecosystems, edited by P. Shukla, J. Skea, E. Calvo
2167-8359. https://peerj.com/articles/5457 Buendia, V. Masson-Delmotte, H.-O. Pörtner, D. Roberts,
Hsiao, T. C., L. Heng, P. Steduto, B. Rojas-Lara, D. Raes, and E. Fereres. P. Zhai, et al., Chapter 5. Geneva: Intergovermental Panel on
May 2009. “AquaCrop-The FAO Crop Model to Simulate Yield Climate Change.
Response to Water: III. Parameterization and Testing for Maize.” Mingwei, Z., Z. Qingbo, C. Zhongxin, L. Jia, Z. Yong, and
Agronomy Journal 101 (3): 448–459. 00021962. http://doi.wiley. C. Chongfa. December 2008. “Crop Discrimination in
com/10.2134/agronj2008.0218s Northern China with Double Cropping Systems Using
Jägermeyr, J., C. Müller, A. C. Ruane, J. Elliott, J. Balkovic, Fourier Analysis of time-series MODIS Data.” International
O. Castillo, B. Faye, et al. November 2021. “Climate Impacts Journal of Applied Earth Observation and Geoinformation
on Global Agriculture Emerge Earlier in New Generation of 10 (4): 476–485. 03032434. https://linkinghub.elsevier.com/
Climate and Crop Models.” Nature Food 2 (11): 873–885. 2662- retrieve/pii/S0303243407000736
1355. https://www.nature.com/articles/s43016-021-00400-y Mishra, B., L. Busetto, M. Boschetti, A. Laborte, and A. Nelson.
James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An December 2021. “RICA: A Rice Crop Calendar for Asia Based
Introduction to Statistical Learning, Volume 103 of Springer on MODIS Multi Year Data.” International Journal of Applied
Texts in Statistics. New York, New York, NY: Springer. Earth Observation and Geoinformation 103: 102471. 03032434.
doi:10.1007/978-1-4614-7138-7. https://linkinghub.elsevier.com/retrieve/pii/
Javed, M. A., S. Rashid Ahmad, W. K. Awan, and B. A. Munir. S0303243421001781
March 2020. “Estimation of Crop Water Deficit in Lower Bari Monitor, C. “Crop Monitor Baseline Data.” 2021b.
Doab, Pakistan Using Reflection-Based Crop Coefficient.” Muñoz-Sabater, J., E. Dutra, A. Agustí-Panareda, C. Albergel,
ISPRS International Journal of Geo-Information 9 (3): 173. G. Arduini, G. Balsamo, S. Boussetta, et al. “ERA5-Land: A
2220-9964. https://www.mdpi.com/2220-9964/9/3/173 state-of-the-art Global Reanalysis Dataset for Land
Jeong, S. J., C. H. HO, H. J. Gim, and M. E. Brown. 2011. Applications.” preprint, Data, Algorithms, and Models, March
“Phenology Shifts at Start Vs. End of Growing Season in 2021. https://essd.copernicus.org/preprints/essd-2021-82/
Temperate Vegetation over the Northern Hemisphere for Nass, U. 2010. Field Crops: Usual Planting and Harvesting Dates.
the Period 1982–2008.” Global Change Biology 17 (7): USDA National Agricultural Statistics Service, Agriculural
2385–2399. doi:10.1111/j.1365-2486.2011.02397.x. Handbook, p. 628.
Jönsson, P., and L. Eklundh. October 2004. “TIMESAT—a Orynbaikyzy, A., U. Gessner, B. Mack, and C. Conrad. August
Program for Analyzing time-series of Satellite Sensor Data.” 2020. “Crop Type Classification Using Fusion of Sentinel-1
Computers & Geosciences 30 (8): 833–845. 00983004. https:// and Sentinel-2 Data: Assessing the Impact of Feature
linkinghub.elsevier.com/retrieve/pii/S0098300404000974 Selection, Optical Data Availability, and Parcel Sizes on the
Kogan, F., N. Kussul, T. Adamenko, S. Skakun, O. Kravchenko, Accuracies.” Remote Sensing 12 (17): 2779. 2072-4292.
O. Kryvobok, A. Shelestov, A. Kolotii, O. Kussul, and https://www.mdpi.com/2072-4292/12/17/2779
A. Lavrenyuk. August 2013. “Winter Wheat Yield Parihar, J. S., C. Justice, J. Soares, O. Leo, P. Kosuth, I. Jarvis,
Forecasting in Ukraine Based on Earth Observation, D. Williams, W. Bingfang, J. Latham, and I. Becker-Reshef
Meteorological Data and Biophysical Models.” International “GEO-GLAM: A GEOSS-G20 Initiative on Global Agricultural
Journal of Applied Earth Observation and Geoinformation 23: Monitoring.” In Proceedings of the 39th COSPAR Scientific
192–203. 03032434. https://linkinghub.elsevier.com/ Assembly, Mysore, India, 14–22 July 2012.
retrieve/pii/S0303243413000123 Patel, J. H., and M. P. Oza. November 2014. “Deriving Crop
Kyu Lee, S., and T. An Dang. June 2020. “Crop Calendar Shift as Calendar Using NDVI time-series.” The International
a Climate Change Adaptation Solution for Cassava Cultivation Archives of the Photogrammetry, Remote Sensing and
Area of Binh Thuan Province, Vietnam.” Pakistan Journal of Spatial Information Sciences XL-8: 869–873. 2194-9034.
Biological Sciences 23 (7): 946–952. 10288880. https://www. https://www.int-arch-photogramm-remote-sens-spatial-inf-
scialert.net/abstract/?doi=pjbs.2020.946.952 sci.net/XL-8/869/2014/
Lin, J., A. Zhang, W. Chen, and M. Lin. August 2018. “Estimates Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel,
of Daily PM2.5 Exposure in Beijing Using Spatio-Temporal B. Thirion, O. Grisel, M. Blondel, et al. 2011. “Scikit-learn:
Kriging Model.” Sustainability 10 (8): 2772. 2071-1050. http:// Machine Learning in Python.” The Journal of machine
www.mdpi.com/2071-1050/10/8/2772 Learning research 12: 2825–2830.
GISCIENCE & REMOTE SENSING 909

Pérez-Hoyos, A., F. Rembold, H. Kerdiles, and J. Gallego. Ecological Modelling 209 (2–4): 203–219. 03043800.
November 2017. “Comparison of Global Land Cover Datasets https://linkinghub.elsevier.com/retrieve/pii/
for Cropland Monitoring.” Remote Sensing 9 (11): 1118. 2072- S0304380007003365
4292. http://www.mdpi.com/2072-4292/9/11/1118 Teluguntla, P., P. S. Thenkabail, A. Oliphant, J. Xiong,
Piao, S., J. Tan, A. Chen, Y. H. Fu, P. Ciais, Q. Liu, . . . J. Penuelas. M. K. Gumma, R. G. Congalton, K. Yadav, and A. Huete.
2015. “Leaf Onset in the Northern Hemisphere Triggered by October 2018. “A 30-m Landsat-derived Cropland Extent
Daytime Temperature.” Nature Communications 6 (1): 1–8. Product of Australia and China Using Random Forest
doi:10.1038/ncomms7911. Machine Learning Algorithm on Google Earth Engine Cloud
Portmann, F. T., S. Siebert, and P. D.ll. 2010. “MIRCA2000 - Computing Platform.” ISPRS Journal of Photogrammetry and
Global Monthly Irrigated and Rainfed Crop Areas around Remote Sensing 144: 325–340. 09242716. https://linkinghub.
the Year 2000: A New high-resolution Data Set for elsevier.com/retrieve/pii/S0924271618302090
Agricultural and Hydrological Modeling.” Global Thapa, S., A. Ghimire, J. Adhikari, A. Thapa, and B. Thapa. 2020.
Biogeochem. Cycles 24: 1–24. doi:10.1029/2008GB003435. “Impacts of Sowing and Climatic Conditions on Wheat Yield
Rembold, F., M. Meroni, F. Urbano, G. Csak, H. Kerdiles, in Nepal.” Malaysian Journal of Halal Research 3 (1): 38–40.
A. Perez-Hoyos, G. Lemoine, O. Leo, and T. Negre. doi:10.2478/mjhr-2020-0006.
January 2019. “ASAP: A New Global Early Warning USDA. “Major World Crop Areas and Climatic Profiles.” United
System to Detect Anomaly Hot Spots of Agricultural States Department of Agriculture, 2006. https://naldc.nal.
Production for Food Security Analysis.” Agricultural usda.gov/download/CAT88895275/PDF
Systems 168: 247–257. 0308521X. https://linkinghub.else van den Hoogen, J., S. Geisen, D. Routh, H. Ferris, W. Traunspurger,
vier.com/retrieve/pii/S0308521X17309095 D. A. Wardle, R. G. M. de Goede, et al. August 2019. “Soil
Ren, J., J. B. Campbell, and Y. Shao. July 2017. “Estimation of Nematode Abundance and Functional Group Composition at
SOS and EOS for Midwestern US Corn and Soybean Crops.” a Global Scale.” Nature 572 (7768): 194–198. 0028-0836, 1476-
Remote Sesing 9 (7): 722. 2072-4292. https://www.mdpi. 4687. http://www.nature.com/articles/s41586-019-1418-6
com/2072-4292/9/7/722 Van Hoolst, R., H. Eerens, D. Haesen, A. Royer, L. Bydekerke, O.
Sacks, W. J., D. Deryng, J. A. Foley, and N. Ramankutty. June Rojas, L. Yanyun, and P. Racionzer. 2016. “Fao’s AVHRR-Based
2010. “Crop Planting Dates: An Analysis of Global Patterns: Agricultural Stress Index System (ASIS) for Global Drought
Global Crop Planting Dates.” Global Ecology and Monitoring.” International Journal of Remote Sensing 37 (2):
Biogeography pages no–no. 1466822X, 14668238. http:// 418–439. doi:10.1080/01431161.2015.1126378.
doi.wiley.com/10.1111/j.1466-8238.2010.00551.x Wang, S., S. Di Tommaso, J. M. Deines, and D. B. Lobell.
Sawano, S., T. Hasegawa, S. Goto, P. Konghakote, A. Polthanee, December 2020. “Mapping Twenty Years of Corn and
Y. Ishigooka, T. Kuwagata, and H. Toritani. March 2008. Soybean across the US Midwest Using the Landsat
“Modeling the Dependence of the Crop Calendar for Archive.” Scientific Data 7 (1): 307. 2052-4463. http://www.
rain-fed Rice on Precipitation in Northeast Thailand.” Paddy nature.com/articles/s41597-020-00646-4
and Water Environment 6 (1): 83–90. 1611-2490, 1611-2504. Whitcraft, A. K., I. Becker-Reshef, and C. O. Justice. 2015a.
http://link.springer.com/10.1007/s10333-007-0102-x “A Framework for Defining Spatially Explicit Earth
Schratz, P., J. Muenchow, E. Iturritxa, J. Richter, and A. Brenning. Observation Requirements for A Global Agricultural
August 2019. “Hyperparameter Tuning and Performance Monitoring Initiative (GEOGLAM).” Remote Sensing 7 (2):
Assessment of Statistical and machine-learning Algorithms 1461–1481. doi:10.3390/rs70201461.
Using Spatial Data.” Ecological Modelling 406: 109–120. Whitcraft, A. K., I. Becker-Reshef, and C. O. Justice. 2015b.
03043800. https://linkinghub.elsevier.com/retrieve/pii/ “Agricultural Growing Season Calendars Derived from
S0304380019302145 MODIS Surface Reflectance.” International Journal of Digital
Shukla, K., P. Kumar, G. S. Mann, and M. Khare. 2020. “Mapping Earth 8 (3): 173–197. 1753-8947, 1753-8955. http://www.
Spatial Distribution of Particulate Matter Using Kriging and tandfonline.com/doi/abs/10.1080/17538947.2014.894147
Inverse Distance Weighting at Supersites of Megacity Delhi.” Whitcraft, A. K., I. Becker-Reshef, C. O. Justice, L. Gifford,
Sustainable Cities and Society 54 (101997): 101997. A. Kavvada, and I. Jarvis. December 2019. “No Pixel Left
doi:10.1016/j.scs.2019.101997. Behind: Toward Integrating Earth Observations for
Skakun, S., B. Franch, E. Vermote, J.-C. Roger, I. Becker- Agriculture into the United Nations Sustainable
Reshef, C. Justice, and N. Kussul. June 2017. “Early Development Goals Framework.” Remote Sensing of
Season large-area Winter Crop Mapping Using MODIS Environment 235: 111470. 00344257. https://linkinghub.else
NDVI Data, Growing Degree Days Information and vier.com/retrieve/pii/S0034425719304894
a Gaussian Mixture Model.” Remote Sensing of Xu, H., M. J. Bechle, M. Wang, A. A. Szpiro, S. Vedal, Y. Bai, and
Environment 195: 244–258. 00344257. https://linkin J. D. Marshall. March 2019. “National PM2.5 and NO2
ghub.elsevier.com/retrieve/pii/S0034425717301888 Exposure Models for China Based on Land Use Regression,
Stehfest, E., M. Heistermann, J. A. Priess, D. S. Ojima, and Satellite Measurements, and Universal Kriging.” Science of the
J. Alcamo. December 2007. “Simulation of Global Crop Total Environment 655: 423–433. 00489697. https://linkin
Production with the Ecosystem Model DayCent.” ghub.elsevier.com/retrieve/pii/S0048969718344802
910 B. FRANCH ET AL.

Yang, H., S. Ranjitkar, W. Xu, L. Han, J. Yang, L. Wu, and J. Xu. Zambrano Nájera, J. D. C., and O. Ortega. January 2021. “Effect
August 2021. “Crop-climate Model in Support of Adjusting of Climate Change on Burley Tobacco Crop Calendars.”
Local Ecological Calendar in the Taxkorgan, Eastern Pamir Revista Facultad Nacional de Agronomía Medellín 74 (1):
Plateau.” Climatic Change 167 (3–4): 56. 0165-0009, 1573- 2248-7026, 0304-2847. https://revistas.unal.edu.co/index.
1480. https://link.springer.com/10.1007/s10584-021-03204-y php/refame/article/view/88867

Appendix A. Model Features


A.1 Wheat

Figure A1: Top 15 most important features of the random forest regressor for the sine component of the wheat’s SOS.

A.2 Maize

Figure A2: Top 15 most important features of the random forest regressor for the cosine component of the wheat’s SOS.
GISCIENCE & REMOTE SENSING 911

Figure A3: Top 15 most important features of the random forest regressor for the sine component of the wheat’s EOS.

Figure A4: Top 15 most important features of the random forest regressor for the cosine component of the wheat’s EOS.
912 B. FRANCH ET AL.

Figure A5: Top 15 most important features of the random forest regressor for the sine component of the maize’s SOS.

Figure A6: Top 15 most important features of the random forest regressor for the cosine component of the maize’s SOS.
GISCIENCE & REMOTE SENSING 913

Figure A7: Top 15 most important features of the random forest regressor for the sine component of the maize’s EOS.

Figure A8: Top 15 most important features of the random forest regressor for the cosine component of the maize’s EOS.

You might also like