Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Effect of Albedo Footprint Size on Relationships between Measured Albedo and Forest Attributes for Small Forest Plots
Previous Article in Journal
Low-Rank Discriminative Embedding Regression for Robust Feature Extraction of Hyperspectral Images via Weighted Schatten p-Norm Minimization
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Geoclimatic Distribution of Satellite-Observed Salinity Bias Classified by Machine Learning Approach

1
Guangdong Key Laboratory of Ocean Remote Sensing, State Key Laboratory of Tropical Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
2
College of Marine Science, University of Chinese Academy of Sciences, Qingdao 266400, China
3
CSIRO Environment, Crawley, WA 6009, Australia
4
Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou 511458, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(16), 3084; https://doi.org/10.3390/rs16163084
Submission received: 8 July 2024 / Revised: 2 August 2024 / Accepted: 20 August 2024 / Published: 21 August 2024
(This article belongs to the Section Ocean Remote Sensing)
Figure 1
<p>Flowchart of the data selection, assembly processing, and production of the final classification. The 15 maps are the geographical distribution of the 15 classes, and the coloured dots in the maps represent the ΔS value of the sample.</p> ">
Figure 2
<p>Ensemble mean (blue line) and spread (grey shading) of the BIC score for increasing the number of GMM classes. The black bars are the standard deviation of the ensemble mean. The BIC scores are computed for 50 random sample groups, each consisting of 90% of the total profiles.</p> ">
Figure 3
<p>Visualisation of the classification results. For each class, the mean value of SST, rain rate, and wind speed is plotted as a 3D coordinate. (<b>a</b>) is the mean values of each class, the size of the marker represents the sample size of the class, and the colour of the marker represents the mean ΔS of the class. To better illustrate the spread of the classes and without hiding the small classes, we subdivided the classes into 3 subplots according to different temperature ranges. (<b>b</b>) Classes with mean SST below 10 °C, corresponding to the triangle markers in (<b>a</b>); (<b>c</b>) between 10–20 °C, corresponding to the square markers in (<b>a</b>); (<b>d</b>) above 20 °C, corresponding to the round markers in (<b>a</b>). The <span class="html-italic">x</span>-axis is SST, the <span class="html-italic">y</span>-axis is wind speed, and the <span class="html-italic">z</span>-axis is rain rate. Rain rate is plotted in log scale for ease of visualisation in (<b>b</b>–<b>d</b>). The details of each class are referred to in <a href="#remotesensing-16-03084-t001" class="html-table">Table 1</a>.</p> ">
Figure 4
<p>Classes with mean SST higher than 25 °C. (<b>a</b>,<b>b</b>,<b>e</b>–<b>g</b>) Scatterplot maps of ΔS (unit: PSU) in the class K11, K13, K3, K15, and K8, respectively. The dotted area in (<b>a</b>) is where the number of members exceeds 200 in a 5° × 5° grid cell and the samples exceed 12. Regions where samples are insufficient for identifying the predominant season are discarded. (<b>c</b>,<b>d</b>,<b>h</b>–<b>j</b>) Prevailing season of the observations in the same classes above. Colours represent over 50% of the observations in the area being taken in the same season: blue is December to February of next year, green is March to May, red is June to August, orange is September to November, and grey means there is no prevailing season in the area.</p> ">
Figure 5
<p>Classes with mean SST between 10 °C and 25 °C. (<b>a</b>,<b>b</b>,<b>e</b>–<b>g</b>) Scatterplot maps of ΔS in classes K1, K6, K9, K14, and K7, respectively. (<b>c</b>,<b>d</b>,<b>h</b>–<b>j</b>) Prevailing season of the observations. The legend is the same as <a href="#remotesensing-16-03084-f004" class="html-fig">Figure 4</a>.</p> ">
Figure 6
<p>Scatterplot of all SMAP SSS bias observations over a PSU (<span class="html-italic">x</span>-axis) and latitude (<span class="html-italic">y</span>-axis) plot. The coloured shading represents the observation count in a 0.02 PSU and 0.5° grid size. The overlaid dashed lines are the mean rain rate (black) and the mean SSS (red), respectively, along the latitude. The mean rain rate and SSS values are in the top <span class="html-italic">x</span>-axis.</p> ">
Figure 7
<p>Classes with mean SST lower than 10 °C. (<b>a</b>,<b>b</b>) Scatterplot maps of ΔS in classes K2 and K10, respectively. (<b>c</b>,<b>d</b>) Prevailing season of the observations in classes K2 and K10, respectively. The legend is the same as <a href="#remotesensing-16-03084-f004" class="html-fig">Figure 4</a>.</p> ">
Figure 8
<p>The distribution of members in K12 and its relationship with sea ice concentration. (<b>a</b>) Scatterplot map of K12, where the colour represents ΔS. (<b>b</b>) Prevailing season of the observations. (<b>c</b>) Scatter plot of observations with sea ice presence within 50 km, with the colour representing the percentage of ice concentration. (<b>d</b>) Observations and mean ΔS concerning sea ice concentration. (<b>e</b>) Scatterplot within the classification parameter space, with the <span class="html-italic">x</span>-, <span class="html-italic">y</span>-, and <span class="html-italic">z</span>-axes representing SST, wind speed, and rain rate, respectively, and the colour of the marker representing ΔS.</p> ">
Figure 9
<p>The distribution of members in K4 and its relationship with precipitation. (<b>a</b>) Scatterplot map of K4. (<b>b</b>) Prevailing season of the observations. (<b>c</b>) Annual mean precipitation. (<b>d</b>) Relations between ΔS and rain rate, the colour is the member count in the corresponding ΔS and rain rate. (<b>e</b>) Scatterplot for classification parameters, same as in <a href="#remotesensing-16-03084-f008" class="html-fig">Figure 8</a>e. The observation count in (<b>d</b>) is calculated with the bin size of 0.1 PSU along the <span class="html-italic">x</span>-axis and 2.5 mm/day along the <span class="html-italic">y</span>-axis.</p> ">
Figure 10
<p>The distribution of members in K5 and its relationship with sea surface current. (<b>a</b>) Scatter plot of K5. (<b>b</b>) Prevailing season of the observations. (<b>c</b>) Annual mean Eddy Kinetic Energy (EKE) of surface current (shading) overlaps with the mean velocity of sea surface current (contour, unit: m/s). (<b>d</b>) Snapshot of SMAP SSS and ocean surface current. The colour shading is SSS, the quiver is current, and the red pentagram marker is Argo observation.</p> ">
Versions Notes

Abstract

:
Sea surface salinity (SSS) observed by satellite has been widely used since the successful launch of the first salinity satellite in 2009. However, compared with other oceanographic satellite products (e.g., sea surface temperature, SST) that became operational in the 1980s, the SSS product is less mature and lacks effective validation from the user end. We employed an unsupervised machine learning approach to classify the Level 3 SSS bias from the Soil Moisture Active Passive (SMAP) satellite and its observing environment. The classification model divides the samples into fifteen classes based on four variables: satellite SSS bias, SST, rain rate, and wind speed. SST is one of the most significant factors influencing the classification. In regions with cold SST, satellite SSS has an accuracy of less than 0.2 PSU (Practical Salinity Unit), mainly due to the higher uncertainty in the cold environment. A small number of observations near the seawater freezing point show a significant fresh bias caused by sea ice. A systematic bias of the SMAP SSS product is found in the mid-latitudes: positive bias tends to occur north (south) of 45°N(S) and negative bias is more common in 25°N(S)–45°N(S) bands, likely associated with the SMAP calibration scheme. A significant bias also occurs in regions with strong ocean currents and eddy activities, likely due to spatial mismatch in the highly dynamic background. Notably, satellite SSS and in situ data correlations remain good in similar environments with weaker ocean dynamic activities, implying that satellite salinity data are reliable in dynamically active regions for capturing high-resolution details. The features of the SMAP SSS shown in this work call for careful consideration by the data user community when interpreting biased values.

1. Introduction

In recent years, satellite-observed sea surface salinity (SSS) data have been widely used to study variations in ocean salinity. Currently, three satellite missions provide SSS data products: Soil Moisture and Ocean Salinity (SMOS), operational since 2010; Aquarius/Satélite de Aplicaciones Científicas (SAC) -D, operated from 2011 to 2015, and Soil Moisture Active Passive (SMAP), operational since 2015. All three satellites follow a sun–dusk orbit with a repeat cycle of 8 days, gaining near-global coverage and a weekly temporal resolution that could not be achieved before [1,2,3]. The ongoing satellite salinity missions, SMOS and SMAP, provide Level 3 SSS data mapped and averaged over the orbit revisit period with a spatial resolution of 0.25° × 0.25° and an 8-day temporal resolution. This high-resolution SSS observations are valuable because satellites provide real-time measurements of the global ocean surface, while other high-resolution datasets, such as reanalysis, are based on mathematical simulations and limited to model accuracy.
SSS remote sensing uses an L-band radiometer to measure the sea surface’s brightness temperature (TB), and then compute the salinity value via a relation between seawater emissivity (depends on the physicochemical properties of seawater, including salinity) and TB [4]. Among other processes, the L-band radiometer measurement is also influenced by sea surface temperature (SST), sea surface roughness, solar and sky emissions, and radio frequency interference (RFI) of anthropogenic activities [5]. Many studies have refined the models and algorithms used in the retrieval process of satellite SSS, aiming to better fit the TB–SSS relationship [6,7,8,9]. The data quality assessments have shown that Level 3 satellite SSS datasets from SMOS, SMAP, and Aquarius have achieved an accuracy of about 0.2 PSU and correlate well with in situ observations [10]. However, it is known that the commonly used Level 3 satellite SSS datasets have poorer quality at high latitudes, characterised by cold SST and high wind speed. The bias between satellite and in situ measurements is more pronounced in regions with high precipitation, such as the Inter-tropical Convergence Zone (ITCZ) and the South Pacific Convergence Zone (SPCZ) [11]. The influence of the ocean surface environments mainly come from three aspects: First, radiometric sensitivity to salinity is lower in cold SST regions, causing larger deviations at high latitudes [12,13,14]. Second, high wind speed causes increasing sea surface roughness, reducing the accuracy of satellite SSS measurement [15]. Third, fresh water from precipitation can make the ocean skin layer (top 5 centimetres are detected by microwave remote sensing) fresher, leading to a fresh bias compared to in situ observations, which reflect a depth of up to 5 m below the ocean surface [16]. While salinity is an oceanic parameter as important as temperature in the research on ocean circulations and climate changes [17,18,19], the resolution and accuracy of its measurement lag behind. While efforts have been made to improve the retrieval processes, here we focus on improvements at the level of data post-processing.
Traditional in situ SSS data include along-track data collected from ships, moored buoys, and the Array for Real-Time Geostrophic Oceanography (Argo) float that began in the 21st century. Currently, Argo floats are the most widely distributed in situ observation system in the global ocean and maintain approximately 4000 active floats in the present day [20]. Two major processes are responsible for the variations in SSS: dynamical processes such as horizontal advection and vertical entrainment; the thermodynamical process is the freshwater flux, including precipitation, evaporation, and river runoff [21,22,23]. Through the two forcing mechanisms, SSS can act as proxies of upper ocean circulations and the global hydrological cycle. While SSS is affected by climate factors, salinity remote sensing instruments are also influenced by environments with climatic variabilities. Therefore, analysing the seasonality and geographical distribution of satellite data bias is crucial for assessing the reliability of satellite-derived data.
Machine learning methods have become efficient tools for ocean data processing as the volume of samples has accumulated over the past decades. In satellite salinity research, several machine learning techniques have proven robust in enhancing data quality, particularly in coastal regions [24,25,26]. However, the application of machine learning as a bias classifier has not been common in previous studies. We looked for an ideal statistical method that could account for multiple factors to address the connections of the observation environments and the SSS bias between satellite and in situ measurements. In particular, we focus on how environmental variables affect the bias. An unsupervised classification method, the Gaussian Mixture Model (GMM), is used in this work to detect relationships between SSS data deviations between satellite and in situ observations and the corresponding environmental parameters. In the literature, GMM is also interpreted as a Profile Classification Method (PCM). Several studies have utilised the PCM/GMM in classifying ocean temperature and salinity profiles to extract the information from the datasets, demonstrating the remarkable abilities of the model in identifying ocean dynamic structures (e.g., fronts and gyres), classifying ocean thermodynamic patterns (e.g., mode water), and tracking the formation and mixing of the water masses [27,28,29]. In this work, the unsupervised machine learning method is used to detect classes of observation that differ in the relation between SSS bias and environmental variables. The classification model shows good skill in capturing the dynamical and thermal processes related to the satellite bias. Especially, it picks out classes under the influence of sea ice and strong ocean currents, showing that these observations significantly differ from others. The goal of this work is to provide a better understanding of satellite SSS data and its applications across conditions that are difficult to find in traditional methods.
The article is organised as follows: Data and Methods are presented in Section 2, Section 3 interprets the classification results and gives the details of the geoclimatic distribution of the classes, and Section 4 is the Conclusion and Discussion.

2. Data and Methods

2.1. Data

Salinity data from the SMAP satellite are obtained from the Jet Propulsion Laboratory (JPL) Physical Oceanography Distributed Active Archive Center (PODAAC). The SMAP SSSs used in this study are Level 3 data, which are standard mapped and have an 8-day running mean. For validation purposes, Argo floats provide in situ reference to the satellite data. Observations of salinity from Argo at depths shallower than 5 m (denoted as Argo SSS) are used for comparison with the nearest satellite grid point. This comparison between the Argo float and the satellite matchup point is constrained by a maximum distance of 19.5 km (half the distance from a grid centre to the nearest grid point) and a maximum time lag of 24 h. The satellite SSS bias (ΔS) is the difference between satellite SSS and Argo SSS.
The environmental data used in this study include SST data from Optimum Interpolation 1/4 Degree Daily Sea Surface Temperature (OISST), precipitation data from Global Precipitation Measurement (GPM), and wind speed data from Cross-Calibrated Multi-Platform Wind Vector Analysis Product (CCMP). For every match between satellite SSS and Argo SSS, the values of ΔS, SST, rain rate, and wind speed of the corresponding 0.25° × 0.25° map grid point and date are combined in a profile. The set of all profiles constitutes the members of the classification shown in Figure 1. The classification input does not include temporal and geographical attributes of observations as the inherent seasonal and annual cycles vary between regions. In the second stage of the analysis, the time–space information will be reintroduced to illustrate the geoclimatic features of each class.
The objective of this study is to classify the observations without cross-validating the classification model. Consequently, all available data are included in the model training. The trained model then classifies the input data into various classes. All input variables are equally weighted and normalised to ensure consistency among the dimensions.

2.2. Unsupervised Machine Learning Classification

The Gaussian Mixture Model (GMM) is a probabilistic model that assumes data samples are derived from a combination of distinct Gaussian distributions. In Equation (1), N x ; μ , Σ is the multidimensional normal probability density function (PDF), with x being a profile (a data sample of the shape [1, D]; D: number of dimensions, as described in Section 2.1) in x dimensions D, μ and Σ are the mean values and covariance of the PDF, respectively [28]. Equation (2) describes the GMM as the weighted sum p x of K distributions (referred to as classes).
N x ;   μ ,   Σ = 1 2 π D   Σ exp 1 2 x μ   Σ 1 x μ
p x = k = 1 K λ k N x ; μ k , Σ k
By using this model, we can find the various groups of environmental factors influencing the bias in satellite SSS.
B I C K = 2 L K + N f K log n
N f K = K 1 + K D + K D D 1 2
GMM is an unsupervised machine learning model; as such, no labels are assigned to the dataset. However, we need to select the number of classes. The Bayesian information criterion (BIC) is a widely used method for determining the number of classes (K). The BIC score is calculated using Equation (3), where L K is the log-likelihood of the model simulation, N f K is the number of independent parameters to be estimated, and n is the number of profiles in the training set. A lower BIC score is expected for a well-fitted GMM since both model-fitting and parsimony model components (fewer classes) are considered in the BIC score. In this study, the BIC score does not reach a minimum as the number K increases. However, the BIC score appears to flatten at K = 15, at which time the spread of the score also decreases (Figure 2). Consequently, we chose to use 15 classes in our classification (Figure 1). In this work, the GMM is implemented with the scikit-learn package (version 1.4.1), an open source Python package by [30].

3. Geoclimatic Distribution of the Classes

3.1. Environmental Signatures of the Classification Result

Table 1 and Figure 3 display the 15 classes from the GMM classification. One of the most significant features of the classes is their strong relation with SST and wind speed: the warm classes with mean temperatures greater than 25 °C have low wind speeds and the classes with mean temperatures less than 10 °C have high wind speeds. The SST–wind relationship is linked to global climate and atmospheric circulation patterns. In the middle-to-high latitudes, SST is cooler due to the lower solar incidence angle, and wind speed is higher due to the influence of the polar vortex and Westerlies. Earlier studies have found this region is particularly challenging for satellite salinity measurements [11,13]. Observations are usually collected during periods of no rainfall because the daily rainfall distribution is scattered. As a result, over half of the classes include clustering observations with mean precipitation less than 1 mm/day in these classes. However, since rainfall affects SSS directly, we include the rain rate in the classification to further explore its role in satellite SSS bias.
Classes characterised by both low ΔS mean and standard deviation (SD) exhibit warm SST, light rainfall, and low wind speed. Over half of these classes show a saline bias, especially in the warmest regions. The saline bias is particularly robust in the tropics for SMAP SSS, a phenomenon attributed to the bias of the reference field used for SMAP retrieval algorithms [31,32]. In most classes, the bias is small and meets the expected accuracy of the instrument, identifying areas where the satellite SSS more closely matches in situ data [1]. Our model also identifies several classes that exhibit significant bias. For example, class K12 has a high mean |ΔS|, while K10 has a high ΔS SD, although it also exhibits a mean |ΔS| close to 0 (0.05 PSU). Despite their small size, these classes capture important features of salinity remote sensing in special conditions, which we describe in detail in Section 3.3.

3.2. Similar Classes in the Different SST Range

Each class from the classification consists of members that share common environmental features, confined to a specific region. In the classes with similar mean values, the geographic and seasonal features are also alike. In the BIC-guided clustering presented above, we have selected 15 classes to ensure that the results are sensitive to extreme and rare conditions. The drawback of selecting so many classes is that some classes consisting of a small ΔS, and thus less informative observations, have been separated despite only minor differences. The model appears to be particularly sensitive to small differences in rain rate, given that precipitation values are close to 0, but extremes can be over 100 mm/day. As shown in Figure 3, the rain rates of the classes are well-differentiated only by using a log scale. However, the actual difference in rainfall in these classes is insignificant. To address this problem, we utilise the prior knowledge of salinity remote sensing principles and the global ocean salinity budgets to group similar classes into three SST ranges according to the level of dispersal of the class in all dimensions and their geographical distribution. The warm SST range includes five classes with a mean temperature greater than 25 °C, located in the tropics. The middle SST range includes four classes with a mean temperature between 10 °C and 20 °C, located at mid-latitudes. Finally, the cold SST range includes two classes with a temperature less than 10 °C, mostly located at latitudes above 40°.
The classes in the warm SST range are shown in Figure 4. They span all tropic and subtropic regions, with the class boundaries shifting towards higher latitudes during the summer in both hemispheres, following the seasonal evolution of SST. The largest class, K11, includes over 36% of the total observations. Areas with higher member density (over 200 members in a 5° × 5° grid point) in K11 (shaded in black dots in Figure 4a) are concentrated in the subtropical open ocean where satellite SSSs have minimum bias. While in the tropics, because more measurements are taken under no-rain circumstances, and the number of Argo floats is abundant here, many samples are categorised as small-biased observation. However, the tropics still have a higher mean bias due to rainfall. The other classes in the warm region (K13, K3, K15, and K8) are approximately half the size of K11. The spatial coverage of these classes decreases as the mean SST increases. K13, which has the closest mean SST to K11, has a slightly smaller spatial range than K11, and the smallest SD of ΔS of all classes. The mean SSTs of K3, K8, and K15 exceed 28 °C, over the tropical convection thresholds, and their SD of ΔS increases with mean SST. The warmest class, K8, is characterised by a mean precipitation of 5.96 mm/day, significantly higher than the other four classes in the warm regions. The area covered by K8 aligns with the location of the ITCZ and SPCZ, where heavier rainfall causes stronger surface stratification that amplifies the difference between the depth of the ocean skin layer sampled by satellite and the depth sampled by Argo observations (about 5 m deep). Therefore, despite the mean saline bias in the tropics, the mean ΔS in K8 is lower than the other four classes in the warm SST range. This result is consistent with earlier studies suggesting that the tropical deep convection may generate a rain-induced fresh lens, leading to the freshening of satellite SSS compared with the in situ observations [16,33].
In the middle SST range from 10 °C to 20 °C, classes K1, K6, K9, and K14 show greater bias than those in warmer temperatures, with especially greater saline bias in the north of 35°N, the south of 40°S, and the cold tongue in the Equatorial Eastern Pacific. These classes migrate southward (northward) in boreal winter (summer), as shown in Figure 5. The mean bias of the four classes is less than 0.1 PSU, but the mean biases in latitudes above 40° in both hemispheres are about 0.2 PSU. Across all these classes, positive bias dominates at higher latitudes, and negative bias dominates at lower latitudes. The dividing line of the positive and negative bias in the northern hemisphere shows a west-to-east tilt towards the equator. The warmer regions in the western boundary current and extensions are dominated by fresh bias. In the southern hemisphere, the dividing line approximates the Subantarctic Front. Class K7 is between the warm SST and middle SST ranges (Figure 5g,h), with the mean SST at 21.9 °C. It is in the subtropical oceans, characterised by lighter rainfall and milder winds, representing a transition from classes within the middle SST range to the warm SST range. Members in K7 mostly show light and fresh bias, covering the areas where SMAP SSSs have negative mean bias.
Throughout the classification, a latitude-based bias pattern characterised by a saline bias in the low and high latitudes and a fresh bias in the middle latitudes becomes apparent (Figure 5a,b,e,f). The transition from saline to fresh bias occurs between 25° and 45° in both hemispheres. The mean fresh bias in the subtropics is −0.3 PSU in the northern hemisphere and −0.1 PSU in the southern hemisphere. The greater bias in the northern hemisphere is due to higher precipitation corresponding to the descending branch of the Hadley cell [34]. The saline bias in the temperate zone is 0.3 PSU in both hemispheres. The variation in bias with latitude mirrors the meridional SSS mean and rain rate, although the peak latitudes of mean bias have shifted poleward (Figure 6). The spatial structure of this bias suggests that it relates to latitude, as supported by earlier satellite SSS evaluation studies [10]. Further analysis is needed to determine the causes of this bias and how to address it.
The cold SST range, characterised by a mean temperature below 10 °C, includes two classes (K2 and K10) with similar locations (Figure 7), which exhibit a seasonal pattern that migrates towards lower latitudes in the winter of both hemispheres. The mean SST of K2 and K10 are 7.5 °C and 4.0 °C, respectively. The warmer class has larger coverage, while the colder class is closer to the polar region and only appears in winter in the North Pacific and Northwestern Atlantic. K2 has over 1.5 times the number of members as K10 and half the SD of K10, meaning that there are fewer observations in the colder scenario and higher biases towards both positive and negative values. This further indicates that the low sensitivity of the salinity retrieval algorithm in cold SST is the main cause of the large salinity bias.

3.3. Classifying the Outliers

The classification has revealed unexpected results in three small classes: K12, K4, and K5. Although these outlier classes only include a small percentage of the total observations, they demonstrate the potential of SSS remote sensing in unconventional observation environments. Since these data satisfy the quality control criterion set by the distribution organisation, they are retained during the retrieval process. By investigating the characteristics of these classes, we try to elucidate the underlying physical processes.
K12 shows the most significant fresh bias among all classes. Unlike the previous understanding of the common case in freshwater flux-induced satellite fresh bias, this class is not accompanied by heavy rainfall. This class has the lowest SST found near the polar regions, including the North Atlantic and Southern oceans (Figure 8a,b). The mean SST of K12 is 1.1 °C, close to the freezing point of seawater (the freezing point of seawater with the salinity of about 32 PSU is −1.8 °C, which is the common case in the polar regions), indicating that this class is influenced by the freshwater from sea ice melting rather than precipitation. By comparing with the sea ice concentration data, we find that over 2/3 of the samples in K12 are within 50 km of an observation grid point where sea ice is found, and the bias of these observations is mostly negative. The range of the samples in K12 advances towards lower latitudes in the winter of both hemispheres as the coverage of sea ice expands, and retreats to higher latitudes in the summer. However, several observations with positive biases, representing the tail of the PDF, have been categorised in this class. Although they share similar environments, they are not associated with ice, as in 150°W of the Southern Ocean (Figure 8c). The freshening intensifies with increasing ice concentration. Most of the samples in this class are taken in conditions of no or light rainfall. The samples with higher rain rates show a smaller bias, suggesting that the significant fresh bias is related to higher concentrations of sea ice rather than precipitation (Figure 8e). However, satellite salinity products discard observations taken over ice and land, resulting in limited data from locations with higher ice concentrations. Hence, the sample size is inadequate in high-ice-concentration conditions. Note that the satellite SSS measured in these conditions is still significantly lower than the Argo SSS, so despite the overall freshening observed by satellite in the areas with sea ice, it is difficult to confirm the salinity value in the skin layer because of the large error associated with low-temperature conditions.
It is worth noting that although satellite-observed freshening has been linked to rain-induced freshwater [33,35,36], the satellite SSS observations taking place in heavy rainfall are sparse and have little influence on the Level 3 SSS data that have been applied with an 8-day running mean, so the Level 3 data from a single satellite cannot capture SSS variations with frequencies higher than 8 days. Previous studies have investigated the lifespan of rain-induced fresh lenses on the ocean surface, revealing durations of only tens of hours, significantly shorter than the temporal scale of Level 3 satellite salinity data [37,38]. In this classification, only a small subset, class K4, exhibits a slight mean freshening under an extremely high rain rate. The locations of this class are dispersed, occurring with heavy rainfall across the ITCZ, SPCZ, and Westerlies, showing the sparsity features of heavy rainfall (Figure 9). Regions with sufficient data to identify the seasonal pattern showed heavy-rainfall-related observations in the rain belt of the north Pacific westerly in boreal autumn, and the SPCZ moving eastward in boreal winter. The ΔS in this class leans more towards negative values, and the members are concentrated where the rain rate is about 20 mm/day (Figure 9e). However, not all members in this class show a negative bias, emphasising that rain-induced skin layer freshening is challenging to detect in Level 3 SSSs. Level 2 satellite salinity data should be used when investigating rain-related SSS variations that have not been time-smoothed [39]. The SD of ΔS is higher than that of other classes with similar temperatures (K7), suggesting that heavy rainfall has reduced measurement precision. The higher sea surface roughness caused by raindrops is the most probable cause of satellite bias in heavy rainfall, leading to greater observational uncertainty. Besides the extreme rainfall cases in K4, the classification also shows classes with high precipitation, such as K8 in the warmest ocean surface and K6 in the middle latitude front zones. The coverage of K4 combines that of K6 and K8, but the member size of K4 is only one-quarter that of K6 or K8 due to its extreme nature.
The spatial pattern of K5 mirrors the regions of strong ocean surface currents. This class may be delineated by the thermohaline properties and winds associated with strong-wind-driven activity, although no ocean current information was included in the classification (Figure 10). In regions of the western boundary current extension and the Antarctic Circumpolar Current (ACC), satellite observations show mostly fresher salinity. In contrast, satellite salinity shows a more positive bias in tropical regions, like the eastern equatorial Pacific. The extension of the Gulf Stream shows a mix of positive and negative biases, unlike the extension of Kuroshio in the same latitude. In addition to the intensified wind-driven circulation regions, observations within the Mediterranean Sea and coastal regions of the North Indian Ocean also fall into this class, characterised by the presence of active eddies [40,41]. In regions with strong ocean dynamical processes, the discrepancy between satellite observation and Argo float data is larger than in calmer regions. The characteristics of this class deserve further investigation. Note that the existing in situ observations are inadequate for data validation in these regions. For instance, the observations are dispersed along the current axis in the Kuroshio extension, where the horizontal salinity gradient is large (Figure 10d). The existing data quality assessment is less accurate in regions of strong current and salinity front.

4. Discussion

The classification shows that most satellite observations align well with Argo floats in regions with a temperate-to-warm SST, weak precipitation, and weak-to-moderate winds. Significant satellite SSS bias is found in the cold SST regions. This is primarily attributed to the low accuracy of L-band radiometer measurements at low temperatures [12], although some observations are also affected by sea ice melting. Rain-related satellite SSS bias may be more influenced by prolonged rain-induced stratification effects than by rainfall itself [16]. This classification makes the latitude-dependent bias in SMAP L3 SSS data easier to identify, especially in the classes located in the middle latitude. Our analysis reveals that more negative biases are found between 25° and 45° in latitude and more positive biases in the tropics and near the polar circles. This bias pattern is observed globally at the same latitude bands with clear dividing lines, suggesting it is a systematic bias associated with the SMAP salinity calibration. Among the classes with large SSS bias, class K12 showed significant fresh bias related to the sea ice melting, and class K5 stands out in the regions with strong ocean dynamical activities, where strong horizontal salinity fronts also exist. Meanwhile, class K4, characterised by extreme rainfall, exhibits a much smaller fresh bias, indicating that substantial freshwater input over a short period of time does not significantly degrade satellite SSS quality as we suspect.
The limited availability of in situ observations hampers satellite salinity validation studies, especially in regions exhibiting significant SSS variations. The spatial and temporal mismatches between Argo floats and satellite observations are particularly pronounced in areas characterised by rapid advection and mixing, such as frontal zones with strong salinity gradients. Since the satellite SSS bias is small in the same range of SST, rain rate, and wind speed, we believe that the satellite observations in K5 reflect the actual SSS. Further field investigation, such as high-resolution observations onboard research vessels or buoys, is necessary to validate the satellite salinity in the dynamically active areas.

5. Conclusions

In this study, we categorised the satellite SSS data bias along with three environmental properties and analysed the geographical characteristics of different classes. Using an unsupervised machine learning method enabled us to examine the relationships of multiple variables more objectively. GMM revealed patterns that are difficult to detect via conventional statistical methods. The classification highlights the SMAP SSS bias influenced by geographically specific environments and seasonality. It also isolated outlier classes compared to the validation dataset, such as in regions that are significantly influenced by sea ice melting, extreme rainfall, and active sea surface currents/fronts. These findings will inform the development of the next-generation satellite salinity instruments and retrieval algorithms by identifying areas of underperformance in the current SSS dataset.

Author Contributions

Conceptualisation, Y.O., Y.Z. and Y.D.; methodology, Y.O., Y.Z. and Y.D.; software, Y.O.; validation, Y.O. and Y.Z.; formal analysis, Y.O. and Y.Z.; investigation, Y.O., Y.Z., M.F., F.B. and Y.D.; resources, Y.Z. and Y.D.; data curation, Y.O.; writing—original draft preparation, Y.O.; writing—review and editing, Y.Z., M.F., F.B. and Y.D.; visualisation, Y.O.; supervision, Y.Z., M.F., F.B. and Y.D.; project administration, Y.Z. and Y.D.; funding acquisition, Y.Z. and Y.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Chinese Academy of Sciences (183311KYSB20200015, XDB42010305, 133244KYSB20190031, SCSIO202204, and SCSIO202201), National Natural Science Foundation of China (42276026), Guangdong Natural Science Funds for Distinguished Young Scholar (2024B1515020037), Guangdong Basic and Applied Basic Research Foundation (2023A1515012691), Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (2019BT02H594).

Data Availability Statement

The 8-day running mean SMAP Level 3 SSS data is provided by NASA JPL (https://podaac.jpl.nasa.gov/dataset/SMAP_JPL_L3_SSS_CAP_8DAY-RUNNINGMEAN_V5?ids=&values=&search=SMAP&provider=PODAAC, accessed on 5 April 2023). The in situ SSS observation data is from the Global Argo scattered data set V3.0, provided by the China Argo Real-time Data Center (http://www.argo.org.cn/index.php?m=content&c=index&a=lists&catid=100, accessed on 5 April 2023). The SST data is provided by the NOAA 1/4° Daily Optimum Interpolation Sea Surface Temperature (https://www.ncei.noaa.gov/products/optimum-interpolation-sst, accessed on 9 April 2023). The wind data is Cross-Calibrated Multi-Platform (CCMP) gridded Level 3 ocean vector wind analysis product provided by the Remote Sensing System (RSS), using satellite, moored buoy, and model wind data as input (https://www.remss.com/measurements/ccmp/, accessed on 11 April 2023). The precipitation data is provided by NASA Global Precipitation Mission (The precipitation data used in this work is GMP V06, accessed on 25 March 2023. GPM product have been upgrade to V07, which can be found in: https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGDF_07/summary?keywords=%22IMERG%20final%22). The ice concentration data is obtained from the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) (https://osi-saf.eumetsat.int/products/osi-401-d, accessed on 7 November 2023). The mean velocity and EKE of surface current are computed from Ocean Surface Current Analysis Real-time (OSCAR) data (https://podaac.jpl.nasa.gov/dataset/OSCAR_L4_OC_third-deg, accessed on 7 March 2022).

Acknowledgments

This work is supported by the Chinese Academy of Sciences (183311KYSB20200015, XDB42010305, 133244KYSB20190031, SCSIO202204, and SCSIO202201), National Natural Science Foundation of China (42276026), Guangdong Natural Science Funds for Distinguished Young Scholar (2024B1515020037), Guangdong Basic and Applied Basic Research Foundation (2023A1515012691), the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (2019BT02H594). In the preparation of this manuscript, we employed the assistance of ChatGPT (version GPT-4o) for grammar correction and stylistic enhancement of the English text. To ensure the preservation of the original content, all suggestions made by ChatGPT were meticulously reviewed by all authors prior to their incorporation into the manuscript.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Entekhabi, D.; Njoku, E.G.; O’Neill, P.E.; Kellogg, K.H.; Crow, W.T.; Edelstein, W.N.; Entin, J.K.; Goodman, S.D.; Jackson, T.J.; Johnson, J.; et al. The Soil Moisture Active Passive (SMAP) Mission. Proc. IEEE 2010, 98, 704–716. [Google Scholar] [CrossRef]
  2. Kerr, Y.H.; Waldteufel, P.; Wigneron, J.-P.; Delwart, S.; Cabot, F.; Boutin, J.; Escorihuela, M.-J.; Font, J.; Reul, N.; Gruhier, C.; et al. The SMOS Mission: New Tool for Monitoring Key Elements ofthe Global Water Cycle. Proc. IEEE 2010, 98, 666–687. [Google Scholar] [CrossRef]
  3. Lagerloef, G.; Colomb, F.R.; Le Vine, D.; Wentz, F.; Yueh, S.; Ruf, C.; Lilly, J.; Gunn, J.; Chao, Y.; deCharon, A.; et al. The Aquarius/SAC-D Mission: Designed to Meet the Salinity Remote-Sensing Challenge. Oceanography 2008, 21, 68–81. [Google Scholar] [CrossRef]
  4. Sirounian, V. Effect of temperature, angle of observation, salinity, and thin ice on the microwave emission of water. J. Geophys. Res. 1968, 73, 4481–4486. [Google Scholar] [CrossRef]
  5. Reul, N.; Grodsky, S.A.; Arias, M.; Boutin, J.; Catany, R.; Chapron, B.; D’Amico, F.; Dinnat, E.; Donlon, C.; Fore, A.; et al. Sea surface salinity estimates from spaceborne L-band radiometers: An overview of the first decade of observation (2010–2019). Remote Sens. Environ. 2020, 242, 111769. [Google Scholar] [CrossRef]
  6. Boutin, J.; Vergely, J.L.; Marchand, S.; D’Amico, F.; Hasson, A.; Kolodziejczyk, N.; Reul, N.; Reverdin, G.; Vialard, J. New SMOS Sea Surface Salinity with reduced systematic errors and improved variability. Remote Sens. Environ. 2018, 214, 115–134. [Google Scholar] [CrossRef]
  7. Vinogradova, N.; Lee, T.; Boutin, J.; Drushka, K.; Fournier, S.; Sabia, R.; Stammer, D.; Bayler, E.; Reul, N.; Gordon, A.; et al. Satellite Salinity Observing System: Recent Discoveries and the Way Forward. Front. Mar. Sci. 2019, 6, 243. [Google Scholar] [CrossRef]
  8. Yin, X.B.; Wang, Z.Z.; Liu, Y.G.; Cheng, Y.C. A new algorithm for microwave radiometer remote sensing of sea surface salinity without influence of wind. Int. J. Remote Sens. 2008, 29, 6789–6800. [Google Scholar] [CrossRef]
  9. Yin, X.B.; Boutin, J.; Dinnat, E.; Song, Q.T.; Martin, A. Roughness and foam signature on SMOS-MIRAS brightness temperatures: A semi-theoretical approach. Remote Sens. Environ. 2016, 180, 221–233. [Google Scholar] [CrossRef]
  10. Bao, S.; Wang, H.; Zhang, R.; Yan, H.; Chen, J. Comparison of Satellite-Derived Sea Surface Salinity Products from SMOS, Aquarius, and SMAP. J. Geophys. Res. Ocean. 2019, 124, 1932–1944. [Google Scholar] [CrossRef]
  11. Ouyang, Y.; Zhang, Y.; Chi, J.; Sun, Q.; Du, Y. Deviations of satellite-measured sea surface salinity caused by environmental factors and their regional dependence. Remote Sens. Environ. 2023, 285, 113411. [Google Scholar] [CrossRef]
  12. Lang, R.; Zhou, Y.; Utku, C.; Le Vine, D. Accurate measurements of the dielectric constant of seawater at L band. Radio Sci. 2016, 51, 2–24. [Google Scholar] [CrossRef]
  13. Tang, W.; Yueh, S.; Yang, D.; Fore, A.; Hayashi, A.; Lee, T.; Fournier, S.; Holt, B. The Potential and Challenges of Using Soil Moisture Active Passive (SMAP) Sea Surface Salinity to Monitor Arctic Ocean Freshwater Changes. Remote Sens. 2018, 10, 869. [Google Scholar] [CrossRef]
  14. Yu, L.; Josey, S.A.; Bingham, F.M.; Lee, T. Intensification of the global water cycle and evidence from ocean salinity: A synthesis review. Ann. N. Y. Acad. Sci. 2020, 1472, 76–94. [Google Scholar] [CrossRef]
  15. Yueh, S.H.; West, R.; Wilson, W.J.; Li, F.K.; Njoku, E.G.; Rahmat-Samii, Y. Error sources and feasibility for microwave remote sensing of ocean surface salinity. IEEE Trans. Geosci. Remote Sens. 2001, 39, 1049–1060. [Google Scholar] [CrossRef]
  16. Boutin, J.; Chao, Y.; Asher, W.E.; Delcroix, T.; Drucker, R.; Drushka, K.; Kolodziejczyk, N.; Lee, T.; Reul, N.; Reverdin, G.; et al. Satellite and In Situ Salinity: Understanding Near-Surface Stratification and Subfootprint Variability. Bull. Am. Meteorol. Soc. 2016, 97, 1391–1407. [Google Scholar] [CrossRef]
  17. Birchfield, G.E. A coupled ocean-atmosphere climate model: Temperature versus salinity effects on the thermohaline circulation. Clim. Dyn. 1989, 4, 57–71. [Google Scholar] [CrossRef]
  18. Du, Y.; Zhang, Y.; Shi, J. Relationship between sea surface salinity and ocean circulation and climate change. Sci. China Earth Sci. 2019, 62, 771–782. [Google Scholar] [CrossRef]
  19. Fedorov, A.V.; Pacanowski, R.C.; Philander, S.G.; Boccaletti, G. The Effect of Salinity on the Wind-Driven Circulation and the Thermal Structure of the Upper Ocean. J. Phys. Oceanogr. 2004, 34, 1949–1966. [Google Scholar] [CrossRef]
  20. Wong, A.P.S.; Wijffels, S.E.; Riser, S.C.; Pouliquen, S.; Hosoda, S.; Roemmich, D.; Gilson, J.; Johnson, G.C.; Martini, K.; Murphy, D.J.; et al. Argo Data 1999–2019: Two Million Temperature-Salinity Profiles and Subsurface Velocity Observations from a Global Array of Profiling Floats. Front. Mar. Sci. 2020, 7, 700. [Google Scholar] [CrossRef]
  21. Delcroix, T.; Hénin, C. Seasonal and interannual variations of sea surface salinity in the tropical Pacific Ocean. J. Geophys. Res. 1991, 96, 22135–122150. [Google Scholar] [CrossRef]
  22. Delcroix, T.; Henin, C.; Porte, V.; Arkin, P. Precipitation and sea-surface salinity in the tropical Pacific Ocean. Deep Sea Res. Part I Oceanogr. Res. Pap. 1996, 43, 1123–1141. [Google Scholar] [CrossRef]
  23. Cronin, M.F.; McPhaden, M.J. Upper ocean salinity balance in the western equatorial Pacific. J. Geophys. Res. Ocean. 1998, 103, 27567–27587. [Google Scholar] [CrossRef]
  24. Medina-Lopez, E.; Ureña-Fuentes, L. High-Resolution Sea Surface Temperature and Salinity in the Global Ocean from Raw Satellite Data. Remote Sens. 2019, 11, 2191. [Google Scholar] [CrossRef]
  25. Akhil, V.P.; Vialard, J.; Lengaigne, M.; Keerthi, M.G.; Boutin, J.; Vergely, J.L.; Papa, F. Bay of Bengal Sea surface salinity variability using a decade of improved SMOS re-processing. Remote Sens. Environ. 2020, 248, 111964. [Google Scholar] [CrossRef]
  26. Jang, E.; Kim, Y.J.; Im, J.; Park, Y.-G.; Sung, T. Global sea surface salinity via the synergistic use of SMAP satellite and HYCOM data based on machine learning. Remote Sens. Environ. 2022, 273, 112980. [Google Scholar] [CrossRef]
  27. Jones, D.C.; Holt, H.J.; Meijers, A.J.S.; Shuckburgh, E. Unsupervised Clustering of Southern Ocean Argo Float Temperature Profiles. J. Geophys. Res. Ocean. 2019, 124, 390–402. [Google Scholar] [CrossRef]
  28. Maze, G.; Mercier, H.; Fablet, R.; Tandeo, P.; Lopez Radcenco, M.; Lenca, P.; Feucher, C.; Le Goff, C. Coherent heat patterns revealed by unsupervised classification of Argo temperature profiles in the North Atlantic Ocean. Prog. Oceanogr. 2017, 151, 275–292. [Google Scholar] [CrossRef]
  29. Xia, X.; Hong, Y.; Du, Y.; Xiu, P. Three Types of Antarctic Intermediate Water Revealed by a Machine Learning Approach. Geophys. Res. Lett. 2022, 49, e2022GL099445. [Google Scholar] [CrossRef]
  30. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.J.T.J.O.M.L.R. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  31. Cummings, J.A.; Smedstad, O.M. Ocean Data Impacts in Global HYCOM*. J. Atmos. Ocean. Technol. 2014, 31, 1771–1791. [Google Scholar] [CrossRef]
  32. Fore, A.G.; Yueh, S.H.; Tang, W.; Stiles, B.W.; Hayashi, A.K. Combined Active/Passive Retrievals of Ocean Vector Wind and Sea Surface Salinity With SMAP. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7396–7404. [Google Scholar] [CrossRef]
  33. Boutin, J.; Martin, N.; Reverdin, G.; Yin, X.; Gaillard, F. Sea surface freshening inferred from SMOS and ARGO salinity: Impact of rain. Ocean. Sci. 2013, 9, 183–192. [Google Scholar] [CrossRef]
  34. Xie, S.-P. The shape of continents, air-sea interaction, and the rising branch of the Hadley circulation. In The Hadley Circulation: Present, Past and Future; Springer: Cham, Switzerland, 2004; pp. 121–152. [Google Scholar]
  35. Asher, W.E.; Jessup, A.T.; Branch, R.; Clark, D. Observations of rain-induced near-surface salinity anomalies. J. Geophys. Res. Ocean. 2014, 119, 5483–5500. [Google Scholar] [CrossRef]
  36. Tang, W.; Yueh, S.H.; Hayashi, A.; Fore, A.G.; Jones, W.L.; Santos-Garcia, A.; Jacob, M.M. Rain-Induced Near Surface Salinity Stratification and Rain Roughness Correction for Aquarius SSS Retrieval. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 5474–5484. [Google Scholar] [CrossRef]
  37. ten Doeschate, A.; Sutherland, G.; Bellenger, H.; Landwehr, S.; Esters, L.; Ward, B. Upper Ocean Response to Rain Observed From a Vertical Profiler. J. Geophys. Res. Ocean. 2019, 124, 3664–3681. [Google Scholar] [CrossRef]
  38. Reverdin, G.; Supply, A.; Drushka, K.; Thompson, E.J.; Asher, W.E.; Lourenço, A. Intense and Small Freshwater Pools from Rainfall Investigated During Spurs-2 on 9 November 2017 in the Eastern Tropical Pacific. J. Geophys. Res. Ocean. 2020, 125, e2019JC015558. [Google Scholar] [CrossRef]
  39. Reul, N.; Chapron, B.; Grodsky, S.A.; Guimbard, S.; Kudryavtsev, V.; Foltz, G.R.; Balaguru, K. Satellite Observations of the Sea Surface Salinity Response to Tropical Cyclones. Geophys. Res. Lett. 2021, 48, e2020GL091478. [Google Scholar] [CrossRef]
  40. Chen, G.; Wang, D.; Hou, Y. The features and interannual variability mechanism of mesoscale eddies in the Bay of Bengal. Cont. Shelf Res. 2012, 47, 178–185. [Google Scholar] [CrossRef]
  41. Pujol, M.I.; Larnicol, G. Mediterranean sea eddy kinetic energy variability from 11 years of altimetric data. J. Mar. Syst. 2005, 58, 121–142. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the data selection, assembly processing, and production of the final classification. The 15 maps are the geographical distribution of the 15 classes, and the coloured dots in the maps represent the ΔS value of the sample.
Figure 1. Flowchart of the data selection, assembly processing, and production of the final classification. The 15 maps are the geographical distribution of the 15 classes, and the coloured dots in the maps represent the ΔS value of the sample.
Remotesensing 16 03084 g001
Figure 2. Ensemble mean (blue line) and spread (grey shading) of the BIC score for increasing the number of GMM classes. The black bars are the standard deviation of the ensemble mean. The BIC scores are computed for 50 random sample groups, each consisting of 90% of the total profiles.
Figure 2. Ensemble mean (blue line) and spread (grey shading) of the BIC score for increasing the number of GMM classes. The black bars are the standard deviation of the ensemble mean. The BIC scores are computed for 50 random sample groups, each consisting of 90% of the total profiles.
Remotesensing 16 03084 g002
Figure 3. Visualisation of the classification results. For each class, the mean value of SST, rain rate, and wind speed is plotted as a 3D coordinate. (a) is the mean values of each class, the size of the marker represents the sample size of the class, and the colour of the marker represents the mean ΔS of the class. To better illustrate the spread of the classes and without hiding the small classes, we subdivided the classes into 3 subplots according to different temperature ranges. (b) Classes with mean SST below 10 °C, corresponding to the triangle markers in (a); (c) between 10–20 °C, corresponding to the square markers in (a); (d) above 20 °C, corresponding to the round markers in (a). The x-axis is SST, the y-axis is wind speed, and the z-axis is rain rate. Rain rate is plotted in log scale for ease of visualisation in (bd). The details of each class are referred to in Table 1.
Figure 3. Visualisation of the classification results. For each class, the mean value of SST, rain rate, and wind speed is plotted as a 3D coordinate. (a) is the mean values of each class, the size of the marker represents the sample size of the class, and the colour of the marker represents the mean ΔS of the class. To better illustrate the spread of the classes and without hiding the small classes, we subdivided the classes into 3 subplots according to different temperature ranges. (b) Classes with mean SST below 10 °C, corresponding to the triangle markers in (a); (c) between 10–20 °C, corresponding to the square markers in (a); (d) above 20 °C, corresponding to the round markers in (a). The x-axis is SST, the y-axis is wind speed, and the z-axis is rain rate. Rain rate is plotted in log scale for ease of visualisation in (bd). The details of each class are referred to in Table 1.
Remotesensing 16 03084 g003
Figure 4. Classes with mean SST higher than 25 °C. (a,b,eg) Scatterplot maps of ΔS (unit: PSU) in the class K11, K13, K3, K15, and K8, respectively. The dotted area in (a) is where the number of members exceeds 200 in a 5° × 5° grid cell and the samples exceed 12. Regions where samples are insufficient for identifying the predominant season are discarded. (c,d,hj) Prevailing season of the observations in the same classes above. Colours represent over 50% of the observations in the area being taken in the same season: blue is December to February of next year, green is March to May, red is June to August, orange is September to November, and grey means there is no prevailing season in the area.
Figure 4. Classes with mean SST higher than 25 °C. (a,b,eg) Scatterplot maps of ΔS (unit: PSU) in the class K11, K13, K3, K15, and K8, respectively. The dotted area in (a) is where the number of members exceeds 200 in a 5° × 5° grid cell and the samples exceed 12. Regions where samples are insufficient for identifying the predominant season are discarded. (c,d,hj) Prevailing season of the observations in the same classes above. Colours represent over 50% of the observations in the area being taken in the same season: blue is December to February of next year, green is March to May, red is June to August, orange is September to November, and grey means there is no prevailing season in the area.
Remotesensing 16 03084 g004
Figure 5. Classes with mean SST between 10 °C and 25 °C. (a,b,eg) Scatterplot maps of ΔS in classes K1, K6, K9, K14, and K7, respectively. (c,d,hj) Prevailing season of the observations. The legend is the same as Figure 4.
Figure 5. Classes with mean SST between 10 °C and 25 °C. (a,b,eg) Scatterplot maps of ΔS in classes K1, K6, K9, K14, and K7, respectively. (c,d,hj) Prevailing season of the observations. The legend is the same as Figure 4.
Remotesensing 16 03084 g005
Figure 6. Scatterplot of all SMAP SSS bias observations over a PSU (x-axis) and latitude (y-axis) plot. The coloured shading represents the observation count in a 0.02 PSU and 0.5° grid size. The overlaid dashed lines are the mean rain rate (black) and the mean SSS (red), respectively, along the latitude. The mean rain rate and SSS values are in the top x-axis.
Figure 6. Scatterplot of all SMAP SSS bias observations over a PSU (x-axis) and latitude (y-axis) plot. The coloured shading represents the observation count in a 0.02 PSU and 0.5° grid size. The overlaid dashed lines are the mean rain rate (black) and the mean SSS (red), respectively, along the latitude. The mean rain rate and SSS values are in the top x-axis.
Remotesensing 16 03084 g006
Figure 7. Classes with mean SST lower than 10 °C. (a,b) Scatterplot maps of ΔS in classes K2 and K10, respectively. (c,d) Prevailing season of the observations in classes K2 and K10, respectively. The legend is the same as Figure 4.
Figure 7. Classes with mean SST lower than 10 °C. (a,b) Scatterplot maps of ΔS in classes K2 and K10, respectively. (c,d) Prevailing season of the observations in classes K2 and K10, respectively. The legend is the same as Figure 4.
Remotesensing 16 03084 g007
Figure 8. The distribution of members in K12 and its relationship with sea ice concentration. (a) Scatterplot map of K12, where the colour represents ΔS. (b) Prevailing season of the observations. (c) Scatter plot of observations with sea ice presence within 50 km, with the colour representing the percentage of ice concentration. (d) Observations and mean ΔS concerning sea ice concentration. (e) Scatterplot within the classification parameter space, with the x-, y-, and z-axes representing SST, wind speed, and rain rate, respectively, and the colour of the marker representing ΔS.
Figure 8. The distribution of members in K12 and its relationship with sea ice concentration. (a) Scatterplot map of K12, where the colour represents ΔS. (b) Prevailing season of the observations. (c) Scatter plot of observations with sea ice presence within 50 km, with the colour representing the percentage of ice concentration. (d) Observations and mean ΔS concerning sea ice concentration. (e) Scatterplot within the classification parameter space, with the x-, y-, and z-axes representing SST, wind speed, and rain rate, respectively, and the colour of the marker representing ΔS.
Remotesensing 16 03084 g008
Figure 9. The distribution of members in K4 and its relationship with precipitation. (a) Scatterplot map of K4. (b) Prevailing season of the observations. (c) Annual mean precipitation. (d) Relations between ΔS and rain rate, the colour is the member count in the corresponding ΔS and rain rate. (e) Scatterplot for classification parameters, same as in Figure 8e. The observation count in (d) is calculated with the bin size of 0.1 PSU along the x-axis and 2.5 mm/day along the y-axis.
Figure 9. The distribution of members in K4 and its relationship with precipitation. (a) Scatterplot map of K4. (b) Prevailing season of the observations. (c) Annual mean precipitation. (d) Relations between ΔS and rain rate, the colour is the member count in the corresponding ΔS and rain rate. (e) Scatterplot for classification parameters, same as in Figure 8e. The observation count in (d) is calculated with the bin size of 0.1 PSU along the x-axis and 2.5 mm/day along the y-axis.
Remotesensing 16 03084 g009
Figure 10. The distribution of members in K5 and its relationship with sea surface current. (a) Scatter plot of K5. (b) Prevailing season of the observations. (c) Annual mean Eddy Kinetic Energy (EKE) of surface current (shading) overlaps with the mean velocity of sea surface current (contour, unit: m/s). (d) Snapshot of SMAP SSS and ocean surface current. The colour shading is SSS, the quiver is current, and the red pentagram marker is Argo observation.
Figure 10. The distribution of members in K5 and its relationship with sea surface current. (a) Scatter plot of K5. (b) Prevailing season of the observations. (c) Annual mean Eddy Kinetic Energy (EKE) of surface current (shading) overlaps with the mean velocity of sea surface current (contour, unit: m/s). (d) Snapshot of SMAP SSS and ocean surface current. The colour shading is SSS, the quiver is current, and the red pentagram marker is Argo observation.
Remotesensing 16 03084 g010
Table 1. Statistics of the 15 classes identified by the model, including the mean ΔS, SST, rain rate, wind speed, standard deviation, and percentage of the total data volume of each class.
Table 1. Statistics of the 15 classes identified by the model, including the mean ΔS, SST, rain rate, wind speed, standard deviation, and percentage of the total data volume of each class.
Number of ClassΔS (PSU)SST (°C)RAIN (mm/Day)WIND (m/s)ΔS SD (PSU)Percentage of Total Data Volume
10.0912.580.006.960.5619.17%
20.247.532.439.820.554.63%
30.1028.130.046.230.234.97%
4−0.0422.1924.827.880.420.84%
5−0.3713.191.1110.121.500.67%
60.0911.617.999.380.582.71%
7−0.0721.932.717.570.242.36%
80.0229.105.965.660.303.20%
90.0015.380.588.660.335.85%
100.054.050.3110.051.032.74%
110.0725.260.006.250.2235.04%
12−2.431.143.9210.545.640.23%
130.0726.580.166.800.195.81%
140.0513.610.078.530.436.43%
150.0628.660.845.990.255.35%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ouyang, Y.; Zhang, Y.; Feng, M.; Boschetti, F.; Du, Y. Geoclimatic Distribution of Satellite-Observed Salinity Bias Classified by Machine Learning Approach. Remote Sens. 2024, 16, 3084. https://doi.org/10.3390/rs16163084

AMA Style

Ouyang Y, Zhang Y, Feng M, Boschetti F, Du Y. Geoclimatic Distribution of Satellite-Observed Salinity Bias Classified by Machine Learning Approach. Remote Sensing. 2024; 16(16):3084. https://doi.org/10.3390/rs16163084

Chicago/Turabian Style

Ouyang, Yating, Yuhong Zhang, Ming Feng, Fabio Boschetti, and Yan Du. 2024. "Geoclimatic Distribution of Satellite-Observed Salinity Bias Classified by Machine Learning Approach" Remote Sensing 16, no. 16: 3084. https://doi.org/10.3390/rs16163084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop