DL-Mini Project (Ocean Wave Prediction) Karpagam K (2021188029)

Predicting ocean-wave conditions using buoy
data supplied to a hybrid RNN-LSTM neural

network model
Abstract - The ability to accurately predict ocean wave physics models, their estimations did not depreciate as
conditions is paramount to many maritime activities. time increased [2], [3]. While ML-based models exhibit
Framework containing a bidirectional recurrent neural higher performance with shorter lead times, and
network (RNN) with long-short-term memory (LSTM) performance depreciates as prediction time increased [3],
cells (RNN-LSTM is developed for timely and accurate [4]. Finally, ML based models use approximation rather
ocean wave forecasting conditions). We used a set of
than exact mathematics and are good at discerning
shallow ones learning data-driven techniques to predict
complicated patterns and relations within the data
significant wave height. Models need accurate and
provided [2]. Machine learning has been used as an
consistent data to make accurate predictions. The dataset
alternative to physics-based approaches to predict
was an aggregation of state data taken from wave
observation floats (buoys) that recorded reading in 20 significant wave heights and periods [5].
and 30 minute increments. The buoy measured the wave Recurrent neural networks (RNNs) recognize patterns
height, wave period, sea surface temperature and other and have made accurate predictions from time-series data
attributes. These properties have been used in different [6]. RNNs are capable of propagating historical data, but
approaches to forecasting future conditions. This only recent historical data significantly contribute to
approach is global application because it can be deployed predictions as gradients get progressively smaller for data
using any sufficient data set collected from wave tracking further back in time [7]. Nevertheless, RNNs have
buoys. achieved impressive results with natural language
processing and document analyses [7].
Keywords—Deep learning, Machine learning, Wave- The LSTM neural network improved upon the classic
condition forecasting RNN framework by adding a “forget” gate that reduces
training time and mitigates the vanishing gradient problem
[7]–[9]. The time gates, which model intervals of time, of
I. INTRODUCTION RNNs and LSTMs, are used to improve performance
juxtaposed to traditional frameworks [10]. Time gates can
Myriad marine and coastal activities are greatly affected
capture both short and long term interests, as well as
by ocean-wave conditions. As such, improved forecast
interval information [10]. Phased LSTM has shown
accuracy is always sought. Furthermore, the growth of
efficacy in predicting event-based and long term sequences
coastal engineering and ocean-based renewable energy
[6].
will immensely benefit from accurate forecasts. Yet,
predictions become complicated by the stochastic nature ML models have successfully predicted ocean-wave
of ocean conditions. This paper applied a hybrid RNN- conditions [5]. The Simulating WAves Nearshore
LSTM neural network because it combines the recurrent (SWAN) model forecasts wave conditions in surf zones
network’s ability to consider not only the input to the next with relatively low return times; though, the model
time step but also what it remembers from preceding underestimates the mean wave period [11]. However, the
elements [1]. We chose an LSTM as the memory cell SWAN model is physics-based. James et al. developed an
because it mitigates the vanishing gradient problem, ML-based approach around the SWAN model using a
lessens over-fitting, and prevents loss of important multi-layered perception model (MLP) and a support
information [1]. This paper presents a bi-directional RNN- vector machine model (SVM) to efficiently forecast wave
LSTM deep learning model to predict ocean-wave conditions in significantly less time than needed to run the
conditions. SWAN model [5]. In a similar vein, relatively simple
artificial neural networks (ANNs) have used buoy data to
predict wave height and wave period with solid results
II. RELATED WORK [12]. Fanetal. proposed a SWAN-LSTM model for fast
When investigating phenomena of the natural world singlepoint predictions of nearshore wave conditions; the
with the aim of making predictions, there are two model outperformed the original SWAN model (with a
foundations to base a model on, physics or machine 65% increase in accuracy), conventional neural networks,
learning (ML). Physics-based models are built on the and machine learning models [8]. The model used wind
foundation of Newtonian mechanics, while ML models are speed, wave height, and wind directions as inputs; it is best
built on both computer and data science, and statistics [2]. used for one-hour predictions, but also performed well
Physics-based methods require appropriate calibration and with six-hour predictions [8]. Another model employing
validation, as ML models require training and testing [3]. RNNs and a simple LSTM proposed by Pirhooshyaran and
However, machine learning models are easier to Snyder accurately forecast a single variable, such as
implement and there is a much smaller computational load significant wave height, as well as two variables, i.e.
when training and using them, as compared to physics energy flux in addition to significant wave height [11].
based [2], [3]. Though, after appropriate calibration of
This work additionally showed that performance benefits measured in degrees Celsius. The dataset is a collection of
from deeper structures when there are many features [11]. these features from 2010 to 2020.
Ni and Ma coupled principle component analysis (PCA)
with an LSTM in order to predict significant wave height The input (feature) matrix is:
in Polar Westerlies [13]. First, their PCA analysis
identified principle components in the dataset, which was
X = [Significant wave height 1
an aggregation of two and a half moths of data fro four
Dominant wave period 2
wave buoys [13]. Next, the LSTM was employed to Average wave period 3
mitigate deep-rooted independence while forecasting. Mean wave direction 4
Their novel method used temporal and spatial data and can Sea surface temperature 5 ]
be applied for short term predictions . Kimetal. predicted
ocean weather using a convolutional LSTM and denoising
AutoEncoder . Following one week, there proposed model where X is the 1 × N input vector at time index t = 1, . . ,T
estimated ocean weather with an average errot of 9.7% . where N is the number of features. Each row represents a
training or testing sample dataset including all data for
Thus illustrating their models efficacy. A recent LSTM
each ocean or wave feature.
model, proposed by Wei, was trained on historic wind,
temperature, and atmospheric pressure, and wave data
[13]. This model predicted wind waves in the US Atlantic Correspondingly, the output matrix is:
Coast in ambient conditions, as well as during hurricanes
and winter storms [13]. The model achieved accurate
forcasts in both conditions, and did not exhibit over or Y = [Significant wave height 1
under fitting [13]. These studies illustrate the success of Dominant wave period 2
LSTM models in wave prediction, and provide a Average wave period 3
foundation for this work to build upon. Here, we develop Mean wave direction 4
a bi-directional recurrent neural network (RNN) with long- Sea surface temperature 5]
short term memory (LSTM) cells (RNN-LSTM) to
forecast ocean-wave conditions from buoy data in
where Y is the 1 × M output vector of predicted wave
Wilmington, NC. Data from the National Data Buoy characteristics at time t. We used Python programming to
Center (NDBC), a network of buoys that record various conduct our research.
ocean and wave characteristics, were supplied to our deep
learning model to predict subsequent ocean-wave
conditions. Next, for comparison, we built Gradient B.FEATURE NORMALIZATION
Boosting, Extreme Gradient Boosting, K Neighbors,
Linear, Ridge, Elastic Net, Lasso, Decision Tree, and Extra We removed values above the 95th percentile and below
Tree machine learning models and estimated significant 5th percentile of the data set. This was to reduce outliers
wave height. All models used the same dataset for training because hurricanes occurred throughout the data set. The
and testing. filtered data set was plotted as a heat map on a 5×5 matrix
and shown in Figure 1. The heat map shows the feature
III. METHODOLOGY correlations. The data were then normalized to a range
from 0 to 1 using python's MinMaxScaler. This is a
common practice used when preparing datasets for
A.FEATURES machine learning. The purpose is to change numerical
values within a data set to a common scale maintaining the
difference in scale.
The data used were collected from buoy ILM-02,
Station 41110, off the coast of Wrightsville Beach, NC by
the National Data Buoy Center. The exact location of the
buoy is (34.142 N 77.715 W), in Masonboro Inlet. The
water depth at the site is 16m and temperature readings
occur 0.46 m below water level. Buoy motion was
captured, transformed to digital information, and relayed
to a nearby data repository as a satellite transmission.
The movement of the buoy was captured, converted to
digital information and transmitted to a nearby data
repository as a satellite transmission. We included the
following variables in our dataset: significant wave height
(WVHT), dominant wave period (DPD), mean wave
period (APD), mean wave direction (MWD), and sea
surface temperature (WTMP). WVHT is calculated as the
average of the highest one-third of all wave heights during
a 20-minute sampling period, measured in meters. DPD is
the period of greatest wave energy, measured in seconds.
APD is the average of all waves over a 20-minute period,
measured in seconds. MWD is the direction from which
waves are coming at the DPD, measured in degrees from Fig. 1. A correlation matrix comparing the ocean and wave
true north. Finally, WTMP is the sea surface temperature, features within the dataset.
The deep learning RNN-LSTM model makes predictions accuracy. Training dataset included 99,969 members of
for all five features (WVHT, DPD, APD, MWD, WTMP), five functions, testing the dataset contained 7,958
and the machine learning models predict only significant members of the five features.We obtained 15 epochs and
wave height (WVHT). produced an accuracy table training data.
D. DEEP LEARNING MODEL DEVELOPMENT
A hybrid RNN-LSTM neural network was chosen to

perform this research because it combines recurrent
networks ability to think not only about entering the next
time step but also what it remembers from preceding
elements [1]. We chose LSTM as the memory cell because
it alleviates this vanishing gradient problem, reduces
overfitting and prevents loss of important information [1].
Fig. 3. Schematic of an LSTM cell.
Figure 3 illustrates an LSTM cell, which was assembled

into a layer, where the number of cells within a layer
affects the performance of the network.
An LSTM layer is defined mathematically as:
i
f
Fig. 2. A series of plots illustrating fluctuations in ocean-
and
wave variables
𝑐 <𝑡> = (𝑖 <𝑡> × 𝑔<𝑡> + 𝑓 <𝑡> × 𝑐 <𝑡−1> )

C.TRAINING AND TESTING DATA SPLIT
h
The collected data were divided into two groups: one for
training and one for testing. The training group includes where 𝑥 <𝑡> is a vector from input matrix X at time t, 𝑔<𝑡>
most of the original data set and the test group consists is the information used to update cell state 𝑐 <𝑡> ,
what's left Next, the data was iterated through a divided superscript t is the current time index and t − 1 is the
into multiple NumPy arrays or sets. This process was previous time index. tanh is the hyperbolic tangent
repeated until all data was stored in the frame. As noted by function.
splitting the data sets into different ones frames uses
temporal information and improves the model
performance.The training dataset was used to build/train a Input weight matrices are 𝑊𝑖𝑔 , 𝑊𝑖𝑓 , 𝑊𝑖𝑖 , and 𝑊𝑖 ,with
deep learning model. Subsequently, a test data set was sizes N × B. B is the number of LSTM cells, 𝑊ℎ𝑔 , 𝑊ℎ𝑓 ,
delivered on a trained deep learning model to verify 𝑊ℎ𝑖 , and 𝑊ℎ𝑜 are the recurrent weight matrices of size N
performance compare with real-time data and assess × N.b(ex: 𝑏𝑜 )is the bias term in the activation function.
assessed training estimations by comparing training/
𝑓 <𝑡> , 𝑖 <𝑡> , 𝑜 <𝑡> are the forget, input, and output gates, validation losses, and training/ validation accuracies.
respectively. Losses were in terms of root mean squared error (RMSE)
and are equivalent to accuracies.
ℎ<𝑡> is the output from the LSTM layer.
Next, output from the LSTM network was sent to a fully

connected (FC) layer with the sigmoid activation function
𝑧 <𝑡> is the output from this first FC layer. 𝑊𝑓1 is the

weight matrix of size B × P. P is the number of
perceptron’s for this layer, and 𝑏𝑓1 is the bias.
Finally, the output from the first FC layer was sent to a
second FC layer with the swish activation function to make
the predictions of interest
^
𝑦𝑚
where Wf2 is the weight matrix of size XX × YY , and 𝑏𝑓2
is bias for the swish layer.
Loss was calculated as: Fig. 4. The accuracy and loss during training and
validation for each epoch
where 𝑦𝑚^ is the prediction of interest m = 1, . . . , M at time Tables II and III lists the RMSEs (losses) for the training
t from output matrix 𝑦 ^ . and testing datasets, respectively
TABLE I TABLE II
RNN-LSTM DEEP LEARNING MODEL SUMMARY. TRAINING LOSS (RMSES).
Feature RMSE
Wvht 0.413
Layer (type) Shape Number of Dpd 12.55
Parameters Apd 1.123
lstm (LSTM) (None, 5, 32) 8065 Mwd 0.291
lstm 1 (LSTM) (None, 5, 16) 3136 Wtmp 0.413
dropout (None, 5, 16) 0
(Dropout)
lstm 2 (LSTM) (None, 10) 1080
dense (Dense) (None, 5) 55
TABLE III
TESTING LOSS (RMSE S).
E. HYPERPARAMETER OPTIMIZATION Feature RMSE
The Adam method was used to minimize the loss Wvht 0.394
function; dropout and batch normalization were Dpd 11.82
implemented to reduce overfitting and to hasten
Apd 1.089
convergence . In natural language processing models using
RNN, dropout rates range from 0.2 to 0.6 and 0.2 was used Mwd 0.307
in this study. We set the learning rate to 0.001, and a batch Wtmp 0.394
size to 200. Table I depicts the architecture of the deep
learning model.
IV. RESULTS AND DISCUSSION

Training and Testing Losses:
Using the hybrid RNN-LSTM neural network, the training
loss was plotted as a function of epoch number. We
[4] Y. S. R. Nayak, Purna C. and K. P. Sudheer.,
“Groundwater level forecasting in a shallow aquifer using
artificial neural network approach,” Water Resources
Management, vol. 20, pp. 77–90, 2006.
[5] S. C. James, Y. Zhang, and F. O’Donncha, “A
machine learning framework to forecast wave conditions,”
Coastal Engineering, vol. 137, pp. 1–10, 2018. [Online].
Available: http://www.sciencedirect.com/
science/article/pii/S0378383917304969
[6] D. Neil, M. Pfeiffer, and S.-C. Liu, “Phased LSTM:
Accelerating recurrent network training for long or event-
based sequences,” in Advances in Neural Information
Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I.
Guyon, and R. Garnett, Eds., vol. 29. Curran Associates,
Inc., 2016, pp. 3882– 3890. [Online]. Available:
https://proceedings.neurips.cc/paper/2016/
file/5bce843dd76db8c939d5323dd3e54ec9-Paper.pdf
[7] C. Zhou, C. Sun, Z. Liu, and F. C. M. Lau, “A C-
LSTM neural network for text classification,” 2015. [8] N.
X. Fan, Shuntao and S. Dong., “A novel model to predict
significant wave height based on long short-term memory
network,” Ocean Engineering. [9] A. Sherstinsky,
Figure 5. Compares the predicted values made by the deep “Fundamentals of recurrent neural network (rnn) and long
learning model to measured values from the testing dataset short-term memory (lstm) network,” Physica D: Nonlinear
Phenomena, vol. 404, no. 132306, 2020.
[10] Y. Zhu, H. Li, Y. Liao, B. Wang, Z. Guan, H. Liu,
V. CONCLUSION AND FUTURE WORK and D. Cai, “What to do next: Modeling user behaviors by
time-LSTM,” in International Joint Conference on
The predictions produced by these models could aid Artificial Intelligence, vol. 17, 2017, pp. 3602–3608.
power estimations for renewable energy, surf conditions,
or optimize shipping routes and lessen fuel usage. [11] The SWAN Team, “SWAN Scientific and Technical
Moreover, the RNN-LSTM neural network could be Documentation,” Delft University of Technology, Tech.
extended to include additional features such as wind Rep. SWAN Cycle III version 40.51, 2006.
direction/speed and bathymetry. For future work, [12] O. Makarynskyy, “Improving wave predictions with
improvements may be realized if more historical data were artificial neural networks,” Ocean Engineering, vol. 31,
available pp. 709–724, 2004. [Online]. Available:
https://www.sciencedirect.com/science/article/abs/
pii/S0029801803001641
VI. REFERENCES
[13]Z. Wei, “Forecasting wind waves in the us atlantic
[1] W. Koehrsen, “Recurrent neural networks by example coast using an artificial neural network model: Towards an
in python,” https://towardsdatascience.com/ recurrent- ai-based storm forecast system,” Ocean Engineering, vol.
neural-networks-by-example-in-python-ffd204f99470, 237, p. 109646, 2021.
Nov. 2018.
[2] S. Erikstad, “Design patterns for digital twin solutions
in marine systems design and operations,” 05 2018.
[3] M. C. A. B. J. A. L. C. P. J. G. Gumiere, Silvio Jos ` e
and A. N. Rousseau, ´ “Machine learning vs. physics-
based modeling for real-time irrigation management,”
Frontiers in Water, vol. 2, 2020.

DL-Mini Project (Ocean Wave Prediction) Karpagam K (2021188029)

Uploaded by

Copyright:

Available Formats

DL-Mini Project (Ocean Wave Prediction) Karpagam K (2021188029)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DL-Mini Project (Ocean Wave Prediction) Karpagam K (2021188029)

Uploaded by

Copyright:

Available Formats

Predicting ocean-wave conditions using buoy

data supplied to a hybrid RNN-LSTM neural

D. DEEP LEARNING MODEL DEVELOPMENT

A hybrid RNN-LSTM neural network was chosen to

Fig. 3. Schematic of an LSTM cell.

Figure 3 illustrates an LSTM cell, which was assembled

𝑐 <𝑡> = (𝑖 <𝑡> × 𝑔<𝑡> + 𝑓 <𝑡> × 𝑐 <𝑡−1> )

Next, output from the LSTM network was sent to a fully

𝑧 <𝑡> is the output from this first FC layer. 𝑊𝑓1 is the

IV. RESULTS AND DISCUSSION

You might also like