Faller Pressure Production
Faller Pressure Production
Faller Pressure Production
Keywords: Production forecasting plays an important role in oil and gas production, aiding engineers to perform field
Forecasting management. However, this can be challenging for complex reservoirs such as the highly heterogeneous
Data-driven carbonate reservoirs from Brazilian Pre-salt fields. We propose a new setup for forecasting multiple outputs
Deep learning
using machine-learning algorithms and evaluate a set of deep-learning architectures suitable for time-series
Oil production
forecasting. The setup proposed is called N-th Day and it provides a coherent solution for the problem of
Pre-salt
forecasting multiple data points in which a sliding window mechanism guarantees there is no data leakage
during training. We also devise four deep-learning architectures for forecasting, stacking the layers to focus
on different timescales, and compare them with different existing off-the-shelf methods. The obtained results
confirm that specific architectures, as those we propose, are crucial for oil and gas production forecasting.
Although LSTM and GRU layers are designed to capture temporal sequences, the experiments also indicate
that the investigated scenario of production forecasting requires additional and specific structures.
∗ Corresponding author.
E-mail address: rafael.werneck@ic.unicamp.br (R.d.O. Werneck).
https://doi.org/10.1016/j.petrol.2021.109937
Received 15 April 2021; Received in revised form 3 November 2021; Accepted 6 November 2021
Available online 13 December 2021
0920-4105/© 2021 Elsevier B.V. All rights reserved.
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
This paper investigates forecasting for oil production and bottom- Pao (2007) proposed a rolling cross-validation setup, in which the
hole pressure (BHP) of producer wells in a carbonate reservoir, con- testing set is separated into folds. The training for each fold is composed
sidering only data from the floating production storage and offloading
of the previous data in the time series. This approach is similar to
units (FPSOs). Traditional methods in the oil and gas industry leverage
what we call retraining. When using retraining, we improve the learning
numerical reservoir models to perform production forecasts, a model-
model in each step of the testing by incorporating the most recent data
based approach known to be time-consuming and computationally
points. However, one disadvantage of this method is that retraining in
expensive. The literature for forecasting well production based on data-
each step is time-consuming.
driven features, mostly with deep learning techniques, is still scarce and
Bedi and Toshniwal (2019) studied a different technique. They used
not well documented (Cao et al., 2016; Kubota and Reinert, 2019; Zhan
an output window of multiple outputs and slid the window the size
et al., 2019; Davtyan et al., 2020; Liu et al., 2020; Zhong et al., 2020).
of the output for the next prediction with the ground-truth values as
This section presents different setups for time-series forecasting in
input. Thus, they combined their results at the end of the prediction
the prior art that considers a broader horizon, i.e., a setup that performs
and presented a plot of their predictions. In this method, they avoid
multiple output predictions for a target variable. The multiple output
multiple predictions for the same day. However, they combined differ-
prediction aims at predicting a sequence of two or more data points
ent confidences of the prediction when concatenating the last day of a
based on a sequence of input data. We discuss the scarce literature on
data-driven oil production forecasting. Finally, we describe off-the-shelf window with the first day of the next window.
and state-of-the-art methods for forecasting general data that could be We detail these different setups in the following figures. For each
adapted to the present problem. figure, one must consider each column as a day in the test set and the
different rows as different predictions. The concatenation approach per-
2.1. Forecasting setups forms its multiple output prediction and slides the window by one day
to perform the next prediction. All predictions are then concatenated
The literature of time-series forecasting lacks a default setup on how and the representation is used to calculate the metrics. In Fig. 2, we
to perform forecasts considering multiple outputs. Bontempi (2008) have three predictions (blue, green, and red, with yellow representing
describes the two most common approaches to multiple outputs as a each input) of three days, and they are concatenated at the end. Hence,
conditional distribution over input and output sequences. Fig. 1 depicts the last day of the first prediction (day 3 in blue) is succeeded by the
these setups in a graphical model, where Fig. 1(a) is the iterated first day of the second prediction (day 2 in green), and so on. This final
prediction, and Fig. 1(b) is the direct prediction. The iterated prediction concatenation is used to calculate the metrics of this setup.
is an iterative one-step-ahead prediction, in which the dependencies are We included a yellow strip to represent the training set in the
preserved, but the error is propagated throughout the iterations. On the retraining setup, showing that we have a larger training set for each
other hand, the direct approach has different models for each next step, forecast. The method retrains before performing the next prediction. In
making it a conditional independent problem. the end, the predictions complete the test set (last row), and we then
Brownlee proposed another setup for multiple outputs on his web- evaluate the method. Fig. 3 depicts this setup.
site (Brownlee, 2018). In this approach, each prediction window’s Finally, the sliding window setup performs a prediction for multiple
output is concatenated in a sequence, considering a one-step sliding outputs without intersections between the columns, i.e., if the first pre-
window, and then evaluated. The pros of this approach are that every diction predicts day 1 to day 3, the second one predicts day 4 to day 6.
multiple output prediction is considered when calculating the metrics. Fig. 4 illustrates this method. These different setups and their problems
However, it is impossible to have a plot representation of the prediction (time-consuming and combining different confidences of prediction)
results for this method, as it has more than one result for each data motivated us to discuss and propose a new, more compatible setup with
point. the challenge of forecasting for longer periods.
2
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
3
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
model can exploit other dynamical trends to improve the generaliza- They evaluated their approach, called N-BEATS, on the M4 (Makridakis
tion. After concatenating the input encodings, the decoder predicts oil, et al., 2018), M3 (Makridakis and Hibon, 2000), and Tourism (Athana-
water, and gas as multivariate time series. Although the results indicate sopoulos et al., 2011) datasets.
a long-term forecasting ability, they only perform 6-data-points-ahead Vaswani et al. (2017) proposed an architecture based on attention
forecasting and do not compare it with other strategies. mechanisms called Transformer. Their network model has an encoder–
Most of these studies are data-driven, the same category of our decoder structure using stacked self-attention and point-wise connected
proposed approach. Nevertheless, such methods comprise simple com- layers. As the model has neither recurrent nor convolution layers,
binations of linear regressions or recurrent networks with fewer layers. they needed a positional encoding to have information about relative
This paper proposes solutions with deeper architectures and, most positions in the sequence. The Transformer architecture was evaluated
importantly, different (and more complete) evaluation setups. There- on translation tasks, outperforming other architectures.
fore, we opted not to compare the proposed methods with the ones Taylor and Letham (2018) described a modular regression model
mentioned above but, instead, with more recent deep-learning derived that can be adjusted to different time series with the help of a specialist.
methods and off-the-shelf solutions. Their method, named Prophet, uses a decomposable time series model
to obtain three components: trend, seasonality, and holidays. The au-
2.3. Baseline methods thors frame the problem as a curve-fitting exercise, as they claim it
provides flexibility, no need for interpolating missing values, efficiency
We adopted two baselines for our experiments: Pressure-Normalized in fitting the model, and interpretable parameters. Taylor and Letham
Decline Curve Analysis and a simple Recurrent Neural Network. Decline evaluated Prophet in business time series, specifically for Facebook
Curve Analysis (DCA) is a traditional method used in the oil and gas Events.
industry to predict future production (Belyadi et al., 2019) using a
graphical procedure to analyze the declining production rates. Lacayo 3. Proposed method
and Lee (2014) proposed a modified curve analysis method for uncon-
ventional reservoirs that did not achieve the pressure stabilized state. In this section, we present our proposed approaches to deal with
The Pressure-Normalized Decline Curve Analysis (PN-DCA) performs a bottom-hole pressure and oil production forecasting. Our main con-
decline curve analysis using pressure-normalized production rates. This tributions are divided into three fronts: first, in terms of validation,
pressure-normalized rate can be described by Eq. (1). we propose a more realistic setup for comparing different forecasting
𝑞 methods when predicting multiple days. This setup allows us to plot
𝛥𝑝𝑁 = (1)
𝑝𝑖 − 𝑝𝑤𝑓 results correctly and avoid mixing different prediction confidences in a
long-range. Second, we introduce a series of pre-processing, data aug-
where 𝑞 is the production rate, 𝑝𝑖 the average initial reservoir pressure,
mentation, and inclusion of injection data in the forecasting modeling.
and 𝑝𝑤𝑓 is the flowing pressure. This pressure-normalized rate 𝛥𝑝𝑁 can
Finally, we propose methods to leverage cutting-edge deep-learning
be calculated by Eq. (2).
formulations for temporal data to tackle bottom-hole pressure and oil
1 √
= 𝑚 𝑡 + 𝑏′ (2) production forecasting. Fig. 5 presents a pipeline of our methodology.
𝛥𝑝𝑁
where 𝑚 is the slope of the straight line and 𝑏′ is where the curve 3.1. Proposed evaluation setups
intercepts the 𝑦-axis.
RNNs are one of the most promising techniques for time-series fore- To avoid the problems raised in Section 2, we propose two fore-
casting. Their main advantage is the presence of memory cells capable casting setups for time series. The first, which we denominate First
of propagating information through time. At a specific time-step 𝑡, the Prediction, slides a multi-output window one step at a time. This method
output ℎ𝑡 is calculated based on the current time-step input 𝑋𝑡 and the obtains the first prediction for each test data point. We thus considered
previous time-step output ℎ𝑡−1 . The RNN network is a simple network all the data from the first forecast window and, for the subsequent
composed of an RNN layer, followed by a single dense layer to our predictions, only the last data of the output. Fig. 6 details how this
output size. approach works. Considering each column a data point and each row
the predictions of multiple outputs with a sliding window of one
2.4. Off-the-shelf methods point each time, the First Prediction approach corresponds to the first
prediction made for each data point. In this case, we select the three
As the literature of data-driven oil-production forecasting is still data predictions for the first forecasting and then the last data point
scarce, we also adopt state-of-the-art methods for general forecast- of each subsequent prediction to compose our final forecast. We can
ing and compare them to our proposal. These methods are consid- plot the predictions with this setup, but it still combines different
ered off-the-shelf, as they are general-purpose and not tailored for oil confidences considering the first prediction window.
production forecasting. Our second setup focused on the last prediction data, obtaining
Salinas et al. (2020) proposed the DeepAR, an autoregressive re- a result that is more compatible with the challenging problem of
current neural network for a probabilistic forecast. This method learns forecasting for longer periods. We name this approach N-th Day. In this
a global model for all time series in a dataset. The authors claim case, we perform the same multi-output window with sliding steps of
as an advantage are that the model learns seasonal behaviors across the previous setup but only consider each window’s last day for the
time series. They make a probabilistic forecast in the form of Monte evaluation. Thus, the obtained results for this approach represent the
Carlo samples learning from similar items. Moreover, their method most challenging forecast data, which is the most distant data in our
can incorporate many likelihood functions to apply in the data. This output window from the input data. This setup has the advantage of
DeepAR method was proposed to Amazon’s retail businesses but was not combining different prediction confidences in its results. However,
also evaluated on datasets of various problems. it lacks all data points to plot, complemented by the First Prediction
Oreshkin et al. (2019) presented a neural architecture based on setup in this case.
backward and forward residual links and fully connected layers for This setup helps us to focus the evaluation on the behavior of
forecasting univariate time series. Their architecture is generic and the forecasting methods for the N-th Day prediction. Fig. 7 shows an
straightforward; it does not rely on time series feature engineering; example of this evaluation approach considering an output window of
it is easy to interpret and extend. They also used ensembling to be size 3. For each prediction (each row), each forecast’s last predicted
comparable to other methods from the M4 forecasting competition. value (third data point) is selected to compose the final forecast series.
4
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
5
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Table 1
Perturbations for the data augmentation in the training phase with their descriptions.
Perturbations Description
Add noise Adds random noise to the time series.
Convolution Performs a convolution in the time series with a kernel window,
i.e., a composition function between the time series’ values and a
kernel function (e.g., triangular, Hann window), which acts as a
filter.
Drift Adds a drift value to some points of the time series.
Pool Divides the time series into windows and then applies a pooling
function (e.g., maximum, minimum, average) to each window point.
Quantization Defines level sets according to a distribution (e.g., uniform,
quantile, k-means) and rounds the time series’ values to the nearest
level in the level set.
Reverse Reverses the timeline of a series. Fig. 9. Representation of an LSTM layer. 𝑋𝑡 represents the input at instant 𝑡, 𝐶𝑡−1 and
Time warp The augmenter randomly changes the speed of the timeline. 𝐶𝑡 are the memory from the previous LSTM cell and current cell, and ℎ𝑡−1 and ℎ𝑡 are
the output of the previous cell and the output of the actual cell. The gates of this
cell are enumerated as follows: (1) is the forget gate, (2) is the input gate, and (3)
is the output gate. In this recurrent cell, the input and the previous output decides to
consider the memory from the last cell, which then update the memory state of the
cell in the input gate, and finally, the memory of the cell combined with the inputs
results in the output of the cell.
considering the BHP and reservoir pressure difference, both from the
injector and producer.
6
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Table 2
The ANNs details we have used in the experiments. A plot of the models can be found in Fig. A.1.
GRU2 GRUconv Seq2Seq CNN
2x gru: 2x conv1D: 2x gru: 3x conv1D:
units = 128, filters = 64/32, units = 128, filters = 64,
return sequences = T/F kernel size = 4, dropout = 0.1, kernel size = 2,
strides = 2, recurrent dropout = 0.5, padding = ‘‘same’’
padding = ‘‘valid’’ return sequences = T + batch
normalization
10x dense: 2x gru: lambda: global average
units = 128/64/32/30 units = 128, last 30 days pooling1D
dropout = 0.1,
recurrent dropout = 0.5,
return sequences = T/F
10x dense: time distributed: 1x dense:
units = 128/64/32/30 dense(1) units = 30
4. Experimental protocol
7
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
fluid production (oil, gas, and water), pressure (bottom-hole), and the Table 3
Forecasting setups on Metro and Energy datasets.
ratio between them (water cut, gas–oil ratio, and gas–liquid ratio). The
reservoir contains 16 producers and 16 injector wells, divided into nine Dataset Metrics Forecasting setups
water injectors and seven WAG injectors. For the oldest producer well, Concatenation Tumbling Retraining + Nth Day
Tumbling
we have five years of historical data.
The final dataset is the UNISIM-II-M-CO benchmark,2 created by MAE 1062.65 1054.47 1046.67 1432.57
Metro RMSE 1405.41 1391.53 1383.81 1764.99
the UNISIM group at the University of Campinas. This synthetic bench-
SMAPE 43.49 43.30 43.03 54.21
mark, run on the CMG-GEM simulator,3 has production and injection
MAE 43.29 43.25 43.21 45.78
trends similar to the private dataset apart from being a carbonate
Energy RMSE 91.56 91.33 88.56 93.79
reservoir based on real field data. The model is a synthetic light-oil SMAPE 36.08 36.15 36.21 38.25
based on a combination of Pre-Salt characteristics, such as fractures,
Super-K layers, and high heterogeneity (Correia et al., 2015). The fluid
model is compositional, with seven components in the oil phase. The
simulation model has 6.5 years of production history and contains eight Upon obtaining the forecasting results, we evaluated the approaches
WAG injectors alternating every six months and ten producer wells. All through three metrics commonly used in forecasting problems (Kubota
the producers and injectors of the simulation model present total and and Reinert, 2019; Oreshkin et al., 2019; Liu et al., 2020): Mean
partial closure frequency similar to real cases. Absolute Error (MAE), that shows the magnitude of errors,
1 ∑
𝑚
In this subsection, we describe the adopted evaluation protocols. Root Mean Square Error (RMSE), which measures how spread out the
errors are
We selected the Metro and the Energy datasets for the experiments √
√
comparing the forecasting setups to show that our proposed setup can √1 ∑ 𝑚
100 ∑ |ℎ(𝑥𝑖 ) − 𝑦𝑖 |
tion 3.4) with Huber loss in all experiments as our proposed baseline, 𝑚
unless stated otherwise. As the approach is stochastic, we performed SMAPE(𝑋, ℎ) = ,
𝑚 𝑖=1 (|𝑦𝑖 |+|ℎ(𝑥𝑖 )|)
ten runs to obtain a margin of its results. We obtained the mean of 2
these predictions as our final result. We fed batches of data containing where 𝑋 are the predicted values and 𝑦 the ground-truth.
85 points as input to learn how to predict the following 30 data points
(output) for training the network. All experiments were performed with 5. Experiments and results
100 epochs, using an early stopping approach after 10 epochs without
improving the validation loss, a common practice in machine learning. In this section, we present the sets of experiments and discuss their
For the oil datasets, we performed experiments considering our N-th results. The first set of experiments (Section 5.1) compares different
day approach, as it is better for assessing the quality of the forecasting forecasting setups. The second set (Section 5.2) shows our forecasting
further in the future, as discussed in Section 3.1. As the datasets contain method applied in the oil and gas field and highlights the importance of
daily data, we performed a forecast of one month ahead (30 data points data pre-processing. The third set (Section 5.3) adds information about
output), given an input of 85 data points. the injection data, considering the delay of influence extracted from
correlation techniques. Finally, the last set (Section 5.4) compares the
Thirty days might seem a small time frame for petroleum engineers,
proposed approach with a number of off-the-shelf data-driven solutions.
as they are used to forecasting years using numerical simulations.
However, it is a larger enough window to help decide interventions
5.1. Comparing forecast setups
in the field’s operation. A reservoir simulator is usually not predictive
enough for short-term events, particularly for large and heterogeneous Our first experiment compared the different forecast setups in the
reservoirs, due to the complexity of representing such reservoirs and literature (Concatenation, Tumbling, and Retraining) with our pro-
computational limitations. Machine-learning approaches can deal bet- posed approach (N-th Day). All these setups perform multiple out-
ter with the complex data available from different sources and high- put forecasting, they comtemplate a more extensive testing set, and
frequency data of the oil and gas industry. Therefore, through the more combine their outputs differently. Table 3 shows how these methods
refined data, these machine-learning approaches can be more accurate perform in the Metro and Energy datasets.
to predict a near-future event, such as kicks, hydrate formation, or early As Table 3 shows, the N-th Day evaluation setup does not have
water and gas breakthroughs. the best result, as expected. This is coherent with its proposition of
All the networks tested in these datasets use Huber loss. We se- considering only the last data of each horizon window and integrate
lected two targets for our experiments in these datasets: daily oil this data at the end. The last day is the most challenging data point to
production and well bottom-hole pressure. Each data point input of the be predicted, as it is the furthest from the input data.
networks consists of all available variables for the given data point, All other setups consider the other predicted data points (e.g., 1-day
e.g., production data, injection data, and well pressure. prediction) in their evaluations and, because of this, their results tend to
After obtaining the forecast results, we performed post-processing in be higher. For instance, the first predicted data point is straightforward,
the data. We ensure that the target was not negative for the oil datasets, as the method can even use the same last data point seen without any
as our targets are daily pressure and daily production. We also removed learning method and still obtain reasonable results. We can see in the
any outliers above one standard deviation. Energy dataset that the results for the N-th Day setup are not far from
the other approaches. The subsequent experiments present results using
only the N-th Day setup for forecasting.
2
https://www.unisim.cepetro.unicamp.br/benchmarks/en/unisim- It is interesting to notice that the Retraining setup outperforms the
ii/overview, accessed on November 23rd, 2020. Concatenation and Tumbling methods. However, as it trains every new
3
https://www.cmgl.ca/, accessed on September 29th, 2020. prediction, its costs are multiple times the cost of the other methods.
8
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Fig. 12. Augmentation and anomaly detection experiments on the private dataset,
Fig. 13. Augmentation and anomaly detection experiments on the UNISIM-II-M-CO
considering 104 experiments (4 networks, 13 wells, and 2 target variables).
dataset, considering 80 experiments (4 networks, 10 wells, and 2 target variables).
Table 4
Our second experiment focuses on field response data, i.e., fluid
SMAPE results for 3 wells on the private dataset using GRU2 method and considering
production and wells pressure from a producing reservoir. We want (or not) their correlated injectors. Well P1 is connected to injectors I1 with 9 days
to remove erroneous data that we cannot predict, such as human delay and I2 with 5 days delay. Well P2 has 3 connected injectors: I2, I3, and I4, with
intervention. 1 day, 15 days, and 1 day delays, respectively. Producer well P3 is correlated with 6
different injector wells: I5 without delay, I6 with 2 days delay, and wells I7, I8, I9,
We separated the anomaly removal and the data augmentation to
and I10 with 1 day delay each.
see how they improved the forecasting results for these experiments.
Well Target Without injector With correlated injectors
We used a z-score method for anomaly removal. Considering data
DailyProdOil 29.12 28.73
augmentation, we created data points that correspond to intervals of P1
DailyPressureBHP 0.99 0.96
3 h in our original daily data, interpolating these data linearly.
DailyProdOil 46.12 38.94
The perturbations for data augmentation are presented in Table 1. P2
DailyPressureBHP 2.60 2.69
Using our two targets, we performed all augmentations combined with DailyProdOil 7.01 7.44
our four networks on 13 producer wells of the private dataset (producer P3
DailyPressureBHP 0.35 1.00
with enough data to perform the perturbations). We compared the
results to select the augmentation approach that performed better in
more experiments. Fig. 12(a) shows the number of experiments that
have the best result with the corresponding augmentation. This figure production) and DailyPressureBHP (daily measure of well bottom-
hole pressure), respectively. We can see how the forecast (red) performs
shows that the augmentation by data points performed better in more
in these plots compared to the ground-truth (blue).
wells than the other approaches. Moreover, Fig. 12(b) considers the
same experiments and shows the influence of the use of an anomaly 5.3. Injector data
detector.
We also performed these experiments in the UNISIM-II-M-CO Our following experiments considered injector data as input of our
dataset, considering all its ten producer wells. Figs. 13(a) and 13(b) forecasting approaches. In these experiments, for each producer well,
presents these results for augmentation and anomaly detection, respec- we devised the TLCC to determine which injector wells are connected
tively. It is clear that, for the UNISIM-II-M-CO dataset, the removal of to the producer and the lag between them. Tables 4 and 5 show the
anomalies worsens both results (augmentation and anomaly detection). SMAPE metric for our two datasets, considering the oil production of a
We believe this is because the interference in this benchmark model well with and without correlated injector wells.
was artificially generated, following a distribution. Thus the anomaly As shown in Tables 4 and 5, we do not have a definitive answer
and synthetic data are intrinsic to the time series data so that any on whether it is better to use data from injector wells, especially
anomaly removal would disrupt the dataset model, and interpolation considering the private dataset. Intuitively, we think it is better to
include this information. We believe future investigations should be
would only provide noise.
performed to enhance the correlation of producer and injector wells so
The next experiments in this work consider the best results on
that more precise information could be used as input to our algorithms.
augmentation and anomaly removal for each dataset. For experiments
with the private dataset, we used augmentation by 3 h and z-score 5.4. Comparing data-driven techniques
anomaly removal. For the UNISIM-II-M-CO dataset, we do not per-
form an augmentation nor do we remove anomalies. Figs. 14 and 15 Our last round of experiments compares our methods with baselines
present plots for three wells and two targets, DailyProdOil (daily oil and different off-the-shelf deep-learning networks from the literature,
9
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Fig. 14. DailyProdOil forecasting in the private dataset. The purple strip represents the start of the test set and has the size of the output window. The red line is the forecasted
data, and the blue points are the ground-truth data. The green shadow is the maximum and minimum values obtained considering all 10 runs. (For interpretation of the references
to color in this figure legend, the reader is referred to the web version of this article.)
10
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Fig. 15. DailyPressureBHP forecasting in the private dataset The purple strip represents the start of the test set and has the size of the output window. The red line is the
forecasted data, and the blue points are the ground-truth data. The green shadow is the maximum and minimum values obtained considering all 10 runs. (For interpretation of
the references to color in this figure legend, the reader is referred to the web version of this article.)
11
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
12
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Fig. 16. SMAPE results for the private dataset considering oil production and bottom-hole pressure. The blue bars are the networks proposed for use in this work, the red bars are
the off-the-shelf methods from the literature, and the green bars are the baselines methods. All experiments were performed with augmentation and anomaly removal. Prophet*
means that this network uses a retraining approach. Note that the PN-DCA method is not applicable (N/A) to perform BHP forecasting. (For interpretation of the references to
color in this figure legend, the reader is referred to the web version of this article.)
both injector and producer are fed to the network, disregarding the Thirty days of production forecast can be used to assist short-
actual delay. term decisions, such as kicks or early water and gas breakthrough.
13
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Fig. 17. SMAPE results for the UNISIM-II-M-CO dataset considering oil production and bottom-hole pressure. The blue bars are the networks proposed for use in this work, the red
bars are the off-the-shelf methods from the literature, and the green bars are the baselines methods. These experiments considered the best pre-processing for the UNISIM-II-M-CO
dataset (no augmentation and no anomaly removal). Prophet* means that this network uses a retraining approach. Note that the PN-DCA method is not applicable (N/A) to perform
BHP forecasting. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
As mentioned before, this time window is much smaller than those approaches are complementary. An even better solution may be found
usually considered in model-based approaches, whose forecasts cover by joining the two into a hybrid procedure, which is a topic for future
more than ten years. We believe that data-driven and model-based research.
14
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
CRediT authorship contribution statement Borovykh, A., Bohte, S., Oosterlee, C.W., 2017. Conditional time series forecasting with
convolutional neural networks. arXiv:1703.04691.
Brownlee, J., 2018. How to develop multi-step LSTM time series forecasting models
Rafael de Oliveira Werneck: Conceptualization, Methodology,
for power usage. https://machinelearningmastery.com/how-to-develop-lstm-
Software, Investigation, Writing – original draft. Raphael Prates: models-for-multi-step-time-series-forecasting-of-household-power-consumption/,
Software, Investigation, Writing – review & editing. Renato Moura: (accessed: Aug 3rd, 2020).
Software, Investigation, Writing – review & editing. Maiara Moreira Candanedo, L.M., Feldheim, V., Deramaix, D., 2017. Data driven prediction models
Gonçalves: Data curation, Writing – review & editing. Manuel Castro: of energy use of appliances in a low-energy house. Energy Build. 140, 81–97.
http://dx.doi.org/10.1016/j.enbuild.2017.01.083.
Data curation, Writing – review & editing. Aurea Soriano-Vargas: Vi-
Cao, Q., Banerjee, R., Gupta, S., Li, J., Zhou, W., Jeyachandra, B., 2016. Data driven
sualization, Writing – review & editing. Pedro Ribeiro Mendes Júnior: production forecasting using machine learning. In: SPE Argentina Exploration and
Formal analysis, Writing – review & editing. M. Manzur Hossain: Production of Unconventional Resources Symposium. pp. 01–10. http://dx.doi.org/
Resources, Writing – review & editing. Marcelo Ferreira Zampieri: 10.2118/180984-MS.
Resources, Data curation, Writing – review & editing. Alexandre Chaikine, I.A., Gates, I.D., 2021. A machine learning model for predicting multi-stage
horizontal well production. J. Pet. Sci. Eng. 198, 108133. http://dx.doi.org/10.
Ferreira: Conceptualization, Supervision, Writing – review & edit-
1016/j.petrol.2020.108133.
ing. Alessandra Davólio: Supervision, Writing – review & editing. Correia, M., Hohendorff, J., Gaspar, A.T., Schiozer, D., 2015. UNISIM-II-D: Benchmark
Denis Schiozer: Supervision, Writing – review & editing. Anderson case proposal based on a carbonate reservoir. In: SPE Latin America and Caribbean
Rocha: Conceptualization, Project administration, Supervision, Writing Petroleum Engineering Conference, Vol. Day 3 Fri, November 20, 2015. pp. 1–21.
– review & editing. http://dx.doi.org/10.2118/177140-MS, D031S020R004.
Davtyan, A., Rodin, A., Muchnik, I., Romashkin, A., 2020. Oil production forecast
models based on sliding window regression. J. Pet. Sci. Eng. 195, 107916. http:
Declaration of competing interest //dx.doi.org/10.1016/j.petrol.2020.107916.
Deng, L., Pan, Y., 2020. Machine-learning-assisted closed-loop reservoir management
No author associated with this paper has disclosed any potential or using echo state network for mature fields under waterflood. SPE Reservoir
pertinent conflicts which may be perceived to have impending conflict Evaluation & Engineering 23 (04), 1298–1313. http://dx.doi.org/10.2118/200862-
PA.
with this work. For full disclosure statements refer to https://doi.org/
van Dyk, D.A., Meng, X.L., 2001. The art of data augmentation. J. Comput. Graph.
10.1016/j.petrol.2021.109937. Statist. 10 (1), 1–50. http://dx.doi.org/10.1198/10618600152418584.
Ertekin, T., Sun, Q., 2019. Artificial intelligence applications in reservoir engineering: A
Acknowledgments status check. Energies 12 (15), 2897–2919. http://dx.doi.org/10.3390/en12152897.
Géron, A., 2019. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow:
This work was conducted in association with the ongoing Project Concepts, Tools, and Techniques To Build Intelligent Systems. O’Reilly Media.
Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press, http://www.
registered under ANP number 21373-6 as ‘‘Desenvolvimento de Téc- deeplearningbook.org.
nicas de Aprendizado de Máquina para Análise de Dados Complexos Hu, J., Zheng, W., 2020. A deep learning model to effectively capture mutation
de Produção de um Campo do Pre-Sal’’ (UNICAMP/Shell Brazil/ANP) information in multivariate time series prediction. Knowl.-Based Syst. 203, 106139.
funded by Shell Brazil, under the ANP R&D levy as ‘‘Compromisso http://dx.doi.org/10.1016/j.knosys.2020.106139.
Johnson, C.R., Greenkorn, R.A., Woods, E.G., 1966. Puk-testing: A new method for
de Investimentos com Pesquisa e Desenvolvimento’’. The authors also
describing reservoir flow properties between wells. J. Pet. Technol. 18 (12),
thank Schlumberger and CMG for software licenses and Vitor Ferreira 599–604.
for helping with the PN-DCA method. Kim, Y.D., Durlofsky, L.J., 2021. A recurrent neural network–based proxy model
for well-control optimization with nonlinear output constraints. SPE J. 26 (04),
Appendix A. Supplementary data 1837–1857. http://dx.doi.org/10.2118/203980-PA.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep
convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Wein-
Supplementary material related to this article can be found online berger, K.Q. (Eds.), Advances in Neural Information Processing Systems, Vol. 25.
at https://doi.org/10.1016/j.petrol.2021.109937. Curran Associates, Inc., pp. 1097–1105, URL http://papers.nips.cc/paper/4824-
imagenet-classification-with-deep-convolutional-neural-networks.pdf.
References Kubota, L., Reinert, D., 2019. Machine learning forecasts oil rate in mature onshore field
jointly driven by water and steam injection. In: SPE Annual Technical Conference
Aizenberg, I., Sheremetov, L., Villa-Vargas, L., noz, J.M.-M., 2016. Multilayer neural and Exhibition. Day 2 Tue, October 01, 2019, pp. 1–18. http://dx.doi.org/10.2118/
network with multi-valued neurons in time series forecasting of oil production. 196152-MS.
Neurocomputing 175, 980–989. http://dx.doi.org/10.1016/j.neucom.2015.06.092. Lacayo, J., Lee, J., 2014. Pressure normalization of production rates improves fore-
Al-Shabandar, R., Jaddoa, A., Liatsis, P., Hussain, A.J., 2021. A deep gated recurrent casting results. In: SPE Unconventional Resources Conference / Gas Technology
neural network for petroleum production forecasting. Mach. Learn. Appl. 3, Symposium, Vol. Day 1 Tue, April 01, 2014. http://dx.doi.org/10.2118/168974-
100013. http://dx.doi.org/10.1016/j.mlwa.2020.100013. MS.
Amirian, E., Fedutenko, E., Yang, C., Chen, Z., Nghiem, L., 2018. Artificial neural Li, X., Chan, C., Nguyen, H., 2013. Application of the neural decision tree approach
network modeling and forecasting of oil reservoir performance. In: Applications of for prediction of petroleum production. J. Pet. Sci. Eng. 104, 11–16. http://dx.doi.
Data Management and Analysis : Case Studies in Social Networks and beyond. org/10.1016/j.petrol.2013.03.018.
Springer International Publishing, Cham, pp. 43–67. http://dx.doi.org/10.1007/ Liu, W., Liu, W.D., Gu, J., 2020. Forecasting oil production using ensemble empirical
978-3-319-95810-1_5. model decomposition based long short-term memory neural network. J. Pet. Sci.
Arps, J., 1945. Analysis of decline curves. Trans. AIME 160 (01), 228–247. http: Eng. 189, 107013. http://dx.doi.org/10.1016/j.petrol.2020.107013.
//dx.doi.org/10.2118/945228-G. Makridakis, S., Hibon, M., 2000. The M3-competition: results, conclusions and im-
Athanasopoulos, G., Hyndman, R.J., Song, H., Wu, D.C., 2011. The tourism forecast- plications. Int. J. Forecast. 16 (4), 451–476. http://dx.doi.org/10.1016/S0169-
ing competition. Int. J. Forecast. 27 (3), 822–844. http://dx.doi.org/10.1016/j. 2070(00)00057-1.
ijforecast.2010.04.009. Makridakis, S., Spiliotis, E., Assimakopoulos, V., 2018. The M4 competition: Results,
Bedi, J., Toshniwal, D., 2019. Deep learning framework to forecast electricity demand. findings, conclusion and way forward. Int. J. Forecast. 34 (4), 802–808. http:
Appl. Energy 238, 1312–1326. http://dx.doi.org/10.1016/j.apenergy.2019.01.113. //dx.doi.org/10.1016/j.ijforecast.2018.06.001.
Belyadi, H., Fathi, E., Belyadi, F., 2019. Chapter seventeen - decline curve analysis. In: Menke, W., Menke, J., 2016. Environmental Data Analysis with Matlab. Academic Press.
Belyadi, H., Fathi, E., Belyadi, F. (Eds.), Hydraulic Fracturing in Unconventional van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A.,
Reservoirs (Second Edition), second ed. Gulf Professional Publishing, pp. 311–340. Kalchbrenner, N., Senior, A., Kavukcuoglu, K., 2016. Wavenet: A generative model
http://dx.doi.org/10.1016/B978-0-12-817665-8.00017-5. for raw audio. arXiv:1609.03499.
Bianchi, F.M., Scardapane, S., Løkse, S., Jenssen, R., 2021. Reservoir computing Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y., 2019. N-BEATS: Neural basis
approaches for representation and classification of multivariate time series. IEEE expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:
Trans. Neural Netw. Learn. Syst. 32 (5), 2169–2179. http://dx.doi.org/10.1109/ 1905.10437.
TNNLS.2020.3001377. Pan, Y., Bi, R., Zhou, P., Deng, L., Lee, J., 2019. An effective physics-based deep learn-
Bontempi, G., 2008. Long term time series prediction with multi-input multi-output ing model for enhancing production surveillance and analysis in unconventional
local learning. In: 2nd European Symposium on Time Series Prediction, pp. reservoirs. In: SPE/AAPG/SEG Unconventional Resources Technology Conference.
145–154. OnePetro, http://dx.doi.org/10.15530/urtec-2019-145.
15
R.d.O. Werneck et al. Journal of Petroleum Science and Engineering 210 (2022) 109937
Pao, H.T., 2007. Forecasting electricity market pricing using artificial neural networks. Tadjer, A., Hong, A., Bratvold, R.B., 2021. Machine learning based decline
Energy Convers. Manage. 48 (3), 907–912. http://dx.doi.org/10.1016/j.enconman. curve analysis for short-term oil production forecast. Energy Explor. Exploit.
2006.08.016. 01445987211011784. http://dx.doi.org/10.1177/01445987211011784.
Razak, S.M., Cornelio, J., Cho, Y., Liu, H.-H., Vaidya, R., Jafarpour, B., 2021. Transfer Taylor, S.J., Letham, B., 2018. Forecasting at scale. Amer. Statist. 72 (1), 37–45.
learning with recurrent neural networks for long-term production forecasting in un- Tian, C., Horne, R.N., 2016. Inferring interwell connectivity using production data. In:
conventional reservoirs. In: SPE/AAPG/SEG Unconventional Resources Technology SPE Annual Technical Conference and Exhibition.
Conference. OnePetro, http://dx.doi.org/10.15530/urtec-2021-5687. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, u.,
Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T., 2020. Deepar: Probabilis- Polosukhin, I., 2017. Attention is all you need. In: Proceedings of the 31st
tic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36 (3), International Conference on Neural Information Processing Systems. In: NIPS’17,
1181–1191. Curran Associates Inc., Red Hook, NY, USA, pp. 6000–6010.
Shen, C., 2015. Analysis of detrended time-lagged cross-correlation between two Yuan, Z., Huang, H., Jiang, Y., Li, J., 2021. Hybrid deep neural networks for reservoir
nonstationary time series. Phys. Lett. A 379 (7), 680–687. production prediction. J. Pet. Sci. Eng. 197, 108111. http://dx.doi.org/10.1016/j.
Song, X., Liu, Y., Xue, L., Wang, J., Zhang, J., Wang, J., Jiang, L., Cheng, Z., 2020. petrol.2020.108111.
Time-series well performance prediction based on long short-term memory (LSTM) Zhan, C., Sankaran, S., LeMoine, V., Graybill, J., Mey, D.O.S., 2019. Application
neural network model. J. Pet. Sci. Eng. 186, 106682. http://dx.doi.org/10.1016/j. of machine learning for production forecasting for unconventional resources. In:
petrol.2019.106682. Unconventional Resources Technology Conference, Denver, Colorado, 22–24 July
Sun, J., Ma, X., Kazi, M., 2018. Comparison of decline curve analysis DCA with 2019. pp. 1945–1954. http://dx.doi.org/10.15530/urtec-2019-47.
recursive neural networks RNN for production forecast of multiple wells. In: SPE Zhong, Z., Sun, A.Y., Wang, Y., Ren, B., 2020. Predicting field production rates for
Western Regional Meeting. Day 4 Wed, April 25, 2018, http://dx.doi.org/10.2118/ waterflooding using a machine learning-based proxy model. J. Pet. Sci. Eng. 194,
190104-MS. 107574. http://dx.doi.org/10.1016/j.petrol.2020.107574.
16