Use of Neural Networks in Log's Data Processing: Prediction and Rebuilding of Lithologic Facies
Use of Neural Networks in Log's Data Processing: Prediction and Rebuilding of Lithologic Facies
Use of Neural Networks in Log's Data Processing: Prediction and Rebuilding of Lithologic Facies
Abstract
When a log is missing in a drilling hole, geologists hope to deduce it from others logs available
in another part of the hole or in a neighbouring hole, in order to define the lithologic facies of the
hole. This paper presents a neural network method to predict the missing log's measure from the
other available log's measures. This method, based on Multi-Layer Perceptron (MLP) acts as a
non linear regression method for the prediction task and as a probability density distribution
approximation for the outlier rejection task. The result obtained when applied to actual logs data
for prediction and rejection are presented in a separate section. The last section is dedicated to a
non supervised neural method in order to reconstruct the lithologic facies of the concerned hole.
This last experiment allows to validate and interpret the different results of the proposed
methods.
Introduction
In order to investigate the subsoil's composition, geologists take sensors down in the drilling
holes for measuring physical properties of the rocks. We call logs the result of these measures.
The measure quality is highly disturbed by technical difficulties which may occur while a
drilling process. So that, a log may be missing in a hole part of the well. The problem is,
knowing p logs, how to deduce the (p+1)th missing log. Afterwards, x = (x1,x2,,xp) is an
observation of p log's values taken at a given depth level, and y is the value of the (p+1)th log we
want to determine. In such a case, usually geologists use the correlation existing between the
different available logs and the missing one, to deduce the value they need [rfrences par Louis
Briqueu]. We propose neural networks methods for helping geologists in determining the
correspondence between several logs. In the following we use multilayer perceptrons (MLP)
which allow nonlinear function approximation. The problem is thus to estimate the MLP's
parameters, using a learning set that we call App = {(x1,y1), (x2,y2),(xN,yN)}, where xk =
Petrophysics meets Geophysics - Paris, France, 6 - 8 November 2000
( x1k , x 2k ,..., x kp ) represents a measure of p logs and yk the measure of the (p+1)th one. The
information contained in this learning data set corresponds to a defined situation, which depends
on geology, drilling tools and drilling conditions. This learning set implicitly determines the
domain where MLPs predictions are valid. In practice, logs may present outlier values because
of technical difficulties, or data corresponding to geological situations which are not represented
in the learning set. Therefore, we have to throw out those data, while using the MLP for
prediction. So, a major problem is to detect these measures (x) which are not in adequacy with
the learning set. In this paper, we propose a neural network method which computes and
interprets the probabilities' repartition of the random variable Y (the p+1th missing log),
conditioned by an observation x = (x1,x2,,xp). The values provided by the neural network
allows to build an accurate rejection method.
In the following we present the neural network method used for prediction and rejection and the
results obtained when applied to actual logs data. In the last section we use a non supervised
neural method in order to reconstruct the lithologic facies of the concerned hole. This last
experiment allows to validate and interpret the different results of the proposed methods.
2
C=
( y k F(x k ,W))
k =1
We know that minimizing this function implies that F(x,W) E(Y/x) where E(Y/x) corresponds
to the statistical average of the missing log y, knowing p observed others logs x. [Bishop 1995].
Considering the prediction of the missing log as a regression problem supposes that the
distribution of the random variable Y/x is an unimodal distribution (normal distribution). So, for
the model F(x,W) can predict the missing log, the unimodal distribution of Y must be checked. It
is this idea that we develop in the next paragraph, with the aim of detecting the values which are
not in adequacy with the whole learning set.
we are disposing of, are varying from 0.023 to 0.823 % of porosity. We subdivide the range of
the NPHI's value in 16 discrete and regular intervals, and we consider a MLP classifier with p =
3 input neurons (PEF, RHOB and GR) and K=16 output neurons. Each one of those 16 neurons
represents one of the 16 NPHI's intervals. The activation of the output layer corresponds to the
histogram of the conditional NPHI's random variable, knowing the three measures (PEF, RHOB
and GR). Figures 2 shows two possible situations; Figure 2-a shows the curve we get when the
distribution is unimodal, which is the general case, a single activation pick appears and then
NPHI's value is clearly defined; Figure 2-b shows the curve we get when the distribution is
multimodal. Several activation picks point out that NPHI's value may belong to several different
intervals, and several values of the missing NPHI's log can be proposed from the three measures
(PEF, RHOB and GR). So these input's logs have to be eliminated from the MLP prediction
process.
(a)
(b)
Figure 2: The x-axis corresponds to the 16 intervals of NPHI. The y-axis corresponds to the
activation value of the neuron associated to each interval. D is the observed NPHI's value, R is
the NPHI's computed by the MLP predictor . (a): The output state curve produced when
probabilities' distribution of Y/x is unimocal. (b): The output state curve produced when this
distribution is unimocal.
In order to decide if the conditional random variable Y/x is unimodal or not, we analyse the
differences between two picks of the output state curve of the MLP classifier. If this difference is
less than a fixed threshold value we then decide that the distribution is multimodal and we
eliminate the current vector log's values x from the predicting process. On figure 4, in the fifth
column, we have plotted the measured NPHI, and in the sixth column, we plotted the computed
NPHI (by the MLP predictor). We didn't draw the line of the predicted value when the MLP
classifier indicates a multimodal distribution. We can see, next to 650 meters, that the biggest
error made by the MLP predictor is thrown out. We have plotted together the predicted NPHI's
curves (column 6) and the diametric log (caliper) (column 1). We know that caliper's variations
reflect drilling hole's deformations. We find out that rejections coincide with caliper's
disturbances, therefore with the hole's deformations. The maximum error is setting just in front
of a caliper's break, characteristic of a considerable excavation. In such conditions, logging tools
can't give calibrated measures, and the values taken in this place are unavailable. As the input
data has no coherence, it is normal that the regression neural treatment provides an aberrant
answer. For understanding better what happens, we wished to know subsoil's composition in
Marcoule's sector. This is the main goal of the next section where we trained a topological map
on the same MAR203 data, as we have four distinct measures for each level ((PEF, RHOB, GR,
NPHI) we use all of them in order to get the more accurate reconstruction.
4
To represent the facies of the drilling hole MAR402, we used a Self-Organizing Map (SOM).
The map is a discrete set (C) of formal neurons. Each neuron of the map is associated to a
referent vector in the data space. The map has a discrete topology defined by an undirect graph,
usually a regular grid in one or two dimensions. For each pair of neurons (c,r), the distance (c,r)
is defined as being the shortest path between c and r on the graph. The SOM algorithm makes
use of a neighborhood system of which the size, controlled by the parameter T decreases as
learning proceeds. At the end of the learning algorithm two neighboring neurons on the map
have close referent vectors in the data space. Each referent vector defines a particular subset of
the data space (usually its Vorono domain). So the data space is divided in several subsets, each
one being represented by a particular neuron of the map. To two neighboring neurons on the map
will correspond two close subsets in the data space [Kohonen 1994], [Yacoub and al, 2000].
This partition of the data space permits to affect each vector of the data space to a particular
neuron on the map. A generalization of the SOM model, the Probabilistic Self-Organizing Map
(PRSOM) [Anouar et al. 1997] uses a probabilistic formalism. PRSOM is a probabilistic model
which associates to a neuron c of the map a spherical Gaussian density function fc., it
approximates the data's density distribution using a mixture of normal distributions. This
algorithm improves the map's spreading out over the data space.
The learning algorithms of SOM and PRSOM are unsupervised algorithms, which adapt the map
to a set of learning samples. Those algorithms allow realizing a partition of the data space, with
each subset associated to a one neuron of the map. If we label each neuron of the map with
particular rock classes, the map becomes a classifier. Labeling process may be made by an expert
or by automatic methods, as hierarchical classification, for more details see [Yacoub and al,
2000], [Frayssinet, 2000], [Gottlib-Zeh and al, 1999].
We trained a rectangular map of size (13x7) with PRSOM algorithm, using the whole set of
logging measures we dispose of in the drilling hole MAR203. We take advantage of the
knowledge provided by the coring of this hole for labelling the map. Figure 3 shows the
topologic map we get. Graphics where made using the package of Kohonen, according to its
representations, Grey level between two neighboring neurons represents the distance between
Petrophysics meets Geophysics - Paris, France, 6 - 8 November 2000
them. We can see that rock classes are represented by neurons gathered together. We notify a
gradient's presence between the different sediments. The sediments which come from the
mainland, as sandstone, stand on the left side of the map, and the sediments which come from
lakes or oceans, as limestone, stand on the right side of the map. On the other hand, the lignite is
dispersed on different places on the map. It seems to be difficult to recognize.
The labeled map, constructed with MAR203's data, is used as an automatic classification tool for
MAR402's data. As such, we draw MAR402's facies (column 6, figure 4). We notice that the
biggest MLP prediction error is made in lignite class rock. If we consider the whole facies build
by labeled map, we notify that most of ambiguous rejections are concomitant with lignite. Then
we can establish a strong correspondence between caliper's disturbances, ambiguous rejections
and lignite's presence. Lignite is a delicate rock, which doesn't stand drilling tools. They may
product big excavations in lignite's banks, and then the log's measures are unavailable. That's
why lignite's situation is not hole defined on the map.
Figure 3: The topologic map established with the values of 4 log's data, PEF, RHOB, GR and
NPHI measured in the drilling hole MAR203.
L = limestone, M = marls, gS = glauconitic sandstone, Sha = shales, Sha1 = others shales, Si =
silts, Si1 = others silts, cS = coarse grained sandstone, S = strict sandstone, SL = sandy
limestone, B = sandy breccia, Lig = lignite.
5
Conclusion
The conjoined use of several kinds of neural networks allows us to settle several tools:
reconstruction of missing data, trusting of the results, probabilistic classification of the lithologic
facies, detection of measures taken in degrading conditions, detection of input data out of the
learning space.
Some points may be improved, as log's selection, probabilities computing before the labeling
process, tools convolution's modelisation.
The results described here open the field of neural research for log's data: permeability's study,
generalization of neural networks to a drilling field, construction of fictitious drilling hole made
of the best log's measures of the drilling field, and, why not, construction of an universal
classification tool.
Acknowledgments
This work was supported by Elf Production (Pau, France). The simulations were run by using
SNNS provided by Stuttgart University, SN2 software provided by the Neuristic Company, and
the Spoutnik software of the LODYC, written by Philippe Daigremont. Figure 3 was made using
Kohonen software **This application was made in collaboration with Philippe Rabiller,
geological adviser at ELF Production and Stephanie Gottlib-Zeh from ELF Exploration.
References
Anouar F., Badran F. and Thiria S., (1997): Self Organized Map, A probabilistic approach,
proceedings of the workshop on Self Organized Maps. Helsinki University of Technology,
Espoo, Finland, June 4-6.
Bishop C. (1995): Neural Networks for Pattern Recognition. Oxford University Press.
Frayssinet D. (2000): Utilisation des rseaux de neurones en traitement des donnes de
diagraphies: prdiction et reconstitution de facis lithologiques. Mmoire d'ingnieur C.N.A.M.
Gottlib-Zeh S., Briqueu L. and Veillerette A. (1999): Indexed Self-Organization Map: a new
calibration system for a geological interpretation of logs. In Proceedings of IAMG'99, VI, pp
183-188,. Lippard, Naess and Sinding-Larsen Eds Norway.
Kohonen T. (1984): Self organization and associative memory. Springer Series in Information
Sciences, 8, Sprinter Verlag, Berlin (2nd ed 1988).
Kohonen T. (1995): SOM-PAK. The Self-Organizing Map Program Package. ftp cochlea.hut.fi
(or 130.233.168.48).
Mejia C.: Architectures Neuronales pour l'Approximation des Fonctions de Transfert: application
la tldtection. Thse de doctorat, Universit de Paris Sud, centre d'Orsay.
Rabaute A. (1999): Obtenir une reprsentation en continu de la lithologie et de la minralogie.
Exemples d'applications du traitement statistique de donnes de diagraphie aux structures
sdimentaires en rgime de convergence de plaques (Log ODP 134, 156 et 160). Thse de
doctorat, Universit de Montpellier II. Mmoires gosciences Montpellier.
Thiria S., Mejia C., Badran F. (1993): A neural network approach for modeling nonlinear
transfer functions: Application for wind retrieval from spaceborne scatterometer data. JGR, vol.
98, N C12, Pages 22,827-22,841, december 15, 1993.
Thiria S., Lechevallier Y., Gascuel O. et Canu S. (1997): Statistiques et mthodes neuronales.
Dunod.
Actes des journes scientifiques CNRS/ANDRA. Bagnols-sur-Cze, 20 et 21 octobre 1997.
Etude du Gard Rhodanien. Editions EDP sciences.
Figure 4: We show the part of MAR402 drilled between 600 and 800 m. On column 1 is the
caliper. On column 2, 3, 4, 5 are plotted PEF, RHOB, GR and NPHI. On column 6, the predicted
NPHI comports gaps which correspond to ambiguous values which are thrown out by the
detection of ouliers. On column 7 is the lithologic facies of MAR402. Around 650 meters, in
front of the excavation revealed by the caliper (column 1), there is a big gap in the predicted
NPHI (column 6), concomitant with a bank of lignite (column 7).