tmpD1FF TMP

Pregowska et al.
BMC Neuroscience (2015) 16:32

DOI 10.1186/s12868-015-0168-0
RESEARCH ARTICLE
Open Access
Mutual information against correlations in

binary communication channels
Agnieszka Pregowska, Janusz Szczepanski* and Eligiusz Wajnryb
Abstract
Background: Explaining how the brain processing is so fast remains an open problem (van Hemmen JL, Sejnowski T.,
2004). Thus, the analysis of neural transmission (Shannon CE, Weaver W., 1963) processes basically focuses on
searching for effective encoding and decoding schemes. According to the Shannon fundamental theorem, mutual
information plays a crucial role in characterizing the efficiency of communication channels. It is well known that this
efficiency is determined by the channel capacity that is already the maximal mutual information between input and
output signals. On the other hand, intuitively speaking, when input and output signals are more correlated, the
transmission should be more efficient. A natural question arises about the relation between mutual information and
correlation. We analyze the relation between these quantities using the binary representation of signals, which is the
most common approach taken in studying neuronal processes of the brain.
Results: We present binary communication channels for which mutual information and correlation coefficients
behave differently both quantitatively and qualitatively. Despite this difference in behavior, we show that the
noncorrelation of binary signals implies their independence, in contrast to the case for general types of signals.
Conclusions: Our research shows that the mutual information cannot be replaced by sheer correlations. Our results
indicate that neuronal encoding has more complicated nature which cannot be captured by straightforward
correlations between input and output signals once the mutual information takes into account the structure and
patterns of the signals.
Keywords: Shannon information, Communication channel, Entropy, Mutual information, Correlation, Neuronal
encoding
Background
Huge effort has been undertaken to analyze neuronal
coding, its high efficiency and mechanisms governing
them [1]. Claude Shannon published his famous paper on
communication theory in 1948 [2,3]. In that paper, he formulated in a rigorous mathematical way intuitive concepts
concerning the transmission of information in communication channels. The occurrences of inputs transmitted
via channel and output symbols are described by random
variables X (input) and Y (output). An actual important
task is determination of an efficient decoding scheme;
i.e., a procedure that allows a decision to be made about
the sequence (message) input to the channel from the
output sequence of symbols. This is the essence of the
*Correspondence: jszczepa@ippt.pan.pl
Institute of Fundamental Technological Research, Polish Academy of Sciences,
Pawinskiego 5B, Warsaw, PL
fundamental Shannon theorem, in which a crucial role

is played by the capacity of the channel that is given
by the maximum of mutual information over all possible probability distributions of input random variables.
The theorem states that the efficiency of a channel is
better when the mutual information is higher [4,5]. Analyzing a relation between data, in particular the input and
response of any system, experimentalists apply the most
natural tools; i.e., different types of correlations [6-14].
Correlation analysis has been used to infer the connectivity between signals. The standard correlation measure
is the Pearson correlation coefficient commonly exploited
in data analysis [15,16]. However, there are a number of
correlation-like coefficients dedicated to specific biological and experimental phenomena [6]. Therefore, besides
the Pearson correlation coefficient, in this paper, we also
2015 Pregowska et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication
waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise
stated.
Pregowska et al. BMC Neuroscience (2015) 16:32
consider the correlation coefficient based on the spike

train that is strongly related to the firing activity of neurons transmitting information. A natural question arises
about the role of correlation coefficients in the description
of communication channels, especially in effective decoding schemes [17,18]. Recently, interesting result has been
shown [19], analytically and numerically, concerning the
effects of correlations between neurons in encoding population. It turned out that decorrelation does not imply an
increase in information. In [20] it was observed that the
spike trains of retinal gangolin cells were indeed decorelated in comparison with the visual input. The authors
conjecture that this decorrelation would enhance coding
efficiency in optic nerve fibers of limited capacity. We
begin a conversation about whether mutual information
can be replaced in some sense by a correlation coefficient.
In this paper we consider binary communication channels. It seems that the straightforward idea holds true:
there is a high correlation between output and input; i.e.,
in the language of neuroscience, by observing a spike in
the output we guess with high probability that there is
also a spike in the input. This finding suggests that the
mutual information and correlation coefficients behave
in a similar way. In fact, we show that this is not always
true and that it often happens that the mutual information
and correlation coefficients behave in completely different
ways.
Methods
The communication channel is a device that acts on the
input to produce the output [3,17,21]. In mathematical language, the communication channel is defined as a
matrix of conditional probabilities linking the transition
between input and output symbols possibly depending on
the internal structure of the channel. In neuronal communication systems of the brain, information is transmitted
by means of a small electric current and the timing of
the action potential (mV), also known in literature as a
spike train [1], plays a crucial role. Spike trains can be
encoded in many ways. The most common encoding proposed in the literature is binary encoding, which is the
most effective and natural method [11,22-26]. It is physically justified that spike trains as being observed, are
detected with some limited time resolution , so that in
each time slice (bin) a spike is either present or absent. If
we think of a spike as representing a "1" and no spike as
representing a 0, then, if we look at some time interval
T
of length T, each possible spike train is equivalent to
digit binary number. In [26] it was shown that transient
responses in auditory cortex can be described as a binary
process, rather than as a highly variable Poisson process.
Thus, in this paper, we analyze binary information sources
and binary channels [25]. Such channels are described by
a 2 2 matrix:
Page 2 of 7

C=
p0|0 p0|1
p1|0 p1|1

,
(1)
where
p0|0 + p1|0 = 1
and
p0|1 + p1|1 = 1 ,
p0|0 , p0|1 , p1|0 , p1|1 0 .

Symbol pj|i denotes the conditional probability of transition from state i to state j, where i = 0, 1 and j =
0, 1. Observe, that i and j are states of different neurons. Input symbols 0 and 1 (coming from the information
source governed, in fact, by a random variable X) arrive
with probabilities pX0 and pX1 , respectively.
Having the matrix C, one can find a relation between
these random variables; i.e., one can find by applying the
p(X=iY =j)
joint
standard formula p(Y = j|X = i) :=
p(X=i)
probability matrix M(2x2), which in general is of the form

p00 p01
,
(2)
M=
p10 p11
where
pji = p(X = i Y = j)
for
i, j = 0, 1 ,
p00 + p01 + p10 + p11 = 1 ,

p00 , p01 , p10 , p11 0 .
Using this notation, the probability distributions pXi and
of the random variables X and Y are given by
pYj
pXi := p(X = i) = p0i + p1i
for
i = 0, 1 ,
pYj
for
j = 0, 1 .
:= p(Y = j) = pj0 + pj1
(3)
The quantities pX1 and pY1 can be interpreted as the firing rates of the input and output spike trains. We will
use these probability distributions to calculate the mutual
information (between input and output signals), which is
expressed in terms of the entropies of the input itself, output itself and the joint probability of input and output
(4). In the following, we consider two random variables
X (input signal to the channel) and Y (output from the
channel) both assuming only two values 0 and 1, formally both defined on the same probability space. It is well
known that the correlation coefficient for any independent random variables X and Y is zero [14], but in general
it is not true that (X, Y ) = 0 implies independence of
random variables. However, for our specific random variables X and Y , which are of binary type, most common in
communication systems, we show the equivalence of independence and noncorrelation (see Appendix). The basic
idea of introducing the concept of a mutual information
is to determine the reduction of uncertainty (measured
by entropy) of random variable X provided that we know
the values of discrete random variable Y . The mutual
information (MI) is defined as
Page 3 of 7
MI(X, Y ) := H(Y )H(Y |X) = H(X)+H(Y )H(X, Y ) ,

(4)
where H(X) is the entropy of X, H(Y ) is the entropy of Y ,
H(X, Y ) is the joint entropy of X and Y , and H(X|Y ) is the
conditional entropy [4,17,21,27-29]. These entropies are
defined as
H(X) := iIs p(X = i) log p(X = i) ,
(5)
H(Y ) := jOs p(Y = j) log p(Y = j) ,

H(X, Y ) := iIs jOs p(X = i Y = j) log p(X = i Y = j) ,
H(Y |X) := iIs p(X = i)H(Y |X = i) ,
(6)
where
H(Y |X = i) := jOs p(Y = j|X = i) log p(Y = j|X = i) ,
(7)
Is and Os are, in general, sets of input and output symbols, p(X = i) and p(Y = j) are probability distributions
of random variables X and Y , and p(X = i Y = j) is
the joint probability distribution of X and Y . Estimation
of mutual information requires knowledge of the probability distributions, which may be easily estimated for
two-dimensional binary distributions, but in real applications it possesses multiple problems [30]. Since, in practice, the knowledge about probability distributions is often
restricted, more advanced tools must be applied, such as
effective entropy estimators [24,30-33].
The relative mutual information RMI(X, Y ) [34]
between random variables X and Y is defined as the ratio
of MI(X, Y ) and the average of information transmitted
by variables X and Y :
RMI(X, Y ) :=
H(X) + H(Y ) H(X, Y )

.
[ H(X) + H(Y )] /2
(8)
RMI(X, Y ) measures the reduction in uncertainty of X,

provided we have knowledge about the realization of Y ,
relative to the average uncertainty of X and Y .
It holds true that [34]
1. 0 RMI(X, Y ) 1;
2. RMI(X, Y ) = 0 if and only if X and Y are
independent;
3. RMI(X, Y ) = 1 if and only if there exists a
deterministic relation between X and Y .
Adopting the notation (2, 3), the relative mutual information RMI can be expressed as
i,j=1
RMI(X, Y ) =
1 pX log pX 1 pY log pY +
i=0
i
i
j
j=0 j
i,j=0 pji log pji

.
1 pX log pX 1 pY log pY /2
i=0
i
i
j
j=0 j
(9)
The standard definition of the Pearson correlation coefficient (X, Y ) of random variables X and Y is
E[ (X EX) (Y EY )]
V (X) V (Y )
E(X Y ) EX EY
=
,

E[ (X EX)2 ] E[ (Y EY )2 ]
(X, Y ) :=
(10)
where E is the average over the ensemble of elementary

events, and V (X) and V (Y ) are the variations of X and Y .
Adopting the communication channels notation, we get
(X, Y ) =
p11 (p01 + p11 ) (p10 + p11 )

(p01 +p11 )(p01 +p11 )2 (p10 +p11 )(p10 +p11 )2
=
p11 pX1 pY1

.

pX0 pX1 pY0 pY1
(11)
It follows that the Pearson correlation coefficient
(X, Y ) is by no means a general measure of dependence
between two random variables X and Y . (X, Y ) is connected with the linear dependence of X and Y . That is, the
well-known theorem [15] states that the value of this coefficient is always between -1 and 1 and assumes -1 or 1 if
and only if there exists a linear relation between X and Y .
The essence of correlation, when we describe simultaneously the input to and the output from neurons,
may be expressed as the difference in the probabilities of
coincident and independent spiking related to independent spiking. To realize this idea, we use a quantitative
neuroscience spike-train correlation (NSTC) coefficient:
NSTC(X, Y ) :=
p11 pX1 pY1

pX1 pY1
p11 (p01 + p11 ) (p10 + p11 )

.
(p01 + p11 ) (p10 + p11 )
(12)
Such a correlation coefficient with this normalization
seems to be more natural than the Pearson coefficient
in neuroscience. A similar idea was developed in [35]
where raw-cross-correlation of simultaneous spike trains
was referred to the square root of the product of firing
rates. Moreover, it turns out that NSTC coefficient has an
important property: i.e., once we know the firing rates pX1
and pY1 of individual neurons and the coefficient, we can
determine the joint probabilities of firing:

p00 = 1 pX1 1 pY1 + NSTC pX1 pY1 ,

p01 = 1 pY1 pX1 NSTC pX1 pY1 ,
(13)

p10 = pY1 1 pX1 NSTC pX1 pY1 ,
p11 = pX1 pY1 + NSTC pX1 pY1 .
Since p11 0, by formula (12) we have the lower
bound NSTC 1. The upper bound is unlimited for
the general class (2) of joint probabilities. In the important
special case when the communication channel is effective
Page 4 of 7
enough, i.e. p11 is large enough so the input spikes with

high probability pass through the channel, one has the
following practical upper bound of NSTC < p111 1.
We present realizations of a few communication channels that show that the relative mutual information, the
Pearson correlation coefficient and neuroscience spiketrain correlation coefficient may behave in different ways,
both qualitatively and quantitatively. Each of these realizations constitutes a family of communication channels
parameterized in a continuous way by a parameter from
some interval. For each , we propose, assuming some
relation between neurons activities, the joint probability
matrix of input and output signals and the information
source distributions. These communication channels are
determined by 2 2 matrixes of conditional probabilities (1). Next the joint probability is used to evaluate both
the relative mutual information and correlation coefficients. Finally, we plot the values of the relative mutual
information and both correlation coefficients against to
illustrate their different behaviors.
Results and discussion

We start with a communication channel in which the relative mutual information monotonically increases with
while NSTC and Pearson correlation coefficients are practically constant. Moreover, RMI has large values which,
according to the fundamental Shannon theorem, result
in high transmission efficiency, while the Pearson correlation coefficient is small. To realize these effects, we
consider the situation described by the joint probability
matrix (14) where the first neuron becomes more active
(i.e., the probability of firing increases) with an increase
in the parameter while simultaneously the activity of
the second neuron is unaffected by . Thus, the joint
probability matrix M() reads
7
1
15 5 +
.
(14)
M() =
2
1
15
5
In this case, the family of the communication channels
2
is given by the conditional
for each parameter 0 < < 15
probability matrix C():
7
1
C() =
15
3
5 2
5 +
2
5 +2
2
15
3
5 2
1
5 +
2
5 +2
(15)
We assume that the input symbols coming from an

information source arrive according to the random variable X with probability distribution pX0 = 35 2 and
pX1 = 25 + 2. The behaviors of RMI, and the NSTC
coefficient are presented in Figure 1.
Now consider the case for which the probability of firing of the first neuron decreases with parameter while
Figure 1 Communication channels family, Eq. (14). Course of the

relative mutual information RMI (red dotted line), (blue dotted line)
and NSTC coefficient (green solid line) versus communication
channels parameter . The left y-axis corresponds to the correlation
measures and NSTC while the right y-axis corresponds to RMI.
the second neuron behaves in the opposite way. The joint

probability matrix M() we propose is
1
7
4
20
,
(16)
M() =
1
7
20 + 2 20
3
and the information source probabilities are pX0 = 10
+
7
7
X
2 and p1 = 10 2 for 0 < < 20 . Here the
communication channels C() are of the form
1
7
C() =
4
3
10 +2
20
7
10 2
1
20 +2
3
10 +2
7
20
7
10 2
(17)
For this family of communication channels, the NSTC

coefficient strongly decreases from positive to negative
values, while and RMI vary non-monotonically around
zero. Moreover, exhibits one extreme and RMI two
extremes. Additionally, for = 0.35, the RMI is close
to zero while the NSTC coefficient is approximately -0.32
(Figure 2). We point out these values to stress that, according to the fundamental Shannon theorem, the transmission is not efficient (RMI is small), although at the same
time, the activity of neurons described by the NSTC coefficient is relatively well correlated. Figure 2 shows the
behaviors of RMI, and the NSTC coefficient. Finally, we
present the situation (18) in which one neuron does not
change its activity with and the activity of the other neuron increases with . Additionally, in contrast to the first
case, the second neuron changes its activity only when the
first neuron is active.
1 1
10 20
(18)
M() =
4 1
+
5 20
Page 5 of 7
is visibly larger than zero what suggests that the communication efficiency is relatively good, while at the same
time the Pearson correlation coefficient (equal to -0.03)
is very close to zero, indicating that the input and output
signals are almost uncorrelated (independent for binary
channels). It suggests that these measures describe different qualitative properties. Figure 3 shows the behaviors of
RMI, and the NSTC coefficient.
Conclusions

In this case, the communication channel C() is given

by
C() =
1
9
20
8
9
1
20 +
1
10
1
10
(19)
9
and the information source probabilities are pX0 = 10
and
1
1
X
p1 = 10 for 0 < < 20 . It turns out that NSTC coefficient increases linearly from large negative values below
-0.4 to a positive value of 0.1. Simultaneously, is practically zero and RMI is small (below 0.1) but varies in a nonmonotonic way having a noticeable minimum (Figure 3).
Moreover, observe that for small the RMI (equal to 0.1)
To summarize, we show that the straightforward intuitive approach of estimating the quality of communication
channels according to only correlations between input
and output signals is often ineffective. In other words, we
refute the intuitive hypothesis which states that the more
the input and output signals are correlated the more the
transmission is efficient (i.e. the more effective decoding
scheme can be found). This intuition could be supported
by two facts:
1. for not correlated binary variables ((X, Y ) = 0),
(which are shown in the Appendix to be
independent) one has RMI = 0,
2. for fully correlated random variables (|(X, Y )| = 1)
(which are linearly dependent) one has RMI = 1. We
introduce a few communication channels for which
the correlation coefficients behave completely
differently to the mutual information, which shows
this intuition is erroneous.
In particular, we present the realizations of channels
characterized by high mutual information for input and
output signals but at the same time featuring very low
correlation between these signals. On the other hand, we
find channels featuring quite the opposite behavior; i.e.,
having very high correlation between input and output
signals while the mutual information turns out to be very
low. This is because the mutual information, which in fact
is a crucial parameter characterizing neuronal encoding,
takes into account structures (patterns) of the signals and
not only their statistical properties, described by firing
rates. Our research shows that neuronal encoding has a
much more complicated nature that cannot be captured
by straightforward correlations between input and output
signals.
Appendix

The theorem states that independence and noncorrelation are equivalent for random variables that take only two
values.
Theorem 1. Let X and Y be random variables, which
take only two real values ax , bx and ay , by , respectively. Let
M be the joint probability matrix

M=
p00 p01
p10 p11
Page 6 of 7

,
(20)
where
Thus, we have p11 (p01 + p11 )(p10 + p11 ) = 0; i.e., p11

is factorized p11 = pX1 1 pY1 1 . To prove the independence
of X1 and Y1 , we have to show that
p00 = pX0 1 pY0 1 , p01 = pX1 1 pY0 1 , p10 = pX0 1 pY1 1 .
p00 = p(X = ax Y = ay ) ,
p01 = p(X = bx Y = ay ) ,
We prove the first and second equality, and the third

equality can be proven analogously.
Making use of (23), we have
p10 = p(X = ax Y = by ) ,
p11 = p(X = bx Y = by ) ,
p01 +p11 = 1(p10 +p00 ) , p10 +p11 = 1(p01 +p00 ) ,

and
(26)
p00 + p01 + p10 + p11 = 1 ,
and (25)
p00 , p01 , p10 , p11 0 .
0 = p11 (p01 +p11 )(p10 +p11 )
The probability distributions of random variables X and

Y are given by
pXax := p(X = ax ) = p0i + p1i
for i = 0 ,
pXbx
pYay
:= p(X = bx ) = p0i + p1i
for i = 1 ,
:= p(Y = ay ) = pj0 + pj1
for j = 0 ,
pYby
:= p(Y = by ) = pj0 + pj1
for j = 1 .
= p11 [ 1 (p10 + p00 )] [ 1 (p01 + p00 )]

= p11[ 1 (p01 + p00 ) (p10 + p00 )+(p10 + p00 )(p01 + p00 )]
= (p11 + p01 + p10 1) + 2p00 (p10 + p00 )(p01 + p00 )
= p00 + 2p00 (p10 + p00 )(p01 + p00 ) .
(21)
(27)
Thus, we have
Adopting this notation, the condition (X, Y ) = 0

implies that random variables X and Y are independent.
p00 = (p10 + p00 )(p01 + p00 ) = pX0 1 pY0 1 .
(28)
Similarly, we have
To prove this Theorem 1, we first show the following
particular case for binary random variables.
Lemma 1. Let X1 and Y1 be two random variables,
which take two values 0,1 only. Let M1 be the joint probability matrix

p00 p01
,
(22)
M1 =
p10 p11
for i, j = 0, 1 ,
p00 + p01 + p10 + p11 = 1 ,
(23)
p00 , p01 , p10 , p11 0 .

The probability distributions pXi 1 and pYj 1 of these binary
random variables are given by
pXi 1 = p(X1 = i) = p0i + p1i
for i = 0, 1 ,
pYj 1
for j = 0, 1 .
= p(Y1 = j) = pj0 + pj1
(24)
Adopting this notation, (X1 , Y1 ) = 0 implies that X1

and Y1 are independent.
Proof. From (11), we have
(X, Y ) =
= p11 [ (p01 + p11 ) (p01 + p11 )(p01 + p00 )]
(29)
= p11 p01 p11 + (p01 + p11 )(p01 + p00 ) .

Thus, we have
p01 = (p01 + p11 )(p01 + p00 ) = pX1 1 pY0 1 .
(30)
To generalize this Lemma 1, we consider the following.
where
pji = p(X1 = i Y1 = j)
0 = p11 (p01 + p11 )(p10 + p11 )

= p11 (p01 + p11 )[ 1 (p01 + p00 )]
p11 (p01 +p11 ) (p10 +p11 )

(p01 +p11 )(p01 +p11 )2 (p10 +p11 )(p10 +p11 )2
=0.
(25)
Lemma 2. Assuming the notation as in Lemma 1, let us

define the random variables: let X := (bx ax )X1 + ax and
Y := (by ay )Y1 + ay .
Under these assumptions, (X, Y ) = 0 implies that X
and Y are independent. In other words, divalent, uncorrelated random variables have to be independent.
Proof. The proof is straightforward and follows directly
(by the linearity of the average value) from the definition
of the correlation coefficient (10) and from the fact that
the joint probability matrices M1 for X1 and Y1 and M
for X and Y are formally the same. Since by Lemma 1 the
random variables X1 and Y1 are independent, the random
variables X and Y must also be independent.
Finally, observe that X takes the values ax , bx and Y
takes the values ay , by only. Therefore, Theorem 1 follows
immediately from Lemma 2.
Page 7 of 7
Competing interests
The authors declare that they have no competing interests.
23.
Authors contributions
JS and AP planned the study, participated in the interpretation of data and
were involved in the proof of the Theorem. AP and EW carried out the
implementation and participated in the elaboration of data. EW participated in
the proof of the Theorem. All authors drafted the manuscript and read and
approved the final manuscript.
Acknowledgements
We gratefully acknowledge financial support from the Polish National Science
Centre under grant no. 2012/05/B/ST8/03010.
24.
25.
26.
27.
Received: 22 November 2014 Accepted: 21 April 2015

28.
References
1. van Hemmen JL, Sejnowski T. 23 Problems in Systems Neurosciences.
UK: Oxford University Press; 2006.
2. Shannon CE, Weaver W. The Mathematical Theory of Communication.
United States of America: University of Illinois Press, Urbana; 1963.
3. Shannon CE. A mathematical theory of communication. Bell Syst Tech J.
1948;27:379423623656.
4. Borst JL, Theunissen FE. Information theory and neural coding. Nat
Neurosci. 1999;2:94757.
5. Paprocki B, Szczepanski J. Transmission efficiency in ring, brain inspired
neuronal networks. information and energetic aspects. Brain Res.
2013;1536:13543.
6. Cohen MR, Kohn A. Measuring and interpreting neuronal correlations.
Nat Neurosci. 2011;14:8119.
7. Arnold M, Szczepanski J, Montejo N, Wajnryb E, Sanchez-Vives MV.
Information content in cortical spike trains during brain state transitions.
J Sleep Res. 2013;2:1321.
8. Abbott LF, Dayan P. The effect of correlated variability on the accuracy of
a population code. Neural Comput. 1999;11:91101.
9. Nirenberg S, Latham PE. Decoding neuronal spike trains: how important
are correlations? In: Proceedings of National Academy of Science USA,
10 June 2003. National Academy of Science USA; 2003. p. 734853.
10. de la Rocha J, Doiron B, Shea-Brown E, Josic K, Reyes A. Correlation
between neural spike trains increases with firing rate. Nature.
2007;448:8026.
11. Pillow JW, Shlens J, Paninski L, Sher A, Litke AM, Chicgilnisky J, et al.
Spatio-temporal correlations and visual signaling in a complete neuronal
population. Nature. 2008;454:9959.
12. Amari S. Measure of correlation orthogonal to change in firing rate.
Neural Comput. 2009;21:96072.
13. Ecker AS, Berens P, Keliris GA, Bethge M, Logothetis NK, Tolias AS.
Decorrelated neuronal firing in cortical microcircuits. Science.
2010;327:5847.
14. Nienborg H, Cumming B. Stimulus correlations between the activity of
sensory neurons and behavior: how much do they tell us about a
neurons causality? Curr Opin Neurobiology. 2010;20:37681.
15. Feller W. An Introduction to Probability Theory and Its Applications. United
States of America: A Wiley Publications in Statistics, New York; 1958.
16. Kohn A, Smith MA. Stimulus dependence of neuronal correlation in
primary visual cortex of the macaque. J Neurosci. 2005;25:366173.
17. Ash RB. The Mathematical Theory of Communication. United States of
America: John Wiley and Sons, New York, London, Sydney; 1965.
18. Eguia MC, Rabinovich MI, Abarbanel HDI. Information transmission
and recovery in neural communications channels. Phys Rev E.
2000;65(5):711122.
19. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget A.
Information-limiting correlations. Nat Neurosci. 2014;17:14107.
20. Pitkow X, Meister M. Decorrelation and efficient coding by retinal
ganglion cells. Nat Neurosci. 2012;15:62835.
21. Cover TM, Thomas JA. Elements of Information Theory. United States of
America: A Wiley-Interscience Publication, New York; 1991.
22. Rolls ET, Aggelopoulos NC, Franco L, Treves A. Information encoding in
the inferior temporal visual cortex: contributions of the firing rates and
29.
30.
31.
32.
33.
34.
35.
the correlations between the firing of neurons. Biol Cybernetics.

2004;90:1932.
Levin JE, Miller JP. Broadband neural encoding in the cricket cercal
sensory system enhanced by stochastic resonance. Nature.
2004;380:1658.
Amigo JM, Szczepanski J, Wajnryb E, Sanchez-Vives MV. Estimating the
entropy rate of spike trains via lempel-ziv complexity. Neural Comput.
2004;16:71736.
ChapeauBlondeau F, Rousseau D, Delahaines A. Renyi entropy measure
of noise-aided information transmission in a binary channel. Phys Rev E.
2010;81(051112):110.
DeWesse MR, Wehr M, Zador A. Binary spiking in auditory cortex.
J Neurosci. 2003;27(23/21):79409.
Paninski L. Estimation of entropy and mutual information. Neural
Comput. 2003;15(6):1191253.
London M, Larkum ME, Hausser M. Predicting the synaptic information
efficacy in cortical layer 5 pyramidal neurons using a minimal
integrate-and-fire model. Biol Cybernetics. 2008;99:393401.
Kraskov A, Stogbauer H, Grassberger P. Estimating mutual information.
Phys Rev E. 2004;69(6):066138.
Panzeri S, Schultz SR, Treves A, Rolls ET. Correlation and the encoding of
information in the nervous system. Proc R Soc London. 1999;B:100112.
Lempel A, Ziv J. On the complexity of finite sequences. IEEE Trans Inf
Theory. 1976;22(1):7581.
Lempel A, Ziv J. On the complexity of individual sequences. IEEE Trans Inf
Theory. 1976;IT-22:75.
Strong SP, Koberle R, de Ruyter van Steveninck RR, Bialek W. Entropy and
information in neural spike trains. Phys Rev Lett. 1998;80(1):197200.
Szczepanski J, Arnold M, Wajnryb E, Amigo JM, Sanchez-Vives MV.
Mutual information and redundancy in spontaneous communication
between cortical neurons. Biol Cybernetics. 2011;104:16174.
Bair W, Zohary E, Newsome WT. Correlated firing in macaque visual
area mt: time scales and relationship to behavior. J Neurosci.
2001;21(5):167697.
Submit your next manuscript to BioMed Central

and take full advantage of:
Convenient online submission
Thorough peer review
No space constraints or color gure charges
Immediate publication on acceptance
Inclusion in PubMed, CAS, Scopus and Google Scholar
Research which is freely available for redistribution
Submit your manuscript at
www.biomedcentral.com/submit

tmpD1FF TMP

Uploaded by

Copyright:

Available Formats

tmpD1FF TMP

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

tmpD1FF TMP

Uploaded by

Copyright:

Available Formats

Pregowska et al.

BMC Neuroscience (2015) 16:32

Mutual information against correlations in

fundamental Shannon theorem, in which a crucial role

Pregowska et al. BMC Neuroscience (2015) 16:32

consider the correlation coefficient based on the spike

p0|0 , p0|1 , p1|0 , p1|1 0 .

p00 + p01 + p10 + p11 = 1 ,

pXi := p(X = i) = p0i + p1i

:= p(Y = j) = pj0 + pj1

Pregowska et al. BMC Neuroscience (2015) 16:32

MI(X, Y ) := H(Y )H(Y |X) = H(X)+H(Y )H(X, Y ) ,

H(Y ) := jOs p(Y = j) log p(Y = j) ,

H(X) + H(Y ) H(X, Y )

RMI(X, Y ) measures the reduction in uncertainty of X,

where E is the average over the ensemble of elementary

p11 (p01 + p11 ) (p10 + p11 )

p11 pX1 pY1

p11 pX1 pY1

p11 (p01 + p11 ) (p10 + p11 )

Pregowska et al. BMC Neuroscience (2015) 16:32

enough, i.e. p11 is large enough so the input spikes with

Results and discussion

We assume that the input symbols coming from an

Figure 1 Communication channels family, Eq. (14). Course of the

the second neuron behaves in the opposite way. The joint

For this family of communication channels, the NSTC

Pregowska et al. BMC Neuroscience (2015) 16:32

Figure 2 Communication channels family, Eq. (16). Course of the

In this case, the communication channel C() is given

Figure 3 Communication channels family, Eq. (18). Course of the

Pregowska et al. BMC Neuroscience (2015) 16:32

Thus, we have p11 (p01 + p11 )(p10 + p11 ) = 0; i.e., p11

We prove the first and second equality, and the third

p01 +p11 = 1(p10 +p00 ) , p10 +p11 = 1(p01 +p00 ) ,

p00 , p01 , p10 , p11 0 .

0 = p11 (p01 +p11 )(p10 +p11 )

The probability distributions of random variables X and

:= p(X = bx ) = p0i + p1i

:= p(Y = ay ) = pj0 + pj1

:= p(Y = by ) = pj0 + pj1

= p11 [ 1 (p10 + p00 )] [ 1 (p01 + p00 )]

Adopting this notation, the condition (X, Y ) = 0

p00 = (p10 + p00 )(p01 + p00 ) = pX0 1 pY0 1 .

p00 + p01 + p10 + p11 = 1 ,

p00 , p01 , p10 , p11 0 .

= p(Y1 = j) = pj0 + pj1

Adopting this notation, (X1 , Y1 ) = 0 implies that X1

= p11 [ (p01 + p11 ) (p01 + p11 )(p01 + p00 )]

= p11 p01 p11 + (p01 + p11 )(p01 + p00 ) .

To generalize this Lemma 1, we consider the following.

0 = p11 (p01 + p11 )(p10 + p11 )

p11 (p01 +p11 ) (p10 +p11 )

Lemma 2. Assuming the notation as in Lemma 1, let us

Pregowska et al. BMC Neuroscience (2015) 16:32

Received: 22 November 2014 Accepted: 21 April 2015

the correlations between the firing of neurons. Biol Cybernetics.

Submit your next manuscript to BioMed Central

H(Y ) := jOs p(Y = j) log p(Y = j) ,