Stat Mining 22
Stat Mining 22
Stat Mining 22
The summation can be interpreted as a weighted average and consequently the marginal
probability. Probability P(B) is sometimes called ‘average probability’ or ‘overall probability’.9
This law usually has one common application where the events coincide with a discrete
random variable taking each value in its range.
Consider a set {A1, A2, …, Ak} of pairwise disjoint events whose union is the entire space.
If P(Ai) are known and also the conditional probabilities P(B | Ai) then the conditional
probability
P ( B | Ai )P ( Ai )
P ( Ai | B ) = (1.12)
∑
k
i =1
P ( Ai )P ( B | Ai )
This is the so-called the Bayes’ Theorem. Probability P(Ai | B) is called a posteriori whereas
probabilities P(Ai) are called a priori.
In other words, a measurable function assigning real numbers to every outcome of the
experiment is called a random variable.
Random variables will be marked in bold.
A random variable is a discrete one if it is supported by a finite or enumerable set of numbers.
Examples of probability distributions for discrete variables will be given in Chapter 1.2.5.
In order to characterise a random variable, it is necessary to determine a set of its possible
values and the corresponding probabilities.
A function F(x), which is defined as the probability of an event {X ≤ x}, is called a
distribution (distribution function, cumulative function) of the random variable X, i.e.
FX ( x ) = ∫
−∞
fX u ) du
(1.15)
then the random variable X is continuous, its distribution is continuous and the function fX(x)
is called a probability density function. Function fX(x) can be treated as a density mass on the
9
Pfeiffer (1978), Rumsey (2006).