We engineers often ignore the distinctions between joint, marginal, and conditional probabilities - to our detriment.

Figure 1 - How the Joint, Marginal, and Conditional distributions are related.
conditional probability:
where where
f
is the probability of x by itself, given specific value of variable
y, and the
distribution parameters,
. (See Figure
1) If x and y represent events A
and B, then P(A|B) = nAB/nB , where
nAB is the number of times both
A
and B occur, and nB is the number of times
B occurs. P(A|B) =
P(AB)/P(B), since
P(AB) = nAB/N and P(B) = nB/N so that
Joint probability is the probability of two or more
things happening together.
where
f is the probability of
x and y
together as a pair, given the distribution parameters,
. Often these events
are not independent, and sadly this is often ignored. Furthermore, the correlation coefficient itself does NOT
adequately describe these interrelationships.
Consider first the idea of a probability density
or distribution:
where
f
is the probability density of x, given the distribution
parameters,
. For a normal distribution,
where
is the
mean, and
is the standard deviation. This is sometimes called a
pdf, probability density function. The
integral of a pdf, the area under the curve (corresponding to the
probability) between specified values of x, is a cdf,
cumulative distribution function,
. For
discrete
f ,
F is the corresponding summation.
A joint probability density two or more variables is called a multivariate distribution. It is often summarized by a vector of parameters, which may or may not be sufficient to characterize the distribution completely. Example, the normal is summarized (sufficiently) by a mean vector and covariance matrix.
marginal probability:
where
f is the
probability density of x, for all possible values of y, given the
distribution parameters,
. The marginal probability is determined from
the joint distribution of x and y by integrating over all values of
y,
called "integrating out" the variable y. In applications of Bayes's
Theorem, y is often a matrix of possible parameter values. The figure
illustrates joint, marginal, and conditional probability relationships.
![]()
Note that in general the conditional probability of A given B is not the same as B given A. The probability of both A and B together is P(AB), and if both P(A) and P(B) are non-zero this leads to a statement of Bayes Theorem:
P(A|B) = P(B|A) x P(A) / P(B) and
P(B|A) = P(A|B) x P(B) / P(A)
Conditional probability is also the basis for statistical dependence and statistical independence.
Independence: Two variables, A and B, are independent if their conditional probability is equal to their unconditional probability. In other words, A and B are independent if, and only if, P(A|B)=P(A), and P(B|A)=P(B). In engineering terms, A and B are independent if knowing something about one tells nothing about the other. This is the origin of the familiar, but often misused, formula P(AB) = P(A) X P(B), which is true only when A and B are independent.
conditional independence: A and B are conditionally independent, given C, if
Prob(A=a, B=b | C=c) = Prob(A=a | C=c) x Prob(B=b | C=c) whenever Prob(C=c) > 0.
So the joint probability of ABC, when
A and B are conditionally
independent, given C, is then
Prob(C)
x Prob(A | C)
x Prob(B | C) A
directed graph illustrating this conditional independence is
A <- C
-> B.