🧪 Probability Theory

Books (in Russian)

Probability of an event

Any subset EE of the sample space is known as an event.

That is, an event is a set consisting of possible outcomes of the experiment. If the outcome of the experiment is contained in EE, then we say that EE has occurred.

Axioms of probability

For each event EE, we denote P(E)\operatorname{P}(E) as the probability of event EE occurring.

The probability that at least one of the elementary events in the sample space will occur is 11.

Every probability value is between 00 and 11 included.

For any sequence of mutually exclusive eventsEiE_iwe have:

P(i=1nEi)=i=1nP(Ei)\operatorname{P} \left( \bigcup_{i=1}^n E_i \right) = \sum_{i = 1}^n \operatorname{P} (E_i)


A permutation is an arrangement ofkkobjects from a pool ofnnobjects, in a given order. The number of such arrangements is:

P(n,k)=n!(nk)!P(n, k) = \frac{n!}{(n - k)!}

A combination is an arrangement ofkkobjects from a pool ofnnobjects, where the order does not matter. The number of such arrangements is:

C(n,k)=n!k!(nk)!C(n, k) = \frac{n!}{k! \cdot (n - k)!}

We note that for 0kn0 \le k \le n , we haveP(n,r)C(n,r)P(n, r) \ge C(n, r)

Conditional probabilities

Conditional probability is the probability of one event occurring with some relationship to one or more other events.

Independence. Two eventsAAandBBare independent if and only if we have:

P(AB)=P(A)P(B)\operatorname{P} (A \cap B) = \operatorname{P}(A) \cdot \operatorname{P}(B)

Law of total probability. Given an eventAA, with known conditional probabilities given any of the BiB_i events, each with a known probability itself, what is the total probability thatAAwill happen? The answer is

P(A)=i=1nP(ABi)P(Bi)\operatorname{P}(A) = \sum_{i = 1}^n \operatorname{P}(A | B_i) \cdot \operatorname{P}(B_i)

Bayes' rule. For eventsAAandBBsuch thatP(B)>0\operatorname{P}(B) > 0, we have

P(AB)=P(BA)P(A)P(B)\operatorname{P} (A | B) = \frac{\operatorname{P}(B | A) \cdot \operatorname{P}(A)}{\operatorname{P}(B)}

Expectation and Moments of the Distribution

The formulas will be explicitly detailed for the discrete (D) and continuous (C) cases.

Expected value. The expected value of a random variable, also known as the mean value or the first moment, is often noted E[X]\operatorname{E}[X]and is the value that we would obtain by averaging the results of the experiment infinitely many times.

(D) E[X]=i=1nxif(xi)(C) E[X]=+xf(x) dx\textbf{(D) } \operatorname{E} [X] = \sum_{i = 1}^n x_i \cdot f(x_i) \hspace{5em} \textbf{(C) } \operatorname{E} [X]= \int_{-\infty}^{+\infty} x \cdot f(x) \ dx

Variance. The variance of a random variable, often noted Var[X]\operatorname{Var}[X], is a measure of the spread of its distribution function.

Var(X)=E[(XE[X])2]=E[X2]E[X]2\operatorname{Var}(X) = \operatorname{E}[(X −\operatorname{E}[X])^2] = \operatorname{E}[X^2] − \operatorname{E}[X]^2

Some Inequalities

Markov's inequality. LetXXbe a random variable anda>0a > 0

P(Xa)E(X)a \operatorname{P} (X\geq a)\leq {\frac {\operatorname {E} (X)}{a}}

Chebyshev's inequality. LetXXbe a random variable with expected valueμ\mu,

P(Xμkσ)1k2 \operatorname{P} (| X - \mu| \ge k \sigma) \leq \frac{1}{k^2}