Just about the basic formulas of probability theory. History of the development of probability theory. What is probability theory

Section 12. Probability theory.

1. Introduction

2. The simplest concepts of probability theory

3. Algebra of events

4. Probability of a random event

5. Geometric probabilities

6. Classical probabilities. Combinatorics formulas.

7. Conditional probability. Independence of events.

8. Total probability formula and Bayes formula

9. Repeated test scheme. Bernoulli formula and its asymptotics

10. Random variables (RV)

11. DSV distribution series

12. Cumulative distribution function

13. NSV distribution function

14. Probability density of NSV

15. Numerical characteristics of random variables

16. Examples of important SV distributions

16.1. Binomial distribution of DSV.

16.2. Poisson distribution

16.3. Uniform distribution of NSV.

16.4. Normal distribution.

17. Limit theorems of probability theory.

Introduction

Probability theory, like many other mathematical disciplines, developed from the needs of practice. At the same time, while studying a real process, it was necessary to create an abstract mathematical model of the real process. Usually, the main, most significant driving forces of a real process are taken into account, discarding from consideration the secondary ones, which are called random. Of course, what is considered main and what is secondary is a separate task. The solution to this question determines the level of abstraction, the simplicity or complexity of the mathematical model and the level of adequacy of the model to the real process. In essence, any abstract model is the result of two opposing aspirations: simplicity and adequacy to reality.

For example, in shooting theory, fairly simple and convenient formulas have been developed for determining the flight path of a projectile from a gun located at a point (Fig. 1).


Under certain conditions, the mentioned theory is sufficient, for example, during massive artillery preparation.

However, it is clear that if several shots are fired from one gun under the same conditions, the trajectories will be, although close, still different. And if the target size is small compared to the scattering area, then specific questions arise specifically related to the influence of factors not taken into account within the proposed model. At the same time, taking into account additional factors will lead to an overly complex model that is almost impossible to use. In addition, there are many of these random factors, their nature is most often unknown.



In the above example, such specific questions that go beyond the deterministic model are, for example, the following: how many shots must be fired in order to guarantee hitting the target with a certain certainty (for example, on )? How should zeroing be carried out in order to use the least amount of shells to hit the target? and so on.

As we will see later, the words “random” and “probability” will become strict mathematical terms. At the same time, they are very common in ordinary colloquial speech. It is believed that the adjective “random” is the opposite of “natural”. However, this is not so, because nature is designed in such a way that random processes reveal patterns, but under certain conditions.

The main condition is called mass character.

For example, if you toss a coin, you cannot predict what will come up, a coat of arms or a number, you can only guess. However, if this coin is tossed a large number of times, the proportion of the coat of arms falling out will not differ much from a certain number close to 0.5 (in what follows we will call this number probability). Moreover, with an increase in the number of tosses, the deviation from this number will decrease. This property is called stability average indicators (in this case - the share of coats of arms). It must be said that in the first steps of probability theory, when it was necessary to verify in practice the presence of the property of stability, even great scientists did not consider it difficult to carry out their own verification. Thus, the famous experiment of Buffon, who tossed a coin 4040 times, and the coat of arms came up 2048 times, therefore, the proportion (or relative frequency) of the coat of arms being lost is 0.508, which is close to the intuitively expected number of 0.5.

Therefore, the definition is usually given the subject of probability theory as a branch of mathematics that studies the patterns of mass random processes.

It must be said that, despite the fact that the greatest achievements of probability theory date back to the beginning of the last century, especially thanks to the axiomatic construction of the theory in the works of A.N. Kolmogorov (1903-1987), interest in the study of accidents appeared a long time ago.

Initial interests were in trying to apply a numerical approach to gambling. The first quite interesting results of probability theory are usually associated with the works of L. Pacioli (1494), D. Cardano (1526) and N. Tartaglia (1556).

Later, B. Pascal (1623-1662), P. Fermat (1601-1665), H. Huygens (1629-1695) laid the foundations of the classical theory of probability. At the beginning of the 18th century, J. Bernoulli (1654-1705) formed the concept of the probability of a random event as the ratio of the number of favorable chances to the number of all possible ones. E. Borel (1871-1956), A. Lomnitsky (1881-1941), R. Mises (1883-1953) built their theories on the use of the concept of measure of a set.

The set-theoretic point of view was presented in its most complete form in 1933. A.N. Kolmogorov in his monograph “Basic Concepts of Probability Theory”. It is from this moment that probability theory becomes a strict mathematical science.

Russian mathematicians P.L. made a great contribution to the development of probability theory. Chebyshev (1821-1894), A.A. Markov (1856-1922), S.N. Bernstein (1880-1968) and others.

Probability theory is developing rapidly at the present time.

The simplest concepts of probability theory

Like any mathematical discipline, probability theory begins with the introduction of the simplest concepts that are not defined, but only explained.

One of the main primary concepts is experience. Experience is understood as a certain set of conditions that can be reproduced an unlimited number of times. We will call each implementation of this complex an experience or a test. The results of the experiment may be different, and this is where the element of chance appears. The various results or outcomes of an experience are called events(more precisely, random events). Thus, during the implementation of the experiment, one or another event may occur. In other words, a random event is an outcome of an experiment that may occur (appear) or not occur during the implementation of the experiment.

Experience will be denoted by the letter , and random events are usually denoted by capital letters

Often in an experiment it is possible to identify in advance its outcomes, which can be called the simplest, which cannot be decomposed into simpler ones. Such events are called elementary events(or cases).

Example 1. Let the coin toss. The outcomes of the experiment are: the loss of the coat of arms (we denote this event with the letter); loss of numbers (denoted by ). Then we can write: experience = (coin toss), outcomes: It is clear that the elementary events in this experiment. In other words, listing all the elementary events of experience completely describes it. In this regard, we will say that experience is the space of elementary events, and in our case, experience can be briefly written in the form: = (coin toss) = (G; C).

Example 2. =(coin is tossed twice)= Here is a verbal description of the experience and a listing of all elementary events: it means that first, on the first toss of a coin, a coat of arms fell, on the second, the coat of arms also fell; means that the coat of arms came up on the first toss of the coin, the number on the second, etc.

Example 3. In the coordinate system, points are thrown into a square. In this example, the elementary events are points with coordinates that satisfy the given inequalities. Briefly it is written as follows:

A colon in curly brackets means that it consists of points, but not any, but only those that satisfy the condition (or conditions) specified after the colon (in our example, these are inequalities).

Example 4. The coin is tossed until the first coat of arms appears. In other words, the coin toss continues until the head is landed. In this example, elementary events can be listed, although their number is infinite:

Note that in examples 3 and 4, the space of elementary events has an infinite number of outcomes. In example 4 they can be listed, i.e. recalculate. Such a set is called countable. In Example 3 the space is uncountable.

Let us introduce two more events that are present in any experience and which are of great theoretical significance.

Let's call the event impossible, unless, as a result of experience, it necessarily does not occur. We will denote it by the sign of the empty set. On the contrary, an event that is sure to occur as a result of experience is called reliable. A reliable event is designated in the same way as the space of elementary events itself - by the letter .

For example, when throwing a dice, the event (less than 9 points rolled up) is reliable, but the event (exactly 9 points rolled up) is impossible.

So, the space of elementary events can be specified by a verbal description, a listing of all its elementary events, and the setting of rules or conditions according to which all of its elementary events are obtained.

Algebra of events

Until now we have spoken only about elementary events as direct results of experience. However, within the framework of experience, we can talk about other random events, in addition to elementary ones.

Example 5. When throwing a dice, in addition to the elementary events of one, two,..., six, respectively, we can talk about other events: (an even number), (an odd number), (a multiple of three), (a number less than 4). ) and so on. In this example, the specified events, in addition to the verbal task, can be specified by listing elementary events:

The formation of new events from elementary, as well as from other events, is carried out using operations (or actions) on events.

Definition. The product of two events is an event that consists in the fact that as a result of an experiment will happen And event , And event, i.e. both events will occur together (simultaneously).

The product sign (dot) is often omitted:

Definition. The sum of two events is an event that consists in the fact that as a result of the experiment will happen or event , or event , or both together (at the same time).

In both definitions we deliberately emphasized conjunctions And And or- in order to attract the reader’s attention to your speech when solving problems. If we pronounce the conjunction “and”, then we are talking about the production of events; If the conjunction “or” is pronounced, then the events must be added. At the same time, we note that the conjunction “or” in everyday speech is often used in the sense of excluding one of two: “only or only”. In probability theory, such an exception is not assumed: and , and , and mean the occurrence of an event

If given by enumerating elementary events, then complex events can be easily obtained using the specified operations. To obtain, you need to find all the elementary events that belong to both events; if there are none, then the Sum of Events is also easy to compose: you need to take any of the two events and add to it those elementary events from the other event that are not included in the first.

In example 5 we obtain, in particular

The introduced operations are called binary, because defined for two events. The following unary operation (defined for a single event) is of great importance: the event is called opposite event if it consists in the fact that in a given experience the event did not occur. From the definition it is clear that every event and its opposite have the following properties: The introduced operation is called addition events A.

It follows that if given by a listing of elementary events, then, knowing the specification of the event, it is easy to obtain it consists of all elementary events of the space that do not belong. In particular, for example 5 the event

If there are no parentheses, then the following priority is set in performing operations: addition, multiplication, addition.

So, with the help of the introduced operations, the space of elementary events is replenished with other random events that form the so-called algebra of events.

Example 6. The shooter fired three shots at the target. Consider the events = (the shooter hit the target with the i-th shot), i = 1,2,3.

Let's compose some events from these events (let's not forget about the opposite ones). We do not provide lengthy comments; We believe that the reader will conduct them independently.

Event B = (all three shots hit the target). More details: B = ( And first, And second, And the third shot hit the target). Used union And, therefore, events are multiplied:

Likewise:

C = (none of the shots hit the target)

E = (one shot reached the target)

D = (target hit on second shot) = ;

F = (target hit by two shots)

N = (at least one hit will hit the target)

As is known, in mathematics the geometric interpretation of analytical objects, concepts and formulas is of great importance.

In probability theory, it is convenient to visually represent (geometric interpretation) experience, random events and operations on them in the form of so-called Euler-Venn diagrams. The essence is that every experience is identified (interpreted) with throwing points into a certain square. The dots are thrown at random, so that all dots have an equal chance of landing anywhere in that square. The square defines the framework of the experience in question. Each event within the experience is identified with a certain area of ​​the square. In other words, the occurrence of an event means that a random point falls inside the area indicated by the letter. Then operations on events are easily interpreted geometrically (Fig. 2)

A:

A + B: any

hatching

In Fig. 2 a) for clarity, event A is highlighted by vertical shading, event B by horizontal shading. Then the multiplication operation corresponds to a double hatch - the event corresponds to that part of the square that is covered with a double hatch. Moreover, if then they are called incompatible events. Accordingly, the operation of addition corresponds to any hatching - the event means a part of the square shaded by any hatching - vertical, horizontal and double. In Fig. 2 b) the event is shown; it corresponds to the shaded part of the square - everything that is not included in the area. The introduced operations have the following basic properties, some of which are valid for operations of the same name on numbers, but there are also specific ones.

10 . commutativity of multiplication;

20 . commutativity of addition;

thirty . associativity of multiplication;

4 0 . addition associativity,

50 . distributivity of multiplication relative to addition,

6 0 . distributivity of addition relative to multiplication;

9 0 . de Morgan's laws of duality,

10 0 .

1 .A .A+ .A· =A, 1 .A+ . 1 .A· = , 1 .A+ =

Example 7. Ivan and Peter agreed to meet at a time interval of T hour, for example, (0, T). At the same time, they agreed that each of them, upon coming to the meeting, would wait for the other no more than an hour.

Let's give this example a geometric interpretation. Let us denote: the time of Ivan’s arrival at the meeting; Peter's arrival time for the meeting. As agreed: 0 . Then in the coordinate system we get: = It is easy to notice that in our example the space of elementary events is a square. 1


0 x corresponds to that part of the square that is located above this line. Similarly, to the second inequality y≤x+ and; and does not work if all elements do not work, i.e. .Thus, de Morgan’s second law of duality: is implemented when elements are connected in parallel.

The above example shows why probability theory is widely used in physics, in particular, in calculating the reliability of real technical devices.

“Accidents are not accidental”... It sounds like something a philosopher said, but in fact, studying randomness is the destiny of the great science of mathematics. In mathematics, chance is dealt with by probability theory. Formulas and examples of tasks, as well as the basic definitions of this science will be presented in the article.

What is probability theory?

Probability theory is one of the mathematical disciplines that studies random events.

To make it a little clearer, let's give a small example: if you throw a coin up, it can land on heads or tails. While the coin is in the air, both of these probabilities are possible. That is, the probability of possible consequences is 1:1. If one is drawn from a deck of 36 cards, then the probability will be indicated as 1:36. It would seem that there is nothing to explore and predict here, especially with the help of mathematical formulas. However, if you repeat a certain action many times, you can identify a certain pattern and, based on it, predict the outcome of events in other conditions.

To summarize all of the above, probability theory in the classical sense studies the possibility of the occurrence of one of the possible events in a numerical value.

From the pages of history

The theory of probability, formulas and examples of the first tasks appeared in the distant Middle Ages, when attempts to predict the outcome of card games first arose.

Initially, probability theory had nothing to do with mathematics. It was justified by empirical facts or properties of an event that could be reproduced in practice. The first works in this area as a mathematical discipline appeared in the 17th century. The founders were Blaise Pascal and Pierre Fermat. They studied gambling for a long time and saw certain patterns, which they decided to tell the public about.

The same technique was invented by Christiaan Huygens, although he was not familiar with the results of the research of Pascal and Fermat. The concept of “probability theory”, formulas and examples, which are considered the first in the history of the discipline, were introduced by him.

The works of Jacob Bernoulli, Laplace's and Poisson's theorems are also of no small importance. They made probability theory more like a mathematical discipline. Probability theory, formulas and examples of basic tasks received their current form thanks to Kolmogorov’s axioms. As a result of all the changes, probability theory became one of the mathematical branches.

Basic concepts of probability theory. Events

The main concept of this discipline is “event”. There are three types of events:

  • Reliable. Those that will happen anyway (the coin will fall).
  • Impossible. Events that will not happen under any circumstances (the coin will remain hanging in the air).
  • Random. The ones that will happen or won't happen. They can be influenced by various factors that are very difficult to predict. If we talk about a coin, then there are random factors that can affect the result: the physical characteristics of the coin, its shape, its original position, the force of the throw, etc.

All events in the examples are indicated in capital Latin letters, with the exception of P, which has a different role. For example:

  • A = “students came to lecture.”
  • Ā = “students did not come to the lecture.”

In practical tasks, events are usually written down in words.

One of the most important characteristics of events is their equal possibility. That is, if you toss a coin, all variants of the initial fall are possible until it falls. But events are also not equally possible. This happens when someone deliberately influences an outcome. For example, “marked” playing cards or dice, in which the center of gravity is shifted.

Events can also be compatible and incompatible. Compatible events do not exclude each other's occurrence. For example:

  • A = “the student came to the lecture.”
  • B = “the student came to the lecture.”

These events are independent of each other, and the occurrence of one of them does not affect the occurrence of the other. Incompatible events are defined by the fact that the occurrence of one excludes the occurrence of another. If we talk about the same coin, then the loss of “tails” makes it impossible for the appearance of “heads” in the same experiment.

Actions on events

Events can be multiplied and added; accordingly, logical connectives “AND” and “OR” are introduced in the discipline.

The amount is determined by the fact that either event A or B, or two, can occur simultaneously. If they are incompatible, the last option is impossible; either A or B will be rolled.

Multiplication of events consists in the appearance of A and B at the same time.

Now we can give several examples to better remember the basics, probability theory and formulas. Examples of problem solving below.

Exercise 1: The company takes part in a competition to receive contracts for three types of work. Possible events that may occur:

  • A = “the firm will receive the first contract.”
  • A 1 = “the firm will not receive the first contract.”
  • B = “the firm will receive a second contract.”
  • B 1 = “the firm will not receive a second contract”
  • C = “the firm will receive a third contract.”
  • C 1 = “the firm will not receive a third contract.”

Using actions on events, we will try to express the following situations:

  • K = “the company will receive all contracts.”

In mathematical form, the equation will have the following form: K = ABC.

  • M = “the company will not receive a single contract.”

M = A 1 B 1 C 1.

Let’s complicate the task: H = “the company will receive one contract.” Since it is not known which contract the company will receive (first, second or third), it is necessary to record the entire series of possible events:

H = A 1 BC 1 υ AB 1 C 1 υ A 1 B 1 C.

And 1 BC 1 is a series of events where the firm does not receive the first and third contract, but receives the second. Other possible events were recorded using the appropriate method. The symbol υ in the discipline denotes the connective “OR”. If we translate the above example into human language, the company will receive either the third contract, or the second, or the first. In a similar way, you can write down other conditions in the discipline “Probability Theory”. The formulas and examples of problem solving presented above will help you do this yourself.

Actually, the probability

Perhaps, in this mathematical discipline, the probability of an event is the central concept. There are 3 definitions of probability:

  • classic;
  • statistical;
  • geometric.

Each has its place in the study of probability. Probability theory, formulas and examples (9th grade) mainly use the classical definition, which sounds like this:

  • The probability of situation A is equal to the ratio of the number of outcomes that favor its occurrence to the number of all possible outcomes.

The formula looks like this: P(A)=m/n.

A is actually an event. If a case opposite to A appears, it can be written as Ā or A 1 .

m is the number of possible favorable cases.

n - all events that can happen.

For example, A = “draw a card of the heart suit.” There are 36 cards in a standard deck, 9 of them are of hearts. Accordingly, the formula for solving the problem will look like:

P(A)=9/36=0.25.

As a result, the probability that a card of the heart suit will be drawn from the deck will be 0.25.

Towards higher mathematics

Now it has become a little known what the theory of probability is, formulas and examples of solving problems that come across in the school curriculum. However, probability theory is also found in higher mathematics, which is taught in universities. Most often they operate with geometric and statistical definitions of the theory and complex formulas.

The theory of probability is very interesting. It is better to start studying formulas and examples (higher mathematics) small - with the statistical (or frequency) definition of probability.

The statistical approach does not contradict the classical one, but slightly expands it. If in the first case it was necessary to determine with what probability an event will occur, then in this method it is necessary to indicate how often it will occur. Here a new concept of “relative frequency” is introduced, which can be denoted by W n (A). The formula is no different from the classic one:

If the classical formula is calculated for prediction, then the statistical one is calculated according to the results of the experiment. Let's take a small task for example.

The technological control department checks products for quality. Among 100 products, 3 were found to be of poor quality. How to find the frequency probability of a quality product?

A = “the appearance of a quality product.”

W n (A)=97/100=0.97

Thus, the frequency of a quality product is 0.97. Where did you get 97 from? Out of 100 products that were checked, 3 were found to be of poor quality. We subtract 3 from 100 and get 97, this is the amount of quality goods.

A little about combinatorics

Another method of probability theory is called combinatorics. Its basic principle is that if a certain choice A can be made in m different ways, and a choice B can be made in n different ways, then the choice of A and B can be made by multiplication.

For example, there are 5 roads leading from city A to city B. There are 4 paths from city B to city C. In how many ways can you get from city A to city C?

It's simple: 5x4=20, that is, in twenty different ways you can get from point A to point C.

Let's complicate the task. How many ways are there to lay out cards in solitaire? There are 36 cards in the deck - this is the starting point. To find out the number of ways, you need to “subtract” one card at a time from the starting point and multiply.

That is, 36x35x34x33x32...x2x1= the result does not fit on the calculator screen, so it can simply be designated 36!. Sign "!" next to the number indicates that the entire series of numbers is multiplied together.

In combinatorics there are such concepts as permutation, placement and combination. Each of them has its own formula.

An ordered set of elements of a set is called an arrangement. Placements can be repeated, that is, one element can be used several times. And without repetition, when elements are not repeated. n are all elements, m are elements that participate in the placement. The formula for placement without repetition will look like:

A n m =n!/(n-m)!

Connections of n elements that differ only in the order of placement are called permutations. In mathematics it looks like: P n = n!

Combinations of n elements of m are those compounds in which it is important what elements they were and what their total number is. The formula will look like:

A n m =n!/m!(n-m)!

Bernoulli's formula

In probability theory, as in every discipline, there are works of outstanding researchers in their field who have taken it to a new level. One of these works is the Bernoulli formula, which allows you to determine the probability of a certain event occurring under independent conditions. This suggests that the occurrence of A in an experiment does not depend on the occurrence or non-occurrence of the same event in earlier or subsequent trials.

Bernoulli's equation:

P n (m) = C n m ×p m ×q n-m.

The probability (p) of the occurrence of event (A) is constant for each trial. The probability that the situation will occur exactly m times in n number of experiments will be calculated by the formula presented above. Accordingly, the question arises of how to find out the number q.

If event A occurs p number of times, accordingly, it may not occur. Unit is a number that is used to designate all outcomes of a situation in a discipline. Therefore, q is a number that denotes the possibility of an event not occurring.

Now you know Bernoulli's formula (probability theory). We will consider examples of problem solving (first level) below.

Task 2: A store visitor will make a purchase with probability 0.2. 6 visitors independently entered the store. What is the likelihood that a visitor will make a purchase?

Solution: Since it is unknown how many visitors should make a purchase, one or all six, it is necessary to calculate all possible probabilities using the Bernoulli formula.

A = “the visitor will make a purchase.”

In this case: p = 0.2 (as indicated in the task). Accordingly, q=1-0.2 = 0.8.

n = 6 (since there are 6 customers in the store). The number m will vary from 0 (not a single customer will make a purchase) to 6 (all visitors to the store will purchase something). As a result, we get the solution:

P 6 (0) = C 0 6 ×p 0 ×q 6 =q 6 = (0.8) 6 = 0.2621.

None of the buyers will make a purchase with probability 0.2621.

How else is Bernoulli's formula (probability theory) used? Examples of problem solving (second level) below.

After the above example, questions arise about where C and r went. Relative to p, a number to the power of 0 will be equal to one. As for C, it can be found by the formula:

C n m = n! /m!(n-m)!

Since in the first example m = 0, respectively, C = 1, which in principle does not affect the result. Using the new formula, let's try to find out what is the probability of two visitors purchasing goods.

P 6 (2) = C 6 2 ×p 2 ×q 4 = (6×5×4×3×2×1) / (2×1×4×3×2×1) × (0.2) 2 × (0.8) 4 = 15 × 0.04 × 0.4096 = 0.246.

The theory of probability is not that complicated. Bernoulli's formula, examples of which are presented above, is direct proof of this.

Poisson's formula

Poisson's equation is used to calculate low probability random situations.

Basic formula:

P n (m)=λ m /m! × e (-λ) .

In this case λ = n x p. Here is a simple Poisson formula (probability theory). We will consider examples of problem solving below.

Task 3: The factory produced 100,000 parts. Occurrence of a defective part = 0.0001. What is the probability that there will be 5 defective parts in a batch?

As you can see, marriage is an unlikely event, and therefore the Poisson formula (probability theory) is used for calculation. Examples of solving problems of this kind are no different from other tasks in the discipline; we substitute the necessary data into the given formula:

A = “a randomly selected part will be defective.”

p = 0.0001 (according to the task conditions).

n = 100000 (number of parts).

m = 5 (defective parts). We substitute the data into the formula and get:

R 100000 (5) = 10 5 /5! X e -10 = 0.0375.

Just like the Bernoulli formula (probability theory), examples of solutions using which are written above, the Poisson equation has an unknown e. In fact, it can be found by the formula:

e -λ = lim n ->∞ (1-λ/n) n .

However, there are special tables that contain almost all values ​​of e.

De Moivre-Laplace theorem

If in the Bernoulli scheme the number of trials is sufficiently large, and the probability of occurrence of event A in all schemes is the same, then the probability of occurrence of event A a certain number of times in a series of tests can be found by Laplace’s formula:

Р n (m)= 1/√npq x ϕ(X m).

X m = m-np/√npq.

To better remember Laplace’s formula (probability theory), examples of problems are below to help.

First, let's find X m, substitute the data (they are all listed above) into the formula and get 0.025. Using tables, we find the number ϕ(0.025), the value of which is 0.3988. Now you can substitute all the data into the formula:

P 800 (267) = 1/√(800 x 1/3 x 2/3) x 0.3988 = 3/40 x 0.3988 = 0.03.

Thus, the probability that the flyer will work exactly 267 times is 0.03.

Bayes formula

The Bayes formula (probability theory), examples of solving problems with the help of which will be given below, is an equation that describes the probability of an event based on the circumstances that could be associated with it. The basic formula is as follows:

P (A|B) = P (B|A) x P (A) / P (B).

A and B are definite events.

P(A|B) is a conditional probability, that is, event A can occur provided that event B is true.

P (B|A) - conditional probability of event B.

So, the final part of the short course “Probability Theory” is the Bayes formula, examples of solutions to problems with which are below.

Task 5: Phones from three companies were brought to the warehouse. At the same time, the share of phones that are manufactured at the first plant is 25%, at the second - 60%, at the third - 15%. It is also known that the average percentage of defective products at the first factory is 2%, at the second - 4%, and at the third - 1%. You need to find the probability that a randomly selected phone will be defective.

A = “randomly picked phone.”

B 1 - the phone that the first factory produced. Accordingly, introductory B 2 and B 3 will appear (for the second and third factories).

As a result we get:

P (B 1) = 25%/100% = 0.25; P(B 2) = 0.6; P (B 3) = 0.15 - thus we found the probability of each option.

Now you need to find the conditional probabilities of the desired event, that is, the probability of defective products in companies:

P (A/B 1) = 2%/100% = 0.02;

P(A/B 2) = 0.04;

P (A/B 3) = 0.01.

Now let’s substitute the data into the Bayes formula and get:

P (A) = 0.25 x 0.2 + 0.6 x 0.4 + 0.15 x 0.01 = 0.0305.

The article presents probability theory, formulas and examples of problem solving, but this is only the tip of the iceberg of a vast discipline. And after everything that has been written, it will be logical to ask the question of whether the theory of probability is needed in life. It’s difficult for an ordinary person to answer; it’s better to ask someone who has used it to win the jackpot more than once.

Probability theory is a branch of mathematics that studies the patterns of random phenomena: random events, random variables, their properties and operations on them.

For a long time, probability theory did not have a clear definition. It was formulated only in 1929. The emergence of probability theory as a science dates back to the Middle Ages and the first attempts at mathematical analysis of gambling (flake, dice, roulette). French mathematicians of the 17th century Blaise Pascal and Pierre Fermat, while studying the prediction of winnings in gambling, discovered the first probabilistic patterns that arise when throwing dice.

Probability theory arose as a science from the belief that mass random events are based on certain patterns. Probability theory studies these patterns.

Probability theory deals with the study of events whose occurrence is not known with certainty. It allows you to judge the degree of probability of the occurrence of some events compared to others.

For example: it is impossible to determine unambiguously the result of “heads” or “tails” as a result of tossing a coin, but with repeated tossing, approximately the same number of “heads” and “tails” appear, which means that the probability that “heads” or “tails” will fall ", is equal to 50%.

Test in this case, the implementation of a certain set of conditions is called, that is, in this case, the toss of a coin. The challenge can be played an unlimited number of times. In this case, the set of conditions includes random factors.

The test result is event. The event happens:

  1. Reliable (always occurs as a result of testing).
  2. Impossible (never happens).
  3. Random (may or may not occur as a result of the test).

For example, when tossing a coin, an impossible event - the coin will land on its edge, a random event - the appearance of “heads” or “tails”. The specific test result is called elementary event. As a result of the test, only elementary events occur. The set of all possible, different, specific test outcomes is called space of elementary events.

Basic concepts of the theory

Probability- the degree of possibility of the occurrence of an event. When the reasons for some possible event to actually occur outweigh the opposite reasons, then this event is called probable, otherwise - unlikely or improbable.

Random value- this is a quantity that, as a result of testing, can take one or another value, and it is not known in advance which one. For example: number per fire station per day, number of hits with 10 shots, etc.

Random variables can be divided into two categories.

  1. Discrete random variable is a quantity that, as a result of testing, can take on certain values ​​with a certain probability, forming a countable set (a set whose elements can be numbered). This set can be either finite or infinite. For example, the number of shots before the first hit on the target is a discrete random variable, because this quantity can take on an infinite, albeit countable, number of values.
  2. Continuous random variable is a quantity that can take any value from some finite or infinite interval. Obviously, the number of possible values ​​of a continuous random variable is infinite.

Probability space- concept introduced by A.N. Kolmogorov in the 30s of the 20th century to formalize the concept of probability, which gave rise to the rapid development of probability theory as a strict mathematical discipline.

A probability space is a triple (sometimes enclosed in angle brackets: , where

This is an arbitrary set, the elements of which are called elementary events, outcomes or points;
- sigma algebra of subsets called (random) events;
- probability measure or probability, i.e. sigma-additive finite measure such that .

De Moivre-Laplace theorem- one of the limit theorems of probability theory, established by Laplace in 1812. It states that the number of successes when repeating the same random experiment over and over again with two possible outcomes is approximately normally distributed. It allows you to find an approximate probability value.

If for each of the independent trials the probability of the occurrence of some random event is equal to () and is the number of trials in which it actually occurs, then the probability of the inequality being true is close (for large values) to the value of the Laplace integral.

Distribution function in probability theory- a function characterizing the distribution of a random variable or random vector; the probability that a random variable X will take a value less than or equal to x, where x is an arbitrary real number. If known conditions are met, it completely determines the random variable.

Expected value- the average value of a random variable (this is the probability distribution of a random variable, considered in probability theory). In English-language literature it is denoted by , in Russian - . In statistics, the notation is often used.

Let a probability space and a random variable defined on it be given. That is, by definition, a measurable function. Then, if there is a Lebesgue integral of over space, then it is called the mathematical expectation, or the mean value, and is denoted .

Variance of a random variable- a measure of the spread of a given random variable, i.e. its deviation from the mathematical expectation. It is designated in Russian and foreign literature. In statistics, the notation or is often used. The square root of the variance is called the standard deviation, standard deviation, or standard spread.

Let be a random variable defined on some probability space. Then

where the symbol denotes the mathematical expectation.

In probability theory, two random events are called independent, if the occurrence of one of them does not change the probability of the occurrence of the other. Similarly, two random variables are called dependent, if the value of one of them affects the probability of the values ​​of the other.

The simplest form of the law of large numbers is Bernoulli's theorem, which states that if the probability of an event is the same in all trials, then as the number of trials increases, the frequency of the event tends to the probability of the event and ceases to be random.

The law of large numbers in probability theory states that the arithmetic mean of a finite sample from a fixed distribution is close to the theoretical mean of that distribution. Depending on the type of convergence, a distinction is made between the weak law of large numbers, when convergence occurs by probability, and the strong law of large numbers, when convergence is almost certain.

The general meaning of the law of large numbers is that the joint action of a large number of identical and independent random factors leads to a result that, in the limit, does not depend on chance.

Methods for estimating probability based on finite sample analysis are based on this property. A clear example is the forecast of election results based on a survey of a sample of voters.

Central limit theorems- a class of theorems in probability theory stating that the sum of a sufficiently large number of weakly dependent random variables that have approximately the same scales (none of the terms dominates or makes a determining contribution to the sum) has a distribution close to normal.

Since many random variables in applications are formed under the influence of several weakly dependent random factors, their distribution is considered normal. In this case, the condition must be met that none of the factors is dominant. Central limit theorems in these cases justify the use of the normal distribution.

Some programmers, after working in the field of developing regular commercial applications, think about mastering machine learning and becoming a data analyst. They often don't understand why certain methods work, and most machine learning methods seem like magic. In fact, machine learning is based on mathematical statistics, which in turn is based on probability theory. Therefore, in this article we will pay attention to the basic concepts of probability theory: we will touch on the definitions of probability, distribution and analyze several simple examples.

You may know that probability theory is conventionally divided into 2 parts. Discrete probability theory studies phenomena that can be described by a distribution with a finite (or countable) number of possible behavior options (throwing dice, coins). Continuous probability theory studies phenomena distributed over some dense set, for example, on a segment or in a circle.

We can consider the subject of probability theory using a simple example. Imagine yourself as a shooter developer. An integral part of the development of games in this genre is the shooting mechanics. It is clear that a shooter in which all weapons shoot absolutely accurately will be of little interest to players. Therefore, it is imperative to add spread to your weapon. But simply randomizing weapon impact points will not allow for fine tuning, so adjusting the game balance will be difficult. At the same time, using random variables and their distributions can analyze how a weapon will perform with a given spread and help make the necessary adjustments.

Space of elementary outcomes

Let's say that from some random experiment that we can repeat many times (for example, tossing a coin), we can extract some formalized information (it came up heads or tails). This information is called an elementary outcome, and it is useful to consider the set of all elementary outcomes, often denoted by the letter Ω (Omega).

The structure of this space depends entirely on the nature of the experiment. For example, if we consider shooting at a sufficiently large circular target, the space of elementary outcomes will be a circle, for convenience, placed with the center at zero, and the outcome will be a point in this circle.

In addition, sets of elementary outcomes - events are considered (for example, hitting the top ten is a concentric circle of small radius with a target). In the discrete case, everything is quite simple: we can get any event, including or excluding elementary outcomes in a finite time. In the continuous case, everything is much more complicated: we need some fairly good family of sets to consider, called algebra by analogy with simple real numbers that can be added, subtracted, divided and multiplied. Sets in algebra can be intersected and combined, and the result of the operation will be in the algebra. This is a very important property for the mathematics that lies behind all these concepts. A minimal family consists of only two sets - the empty set and the space of elementary outcomes.

Measure and probability

Probability is a way of making inferences about the behavior of very complex objects without understanding how they work. Thus, probability is defined as a function of an event (from that very good family of sets) that returns a number - some characteristic of how often such an event can occur in reality. To be certain, mathematicians agreed that this number should lie between zero and one. In addition, this function has requirements: the probability of an impossible event is zero, the probability of the entire set of outcomes is unit, and the probability of combining two independent events (disjoint sets) is equal to the sum of the probabilities. Another name for probability is a probability measure. Most often, Lebesgue measure is used, which generalizes the concepts of length, area, volume to any dimensions (n-dimensional volume), and thus it is applicable to a wide class of sets.

Together, the collection of a set of elementary outcomes, a family of sets, and a probability measure is called probability space. Let's consider how we can construct a probability space for the example of shooting at a target.

Consider shooting at a large round target of radius R, which is impossible to miss. By a set of elementary events we set a circle with a center at the origin of coordinates of radius R. Since we are going to use area (the Lebesgue measure for two-dimensional sets) to describe the probability of an event, we will use a family of measurable (for which this measure exists) sets.

Note In fact, this is a technical point and in simple problems the process of determining a measure and a family of sets does not play a special role. But it is necessary to understand that these two objects exist, because in many books on probability theory the theorems begin with the words: “ Let (Ω,Σ,P) be a probability space...».

As mentioned above, the probability of the entire space of elementary outcomes must be equal to one. The area (two-dimensional Lebesgue measure, which we denote λ 2 (A), where A is an event) of a circle, according to a well-known formula from school, is equal to π *R 2. Then we can introduce the probability P(A) = λ 2 (A) / (π *R 2), and this value will already lie between 0 and 1 for any event A.

If we assume that hitting any point on the target is equally probable, the search for the probability of a shooter hitting some area of ​​the target comes down to finding the area of ​​this set (from here we can conclude that the probability of hitting a specific point is zero, because the area of ​​the point is zero).

For example, we want to find out what is the probability that the shooter will hit the top ten (event A - the shooter hits the desired set). In our model, the “ten” is represented by a circle with a center at zero and radius r. Then the probability of getting into this circle is P(A) = λ 2 /(A)π *R 2 = π * r 2 /(π R 2)= (r/R) 2.

This is one of the simplest types of "geometric probability" problems - most of these problems require finding an area.

Random variables

A random variable is a function that converts elementary outcomes into real numbers. For example, in the problem considered, we can introduce a random variable ρ(ω) - the distance from the point of impact to the center of the target. The simplicity of our model allows us to explicitly define the space of elementary outcomes: Ω = (ω = (x,y) such numbers that x 2 +y 2 ≤ R 2 ) . Then the random variable ρ(ω) = ρ(x,y) = x 2 +y 2 .

Means of abstraction from probabilistic space. Distribution function and density

It’s good when the structure of the space is well known, but in reality this is not always the case. Even if the structure of a space is known, it can be complex. To describe random variables if their expression is unknown, there is the concept of a distribution function, which is denoted by F ξ (x) = P(ξ< x) (нижний индекс ξ здесь означает случайную величину). Т.е. это вероятность множества всех таких элементарных исходов, для которых значение случайной величины ξ на этом событии меньше, чем заданный параметр x .

The distribution function has several properties:

  1. Firstly, it is between 0 and 1.
  2. Secondly, it does not decrease when its argument x increases.
  3. Third, when the number -x is very large, the distribution function is close to 0, and when x itself is large, the distribution function is close to 1.

Probably, the meaning of this construction is not very clear upon first reading. One useful property is that the distribution function allows you to look for the probability that a value takes a value from an interval. So, P (the random variable ξ takes values ​​from the interval) = F ξ (b)-F ξ (a). Based on this equality, we can study how this value changes if the boundaries a and b of the interval are close.

Let d = b-a , then b = a+d . And therefore, F ξ (b) - F ξ (a) = F ξ (a+d) - F ξ (a) . For small values ​​of d, the above difference is also small (if the distribution is continuous). It makes sense to consider the ratio p ξ (a,d)= (F ξ (a+d) - F ξ (a))/d. If, for sufficiently small values ​​of d, this ratio differs little from some constant p ξ (a), independent of d, then at this point the random variable has a density equal to p ξ (a).

Note Readers who have previously encountered the concept of derivative may notice that p ξ (a) is the derivative of the function F ξ (x) at point a. In any case, you can study the concept of a derivative in an article on this topic on the Mathprofi website.

Now the meaning of the distribution function can be defined as follows: its derivative (density p ξ, which we defined above) at point a describes how often a random variable will fall into a small interval centered at point a (the neighborhood of point a) compared to the neighborhoods of other points . In other words, the faster the distribution function grows, the more likely it is that such a value will appear in a random experiment.

Let's go back to the example. We can calculate the distribution function for the random variable, ρ(ω) = ρ(x,y) = x 2 +y 2 , which denotes the distance from the center to the random hit point on the target. By definition, F ρ (t) = P(ρ(x,y)< t) . т.е. множество {ρ(x,y) < t)} – состоит из таких точек (x,y) , расстояние от которых до нуля меньше, чем t . Мы уже считали вероятность такого события, когда вычисляли вероятность попадания в «десятку» - она равна t 2 /R 2 . Таким образом, Fρ(t) = P(ρ(x,y) < t) = t 2 /R 2 , для 0

We can find the density p ρ of this random variable. Let us immediately note that outside the interval it is zero, because the distribution function over this interval is unchanged. At the ends of this interval the density is not determined. Inside the interval, it can be found using a table of derivatives (for example, from the Mathprofi website) and elementary rules of differentiation. The derivative of t 2 /R 2 is equal to 2t/R 2. This means that we found the density on the entire axis of real numbers.

Another useful property of density is the probability that a function takes a value from an interval, which is calculated using the integral of density over this interval (you can find out what this is in articles about proper, improper, and indefinite integrals on the Mathprofi website).

On first reading, the integral over an interval of the function f(x) can be thought of as the area of ​​a curved trapezoid. Its sides are a fragment of the Ox axis, a gap (horizontal coordinate axis), vertical segments connecting points (a,f(a)), (b,f(b)) on the curve with points (a,0), (b,0 ) on the Ox axis. The last side is a fragment of the graph of the function f from (a,f(a)) to (b,f(b)) . We can talk about the integral over the interval (-∞; b], when for sufficiently large negative values, a, the value of the integral over the interval will change negligibly compared to the change in the number a. The integral over intervals is determined in a similar way. In total there will be 2Ї2Ї2Ї2 = 16 outcomes In accordance with the assumption that the results of individual shots are independent, formula (3) and a note to it should be used to determine the probabilities of these outcomes. Thus, the probability of the outcome (y, n.n, n) should be set equal to 0.2Ї0.8Ї0.8Ї0, 8 = 0.1024; here 0.8 = 1-0.2 is the probability of a miss with a single shot. The event “the target is hit three times” is favored by the outcomes (y, y, y, n), (y, y, n, y), (y, n, y, y). (n, y, y, y), the probability of each is the same:

0.2Ї0.2Ї0.2Ї0.8 =...... =0.8Ї0.2Ї0.2Ї0.2 = 0.0064;

therefore, the required probability is equal to

4Ї0.0064 = 0.0256.

Generalizing the reasoning of the analyzed example, we can derive one of the basic formulas of probability theory: if events A1, A2,..., An are independent and each have a probability p, then the probability of exactly m of them occurring is equal to

Pn (m) = Cnmpm (1 - p) n-m; (4)

here Cnm denotes the number of combinations of n elements of m. For large n, calculations using formula (4) become difficult. Let the number of shots in the previous example be 100, and the question is asked to find the probability x that the number of hits lies in the range from 8 to 32. Application of formula (4) and the addition theorem gives an accurate, but practically unusable expression of the desired probability


The approximate value of the probability x can be found using Laplace's theorem

and the error does not exceed 0.0009. The result found shows that the event 8 £ m £ 32 is almost certain. This is the simplest, but typical example of the use of limit theorems in probability theory.

The basic formulas of elementary probability theory also include the so-called total probability formula: if the events A1, A2,..., Ar are pairwise incompatible and their union is a reliable event, then for any event B its probability is equal to the sum


The probability multiplication theorem is particularly useful when considering compound tests. A trial T is said to be composed of trials T1, T2,..., Tn-1, Tn if each outcome of a trial T is a combination of some outcomes Ai, Bj,..., Xk, Yl of the corresponding trials T1, T2,... , Tn-1, Tn. From one reason or another, the probabilities are often known