Probability
It is hard to understand machine learning and data science without knowledge of probability and its mathematics.
For example machine learning models are predictive models where every output or result defined in language of probability means no certainty. And results mostly depends on models and experience of models. So, its important to have basic understanding of probability to understand mechanics of machine learning algorithms.
Definition: Probability is basically a number which defined the likelihood of an event.
For example tossing a coin is a random experiment and result; that is either head or tail are events. In this case likelihood of a event (head or tail) is 1/2.
Sample Space: In probability theory, the sample space of an experiment or random trial is the set of all possible outcomes or results of that experiment. Denoted by a set . {H, T}
Random Experiment: In a random experiment we cannot predict the outcome of the experiment and we can say that a random experiment is a unbiased experiment where every outcome have an equal opportunity.
Note: The values of probability lies between [0, 1]
0-event will not happen and 1-event will happen ;
0.9- maximum likelihood and 0.1-mimimum likelihood.
And probability of all outcomes sums to 1. These are some basic points that we studies in middle schools.
Dependences and Independences
Roughly speaking we say that two events E and F are dependent if event E gives information of event F or influences probability of event F and vice versa.
And we say that two events E and F are independent if event E gives no information and no influence on event F. Basically if events are not dependent then they are independent.
For example; Car Accident (E) depends on your driving (F) - Dependent events
And Growing a plant (E) is completely different from your driving (F). -Independent events.
Types of Probability
Marginal Probability: The probability of an event occurring. For example probability of a red heart card drawn from cards. If it is a event A , then P(A) = 13/52.
Joint Probability: The probability of event A and event B occurring at same time. For example probability that a card is a four and red =P(four and red) = P(A ∩ B) =2/52=1/26.
Conditional Probability: Probability of event A occurring, given that event B occurs. Example: given that you drew a red card, what’s the probability that it’s a four; P(four|red)= P(A|B) =2/26=1/13.
Expressions of conditional probability:
If A and B are dependent events then conditional probability is given by;
P(A|B) = P (A and B)/ P(B)
If A and B are independent events then conditional probability is given by;
P(A|B)= P(A)
For independent events probability of two variables simultaneously P(A and B) = P(A)*P(B).
Bayes's Theorem : Bayes theorem is basically an alternate way to calculate condition probability. And we can also calculate marginal probabilities of events using bayes theorem.
We used bayes theorem when it is not easy to calculate join probabilities of events or when it is easy to calculate inverse conditional probabilities.
Bayes' theorem is also known as Bayes' law, or Bayesian reasoning.
Conditional probability of A when B occurs
Conditional probability of B when A occurs
Bayes theorem
Probability Distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. In simple terms all possible likelihood for a random variable in given range.
Depending on types of (data) of random variable in statistics we have different types of probability distribution. Based on data we can divide probability distribution as Discrete Distribution and Continuous Distribution.
Random Variable: A random variable is a variable whose possible values have an associated probability distribution.
For example random variable equals 1, if a coin flip turns up heads and 0 if the flip turns up tails.
Continuous Distribution of a random variable
Normal Distribution : Normal Distribution in Statistics
Normal Distribution also know as Gaussian distribution. A Normal Distribution is symmetrical from both the ends of the mean. It shows that data near mean is more frequent than outliers. To know more about Normal Distribution ; Click the above given link.
This distribution is sometimes also called as Bell Curve distribution because of its shape.
For a continuous random variable x Probability Density Function
Where x= Continuous Random Variable
σ =standard deviation and σ²=variance
μ (mu)=mean
PDF (Probability Density Function): A probability density function is an equation mentioned above to determine and compute probabilities of a continuous random variable. Some properties of PDF :
- Graph of a PDF (Probability density function) is continues over a range.
- Area bounded by by the PDF curve and x-axis is equal to 1.
- Probability of a random variable between two point (a and b) is equal to area bounded by a&b and x-axis.
Normal Distribution completely determined by two parameters: its mean μ(mu) and its
standard deviation σ(sigma). The mean indicates where the bell is centered, and the
standard deviation how wide it is.
Question: What is Central Limit Theorem ?