Posts

Poisson Distribution

Image
  In probability theory and statistics, the  Poisson distribution , named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume. For instance, a call centre receives an average of 180 calls per hour, 24 hours a day. The calls are independent; receiving one does not change the probability of when the next one will arrive. The number of calls received during any minute has a Poisson probability distribution: the most likely numbers are 2 and 3 but 1 and 4 are also likely and there is a small probability of it being as low as zero and a very small probability it...

Logistic Regression Model

Image
The linear regression model can work well for regression but fails for classification.  A solution for classification is logistic regression. Instead of fitting a straight line or hyperplane, the logistic regression model uses the logistic function to squeeze the output of a linear equation between 0 and 1. References: 1.  Vineet Maheshwari  (Dec 21, 2018),  LOGISTIC REGRESSION,  https://medium.datadriveninvestor.com/logistic-regression-18afd48779ce

Generalized linear models (GLMs)

Image
Part of a series on regression analysis Generalized linear models are a flexible class of models that let us generalize from the linear model to include more types of response variables, such as count, binary, and proportion data. The data Y1, Y2, ..., Yn are independently distributed, i.e., cases are independent. Thus errors are independent... but NOT necessarily normally distributed. The dependent variable Yi does NOT need to be normally distributed, but it assumes a distribution, typically from an exponential family (e.g. binomial, Poisson, gamma,...) GLM does NOT assume a linear relationship between the dependent variable and the independent variables, but it does assume a linear relationship between the transformed response (in terms of the link function) and the explanatory variables; e.g., for binary logistic regression logit(p)=β0+β1X. The homogeneity of variance does NOT need to be satisfied. It uses maximum likelihood estimation (MLE) rather than ordinary least squares (OLS) ...