Generalized linear models (GLMs)
![]() |
| Part of a series on regression analysis |
Generalized linear models are a flexible class of models that let us generalize from the linear model to include more types of response variables, such as count, binary, and proportion data.
The data Y1, Y2, ..., Yn are independently distributed, i.e., cases are independent.
Thus errors are independent... but NOT necessarily normally distributed.
The dependent variable Yi does NOT need to be normally distributed, but it assumes a distribution, typically from an exponential family (e.g. binomial, Poisson, gamma,...)
GLM does NOT assume a linear relationship between the dependent variable and the independent variables, but it does assume a linear relationship between the transformed response (in terms of the link function) and the explanatory variables; e.g., for binary logistic regression logit(p)=β0+β1X.
The homogeneity of variance does NOT need to be satisfied.
It uses maximum likelihood estimation (MLE) rather than ordinary least squares (OLS) to estimate the parameters and thus relies on large-sample approximations.
Generalized linear models have three parts:
1. random component: the response and an associated probability distribution
2. systematic component: explanatory variables and relationships among them (e.g., interaction terms)
3. link function, which tells us about the relationship between the systematic component (or linear predictor) and the mean of the response
It is the link function that allows us to generalize the linear models for the count, binomial and per cent data. It ensures the linearity and constrains the predictions to be within a range of possible values.

Comments
Post a Comment