Interpreting Logistic Regression Coefficients

### Intro

I was recently asked to interpret coefficient estimates from a logistic regression model. It turns out, I'd forgotten how to. I knew the log odds were involved, but I couldn't find the words to explain it. Part of that has to do with my recent focus on prediction accuracy rather than inference. Still, it's an important concept to understand and this is a good opportunity to refamiliarize myself with it.

Logistic regression models are used when the outcome of
interest is binary. (There are ways to handle
multi-class classification, too.) The predicted values,
which are between zero and one, can be interpreted as
probabilities for being in the positive class—the
one labeled `1`

.

### Logistic Function to Logit

To model the probability when \(y\) is binary—that is, \(p(X) = p(y=1 \mid X)\)—we use the logistic function defined as:

\[p(X) = \frac{e^t}{1 + e^t}\text{,}\]

where \(t\) is some function of the covariates, \(X\). Let's define \(t\) using matrix notation such that \(t = X\beta\), where \(\beta\) is actually \(\vec{\beta}\).

This can be rewritten as:

\[\frac{p(X)}{1-p(X)} = e^{X\beta}\text{.}\]

This is known as the *odds*.

Finally, we can take the log of both sides to get:

\[\log \left(\frac{p(X)}{1-p(X)}\right) = X\beta\text{.}\]

The left-hand side is known as the log-odds or
*logit*.

### Odds

Before we consider the coefficient estimates, let's take a moment to discuss odds. The odds of an event is the probability of that event divided by its complement:

\[\frac{p}{1 - p}\text{.}\]

For an event with probability 0.75, the odds are:

\[\frac{0.75}{1 - 0.75} = \frac{0.75}{0.25} = 3\text{.}\]

This means that the event is three times as likely to occur than not. As another example, consider an event with a 50% chance of happening. In this case, the odds are one to one—there is an equal chance of either event happening, which makes sense given the probability.

### Coefficients

Let's look at an example using Python. For this, we'll
load the `ccard`

data set from
Statsmodels. (Note: all of the code
for this example can be found
here.)

import numpy as np import statsmodels.api as sm df = sm.datasets.ccard.load_pandas().data

In this example, we'll use age and income to predict
home ownership. The income variable, `INCOME`

,
is in 10,000s of dollars. (Note: we also add an
intercept term.)

Let's fit the model and view the summary output.

model = sm.Logit(df.OWNRENT, df[['intercept', 'AGE', 'INCOME']]) result = model.fit() result.summary() Logit Regression Results ============================================================================== Dep. Variable: OWNRENT No. Observations: 72 Model: Logit Df Residuals: 69 Method: MLE Df Model: 2 Date: Pseudo R-squ.: 0.2561 Time: Log-Likelihood: -35.434 converged: True LL-Null: -47.633 LLR p-value: 5.039e-06 ============================================================================== coef std err z P>|z| [95.0% Conf. Int.] ------------------------------------------------------------------------------ intercept -6.0978 1.570 -3.885 0.000 -9.174 -3.021 AGE 0.1056 0.046 2.300 0.021 0.016 0.196 INCOME 0.6411 0.246 2.605 0.009 0.159 1.123 ==============================================================================

The estimated coefficients are the log odds. By exponentiating these values, we can calculate the odds, which are easier to interpret.

np.exp(result.params) intercept 0.002248 AGE 1.111398 INCOME 1.898642 dtype: float64

The odds for both age and income are above one, meaning that they are positively associated with home ownership in this small data set. Let's focus on income. We can interpret this as follows. For a $10,000 increase in income—recall that this corresponds to one unit—we expect the odds of home ownership to increase by almost two times (90%), holding everything else constant.

### Final Thoughts

Interpreting logistic regression coefficients amounts to calculating the odds, which corresponds to the likelihood that event will occur, relative to it not occurring.

Special thanks to UCLA's Institute for Digital Research and Education for the excellent post on this topic.