Fitting and Visualizing a Negative Binomial Distribution in Python

Some theory and a Python tutorial

5 min readJun 3, 2021

Introduction

In this short article I will discuss the process of fitting a negative binomial (NB) distribution to a random variable. There are lots of articles, posts and other Web resources available out there discussing this distribution and its applications. Yet, for many beginners in statistics, econometrics and machine learning, it may still be a real challenge to understand the link between numerous ways, each with its own set of parameters, in which the NB distribution is presented.

All the simulations, estimations and visualizations will be implemented in Python.

The Context

The NB distribution belongs to the class of the so-called discrete probability distributions. In other words, it may be used to describe or model discrete random phenomena, i.e. those whose realizations are numbers: 0,1,2… etc..

Shutterstock Image. — Image source: duckhits.com

Suppose that you are in a bar trying to strike a conversation with random (or maybe not so random…) people. Let’s say your goal is to have a conversation or, even better, a certain number, n, of non-consecutive conversations lasting longer than 30 minutes. Each of these long conversations can be called “success”.

The Math

From a more mathematical perspective, we have sequence of Bernoulli trials with the probability of “success” given by p, and we continue these trials for as long as necessary until we obtain a total of n successes. Well, the “as long as necessary” part may be somewhat unrealistic, but you get the idea. Obviously, the total number of trials necessary will be a random variable given by k+n with k being the total number of failures observed before reaching n successes.

You know I did you a favor and interviewed all the guys in this joint and they’re all losers (source: duckhits.com)

The corresponding distribution of the number of failures X before n successes are observed is given by the Negative Binomial probability mass function:

with the first two moments (mean and variance) being:

From the above equations it follows that parameters p and n can be expressed in terms of the distribution moments as:

Finally, from the last equation, the variance may be expressed in a slightly re-parameterized way:

The last result is quite useful as α is often referred to as the “over dispersion” parameter. The logic behind this terminology can be seen if we notice that when α is (close to) zero, the mean and variance are the same, and, therefore, our distribution collapses into a well-known Poisson distribution with a single parameter mu (The exact requirement is more complex: n goes to infinity and p goes to 1 in such a way that μ remains constant).

Python Implementation

In what follows, I show the process of simulating and estimating the parameters of a negative binomial distribution using Python and some of its libraries.

First, start by importing the required libraries:

We will now generate 10000 random observations from a NB distribution with parameters p=0.25 and n=3. Consistent with our intuition, the empirical (sample) mean and variance are extremely close to their counterparts implied by p and n:

We will next fit the negative binomial probability mass function to our sample using the negative binomial regression framework provided by the statsmodels library. This can be done by specifying a model with no explanatory variables except for the constant term (i.e. the column of ones). As it is the case with many non-linear models, the method of Maximum Likelihood (ML) is used to estimate the unknown model (in our case, distribution) parameters.

It’s important to understand what parameters are estimated by the Negative Binomial regression. Let’s examine the output:

The term const is, in fact, the estimate of log(μ), while alpha is the over-dispersion parameter discussed above. The fact that alpha is statistically different from zero is consistent with the fact that the mean and and variance of the true distribution aren’t close to each other.

Compare our last results to what the output of the NB regression would be if the simulated random variable were from the Poisson distribution with λ=9 (i.e. where μ=σ²=9):

We can now recover the estimate of μ and those of the parameters p and n:

Finally, using these results, we can juxtapose the histogram with the fitted NB distribution:

Conclusion

This short article explained the process of fitting a negative binomial distribution to an arbitrary column of data and interpreting the estimated parameters using the statsmodels Python library. The reader may now be ready to explore the topic of negative binomial regression analysis even further by understanding how additional explanatory variables (or “features” for machine learning practitioners) can be incorporated in the modelling process.