Bayesian Inference with Count Data
If we model the count of events over a certain interval as a Poisson random variable with rate , then . This can be thought of as a prior model for , where is itself a random variable. We can adopt a prior for to allow for conjugacy of the prior and posterior.
To derive the posterior and predictive distributions, we first need to find the joint distribution of and , . Using the shape-scale parameterization of the Gamma, the joint distribution is
The prior predictive distribution of (the marginal of ) is given by
Inside this last integral is the kernel of another random variable, a , which integrates over its whole support to .
This final expression for is the pmf of a Negative Binomial random variable with parameters and that models the number of failures before the success.
If we were to have used the shape-rate parameterization of the Gamma, this prior predictive would end up being a Negative Binomial with and .
The posterior distribution for after observing using the shape-scale parameterization is
which is the pdf of a random variable.
Under the shape-rate parameterization this posterior would be a .
The final distribution of interest is that of the posterior predictive of a new observation denoted after already observing that . It is a bit simpler to see the derivation using the shape-rate parameterization of the Gamma distribution.
After integrating the kernel of a density contained in the above expression, the posterior predictive simplifies to the pmf of a Negative Binomial with and .
Random Sample of Size n
The above derivations deal with observing a single value of a random variable . These results can generalized to observing values of a random sample of identically distributed random variables .
The posterior distribution of after observing the random sample is a , using the shape-rate Gamma parameterization.
The posterior predictive distribution of a new variable after observing the random sample is a Negative Binomial with , using the shape-rate Gamma parameterization.
Useful R Functions
Some useful functions for plotting the Gamma, Negative Binomial, and Poisson variables that arise in Gamma Poisson modeling are included in the gammaR
function, installable with the following command.
devtools::install_github("carter-allen/gammaR")
The functions dgam_plot()
, dnb_plot()
, and dpois_plot()
can be used to quickly plot Gamma, Negative Binomial, and Poisson distributions, respectively.
Example
library(gammaR)
Imagine the situation of modeling the number of unhappy customers that will arrive at an ice cream shop per day at the beginning of the week, and then updating this model for the rest of the week after Monday. We can adopt a likelihood model for the number of unhappy customers arriving per hour on any given day. If we know that on Friday 6 unhappy customers came to the shop, we can adopt a . Here alpha = 6
can be interpreted as the sum of prior counts and beta = 1
can be interpreted as the number of trials. Note that the shape-rate parameterization is being used here for the Gamma distribution. The prior distribution has the following shape.
dgam_plot(alpha,beta)
From the derivations above, we can see that the marginal distribution of , the prior predictive distribution of before observing any new data, is a Negative Binomial distribution with and and the following shape.
dnb_plot(alpha,beta/(1+beta))
Now if at the end of the day on Monday we record that only x = 3
unhappy customers visited the shop, we can use this information to update our belief about the distribution of , the average number of grumpy customers per day. The posterior distribution of is a , where n = 1
corresponds to the single day observed. The posterior has the following shape.
dgam_plot(alpha + x, beta + n)
Note that the scaling of this posterior is different from the prior and the posterior has more density on lower values of .
Finally, the posterior predictive distribution of the number of unhappy customers that will come to the ice cream shop tomorrow is a Negative Binomial with and , which has the following shape.
dnb_plot(alpha + x,(beta+n)/(beta+n+1))
Shiny App
A online applet for tweaking these parameters and visualizing the models in this distribution can be found at https://carter-allen.shinyapps.io/GammaPoisson/.