4 min read

Beta Binomial

Bayesian Inference with Binary Data

The Beta-Binomial model is a classic Bayesian framework for working with dichotomous data. If we observe a Bernoulli process over a total of n trials, where each trial has “success” probability θ, then the total number of successes in n trials X is a binomial random variable where X=0,1,2,...,n.

p(X=x|θ)=(nx)θx(1θ)nx

The binomial random variable is known as the likelihood of this model. Now, if there is uncertainty about the parameter θ in the binomial model, we can treat θ as an additional random variable that follows a beta(α,β) distribution where 0θ1.

p(θ)=Γ(α,β)Γ(α)Γ(β)θα1(1θ)β1

The beta distribution is the prior for this model. Given we know X=x successes occured over n trials of the Bernoulli process with success probablility θ, we can derive the posterior distribution for θ, that is, the distribution of θ that incorporates prior information and observed information, namely the quantity X=x.

p(θ|X=x)θx(1θ)nxθα1(1θ)β1

=θx+α1(1θ)nx+β1

This expression has the form of the kernel of another beta distribution with parameters α=x+α and β=nx+β

To find the predictive distribution of X, fX(x), we first need the joint distribution of X and θ, fX,θ(x,θ).

fX,θ(x,θ)=fθ(θ)fX|θ(x|θ)=Γ(α,β)Γ(α)Γ(β)(nx)θα+x1(1θ)β+nx1

The prior predictive distribution fX(x) = θfX,θ(x,θ)dθ.

θΓ(α,β)Γ(α)Γ(β)(nx)θα+x1(1θ)β+nx1dθ =Γ(α,β)Γ(α)Γ(β)(nx)Γ(α+x)Γ(β+nx)Γ(α+x,β+nx)

=(nk)B(α+x,β+nx)B(α,β)

The above expression is the probability density function of a beta-binomial random variable with parameters n,α,β. This is the distribution of X averaged over all possible values of θ.

The last distribution of interest in this model is the posterior predictive distribution, of the predictive distribution of a new X (call it X) an observation X=x.

fX|X=x=θf(x|θ,x)f(θ|x)dθ Since X and X are conditionally independent given θ,

=θf(x|θ)f(θ|x)dθ

Following a similar derivation as that of the prior predictive, Xbeta binom(n,x+α,nx+β).

Useful R Functions

The betaR package includes some useful functions for working with the beta binomial model. The package can be found at https://github.com/carter-allen/betaR, and installed using the following command.

devtools::install_github("carter-allen/betaR")

Example

library(betaR)

Consider the situation in which we are we are modeling a team’s number of wins over a n=12 game season. The probability that the team wins any given game θ is probably not a fixed value, with variability coming from the quality of the opposing team, home-court advantage, and countless other factors. If the team is relatively good though, we might expect their probability of winning any given game to be around θ=0.75, but not fixed at θ=0.75.

We can represent this prior belief by a beta distribtibution for θ with parameters α=9 and β=3. These parameters can be thought of as the number of prior wins and prior losses respectively, and would be a reasonable assumption if the team went 9-3 last season. The prior distribution of θ is shown below.

dbeta_plot(alpha = 9, beta = 3)

Now if we wanted to make a prediction about how the team will do this season, we can form a prior predictive distribution of X, that is, the number of games we expect the team to win this season, given their 9-3 record last season. If we model each game by a binomial(n=12,p=θ) distribution and keep our beta(α=9,β=3) prior for θ, the prior predictive distribution of X is a beta binomial(n=12,α=9,β=3). The prior predictive distribution of X is shown below. Note the similarity in shape to the prior.

dbb_plot(n = 12, alpha = 9, beta = 3)

Let’s say that the season we are predicting ends up being much less successful than we expected, with the team going 4-8 over 12 games. Given this information, we can update our knowledge about θ, the probability that this team wins any given game. θ|X=xbeta(x+α,nx+β)=beta(4+9,124+3), which has the following distribution.

dbeta_plot(alpha = 4+9,beta = 12-4+3)

Our belief about the probability this team wins any given game has become a bit less optimistic given their dissapointing last season.

Finally, lets take into account all of this information from the past two seasons to get a posterior predictive distribution of a new 12 game season. From earlier, Xbeta binom(n,x+α,nx+β)=beta binom(12,4+9,124+3)

dbb_plot(n = 12, alpha = 4+9, beta = 12-4+3)

Shiny App

The reactive app for visualizing the distributions discussed here under different parameter sets can be found at https://carter-allen.shinyapps.io/BetaBinomial/.