Generalized linear models using JAGS and R
Andrew O. Finley January 25, 2013
Generalized linear models
We make use of several libraries in the following example session, including:
library(coda)
library(rjags)
library(fields)
library(MBA)
library(geoR)
Let’s consider the logistic model where the outcome of the i-th sample unit, yiis either 1 or 0. In addition to the outcome, we often have a set of covariates observed at the sample units. Given these data, we can write the model as
π(yi| ηi) ∼ Ber(p(ηi)), for i = 1, . . . , n (1) where p(ηi) = exp(ηi)/(1 + exp(ηi)) is the probability of yi= 1 and
ηi= x0iβ. (2)
As in the ordinary linear regression model, xi = (xi,0, xi,1, . . . , xi,p)0 is a p × 1 vector (with xi0 set to one, i.e., the intercept) and β = (β0, β1, . . . , βp)0 is a p × 1 vector of regression coefficients.
1 Simulated data analysis
Let’s begin by simulating some data from model (1).
> set.seed(1)
> n <- 100
> X <- cbind(1, rnorm(n))
> beta <- c(0.1, -0.5)
> eta <- X %*% beta
> p <- 1/(1 + exp(-eta))
> y <- rbinom(n, size = 1, prob = p)
Note, exp(η)/(1 + exp(η)) and 1/(1 + exp(−η)) are mathematically equivalent forms of the inverse logit function.
We would like to generate samples from the posterior distributions of β0 and β1. Again, it is useful to provide JAGS with some reasonable starting values for these parameters. Note, the call to glm includes a -1 that indicates the intercept is already included in x.
> glm.m <- glm(y ~ X - 1, family = "binomial")
> summary(glm.m)
Call:
glm(formula = y ~ X - 1, family = "binomial") Deviance Residuals:
Min 1Q Median 3Q Max
-2.1663 -1.0622 0.5172 1.0552 1.7453 Coefficients:
Estimate Std. Error z value Pr(>|z|) X1 0.1521 0.2179 0.698 0.485090 X2 -0.9454 0.2779 -3.401 0.000671 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1)
Null deviance: 138.63 on 100 degrees of freedom Residual deviance: 124.33 on 98 degrees of freedom AIC: 128.33
Number of Fisher Scoring iterations: 4 Now we formulate the JAGS model file ex-3a.jag.
model{
f o r ( i in 1 : n ) {
e t a [ i ] ← b e t a . 0 + b e t a . 1*x [ i ] p i [ i ] ← 1/(1+ exp(− e t a [ i ] ) ) y [ i ] ∼ dbern ( p i [ i ] )
}
b e t a . 0 ∼ dnorm ( 0 , 0 . 0 0 0 0 0 1 ) b e t a . 1 ∼ dnorm ( 0 , 0 . 0 0 0 0 0 1 ) }
As in the previous exercises, we define the data objects needed in the JAGS model along with parameter starting values. Our call to jags.model requests three MCMC chains of 5000 iterations.
> x <- X[, 2]
> n <- length(y)
> data <- list(y = y, x = x, n = n)
> inits <- list(beta.0 = coefficients(glm.m)[1], beta.1 = coefficients(glm.m)[2])
> jags.m <- jags.model(file = "ex-3a.jag", data = data, + inits = inits, n.chains = 3, n.adapt = 1000) Compiling model graph
Resolving undeclared variables Allocating nodes
Graph Size: 806 Initializing model
Now we collect the posterior samples.
> params <- c("beta.0", "beta.1")
> samps <- coda.samples(jags.m, params, n.iter = 5000)
> plot(samps, density = FALSE)
1000 2000 3000 4000 5000 6000
−0.50.00.51.0
Iterations Trace of beta.0
1000 2000 3000 4000 5000 6000
−2.0−1.5−1.0−0.50.0
Iterations Trace of beta.1
Figure 1: Posterior sample chain and density plots for β.
We will also check the Gelman-Rubin Diagnostic to assess convergence, Figure 2. Given the potential scale reduction factors are all less then ∼1.1 and visual inspection of the chains suggests they are mixing well, we can concluded the chains have converged and we can turn our attention to summarizing the posterior samples.
> gelman.diag(samps)
Potential scale reduction factors:
Point est. Upper C.I.
beta.0 1 1
beta.1 1 1
Multivariate psrf 1
> gelman.plot(samps)
> burn.in <- 2000
> round(summary(window(samps, start = burn.in))$quantiles[, + c(3, 1, 5)], 2)
50% 2.5% 97.5%
beta.0 0.16 -0.27 0.60 beta.1 -0.97 -1.57 -0.45
1000 2000 3000 4000 5000 6000
1.001.051.101.151.201.25
last iteration in chain
shrink factor
median 97.5%
beta.0
1000 2000 3000 4000 5000 6000
1.001.011.021.031.041.051.06
last iteration in chain
shrink factor
median 97.5%
beta.1
Figure 2: Gelman-Rubin diagnostic plots for β.
Considering the addition of a random intercept
Say our data fall within three groups and we want to explore y differs with regard to group membership.
We could set this model up using dummy variables or use a multilevel model. When the number of groups is large or we expect the model to become more complicated, then the multilevel approach is desirable. For the multilevel model specification, we modify model (1) by adding an additional term to η. Specifically,
ηij = x0iβ + αj, for i = 1 . . . n and j = 1 . . . 3, (3) where αj is the group specific intercept offset. We assume αj∼ N (0, σα2).
Let’s simulate some data to try this model out.
> set.seed(1)
> n <- 200
> X <- cbind(1, rnorm(n))
> beta <- c(0.1, -0.5)
> alpha.groups <- sample(1:3, size = n, replace = TRUE)
> Z.alpha <- matrix(0, n, 3)
> Z.alpha[alpha.groups == 1, 1] <- 1
> Z.alpha[alpha.groups == 2, 2] <- 1
> Z.alpha[alpha.groups == 3, 3] <- 1
> head(Z.alpha) [,1] [,2] [,3]
[1,] 0 1 0
[2,] 1 0 0
[3,] 0 0 1
[4,] 0 0 1
[5,] 0 0 1
[6,] 0 0 1
> sigma.sq.alpha <- 1
> alpha <- rnorm(3, 0, sqrt(sigma.sq.alpha))
> p <- 1/(1 + exp(-(X %*% beta + Z.alpha %*% alpha)))
> y <- rbinom(n, size = 1, prob = p)
Now we formulate the JAGS model file ex-3a.jag.
model{
f o r ( i in 1 : n ) {
e t a [ i ] ← b e t a . 0 + b e t a . 1*x [ i ] + alpha [ alpha.indx [ i ] ] p i [ i ] ← 1/(1+ exp(− e t a [ i ] ) )
y [ i ] ∼ dbern ( p i [ i ] ) }
b e t a . 0 ∼ dnorm ( 0 , 0 . 0 0 0 0 0 1 ) b e t a . 1 ∼ dnorm ( 0 , 0 . 0 0 0 0 0 1 ) f o r ( j in 1 : q ) {
a l p h a [ j ] ∼ dnorm ( 0 , t a u . s q . a l p h a ) }
t a u . s q . a l p h a ∼ dgamma( 0 . 0 1 , 0 . 0 1 ) s i g m a . s q . a l p h a ← 1/ t a u . s q . a l p h a }
As in the previous exercises, we define the data objects needed in the JAGS model along with parameter starting values. Our call to jags.model requests three MCMC chains of 10000 iterations.
> x <- X[, 2]
> n <- length(y)
> data <- list(y = y, x = x, n = n, q = 3, alpha.indx = alpha.groups)
> inits <- list(beta.0 = coefficients(glm.m)[1], beta.1 = coefficients(glm.m)[2])
> jags.m <- jags.model(file = "ex-3b.jag", data = data, + n.chains = 3, n.adapt = 1000)
Compiling model graph
Resolving undeclared variables Allocating nodes
Graph Size: 1813 Initializing model
> params <- c("beta.0", "beta.1", "alpha", "sigma.sq.alpha")
> samps <- coda.samples(jags.m, params, n.iter = 10000)
> plot(samps, density = FALSE)
> burn.in <- 2000
> round(summary(window(samps, start = burn.in))$quantiles[, + c(3, 1, 5)], 2)
50% 2.5% 97.5%
alpha[1] 0.19 -3.43 3.80
alpha[2] -1.69 -5.34 1.88
alpha[3] 1.40 -2.13 5.16
beta.0 0.67 -2.94 4.26
beta.1 -0.37 -0.77 0.02
sigma.sq.alpha 3.50 0.60 82.77
So, are these parameter values correct? How is chain convergence?
Try your hand at writing some more complex models by, e.g., 1) adding another random offset to the intercept, and 2) allowing the covariate to vary by group, i.e., an intercept and slope varying model.