COUNTREG Procedure

Marginal Likelihood

The Bayes theorem states that

p left-parenthesis theta vertical-bar bold y right-parenthesis proportional-to pi left-parenthesis theta right-parenthesis upper L left-parenthesis y vertical-bar theta right-parenthesis

where theta is a vector of parameters and pi left-parenthesis theta right-parenthesis is the product of the prior densities, which are specified in the PRIOR statement. The term upper L left-parenthesis y vertical-bar theta right-parenthesis is the likelihood associated with the MODEL statement. The function pi left-parenthesis theta right-parenthesis upper L left-parenthesis y vertical-bar theta right-parenthesis is the nonnormalized posterior distribution over the parameter vector theta. The normalized posterior distribution, or simply the posterior distribution, is

p left-parenthesis theta vertical-bar bold y right-parenthesis equals StartFraction pi left-parenthesis theta right-parenthesis upper L left-parenthesis y vertical-bar theta right-parenthesis Over integral Underscript theta Endscripts pi left-parenthesis theta right-parenthesis upper L left-parenthesis y vertical-bar theta right-parenthesis d theta EndFraction

The denominator m left-parenthesis y right-parenthesis equals integral Underscript theta Endscripts pi left-parenthesis theta right-parenthesis upper L left-parenthesis y vertical-bar theta right-parenthesis d theta, also called the "marginal likelihood," is a quantity of interest because it represents the probability of the data after the effect of the parameter vector has been averaged out. Due to its interpretation, the marginal likelihood can be used in various applications, including model averaging and variable or model selection.

A natural estimate of the marginal likelihood is provided by the harmonic mean,

m left-parenthesis y right-parenthesis equals StartSet StartFraction 1 Over n EndFraction sigma-summation Underscript i equals 1 Overscript n Endscripts StartFraction 1 Over upper L left-parenthesis y vertical-bar theta Subscript i Baseline right-parenthesis EndFraction EndSet Superscript negative 1

where theta Subscript i is a sample draw from the posterior distribution. This estimator has proven to be unstable in practical applications.

An alternative and more stable estimator can be obtained by using an importance sampling scheme. The auxiliary distribution for the importance sampler can be chosen through the cross-entropy theory (Chan and Eisenstat 2015). In particular, given a parametric family of distributions, the auxiliary density function is chosen to be the one closest, in terms of the Kullback-Leibler divergence, to the probability density that would give a zero variance estimate of the marginal likelihood. In practical terms, this is equivalent to the following algorithm:

  1. Choose a parametric family, f left-parenthesis period comma beta right-parenthesis, for the parameters of the model: f left-parenthesis theta vertical-bar beta right-parenthesis

  2. Evaluate the maximum likelihood estimator of beta by using the posterior samples theta 1 comma ellipsis comma theta Subscript n Baseline as data

  3. Use f left-parenthesis theta Superscript asterisk Baseline vertical-bar ModifyingAbove beta With caret Subscript m l e Baseline right-parenthesis to generate the importance samples: theta 1 Superscript asterisk Baseline comma ellipsis comma theta Subscript n Sub Superscript asterisk Subscript Superscript asterisk

  4. Estimate the marginal likelihood:

    m left-parenthesis y right-parenthesis equals StartFraction 1 Over n Superscript asterisk Baseline EndFraction sigma-summation Underscript j equals 1 Overscript n Superscript asterisk Baseline Endscripts StartFraction upper L left-parenthesis y vertical-bar theta Subscript j Superscript asterisk Baseline right-parenthesis pi left-parenthesis theta Subscript j Superscript asterisk Baseline right-parenthesis Over f left-parenthesis theta Subscript j Superscript asterisk Baseline vertical-bar ModifyingAbove beta With caret Subscript m l e Baseline right-parenthesis EndFraction

The parametric family for the auxiliary distribution is chosen to be Gaussian. The parameters that are subject to bounds are transformed accordingly

  • If negative normal infinity less-than theta less-than normal infinity, then p equals theta.

  • If m less-than-or-equal-to theta less-than normal infinity, then q equals log left-parenthesis theta minus m right-parenthesis.

  • If negative normal infinity less-than theta less-than-or-equal-to upper M, then r equals log left-parenthesis upper M minus theta right-parenthesis.

  • If m less-than-or-equal-to theta less-than-or-equal-to upper M, then s equals log left-parenthesis theta minus m right-parenthesis minus log left-parenthesis upper M minus theta right-parenthesis.

Assuming independence for the parameters that are subject to bounds, the auxiliary distribution to generate importance samples is

Start 4 By 1 Matrix 1st Row  bold p 2nd Row  bold q 3rd Row  bold r 4th Row  bold s EndMatrix tilde bold upper N left-bracket Start 4 By 1 Matrix 1st Row  mu Subscript p Baseline 2nd Row  mu Subscript q Baseline 3rd Row  mu Subscript r Baseline 4th Row  mu Subscript s Baseline EndMatrix comma Start 4 By 4 Matrix 1st Row 1st Column normal upper Sigma Subscript p Baseline 2nd Column 0 3rd Column 0 4th Column 0 2nd Row 1st Column 0 2nd Column normal upper Sigma Subscript q Baseline 3rd Column 0 4th Column 0 3rd Row 1st Column 0 2nd Column 0 3rd Column normal upper Sigma Subscript r Baseline 4th Column 0 4th Row 1st Column 0 2nd Column 0 3rd Column 0 4th Column normal upper Sigma Subscript r Baseline EndMatrix right-bracket

where bold p, bold q, bold r and bold s are vectors containing the transformations of the unbounded, bounded-below, bounded-above and bounded-above-and-below parameters. Also, given the imposed independence structure, normal upper Sigma Subscript p can be a non-diagonal matrix while normal upper Sigma Subscript q, normal upper Sigma Subscript r and normal upper Sigma Subscript s are imposed to be diagonal matrices.

Standard Distributions

Table 5 through Table 10 show all the distribution density functions that PROC COUNTREG recognizes. You specify these distribution densities in the PRIOR statement.

Table 5: Beta Distribution

PRIOR statement BETA(SHAPE1=a, SHAPE2=b, MIN=m, MAX=M)
Note: Commonly m equals 0 and upper M equals 1.
Density StartFraction left-parenthesis theta minus m right-parenthesis Superscript a minus 1 Baseline left-parenthesis upper M minus theta right-parenthesis Superscript b minus 1 Baseline Over upper B left-parenthesis a comma b right-parenthesis left-parenthesis upper M minus m right-parenthesis Superscript a plus b minus 1 Baseline EndFraction
Parameter restriction a greater-than 0, b greater-than 0, negative normal infinity less-than m less-than upper M less-than normal infinity
Range StartLayout Enlarged left-brace 1st Row 1st Column left-bracket m comma upper M right-bracket 2nd Column when a equals 1 comma b equals 1 2nd Row 1st Column left-bracket m comma upper M right-parenthesis 2nd Column when a equals 1 comma b not-equals 1 3rd Row 1st Column left-parenthesis m comma upper M right-bracket 2nd Column when a not-equals 1 comma b equals 1 4th Row 1st Column left-parenthesis m comma upper M right-parenthesis 2nd Column otherwise EndLayout
Mean StartFraction a Over a plus b EndFraction times left-parenthesis upper M minus m right-parenthesis plus m
Variance StartFraction a b Over left-parenthesis a plus b right-parenthesis squared left-parenthesis a plus b plus 1 right-parenthesis EndFraction times left-parenthesis upper M minus m right-parenthesis squared
Mode StartLayout Enlarged left-brace 1st Row 1st Column StartFraction a minus 1 Over a plus b minus 2 EndFraction times upper M plus StartFraction b minus 1 Over a plus b minus 2 EndFraction times m 2nd Column a greater-than 1 comma b greater-than 1 2nd Row 1st Column m and upper M 2nd Column a less-than 1 comma b less-than 1 3rd Row 1st Column m 2nd Column StartLayout Enlarged left-brace 1st Row  a less-than 1 comma b greater-than-or-equal-to 1 2nd Row  a equals 1 comma b greater-than 1 EndLayout 4th Row 1st Column upper M 2nd Column StartLayout Enlarged left-brace 1st Row  a greater-than-or-equal-to 1 comma b less-than 1 2nd Row  a greater-than 1 comma b equals 1 EndLayout 5th Row 1st Column not unique 2nd Column a equals b equals 1 EndLayout
Defaults SHAPE1=SHAPE2=1, sans-serif upper M sans-serif upper I sans-serif upper N right-arrow negative normal infinity, sans-serif upper M sans-serif upper A sans-serif upper X right-arrow normal infinity


Table 6: Gamma Distribution

PRIOR statement GAMMA(SHAPE=a, SCALE=b )
Density StartFraction 1 Over b Superscript a Baseline normal upper Gamma left-parenthesis a right-parenthesis EndFraction theta Superscript a minus 1 Baseline e Superscript negative theta slash b
Parameter restriction a greater-than 0 comma b greater-than 0
Range left-bracket 0 comma normal infinity right-parenthesis
Mean a b
Variance a b squared
Mode left-parenthesis a minus 1 right-parenthesis b
Defaults SHAPE=SCALE=1


Table 7: Inverse Gamma Distribution

PRIOR statement IGAMMA(SHAPE=a, SCALE=b)
Density StartFraction b Superscript a Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction theta Superscript minus left-parenthesis a plus 1 right-parenthesis Baseline e Superscript negative b slash theta
Parameter restriction a greater-than 0 comma b greater-than 0
Range 0 less-than theta less-than normal infinity
Mean StartFraction b Over a minus 1 EndFraction comma a greater-than 1
Variance StartFraction b squared Over left-parenthesis a minus 1 right-parenthesis squared left-parenthesis a minus 2 right-parenthesis EndFraction comma a greater-than 2
Mode StartFraction b Over a plus 1 EndFraction
Defaults SHAPE=2.000001, SCALE=1


Table 8: Normal Distribution

PRIOR statement NORMAL(MEAN=mu, VAR=sigma squared)
Density StartFraction 1 Over sigma StartRoot 2 pi EndRoot EndFraction exp left-parenthesis minus StartFraction left-parenthesis theta minus mu right-parenthesis squared Over 2 sigma squared EndFraction right-parenthesis
Parameter restriction sigma squared greater-than 0
Range negative normal infinity less-than theta less-than normal infinity
Mean mu
Variance sigma squared
Mode mu
Defaults MEAN=0, VAR=1000000


Table 9: t Distribution

PRIOR statement T(LOCATION=mu, DF=nu)
Density StartStartFraction normal upper Gamma left-parenthesis StartFraction nu plus 1 Over 2 EndFraction right-parenthesis OverOver normal upper Gamma left-parenthesis StartFraction nu Over 2 EndFraction right-parenthesis StartRoot pi nu EndRoot EndEndFraction left-bracket 1 plus StartFraction left-parenthesis theta minus mu right-parenthesis squared Over nu EndFraction right-bracket Superscript minus StartFraction nu plus 1 Over 2 EndFraction
Parameter restriction nu greater-than 0
Range negative normal infinity less-than theta less-than normal infinity
Mean mu comma for nu greater-than 1
Variance StartFraction nu Over nu minus 2 EndFraction comma for nu greater-than 2
Mode mu
Defaults LOCATION=0, DF=3


Table 10: Uniform Distribution

PRIOR statement UNIFORM(MIN=m, MAX=M)
Density StartFraction 1 Over upper M minus m EndFraction
Parameter restriction negative normal infinity less-than m less-than upper M less-than normal infinity
Range theta element-of left-bracket m comma upper M right-bracket
Mean StartFraction m plus upper M Over 2 EndFraction
Variance StartFraction left-parenthesis upper M minus m right-parenthesis squared Over 12 EndFraction
Mode Not unique
Defaults MINright-arrow negative normal infinity, MAXright-arrow normal infinity


Last updated: June 19, 2025