QLIM Procedure

Random-Parameters Models and Panel Data Analysis

Consider the effect of age on an individual’s health self-assessment that is recorded using the values , where 0 indicates the poorest health. You can model the self-assessment outcome by an ordered probit or logit in PROC QLIM by using the option DISCRETE(D=NORMAL) or DISCRETE(D=LOGISTIC) in the MODEL or ENDOGENOUS statement.

One important shortcoming of this traditional way of modeling is the underlying assumption that, for all individuals, the explanatory variables have fixed constant coefficients. This assumption implies that the impact of the explanatory variables on the dependent variable is the same for all the individuals. However, the assumption might not be realistic, because individuals are usually heterogeneous and hence the coefficient values are expected to vary across the individual observations. In the health self-assessment example, it is expected that aging involves cognitive and physical decline, so on average the relationship between age and health is expected to be negative. However, believing that this negative relationship is the same for every individual ignores the fact that for some individuals aging brings wiser life choices, including a healthier lifestyle and improved emotional well-being, and hence even improved health. Thus, enforcing a negative relationship can cause misleading inferences for this subgroup of individuals with a positive coefficient. Similarly, the effect might be negative for every individual, but its magnitude can vary across observations. In any case, if you are modeling such a behavior, then taking into account the unobserved heterogeneity, where parameter values vary across the observations because of unobserved factors, is more likely to give you more realistic results.

Random-parameters models accommodate such a heterogeneity by allowing the coefficients to vary randomly across individuals based on some prespecified distribution, . The set of parameters defines the unobserved heterogeneity. Therefore, the goal is to estimate those parameters to define the individual heterogeneity.

If you have panel data, you can include random parameters by using the RANDOM statement for all the single-equation models of PROC QLIM—binary probit or logit, ordered probit or logit, Tobit (censored and truncated), stochastic frontier production and cost, and linear regression models—to generalize these models further in order to obtain more realistic results. However, you do not have to have the observations collected in a panel data setting to apply random-parameters models in PROC QLIM. The random-parameters models can also be applied in cross-sectional data as long as you specify the group or subject variable across which the parameter heterogeneity occurs.

General Models with Random Parameters

Random-parameters models allow individual heterogeneity in the coefficients in the latent process,

y Subscript i t Superscript asterisk Baseline equals bold x prime Subscript i t Baseline bold-italic beta Subscript i Baseline plus v Subscript i t

where is a latent variable, is a vector of covariates, and is the error term. In the applications for a panel data set, the subscript i represents individuals and t represents the time period.

The model assumes that parameters are randomly distributed with mean

upper E left-parenthesis bold-italic beta Subscript i Baseline right-parenthesis equals bold-italic beta

and variance

normal upper V normal a normal r left-parenthesis bold-italic beta Subscript i Baseline right-parenthesis equals bold upper Omega

is a positive definite matrix. If the random parameters are not correlated with one another, then becomes a diagonal matrix. Let be the Cholesky factorization of the covariance matrix of the random parameters, . In other words, is the lower triangular matrix that produces . By construction,

bold-italic beta Subscript i Baseline equals bold-italic beta plus bold upper Gamma bold-italic omega Subscript i

where is a random vector with zero means and unit standard deviations. In the no-correlation case, is also a diagonal matrix with the standard deviations of on the diagonal.

PROC QLIM assumes that are normally distributed; hence is normally distributed with mean vector and covariance matrix .

Some of the explanatory variables in the latent model might have fixed (nonrandom) coefficients. In this case can be written conveniently as

bold-italic beta Subscript i Baseline equals StartBinomialOrMatrix bold-italic beta 1 Choose bold-italic beta 2 plus bold upper Gamma bold-italic omega Subscript i EndBinomialOrMatrix

where is the vector of nonrandom (fixed) coefficients and is the vector of the means of the random coefficients.

The general form of the conditional density for the observed response can be written as

f left-parenthesis y Subscript i t Baseline vertical-bar x Subscript i t Baseline comma bold-italic omega Subscript i Baseline right-parenthesis equals g left-parenthesis y Subscript i t Baseline comma x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis

where is the parameter vector that includes the elements of and ; the standard deviation of , ; and other parameters specified by the model.

The joint density for the ith group conditional on and is

f left-parenthesis y Subscript i Baseline 1 Baseline comma y Subscript i Baseline 2 Baseline comma ellipsis comma y Subscript i upper T Sub Subscript i Subscript Baseline vertical-bar bold x Subscript i Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis equals product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis

Because is unobserved, it is necessary to obtain the unconditional likelihood by taking the expectation of this likelihood over the distribution of . Thus

upper L Subscript i Baseline equals f left-parenthesis y Subscript i Baseline 1 Baseline comma y Subscript i Baseline 2 Baseline comma ellipsis comma y Subscript i upper T Sub Subscript i Subscript Baseline vertical-bar bold x Subscript i Baseline semicolon bold-italic theta right-parenthesis equals integral Underscript bold-italic omega Endscripts left-bracket product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis right-bracket h left-parenthesis bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis d bold-italic omega

where is the probability density function of . Under the normality assumption, , where is the probability density function of the standard normal distribution. The true log-likelihood function is obtained by summing , the log of the contribution of the ith individual to the total, over the individuals:

ln upper L equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln upper L Subscript i Baseline equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-bracket integral Underscript bold-italic omega Endscripts left-parenthesis product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis right-parenthesis phi left-parenthesis bold-italic omega Subscript i Baseline right-parenthesis d bold-italic omega right-bracket

The integral in the square brackets does not have a closed form, so it is difficult to perform maximum likelihood estimation. However, this integration can be approximated and likelihood estimation is still possible. The subsection Estimation discusses various methods of approximation for this integral.

The nature of the dependent variable specifies the log-likelihood function. For example, if the dependent variable is binary and its probability is defined by a normal distribution (a probit model), then

g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis equals normal upper Phi left-bracket left-parenthesis 2 y Subscript i t Baseline minus 1 right-parenthesis left-parenthesis x prime Subscript i t Baseline bold-italic beta Subscript i Baseline right-parenthesis right-bracket

where is the cumulative density function of the standard normal distribution. If the dependent variable is modeled by a logit, then

where is the cumulative density function of the standard logistic distribution.

The likelihood function is maximized by solving the likelihood equations

StartFraction partial-differential ln upper L Over partial-differential bold-italic theta EndFraction equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts StartFraction partial-differential ln upper L Subscript i Baseline Over partial-differential bold-italic theta EndFraction

These derivatives involve integration. The integration is approximated by the same method that is used to calculate the likelihood.

When you use one of the simulation methods that are described in the subsections Monte Carlo Integration and QMC Method Using the Halton Sequence, the log likelihood to be optimized becomes

ln upper L Subscript normal s normal i normal m normal u normal l normal a normal t normal e normal d Baseline equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-bracket StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts left-parenthesis product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis right-parenthesis right-bracket

The general formulation of the gradients is

StartFraction partial-differential ln upper L Subscript normal s normal i normal m normal u normal l normal a normal t normal e normal d Baseline Over partial-differential bold-italic theta EndFraction equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts StartStartFraction StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts StartFraction partial-differential product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis Over partial-differential bold-italic theta EndFraction OverOver StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma bold-italic omega Subscript i Baseline semicolon bold-italic theta right-parenthesis EndEndFraction

The formulation of the derivatives with respect to each type of parameter differs from model to model.

Note that includes the elements of rather than . That is, the optimization is performed with respect to elements of . Therefore, when you use the ITPRINT option, the resulting output is based on the parameters that construct the lower triangular matrix from the Cholesky factorization of the covariance matrix of the random parameters. These parameters are labeled starting with _CHOL. For example, if two of the explanatory variables, and , in your model have random coefficients, then the parameters that construct the diagonal of are _CHOL.x1.x1 and _CHOL.x2.x2 and the lower part of is _CHOL.x1.x2. If you use the NOCORR option, then the optimization is based on only the diagonal elements of , and in this case _CHOL.x1.x1 and _CHOL.x2.x2 are the standard deviations of the coefficients of and , respectively. Although the optimization is performed with respect to , which includes the elements of rather than , the results are transformed to obtain the elements of and their corresponding standard errors.

Random-Effects Models

Random-effects models are a special case in which only the constant term is random. For these models, the parameter heterogeneity across individuals can be formulated as

bold-italic beta Subscript i Baseline equals StartBinomialOrMatrix beta Subscript 0 i Baseline Choose bold-italic beta 1 EndBinomialOrMatrix equals StartBinomialOrMatrix beta 0 plus mu Subscript i Baseline Choose bold-italic beta 1 EndBinomialOrMatrix

where has mean 0 and variance .

In most applications of random-effects models, this type of parameter heterogeneity is modeled as a group-specific unobservable heterogeneity in the error term as

y Subscript i t Superscript asterisk Baseline equals bold x prime Subscript i t Baseline bold-italic beta plus epsilon Subscript i t

where

epsilon Subscript i t Baseline equals mu Subscript i Baseline plus v Subscript i t

The density of an observed random variable, , is

f left-parenthesis y Subscript i t Baseline vertical-bar bold x Subscript i t Baseline comma mu Subscript i Baseline right-parenthesis equals g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma mu Subscript i Baseline semicolon bold-italic theta right-parenthesis

The density of the group-specific heterogeneity is

f left-parenthesis mu Subscript i Baseline right-parenthesis equals h left-parenthesis mu Subscript i Baseline semicolon bold-italic theta right-parenthesis

For example, in the case of a random-effects Tobit model, is specified as

y Subscript i t Superscript asterisk Baseline equals bold x prime Subscript i t Baseline bold-italic beta plus epsilon Subscript i t Baseline comma t equals 1 comma ellipsis comma upper T Subscript i Baseline comma i equals 1 comma ellipsis comma upper N

y Subscript i t Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column y Subscript i t Superscript asterisk Baseline 2nd Column normal i normal f y Subscript i t Superscript asterisk Baseline greater-than 0 2nd Row 1st Column 0 2nd Column normal i normal f y Subscript i t Superscript asterisk Baseline less-than-or-equal-to 0 EndLayout

where

v Subscript i t Baseline vertical-bar left-parenthesis bold x Subscript i Baseline comma mu Subscript i Baseline right-parenthesis tilde upper N left-parenthesis 0 comma sigma squared right-parenthesis

mu Subscript i Baseline vertical-bar bold x Subscript i Baseline tilde upper N left-parenthesis 0 comma sigma Subscript mu Superscript 2 Baseline right-parenthesis

where contains for all t and consists of and . Therefore, for this model,

f left-parenthesis y Subscript i t Baseline vertical-bar bold x Subscript i t Baseline comma mu Subscript i Baseline right-parenthesis equals StartSet 1 minus normal upper Phi left-bracket left-parenthesis bold x prime Subscript i t Baseline bold-italic beta plus mu Subscript i Baseline right-parenthesis slash sigma right-bracket EndSet Superscript 1 left-bracket y Super Subscript i t Superscript equals 0 right-bracket Baseline StartSet left-parenthesis 1 slash sigma right-parenthesis phi left-bracket left-parenthesis y Subscript i t Baseline minus bold x prime Subscript i t Baseline bold-italic beta minus mu Subscript i Baseline right-parenthesis slash sigma right-bracket EndSet Superscript 1 left-bracket y Super Subscript i t Superscript greater-than 0 right-bracket

and

f left-parenthesis mu Subscript i Baseline right-parenthesis equals phi left-parenthesis mu Subscript i Baseline slash sigma Subscript mu Baseline right-parenthesis

where is the cumulative density function of the standard normal distribution, is the probability density function of the standard normal distribution, and is the indicator function.

For random-effects models, the unobserved component, , must be integrated out in order to form the likelihood function for the observed data. For individual i,

Therefore, the log-likelihood function for the observed data becomes

ln upper L equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-bracket integral Underscript mu Endscripts left-parenthesis product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x prime Subscript i t Baseline bold-italic beta comma mu Subscript i Baseline semicolon bold-italic theta right-parenthesis right-parenthesis h left-parenthesis mu Subscript i Baseline semicolon bold-italic theta right-parenthesis d mu Subscript i Baseline right-bracket

The notation for the likelihood function of a random-effects model is not much different from that of the random-parameters model discussed in the section General Models with Random Parameters. However, there is a substantial difference in the formulation of the likelihood function of the random-parameters model. The integration in is a multidimensional integral. More specifically, if the number of random parameters is K, then it is a K-dimensional integral.

Estimation

The integral in the log-likelihood function for random-parameters models does not have a closed form; that is, it is difficult to integrate out the random parameters. However, the integral can be approximated, and the usual likelihood estimation can be pursued based on the approximated log-likelihood function. PROC QLIM offers three methods of approximation: Monte Carlo (MC) integration, the quasi–Monte Carlo (QMC) method using the Halton sequences, and approximation by Hermite quadrature. The first two methods are simulation methods, and hence the likelihood method based on the resulting simulated log-likelihood function is called the simulated maximum likelihood. The third method fails to provide a good approximation when the dimensionality of the random parameters, K, is high. The Hermite quadrature method can be used only for random-effects models or random-parameters models that have a single random coefficient (that is, ).

Monte Carlo Integration

Consider the random-effects model defined in the section Random-Effects Models. First, note that

integral Underscript mu Endscripts left-parenthesis product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x Subscript i t Baseline comma mu Subscript i Baseline semicolon bold-italic theta right-parenthesis right-parenthesis h left-parenthesis mu Subscript i Baseline semicolon bold-italic theta right-parenthesis d mu Subscript i Baseline equals upper E left-bracket upper F left-parenthesis mu Subscript i Baseline semicolon bold-italic theta right-parenthesis right-bracket

The function is smooth, continuous, and continuously differentiable. By the law of large numbers, if is a sample of iid draws from , then

normal p normal l normal i normal m StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts upper F left-parenthesis mu Subscript i r Baseline semicolon bold-italic theta right-parenthesis equals upper E left-bracket upper F left-parenthesis mu Subscript i Baseline semicolon bold-italic theta right-parenthesis right-bracket

This operation is implemented by simulation that uses a random number generator. PROC QLIM inserts the simulated integral in the log likelihood to obtain the simulated log likelihood

and maximizes the simulated log likelihood with respect to the parameter set that includes and .

Under certain assumptions (Greene 2001), the simulated likelihood estimator and the maximum likelihood estimator are equivalent. For this equivalence result to hold, the number of draws, R, must increase faster than the number of observations, N. For this reason, if the NDRAW= option is not specified, then by default, it is tied to the sample size by using the rule , where .

Generalization of the log-likelihood function for random-parameters models is

where

bold-italic beta Subscript i r Baseline equals StartBinomialOrMatrix bold-italic beta 1 Choose bold-italic beta 2 plus bold upper Gamma bold-italic omega Subscript i r EndBinomialOrMatrix

In this more general case, is the rth K-variate vector of random draws for individual i. The random draws come from the distribution with the probability density function . PROC QLIM specifies as the probability density function of the standard normal distribution.

The use of independent random draws in simulation is conceptually straightforward, and the statistical properties of the simulated maximum likelihood estimator are easy to derive. However, simulation is a very computationally intensive technique. Moreover, the simulation method itself contributes to the variation of the simulated maximum likelihood estimator (see, for example, Geweke 1995). There are other ways to take draws that can provide greater accuracy by covering the domain of the integral more uniformly and by lowering the simulation variance (Train 2009, section 9.3). For example, quasi–Monte Carlo methods are based on an integration technique that replaces the pseudorandom draws of MC integration with a sequence of judiciously selected nonrandom points that provide more uniform coverage of the domain of the integral. Therefore, the advantage of QMC integration over MC integration is that for some types of sequences, the accuracy is far greater, convergence is much faster, and the simulation variance is smaller. QMC methods are surveyed in Bhat (2001), Sloan and Woźniakowski (1998), and Morokoff and Caflisch (1995). In addition to MC simulation, PROC QLIM offers the QMC integration method that uses Halton sequences.

QMC Method Using the Halton Sequence

Halton sequences (Halton 1960) provide uniform coverage for each observation’s integral, and they decrease the simulation variance by inducing a negative correlation over the draws for each observation. A Halton sequence is constructed deterministically in terms of a prime number as its base. For example, the following sequence is the Halton sequence for 2:

1 slash 2 comma 1 slash 4 comma 3 slash 4 comma 1 slash 8 comma 5 slash 8 comma 3 slash 8 comma 7 slash 8 comma 1 slash 16 comma 9 slash 16 comma ellipsis

For more information about how to generate a Halton sequence, see Train (2009), section 9.3.3.

If you use the QMC method, first, K Halton sequences are created—that is, one Halton sequence for each random parameter, with each sequence corresponding to a different prime number between 2 and the Kth prime number. Then for each sequence, part of the sequence (or the whole sequence, depending on whether you decide to discard the initial elements of the sequences^[13]) is used in groups. For a given sequence, each group of consequent elements constitutes the "draws" for each cross-sectional observation. This way, each sub-sequence fills in the gaps for the previous sub-sequences, and the draws for one observation tend to be negatively correlated with those for the previous observation.

When the number of draws that are used for each observation rises, the coverage for each observation improves. This improvement in turn improves the accuracy; however, the negative covariance across observations diminishes. Because Halton draws are far more effective than random draws in Monte Carlo simulation, a small number of Halton draws provide relatively good integration (Spanier and Maize 1991).

The Halton draws are for a uniform density. PROC QLIM obtains by evaluating the inverse cumulative standard normal density for each element of the rth K-variate draw for the ith group.

Approximation by Hermite Quadrature

Consider the random-effects model that is defined in the section Random-Effects Models. This method is the Butler and Moffitt (1982) approach, which is based on models in which has a normal distribution. If is normally distributed with zero mean, then

Let . Then and . Making the change of variable and letting the error effects be additive produce

upper L Subscript i Baseline equals StartFraction 1 Over StartRoot pi EndRoot EndFraction integral Subscript negative normal infinity Superscript plus normal infinity Baseline exp left-parenthesis minus r Subscript i Superscript 2 Baseline right-parenthesis left-bracket product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x prime Subscript i t Baseline bold-italic beta plus left-parenthesis sigma Subscript mu Baseline StartRoot 2 EndRoot right-parenthesis r Subscript i Baseline semicolon bold-italic theta right-parenthesis right-bracket d r Subscript i Baseline

This likelihood function is in a form that can be approximated accurately by using Gauss-Hermite quadrature, which eliminates the integration. Thus, the log-likelihood function can be approximated with

ln upper L Subscript h Baseline equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-bracket StartFraction 1 Over StartRoot pi EndRoot EndFraction sigma-summation Underscript h equals 1 Overscript upper H Endscripts w Subscript h Baseline product Underscript t equals 1 Overscript upper T Subscript i Baseline Endscripts g left-parenthesis y Subscript i t Baseline comma bold x prime Subscript i t Baseline bold-italic beta plus left-parenthesis sigma Subscript mu Baseline StartRoot 2 EndRoot right-parenthesis r Subscript h Baseline semicolon bold-italic theta right-parenthesis right-bracket

where and are the weights and nodes for the Hermite quadrature of degree H. PROC QLIM maximizes when the Hermite quadrature option is specified (METHOD=HERMITE in the RANDOM statement).

^[13]When sequences are created in multiple dimensions, the initial part of the series is usually eliminated because the initial terms of multiple Halton sequences are highly correlated. However, there is no such correlation for a single dimension.

Last updated: June 19, 2025