MODEL Procedure

Estimation Methods

Consider the general nonlinear model:

StartLayout 1st Row 1st Column bold-italic epsilon Subscript t 2nd Column equals 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Row 1st Column bold z Subscript t 2nd Column equals 3rd Column upper Z left-parenthesis bold x Subscript t Baseline right-parenthesis EndLayout

where q element-of upper R Superscript g is a real vector valued function of ySubscript t Baseline element-of upper R Superscript g, xSubscript t Baseline element-of upper R Superscript l, bold-italic theta element-of upper R Superscript p, where g is the number of equations, l is the number of exogenous variables (lagged endogenous variables are considered exogenous here), p is the number of parameters, and t ranges from 1 to n. bold z Subscript t Baseline element-of upper R Superscript k is a vector of instruments. bold-italic epsilon Subscript t is an unobservable disturbance vector with the following properties:

StartLayout 1st Row 1st Column upper E left-parenthesis bold-italic epsilon Subscript t Baseline right-parenthesis 2nd Column equals 3rd Column 0 2nd Row 1st Column upper E left-parenthesis bold-italic epsilon Subscript t Baseline bold-italic epsilon Subscript t Superscript prime Baseline right-parenthesis 2nd Column equals 3rd Column bold upper Sigma EndLayout

All of the methods implemented in PROC MODEL aim to minimize an objective function. Table 2 summarizes the objective functions that define the estimators and the corresponding estimator of the covariance of the parameter estimates for each method.

Table 2: Summary of PROC MODEL Estimation Methods

Method Instruments Objective Function Covariance of theta
OLS No bold r prime bold r slash n left-parenthesis bold upper X prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold upper X right-parenthesis Superscript negative 1
ITOLS No bold r prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold upper X right-parenthesis Superscript negative 1
SUR No bold r prime left-parenthesis bold upper S Subscript normal upper O normal upper L normal upper S Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold upper X right-parenthesis Superscript negative 1
ITSUR No bold r prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold upper X right-parenthesis Superscript negative 1
N2SLS Yes bold r prime left-parenthesis bold upper I circled-times bold upper W right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold upper X right-parenthesis Superscript negative 1
IT2SLS Yes bold r prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis normal d normal i normal a normal g left-parenthesis bold upper S right-parenthesis Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold upper X right-parenthesis Superscript negative 1
N3SLS Yes bold r prime left-parenthesis bold upper S Subscript normal upper N Baseline 2 normal upper S normal upper L normal upper S Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold upper X right-parenthesis Superscript negative 1
IT3SLS Yes bold r prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold r slash n left-parenthesis bold upper X prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper W right-parenthesis bold upper X right-parenthesis Superscript negative 1
GMM Yes left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket prime ModifyingAbove bold upper V With caret Subscript normal upper N Baseline 2 normal upper S normal upper L normal upper S Superscript negative 1 Baseline left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket slash n left-bracket left-parenthesis bold upper Y bold upper X right-parenthesis prime ModifyingAbove bold upper V With caret Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis right-bracket Superscript negative 1
ITGMM Yes left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket prime ModifyingAbove bold upper V With caret Superscript negative 1 Baseline left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket slash n left-bracket left-parenthesis bold upper Y bold upper X right-parenthesis prime ModifyingAbove bold upper V With caret Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis right-bracket Superscript negative 1
FIML No c o n s t a n t plus StartFraction n Over 2 EndFraction ln left-parenthesis det left-parenthesis bold upper S right-parenthesis right-parenthesis left-bracket ModifyingAbove bold upper Z With caret prime left-parenthesis bold upper S Superscript negative 1 Baseline circled-times bold upper I right-parenthesis ModifyingAbove bold upper Z With caret right-bracket Superscript negative 1
minus sigma-summation Underscript 1 Overscript n Endscripts ln StartAbsoluteValue left-parenthesis bold upper J Subscript t Baseline right-parenthesis EndAbsoluteValue


The Instruments column identifies the estimation methods that require instruments. The variables used in this table and the remainder of this chapter are defined as follows:

n is the number of nonmissing observations.

g is the number of equations.

k is the number of instrumental variables.

bold r equals Start 4 By 1 Matrix 1st Row  r 1 2nd Row  r 2 3rd Row  vertical-ellipsis 4th Row  r Subscript g EndMatrix is the n g times 1 vector of residuals for the g equations stacked together.

bold r Subscript i Baseline equals Start 4 By 1 Matrix 1st Row  q Subscript i Baseline left-parenthesis bold y 1 comma bold x 1 comma bold-italic theta right-parenthesis 2nd Row  q Subscript i Baseline left-parenthesis bold y 2 comma bold x 2 comma bold-italic theta right-parenthesis 3rd Row  vertical-ellipsis 4th Row  q Subscript i Baseline left-parenthesis bold y Subscript n Baseline comma bold x Subscript n Baseline comma bold-italic theta right-parenthesis EndMatrix is the n times 1 column vector of residuals for the ith equation.

S

is a g times g matrix that estimates bold upper Sigma, the covariances of the errors across equations (referred to as the S matrix).

X

is an n g times p matrix of partial derivatives of the residual with respect to the parameters.

W

is an n times n matrix, bold upper Z left-parenthesis bold upper Z prime bold upper Z right-parenthesis Superscript negative 1 Baseline bold upper Z prime.

Z

is an n times k matrix of instruments.

Y

is a g k times n g matrix of instruments. bold upper Y equals bold upper I Subscript g Baseline circled-times bold upper Z prime.

ModifyingAbove bold upper Z With caret

ModifyingAbove bold upper Z With caret equals left-parenthesis ModifyingAbove upper Z With caret Subscript 1 Baseline comma ModifyingAbove upper Z With caret Subscript 2 Baseline comma ellipsis comma ModifyingAbove upper Z With caret Subscript p Baseline right-parenthesis is an n g times p matrix. ModifyingAbove upper Z With caret Subscript i is a n g times 1 column vector obtained from stacking the columns of

StartLayout 1st Row  bold upper U StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential y Subscript t Baseline EndFraction right-parenthesis Superscript negative 1 Baseline StartFraction partial-differential squared bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential y Subscript t Baseline partial-differential theta Subscript i Baseline EndFraction minus bold upper Q Subscript i EndLayout
U

is an n times g matrix of residual errors. bold upper U equals bold-italic epsilon 1 comma bold-italic epsilon 2 comma ellipsis comma bold-italic epsilon Subscript n Baseline prime.

Q

is the n times g matrix bold q left-parenthesis bold y 1 comma bold x 1 comma bold-italic theta right-parenthesis comma bold q left-parenthesis bold y 2 comma bold x 2 comma bold-italic theta right-parenthesis comma ellipsis comma bold q left-parenthesis bold y Subscript n Baseline comma bold x Subscript n Baseline comma bold-italic theta right-parenthesis.

Q Subscript i

is an n times g matrix StartFraction partial-differential bold upper Q Over partial-differential theta Subscript i Baseline EndFraction.

I

is an n times n identity matrix.

J Subscript t

is StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over partial-differential bold y Subscript t Superscript prime Baseline EndFraction, which is a g times g Jacobian matrix.

bold m Subscript n

is first moment of the crossproduct bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t,

m Subscript n Baseline equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t

z Subscript t

is a k column vector of instruments for observation t. bold z prime Subscript t is also the tth row of Z.

ModifyingAbove bold upper V With caret

is the g k times g k matrix that represents the variance of the moment functions.

k

is the number of instrumental variables used.

constant

is the constant StartFraction n g Over 2 EndFraction left-parenthesis 1 plus ln left-parenthesis 2 pi right-parenthesis right-parenthesis.

circled-times

is the notation for a Kronecker product.

All vectors are column vectors unless otherwise noted. Other estimates of the covariance matrix for FIML are also available.

Dependent Regressors and Two-Stage Least Squares

Ordinary regression analysis is based on several assumptions. A key assumption is that the independent variables are in fact statistically independent of the unobserved error component of the model. If this assumption is not true (if the regressor varies systematically with the error), then ordinary regression produces inconsistent results. The parameter estimates are biased.

Regressors might fail to be independent variables because they are dependent variables in a larger simultaneous system. For this reason, the problem of dependent regressors is often called simultaneous equation bias. For example, consider the following two-equation system:

y 1 equals a 1 plus b 1 y 2 plus c 1 x 1 plus epsilon 1
y 2 equals a 2 plus b 2 y 1 plus c 2 x 2 plus epsilon 2

In the first equation, y 2 is a dependent, or endogenous, variable. As shown by the second equation, y 2 is a function of y 1, which by the first equation is a function of epsilon 1, and therefore y 2 depends on epsilon 1. Likewise, y 1 depends on epsilon 2 and is a dependent regressor in the second equation. This is an example of a simultaneous equation system; y 1 and y 2 are a function of all the variables in the system.

Using the ordinary least squares (OLS) estimation method to estimate these equations produces biased estimates. One solution to this problem is to replace y 1 and y 2 on the right-hand side of the equations with predicted values, thus changing the regression problem to the following:

y 1 equals a 1 plus b 1 ModifyingAbove y With caret Subscript 2 Baseline plus c 1 x 1 plus epsilon 1
y 2 equals a 2 plus b 2 ModifyingAbove y With caret Subscript 1 Baseline plus c 2 x 2 plus epsilon 2

This method requires estimating the predicted values ModifyingAbove y With caret Subscript 1 and ModifyingAbove y With caret Subscript 2 through a preliminary, or "first stage," instrumental regression. An instrumental regression is a regression of the dependent regressors on a set of instrumental variables, which can be any independent variables useful for predicting the dependent regressors. In this example, the equations are linear and the exogenous variables for the whole system are known. Thus, the best choice for instruments (of the variables in the model) are the variables x 1 and x 2.

This method is known as two-stage least squares or 2SLS, or more generally as the instrumental variables method. The 2SLS method for linear models is discussed in Pindyck and Rubinfeld (1981, pp. 191–192). For nonlinear models this situation is more complex, but the idea is the same. In nonlinear 2SLS, the derivatives of the model with respect to the parameters are replaced with predicted values. For further discussion of the use of instrumental variables in nonlinear regression, see the section Choice of Instruments.

To perform nonlinear 2SLS estimation with PROC MODEL, specify the instrumental variables with an INSTRUMENTS statement and specify the 2SLS or N2SLS option in the FIT statement. The following statements show how to estimate the first equation in the preceding example with PROC MODEL:

proc model data=in;
   y1 = a1 + b1 * y2 + c1 * x1;
   fit y1 / 2sls;
   instruments x1 x2;
run;

The 2SLS or instrumental variables estimator can be computed by using a first-stage regression on the instrumental variables as described previously. However, PROC MODEL actually uses the equivalent but computationally more appropriate technique of projecting the regression problem into the linear space defined by the instruments. Thus, PROC MODEL does not produce any "first stage" results when you use 2SLS. If you specify the FSRSQ option in the FIT statement, PROC MODEL prints "First-Stage upper R squared" statistic for each parameter estimate.

Formally, the ModifyingAbove bold-italic theta With caret that minimizes

StartLayout 1st Row  ModifyingAbove upper S With caret Subscript n Baseline equals StartFraction 1 Over n EndFraction left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma theta right-parenthesis circled-times bold z Subscript t Baseline right-parenthesis right-parenthesis prime left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts upper I circled-times bold z Subscript t Baseline bold z prime Subscript t right-parenthesis Superscript negative 1 Baseline left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-parenthesis right-parenthesis EndLayout

is the N2SLS estimator of the parameters. The estimate of bold upper Sigma at the final iteration is used in the covariance of the parameters given in Table 2. For more information about the properties of nonlinear two-stage least squares, see Amemiya (1985, p. 250).

Seemingly Unrelated Regression

If the regression equations are not simultaneous (so there are no dependent regressors), seemingly unrelated regression (SUR) can be used to estimate systems of equations with correlated random errors. The large-sample efficiency of an estimation can be improved if these cross-equation correlations are taken into account. SUR is also known as joint generalized least squares or Zellner regression. Formally, the ModifyingAbove bold-italic theta With caret that minimizes

ModifyingAbove upper S With caret Subscript n Baseline equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime ModifyingAbove bold upper Sigma With caret Superscript negative 1 Baseline bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis

is the SUR estimator of the parameters.

The SUR method requires an estimate of the cross-equation covariance matrix, bold upper Sigma. PROC MODEL first performs an OLS estimation, computes an estimate, ModifyingAbove bold upper Sigma With caret, from the OLS residuals, and then performs the SUR estimation based on ModifyingAbove bold upper Sigma With caret. The OLS results are not printed unless you specify the OLS option in addition to the SUR option.

You can specify the ModifyingAbove bold upper Sigma With caret to use for SUR by storing the matrix in a SAS data set and naming that data set in the SDATA= option. You can also feed the ModifyingAbove bold upper Sigma With caret computed from the SUR residuals back into the SUR estimation process by specifying the ITSUR option. You can print the estimated covariance matrix ModifyingAbove bold upper Sigma With caret by using the COVS option in the FIT statement.

The SUR method requires estimation of the bold upper Sigma matrix, and this increases the sampling variability of the estimator for small sample sizes. The efficiency gain that SUR has over OLS is a large sample property, and you must have a reasonable amount of data to realize this gain. For a more detailed discussion of SUR, see Pindyck and Rubinfeld (1981, pp. 331–333).

Three-Stage Least Squares Estimation

If the equation system is simultaneous, you can combine the 2SLS and SUR methods to take into account both dependent regressors and cross-equation correlation of the errors. This is called three-stage least squares (3SLS).

Formally, the ModifyingAbove bold-italic theta With caret that minimizes

StartLayout 1st Row  ModifyingAbove upper S With caret Subscript n Baseline equals StartFraction 1 Over n EndFraction left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-parenthesis right-parenthesis Superscript prime Baseline left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis ModifyingAbove bold upper Sigma With caret circled-times bold z Subscript t Baseline bold z prime Subscript t right-parenthesis right-parenthesis Superscript negative 1 Baseline left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-parenthesis right-parenthesis EndLayout

is the 3SLS estimator of the parameters. For more information about 3SLS, see Gallant (1987, p. 435).

Residuals from the 2SLS method are used to estimate the bold upper Sigma matrix required for 3SLS. The results of the preliminary 2SLS step are not printed unless the 2SLS option is also specified.

To use the three-stage least squares method, specify an INSTRUMENTS statement and use the 3SLS or N3SLS option in either the PROC MODEL statement or a FIT statement.

Generalized Method of Moments (GMM)

For systems of equations with heteroscedastic errors, generalized method of moments (GMM) can be used to obtain efficient estimates of the parameters. For alternatives to GMM, see the section Heteroscedasticity.

Consider the nonlinear model

StartLayout 1st Row 1st Column bold-italic epsilon Subscript t 2nd Column equals 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Row 1st Column bold z Subscript t 2nd Column equals 3rd Column upper Z left-parenthesis bold x Subscript t Baseline right-parenthesis EndLayout

where bold z Subscript t is a vector of instruments and bold-italic epsilon Subscript t is an unobservable disturbance vector that can be serially correlated and nonstationary.

In general, the following orthogonality condition is desired:

upper E left-parenthesis bold-italic epsilon Subscript t Baseline circled-times bold z Subscript t Baseline right-parenthesis equals 0

This condition states that the expected crossproducts of the unobservable disturbances, bold-italic epsilon Subscript t, and functions of the observable variables are set to 0. The first moment of the crossproducts is

StartLayout 1st Row 1st Column bold m Subscript n 2nd Column equals 3rd Column StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Row 1st Column bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Column equals 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t EndLayout

where bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis element-of upper R Superscript g k.

The case where g k greater-than p is considered here, where p is the number of parameters.

Estimate the true parameter vector theta Superscript 0 by the value of ModifyingAbove theta With caret that minimizes

upper S left-parenthesis theta comma bold upper V right-parenthesis equals left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket prime bold upper V Superscript negative 1 Baseline left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket slash n

where

StartLayout 1st Row  bold upper V equals Cov left-parenthesis left-bracket n bold m Subscript n Baseline left-parenthesis theta Superscript 0 Baseline right-parenthesis right-bracket comma left-bracket n bold m Subscript n Baseline left-parenthesis theta Superscript 0 Baseline right-parenthesis right-bracket prime right-parenthesis EndLayout

The parameter vector that minimizes this objective function is the GMM estimator. GMM estimation is requested in the FIT statement with the GMM option.

The variance of the moment functions, bold upper V, can be expressed as

StartLayout 1st Row 1st Column bold upper V 2nd Column equals 3rd Column upper E left-parenthesis sigma-summation Underscript t equals 1 Overscript n Endscripts bold-italic epsilon Subscript t Baseline circled-times bold z Subscript t Baseline right-parenthesis left-parenthesis sigma-summation Underscript s equals 1 Overscript n Endscripts bold-italic epsilon Subscript s Baseline circled-times bold z Subscript s Baseline right-parenthesis prime 2nd Row 1st Column Blank 2nd Column equals 3rd Column sigma-summation Underscript t equals 1 Overscript n Endscripts sigma-summation Underscript s equals 1 Overscript n Endscripts upper E left-bracket left-parenthesis bold-italic epsilon Subscript t Baseline circled-times bold z Subscript t Baseline right-parenthesis left-parenthesis bold-italic epsilon Subscript s Baseline circled-times bold z Subscript s Baseline right-parenthesis prime right-bracket 3rd Row 1st Column Blank 2nd Column equals 3rd Column n bold upper S Subscript n Superscript 0 EndLayout

where bold upper S Subscript n Superscript 0 is estimated as

ModifyingAbove bold upper S With caret Subscript n Baseline equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts sigma-summation Underscript s equals 1 Overscript n Endscripts left-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-parenthesis left-parenthesis bold q left-parenthesis bold y Subscript s Baseline comma bold x Subscript s Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript s Baseline right-parenthesis prime

Note that ModifyingAbove bold upper S With caret Subscript n is a g k times g k matrix. Because Varleft-parenthesis ModifyingAbove bold upper S With caret Subscript n Baseline right-parenthesis does not decrease with increasing n, you consider estimators of bold upper S Subscript n Superscript 0 of the form

StartLayout 1st Row 1st Column ModifyingAbove bold upper S With caret Subscript n Baseline left-parenthesis l left-parenthesis n right-parenthesis right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript tau equals negative n plus 1 Overscript n minus 1 Endscripts ModifyingAbove w With caret left-parenthesis StartFraction tau Over l left-parenthesis n right-parenthesis EndFraction right-parenthesis bold upper D ModifyingAbove bold upper S With caret Subscript n comma tau bold upper D 2nd Row 1st Column ModifyingAbove bold upper S With caret Subscript n comma tau 2nd Column equals 3rd Column StartLayout Enlarged left-brace 1st Row 1st Column sigma-summation Underscript t equals 1 plus tau Overscript n Endscripts left-bracket bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta Superscript number-sign Baseline right-parenthesis circled-times bold z Subscript t Baseline right-bracket left-bracket bold q left-parenthesis bold y Subscript t minus tau Baseline comma bold x Subscript t minus tau Baseline comma bold-italic theta Superscript number-sign Baseline right-parenthesis circled-times bold z Subscript t minus tau Baseline right-bracket prime 2nd Column tau greater-than-or-equal-to 0 2nd Row 1st Column left-parenthesis ModifyingAbove bold upper S With caret Subscript n comma negative tau Baseline right-parenthesis prime 2nd Column tau less-than 0 EndLayout 3rd Row 1st Column ModifyingAbove w With caret left-parenthesis StartFraction tau Over l left-parenthesis n right-parenthesis EndFraction right-parenthesis 2nd Column equals 3rd Column StartLayout Enlarged left-brace 1st Row 1st Column w left-parenthesis StartFraction tau Over l left-parenthesis n right-parenthesis EndFraction right-parenthesis 2nd Column l left-parenthesis n right-parenthesis greater-than 0 2nd Row 1st Column delta Subscript tau comma 0 2nd Column l left-parenthesis n right-parenthesis equals 0 EndLayout EndLayout

where l left-parenthesis n right-parenthesis is a scalar function that computes the bandwidth parameter, w left-parenthesis dot right-parenthesis is a scalar valued kernel, and the Kronecker delta function, delta Subscript i comma j, is 1 if i equals j and 0 otherwise. The diagonal matrix bold upper D is used for a small sample degrees of freedom correction (Gallant 1987). The initial theta Superscript number-sign used for the estimation of ModifyingAbove bold upper S With caret Subscript n is obtained from a 2SLS estimation of the system. The degrees of freedom correction is handled by the VARDEF= option as it is for the S matrix estimation.

The following kernels are supported by PROC MODEL. They are listed with their default bandwidth functions.

Bartlett: KERNEL=BART

StartLayout 1st Row 1st Column w left-parenthesis x right-parenthesis 2nd Column equals 3rd Column StartLayout Enlarged left-brace 1st Row 1st Column 1 minus StartAbsoluteValue x EndAbsoluteValue 2nd Column StartAbsoluteValue x EndAbsoluteValue less-than-or-equal-to 1 2nd Row 1st Column 0 2nd Column otherwise EndLayout 2nd Row 1st Column l left-parenthesis n right-parenthesis 2nd Column equals 3rd Column one-half n Superscript 1 slash 3 EndLayout

Parzen: KERNEL=PARZEN

StartLayout 1st Row 1st Column w left-parenthesis x right-parenthesis 2nd Column equals 3rd Column StartLayout Enlarged left-brace 1st Row 1st Column 1 minus 6 StartAbsoluteValue x EndAbsoluteValue squared plus 6 StartAbsoluteValue x EndAbsoluteValue cubed 2nd Column 0 less-than-or-equal-to StartAbsoluteValue x EndAbsoluteValue less-than-or-equal-to one-half 2nd Row 1st Column 2 left-parenthesis 1 minus StartAbsoluteValue x EndAbsoluteValue right-parenthesis cubed 2nd Column one-half less-than-or-equal-to StartAbsoluteValue x EndAbsoluteValue less-than-or-equal-to 1 3rd Row 1st Column 0 2nd Column otherwise EndLayout 2nd Row 1st Column l left-parenthesis n right-parenthesis 2nd Column equals 3rd Column n Superscript 1 slash 5 EndLayout

Quadratic spectral: KERNEL=QS

StartLayout 1st Row 1st Column w left-parenthesis x right-parenthesis 2nd Column equals 3rd Column StartFraction 25 Over 12 pi squared x squared EndFraction left-parenthesis StartFraction sine left-parenthesis 6 pi x slash 5 right-parenthesis Over 6 pi x slash 5 EndFraction minus cosine left-parenthesis 6 pi x slash 5 right-parenthesis right-parenthesis 2nd Row 1st Column l left-parenthesis n right-parenthesis 2nd Column equals 3rd Column one-half n Superscript 1 slash 5 EndLayout

Figure 23: Kernels for Smoothing

Kernels for Smoothing


For more information about the properties of these and other kernels, see Andrews (1991). Kernels are selected with the KERNEL= option; KERNEL=PARZEN is the default. The general form of the KERNEL= option is

KERNEL=( PARZEN | QS | BART, c, e )

where the e greater-than-or-equal-to 0 and c greater-than-or-equal-to 0 are used to compute the bandwidth parameter as

l left-parenthesis n right-parenthesis equals c n Superscript e

The bias of the standard error estimates increases for large bandwidth parameters. A warning message is produced for bandwidth parameters greater than n Superscript one-third. For a discussion of the computation of the optimal l left-parenthesis n right-parenthesis, see Andrews (1991).

The "Newey-West" kernel (Newey and West 1987) corresponds to the Bartlett kernel with bandwidth parameter l left-parenthesis n right-parenthesis equals upper L plus 1. That is, if the "lag length" for the Newey-West kernel is upper L, then the corresponding MODEL procedure syntax is KERNEL=(bart, L+1, 0).

Andrews and Monahan (1992) show that using prewhitening in combination with GMM can improve confidence interval coverage and reduce over rejection of t statistics at the cost of inflating the variance and MSE of the estimator. Prewhitening can be performed by using the %AR macros.

For the special case that the errors are not serially correlated—that is,

upper E left-parenthesis e Subscript t Baseline circled-times bold z Subscript t Baseline right-parenthesis left-parenthesis e Subscript s Baseline circled-times bold z Subscript s Baseline right-parenthesis equals 0 t not-equals s

the estimate for bold upper S Subscript n Superscript 0 reduces to

ModifyingAbove bold upper S With caret Subscript n Baseline equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts left-bracket bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-bracket left-bracket bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t Baseline right-bracket prime

The option KERNEL=(kernel,0,) is used to select this type of estimation when using GMM.

Covariance of GMM estimators

The covariance of GMM estimators, given a general weighting matrix bold upper V Subscript normal upper G Superscript negative 1, is

left-bracket left-parenthesis bold upper Y bold upper X right-parenthesis prime bold upper V Subscript normal upper G Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis right-bracket Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis prime bold upper V Subscript normal upper G Superscript negative 1 Baseline ModifyingAbove bold upper V With caret bold upper V Subscript normal upper G Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis left-bracket left-parenthesis bold upper Y bold upper X right-parenthesis prime bold upper V Subscript normal upper G Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis right-bracket Superscript negative 1

By default or when GENGMMV is specified, this is the covariance of GMM estimators.

If the weighting matrix is the same as ModifyingAbove bold upper V With caret, then the covariance of GMM estimators becomes

left-bracket left-parenthesis bold upper Y bold upper X right-parenthesis prime ModifyingAbove bold upper V With caret Superscript negative 1 Baseline left-parenthesis bold upper Y bold upper X right-parenthesis right-bracket Superscript negative 1

If NOGENGMMV is specified, this is used as the covariance estimators.

Testing Overidentifying Restrictions

Let r be the number of unique instruments times the number of equations. The value r represents the number of orthogonality conditions imposed by the GMM method. Under the assumptions of the GMM method, r minus p linearly independent combinations of the orthogonality should be close to zero. The GMM estimates are computed by setting these combinations to zero. When r exceeds the number of parameters to be estimated, the OBJECTIVE*N, reported at the end of the estimation, is an asymptotically valid statistic to test the null hypothesis that the overidentifying restrictions of the model are valid. The OBJECTIVE*N is distributed as a chi-square with r minus p degrees of freedom (Hansen 1982, p. 1049). When the GMM method is selected, the value of the overidentifying restrictions test statistic, also known as Hansen’s J test statistic, and its associated number of degrees of freedom are reported together with the probability under the null hypothesis.

Iterated Generalized Method of Moments (ITGMM)

Iterated generalized method of moments is similar to the iterated versions of 2SLS, SUR, and 3SLS. The variance matrix for GMM estimation is reestimated at each iteration with the parameters determined by the GMM estimation. The iteration terminates when the variance matrix for the equation errors change less than the CONVERGE= value. Iterated generalized method of moments is selected by the ITGMM option in the FIT statement. For some indication of the small sample properties of ITGMM, see Ferson and Foerster (1993).

Simulated Method of Moments (SMM)

The SMM method uses simulation techniques in model inference and estimation. It is appropriate for estimating models in which integrals appear in the objective function, and these integrals can be approximated by simulation. There might be various reasons for integrals to appear in an objective function (for example, transformation of a latent model into an observable model, missing data, random coefficients, heterogeneity, and so on).

This simulation method can be used with all the estimation methods except full information maximum likelihood (FIML) in PROC MODEL. SMM, also known as simulated generalized method of moments (SGMM), is the default estimation method because of its nice properties.

Estimation Details

A general nonlinear model can be described as

bold-italic epsilon Subscript t Baseline equals bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis

where q element-of upper R Superscript g is a real vector valued function of ySubscript t Baseline element-of upper R Superscript g, xSubscript t Baseline element-of upper R Superscript l, bold-italic theta element-of upper R Superscript p; g is the number of equations; l is the number of exogenous variables (lagged endogenous variables are considered exogenous here); p is the number of parameters; and t ranges from 1 to n. bold-italic epsilon Subscript t is an unobservable disturbance vector with the following properties:

StartLayout 1st Row 1st Column upper E left-parenthesis bold-italic epsilon Subscript t Baseline right-parenthesis 2nd Column equals 3rd Column 0 2nd Row 1st Column upper E left-parenthesis bold-italic epsilon Subscript t Baseline bold-italic epsilon Subscript t Superscript prime Baseline right-parenthesis 2nd Column equals 3rd Column bold upper Sigma EndLayout

In many cases, it is not possible to write bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis in a closed form. Instead bold q is expressed as an integral of a function bold f; that is,

bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis equals integral bold f left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta comma bold u Subscript t Baseline right-parenthesis d upper P left-parenthesis bold u right-parenthesis

where felement-of upper R Superscript g is a real vector valued function of ySubscript t Baseline element-of upper R Superscript g, xSubscript t Baseline element-of upper R Superscript l, bold-italic theta element-of upper R Superscript p, and uSubscript t Baseline element-of upper R Superscript m, m is the number of stochastic variables with a known distribution upper P left-parenthesis bold u right-parenthesis. Since the distribution of u is completely known, it is possible to simulate artificial draws from this distribution. Using such independent draws bold u Subscript h t, h equals 1 comma ellipsis comma upper H, and the strong law of large numbers, bold q can be approximated by

StartFraction 1 Over upper H EndFraction sigma-summation Underscript h equals 1 Overscript upper H Endscripts bold f left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta comma bold u Subscript h t Baseline right-parenthesis period
Simulated Generalized Method of Moments (SGMM)

Generalized method of moments (GMM) is widely used to obtain efficient estimates for general model systems. When the moment conditions are not readily available in closed forms but can be approximated by simulation, simulated generalized method of moments (SGMM) can be used. The SGMM estimators have the nice property of being asymptotically consistent and normally distributed even if the number of draws H is fixed (see McFadden 1989; Pakes and Pollard 1989).

Consider the nonlinear model

StartLayout 1st Row 1st Column bold-italic epsilon Subscript t 2nd Column equals 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis equals StartFraction 1 Over upper H EndFraction sigma-summation Underscript h equals 1 Overscript upper H Endscripts bold f left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta comma bold u Subscript h t Baseline right-parenthesis 2nd Row 1st Column bold z Subscript t 2nd Column equals 3rd Column upper Z left-parenthesis bold x Subscript t Baseline right-parenthesis EndLayout

where bold z Subscript t Baseline element-of upper R Superscript k is a vector of k instruments and bold-italic epsilon Subscript t is an unobservable disturbance vector that can be serially correlated and nonstationary. In the case of no instrumental variables, bold z Subscript t is 1. bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis is the vector of moment conditions, and it is approximated by simulation.

In general, theory suggests the following orthogonality condition,

upper E left-parenthesis bold-italic epsilon Subscript t Baseline circled-times bold z Subscript t Baseline right-parenthesis equals 0

which states that the expected crossproducts of the unobservable disturbances, bold-italic epsilon Subscript t, and functions of the observable variables are set to 0. The sample means of the crossproducts are

StartLayout 1st Row 1st Column bold m Subscript n 2nd Column equals 3rd Column StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Row 1st Column bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis 2nd Column equals 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis circled-times bold z Subscript t EndLayout

where bold m left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis element-of upper R Superscript g k. The case where g k greater-than p, where p is the number of parameters, is considered here. An estimate of the true parameter vector theta Superscript 0 is the value of ModifyingAbove theta With caret that minimizes

upper S left-parenthesis theta comma upper V right-parenthesis equals left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket prime bold upper V Superscript negative 1 Baseline left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket slash n

where

bold upper V equals normal upper C normal o normal v left-parenthesis bold m left-parenthesis theta Superscript 0 Baseline right-parenthesis comma bold m left-parenthesis theta Superscript 0 Baseline right-parenthesis Superscript prime Baseline right-parenthesis period

The steps for SGMM are as follows:

1. Start with a positive definite ModifyingAbove bold upper V With caret matrix. This ModifyingAbove bold upper V With caret matrix can be estimated from a consistent estimator of theta. If ModifyingAbove theta With caret is a consistent estimator, then bold u Subscript t for t equals 1 comma ellipsis comma n can be simulated upper H prime number of times. A consistent estimator of bold upper V is obtained as

ModifyingAbove bold upper V With caret equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts left-bracket StartFraction 1 Over upper H prime EndFraction sigma-summation Underscript h equals 1 Overscript upper H prime Endscripts bold f left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma ModifyingAbove bold-italic theta With caret comma bold u Subscript h t Baseline right-parenthesis circled-times bold z Subscript t Baseline right-bracket left-bracket StartFraction 1 Over upper H prime EndFraction sigma-summation Underscript h equals 1 Overscript upper H prime Endscripts bold f left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma ModifyingAbove bold-italic theta With caret comma bold u Subscript h t Baseline right-parenthesis circled-times bold z Subscript t Baseline right-bracket prime

upper H prime must be large so that this is an consistent estimator of bold upper V.

2. Simulate H number of bold u Subscript t for t equals 1 comma ellipsis comma n. As shown by Gourieroux and Monfort (1993), the number of simulations H does not need to be very large. For upper H equals 10, the SGMM estimator achieves 90% of the efficiency of the corresponding GMM estimator. Find ModifyingAbove theta With caret that minimizes the quadratic product of the moment conditions again with the weight matrix being ModifyingAbove bold upper V With caret Superscript negative 1.

min Underscript theta Endscripts left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket prime ModifyingAbove bold upper V With caret Superscript negative 1 Baseline left-bracket n bold m Subscript n Baseline left-parenthesis theta right-parenthesis right-bracket slash n

3. The covariance matrix of StartRoot n EndRoot theta is given as (Gourieroux and Monfort 1993)

bold upper Sigma 1 Superscript negative 1 Baseline bold upper D ModifyingAbove bold upper V With caret Superscript negative 1 Baseline bold upper V left-parenthesis ModifyingAbove theta With caret right-parenthesis ModifyingAbove bold upper V With caret Superscript negative 1 Baseline bold upper D prime bold upper Sigma 1 Superscript negative 1 plus StartFraction 1 Over upper H EndFraction bold upper Sigma 1 Superscript negative 1 Baseline bold upper D ModifyingAbove bold upper V With caret Superscript negative 1 Baseline upper E left-bracket bold z circled-times upper V a r left-parenthesis bold f vertical-bar bold x right-parenthesis circled-times bold z right-bracket ModifyingAbove bold upper V With caret Superscript negative 1 Baseline bold upper D prime bold upper Sigma 1 Superscript negative 1

where bold upper Sigma 1 equals bold upper D ModifyingAbove bold upper V With caret Superscript negative 1 Baseline bold upper D, bold upper D is the matrix of partial derivatives of the residuals with respect to the parameters, bold upper V left-parenthesis ModifyingAbove theta With caret right-parenthesis is the covariance of moments from estimated parameters ModifyingAbove theta With caret, and upper V a r left-parenthesis bold f vertical-bar bold x right-parenthesis is the covariance of moments for each observation from simulation. The first term is the variance-covariance matrix of the exact GMM estimator, and the second term accounts for the variation contributed by simulating the moments.

Implementation in PROC MODEL

In PROC MODEL, if the user specifies the GMM and NDRAW options in the FIT statement, PROC MODEL first fits the model by using N2SLS and computes ModifyingAbove bold upper V With caret by using the estimates from N2SLS and upper H prime simulation. If NO2SLS is specified in the FIT statement, ModifyingAbove bold upper V With caret is read from the VDATA= data set. If the user does not provide a ModifyingAbove bold upper V With caret matrix, the initial starting value of theta is used as the estimator for computing the ModifyingAbove bold upper V With caret matrix in step 1. If ITGMM option is specified instead of GMM, then PROC MODEL iterates from step 1 to step 3 until the bold upper V matrix converges.

The consistency of the parameter estimates is not affected by the variance correction shown in the second term in step 3. The correction on the variance of parameter estimates is not computed by default. To add the adjustment, use the ADJSMMV option in the FIT statement. This correction is of the order of StartFraction 1 Over upper H EndFraction and is small even for moderate H.

The following example illustrates how to use SMM to estimate a simple regression model. Suppose the model is

y equals a plus b x plus u comma u tilde i i d upper N left-parenthesis 0 comma s squared right-parenthesis period

First, consider the problem in a GMM context. The first two moments of y are easily derived:

StartLayout 1st Row 1st Column upper E left-parenthesis y right-parenthesis 2nd Column equals 3rd Column a plus b x 2nd Row 1st Column upper E left-parenthesis y squared right-parenthesis 2nd Column equals 3rd Column left-parenthesis a plus b x right-parenthesis squared plus s squared EndLayout

Rewrite the moment conditions in the form similar to the preceding discussion:

StartLayout 1st Row 1st Column epsilon Subscript 1 t 2nd Column equals 3rd Column y Subscript t Baseline minus left-parenthesis a plus b x Subscript t Baseline right-parenthesis 2nd Row 1st Column epsilon Subscript 2 t 2nd Column equals 3rd Column y Subscript t Superscript 2 Baseline minus left-parenthesis a plus b x Subscript t Baseline right-parenthesis squared minus s squared EndLayout

Then you can estimate this model by using GMM with the following statements:

proc model data=a;
   parms a b s;
   instrument x;
   eq.m1 = y-(a+b*x);
   eq.m2 = y*y - (a+b*x)**2 - s*s;
   bound s > 0;
   fit m1 m2 / gmm;
run;

Now suppose you do not have the closed form for the moment conditions. Instead you can simulate the moment conditions by generating H number of simulated samples based on the parameters. Then the simulated moment conditions are

StartLayout 1st Row 1st Column epsilon Subscript 1 t 2nd Column equals 3rd Column StartFraction 1 Over upper H EndFraction sigma-summation Underscript h equals 1 Overscript upper H Endscripts StartSet y Subscript t Baseline minus left-parenthesis a plus b x Subscript t Baseline plus s u Subscript t comma h Baseline right-parenthesis EndSet 2nd Row 1st Column epsilon Subscript 2 t 2nd Column equals 3rd Column StartFraction 1 Over upper H EndFraction sigma-summation Underscript h equals 1 Overscript upper H Endscripts StartSet y Subscript t Superscript 2 Baseline minus left-parenthesis a plus b x Subscript t Baseline plus s u Subscript t comma h Baseline right-parenthesis squared EndSet EndLayout

This model can be estimated by using SGMM with the following statements:

proc model data=_tmpdata;
   parms a b s;
   instrument x;
   ysim = (a+b*x) + s * rannor( 98711 );
   eq.m1 = y-ysim;
   eq.m2 = y*y - ysim*ysim;
   bound s > 0;
   fit m1 m2 / gmm ndraw=10;
run;

You can use the following MOMENT statement instead of specifying the two moment equations shown earlier:

moment ysim=(1, 2);

In cases where you require a large number of moment equations, using the MOMENT statement to specify them is more efficient.

Note that the NDRAW= option tells PROC MODEL that this is a simulation-based estimation. Thus, the random number function RANNOR returns random numbers in estimation process. During the simulation, 10 draws of m Baseline 1 and m Baseline 2 are generated for each observation, and the averages enter the objective functions just as the equations specified previously.

Other Estimation Methods

The simulation method can be used not only with GMM and ITGMM, but also with OLS, ITOLS, SUR, ITSUR, N2SLS, IT2SLS, N3SLS, and IT3SLS. These simulation-based methods are similar to the corresponding methods in PROC MODEL; the only difference is that the objective functions include the average of the H simulations.

Full Information Maximum Likelihood Estimation (FIML)

A different approach to the simultaneous equation bias problem is the full information maximum likelihood (FIML) estimation method (Amemiya 1977).

Compared to the instrumental variables methods (2SLS and 3SLS), the FIML method has these advantages and disadvantages:

  • FIML does not require instrumental variables.

  • FIML requires that the model include the full equation system, with as many equations as there are endogenous variables. With 2SLS or 3SLS, you can estimate some of the equations without specifying the complete system.

  • FIML assumes that the equations errors have a multivariate normal distribution. If the errors are not normally distributed, the FIML method might produce poor results. 2SLS and 3SLS do not assume a specific distribution for the errors.

  • The FIML method is computationally expensive.

The full information maximum likelihood estimators of theta and sigma are the ModifyingAbove theta With caret and ModifyingAbove sigma With caret that minimize the negative log-likelihood function:

StartLayout 1st Row 1st Column bold l Subscript n Baseline left-parenthesis bold-italic theta comma bold-italic sigma right-parenthesis equals 2nd Column StartFraction n g Over 2 EndFraction 3rd Column ln left-parenthesis 2 pi right-parenthesis minus sigma-summation Underscript t equals 1 Overscript n Endscripts ln left-parenthesis StartAbsoluteValue StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over partial-differential bold y Subscript t Superscript prime Baseline EndFraction EndAbsoluteValue right-parenthesis plus StartFraction n Over 2 EndFraction ln left-parenthesis StartAbsoluteValue bold upper Sigma left-parenthesis sigma right-parenthesis EndAbsoluteValue right-parenthesis 2nd Row 1st Column Blank 2nd Column plus 3rd Column one-half trace left-parenthesis bold upper Sigma left-parenthesis sigma right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis bold q prime left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis right-parenthesis EndLayout

The option FIML requests full information maximum likelihood estimation. If the errors are distributed normally, FIML produces efficient estimators of the parameters. If instrumental variables are not provided, the starting values for the estimation are obtained from a SUR estimation. If instrumental variables are provided, then the starting values are obtained from a 3SLS estimation. The log-likelihood value and the lSubscript 2 norm of the gradient of the negative log-likelihood function are shown in the estimation summary.

FIML Details

To compute the minimum of bold l Subscript n Baseline left-parenthesis bold-italic theta comma bold-italic sigma right-parenthesis, this function is concentrated using the relation

bold upper Sigma left-parenthesis theta right-parenthesis equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis bold q prime left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis

This results in the concentrated negative log-likelihood function discussed in Davidson and MacKinnon (1993):

bold l Subscript n Baseline left-parenthesis bold-italic theta right-parenthesis equals StartFraction n g Over 2 EndFraction left-parenthesis 1 plus ln left-parenthesis 2 pi right-parenthesis right-parenthesis minus sigma-summation Underscript t equals 1 Overscript n Endscripts ln StartAbsoluteValue StartFraction partial-differential Over partial-differential bold y Subscript t Superscript prime Baseline EndFraction bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis EndAbsoluteValue plus StartFraction n Over 2 EndFraction ln StartAbsoluteValue bold upper Sigma left-parenthesis theta right-parenthesis EndAbsoluteValue

The gradient of the negative log-likelihood function is

StartFraction partial-differential Over partial-differential theta Subscript i Baseline EndFraction bold l Subscript n Baseline left-parenthesis bold-italic theta right-parenthesis equals sigma-summation Underscript t equals 1 Overscript n Endscripts nabla Subscript i Baseline left-parenthesis t right-parenthesis
StartLayout 1st Row 1st Column nabla Subscript i Baseline left-parenthesis t right-parenthesis 2nd Column equals 3rd Column minus trace left-parenthesis left-parenthesis StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over partial-differential bold y Subscript t Superscript Super Superscript prime Superscript Baseline EndFraction right-parenthesis Superscript negative 1 Baseline StartFraction partial-differential squared bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over partial-differential bold y Subscript t Superscript Super Superscript prime Superscript Baseline partial-differential theta Subscript i Baseline EndFraction right-parenthesis 2nd Row 1st Column Blank 2nd Column plus 3rd Column one-half trace left-parenthesis bold upper Sigma left-parenthesis theta right-parenthesis Superscript negative 1 Baseline StartFraction partial-differential bold upper Sigma left-parenthesis theta right-parenthesis Over partial-differential theta Subscript i Baseline EndFraction 3rd Row 1st Column Blank 2nd Column Blank 3rd Column left-bracket upper I minus bold upper Sigma left-parenthesis theta right-parenthesis Superscript negative 1 Baseline bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Superscript prime Baseline right-bracket right-parenthesis 4th Row 1st Column Blank 2nd Column plus 3rd Column bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta Superscript prime Baseline right-parenthesis bold upper Sigma left-parenthesis theta right-parenthesis Superscript negative 1 Baseline StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over partial-differential theta Subscript i Baseline EndFraction EndLayout

where

StartFraction partial-differential bold upper Sigma left-parenthesis theta right-parenthesis Over partial-differential theta Subscript i Baseline EndFraction equals StartFraction 2 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential theta Subscript i Baseline EndFraction

The estimator of the variance-covariance of ModifyingAbove theta With caret (COVB) for FIML can be selected with the COVBEST= option with the following arguments:

CROSS

selects the crossproducts estimator of the covariance matrix (Gallant 1987, p. 473),

StartLayout 1st Row  upper C equals left-parenthesis StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts nabla left-parenthesis t right-parenthesis nabla prime left-parenthesis t right-parenthesis right-parenthesis Superscript negative 1 EndLayout

where nabla left-parenthesis t right-parenthesis equals left-bracket nabla Subscript 1 Baseline left-parenthesis t right-parenthesis comma nabla Subscript 2 Baseline left-parenthesis t right-parenthesis comma ellipsis comma nabla Subscript p Baseline left-parenthesis t right-parenthesis right-bracket prime. This is the default.

GLS

selects the generalized least squares estimator of the covariance matrix. This is computed as (Dagenais 1978)

upper C equals left-bracket ModifyingAbove bold upper Z With caret prime left-parenthesis bold upper Sigma left-parenthesis theta right-parenthesis Superscript negative 1 Baseline circled-times upper I right-parenthesis ModifyingAbove bold upper Z With caret right-bracket Superscript negative 1

where ModifyingAbove bold upper Z With caret equals left-parenthesis ModifyingAbove upper Z With caret Subscript 1 Baseline comma ModifyingAbove upper Z With caret Subscript 2 Baseline comma ellipsis comma ModifyingAbove upper Z With caret Subscript p Baseline right-parenthesis is n g times p and each ModifyingAbove upper Z With caret Subscript i column vector is obtained from stacking the columns of

StartLayout 1st Row  bold upper U StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential y EndFraction right-parenthesis Superscript negative 1 Baseline StartFraction partial-differential squared bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential bold y Subscript n Superscript prime Baseline partial-differential theta Subscript i Baseline EndFraction minus upper Q Subscript i EndLayout

bold upper U is an n times g matrix of residuals and q Subscript i is an n times g matrix StartFraction partial-differential bold upper Q Over partial-differential theta Subscript i Baseline EndFraction.

FDA

selects the inverse of concentrated likelihood Hessian as an estimator of the covariance matrix. The Hessian is computed numerically, so for a large problem this is computationally expensive.

The HESSIAN= option controls which approximation to the Hessian is used in the minimization procedure. Alternate approximations are used to improve convergence and execution time. The choices are as follows:

CROSS

The crossproducts approximation is used.

GLS

The generalized least squares approximation is used (default).

FDA

The Hessian is computed numerically by finite differences.

HESSIAN=GLS has better convergence properties in general, but COVBEST=CROSS produces the most pessimistic standard error bounds. When the HESSIAN= option is used, the default estimator of the variance-covariance of ModifyingAbove theta With caret is the inverse of the Hessian selected.

Multivariate t Distribution Estimation

The multivariate t distribution is specified by using the ERRORMODEL statement with the T option. Other method specifications (FIML and OLS, for example ) are ignored when the ERRORMODEL statement is used for a distribution other than normal.

The probability density function for the multivariate t distribution is

upper P Subscript q Baseline equals StartStartFraction normal upper Gamma left-parenthesis StartFraction d f plus m Over 2 EndFraction right-parenthesis OverOver left-parenthesis pi asterisk d f right-parenthesis Superscript StartFraction m Over 2 EndFraction Baseline asterisk normal upper Gamma left-parenthesis StartFraction d f Over 2 EndFraction right-parenthesis StartAbsoluteValue bold upper Sigma left-parenthesis sigma right-parenthesis EndAbsoluteValue Superscript one-half Baseline EndEndFraction asterisk left-parenthesis 1 plus StartFraction bold q prime left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis bold upper Sigma left-parenthesis sigma right-parenthesis Superscript negative 1 Baseline bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis Over d f EndFraction right-parenthesis Superscript minus StartFraction d f plus m Over 2 EndFraction

where m is the number of equations and d f is the degrees of freedom.

The maximum likelihood estimators of theta and sigma are the ModifyingAbove theta With caret and ModifyingAbove sigma With caret that minimize the negative log-likelihood function:

StartLayout 1st Row 1st Column bold l Subscript n Baseline left-parenthesis bold-italic theta comma bold-italic sigma right-parenthesis 2nd Column equals 3rd Column minus sigma-summation Underscript t equals 1 Overscript n Endscripts ln left-parenthesis StartStartFraction normal upper Gamma left-parenthesis StartFraction d f plus m Over 2 EndFraction right-parenthesis OverOver left-parenthesis pi asterisk d f right-parenthesis Superscript StartFraction m Over 2 EndFraction Baseline asterisk normal upper Gamma left-parenthesis StartFraction d f Over 2 EndFraction right-parenthesis EndEndFraction asterisk left-parenthesis 1 plus StartFraction q prime Subscript t Baseline bold upper Sigma Superscript negative 1 Baseline q Subscript t Baseline Over d f EndFraction right-parenthesis Superscript minus StartFraction d f plus m Over 2 EndFraction Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column Blank 3rd Column plus StartFraction n Over 2 EndFraction asterisk ln left-parenthesis StartAbsoluteValue bold upper Sigma EndAbsoluteValue right-parenthesis minus sigma-summation Underscript t equals 1 Overscript n Endscripts ln left-parenthesis StartAbsoluteValue StartFraction partial-differential q Subscript t Baseline Over partial-differential y prime Subscript t EndFraction EndAbsoluteValue right-parenthesis EndLayout

The ERRORMODEL statement is used to request the t distribution maximum likelihood estimation. An OLS estimation is done to obtain initial parameter estimates and MSE.var estimates. Use NOOLS to turn off this initial estimation. If the errors are distributed normally, t distribution estimation produces results similar to FIML.

The multivariate model has a single shared degrees-of-freedom parameter, which is estimated. The degrees-of-freedom parameter can also be set to a fixed value. The log-likelihood value and the lSubscript 2 norm of the gradient of the negative log-likelihood function are shown in the estimation summary.

t Distribution Details

Since a variance term is explicitly specified by using the ERRORMODEL statement, bold upper Sigma left-parenthesis theta right-parenthesis is estimated as a correlation matrix and bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis is normalized by the variance. The gradient of the negative log-likelihood function with respect to the degrees of freedom is

StartLayout 1st Row 1st Column StartFraction partial-differential l Subscript n Baseline Over partial-differential d f EndFraction 2nd Column equals 3rd Column StartFraction n m Over 2 d f EndFraction minus StartFraction n Over 2 EndFraction StartStartFraction normal upper Gamma prime left-parenthesis StartFraction d f plus m Over 2 EndFraction right-parenthesis OverOver normal upper Gamma left-parenthesis StartFraction d f plus m Over 2 EndFraction right-parenthesis EndEndFraction plus StartFraction n Over 2 EndFraction StartStartFraction normal upper Gamma prime left-parenthesis StartFraction d f Over 2 EndFraction right-parenthesis OverOver normal upper Gamma left-parenthesis StartFraction d f Over 2 EndFraction right-parenthesis EndEndFraction plus 2nd Row 1st Column Blank 2nd Column Blank 3rd Column 0.5 log left-parenthesis 1 plus StartFraction bold q prime bold upper Sigma Superscript negative 1 Baseline bold q Over d f EndFraction right-parenthesis minus StartStartFraction 0.5 left-parenthesis d f plus m right-parenthesis OverOver left-parenthesis 1 plus StartFraction bold q prime bold upper Sigma Superscript negative 1 Baseline bold q Over d f EndFraction right-parenthesis EndEndFraction StartFraction bold q prime bold upper Sigma Superscript negative 1 Baseline bold q Over d f squared EndFraction EndLayout

The gradient of the negative log-likelihood function with respect to the parameters is

StartFraction partial-differential l Subscript n Baseline Over partial-differential theta Subscript i Baseline EndFraction equals StartFraction 0.5 left-parenthesis d f plus m right-parenthesis Over left-parenthesis 1 plus bold q prime bold upper Sigma Superscript negative 1 Baseline bold q slash d f right-parenthesis EndFraction left-bracket StartStartFraction left-parenthesis 2 bold q prime bold upper Sigma Superscript negative 1 Baseline StartFraction partial-differential bold q Over partial-differential theta Subscript i Baseline EndFraction right-parenthesis OverOver d f EndEndFraction plus bold q prime bold upper Sigma Superscript negative 1 Baseline StartFraction partial-differential bold upper Sigma Over partial-differential theta Subscript i Baseline EndFraction bold upper Sigma Superscript negative 1 Baseline bold q right-bracket minus StartFraction n Over 2 EndFraction normal t normal r normal a normal c normal e left-parenthesis bold upper Sigma Superscript negative 1 Baseline StartFraction partial-differential bold upper Sigma Over partial-differential theta Subscript i Baseline EndFraction right-parenthesis

where

StartFraction partial-differential bold upper Sigma left-parenthesis theta right-parenthesis Over partial-differential theta Subscript i Baseline EndFraction equals StartFraction 2 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis StartFraction partial-differential bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis prime Over partial-differential theta Subscript i Baseline EndFraction

and

bold q left-parenthesis bold y Subscript t Baseline comma bold x Subscript t Baseline comma bold-italic theta right-parenthesis equals StartFraction epsilon left-parenthesis theta right-parenthesis Over StartRoot h left-parenthesis theta right-parenthesis EndRoot EndFraction element-of upper R Superscript m times n

The estimator of the variance-covariance of ModifyingAbove theta With caret (COVB) for the t distribution is the inverse of the likelihood Hessian. The gradient is computed analytically, and the Hessian is computed numerically.

Empirical Distribution Estimation and Simulation

(View the complete code for this example.)

The following SAS statements fit a model that uses least squares as the likelihood function, but represent the distribution of the residuals with an empirical cumulative distribution function (CDF). The plot of the empirical probability distribution is shown in Figure 24.

data t;  /* Sum of two normals  */
   format date monyy.;
   do t = 0 to 9.9 by 0.1;
      date = intnx( 'month', '1jun90'd, (t*10)-1 );
      y =  0.1 * (rannor(123)-10) +
            .5 * (rannor(123)+10);
      output;
   end;
run;
ods select Model.Liklhood.ResidSummary
           Model.Liklhood.ParameterEstimates;

proc model data=t time=t itprint;
   dependent y;
   parm a 5;

   y = a;
   obj = resid.y * resid.y;
   errormodel y ~ general( obj )
   cdf=(empirical=(tails=( normal percent=10)));

   fit y / outsn=s out=r;
   id  date;

   solve y / data=t(where=(date='1aug98'd))
             residdata=r sdata=s
             random=200 seed=6789 out=monte ;
run;



proc kde data=monte;
   univar y / plots=density;
run;

Figure 24: Empirical PDF Plot

Empirical PDF Plot


For simulation, if the CDF for the model is not built in to the procedure, you can use the CDF=EMPIRICAL() option. This uses the sorted residual data to create an empirical CDF. For computing the inverse CDF, the program needs to know how to handle the tails. For continuous data, the tail distribution is generally poorly determined. To counter this, the PERCENT= option specifies the percentage of the observations to use in constructing each tail. The default for the PERCENT= option is 10.

A normal distribution or a t distribution is used to extrapolate the tails to infinity. The standard errors for this extrapolation are obtained from the data so that the empirical CDF is continuous.

Last updated: June 19, 2025