SYSLIN Procedure

Computational Details

This section discusses various computational details.

Computation of Least Squares–Based Estimators

Let the system be composed of G equations, and let the ith equation be expressed in the form

y Subscript i Baseline equals upper Y Subscript i Baseline bold-italic beta Subscript i Baseline plus upper X Subscript i Baseline bold-italic gamma Subscript i Baseline plus bold u

where

: is the vector of observations on the dependent variable
: is the matrix of observations on the endogenous variables included in the equation
: is the vector of parameters associated with
: is the matrix of observations on the predetermined variables included in the equation
: is the vector of parameters associated with
: is a vector of errors

Let , where is the projection of onto the space spanned by the instruments matrix Z.

Let

bold-italic delta Subscript i Baseline equals StartBinomialOrMatrix bold-italic beta Subscript i Baseline Choose bold-italic gamma Subscript i EndBinomialOrMatrix

be the vector of parameters associated with both the endogenous and exogenous variables.

The K-class of estimators (Theil 1971) is defined by

ModifyingAbove bold-italic delta With caret Subscript i comma k Baseline equals Start 2 By 2 Matrix 1st Row 1st Column upper Y prime Subscript i Baseline upper Y Subscript i minus k ModifyingAbove upper V With caret prime Subscript i Baseline ModifyingAbove upper V With caret Subscript i 2nd Column upper Y prime Subscript i Baseline upper X Subscript i 2nd Row 1st Column upper X prime Subscript i Baseline upper Y Subscript i 2nd Column upper X prime Subscript i Baseline upper X Subscript i EndMatrix Superscript negative 1 Baseline StartBinomialOrMatrix left-parenthesis upper Y Subscript i Baseline minus k upper V Subscript i Baseline right-parenthesis prime y Subscript i Baseline Choose upper X prime Subscript i Baseline y Subscript i EndBinomialOrMatrix

where k is a user-defined value.

Let

bold upper R equals left-bracket upper Y Subscript i Baseline upper X Subscript i Baseline right-bracket

and

ModifyingAbove bold upper R With caret equals left-bracket ModifyingAbove upper Y With caret Subscript i Baseline upper X Subscript i Baseline right-bracket

The 2SLS estimator is defined as

ModifyingAbove bold-italic delta With caret Subscript i comma 2 normal upper S normal upper L normal upper S Baseline equals left-bracket ModifyingAbove upper R With caret prime Subscript i Baseline ModifyingAbove upper R With caret Subscript i Baseline right-bracket Superscript negative 1 Baseline ModifyingAbove upper R With caret prime Subscript i Baseline y Subscript i

Let and be the vectors obtained by stacking the vectors of dependent variables and parameters for all G equations, and let and be the block diagonal matrices formed by and , respectively.

The SUR and ITSUR estimators are defined as

ModifyingAbove bold-italic delta With caret Subscript left-parenthesis normal upper I normal upper T right-parenthesis normal upper S normal upper U normal upper R Baseline equals left-bracket bold upper R prime left-parenthesis ModifyingAbove normal upper Sigma With caret Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold upper R right-bracket Superscript negative 1 Baseline bold upper R prime left-parenthesis ModifyingAbove normal upper Sigma With caret Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold y

while the 3SLS and IT3SLS estimators are defined as

ModifyingAbove bold-italic delta With caret Subscript left-parenthesis normal upper I normal upper T right-parenthesis 3 normal upper S normal upper L normal upper S Baseline equals left-bracket ModifyingAbove bold upper R With caret prime left-parenthesis ModifyingAbove normal upper Sigma With caret Superscript negative 1 Baseline circled-times bold upper I right-parenthesis ModifyingAbove bold upper R With caret right-bracket Superscript negative 1 Baseline ModifyingAbove bold upper R With caret prime left-parenthesis ModifyingAbove normal upper Sigma With caret Superscript negative 1 Baseline circled-times bold upper I right-parenthesis bold y

where is the identity matrix and is an estimator of the cross-equation correlation matrix. For 3SLS, is obtained from the 2SLS estimation, while for SUR it is derived from the OLS estimation. For IT3SLS and ITSUR, it is obtained iteratively from the previous estimation step, until convergence.

Computation of Standard Errors

The VARDEF= option in the PROC SYSLIN statement controls the denominator used in calculating the cross-equation covariance estimates and the parameter standard errors and covariances. The values of the VARDEF= option and the resulting denominator are as follows:

N: uses the number of nonmissing observations.
DF: uses the number of nonmissing observations less the degrees of freedom in the model.
WEIGHT: uses the sum of the observation weights given by the WEIGHTS statement.
WDF: uses the sum of the observation weights given by the WEIGHTS statement less the degrees of freedom in the model.

The VARDEF= option does not affect the model mean squared error, root mean squared error, or statistics. These statistics are always based on the error degrees of freedom, regardless of the VARDEF= option. The VARDEF= option also does not affect the dependent variable coefficient of variation (CV).

Reduced Form Estimates

The REDUCED option in the PROC SYSLIN statement computes estimates of the reduced form coefficients. The REDUCED option requires that the equation system be square. If there are fewer models than endogenous variables, IDENTITY statements can be used to complete the equation system.

The reduced form coefficients are computed as follows. Represent the equation system, with all endogenous variables moved to the left-hand side of the equations and identities, as

bold upper B bold upper Y equals bold upper Gamma bold upper X

Here B is the estimated coefficient matrix for the endogenous variables Y, and is the estimated coefficient matrix for the exogenous (or predetermined) variables X.

The system can be solved for Y as follows, provided B is square and nonsingular:

bold upper Y equals bold upper B Superscript negative 1 Baseline bold upper Gamma bold upper X

The reduced form coefficients are the matrix .

Uncorrelated Errors across Equations

The SDIAG option in the PROC SYSLIN statement computes estimates by assuming uncorrelated errors across equations. As a result, when the SDIAG option is used, the 3SLS estimates are identical to 2SLS estimates, and the SUR estimates are the same as the OLS estimates.

Overidentification Restrictions

The OVERID option in the MODEL statement can be used to test for overidentifying restrictions on parameters of each equation. The null hypothesis is that the predetermined variables that do not appear in any equation have zero coefficients. The alternative hypothesis is that at least one of the assumed zero coefficients is nonzero. The test is approximate and rejects the null hypothesis too frequently for small sample sizes.

The formula for the test is given as follows. Let be the ith equation. are the endogenous variables that appear as regressors in the ith equation, and are the instrumental variables that appear as regressors in the ith equation. Let be the number of variables in and .

Let . Let Z represent all instrumental variables, T be the total number of observations, and K be the total number of instrumental variables. Define as follows:

ModifyingAbove l With caret equals StartFraction v prime Subscript i Baseline left-parenthesis bold upper I minus bold upper Z Subscript i Baseline left-parenthesis bold upper Z prime Subscript i Baseline bold upper Z Subscript i Baseline right-parenthesis Superscript negative 1 Baseline bold upper Z prime Subscript i Baseline right-parenthesis v Subscript i Baseline Over v prime Subscript i Baseline left-parenthesis bold upper I minus bold upper Z left-parenthesis bold upper Z prime bold upper Z right-parenthesis Superscript negative 1 Baseline bold upper Z prime right-parenthesis v Subscript i Baseline EndFraction

Then the test statistic

StartFraction upper T minus upper K Over upper K minus upper N Subscript i Baseline EndFraction left-parenthesis ModifyingAbove l With caret minus 1 right-parenthesis

is distributed approximately as an F with and degrees of freedom. For more information, see Basmann (1960).

Fuller’s Modification to LIML

The ALPHA= option in the PROC SYSLIN and MODEL statements parameterizes Fuller’s modification to LIML. This modification is , where is the value of the ALPHA= option, is the LIML k value,n is the number of observations, and g is the number of predetermined variables. Fuller’s modification is not used unless the ALPHA= option is specified. For more information, see Fuller (1977).

Last updated: June 19, 2025