STATESPACE Procedure

Parameter Estimation

The model is , where is a sequence of independent multivariate normal innovations with mean vector 0 and variance . The observed sequence composes the first r components of , and thus , where H is the matrix .

Let be the matrix of innovations:

StartLayout 1st Row bold upper E equals Start 1 By 3 Matrix 1st Row 1st Column bold e 1 2nd Column midline-horizontal-ellipsis 3rd Column bold e Subscript n EndMatrix EndLayout

If the number of observations n is reasonably large, the log likelihood L can be approximated up to an additive constant as follows:

upper L equals minus StartFraction n Over 2 EndFraction ln left-parenthesis StartAbsoluteValue bold upper Sigma Subscript bold e bold e Baseline EndAbsoluteValue right-parenthesis minus one-half normal t normal r normal a normal c normal e left-parenthesis bold upper Sigma Subscript bold e bold e Superscript negative 1 Baseline bold upper E bold upper E prime right-parenthesis

The elements of are taken as free parameters and are estimated as follows:

bold upper S 0 equals StartFraction 1 Over n EndFraction bold upper E bold upper E prime

Replacing by in the likelihood equation, the log likelihood, up to an additive constant, is

bold upper L equals minus StartFraction n Over 2 EndFraction ln left-parenthesis StartAbsoluteValue bold upper S 0 EndAbsoluteValue right-parenthesis

Letting B be the backshift operator, the formal relation between and is

bold x Subscript t Baseline equals bold upper H left-parenthesis bold upper I minus upper B bold upper F right-parenthesis Superscript negative 1 Baseline bold upper G bold e Subscript t

bold e Subscript t Baseline equals left-parenthesis bold upper H left-parenthesis bold upper I minus upper B bold upper F right-parenthesis Superscript negative 1 Baseline bold upper G right-parenthesis Superscript negative 1 Baseline bold x Subscript t Baseline equals sigma-summation Underscript i equals 0 Overscript normal infinity Endscripts bold upper Xi Subscript i Baseline bold x Subscript t minus i

Letting be the ith lagged sample covariance of and neglecting end effects, the matrix is

bold upper S 0 equals sigma-summation Underscript i comma j equals 0 Overscript normal infinity Endscripts bold upper Xi Subscript i Baseline bold upper C Subscript negative i plus j Baseline bold upper Xi Subscript j Superscript prime

For the computation of , the infinite sum is truncated at the value of the KLAG= option. The value of the KLAG= option should be large enough that the sequence is approximately 0 beyond that point.

Let be the vector of free parameters in the and matrices. The derivative of the log likelihood with respect to the parameter is

StartLayout 1st Row StartFraction partial-differential upper L Over partial-differential bold-italic theta EndFraction equals minus StartFraction n Over 2 EndFraction normal t normal r normal a normal c normal e left-parenthesis bold upper S 0 Superscript negative 1 Baseline StartFraction partial-differential bold upper S 0 Over partial-differential bold-italic theta EndFraction right-parenthesis EndLayout

The second derivative is

StartLayout 1st Row StartFraction partial-differential squared bold upper L Over partial-differential bold-italic theta partial-differential bold-italic theta prime EndFraction equals StartFraction n Over 2 EndFraction left-parenthesis normal t normal r normal a normal c normal e left-parenthesis bold upper S 0 Superscript negative 1 Baseline StartFraction partial-differential bold upper S 0 Over partial-differential bold-italic theta prime EndFraction bold upper S 0 Superscript negative 1 Baseline StartFraction partial-differential bold upper S 0 Over partial-differential bold-italic theta EndFraction right-parenthesis minus normal t normal r normal a normal c normal e left-parenthesis bold upper S 0 Superscript negative 1 Baseline StartFraction partial-differential squared bold upper S 0 Over partial-differential bold-italic theta partial-differential bold-italic theta prime EndFraction right-parenthesis right-parenthesis EndLayout

Near the maximum, the first term is unimportant and the second term can be approximated to give the following second derivative approximation:

StartLayout 1st Row StartFraction partial-differential squared upper L Over partial-differential bold-italic theta partial-differential bold-italic theta Superscript prime Baseline EndFraction approximately-equals negative n trace left-parenthesis bold upper S 0 Superscript negative 1 Baseline StartFraction partial-differential bold upper E Over partial-differential bold-italic theta EndFraction StartFraction partial-differential bold upper E Superscript prime Baseline Over partial-differential bold-italic theta prime EndFraction right-parenthesis EndLayout

The first derivative matrix and this second derivative matrix approximation are computed from the sample covariance matrix and the truncated sequence . The approximate likelihood function is maximized by a modified Newton-Raphson algorithm that employs these derivative matrices.

The matrix is used as the estimate of the innovation covariance matrix, . The negative of the inverse of the second derivative matrix at the maximum is used as an approximate covariance matrix for the parameter estimates. The standard errors of the parameter estimates printed in the parameter estimates tables are taken from the diagonal of this covariance matrix. The parameter covariance matrix is printed when the COVB option is specified.

If the data are nearly nonstationary, a better estimate of and the other parameters can sometimes be obtained by specifying the RESIDEST option. The RESIDEST option estimates the parameters by using conditional least squares instead of maximum likelihood.

The residuals are computed using the state space equation and the sample mean values of the variables in the model as start-up values. The estimate of is then computed using the residuals from the ith observation on, where i is the maximum number of times any variable occurs in the state vector. A multivariate Gauss-Marquardt algorithm is used to minimize . For a further description of this method, see Harvey (1981a).

Last updated: June 19, 2025