AUTOREG Procedure

Predicted Values

The AUTOREG procedure can produce two kinds of predicted values for the response series and corresponding residuals and confidence limits. The residuals in both cases are computed as the actual value minus the predicted value. In addition, when GARCH models are estimated, the AUTOREG procedure can output predictions of the conditional error variance.

Predicting the Unconditional Mean

The first type of predicted value is obtained from only the structural part of the model, . These are useful in predicting values of new response time series, which are assumed to be described by the same model as the current response time series. The predicted values, residuals, standard errors, and upper and lower confidence limits for the structural predictions are requested by specifying the PREDICTEDM=, RESIDUALM=, STDERRM=, UCLM=, or LCLM= option in the OUTPUT statement. The ALPHACLM= option controls the confidence level for UCLM= and LCLM=. These confidence limits are for estimation of the mean of the dependent variable, , where is the column vector of independent variables at observation t.

The predicted values are computed as

ModifyingAbove y With caret Subscript t Baseline equals bold x prime Subscript t Baseline bold b

and the upper and lower confidence limits as

ModifyingAbove u With caret Subscript t Baseline equals ModifyingAbove y With caret Subscript t Baseline plus t Subscript alpha slash 2 Baseline normal v

ModifyingAbove l With caret Subscript t Baseline equals ModifyingAbove y With caret Subscript t Baseline minus t Subscript alpha slash 2 Baseline normal v

where v is an estimate of the variance of and is the upper /2 percentage point of the t distribution.

normal upper P normal r normal o normal b left-parenthesis upper T greater-than t Subscript alpha slash 2 Baseline right-parenthesis equals alpha slash 2

where T is an observation from a t distribution with q degrees of freedom. The value of can be set with the ALPHACLM= option. The degrees of freedom parameter, q, is taken to be the number of observations minus the number of free parameters in the final model. For the YW estimation method, the value of v is calculated as

normal v equals StartRoot s squared bold x prime Subscript t Baseline left-parenthesis bold upper X prime bold upper V Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold x Subscript t Baseline EndRoot

where is the error sum of squares divided by q. For the ULS and ML methods, it is calculated as

normal v equals StartRoot s squared bold x prime Subscript t Baseline bold upper W bold x Subscript t Baseline EndRoot

where is the submatrix of that corresponds to the regression parameters. For more information, see the section Computational Methods.

Predicting Future Series Realizations

The other predicted values use both the structural part of the model and the predicted values of the error process. These conditional mean values are useful in predicting future values of the current response time series. The predicted values, residuals, standard errors, and upper and lower confidence limits for future observations conditional on past values are requested by the PREDICTED=, RESIDUAL=, STDERR=, UCL=, or LCL= option in the OUTPUT statement. The ALPHACLI= option controls the confidence level for UCL= and LCL=. These confidence limits are for the predicted value,

y overTilde Subscript t Baseline equals bold x prime Subscript t Baseline bold b plus nu Subscript t vertical-bar t minus 1

where is the vector of independent variables if all independent variables at time t are nonmissing, and is the minimum variance linear predictor of the error term, which is defined in the following recursive way given the autoregressive model, AR(m) model, for ,

nu Subscript s vertical-bar t Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column minus sigma-summation Underscript i equals 1 Overscript m Endscripts ModifyingAbove phi With caret Subscript i Baseline nu Subscript s minus i vertical-bar t Baseline 2nd Column s greater-than t or observation s is missing 2nd Row 1st Column y Subscript s Baseline minus bold x prime Subscript s Baseline bold b 2nd Column 0 less-than s less-than-or-equal-to t and observation s is nonmissing 3rd Row 1st Column 0 2nd Column s less-than-or-equal-to 0 EndLayout

where , are the estimated AR parameters. Observation s is considered to be missing if the dependent variable or at least one independent variable is missing. If some of the independent variables at time t are missing, the predicted is also missing. With the same definition of , the prediction method can be easily extended to the multistep forecast of :

y overTilde Subscript t plus d Baseline equals bold x prime Subscript t plus d Baseline bold b plus nu Subscript t plus d vertical-bar t minus 1

The prediction method is implemented through the Kalman filter.

If is not missing, the upper and lower confidence limits are computed as

u overTilde Subscript t Baseline equals y overTilde Subscript t Baseline plus t Subscript alpha slash 2 Baseline normal v

l overTilde Subscript t Baseline equals y overTilde Subscript t Baseline minus t Subscript alpha slash 2 Baseline normal v

where v, in this case, is computed as

normal v equals StartRoot bold z prime Subscript t Baseline bold upper V Subscript beta Baseline bold z Subscript t Baseline plus s squared r EndRoot

where is the variance-covariance matrix of the estimation of regression parameter ; is defined as

bold z Subscript t Baseline equals bold x Subscript t Baseline plus sigma-summation Underscript i equals 1 Overscript m Endscripts ModifyingAbove phi With caret Subscript i Baseline bold x Subscript t minus i vertical-bar t minus 1

and is defined in a similar way as :

bold x Subscript s vertical-bar t Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column minus sigma-summation Underscript i equals 1 Overscript m Endscripts ModifyingAbove phi With caret Subscript i Baseline bold x Subscript s minus i vertical-bar t Baseline 2nd Column s greater-than t or observation s is missing 2nd Row 1st Column bold x Subscript s Baseline 2nd Column 0 less-than s less-than-or-equal-to t and observation s is nonmissing 3rd Row 1st Column 0 2nd Column s less-than-or-equal-to 0 EndLayout

The formula for computing the prediction variance v is deducted based on Baillie (1979).

The value is the estimate of the conditional prediction error variance. At the start of the series, and after missing values, r is usually greater than 1. For the computational details of r, see the section Predicting the Conditional Variance. The plot of residuals and confidence limits in Example 8.4 illustrates this behavior.

Except to adjust the degrees of freedom for the error sum of squares, the preceding formulas do not account for the fact that the autoregressive parameters are estimated. In particular, the confidence limits are likely to be somewhat too narrow. In large samples, this is probably not an important effect, but it might be appreciable in small samples. For some discussion of this problem for AR(1) models, see Harvey (1981).

At the beginning of the series (the first m observations, where m is the value of the NLAG= option) and after missing values, these residuals do not match the residuals obtained by using OLS on the transformed variables. This is because, in these cases, the predicted noise values must be based on less than a complete set of past noise values and, thus, have larger variance. The GLS transformation for these observations includes a scale factor in addition to a linear combination of past values. Put another way, the matrix defined in the section Computational Methods has the value 1 along the diagonal, except for the first m observations and after missing values.

Predicting the Conditional Variance

The GARCH process can be written as

epsilon Subscript t Superscript 2 Baseline equals omega plus sigma-summation Underscript i equals 1 Overscript n Endscripts left-parenthesis alpha Subscript i Baseline plus gamma Subscript i Baseline right-parenthesis epsilon Subscript t minus i Superscript 2 Baseline minus sigma-summation Underscript j equals 1 Overscript p Endscripts gamma Subscript j Baseline eta Subscript t minus j Baseline plus eta Subscript t

where and . This representation shows that the squared residual follows an ARMA process. Then for any , the conditional expectations are as follows:

bold upper E left-parenthesis epsilon Subscript t plus d Superscript 2 Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis equals omega plus sigma-summation Underscript i equals 1 Overscript n Endscripts left-parenthesis alpha Subscript i Baseline plus gamma Subscript i Baseline right-parenthesis bold upper E left-parenthesis epsilon Subscript t plus d minus i Superscript 2 Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis minus sigma-summation Underscript j equals 1 Overscript p Endscripts gamma Subscript j Baseline bold upper E left-parenthesis eta Subscript t plus d minus j Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis

The d-step-ahead prediction error, = , has the conditional variance

bold upper V left-parenthesis xi Subscript t plus d Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis equals sigma-summation Underscript j equals 0 Overscript d minus 1 Endscripts g Subscript j Superscript 2 Baseline sigma Subscript t plus d minus j vertical-bar t Superscript 2

where

sigma Subscript t plus d minus j vertical-bar t Superscript 2 Baseline equals bold upper E left-parenthesis epsilon Subscript t plus d minus j Superscript 2 Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis

Coefficients in the conditional d-step prediction error variance are calculated recursively using the formula

g Subscript j Baseline equals minus phi 1 g Subscript j minus 1 Baseline minus midline-horizontal-ellipsis minus phi Subscript m Baseline g Subscript j minus m

where and if ; , …, are autoregressive parameters. Since the parameters are not known, the conditional variance is computed using the estimated autoregressive parameters. The d-step-ahead prediction error variance is simplified when there are no autoregressive terms:

Therefore, the one-step-ahead prediction error variance is equivalent to the conditional error variance defined in the GARCH process:

h Subscript t Baseline equals bold upper E left-parenthesis epsilon Subscript t Superscript 2 Baseline vertical-bar normal upper Psi Subscript t minus 1 Baseline right-parenthesis equals sigma Subscript t vertical-bar t minus 1 Superscript 2

The multistep forecast of conditional error variance of the EGARCH, QGARCH, TGARCH, PGARCH, and GARCH-M models cannot be calculated using the preceding formula for the GARCH model. The following formulas are recursively implemented to obtain the multistep forecast of conditional error variance of these models:

for the EGARCH(p, q) model:

where
for the QGARCH(p, q) model:
for the TGARCH(p, q) model:
for the PGARCH(p, q) model:
for the GARCH-M model: ignoring the mean effect and directly using the formula of the corresponding GARCH model.

If the conditional error variance is homoscedastic, the conditional prediction error variance is identical to the unconditional prediction error variance

bold upper V left-parenthesis xi Subscript t plus d Baseline vertical-bar normal upper Psi Subscript t Baseline right-parenthesis equals bold upper V left-parenthesis xi Subscript t plus d Baseline right-parenthesis equals sigma squared sigma-summation Underscript j equals 0 Overscript d minus 1 Endscripts g Subscript j Superscript 2

since . You can compute (which is the second term of the variance for the predicted value explained in the section Predicting Future Series Realizations) by using the formula , and r is estimated from by using the estimated autoregressive parameters.

Consider the following conditional prediction error variance:

The second term in the preceding equation can be interpreted as the noise from using the homoscedastic conditional variance when the errors follow the GARCH process. However, it is expected that if the GARCH process is covariance stationary, the difference between the conditional prediction error variance and the unconditional prediction error variance disappears as the forecast horizon d increases.

Last updated: June 19, 2025