VARMAX Procedure

Vector Error Correction Model

A vector error correction model (VECM) can lead to a better understanding of the nature of any nonstationarity among the different component series and can also improve longer-term forecasting compared to an unconstrained model.

The VECM(p) form with the cointegration rank, r left-parenthesis less-than-or-equal-to k right-parenthesis, is written as

StartLayout 1st Row  normal upper Delta bold y Subscript t Baseline equals bold-italic delta plus normal upper Pi bold y Subscript t minus 1 Baseline plus sigma-summation Underscript i equals 1 Overscript p minus 1 Endscripts normal upper Phi Subscript i Superscript asterisk Baseline normal upper Delta bold y Subscript t minus i Baseline plus bold-italic epsilon Subscript t EndLayout

where normal upper Delta is the differencing operator, such that normal upper Delta bold y Subscript t Baseline equals bold y Subscript t Baseline minus bold y Subscript t minus 1; normal upper Pi equals alpha beta prime, where alpha and beta are k times r matrices; and normal upper Phi Subscript i Superscript asterisk is a k times k matrix.

The VECM(p) form has an equivalent VAR(p) representation as described in the section Vector Autoregressive Model.

StartLayout 1st Row  bold y Subscript t Baseline equals bold-italic delta plus left-parenthesis upper I Subscript k Baseline plus normal upper Pi plus normal upper Phi 1 Superscript asterisk Baseline right-parenthesis bold y Subscript t minus 1 Baseline plus sigma-summation Underscript i equals 2 Overscript p minus 1 Endscripts left-parenthesis normal upper Phi Subscript i Superscript asterisk Baseline minus normal upper Phi Subscript i minus 1 Superscript asterisk Baseline right-parenthesis bold y Subscript t minus i Baseline minus normal upper Phi Subscript p minus 1 Superscript asterisk Baseline bold y Subscript t minus p Baseline plus bold-italic epsilon Subscript t EndLayout

where upper I Subscript k is a k times k identity matrix.

An example of the second-order nonstationary vector autoregressive model is

StartLayout 1st Row  bold y Subscript t Baseline equals Start 2 By 2 Matrix 1st Row 1st Column negative 0.2 2nd Column 0.1 2nd Row 1st Column 0.5 2nd Column 0.2 EndMatrix bold y Subscript t minus 1 Baseline plus Start 2 By 2 Matrix 1st Row 1st Column 0.8 2nd Column 0.7 2nd Row 1st Column negative 0.4 2nd Column 0.6 EndMatrix bold y Subscript t minus 2 Baseline plus bold-italic epsilon Subscript t EndLayout

with

StartLayout 1st Row  normal upper Sigma equals Start 2 By 2 Matrix 1st Row 1st Column 100 2nd Column 0 2nd Row 1st Column 0 2nd Column 100 EndMatrix normal a normal n normal d bold y Subscript negative 1 Baseline equals bold y 0 equals StartBinomialOrMatrix 0 Choose 0 EndBinomialOrMatrix EndLayout

This process can be given the following VECM(2) representation with the cointegration rank one:

StartLayout 1st Row  normal upper Delta bold y Subscript t Baseline equals StartBinomialOrMatrix negative 0.4 Choose 0.1 EndBinomialOrMatrix left-parenthesis 1 comma negative 2 right-parenthesis bold y Subscript t minus 1 Baseline minus Start 2 By 2 Matrix 1st Row 1st Column 0.8 2nd Column 0.7 2nd Row 1st Column negative 0.4 2nd Column 0.6 EndMatrix normal upper Delta bold y Subscript t minus 1 Baseline plus bold-italic epsilon Subscript t EndLayout

The following PROC IML statements generate simulated data for this VECM(2) form and the PROC SGPLOT statements plot the data, as shown in Figure 12:

proc iml;
   sig = 100*i(2);
   phi = {-0.2 0.1, 0.5 0.2, 0.8 0.7, -0.4 0.6};
   call varmasim(y,phi) sigma=sig n=100 initial=0
                        seed=45876;
   cn = {'y1' 'y2'};
   create simul2 from y[colname=cn];
   append from y;
quit;

data simul2;
   set simul2;
   date = intnx( 'year', '01jan1900'd, _n_-1 );
   format date year4. ;
run;
proc sgplot data=simul2;
   series x=date y=y1 / lineattrs=(pattern=solid);
   series x=date y=y2 / lineattrs=(pattern=dash);
   yaxis label="Series";
run;

Figure 12: Plot of Generated Data Process

Plot of Generated Data Process


Cointegration Testing

The following statements use the Johansen cointegration rank test. The COINTTEST=(JOHANSEN) option performs the Johansen trace test and is equivalent to specifying the COINTTEST option with no additional suboptions or specifying the COINTTEST=(JOHANSEN=(TYPE=TRACE)) option.

/*--- Cointegration Test ---*/

proc varmax data=simul2;
   model y1 y2 / p=2 noint dftest cointtest=(johansen);
run;

Figure 13 shows the output for Dickey-Fuller tests for the nonstationarity of each series and the Johansen cointegration rank test between series.

Figure 13: Dickey-Fuller Tests and Cointegration Rank Test

The VARMAX Procedure

Unit Root Test
Variable Type Rho Pr < Rho Tau Pr < Tau
y1 Zero Mean 1.47 0.9628 1.65 0.9755
  Single Mean -0.80 0.9016 -0.47 0.8916
  Trend -10.88 0.3573 -2.20 0.4815
y2 Zero Mean -0.05 0.6692 -0.03 0.6707
  Single Mean -6.03 0.3358 -1.72 0.4204
  Trend -50.49 0.0003 -4.92 0.0006

Cointegration Rank Test Using Trace
H0:
Rank=r
H1:
Rank>r
Eigenvalue Trace Pr > Trace Drift in ECM Drift in Process
0 0 0.5086 70.7279 <.0001 NOINT Constant
1 1 0.0111 1.0921 0.3441    


In Dickey-Fuller tests, the second column specifies three types of models, which are zero mean, single mean, or trend. The third column (Rho) and the fifth column (Tau) are the test statistics that are used to test the null hypothesis that the series has a unit root. Other columns are their p-values. You can see that both series have unit roots. For a description of Dickey-Fuller tests, see the section PROBDF Function for Dickey-Fuller Tests in Chapter 5, SAS Macros and Functions.

In the "Cointegration Rank Test Using Trace" table, the last two columns explain the drift in the model or process. Because the NOINT option is specified, the model is

StartLayout 1st Row  normal upper Delta bold y Subscript t Baseline equals normal upper Pi bold y Subscript t minus 1 Baseline plus normal upper Phi 1 Superscript asterisk Baseline normal upper Delta bold y Subscript t minus 1 Baseline plus bold-italic epsilon Subscript t EndLayout

The column Drift in ECM indicates that there is no separate drift in the error correction model, and the column Drift in Process indicates that the process has a constant drift before differencing.

H0 is the null hypothesis, and H1 is the alternative hypothesis. The first row tests the cointegration rank r equals 0 against r greater-than 0, and the second row tests r equals 1 against r greater-than 1. The trace test statistics in the fourth column are computed by minus upper T sigma-summation Underscript i equals r plus 1 Overscript k Endscripts log left-parenthesis 1 minus lamda Subscript i Baseline right-parenthesis, where T is the available number of observations and lamda Subscript i is the eigenvalue in the third column. The p-values for these statistics are output in the fifth column. If you compare the p-value in each row to the significance level of interest (such as 5%), the null hypothesis that there is no cointegrated process (H0: r equals 0) is rejected, whereas the null hypothesis that there is at most one cointegrated process (H0: r equals 1) cannot be rejected.

The following statements fit a VECM(2) form to the simulated data:

/*--- Vector Error Correction Model ---*/

proc varmax data=simul2;
   model y1 y2 / p=2 noint lagmax=3
                 print=(iarr estimates);
   cointeg rank=1 normalize=y1;
run;

The results in Figure 13 indicate that the time series are cointegrated with rank = 1. So you might want to specify the RANK=1 option in the COINTEG statement. For normalizing the value of the cointegrated vector, you specify the normalized variable by using the NORMALIZE= option in the COINTEG statement. The COINTEG statement produces the estimates of the long-run parameter, bold-italic beta, and the adjustment coefficient, bold-italic alpha. The PRINT=(IARR) option provides the VAR(2) representation.

The VARMAX procedure output is shown in Figure 14 through Figure 17. In Figure 14, "1" indicates the first column of the bold-italic alpha and bold-italic beta matrices. Because the cointegration rank is 1 in the bivariate system, bold-italic alpha and bold-italic beta are two-dimensional vectors. The estimated cointegrating vector is ModifyingAbove bold-italic beta With caret equals left-parenthesis 1 comma negative 1.96 right-parenthesis prime. Therefore, the long-run relationship between y Subscript 1 t and y Subscript 2 t is y Subscript 1 t Baseline equals 1.96 y Subscript 2 t. The first element of ModifyingAbove bold-italic beta With caret is 1 because y 1 is specified as the normalized variable. Asymptotically, bold-italic alpha follows a normal distribution, and the t values and p-values of its elements are shown in the "Alpha and Beta Parameter Estimates" table; however, because bold-italic beta follows a nonnormal distribution, the corresponding standard errors, t values, and p-values are missing. The Variable column shows the variables that correspond to the coefficients. For example, for the coefficient bold-italic alpha Subscript i j (the ith element in the jth column of bold-italic alpha), ALPHAi normal bar j, the variable is the inner product of the transpose of the jth column of bold-italic beta (Beta[,j]prime) and the vector of lag 1 dependent variables bold y Subscript t minus 1 (normal barDEPnormal bar(t–1)).

Figure 14: Parameter Estimates for the VECM(2) Form

The VARMAX Procedure

Type of Model VECM(2)
Estimation Method Maximum Likelihood Estimation
Cointegrated Rank 1

Beta
Variable 1
y1 1.00000
y2 -1.95575

Alpha
Variable 1
y1 -0.46680
y2 0.10667

Alpha and Beta Parameter Estimates
Equation Parameter Estimate Standard
Error
t Value Pr > |t| Variable
D_y1 ALPHA1_1 -0.46680 0.04786 -9.75 <.0001 Beta[,1]'*_DEP_(t-1)
  BETA1_1 1.00000       y1(t-1)
D_y2 ALPHA2_1 0.10667 0.05146 2.07 0.0409 Beta[,1]'*_DEP_(t-1)
  BETA2_1 -1.95575       y2(t-1)


Figure 15 shows the parameter estimates in terms of lag 1 coefficients, bold y Subscript t minus 1, and lag 1 first-differenced coefficients, normal upper Delta bold y Subscript t minus 1, and their significance. "Alpha * Betaprime" indicates the coefficients of bold y Subscript t minus 1 and is obtained by multiplying the Alpha and Beta estimates in Figure 14. The parameter AR1normal bar i normal bar j (which is shown in the "Model Parameter Estimates" table) corresponds to the elements in the "Alpha * Betaprime" matrix. The parameter AR2normal bar i normal bar j corresponds to the elements in the differenced lagged AR coefficient matrix. The "D_" prefixed to a variable name in Figure 15 implies differencing.

Figure 15: Parameter Estimates for the VECM(2) Form, Continued

Parameter Alpha * Beta' Estimates
Variable y1 y2
y1 -0.46680 0.91295
y2 0.10667 -0.20862

AR Coefficients of Differenced Lag
DIF Lag Variable y1 y2
1 y1 -0.74332 -0.74621
  y2 0.40493 -0.57157

Model Parameter Estimates
Equation Parameter Estimate Standard
Error
t Value Pr > |t| Variable
D_y1 AR1_1_1 -0.46680 0.04786 -9.75 <.0001 y1(t-1)
  AR1_1_2 0.91295 0.09359 9.75 <.0001 y2(t-1)
  AR2_1_1 -0.74332 0.04526 -16.42 <.0001 D_y1(t-1)
  AR2_1_2 -0.74621 0.04769 -15.65 <.0001 D_y2(t-1)
D_y2 AR1_2_1 0.10667 0.05146 2.07 0.0409 y1(t-1)
  AR1_2_2 -0.20862 0.10064 -2.07 0.0409 y2(t-1)
  AR2_2_1 0.40493 0.04867 8.32 <.0001 D_y1(t-1)
  AR2_2_2 -0.57157 0.05128 -11.15 <.0001 D_y2(t-1)


Figure 16 shows the parameter estimates of the innovations covariance matrix and their significance.

Figure 16: Parameter Estimates for the VECM(2) Form, Continued

Covariance Parameter Estimates
Parameter Estimate Standard
Error
t Value Pr > |t|
COV1_1 94.75575 13.53654 7.00 <.0001
COV1_2 4.52684 10.30302 0.44 0.6614
COV2_2 109.57038 15.65291 7.00 <.0001


The fitted model is represented as

StartLayout 1st Row  normal upper Delta bold y Subscript t Baseline equals Start 4 By 2 Matrix 1st Row 1st Column negative 0.467 2nd Column 0.913 2nd Row 1st Column left-parenthesis 0.048 right-parenthesis 2nd Column left-parenthesis 0.094 right-parenthesis 3rd Row 1st Column 0.107 2nd Column negative 0.209 4th Row 1st Column left-parenthesis 0.051 right-parenthesis 2nd Column left-parenthesis 0.100 right-parenthesis EndMatrix bold y Subscript t minus 1 Baseline plus Start 4 By 2 Matrix 1st Row 1st Column negative 0.743 2nd Column negative 0.746 2nd Row 1st Column left-parenthesis 0.045 right-parenthesis 2nd Column left-parenthesis 0.048 right-parenthesis 3rd Row 1st Column 0.405 2nd Column negative 0.572 4th Row 1st Column left-parenthesis 0.049 right-parenthesis 2nd Column left-parenthesis 0.051 right-parenthesis EndMatrix normal upper Delta bold y Subscript t minus 1 Baseline plus bold-italic epsilon Subscript t EndLayout

Figure 17: Change the VECM(2) Form to the VAR(2) Model

Infinite Order AR Representation
Lag Variable y1 y2
1 y1 -0.21013 0.16674
  y2 0.51160 0.21980
2 y1 0.74332 0.74621
  y2 -0.40493 0.57157
3 y1 0.00000 0.00000
  y2 0.00000 0.00000


The PRINT=(IARR) option in the previous SAS statements prints the reparameterized coefficient estimates. Because LAGMAX=3 in those statements, the coefficient matrix of lag 3 is zero.

The VECM(2) form in Figure 17 can be rewritten as the following second-order vector autoregressive model:

StartLayout 1st Row  bold y Subscript t Baseline equals Start 2 By 2 Matrix 1st Row 1st Column negative 0.210 2nd Column 0.167 2nd Row 1st Column 0.512 2nd Column 0.220 EndMatrix bold y Subscript t minus 1 Baseline plus Start 2 By 2 Matrix 1st Row 1st Column 0.743 2nd Column 0.746 2nd Row 1st Column negative 0.405 2nd Column 0.572 EndMatrix bold y Subscript t minus 2 Baseline plus bold-italic epsilon Subscript t EndLayout
Last updated: June 19, 2025