AUTOREG Procedure

Testing

The modeling process consists of four stages: identification, specification, estimation, and diagnostic checking (Cromwell, Labys, and Terraza 1994). The AUTOREG procedure supports tens of statistical tests for identification and diagnostic checking. Figure 17 illustrates how to incorporate these statistical tests into the modeling process.

Figure 17: Statistical Tests in the AUTOREG Procedure

Statistical Tests in the AUTOREG Procedure


Testing for Stationarity

Most of the theories of time series require stationarity; therefore, it is critical to determine whether a time series is stationary. Two nonstationary time series are fractionally integrated time series and autoregressive series with random coefficients. However, more often some time series are nonstationary due to an upward trend over time. The trend can be captured by either of the following two models.

  • The difference stationary process

    left-parenthesis 1 minus upper L right-parenthesis y Subscript t Baseline equals delta plus psi left-parenthesis upper L right-parenthesis epsilon Subscript t

    where L is the lag operator, psi left-parenthesis 1 right-parenthesis not-equals 0, and epsilon Subscript t is a white noise sequence with mean zero and variance sigma squared. Hamilton (1994) also refers to this model the unit root process.

  • The trend stationary process

    y Subscript t Baseline equals alpha plus delta t plus psi left-parenthesis upper L right-parenthesis epsilon Subscript t

When a process has a unit root, it is said to be integrated of order one or I(1). An I(1) process is stationary after differencing once. The trend stationary process and difference stationary process require different treatment to transform the process into stationary one for analysis. Therefore, it is important to distinguish the two processes. Bhargava (1986) nested the two processes into the following general model:

y Subscript t Baseline equals gamma 0 plus gamma 1 t plus alpha left-parenthesis y Subscript t minus 1 Baseline minus gamma 0 minus gamma 1 left-parenthesis t minus 1 right-parenthesis right-parenthesis plus psi left-parenthesis upper L right-parenthesis epsilon Subscript t

However, a difficulty is that the right-hand side is nonlinear in the parameters. Therefore, it is convenient to use a different parameterization:

y Subscript t Baseline equals beta 0 plus beta 1 t plus alpha y Subscript t minus 1 Baseline plus psi left-parenthesis upper L right-parenthesis epsilon Subscript t

The test of null hypothesis that alpha equals 1 against the one-sided alternative of alpha less-than 1 is called a unit root test.

Dickey-Fuller unit root tests are based on regression models similar to the previous model,

y Subscript t Baseline equals beta 0 plus beta 1 t plus alpha y Subscript t minus 1 Baseline plus epsilon Subscript t

where epsilon Subscript t is assumed to be white noise. The t statistic of the coefficient alpha does not follow the normal distribution asymptotically. Instead, its distribution can be derived using the functional central limit theorem. Three types of regression models including the preceding one are considered by the Dickey-Fuller test. The deterministic terms that are included in the other two types of regressions are either null or constant only.

An assumption in the Dickey-Fuller unit root test is that it requires the errors in the autoregressive model to be white noise, which is often not true. There are two popular ways to account for general serial correlation between the errors. One is the augmented Dickey-Fuller (ADF) test, which uses the lagged difference in the regression model. This was originally proposed by Dickey and Fuller (1979) and later studied by Said and Dickey (1984) and Phillips and Perron (1988). Another method is proposed by Phillips and Perron (1988); it is called Phillips-Perron (PP) test. The tests adopt the original Dickey-Fuller regression with intercept, but modify the test statistics to take account of the serial correlation and heteroscedasticity. It is called nonparametric because no specific form of the serial correlation of the errors is assumed.

A problem of the augmented Dickey-Fuller and Phillips-Perron unit root tests is that they are subject to size distortion and low power. It is reported in Schwert (1989) that the size distortion is significant when the series contains a large moving average (MA) parameter. DeJong et al. (1992) find that the ADF has power around one third and PP test has power less than 0.1 against the trend stationary alternative, in some common settings. Among some more recent unit root tests that improve upon the size distortion and the low power are the tests described by Elliott, Rothenberg, and Stock (1996) and Ng and Perron (2001). These tests involve a step of detrending before constructing the test statistics and are demonstrated to perform better than the traditional ADF and PP tests.

Most testing procedures specify the unit root processes as the null hypothesis. Tests of the null hypothesis of stationarity have also been studied, among which Kwiatkowski et al. (1992) is very popular.

Economic theories often dictate that a group of economic time series are linked together by some long-run equilibrium relationship. Statistically, this phenomenon can be modeled by cointegration. When several nonstationary processes bold z Subscript t Baseline equals left-parenthesis z Subscript 1 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime are cointegrated, there exists a left-parenthesis k times 1 right-parenthesis cointegrating vector bold c such that bold c prime bold z Subscript t is stationary and bold c is a nonzero vector. One way to test the relationship of cointegration is the residual based cointegration test, which assumes the regression model

y Subscript t Baseline equals beta 1 plus bold x prime Subscript t Baseline beta plus u Subscript t

where y Subscript t Baseline equals z Subscript 1 t, bold x Subscript t Baseline equals left-parenthesis z Subscript 2 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime, and beta equals left-parenthesis beta 2 comma ellipsis comma beta Subscript k Baseline right-parenthesis prime. The OLS residuals from the regression model are used to test for the null hypothesis of no cointegration. Engle and Granger (1987) suggest using ADF on the residuals while Phillips and Ouliaris (1990) study the tests using PP and other related test statistics.

Augmented Dickey-Fuller Unit Root and Engle-Granger Cointegration Testing

Common unit root tests have the null hypothesis that there is an autoregressive unit root upper H 0 colon alpha equals 1, and the alternative is upper H Subscript a Baseline colon StartAbsoluteValue alpha EndAbsoluteValue less-than 1, where alpha is the autoregressive coefficient of the time series

y Subscript t Baseline equals alpha y Subscript t minus 1 Baseline plus epsilon Subscript t

This is referred to as the zero mean model. The standard Dickey-Fuller (DF) test assumes that errors epsilon Subscript t are white noise. There are two other types of regression models that include a constant or a time trend as follows:

StartLayout 1st Row 1st Column y Subscript t 2nd Column equals mu plus alpha y Subscript t minus 1 Baseline plus epsilon Subscript t Baseline 2nd Row 1st Column y Subscript t 2nd Column equals mu plus beta t plus alpha y Subscript t minus 1 Baseline plus epsilon Subscript t EndLayout

These two models are referred to as the constant mean model and the trend model, respectively. The constant mean model includes a constant mean mu of the time series. However, the interpretation of mu depends on the stationarity in the following sense: the mean in the stationary case when alpha less-than 1 is the trend in the integrated case when alpha equals 1. Therefore, the null hypothesis should be the joint hypothesis that alpha equals 1 and mu equals 0. However, for the unit root tests, the test statistics are concerned with the null hypothesis of alpha equals 1. The joint null hypothesis is not commonly used. This issue is addressed in Bhargava (1986) with a different nesting model.

There are two types of test statistics. The conventional t ratio is

upper D upper F Subscript tau Baseline equals StartFraction ModifyingAbove alpha With caret minus 1 Over s d left-parenthesis ModifyingAbove alpha With caret right-parenthesis EndFraction

and the second test statistic, called rho-test, is

upper T left-parenthesis ModifyingAbove alpha With caret minus 1 right-parenthesis

For the zero mean model, the asymptotic distributions of the Dickey-Fuller test statistics are

StartLayout 1st Row 1st Column upper T left-parenthesis ModifyingAbove alpha With caret minus 1 right-parenthesis 2nd Column right double arrow left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d upper W left-parenthesis r right-parenthesis right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis squared d r right-parenthesis Superscript negative 1 Baseline 2nd Row 1st Column upper D upper F Subscript tau 2nd Column right double arrow left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d upper W left-parenthesis r right-parenthesis right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis squared d r right-parenthesis Superscript negative 1 slash 2 EndLayout

For the constant mean model, the asymptotic distributions are

StartLayout 1st Row 1st Column upper T left-parenthesis ModifyingAbove alpha With caret minus 1 right-parenthesis 2nd Column right double arrow left-parenthesis left-bracket upper W left-parenthesis 1 right-parenthesis squared minus 1 right-bracket slash 2 minus upper W left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis squared d r minus left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis squared right-parenthesis Superscript negative 1 Baseline 2nd Row 1st Column upper D upper F Subscript tau 2nd Column right double arrow left-parenthesis left-bracket upper W left-parenthesis 1 right-parenthesis squared minus 1 right-bracket slash 2 minus upper W left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis squared d r minus left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis squared right-parenthesis Superscript negative 1 slash 2 EndLayout

For the trend model, the asymptotic distributions are

StartLayout 1st Row 1st Column upper T left-parenthesis ModifyingAbove alpha With caret minus 1 right-parenthesis 2nd Column right double arrow left-bracket upper W left-parenthesis r right-parenthesis d upper W plus 12 left-parenthesis integral Subscript 0 Superscript 1 Baseline r upper W left-parenthesis r right-parenthesis d r minus one-half integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r minus one-half upper W left-parenthesis 1 right-parenthesis right-parenthesis 2nd Row 1st Column Blank 2nd Column minus upper W left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-bracket upper D Superscript negative 1 3rd Row 1st Column upper D upper F Subscript tau 2nd Column right double arrow left-bracket upper W left-parenthesis r right-parenthesis d upper W plus 12 left-parenthesis integral Subscript 0 Superscript 1 Baseline r upper W left-parenthesis r right-parenthesis d r minus one-half integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r minus one-half upper W left-parenthesis 1 right-parenthesis right-parenthesis 4th Row 1st Column Blank 2nd Column minus upper W left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-bracket upper D Superscript 1 slash 2 EndLayout

where

upper D equals integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis squared d r minus 12 left-parenthesis integral Subscript 0 Superscript 1 Baseline r left-parenthesis upper W left-parenthesis r right-parenthesis d r right-parenthesis squared plus 12 integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r integral Subscript 0 Superscript 1 Baseline r upper W left-parenthesis r right-parenthesis d r minus 4 left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r right-parenthesis squared

One problem of the Dickey-Fuller and similar tests that employ three types of regressions is the difficulty in the specification of the deterministic trends. Campbell and Perron (1991) claimed that "the proper handling of deterministic trends is a vital prerequisite for dealing with unit roots." However, the "proper handling" is not obvious since the distribution theory of the relevant statistics about the deterministic trends is not available. Hayashi (2000) suggests using the constant mean model when you think there is no trend, and using the trend model when you think otherwise. However, no formal procedure is provided.

The null hypothesis of the Dickey-Fuller test is a random walk, possibly with drift. The differenced process is not serially correlated under the null of I(1). There is a great need for the generalization of this specification. The augmented Dickey-Fuller (ADF) test, originally proposed in Dickey and Fuller (1979), adjusts for the serial correlation in the time series by adding lagged first differences to the autoregressive model,

normal upper Delta y Subscript t Baseline equals mu plus delta t plus alpha y Subscript t minus 1 Baseline plus sigma-summation Underscript j equals 1 Overscript p Endscripts alpha Subscript j Baseline normal upper Delta y Subscript t minus j Baseline plus epsilon Subscript t

where the deterministic terms delta t and mu can be absent for the models without drift or linear trend. As previously, there are two types of test statistics. One is the OLS t value

StartFraction ModifyingAbove alpha With caret Over s d left-parenthesis ModifyingAbove alpha With caret right-parenthesis EndFraction

and the other is given by

StartFraction upper T ModifyingAbove alpha With caret Over 1 minus ModifyingAbove alpha With caret Subscript 1 Baseline minus midline-horizontal-ellipsis minus ModifyingAbove alpha With caret Subscript p Baseline EndFraction

The asymptotic distributions of the test statistics are the same as those of the standard Dickey-Fuller test statistics.

Nonstationary multivariate time series can be tested for cointegration, which means that a linear combination of these time series is stationary. Formally, denote the series by bold z Subscript t Baseline equals left-parenthesis z Subscript 1 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime. The null hypothesis of cointegration is that there exists a vector bold c such that bold c prime bold z Subscript t is stationary. Residual-based cointegration tests were studied in Engle and Granger (1987) and Phillips and Ouliaris (1990). The latter are described in the next subsection. The first step regression is

y Subscript t Baseline equals bold x prime Subscript t Baseline beta plus u Subscript t

where y Subscript t Baseline equals z Subscript 1 t, bold x Subscript t Baseline equals left-parenthesis z Subscript 2 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime, and beta equals left-parenthesis beta 2 comma ellipsis comma beta Subscript k Baseline right-parenthesis prime. This regression can also include an intercept or an intercept with a linear trend. The residuals are used to test for the existence of an autoregressive unit root. Engle and Granger (1987) proposed augmented Dickey-Fuller type regression without an intercept on the residuals to test the unit root. When the first step OLS does not include an intercept, the asymptotic distribution of the ADF test statistic upper D upper F Subscript tau is given by

StartLayout 1st Row 1st Column upper D upper F Subscript tau Baseline long right double arrow 2nd Column integral Subscript 0 Superscript 1 Baseline StartFraction upper Q left-parenthesis r right-parenthesis Over left-parenthesis integral Subscript 0 Superscript 1 Baseline upper Q squared right-parenthesis Superscript 1 slash 2 Baseline EndFraction d upper S 2nd Row 1st Column upper Q left-parenthesis r right-parenthesis 2nd Column equals upper W 1 left-parenthesis r right-parenthesis minus integral Subscript 0 Superscript 1 Baseline upper W 1 upper W prime 2 left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W 2 upper W prime 2 right-parenthesis Superscript negative 1 Baseline upper W 2 left-parenthesis r right-parenthesis 3rd Row 1st Column upper S left-parenthesis r right-parenthesis 2nd Column equals StartFraction upper Q left-parenthesis r right-parenthesis Over left-parenthesis kappa prime kappa right-parenthesis Superscript 1 slash 2 Baseline EndFraction 4th Row 1st Column kappa prime 2nd Column equals left-parenthesis 1 comma minus integral Subscript 0 Superscript 1 Baseline upper W 1 upper W prime 2 left-parenthesis integral Subscript 0 Superscript 1 Baseline upper W 2 upper W prime 2 right-parenthesis Superscript negative 1 Baseline right-parenthesis EndLayout

where upper W left-parenthesis r right-parenthesis is a k vector standard Brownian motion and

upper W left-parenthesis r right-parenthesis equals left-parenthesis upper W 1 left-parenthesis r right-parenthesis comma upper W 2 left-parenthesis r right-parenthesis right-parenthesis

is a partition such that upper W 1 left-parenthesis r right-parenthesis is a scalar and upper W 2 left-parenthesis r right-parenthesis is k minus 1 dimensional. The asymptotic distributions of the test statistics in the other two cases have the same form as the preceding formula. If the first step regression includes an intercept, then upper W left-parenthesis r right-parenthesis is replaced by the de-meaned Brownian motion ModifyingAbove upper W With bar left-parenthesis r right-parenthesis equals upper W left-parenthesis r right-parenthesis minus integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis r right-parenthesis d r. If the first step regression includes a time trend, then upper W left-parenthesis r right-parenthesis is replaced by the detrended Brownian motion. The critical values of the asymptotic distributions are tabulated in Phillips and Ouliaris (1990) and MacKinnon (1991).

The residual based cointegration tests have a major shortcoming. Different choices of the dependent variable in the first step OLS might produce contradictory results. This can be explained theoretically. If the dependent variable is in the cointegration relationship, then the test is consistent against the alternative that there is cointegration. On the other hand, if the dependent variable is not in the cointegration system, the OLS residual y Subscript t Baseline minus bold x prime Subscript t Baseline beta do not converge to a stationary process. Changing the dependent variable is more likely to produce conflicting results in finite samples.

Phillips-Perron Unit Root and Cointegration Testing

Besides the ADF test, there is another popular unit root test that is valid under general serial correlation and heteroscedasticity, developed by Phillips (1987) and Phillips and Perron (1988). The tests are constructed using the AR(1) type regressions, unlike ADF tests, with corrected estimation of the long run variance of normal upper Delta y Subscript t. In the case without intercept, consider the driftless random walk process

y Subscript t Baseline equals y Subscript t minus 1 Baseline plus u Subscript t

where the disturbances might be serially correlated with possible heteroscedasticity. Phillips and Perron (1988) proposed the unit root test of the OLS regression model,

y Subscript t Baseline equals rho y Subscript t minus 1 Baseline plus u Subscript t

Denote the OLS residual by ModifyingAbove u With caret Subscript t. The asymptotic variance of StartFraction 1 Over upper T EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts ModifyingAbove u With caret Subscript t Superscript 2 can be estimated by using the truncation lag l,

ModifyingAbove lamda With caret equals sigma-summation Underscript j equals 0 Overscript l Endscripts kappa Subscript j Baseline left-bracket 1 minus j slash left-parenthesis l plus 1 right-parenthesis right-bracket ModifyingAbove gamma With caret Subscript j

where kappa 0 equals 1, kappa Subscript j Baseline equals 2 for j greater-than 0, and ModifyingAbove gamma With caret Subscript j Baseline equals StartFraction 1 Over upper T EndFraction sigma-summation Underscript t equals j plus 1 Overscript upper T Endscripts ModifyingAbove u With caret Subscript t Baseline ModifyingAbove u With caret Subscript t minus j. This is a consistent estimator suggested by Newey and West (1987).

The variance of u Subscript t can be estimated by s squared equals StartFraction 1 Over upper T minus k EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts ModifyingAbove u With caret Subscript t Superscript 2. Let ModifyingAbove sigma With caret squared be the variance estimate of the OLS estimator ModifyingAbove rho With caret. Then the Phillips-Perron ModifyingAbove normal upper Z With caret Subscript rho test (zero mean case) is written

ModifyingAbove normal upper Z With caret Subscript rho Baseline equals upper T left-parenthesis ModifyingAbove rho With caret minus 1 right-parenthesis minus one-half upper T squared ModifyingAbove sigma With caret squared left-parenthesis ModifyingAbove lamda With caret minus ModifyingAbove gamma With caret Subscript 0 Baseline right-parenthesis slash s squared

The ModifyingAbove normal upper Z With caret Subscript rho statistic is just the ordinary Dickey-Fuller ModifyingAbove normal upper Z With caret Subscript alpha statistic with a correction term that accounts for the serial correlation. The correction term goes to zero asymptotically if there is no serial correlation.

Note that normal upper P left-parenthesis ModifyingAbove rho With caret less-than 1 right-parenthesis almost-equals 0.68 as upper T right-arrow normal infinity, which shows that the limiting distribution is skewed to the left.

Let tau Subscript rho be the tau statistic for ModifyingAbove rho With caret. The Phillips-Perron ModifyingAbove normal upper Z With caret Subscript t (defined here as ModifyingAbove normal upper Z With caret Subscript tau) test is written

ModifyingAbove normal upper Z With caret Subscript tau Baseline equals left-parenthesis ModifyingAbove gamma With caret Subscript 0 Baseline slash ModifyingAbove lamda With caret right-parenthesis Superscript 1 slash 2 Baseline t Subscript ModifyingAbove rho With caret Baseline minus one-half upper T ModifyingAbove sigma With caret left-parenthesis ModifyingAbove lamda With caret minus ModifyingAbove gamma With caret Subscript 0 Baseline right-parenthesis slash left-parenthesis s ModifyingAbove lamda With caret Superscript 1 slash 2 Baseline right-parenthesis

To incorporate a constant intercept, the regression model y Subscript t Baseline equals mu plus rho y Subscript t minus 1 Baseline plus u Subscript t is used (single mean case) and null hypothesis the series is a driftless random walk with nonzero unconditional mean. To incorporate a time trend, the regression model y Subscript t Baseline equals mu plus delta t plus rho y Subscript t minus 1 Baseline plus u Subscript t is used, and under the null the series is a random walk with drift.

The limiting distributions of the test statistics for the zero mean case are

StartLayout 1st Row 1st Column ModifyingAbove normal upper Z With caret Subscript rho 2nd Column right double arrow StartFraction one-half StartSet upper B left-parenthesis 1 right-parenthesis squared minus 1 EndSet Over integral Subscript 0 Superscript 1 Baseline left-bracket upper B left-parenthesis s right-parenthesis right-bracket Superscript 2 Baseline d s EndFraction 2nd Row 1st Column ModifyingAbove normal upper Z With caret Subscript tau 2nd Column right double arrow StartFraction one-half left-brace left-bracket upper B left-parenthesis 1 right-parenthesis right-bracket squared minus 1 right-brace Over StartSet integral Subscript 0 Superscript 1 Baseline left-bracket upper B left-parenthesis x right-parenthesis right-bracket Superscript 2 Baseline d x EndSet Superscript 1 slash 2 Baseline EndFraction EndLayout

where B(dot) is a standard Brownian motion.

The limiting distributions of the test statistics for the intercept case are

StartLayout 1st Row 1st Column ModifyingAbove upper Z With caret Subscript rho 2nd Column right double arrow StartFraction one-half left-brace left-bracket upper B left-parenthesis 1 right-parenthesis right-bracket squared minus 1 right-brace minus upper B left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x Over integral Subscript 0 Superscript 1 Baseline left-bracket upper B left-parenthesis x right-parenthesis right-bracket Superscript 2 Baseline d x minus left-bracket integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x right-bracket squared EndFraction 2nd Row 1st Column ModifyingAbove normal upper Z With caret Subscript tau 2nd Column right double arrow StartFraction one-half left-brace left-bracket upper B left-parenthesis 1 right-parenthesis right-bracket squared minus 1 right-brace minus upper B left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x Over left-brace integral Subscript 0 Superscript 1 Baseline left-bracket upper B left-parenthesis x right-parenthesis right-bracket Superscript 2 Baseline d x minus left-bracket integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x right-bracket squared right-brace Superscript 1 slash 2 Baseline EndFraction EndLayout

Finally, the limiting distributions of the test statistics for the trend case are can be derived as

Start 1 By 3 Matrix 1st Row 1st Column 0 2nd Column c 3rd Column 0 EndMatrix upper V Superscript negative 1 Baseline Start 3 By 1 Matrix 1st Row  upper B left-parenthesis 1 right-parenthesis 2nd Row  left-parenthesis upper B left-parenthesis 1 right-parenthesis squared minus 1 right-parenthesis slash 2 3rd Row  upper B left-parenthesis 1 right-parenthesis minus integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x EndMatrix

where c equals 1 for ModifyingAbove normal upper Z With caret Subscript rho and c equals StartFraction 1 Over StartRoot upper Q EndRoot EndFraction for ModifyingAbove normal upper Z With caret Subscript tau,

upper V equals Start 3 By 3 Matrix 1st Row 1st Column 1 2nd Column integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x 3rd Column 1 slash 2 2nd Row 1st Column integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis d x 2nd Column integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis x right-parenthesis squared d x 3rd Column integral Subscript 0 Superscript 1 Baseline x upper B left-parenthesis x right-parenthesis d x 3rd Row 1st Column 1 slash 2 2nd Column integral Subscript 0 Superscript 1 Baseline x upper B left-parenthesis x right-parenthesis d x 3rd Column 1 slash 3 EndMatrix
upper Q equals Start 1 By 3 Matrix 1st Row 1st Column 0 2nd Column c 3rd Column 0 EndMatrix upper V Superscript negative 1 Baseline Start 1 By 3 Matrix 1st Row 1st Column 0 2nd Column c 3rd Column 0 EndMatrix Superscript upper T

The finite sample performance of the PP test is not satisfactory (see Hayashi 2000).

When several variables bold z Subscript t Baseline equals left-parenthesis z Subscript 1 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime are cointegrated, there exists a left-parenthesis k times 1 right-parenthesis cointegrating vector bold c such that bold c prime bold z Subscript t is stationary and bold c is a nonzero vector. The residual based cointegration test assumes the following regression model,

y Subscript t Baseline equals beta 1 plus bold x prime Subscript t Baseline beta plus u Subscript t

where y Subscript t Baseline equals z Subscript 1 t, bold x Subscript t Baseline equals left-parenthesis z Subscript 2 t Baseline comma ellipsis comma z Subscript k t Baseline right-parenthesis prime, and beta equals left-parenthesis beta 2 comma ellipsis comma beta Subscript k Baseline right-parenthesis prime. You can estimate the consistent cointegrating vector by using OLS if all variables are difference stationary—that is, I(1). The estimated cointegrating vector is ModifyingAbove bold c With caret equals left-parenthesis 1 comma minus ModifyingAbove beta With caret Subscript 2 Baseline comma ellipsis comma minus ModifyingAbove beta With caret Subscript k Baseline right-parenthesis prime. The Phillips-Ouliaris test is computed using the OLS residuals from the preceding regression model, and it uses the PP unit root tests ModifyingAbove normal upper Z With caret Subscript rho and ModifyingAbove normal upper Z With caret Subscript tau developed in Phillips (1987), although in Phillips and Ouliaris (1990) the asymptotic distributions of some other leading unit root tests are also derived. The null hypothesis is no cointegration.

You need to refer to the tables by Phillips and Ouliaris (1990) to obtain the p-value of the cointegration test. Before you apply the cointegration test, you might want to perform the unit root test for each variable (see the option STATIONARITY=).

As in the Engle-Granger cointegration tests, the Phillips-Ouliaris test can give conflicting results for different choices of the regressand. There are other cointegration tests that are invariant to the order of the variables, including Johansen (1988), Johansen (1991), Stock and Watson (1988).

ERS and Ng-Perron Unit Root Tests

As mentioned earlier, ADF and PP both suffer severe size distortion and low power. There is a class of newer tests that improve both size and power. These are sometimes called efficient unit root tests, and among them tests by Elliott, Rothenberg, and Stock (1996) and Ng and Perron (2001) are prominent.

Elliott, Rothenberg, and Stock (1996) consider the data generating process

StartLayout 1st Row 1st Column y Subscript t 2nd Column equals beta prime z Subscript t Baseline plus u Subscript t Baseline 2nd Row 1st Column u Subscript t 2nd Column equals alpha u Subscript t minus 1 Baseline plus v Subscript t Baseline comma t equals 1 comma ellipsis comma upper T EndLayout

where StartSet z Subscript t Baseline EndSet is either StartSet 1 EndSet or StartSet left-parenthesis 1 comma t right-parenthesis EndSet and StartSet v Subscript t Baseline EndSet is an unobserved stationary zero-mean process with positive spectral density at zero frequency. The null hypothesis is upper H 0 colon alpha equals 1, and the alternative is upper H Subscript a Baseline colon StartAbsoluteValue alpha EndAbsoluteValue less-than 1. The key idea of Elliott, Rothenberg, and Stock (1996) is to study the asymptotic power and asymptotic power envelope of some new tests. Asymptotic power is defined with a sequence of local alternatives. For a fixed alternative hypothesis, the power of a test usually goes to one when sample size goes to infinity; however, this says nothing about the finite sample performance. On the other hand, when the data generating process under the alternative moves closer to the null hypothesis as the sample size increases, the power does not necessarily converge to one. The local-to-unity alternatives in ERS are

alpha equals 1 plus StartFraction c Over upper T EndFraction

and the power against the local alternatives has a limit as T goes to infinity, which is called asymptotic power. This value is strictly between 0 and 1. Asymptotic power indicates the adequacy of a test to distinguish small deviations from the null hypothesis.

Define

StartLayout 1st Row 1st Column y Subscript alpha 2nd Column equals left-parenthesis y 1 comma left-parenthesis 1 minus alpha upper L right-parenthesis y 2 comma ellipsis comma left-parenthesis 1 minus alpha upper L right-parenthesis y Subscript upper T Baseline right-parenthesis 2nd Row 1st Column z Subscript alpha 2nd Column equals left-parenthesis z 1 comma left-parenthesis 1 minus alpha upper L right-parenthesis z 2 comma ellipsis comma left-parenthesis 1 minus alpha upper L right-parenthesis z Subscript upper T Baseline right-parenthesis EndLayout

Let upper S left-parenthesis alpha right-parenthesis be the sum of squared residuals from a least squares regression of y Subscript alpha on z Subscript alpha. Then the point optimal test against the local alternative alpha overbar equals 1 plus c overbar slash upper T has the form

upper P Subscript upper T Superscript upper G upper L upper S Baseline equals StartFraction upper S left-parenthesis alpha overbar right-parenthesis minus alpha overbar upper S left-parenthesis 1 right-parenthesis Over ModifyingAbove omega With caret squared EndFraction

where ModifyingAbove omega With caret squared is an estimator for omega squared equals sigma-summation Underscript k equals negative normal infinity Overscript normal infinity Endscripts upper E v Subscript t Baseline v Subscript t minus k. The autoregressive (AR) estimator is used for ModifyingAbove omega With caret squared (Elliott, Rothenberg, and Stock 1996, equations 13 and 14),

ModifyingAbove omega With caret squared equals StartFraction ModifyingAbove sigma With caret Subscript eta Superscript 2 Baseline Over left-parenthesis 1 minus sigma-summation Underscript i equals 1 Overscript p Endscripts ModifyingAbove a With caret Subscript i Baseline right-parenthesis squared EndFraction

where ModifyingAbove sigma With caret Subscript eta Superscript 2 and ModifyingAbove a With caret Subscript i are OLS estimates from the regression

normal upper Delta y Subscript t Baseline equals a 0 y Subscript t minus 1 Baseline plus sigma-summation Underscript i equals 1 Overscript p Endscripts a Subscript i Baseline normal upper Delta y Subscript t minus i Baseline plus a Subscript p plus 1 Baseline plus eta Subscript t

where p is selected according to the Schwarz Bayesian information criterion. The test rejects the null when upper P Subscript upper T is small. The asymptotic power function for the point optimal test that is constructed with c overbar under local alternatives with c is denoted by pi left-parenthesis c comma c overbar right-parenthesis. Then the power envelope is pi left-parenthesis c comma c right-parenthesis because the test formed with c overbar is the most powerful against the alternative c equals c overbar. In other words, the asymptotic function pi left-parenthesis c comma c overbar right-parenthesis is always below the power envelope pi left-parenthesis c right-parenthesis except that at one point, c equals c overbar, they are tangent. Elliott, Rothenberg, and Stock (1996) show that choosing some specific values for c overbar can cause the asymptotic power function pi left-parenthesis c comma c overbar right-parenthesis of the point optimal test to be very close to the power envelope. The optimal c overbar is negative 7 when z Subscript t Baseline equals 1, and negative 13.5 when z Subscript t Baseline equals left-parenthesis 1 comma t right-parenthesis prime. This choice of c overbar corresponds to the tangent point where pi equals 0.5. This is also true of the DF-GLS test.

Elliott, Rothenberg, and Stock (1996) also propose the DF-GLS test, given by the t statistic for testing psi 0 equals 0 in the regression

normal upper Delta y Subscript t Superscript d Baseline equals psi 0 y Subscript t minus 1 Superscript d Baseline plus sigma-summation Underscript j equals 1 Overscript p Endscripts psi Subscript j Baseline normal upper Delta y Subscript t minus j Superscript d Baseline plus epsilon Subscript t p

where y Subscript t Superscript d is obtained in a first step detrending

y Subscript t Superscript d Baseline equals y Subscript t Baseline minus ModifyingAbove beta With caret prime Subscript alpha overbar Baseline z Subscript t

and ModifyingAbove beta With caret Subscript alpha overbar is least squares regression coefficient of y Subscript alpha on z Subscript alpha. Regarding the lag length selection, Elliott, Rothenberg, and Stock (1996) favor the Schwarz Bayesian information criterion. The optimal selection of the lag length p and the estimation of omega squared is further discussed in Ng and Perron (2001). The lag length is selected from the interval left-bracket 0 comma p Subscript m a x Baseline right-bracket for some fixed p Subscript m a x by using the modified Akaike’s information criterion,

StartLayout 1st Row  MAIC left-parenthesis p right-parenthesis equals log left-parenthesis ModifyingAbove sigma With caret Subscript p Superscript 2 Baseline right-parenthesis plus StartFraction 2 left-parenthesis tau Subscript upper T Baseline left-parenthesis p right-parenthesis plus p right-parenthesis Over upper T minus p Subscript m a x Baseline EndFraction EndLayout

where tau Subscript upper T Baseline left-parenthesis p right-parenthesis equals left-parenthesis ModifyingAbove sigma With caret Subscript p Superscript 2 Baseline right-parenthesis Superscript negative 1 Baseline ModifyingAbove psi With caret Subscript 0 Superscript 2 Baseline sigma-summation Underscript t equals p Subscript m a x Baseline plus 1 Overscript upper T minus 1 Endscripts left-parenthesis y Subscript t Superscript d Baseline right-parenthesis squared and ModifyingAbove sigma With caret Subscript p Superscript 2 Baseline equals left-parenthesis upper T minus p Subscript m a x Baseline minus 1 right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals p Subscript m a x Baseline plus 1 Overscript upper T minus 1 Endscripts ModifyingAbove epsilon With caret Subscript t p Superscript 2. For fixed lag length p, an estimate of omega squared is given by

ModifyingAbove omega With caret squared equals StartFraction left-parenthesis upper T minus 1 minus p right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals p plus 2 Overscript upper T Endscripts ModifyingAbove epsilon With caret Subscript t p Superscript 2 Baseline Over left-parenthesis 1 minus sigma-summation Underscript j equals 1 Overscript p Endscripts ModifyingAbove psi With caret Subscript j Baseline right-parenthesis squared EndFraction

DF-GLS is indeed a superior unit root test, according to Stock (1994), Schwert (1989), and Elliott, Rothenberg, and Stock (1996). In terms of the size of the test, DF-GLS is almost as good as the ADF t test DF Subscript tau and better than the PP ModifyingAbove normal upper Z With caret Subscript rho and ModifyingAbove normal upper Z With caret Subscript tau test. In addition, the power of the DF-GLS test is greater than that of both the ADF t test and the rho-test.

Ng and Perron (2001) also apply GLS detrending to obtain the following M-tests:

StartLayout 1st Row 1st Column upper M upper Z Subscript alpha 2nd Column equals left-parenthesis left-parenthesis upper T minus 1 right-parenthesis Superscript negative 1 Baseline left-parenthesis y Subscript upper T Superscript d Baseline right-parenthesis squared minus ModifyingAbove omega With caret squared right-parenthesis left-parenthesis 2 left-parenthesis upper T minus 1 right-parenthesis Superscript negative 2 Baseline sigma-summation Underscript t equals 1 Overscript upper T minus 1 Endscripts left-parenthesis y Subscript t Superscript d Baseline right-parenthesis squared right-parenthesis Superscript negative 1 Baseline 2nd Row 1st Column upper M upper S upper B 2nd Column equals left-parenthesis StartFraction sigma-summation Underscript t equals 1 Overscript upper T minus 1 Endscripts left-parenthesis y Subscript t Superscript d Baseline right-parenthesis squared Over left-parenthesis upper T minus 1 right-parenthesis squared ModifyingAbove omega With caret squared EndFraction right-parenthesis Superscript 1 slash 2 Baseline 3rd Row 1st Column upper M upper Z Subscript t 2nd Column equals upper M upper Z Subscript alpha Baseline times upper M upper S upper B EndLayout

The first one is a modified version of the Phillips-Perron normal upper Z Subscript rho test,

upper M upper Z Subscript rho Baseline equals normal upper Z Subscript rho Baseline plus StartFraction upper T Over 2 EndFraction left-parenthesis ModifyingAbove alpha With caret minus 1 right-parenthesis squared

where the detrended data StartSet y Subscript t Superscript d Baseline EndSet is used. The second is a modified Bhargava (1986) upper R 1 test statistic. The third can be perceived as a modified Phillips-Perron normal upper Z Subscript tau statistic because of the relationship normal upper Z Subscript tau Baseline equals upper M upper S upper B times normal upper Z Subscript rho.

The modified point optimal tests that use the GLS detrended data are

StartLayout 1st Row 1st Column upper M upper P Subscript upper T Superscript upper G upper L upper S Baseline equals StartFraction c overbar squared left-parenthesis upper T minus 1 right-parenthesis Superscript negative 2 Baseline sigma-summation Underscript t equals 1 Overscript upper T minus 1 Endscripts left-parenthesis y Subscript t Superscript d Baseline right-parenthesis squared minus c overbar left-parenthesis upper T minus 1 right-parenthesis Superscript negative 1 Baseline left-parenthesis y Subscript upper T Superscript d Baseline right-parenthesis squared Over ModifyingAbove omega With caret squared EndFraction 2nd Column for z Subscript t Baseline equals 1 2nd Row 1st Column upper M upper P Subscript upper T Superscript upper G upper L upper S Baseline equals StartFraction c overbar squared left-parenthesis upper T minus 1 right-parenthesis Superscript negative 2 Baseline sigma-summation Underscript t equals 1 Overscript upper T minus 1 Endscripts left-parenthesis y Subscript t Superscript d Baseline right-parenthesis squared plus left-parenthesis 1 minus c overbar right-parenthesis left-parenthesis upper T minus 1 right-parenthesis Superscript negative 1 Baseline left-parenthesis y Subscript upper T Superscript d Baseline right-parenthesis squared Over ModifyingAbove omega With caret squared EndFraction 2nd Column for z Subscript t Baseline equals left-parenthesis 1 comma t right-parenthesis EndLayout

The DF-GLS test and the MZ Subscript t test have the same limiting distribution:

StartLayout 1st Row 1st Column DF hyphen GLS almost-equals upper M upper Z Subscript t Baseline right double arrow 0.5 StartFraction left-parenthesis upper J Subscript c Baseline left-parenthesis 1 right-parenthesis squared minus 1 right-parenthesis Over left-parenthesis integral Subscript 0 Superscript 1 Baseline upper J Subscript c Baseline left-parenthesis r right-parenthesis squared d r right-parenthesis Superscript 1 slash 2 Baseline EndFraction 2nd Column for z Subscript t Baseline equals 1 2nd Row 1st Column DF hyphen GLS almost-equals upper M upper Z Subscript t Baseline right double arrow 0.5 StartFraction left-parenthesis upper V Subscript c comma c overbar Baseline left-parenthesis 1 right-parenthesis squared minus 1 right-parenthesis Over left-parenthesis integral Subscript 0 Superscript 1 Baseline upper V Subscript c comma c overbar Baseline left-parenthesis r right-parenthesis squared d r right-parenthesis Superscript 1 slash 2 Baseline EndFraction 2nd Column for z Subscript t Baseline equals left-parenthesis 1 comma t right-parenthesis EndLayout

The point optimal test and the modified point optimal test have the same limiting distribution,

StartLayout 1st Row 1st Column upper P Subscript upper T Superscript upper G upper L upper S Baseline almost-equals upper M upper P Subscript upper T Superscript upper G upper L upper S Baseline right double arrow c overbar squared integral Subscript 0 Superscript 1 Baseline upper J Subscript c Baseline left-parenthesis r right-parenthesis squared d r minus c overbar upper J Subscript c Baseline left-parenthesis 1 right-parenthesis squared 2nd Column for z Subscript t Baseline equals 1 2nd Row 1st Column upper P Subscript upper T Superscript upper G upper L upper S Baseline almost-equals upper M upper P Subscript upper T Superscript upper G upper L upper S Baseline right double arrow c overbar squared integral Subscript 0 Superscript 1 Baseline upper V Subscript c comma c overbar Baseline left-parenthesis r right-parenthesis squared d r plus left-parenthesis 1 minus c overbar right-parenthesis upper V Subscript c comma c overbar Baseline left-parenthesis 1 right-parenthesis squared 2nd Column for z Subscript t Baseline equals left-parenthesis 1 comma t right-parenthesis EndLayout

where upper W left-parenthesis r right-parenthesis is a standard Brownian motion and upper J Subscript c Baseline left-parenthesis r right-parenthesis is an Ornstein-Uhlenbeck process defined by d upper J Subscript c Baseline left-parenthesis r right-parenthesis equals c upper J Subscript c Baseline left-parenthesis r right-parenthesis d r plus d upper W left-parenthesis r right-parenthesis with upper J Subscript c Baseline left-parenthesis 0 right-parenthesis equals 0, upper V Subscript c comma c overbar Baseline left-parenthesis r right-parenthesis equals upper J Subscript c Baseline left-parenthesis r right-parenthesis minus r left-bracket lamda upper J Subscript c Baseline left-parenthesis 1 right-parenthesis plus 3 left-parenthesis 1 minus lamda right-parenthesis integral Subscript 0 Superscript 1 Baseline s upper J Subscript c Baseline left-parenthesis s right-parenthesis d s right-bracket, and lamda equals left-parenthesis 1 minus c overbar right-parenthesis slash left-parenthesis 1 minus c overbar plus c overbar squared slash 3 right-parenthesis.

Overall, the M-tests have the smallest size distortion, with the ADF t test having the next smallest. The ADF rho-test, ModifyingAbove normal upper Z With caret Subscript rho, and ModifyingAbove normal upper Z With caret Subscript tau have the largest size distortion. In addition, the power of the DF-GLS and M-tests is greater than that of the ADF t test and rho-test. The ADF ModifyingAbove normal upper Z With caret Subscript rho has more severe size distortion than the ADF ModifyingAbove normal upper Z With caret Subscript tau, but it has more power for a fixed lag length.

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Unit Root Test and Shin Cointegration Test

There are fewer tests available for the null hypothesis of trend stationarity I(0). The main reason is the difficulty of theoretical development. The KPSS test was introduced in Kwiatkowski et al. (1992) to test the null hypothesis that an observable series is stationary around a deterministic trend. For consistency, the notation used here differs from the notation in the original paper. The setup of the problem is as follows: it is assumed that the series is expressed as the sum of the deterministic trend, random walk r Subscript t, and stationary error u Subscript t; that is,

StartLayout 1st Row 1st Column y Subscript t 2nd Column equals mu plus delta t plus r Subscript t Baseline plus u Subscript t Baseline 2nd Row 1st Column r Subscript t 2nd Column equals r Subscript t minus 1 Baseline plus e Subscript t EndLayout

where e Subscript t Baseline tildeiid left-parenthesis 0 comma sigma Subscript e Superscript 2 Baseline right-parenthesis, and an intercept mu (in the original paper, the authors use r 0 instead of mu; here it is assumed that r 0 equals 0.) The null hypothesis of trend stationarity is specified by upper H 0 colon sigma Subscript e Superscript 2 Baseline equals 0, while the null of level stationarity is the same as above with the model restriction delta equals 0. Under the alternative that sigma Subscript e Superscript 2 Baseline not-equals 0, there is a random walk component in the observed series y Subscript t.

Under stronger assumptions of normality and iid of u Subscript t and e Subscript t, a one-sided LM test of the null that there is no random walk (e Subscript t Baseline equals 0 comma for-all t) can be constructed as follows:

StartLayout 1st Row 1st Column ModifyingAbove upper L upper M With caret 2nd Column equals StartFraction 1 Over upper T squared EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts StartFraction upper S Subscript t Superscript 2 Baseline Over s squared left-parenthesis l right-parenthesis EndFraction 2nd Row 1st Column s squared left-parenthesis l right-parenthesis 2nd Column equals StartFraction 1 Over upper T EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts ModifyingAbove u With caret Subscript t Superscript 2 Baseline plus StartFraction 2 Over upper T EndFraction sigma-summation Underscript s equals 1 Overscript l Endscripts w left-parenthesis s comma l right-parenthesis sigma-summation Underscript t equals s plus 1 Overscript upper T Endscripts ModifyingAbove u With caret Subscript t Baseline ModifyingAbove u With caret Subscript t minus s Baseline 3rd Row 1st Column upper S Subscript t 2nd Column equals sigma-summation Underscript tau equals 1 Overscript t Endscripts ModifyingAbove u With caret Subscript tau EndLayout

Under the null hypothesis, ModifyingAbove u With caret Subscript t can be estimated by ordinary least squares regression of y Subscript t on an intercept and the time trend. Following the original work of Kwiatkowski et al. (1992), under the null (sigma Subscript e Superscript 2 Baseline equals 0), the ModifyingAbove upper L upper M With caret statistic converges asymptotically to three different distributions depending on whether the model is trend-stationary, level-stationary (delta equals 0), or zero-mean stationary (delta equals 0, mu equals 0). The trend-stationary model is denoted by subscript tau and the level-stationary model is denoted by subscript mu. The case when there is no trend and zero intercept is denoted as 0. The last case, although rarely used in practice, is considered in Hobijn, Franses, and Ooms (2004),

StartLayout 1st Row 1st Column y Subscript t Baseline equals u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript 0 Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper B squared left-parenthesis r right-parenthesis d r 2nd Row 1st Column y Subscript t Baseline equals mu plus u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript mu Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper V squared left-parenthesis r right-parenthesis d r 3rd Row 1st Column y Subscript t Baseline equals mu plus delta t plus u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript tau Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper V 2 squared left-parenthesis r right-parenthesis d r EndLayout

with

StartLayout 1st Row 1st Column upper V left-parenthesis r right-parenthesis 2nd Column equals upper B left-parenthesis r right-parenthesis minus r upper B left-parenthesis 1 right-parenthesis 2nd Row 1st Column upper V 2 left-parenthesis r right-parenthesis 2nd Column equals upper B left-parenthesis r right-parenthesis plus left-parenthesis 2 r minus 3 r squared right-parenthesis upper B left-parenthesis 1 right-parenthesis plus left-parenthesis minus 6 r plus 6 r squared right-parenthesis integral Subscript 0 Superscript 1 Baseline upper B left-parenthesis s right-parenthesis d s EndLayout

where upper B left-parenthesis r right-parenthesis is a Brownian motion (Wiener process) and long right-arrow Overscript upper D Endscripts is convergence in distribution. upper V left-parenthesis r right-parenthesis is a standard Brownian bridge, and upper V 2 left-parenthesis r right-parenthesis is a second-level Brownian bridge.

Using the notation of Kwiatkowski et al. (1992), the ModifyingAbove upper L upper M With caret statistic is named as ModifyingAbove eta With caret. This test depends on the computational method used to compute the long-run variance s left-parenthesis l right-parenthesis; that is, the window width l and the kernel type w left-parenthesis dot comma dot right-parenthesis. You can specify the kernel used in the test by using the KERNEL option:

  • Newey-West/Bartlett (KERNEL=NW vertical-bar BART) (this is the default)

    w left-parenthesis s comma l right-parenthesis equals 1 minus StartFraction s Over l plus 1 EndFraction
  • quadratic spectral (KERNEL=QS)

    w left-parenthesis s comma l right-parenthesis equals ModifyingAbove w With tilde left-parenthesis StartFraction s Over l EndFraction right-parenthesis equals ModifyingAbove w With tilde left-parenthesis x right-parenthesis equals StartFraction 25 Over 12 pi squared x squared EndFraction left-parenthesis StartFraction sine left-parenthesis 6 pi x slash 5 right-parenthesis Over 6 pi x slash 5 EndFraction minus cosine left-parenthesis six-fifths pi x right-parenthesis right-parenthesis

You can specify the number of lags, l, in three different ways:

  • Schwert (SCHW = c) (default for NW, c=12)

    l equals max left-brace 1 comma floor left-bracket c left-parenthesis StartFraction upper T Over 100 EndFraction right-parenthesis Superscript 1 slash 4 Baseline right-bracket right-brace
  • manual (LAG = l)

  • automatic selection (AUTO) (default for QS), from Hobijn, Franses, and Ooms (2004). The number of lags, l, is calculated as in the following table:

    KERNEL=NW KERNEL=QS
    l equals min left-parenthesis upper T comma floor left-parenthesis ModifyingAbove gamma With caret upper T Superscript 1 slash 3 Baseline right-parenthesis right-parenthesis l equals min left-parenthesis upper T comma floor left-parenthesis ModifyingAbove gamma With caret upper T Superscript 1 slash 5 Baseline right-parenthesis right-parenthesis
    ModifyingAbove gamma With caret equals 1.1447 StartSet left-parenthesis StartFraction ModifyingAbove s With caret Superscript left-parenthesis 1 right-parenthesis Baseline Over ModifyingAbove s With caret Superscript left-parenthesis 0 right-parenthesis Baseline EndFraction right-parenthesis squared EndSet Superscript 1 slash 3 ModifyingAbove gamma With caret equals 1.3221 StartSet left-parenthesis StartFraction ModifyingAbove s With caret Superscript left-parenthesis 2 right-parenthesis Baseline Over ModifyingAbove s With caret Superscript left-parenthesis 0 right-parenthesis Baseline EndFraction right-parenthesis squared EndSet Superscript 1 slash 5
    ModifyingAbove s With caret Superscript left-parenthesis j right-parenthesis Baseline equals delta Subscript 0 comma j Baseline ModifyingAbove gamma With caret Subscript 0 Baseline plus 2 sigma-summation Underscript i equals 1 Overscript n Endscripts i Superscript j Baseline ModifyingAbove gamma With caret Subscript i ModifyingAbove s With caret Superscript left-parenthesis j right-parenthesis Baseline equals delta Subscript 0 comma j Baseline ModifyingAbove gamma With caret Subscript 0 Baseline plus 2 sigma-summation Underscript i equals 1 Overscript n Endscripts i Superscript j Baseline ModifyingAbove gamma With caret Subscript i
    n equals floor left-parenthesis upper T Superscript 2 slash 9 Baseline right-parenthesis n equals floor left-parenthesis upper T Superscript 2 slash 25 Baseline right-parenthesis
    where T is the number of observations, delta Subscript 0 comma j Baseline equals 1 if j equals 0 and 0 otherwise, and ModifyingAbove gamma With caret Subscript i Baseline equals StartFraction 1 Over upper T EndFraction sigma-summation Underscript t equals 1 Overscript upper T minus i Endscripts u Subscript t Baseline u Subscript t plus i.

Simulation evidence shows that the KPSS has size distortion in finite samples. For an example, see Caner and Kilian (2001). The power is reduced when the sample size is large; this can be derived theoretically (see Breitung 1995). Another problem of the KPSS test is that the power depends on the truncation lag used in the Newey-West estimator of the long-run variance s squared left-parenthesis l right-parenthesis.

Shin (1994) extends the KPSS test to incorporate the regressors to be a cointegration test. The cointegrating regression becomes

StartLayout 1st Row 1st Column y Subscript t 2nd Column equals mu plus delta t plus upper X prime Subscript t Baseline beta plus r Subscript t Baseline plus u Subscript t Baseline 2nd Row 1st Column r Subscript t 2nd Column equals r Subscript t minus 1 Baseline plus e Subscript t EndLayout

where y Subscript t and upper X Subscript t are scalar and m-vector upper I left-parenthesis 1 right-parenthesis variables. There are still three cases of cointegrating regressions: without intercept and trend, with intercept only, and with intercept and trend. The null hypothesis of the cointegration test is the same as that for the KPSS test, upper H 0 colon sigma Subscript e Superscript 2 Baseline equals 0. The test statistics for cointegration in the three cases of cointegrating regressions are exactly the same as those in the KPSS test; these test statistics are then ignored here. Under the null hypothesis, the statistics converge asymptotically to three different distributions,

StartLayout 1st Row 1st Column y Subscript t Baseline equals upper X prime Subscript t Baseline beta plus u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript 0 Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper Q 1 squared left-parenthesis r right-parenthesis d r 2nd Row 1st Column y Subscript t Baseline equals mu plus upper X prime Subscript t Baseline beta plus u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript mu Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper Q 2 squared left-parenthesis r right-parenthesis d r 3rd Row 1st Column y Subscript t Baseline equals mu plus delta t plus upper X prime Subscript t Baseline beta plus u Subscript t Baseline colon 2nd Column ModifyingAbove upper L upper M With caret Subscript tau Baseline long right-arrow Overscript upper D Endscripts integral Subscript 0 Superscript 1 Baseline upper Q 3 squared left-parenthesis r right-parenthesis d r EndLayout

with

StartLayout 1st Row  upper Q 1 left-parenthesis r right-parenthesis equals upper B left-parenthesis r right-parenthesis minus left-parenthesis integral Subscript 0 Superscript r Baseline bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis d x right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis bold upper B prime Subscript bold m Baseline left-parenthesis x right-parenthesis d x right-parenthesis Superscript negative 1 Baseline left-parenthesis integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis d upper B left-parenthesis x right-parenthesis right-parenthesis 2nd Row  upper Q 2 left-parenthesis r right-parenthesis equals upper V left-parenthesis r right-parenthesis minus left-parenthesis integral Subscript 0 Superscript r Baseline ModifyingAbove bold upper B With bold bar Subscript bold m Baseline left-parenthesis x right-parenthesis d x right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline ModifyingAbove bold upper B With bold bar Subscript bold m Baseline left-parenthesis x right-parenthesis ModifyingAbove bold upper B prime Subscript bold m Baseline With bar left-parenthesis x right-parenthesis d x right-parenthesis Superscript negative 1 Baseline left-parenthesis integral Subscript 0 Superscript 1 Baseline ModifyingAbove bold upper B With bold bar Subscript bold m Baseline left-parenthesis x right-parenthesis d upper B left-parenthesis x right-parenthesis right-parenthesis 3rd Row  upper Q 3 left-parenthesis r right-parenthesis equals upper V 2 left-parenthesis r right-parenthesis minus left-parenthesis integral Subscript 0 Superscript r Baseline bold upper B Subscript bold m Superscript bold asterisk Baseline left-parenthesis x right-parenthesis d x right-parenthesis left-parenthesis integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Superscript bold asterisk Baseline left-parenthesis x right-parenthesis bold upper B Subscript bold m Superscript bold asterisk Baseline prime left-parenthesis x right-parenthesis d x right-parenthesis Superscript negative 1 Baseline left-parenthesis integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Superscript bold asterisk Baseline left-parenthesis x right-parenthesis d upper B left-parenthesis x right-parenthesis right-parenthesis EndLayout

where upper B left-parenthesis period right-parenthesis and bold upper B Subscript bold m Baseline left-parenthesis period right-parenthesis are independent scalar and m-vector standard Brownian motion, and long right-arrow Overscript upper D Endscripts is convergence in distribution. upper V left-parenthesis r right-parenthesis is a standard Brownian bridge, upper V 2 left-parenthesis r right-parenthesis is a Brownian bridge of a second-level, ModifyingAbove bold upper B With bold bar Subscript bold m Baseline left-parenthesis r right-parenthesis equals bold upper B Subscript bold m Baseline left-parenthesis r right-parenthesis minus integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis d x is an m-vector standard de-meaned Brownian motion, and bold upper B Subscript bold m Superscript bold asterisk Baseline left-parenthesis r right-parenthesis equals bold upper B Subscript bold m Baseline left-parenthesis r right-parenthesis plus left-parenthesis 6 r minus 4 right-parenthesis integral Subscript 0 Superscript 1 Baseline bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis d x plus left-parenthesis minus 12 r plus 6 right-parenthesis integral Subscript 0 Superscript 1 Baseline x bold upper B Subscript bold m Baseline left-parenthesis x right-parenthesis d x is an m-vector standard de-meaned and detrended Brownian motion.

The p-values that are reported for the KPSS test and Shin test are calculated via a Monte Carlo simulation of the limiting distributions, using a sample size of 2,000 and 1,000,000 replications.

Testing for Statistical Independence

Independence tests are widely used in model selection, residual analysis, and model diagnostics because models are usually based on the assumption of independently distributed errors. If a given time series (for example, a series of residuals) is independent, then no deterministic model is necessary for this completely random process; otherwise, there must exist some relationship in the series to be addressed. In the following section, four independence tests are introduced: the BDS test, the runs test, the turning point test, and the rank version of von Neumann ratio test.

BDS Test

Brock, Dechert, and Scheinkman (1987) propose a test (BDS test) of independence based on the correlation dimension. Brock et al. (1996) show that the first-order asymptotic distribution of the test statistic is independent of the estimation error provided that the parameters of the model under test can be estimated StartRoot n EndRoot-consistently. Hence, the BDS test can be used as a model selection tool and as a specification test.

Given the sample size T, the embedding dimension m, and the value of the radius r, the BDS statistic is

upper S Subscript normal upper B normal upper D normal upper S Baseline left-parenthesis upper T comma m comma r right-parenthesis equals StartRoot upper T minus m plus 1 EndRoot StartFraction c Subscript m comma m comma upper T Baseline left-parenthesis r right-parenthesis minus c Subscript 1 comma m comma upper T Superscript m Baseline left-parenthesis r right-parenthesis Over sigma Subscript m comma upper T Baseline left-parenthesis r right-parenthesis EndFraction

where

StartLayout 1st Row 1st Column c Subscript m comma n comma upper N Baseline left-parenthesis r right-parenthesis 2nd Column equals StartFraction 2 Over left-parenthesis upper N minus n plus 1 right-parenthesis left-parenthesis upper N minus n right-parenthesis EndFraction sigma-summation Underscript s equals n Overscript upper N Endscripts sigma-summation Underscript t equals s plus 1 Overscript upper N Endscripts product Underscript j equals 0 Overscript m minus 1 Endscripts upper I Subscript r Baseline left-parenthesis z Subscript s minus j Baseline comma z Subscript t minus j Baseline right-parenthesis 2nd Row 1st Column upper I Subscript r Baseline left-parenthesis z Subscript s Baseline comma z Subscript t Baseline right-parenthesis 2nd Column equals StartLayout Enlarged left-brace 1st Row 1st Column 1 2nd Column if StartAbsoluteValue z Subscript s Baseline minus z Subscript t Baseline EndAbsoluteValue less-than r 2nd Row 1st Column 0 2nd Column otherwise EndLayout 3rd Row 1st Column sigma Subscript m comma upper T Superscript 2 Baseline left-parenthesis r right-parenthesis 2nd Column equals 4 left-parenthesis k Superscript m Baseline plus 2 sigma-summation Underscript j equals 1 Overscript m minus 1 Endscripts k Superscript m minus j Baseline c Superscript 2 j Baseline plus left-parenthesis m minus 1 right-parenthesis squared c Superscript 2 m Baseline minus m squared k c Superscript 2 m minus 2 Baseline right-parenthesis 4th Row 1st Column c 2nd Column equals c Subscript 1 comma 1 comma upper T Baseline left-parenthesis r right-parenthesis 5th Row 1st Column k 2nd Column equals k Subscript upper T Baseline left-parenthesis r right-parenthesis equals StartFraction 6 Over upper T left-parenthesis upper T minus 1 right-parenthesis left-parenthesis upper T minus 2 right-parenthesis EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts sigma-summation Underscript s equals t plus 1 Overscript upper T Endscripts sigma-summation Underscript l equals s plus 1 Overscript upper T Endscripts h Subscript r Baseline left-parenthesis z Subscript t Baseline comma z Subscript s Baseline comma z Subscript l Baseline right-parenthesis 6th Row 1st Column h Subscript r Baseline left-parenthesis z Subscript t Baseline comma z Subscript s Baseline comma z Subscript l Baseline right-parenthesis 2nd Column equals one-third left-parenthesis upper I Subscript r Baseline left-parenthesis z Subscript t Baseline comma z Subscript s Baseline right-parenthesis upper I Subscript r Baseline left-parenthesis z Subscript s Baseline comma z Subscript l Baseline right-parenthesis plus upper I Subscript r Baseline left-parenthesis z Subscript t Baseline comma z Subscript l Baseline right-parenthesis upper I Subscript r Baseline left-parenthesis z Subscript l Baseline comma z Subscript s Baseline right-parenthesis plus upper I Subscript r Baseline left-parenthesis z Subscript s Baseline comma z Subscript t Baseline right-parenthesis upper I Subscript r Baseline left-parenthesis z Subscript t Baseline comma z Subscript l Baseline right-parenthesis right-parenthesis EndLayout

The statistic has a standard normal distribution if the sample size is large enough. For small sample size, the distribution can be approximately obtained through simulation. Kanzler (1999) has a comprehensive discussion on the implementation and empirical performance of BDS test.

Runs Test and Turning Point Test

The runs test and turning point test are two widely used tests for independence (Cromwell, Labys, and Terraza 1994).

The runs test needs several steps. First, convert the original time series into the sequence of signs, StartSet plus plus en-dash midline-horizontal-ellipsis plus em-dash EndSet, that is, map StartSet z Subscript t Baseline EndSet into StartSet s i g n left-parenthesis z Subscript t Baseline minus z Subscript upper M Baseline right-parenthesis EndSet where z Subscript upper M is the sample mean of z Subscript t and s i g n left-parenthesis x right-parenthesis is "plus" if x is nonnegative and "minus" if x is negative. Second, count the number of runs, R, in the sequence. A run of a sequence is a maximal non-empty segment of the sequence that consists of adjacent equal elements. For example, the following sequence contains upper R equals 8 runs:

ModifyingBelow plus plus plus With bottom-brace Underscript 1 Endscripts ModifyingBelow minus minus minus With bottom-brace Underscript 1 Endscripts ModifyingBelow plus plus With bottom-brace Underscript 1 Endscripts ModifyingBelow minus minus With bottom-brace Underscript 1 Endscripts ModifyingBelow plus With bottom-brace Underscript 1 Endscripts minus ModifyingBelow plus plus plus plus plus With bottom-brace Underscript 1 Endscripts ModifyingBelow minus minus With bottom-brace Underscript 1 Endscripts

Third, count the number of pluses and minuses in the sequence and denote them as upper N Subscript plus and upper N Subscript minus, respectively. In the preceding example sequence, upper N Subscript plus Baseline equals 11 and upper N Subscript minus Baseline equals 8. Note that the sample size upper T equals upper N Subscript plus Baseline plus upper N Subscript minus. Finally, compute the statistic of runs test,

upper S Subscript normal r normal u normal n normal s Baseline equals StartFraction upper R minus mu Over sigma EndFraction

where

mu equals StartFraction 2 upper N Subscript plus Baseline upper N Subscript minus Baseline Over upper T EndFraction plus 1
sigma squared equals StartFraction left-parenthesis mu minus 1 right-parenthesis left-parenthesis mu minus 2 right-parenthesis Over upper T minus 1 EndFraction

The statistic of the turning point test is defined as

upper S Subscript normal upper T normal upper P Baseline equals StartFraction sigma-summation Underscript t equals 2 Overscript upper T minus 1 Endscripts upper T upper P Subscript t Baseline minus 2 left-parenthesis upper T minus 2 right-parenthesis slash 3 Over StartRoot left-parenthesis 16 upper T minus 29 right-parenthesis slash 90 EndRoot EndFraction

where the indicator function of the turning point upper T upper P Subscript t is 1 if z Subscript t Baseline greater-than z Subscript t plus-or-minus 1 or z Subscript t Baseline less-than z Subscript t plus-or-minus 1 (that is, both the previous and next values are greater or less than the current value); otherwise, 0.

The statistics of both the runs test and the turning point test have the standard normal distribution under the null hypothesis of independence.

Rank Version of the von Neumann Ratio Test

Because the runs test completely ignores the magnitudes of the observations, Bartels (1982) proposes a rank version of the von Neumann ratio test for independence,

upper S Subscript normal upper R normal upper V normal upper N Baseline equals StartFraction StartRoot upper T EndRoot Over 2 EndFraction left-parenthesis StartFraction sigma-summation Underscript t equals 1 Overscript upper T minus 1 Endscripts left-parenthesis upper R Subscript t plus 1 Baseline minus upper R Subscript t Baseline right-parenthesis squared Over left-parenthesis upper T left-parenthesis upper T squared minus 1 right-parenthesis slash 12 right-parenthesis EndFraction minus 2 right-parenthesis

where upper R Subscript t is the rank of tth observation in the sequence of T observations. For large samples, the statistic follows the standard normal distribution under the null hypothesis of independence. For small samples of size between 11 and 100, the critical values that have been simulated would be more precise. For samples of size less than or equal to 10, the exact CDF of the statistic is available. Hence, the VNRRANK=(PVALUE=SIM) option is recommended for small samples whose size is no more than 100, although it might take longer to obtain the p-value than if you use the VNRRANK=(PVALUE=DIST) option.

Testing for Normality

Based on skewness and kurtosis, Jarque and Bera (1980) calculated the test statistic

upper T Subscript upper N Baseline equals left-bracket StartFraction upper N Over 6 EndFraction b 1 squared plus StartFraction upper N Over 24 EndFraction left-parenthesis b 2 minus 3 right-parenthesis squared right-bracket

where

b 1 equals StartFraction StartRoot upper N EndRoot sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove u With caret Subscript t Superscript 3 Baseline Over left-parenthesis sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove u With caret Subscript t Superscript 2 Baseline right-parenthesis Superscript three-halves Baseline EndFraction
b 2 equals StartFraction upper N sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove u With caret Subscript t Superscript 4 Baseline Over left-parenthesis sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove u With caret Subscript t Superscript 2 Baseline right-parenthesis squared EndFraction

The chi squared(2) distribution gives an approximation to the normality test upper T Subscript upper N.

When the GARCH model is estimated, the normality test is obtained using the standardized residuals ModifyingAbove u With caret Subscript t Baseline equals ModifyingAbove epsilon With caret Subscript t Baseline slash StartRoot h Subscript t Baseline EndRoot. The normality test can be used to detect misspecification of the family of ARCH models.

Testing for Linear Dependence

Generalized Durbin-Watson Tests

Consider the linear regression model

bold upper Y equals bold upper X bold-italic beta plus bold-italic nu

where bold upper X is an upper N times k data matrix, beta is a k times 1 coefficient vector, and bold-italic nu is an upper N times 1 disturbance vector. The error term bold-italic nu is assumed to be generated by the jth-order autoregressive process nu Subscript t Baseline equals epsilon Subscript t Baseline minus phi Subscript j Baseline nu Subscript t minus j where StartAbsoluteValue phi Subscript j Baseline EndAbsoluteValue less-than 1, epsilon Subscript t is a sequence of independent normal error terms with mean 0 and variance sigma squared. Usually, the Durbin-Watson statistic is used to test the null hypothesis upper H 0 colon phi 1 equals 0 against upper H 1 colon minus phi 1 greater-than 0. Vinod (1973) generalized the Durbin-Watson statistic,

d Subscript j Baseline equals StartFraction sigma-summation Underscript t equals j plus 1 Overscript upper N Endscripts left-parenthesis ModifyingAbove nu With caret Subscript t Baseline minus ModifyingAbove nu With caret Subscript t minus j Baseline right-parenthesis squared Over sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove nu With caret Subscript t Superscript 2 Baseline EndFraction

where ModifyingAbove bold-italic nu With caret are OLS residuals. Using the matrix notation,

d Subscript j Baseline equals StartFraction bold upper Y prime bold upper M bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper M bold upper Y Over bold upper Y prime bold upper M bold upper Y EndFraction

where bold upper M equals bold upper I Subscript upper N Baseline minus bold upper X bold left-parenthesis bold upper X prime bold upper X bold right-parenthesis Superscript negative bold 1 Baseline bold upper X prime and bold upper A Subscript j is a left-parenthesis upper N minus j right-parenthesis times upper N matrix,

bold upper A Subscript j Baseline equals Start 4 By 8 Matrix 1st Row 1st Column negative 1 2nd Column 0 3rd Column midline-horizontal-ellipsis 4th Column 0 5th Column 1 6th Column 0 7th Column midline-horizontal-ellipsis 8th Column 0 2nd Row 1st Column 0 2nd Column negative 1 3rd Column 0 4th Column midline-horizontal-ellipsis 5th Column 0 6th Column 1 7th Column 0 8th Column midline-horizontal-ellipsis 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column vertical-ellipsis 5th Column vertical-ellipsis 6th Column vertical-ellipsis 7th Column vertical-ellipsis 8th Column vertical-ellipsis 4th Row 1st Column 0 2nd Column midline-horizontal-ellipsis 3rd Column 0 4th Column negative 1 5th Column 0 6th Column midline-horizontal-ellipsis 7th Column 0 8th Column 1 EndMatrix

and there are j minus 1 zeros between negative 1 and 1 in each row of matrix bold upper A Subscript j.

The QR factorization of the design matrix bold upper X yields an upper N times upper N orthogonal matrix bold upper Q,

bold upper X equals bold upper Q bold upper R

where R is an upper N times k upper triangular matrix. There exists an upper N times left-parenthesis upper N minus k right-parenthesis submatrix of bold upper Q such that bold upper Q 1 bold upper Q prime 1 equals bold upper M and bold upper Q prime 1 bold upper Q 1 equals bold upper I Subscript upper N minus k. Consequently, the generalized Durbin-Watson statistic is stated as a ratio of two quadratic forms,

d Subscript j Baseline equals StartFraction sigma-summation Underscript l equals 1 Overscript n Endscripts lamda Subscript j l Baseline xi Subscript l Baseline squared Over sigma-summation Underscript l equals 1 Overscript n Endscripts xi Subscript l Superscript 2 Baseline EndFraction

where lamda Subscript j Baseline 1 Baseline ellipsis lamda Subscript j n are upper n eigenvalues of bold upper M bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper M and xi Subscript l is a standard normal variate, and n equals min left-parenthesis upper N minus k comma upper N minus j right-parenthesis. These eigenvalues are obtained by a singular value decomposition of bold upper Q prime Subscript 1 Baseline bold upper A prime Subscript j (Golub and Van Loan 1989; Savin and White 1978).

The marginal probability (or p-value) for d Subscript j given c 0 is

normal upper P normal r normal o normal b left-parenthesis StartFraction sigma-summation Underscript l equals 1 Overscript n Endscripts lamda Subscript j l Baseline xi Subscript l Superscript 2 Baseline Over sigma-summation Underscript l equals 1 Overscript n Endscripts xi Subscript l Superscript 2 Baseline EndFraction less-than c 0 right-parenthesis equals normal upper P normal r normal o normal b left-parenthesis q Subscript j Baseline less-than 0 right-parenthesis

where

q Subscript j Baseline equals sigma-summation Underscript l equals 1 Overscript n Endscripts left-parenthesis lamda Subscript j l Baseline minus c 0 right-parenthesis xi Subscript l Superscript 2

When the null hypothesis upper H 0 colon phi Subscript j Baseline equals 0 holds, the quadratic form q Subscript j has the characteristic function

phi Subscript j Baseline left-parenthesis t right-parenthesis equals product Underscript l equals 1 Overscript n Endscripts left-parenthesis 1 minus 2 left-parenthesis lamda Subscript j l Baseline minus c 0 right-parenthesis i t right-parenthesis Superscript negative 1 slash 2

The distribution function is uniquely determined by this characteristic function:

upper F left-parenthesis x right-parenthesis equals one-half plus StartFraction 1 Over 2 pi EndFraction integral Subscript 0 Superscript normal infinity Baseline StartFraction e Superscript i t x Baseline phi Subscript j Baseline left-parenthesis negative t right-parenthesis minus e Superscript minus i t x Baseline phi Subscript j Baseline left-parenthesis t right-parenthesis Over i t EndFraction d t

For example, to test upper H 0 colon phi 4 equals 0 given phi 1 equals phi 2 equals phi 3 equals 0 against upper H 1 colon minus phi 4 greater-than 0, the marginal probability (p-value) can be used,

upper F left-parenthesis 0 right-parenthesis equals one-half plus StartFraction 1 Over 2 pi EndFraction integral Subscript 0 Superscript normal infinity Baseline StartFraction left-parenthesis phi 4 left-parenthesis negative t right-parenthesis minus phi 4 left-parenthesis t right-parenthesis right-parenthesis Over i t EndFraction d t

where

phi 4 left-parenthesis t right-parenthesis equals product Underscript l equals 1 Overscript n Endscripts left-parenthesis 1 minus 2 left-parenthesis lamda Subscript 4 l Baseline minus ModifyingAbove d With caret Subscript 4 Baseline right-parenthesis i t right-parenthesis Superscript negative 1 slash 2

and ModifyingAbove d With caret Subscript 4 is the calculated value of the fourth-order Durbin-Watson statistic.

In the Durbin-Watson test, the marginal probability indicates positive autocorrelation (minus phi Subscript j Baseline greater-than 0) if it is less than the level of significance (alpha), while you can conclude that a negative autocorrelation (minus phi Subscript j Baseline less-than 0) exists if the marginal probability based on the computed Durbin-Watson statistic is greater than 1 minus alpha. Wallis (1972) presented tables for bounds tests of fourth-order autocorrelation, and Vinod (1973) has given tables for a 5% significance level for orders two to four. Using the AUTOREG procedure, you can calculate the exact p-values for the general order of Durbin-Watson test statistics. Tests for the absence of autocorrelation of order p can be performed sequentially; at the jth step, test upper H 0 colon phi Subscript j Baseline equals 0 given phi 1 equals midline-horizontal-ellipsis equals phi Subscript j minus 1 Baseline equals 0 against phi Subscript j Baseline not-equals 0. However, the size of the sequential test is not known.

The Durbin-Watson statistic is computed from the OLS residuals, while that of the autoregressive error model uses residuals that are the difference between the predicted values and the actual values. When you use the Durbin-Watson test from the residuals of the autoregressive error model, you must be aware that this test is only an approximation. See the section Autoregressive Error Model. If there are missing values, the Durbin-Watson statistic is computed using all the nonmissing values and ignoring the gaps caused by missing residuals. This does not affect the significance level of the resulting test, although the power of the test against certain alternatives may be adversely affected. Savin and White (1978) have examined the use of the Durbin-Watson statistic with missing values.

The Durbin-Watson probability calculations have been enhanced to compute the p-value of the generalized Durbin-Watson statistic for large sample sizes. Previously, the Durbin-Watson probabilities were only calculated for small sample sizes.

Consider the linear regression model

bold upper Y equals bold upper X beta plus bold u
u Subscript t Baseline plus phi Subscript j Baseline u Subscript t minus j Baseline equals epsilon Subscript t Baseline comma t equals 1 comma ellipsis comma upper N

where bold upper X is an upper N times k data matrix, beta is a k times 1 coefficient vector, bold u is an upper N times 1 disturbance vector, and epsilon Subscript t is a sequence of independent normal error terms with mean 0 and variance sigma squared.

The generalized Durbin-Watson statistic is written as

normal upper D normal upper W Subscript j Baseline equals StartFraction ModifyingAbove bold u With caret prime bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline ModifyingAbove bold u With caret Over ModifyingAbove bold u With caret prime ModifyingAbove bold u With caret EndFraction

where ModifyingAbove bold u With caret is a vector of OLS residuals and bold upper A Subscript j is a left-parenthesis upper T minus j right-parenthesis times upper T matrix. The generalized Durbin-Watson statistic DWSubscript j can be rewritten as

normal upper D normal upper W Subscript j Baseline equals StartFraction bold upper Y prime bold upper M bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper M bold upper Y Over bold upper Y prime bold upper M bold upper Y EndFraction equals StartFraction eta prime left-parenthesis bold upper Q prime 1 bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper Q 1 right-parenthesis eta Over eta prime eta EndFraction

where bold upper Q prime 1 bold upper Q 1 equals bold upper I Subscript upper T minus k Baseline comma bold upper Q prime 1 bold upper X equals 0 comma normal a normal n normal d eta equals bold upper Q prime 1 bold u.

The marginal probability for the Durbin-Watson statistic is

probability left-parenthesis normal upper D normal upper W Subscript j Baseline less-than c right-parenthesis equals probability left-parenthesis h less-than 0 right-parenthesis

where h equals eta prime left-parenthesis bold upper Q prime 1 bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper Q 1 minus c bold upper I right-parenthesis eta.

The p-value or the marginal probability for the generalized Durbin-Watson statistic is computed by numerical inversion of the characteristic function phi left-parenthesis u right-parenthesis of the quadratic form h equals eta prime left-parenthesis bold upper Q prime 1 bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper Q 1 minus c bold upper I right-parenthesis eta. The trapezoidal rule approximation to the marginal probability probability left-parenthesis h less-than 0 right-parenthesis is

probability left-parenthesis h less-than 0 right-parenthesis equals one-half minus sigma-summation Underscript k equals 0 Overscript upper K Endscripts StartFraction normal upper I normal m left-bracket phi left-parenthesis left-parenthesis k plus one-half right-parenthesis normal upper Delta right-parenthesis right-bracket Over pi left-parenthesis k plus one-half right-parenthesis EndFraction plus normal upper E Subscript upper I Baseline left-parenthesis normal upper Delta right-parenthesis plus normal upper E Subscript upper T Baseline left-parenthesis upper K right-parenthesis

where normal upper I normal m left-bracket phi left-parenthesis dot right-parenthesis right-bracket is the imaginary part of the characteristic function, normal upper E Subscript upper I Baseline left-parenthesis normal upper Delta right-parenthesis and normal upper E Subscript upper T Baseline left-parenthesis upper K right-parenthesis are integration and truncation errors, respectively. For numerical inversion of the characteristic function, see Davies (1973).

Ansley, Kohn, and Shively (1992) proposed a numerically efficient algorithm that requires O(N) operations for evaluation of the characteristic function phi left-parenthesis u right-parenthesis. The characteristic function is denoted as

StartLayout 1st Row 1st Column phi left-parenthesis u right-parenthesis 2nd Column equals 3rd Column StartAbsoluteValue bold upper I minus 2 i u left-parenthesis bold upper Q prime 1 bold upper A prime Subscript j Baseline bold upper A Subscript j Baseline bold upper Q 1 minus c bold upper I Subscript upper N minus k Baseline right-parenthesis EndAbsoluteValue Superscript negative 1 slash 2 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartAbsoluteValue bold upper V EndAbsoluteValue Superscript negative 1 slash 2 Baseline StartAbsoluteValue bold upper X prime bold upper V Superscript negative 1 Baseline bold upper X EndAbsoluteValue Superscript negative 1 slash 2 Baseline StartAbsoluteValue bold upper X prime bold upper X EndAbsoluteValue Superscript 1 slash 2 EndLayout

where bold upper V equals left-parenthesis 1 plus 2 i u c right-parenthesis bold upper I minus 2 i u bold upper A prime Subscript j Baseline bold upper A Subscript j and i equals StartRoot negative 1 EndRoot. By applying the Cholesky decomposition to the complex matrix bold upper V, you can obtain the lower triangular matrix bold upper G that satisfies bold upper V equals bold upper G bold upper G prime. Therefore, the characteristic function can be evaluated in O(N) operations by using the formula

phi left-parenthesis u right-parenthesis equals StartAbsoluteValue bold upper G EndAbsoluteValue Superscript negative 1 Baseline StartAbsoluteValue bold upper X Superscript asterisk prime Baseline bold upper X Superscript asterisk Baseline EndAbsoluteValue Superscript negative 1 slash 2 Baseline StartAbsoluteValue bold upper X prime bold upper X EndAbsoluteValue Superscript 1 slash 2

where bold upper X Superscript asterisk Baseline equals bold upper G Superscript negative 1 Baseline bold upper X. For more information about evaluation of the characteristic function, see Ansley, Kohn, and Shively (1992).

Tests for Serial Correlation with Lagged Dependent Variables

When regressors contain lagged dependent variables, the Durbin-Watson statistic (d 1) for the first-order autocorrelation is biased toward 2 and has reduced power. Wallis (1972) shows that the bias in the Durbin-Watson statistic (d 4) for the fourth-order autocorrelation is smaller than the bias in d 1 in the presence of a first-order lagged dependent variable. Durbin (1970) proposes two alternative statistics (Durbin h and t) that are asymptotically equivalent. The h statistic is written as

h equals ModifyingAbove rho With caret StartRoot upper N slash left-parenthesis 1 minus upper N ModifyingAbove upper V With caret right-parenthesis EndRoot

where ModifyingAbove rho With caret equals sigma-summation Underscript t equals 2 Overscript upper N Endscripts ModifyingAbove nu With caret Subscript t Baseline ModifyingAbove nu With caret Subscript t minus 1 Baseline slash sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove nu With caret Subscript t Superscript 2 and ModifyingAbove upper V With caret is the least squares variance estimate for the coefficient of the lagged dependent variable. Durbin’s t test consists of regressing the OLS residuals ModifyingAbove nu With caret Subscript t on explanatory variables and ModifyingAbove nu With caret Subscript t minus 1 and testing the significance of the estimate for coefficient of ModifyingAbove nu With caret Subscript t minus 1.

Inder (1984) shows that the Durbin-Watson test for the absence of first-order autocorrelation is generally more powerful than the h test in finite samples. For information about the Durbin-Watson test in the presence of lagged dependent variables, see Inder (1986) and King and Wu (1991).

Godfrey LM test

The GODFREY= option in the MODEL statement produces the Godfrey Lagrange multiplier test for serially correlated residuals for each equation (Godfrey 1978b, 1978a). r is the maximum autoregressive order, and specifies that Godfrey’s tests be computed for lags 1 through r. The default number of lags is four.

Testing for Nonlinear Dependence: Ramsey’s Reset Test

Ramsey’s reset test is a misspecification test associated with the functional form of models to check whether power transforms need to be added to a model. The original linear model, henceforth called the restricted model, is

y Subscript t Baseline equals bold x Subscript bold t Baseline beta plus u Subscript t

To test for misspecification in the functional form, the unrestricted model is

y Subscript t Baseline equals bold x Subscript bold t Baseline beta plus sigma-summation Underscript j equals 2 Overscript p Endscripts phi Subscript j Baseline ModifyingAbove y With caret Subscript t Superscript j Baseline plus u Subscript t

where ModifyingAbove y With caret Subscript t is the predicted value from the linear model and p is the power of ModifyingAbove y With caret Subscript t in the unrestricted model equation starting from 2. The number of higher-ordered terms to be chosen depends on the discretion of the analyst. The RESET option produces test results for p equals 2, 3, and 4.

The reset test is an F statistic for testing upper H 0 colon phi Subscript j Baseline equals 0, for all j equals 2 comma ellipsis comma p, against upper H 1 colon phi Subscript j Baseline not-equals 0 for at least one j equals 2 comma ellipsis comma p in the unrestricted model and is computed as

upper F Subscript left-parenthesis p minus 1 comma n minus k minus p plus 1 right-parenthesis Baseline equals StartFraction left-parenthesis normal upper S normal upper S normal upper E Subscript upper R Baseline minus normal upper S normal upper S normal upper E Subscript upper U Baseline right-parenthesis slash left-parenthesis p minus 1 right-parenthesis Over normal upper S normal upper S normal upper E Subscript upper U Baseline slash left-parenthesis n minus k minus p plus 1 right-parenthesis EndFraction

where normal upper S normal upper S normal upper E Subscript upper R is the sum of squared errors due to the restricted model, normal upper S normal upper S normal upper E Subscript upper U is the sum of squared errors due to the unrestricted model, n is the total number of observations, and k is the number of parameters in the original linear model.

Ramsey’s test can be viewed as a linearity test that checks whether any nonlinear transformation of the specified independent variables has been omitted, but it need not help in identifying a new relevant variable other than those already specified in the current model.

Testing for Nonlinear Dependence: Heteroscedasticity Tests

Portmanteau Q Test

For nonlinear time series models, the portmanteau test statistic based on squared residuals is used to test for independence of the series (McLeod and Li 1983),

upper Q left-parenthesis q right-parenthesis equals upper N left-parenthesis upper N plus 2 right-parenthesis sigma-summation Underscript i equals 1 Overscript q Endscripts StartFraction r left-parenthesis i semicolon ModifyingAbove nu With caret Subscript t Superscript 2 Baseline right-parenthesis Over left-parenthesis upper N minus i right-parenthesis EndFraction

where

r left-parenthesis i semicolon ModifyingAbove nu With caret Subscript t Superscript 2 Baseline right-parenthesis equals StartFraction sigma-summation Underscript t equals i plus 1 Overscript upper N Endscripts left-parenthesis ModifyingAbove nu With caret Subscript t Superscript 2 Baseline minus ModifyingAbove sigma With caret squared right-parenthesis left-parenthesis ModifyingAbove nu With caret Subscript t minus i Superscript 2 Baseline minus ModifyingAbove sigma With caret squared right-parenthesis Over sigma-summation Underscript t equals 1 Overscript upper N Endscripts left-parenthesis ModifyingAbove nu With caret Subscript t Superscript 2 Baseline minus ModifyingAbove sigma With caret squared right-parenthesis squared EndFraction
ModifyingAbove sigma With caret squared equals StartFraction 1 Over upper N EndFraction sigma-summation Underscript t equals 1 Overscript upper N Endscripts ModifyingAbove nu With caret Subscript t Superscript 2

This Q statistic is used to test the nonlinear effects (for example, GARCH effects) present in the residuals. The GARCHleft-parenthesis p comma q right-parenthesis process can be considered as an ARMAleft-parenthesis max left-parenthesis p comma q right-parenthesis comma p right-parenthesis process. See the section Predicting the Conditional Variance. Therefore, the Q statistic calculated from the squared residuals can be used to identify the order of the GARCH process.

Engle’s Lagrange Multiplier Test for ARCH Disturbances

Engle (1982) proposed a Lagrange multiplier test for ARCH disturbances. The test statistic is asymptotically equivalent to the test used by Breusch and Pagan (1979). Engle’s Lagrange multiplier test for the qth order ARCH process is written

upper L upper M left-parenthesis q right-parenthesis equals StartFraction upper N bold upper W prime bold upper Z left-parenthesis bold upper Z prime bold upper Z right-parenthesis Superscript negative 1 Baseline bold upper Z prime bold upper W Over bold upper W prime bold upper W EndFraction

where

bold upper W equals left-parenthesis StartFraction ModifyingAbove nu With caret Subscript 1 Superscript 2 Baseline Over ModifyingAbove sigma With caret squared EndFraction minus 1 comma ellipsis comma StartFraction ModifyingAbove nu With caret Subscript upper N Superscript 2 Baseline Over ModifyingAbove sigma With caret squared EndFraction minus 1 right-parenthesis prime

and

bold upper Z equals Start 4 By 4 Matrix 1st Row 1st Column 1 2nd Column ModifyingAbove nu With caret Subscript 0 Superscript 2 Baseline 3rd Column midline-horizontal-ellipsis 4th Column ModifyingAbove nu With caret Subscript negative q plus 1 Superscript 2 Baseline 2nd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column vertical-ellipsis 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column 1 2nd Column ModifyingAbove nu With caret Subscript upper N minus 1 Superscript 2 Baseline 3rd Column midline-horizontal-ellipsis 4th Column ModifyingAbove nu With caret Subscript upper N minus q Superscript 2 EndMatrix

The presample values ( nu 0 squared, …, nu Subscript negative q plus 1 Superscript 2) have been set to 0. Note that the LMleft-parenthesis q right-parenthesis tests might have different finite-sample properties depending on the presample values, though they are asymptotically equivalent regardless of the presample values.

Lee and King’s Test for ARCH Disturbances

Engle’s Lagrange multiplier test for ARCH disturbances is a two-sided test; that is, it ignores the inequality constraints for the coefficients in ARCH models. Lee and King (1993) propose a one-sided test and prove that the test is locally most mean powerful. Let epsilon Subscript t Baseline comma t equals 1 comma ellipsis comma upper T, denote the residuals to be tested. Lee and King’s test checks

upper H 0 colon alpha Subscript i Baseline equals 0 comma i equals 1 comma ellipsis comma q
upper H 1 colon alpha Subscript i Baseline greater-than 0 comma i equals 1 comma ellipsis comma q

where alpha Subscript i Baseline comma i equals 1 comma ellipsis comma q comma are in the following ARCH(q) model:

epsilon Subscript t Baseline equals StartRoot h Subscript t Baseline EndRoot e Subscript t Baseline comma e Subscript t Baseline i i d left-parenthesis 0 comma 1 right-parenthesis
h Subscript t Baseline equals alpha 0 plus sigma-summation Underscript i equals 1 Overscript q Endscripts alpha Subscript i Baseline epsilon Subscript t minus i Superscript 2

The statistic is written as

upper S equals StartStartFraction sigma-summation Underscript t equals q plus 1 Overscript upper T Endscripts left-parenthesis StartFraction epsilon Subscript t Superscript 2 Baseline Over h 0 EndFraction minus 1 right-parenthesis sigma-summation Underscript i equals 1 Overscript q Endscripts epsilon Subscript t minus i Superscript 2 Baseline OverOver left-bracket 2 sigma-summation Underscript t equals q plus 1 Overscript upper T Endscripts left-parenthesis sigma-summation Underscript i equals 1 Overscript q Endscripts epsilon Subscript t minus i Superscript 2 Baseline right-parenthesis squared minus StartFraction 2 left-parenthesis sigma-summation Underscript t equals q plus 1 Overscript upper T Endscripts sigma-summation Underscript i equals 1 Overscript q Endscripts epsilon Subscript t minus i Superscript 2 Baseline right-parenthesis squared Over upper T minus q EndFraction right-bracket Superscript 1 slash 2 Baseline EndEndFraction
Wong and Li’s Test for ARCH Disturbances

Wong and Li (1995) propose a rank portmanteau statistic to minimize the effect of the existence of outliers in the test for ARCH disturbances. They first rank the squared residuals; that is, upper R Subscript t Baseline equals r a n k left-parenthesis epsilon Subscript t Superscript 2 Baseline right-parenthesis. Then they calculate the rank portmanteau statistic

upper Q Subscript upper R Baseline equals sigma-summation Underscript i equals 1 Overscript q Endscripts StartFraction left-parenthesis r Subscript i Baseline minus mu Subscript i Baseline right-parenthesis squared Over sigma Subscript i Superscript 2 Baseline EndFraction

where r Subscript i, mu Subscript i, and sigma Subscript i Superscript 2 are defined as follows:

r Subscript i Baseline equals StartFraction sigma-summation Underscript t equals i plus 1 Overscript upper T Endscripts left-parenthesis upper R Subscript t Baseline minus left-parenthesis upper T plus 1 right-parenthesis slash 2 right-parenthesis left-parenthesis upper R Subscript t minus i Baseline minus left-parenthesis upper T plus 1 right-parenthesis slash 2 right-parenthesis Over upper T left-parenthesis upper T squared minus 1 right-parenthesis slash 12 EndFraction
mu Subscript i Baseline equals minus StartFraction upper T minus i Over upper T left-parenthesis upper T minus 1 right-parenthesis EndFraction
sigma Subscript i Superscript 2 Baseline equals StartFraction 5 upper T Superscript 4 Baseline minus left-parenthesis 5 i plus 9 right-parenthesis upper T cubed plus 9 left-parenthesis i minus 2 right-parenthesis upper T squared plus 2 i left-parenthesis 5 i plus 8 right-parenthesis upper T plus 16 i squared Over 5 left-parenthesis upper T minus 1 right-parenthesis squared upper T squared left-parenthesis upper T plus 1 right-parenthesis EndFraction

The Q, Engle’s LM, Lee and King’s, and Wong and Li’s statistics are computed from the OLS residuals, or residuals if the NLAG= option is specified, assuming that disturbances are white noise. The Q, Engle’s LM, and Wong and Li’s statistics have an approximate chi Subscript left-parenthesis q right-parenthesis Superscript 2 distribution under the white-noise null hypothesis, while the Lee and King’s statistic has a standard normal distribution under the white-noise null hypothesis.

Testing for Structural Change

Chow Test

Consider the linear regression model

bold y equals bold upper X beta plus bold u

where the parameter vector beta contains k elements.

Split the observations for this model into two subsets at the break point specified by the CHOW= option, so that

StartLayout 1st Row 1st Column bold y 2nd Column equals 3rd Column left-parenthesis bold y prime Subscript 1 Baseline comma bold y prime Subscript 2 Baseline right-parenthesis prime 2nd Row 1st Column bold upper X 2nd Column equals 3rd Column left-parenthesis bold upper X prime Subscript 1 Baseline comma bold upper X prime Subscript 2 Baseline right-parenthesis prime 3rd Row 1st Column bold u 2nd Column equals 3rd Column left-parenthesis bold u prime Subscript 1 Baseline comma bold u prime Subscript 2 Baseline right-parenthesis prime EndLayout

Now consider the two linear regressions for the two subsets of the data modeled separately,

bold y 1 equals bold upper X 1 beta 1 plus bold u 1
bold y 2 equals bold upper X 2 beta 2 plus bold u 2

where the number of observations from the first set is n 1 and the number of observations from the second set is n 2.

The Chow test statistic is used to test the null hypothesis upper H 0 colon beta 1 equals beta 2 conditional on the same error variance upper V left-parenthesis bold u 1 right-parenthesis equals upper V left-parenthesis bold u 2 right-parenthesis. The Chow test is computed using three sums of square errors,

normal upper F Subscript c h o w Baseline equals StartFraction left-parenthesis ModifyingAbove bold u With bold caret prime ModifyingAbove bold u With bold caret minus ModifyingAbove bold u With bold caret prime Subscript 1 Baseline ModifyingAbove bold u With bold caret Subscript 1 Baseline minus ModifyingAbove bold u With bold caret prime Subscript 2 Baseline ModifyingAbove bold u With bold caret Subscript 2 Baseline right-parenthesis slash k Over left-parenthesis ModifyingAbove bold u With bold caret prime Subscript 1 Baseline ModifyingAbove bold u With bold caret Subscript 1 Baseline plus ModifyingAbove bold u With bold caret prime Subscript 2 Baseline ModifyingAbove bold u With bold caret Subscript 2 Baseline right-parenthesis slash left-parenthesis n 1 plus n 2 minus 2 k right-parenthesis EndFraction

where ModifyingAbove bold u With bold caret is the regression residual vector from the full set model, ModifyingAbove bold u With bold caret Subscript 1 is the regression residual vector from the first set model, and ModifyingAbove bold u With bold caret Subscript 2 is the regression residual vector from the second set model. Under the null hypothesis, the Chow test statistic has an F distribution with k and left-parenthesis n 1 plus n 2 minus 2 k right-parenthesis degrees of freedom, where k is the number of elements in beta.

Chow (1960) suggested another test statistic that tests the hypothesis that the mean of prediction errors is 0. The predictive Chow test can also be used when n 2 less-than k.

The PCHOW= option computes the predictive Chow test statistic

normal upper F Subscript p c h o w Baseline equals StartFraction left-parenthesis ModifyingAbove bold u With bold caret prime ModifyingAbove bold u With bold caret minus ModifyingAbove bold u With bold caret prime Subscript 1 Baseline ModifyingAbove bold u With bold caret Subscript 1 Baseline right-parenthesis slash n 2 Over ModifyingAbove bold u With bold caret prime Subscript 1 Baseline ModifyingAbove bold u With bold caret Subscript 1 Baseline slash left-parenthesis n 1 minus k right-parenthesis EndFraction

The predictive Chow test has an F distribution with n 2 and left-parenthesis n 1 minus k right-parenthesis degrees of freedom.

Bai and Perron’s Multiple Structural Change Tests

Bai and Perron (1998) propose several kinds of multiple structural change tests: (1) the test of no break versus a fixed number of breaks (s u p upper F test), (2) the equal and unequal weighted versions of double maximum tests of no break versus an unknown number of breaks given some upper bound (upper U upper D m a x upper F test and upper W upper D m a x upper F test), and (3) the test of l versus l plus 1 breaks (s u p upper F Subscript l plus 1 vertical-bar l test). Bai and Perron (2003a, 2003b, 2006) also show how to implement these tests, the commonly used critical values, and the simulation analysis on these tests.

Consider the following partial structural change model with m breaks (m plus 1 regimes):

y Subscript t Baseline equals x prime Subscript t Baseline beta plus z prime Subscript t Baseline delta Subscript j Baseline plus u Subscript t Baseline comma t equals upper T Subscript j minus 1 Baseline plus 1 comma ellipsis comma upper T Subscript j Baseline comma j equals 1 comma ellipsis comma m

Here, y Subscript t is the dependent variable observed at time t, x Subscript t Baseline left-parenthesis p times 1 right-parenthesis is a vector of covariates with coefficients beta unchanged over time, and z Subscript t Baseline left-parenthesis q times 1 right-parenthesis is a vector of covariates with coefficients delta Subscript j at regime j, j equals 1 comma ellipsis comma m. If p equals 0 (that is, there are no x regressors), the regression model becomes the pure structural change model. The indices left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis (that is, the break dates or break points) are unknown, and the convenient notation upper T 0 equals 0 and upper T Subscript m plus 1 Baseline equals upper T applies. For any given m-partition left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis, the associated least squares estimates of beta and delta Subscript j Baseline comma j equals 1 comma ellipsis comma m comma are obtained by minimizing the sum of squared residuals (SSR),

upper S Subscript upper T Baseline left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis equals sigma-summation Underscript i equals 1 Overscript m plus 1 Endscripts sigma-summation Underscript t equals upper T Subscript i minus 1 Baseline plus 1 Overscript upper T Subscript i Baseline Endscripts left-parenthesis y Subscript t Baseline minus x prime Subscript t Baseline beta minus z prime Subscript t Baseline delta Subscript i Baseline right-parenthesis squared

Let ModifyingAbove upper S With caret Subscript upper T Baseline left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis denote the minimized SSR for a given left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis. The estimated break dates left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript m Baseline right-parenthesis are such that

left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript m Baseline right-parenthesis equals arg min Underscript upper T 1 comma ellipsis comma upper T Subscript m Baseline Endscripts ModifyingAbove upper S With caret Subscript upper T Baseline left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis

where the minimization is taken over all partitions left-parenthesis upper T 1 comma ellipsis comma upper T Subscript m Baseline right-parenthesis such that upper T Subscript i Baseline minus upper T Subscript i minus 1 Baseline greater-than-or-equal-to upper T epsilon. Bai and Perron (2003a) propose an efficient algorithm, based on the principle of dynamic programming, to estimate the preceding model.

In the case that the data are nontrending, as stated in Bai and Perron (1998), the limiting distribution of the break dates is

StartFraction left-parenthesis normal upper Delta prime Subscript i Baseline upper Q Subscript i Baseline normal upper Delta Subscript i Baseline right-parenthesis squared Over left-parenthesis normal upper Delta prime Subscript i Baseline normal upper Omega Subscript i Baseline normal upper Delta Subscript i Baseline right-parenthesis EndFraction left-parenthesis ModifyingAbove upper T With caret Subscript i Baseline minus upper T Subscript i Superscript 0 Baseline right-parenthesis right double arrow arg max Underscript s Endscripts upper V Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis comma i equals 1 comma ellipsis comma m

where

upper V Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column upper W 1 Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis negative s right-parenthesis minus StartAbsoluteValue s EndAbsoluteValue slash 2 2nd Column if s less-than-or-equal-to 0 2nd Row 1st Column StartRoot eta Subscript i Baseline EndRoot left-parenthesis phi Subscript i comma 2 Baseline slash phi Subscript i comma 1 Baseline right-parenthesis upper W 2 Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis minus eta Subscript i Baseline StartAbsoluteValue s EndAbsoluteValue slash 2 2nd Column if s greater-than 0 EndLayout

and

StartLayout 1st Row 1st Column normal upper Delta upper T Subscript i Superscript 0 2nd Column equals upper T Subscript i Superscript 0 Baseline minus upper T Subscript i minus 1 Superscript 0 Baseline 2nd Row 1st Column normal upper Delta Subscript i 2nd Column equals delta Subscript i plus 1 Superscript 0 Baseline minus delta Subscript i Superscript 0 Baseline 3rd Row 1st Column upper Q Subscript i 2nd Column equals limit left-parenthesis normal upper Delta upper T Subscript i Superscript 0 Baseline right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals upper T Subscript i minus 1 Superscript 0 Baseline plus 1 Overscript upper T Subscript i Superscript 0 Baseline Endscripts upper E left-parenthesis z Subscript t Baseline z prime Subscript t right-parenthesis 4th Row 1st Column normal upper Omega Subscript i 2nd Column equals limit left-parenthesis normal upper Delta upper T Subscript i Superscript 0 Baseline right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript r equals upper T Subscript i minus 1 Superscript 0 Baseline plus 1 Overscript upper T Subscript i Superscript 0 Baseline Endscripts sigma-summation Underscript t equals upper T Subscript i minus 1 Superscript 0 Baseline plus 1 Overscript upper T Subscript i Superscript 0 Baseline Endscripts upper E left-parenthesis z Subscript r Baseline z prime Subscript t Baseline u Subscript r Baseline u Subscript t Baseline right-parenthesis 5th Row 1st Column eta Subscript i 2nd Column equals normal upper Delta prime Subscript i Baseline upper Q Subscript i plus 1 Baseline normal upper Delta Subscript i Baseline slash normal upper Delta prime Subscript i Baseline upper Q Subscript i Baseline normal upper Delta Subscript i Baseline 6th Row 1st Column phi Subscript i comma 1 Superscript 2 2nd Column equals normal upper Delta prime Subscript i Baseline normal upper Omega Subscript i Baseline normal upper Delta Subscript i Baseline slash normal upper Delta prime Subscript i Baseline upper Q Subscript i Baseline normal upper Delta Subscript i Baseline 7th Row 1st Column phi Subscript i comma 2 Superscript 2 2nd Column equals normal upper Delta prime Subscript i Baseline normal upper Omega Subscript i plus 1 Baseline normal upper Delta Subscript i Baseline slash normal upper Delta prime Subscript i Baseline upper Q Subscript i plus 1 Baseline normal upper Delta Subscript i EndLayout

Also, upper W 1 Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis and upper W 2 Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis are independent standard Weiner processes that are defined on left-bracket 0 comma normal infinity right-parenthesis, starting at the origin when s equals 0; these processes are also independent across i. The cumulative distribution function of arg max Underscript s Endscripts upper V Superscript left-parenthesis i right-parenthesis Baseline left-parenthesis s right-parenthesis is shown in Bai (1997). Hence, with the estimates of normal upper Delta Subscript i, upper Q Subscript i, and normal upper Omega Subscript i, the relevant critical values for confidence interval of break dates upper T Subscript i can be calculated. The estimate of normal upper Delta Subscript i is ModifyingAbove delta With caret Subscript i plus 1 Baseline minus ModifyingAbove delta With caret Subscript i. The estimate of upper Q Subscript i is either

ModifyingAbove upper Q With caret Subscript i Baseline equals left-parenthesis normal upper Delta ModifyingAbove upper T With caret Subscript i Baseline right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals ModifyingAbove upper T With caret Subscript i minus 1 Superscript 0 Baseline plus 1 Overscript ModifyingAbove upper T With caret Subscript i Superscript 0 Baseline Endscripts z Subscript t Baseline z prime Subscript t

if the regressors are assumed to have heterogeneous distributions across regimes (that is, the HQ option is specified), or

ModifyingAbove upper Q With caret Subscript i Baseline equals ModifyingAbove upper Q With caret equals left-parenthesis upper T right-parenthesis Superscript negative 1 Baseline sigma-summation Underscript t equals 1 Overscript upper T Endscripts z Subscript t Baseline z prime Subscript t

if the regressors are assumed to have identical distributions across regimes (that is, the HQ option is not specified). The estimate of normal upper Omega Subscript i can also be constructed with data over regime i only or the whole sample, depending on whether the vectors z Subscript t Baseline ModifyingAbove u With caret Subscript t are heterogeneously distributed across regimes (that is, the HO option is specified). If the HAC option is specified, ModifyingAbove normal upper Omega With caret Subscript i is estimated through the heteroscedasticity- and autocorrelation-consistent (HAC) covariance matrix estimator applied to vectors z Subscript t Baseline ModifyingAbove u With caret Subscript t.

The s u p upper F test of no structural break left-parenthesis m equals 0 right-parenthesis versus the alternative hypothesis that there are a fixed number, m equals k, of breaks is defined as

s u p upper F left-parenthesis k right-parenthesis equals StartFraction 1 Over upper T EndFraction left-parenthesis StartFraction upper T minus left-parenthesis k plus 1 right-parenthesis q minus p Over k q EndFraction right-parenthesis left-parenthesis upper R ModifyingAbove theta With caret right-parenthesis prime left-parenthesis upper R ModifyingAbove upper V With caret left-parenthesis ModifyingAbove theta With caret right-parenthesis upper R prime right-parenthesis Superscript negative 1 Baseline left-parenthesis upper R ModifyingAbove theta With caret right-parenthesis

where

upper R Subscript left-parenthesis k q right-parenthesis times left-parenthesis p plus left-parenthesis k plus 1 right-parenthesis q right-parenthesis Baseline equals Start 4 By 7 Matrix 1st Row 1st Column 0 Subscript q times p Baseline 2nd Column upper I Subscript q Baseline 3rd Column minus upper I Subscript q Baseline 4th Column 0 5th Column 0 6th Column midline-horizontal-ellipsis 7th Column 0 2nd Row 1st Column 0 Subscript q times p Baseline 2nd Column 0 3rd Column upper I Subscript q Baseline 4th Column minus upper I Subscript q Baseline 5th Column 0 6th Column midline-horizontal-ellipsis 7th Column 0 3rd Row 1st Column vertical-ellipsis 2nd Column midline-horizontal-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column down-right-diagonal-ellipsis 5th Column down-right-diagonal-ellipsis 6th Column down-right-diagonal-ellipsis 7th Column midline-horizontal-ellipsis 4th Row 1st Column 0 Subscript q times p Baseline 2nd Column 0 3rd Column midline-horizontal-ellipsis 4th Column midline-horizontal-ellipsis 5th Column 0 6th Column upper I Subscript q Baseline 7th Column minus upper I Subscript q EndMatrix

and upper I Subscript q is the q times q identity matrix; ModifyingAbove theta With caret is the coefficient vector left-parenthesis ModifyingAbove beta With caret prime ModifyingAbove delta With caret prime Subscript 1 Baseline ellipsis ModifyingAbove delta With caret Subscript k plus 1 Baseline right-parenthesis prime, which together with the break dates left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline ellipsis ModifyingAbove upper T With caret Subscript k Baseline right-parenthesis minimizes the global sum of squared residuals; and ModifyingAbove upper V With caret left-parenthesis ModifyingAbove theta With caret right-parenthesis is an estimate of the variance-covariance matrix of ModifyingAbove theta With caret, which could be estimated by using the HAC estimator or another way, depending on how the HAC, HR, and HE options are specified. The output s u p upper F test statistics are scaled by q, the number of regressors, to be consistent with the limiting distribution; Bai and Perron (2003b, 2006) take the same action.

There are two versions of double maximum tests of no break against an unknown number of breaks given some upper bound M: the upper U upper D m a x upper F test,

upper U upper D m a x upper F left-parenthesis upper M right-parenthesis equals max Underscript 1 less-than-or-equal-to m less-than-or-equal-to upper M Endscripts s u p upper F left-parenthesis m right-parenthesis

and the upper W upper D m a x upper F test,

upper W upper D m a x upper F left-parenthesis upper M comma alpha right-parenthesis equals max Underscript 1 less-than-or-equal-to m less-than-or-equal-to upper M Endscripts StartFraction c Subscript alpha Baseline left-parenthesis 1 right-parenthesis Over c Subscript alpha Baseline left-parenthesis m right-parenthesis EndFraction s u p upper F left-parenthesis m right-parenthesis

where alpha is the significance level and c Subscript alpha Baseline left-parenthesis m right-parenthesis is the critical value of s u p upper F left-parenthesis m right-parenthesis test given the significance level alpha. Four kinds of upper W upper D m a x upper F tests that correspond to alpha equals 0.100 comma 0.050 comma 0.025, and 0.010 are implemented.

The s u p upper F left-parenthesis l plus 1 vertical-bar l right-parenthesis test of l versus l plus 1 breaks is calculated in two ways that are asymptotically the same. In the first calculation, the method amounts to the application of left-parenthesis l plus 1 right-parenthesis tests of the null hypothesis of no structural change versus the alternative hypothesis of a single change. The test is applied to each segment that contains the observations ModifyingAbove upper T With caret Subscript i minus 1 to ModifyingAbove upper T With caret Subscript i left-parenthesis i equals 1 comma ellipsis comma l plus 1 right-parenthesis. The s u p upper F left-parenthesis l plus 1 vertical-bar l right-parenthesis test statistics are the maximum of these left-parenthesis l plus 1 right-parenthesis s u p upper F test statistics. In the second calculation, for the given l breaks left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript l Baseline right-parenthesis, the new break ModifyingAbove upper T With caret Superscript left-parenthesis upper N right-parenthesis is to minimize the global SSR:

ModifyingAbove upper T With caret Superscript left-parenthesis upper N right-parenthesis Baseline equals arg min Underscript upper T Superscript left-parenthesis upper N right-parenthesis Baseline Endscripts upper S upper S upper R left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript l Baseline semicolon upper T Superscript left-parenthesis upper N right-parenthesis Baseline right-parenthesis

Then,

s u p upper F left-parenthesis l plus 1 vertical-bar l right-parenthesis equals left-parenthesis upper T minus left-parenthesis l plus 1 right-parenthesis q minus p right-parenthesis StartFraction upper S upper S upper R left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript l Baseline right-parenthesis minus upper S upper S upper R left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript l Baseline semicolon ModifyingAbove upper T With caret Superscript left-parenthesis upper N right-parenthesis Baseline right-parenthesis Over upper S upper S upper R left-parenthesis ModifyingAbove upper T With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove upper T With caret Subscript l Baseline right-parenthesis EndFraction

The p-value of each test is based on the simulation of the limiting distribution of that test.

Last updated: June 19, 2025