PANEL Procedure

Panel Data Unit Root Tests

Unit roots are a big concern in dynamic processes because they have important implications for the stationarity of a process and hence for estimation. Using regular estimation techniques while ignoring the presence of unit roots can lead to spurious regressions and hence produce nonsensical results. Therefore, detecting unit roots in order to be able to analyze stationary processes is of vital concern for dynamic processes. One of the most widely used tests in the time series literature is the augmented Dickey-Fuller (ADF) test. This section introduces and briefly reviews the background information about the tests developed for dynamic panel data, which in most cases are enhancements of the ADF test.

Levin, Lin, and Chu Test

Levin, Lin, and Chu (2002) propose a panel data unit root test for the null hypothesis of a unit root against a hypothesis of homogeneous stationarity. The model is specified as

normal upper Delta y Subscript i t Baseline equals delta y Subscript i t minus 1 Baseline plus sigma-summation Underscript upper L equals 1 Overscript p Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus alpha Subscript m i Baseline d Subscript m t Baseline plus epsilon Subscript i t Baseline m equals 1 comma 2 comma 3

The panel data unit root test evaluates the null hypothesis of upper H 0 colon delta equals 0, for all i, against the alternative hypothesis upper H 1 colon delta less-than 0, for all i. Three models are considered: (1) d Subscript 1 t Baseline equals phi (the empty set) with no individual effects; (2) d Subscript 2 t Baseline equals StartSet 1 EndSet, in which the series y Subscript i t has an individual-specific mean but no time trend; and (3) d Subscript 3 t Baseline equals StartSet 1 comma t EndSet, in which the series y Subscript i t has an individual-specific mean and a linear and individual-specific time trend. The lag order p Subscript i is unknown and is allowed to vary across individuals. It can be selected by the methods that are described in the section Lag Order Selection in the ADF Regression. The selected lag order is denoted as ModifyingAbove p With caret Subscript i. The necessary condition for the test is that StartFraction StartRoot upper N EndRoot Over upper T EndFraction right-arrow 0. An important assumption is that the errors, epsilon Subscript i t, are normal i normal i normal d left-parenthesis 0 comma sigma Subscript i comma t Superscript 2 Baseline right-parenthesis. In other words, cross-sectional independence is assumed. The test is implemented in the following three steps:

Step 1

The ADF regressions are implemented for each individual i, and then the orthogonalized residuals are generated and normalized. That is, the following model is estimated:

normal upper Delta y Subscript i t Baseline equals delta Subscript i Baseline y Subscript i t minus 1 Baseline plus sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus alpha Subscript m i Baseline d Subscript m t Baseline plus epsilon Subscript i t Baseline m equals 1 comma 2 comma 3

Then, two orthogonalized residuals are generated by the following two auxiliary regressions:

normal upper Delta y Subscript i t Baseline equals sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus alpha Subscript m i Baseline d Subscript m i Baseline plus e Subscript i t
y Subscript i t minus 1 Baseline equals sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus alpha Subscript m i Baseline d Subscript m i Baseline plus v Subscript i t minus 1

The residuals are then saved as ModifyingAbove e With caret Subscript i t and ModifyingAbove v With caret Subscript i t minus 1, respectively, and normalized using the regression standard error from the ADF regression in order to remove heteroscedasticity. Let ModifyingAbove sigma With caret Subscript epsilon i denote the standard error from each of the previous ADF regressions, where ModifyingAbove sigma With caret Subscript epsilon i Superscript 2 Baseline equals sigma-summation Underscript t equals ModifyingAbove p With caret Subscript i Baseline plus 2 Overscript upper T Endscripts left-parenthesis ModifyingAbove e With caret Subscript i t Baseline minus ModifyingAbove delta Subscript i Baseline With caret ModifyingAbove v With caret Subscript i t minus 1 Baseline right-parenthesis squared slash left-parenthesis upper T minus p Subscript i Baseline minus 1 right-parenthesis. The normalized residuals are then

e overTilde Subscript i t Baseline equals StartFraction ModifyingAbove e With caret Subscript i t Baseline Over ModifyingAbove sigma With caret Subscript epsilon i Baseline EndFraction comma v overTilde Subscript i t minus 1 Baseline equals StartFraction ModifyingAbove v With caret Subscript i t minus 1 Baseline Over ModifyingAbove sigma With caret Subscript epsilon i Baseline EndFraction
Step 2

The ratios of long-run to short-run standard deviations of normal upper Delta y Subscript i t are estimated. Denote the ratios and the long-run variances as s Subscript i and sigma Subscript y i, respectively. The long-run variances are estimated by the heteroscedasticity- and autocorrelation-consistent (HAC) estimators, which are described in the section Long-Run Variance Estimation. Then the ratios are estimated by ModifyingAbove s With caret Subscript i Baseline equals ModifyingAbove sigma With caret Subscript y i Baseline slash ModifyingAbove sigma With caret Subscript epsilon i. Let the average standard deviation ratio be upper S Subscript upper N Baseline equals left-parenthesis 1 slash upper N right-parenthesis sigma-summation Underscript i equals 1 Overscript upper N Endscripts s Subscript i, and let its estimator be ModifyingAbove upper S With caret Subscript upper N Baseline equals left-parenthesis 1 slash upper N right-parenthesis sigma-summation Underscript i equals 1 Overscript upper N Endscripts ModifyingAbove s With caret Subscript i. As the authors note in their paper (Levin, Lin, and Chu 2002), use of the long-run variance based on first differences results in lower bias in finite samples.

Step 3

The panel test statistics are calculated. To calculate the t statistic and the adjusted t statistic, the following equation is estimated:

e overTilde Subscript i t Baseline equals delta v overTilde Subscript i t minus 1 Baseline plus epsilon overTilde Subscript i t

The total number of observations is upper N upper T overTilde, with ModifyingAbove p With caret overbar equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts ModifyingAbove p With caret Subscript i Baseline slash upper N comma upper T overTilde equals upper T minus ModifyingAbove p With caret overbar minus 1. The standard t statistic for testing upper H 0 colon delta equals 0 is t Subscript delta Baseline equals ModifyingAbove delta With caret slash ModifyingAbove sigma With caret Subscript delta, with the OLS estimator ModifyingAbove delta With caret and standard deviation ModifyingAbove sigma With caret Subscript delta,

ModifyingAbove delta With caret equals StartFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 plus ModifyingAbove p With caret Subscript i Baseline Overscript upper T Endscripts e overTilde Subscript i t Baseline v overTilde Subscript i t minus 1 Baseline Over sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 plus ModifyingAbove p With caret Subscript i Baseline Overscript upper T Endscripts v overTilde Subscript i t minus 1 Superscript 2 Baseline EndFraction
ModifyingAbove sigma With caret Subscript delta Baseline equals ModifyingAbove sigma With caret Subscript epsilon overTilde Baseline left-bracket sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 plus ModifyingAbove p With caret Subscript i Baseline Overscript upper T Endscripts v overTilde Subscript i t minus 1 Superscript 2 Baseline right-bracket Superscript negative one-half

where ModifyingAbove sigma With caret Subscript epsilon overTilde Superscript 2 is the root mean square error from the step 3 regression

ModifyingAbove sigma With caret Subscript epsilon overTilde Superscript 2 Baseline equals StartFraction 1 Over upper N upper T overTilde EndFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 plus ModifyingAbove p With caret Subscript i Baseline Overscript upper T Endscripts left-parenthesis e overTilde Subscript i t Baseline minus ModifyingAbove delta With caret v overTilde Subscript i t minus 1 Baseline right-parenthesis squared

However, the standard t statistic diverges to negative infinity for models (2) and (3). Levin, Lin, and Chu (2002) therefore propose the following adjusted t statistic:

t Subscript delta Superscript asterisk Baseline equals StartFraction t Subscript delta Baseline minus upper N upper T overTilde ModifyingAbove upper S With caret Subscript upper N Baseline ModifyingAbove sigma With caret Subscript epsilon overTilde Superscript negative 2 Baseline ModifyingAbove sigma With caret Subscript delta Baseline mu Subscript m upper T overTilde Superscript asterisk Baseline Over sigma Subscript m upper T overTilde Superscript asterisk Baseline EndFraction

The mean and standard deviation adjustments (mu Subscript m upper T overTilde Superscript asterisk Baseline comma sigma Subscript m upper T overTilde Superscript asterisk) depend on the time series dimension upper T overTilde and model specification m, which can be found in Table 2 of Levin, Lin, and Chu (2002). The adjusted t statistic converges to the standard normal distribution. Therefore, the standard normal critical values are used in hypothesis testing.

Lag Order Selection in the ADF Regression

The methods of selecting the individual lag orders in the ADF regressions can be divided into two categories: selection based on information criteria and selection via sequential testing.

Lag Selection Based on Information Criteria

In this method, the following information criteria can be applied to lag order selection: Akaike’s information criterion (AIC), the Schwarz Bayesian criterion (SBC), the Hannan-Quinn information criterion (HQIC or HQC), and the modified AIC. As with other model selection applications, the lag order is selected from 0 to the maximum p Subscript max to minimize the objective function, plus a penalty term, which is a function of the number of parameters in the regression. Let k be the number of parameters and upper T Subscript o be the number of effective observations. For regression models, the objective function is upper T Subscript o Baseline log left-parenthesis normal upper S normal upper S normal upper R slash upper T Subscript o Baseline right-parenthesis, where SSR is the sum of squared residuals. For AIC, the penalty term equals 2 k. For SBC, this term is k log left-parenthesis upper T Subscript o Baseline right-parenthesis. For HQIC, it is 2 c k log left-bracket log left-parenthesis upper T Subscript o Baseline right-parenthesis right-bracket, where c is a constant greater than 1.[8] For MAIC, the penalty term equals 2 left-parenthesis tau Subscript upper T Baseline left-parenthesis k right-parenthesis plus k right-parenthesis, where

tau Subscript upper T Baseline left-parenthesis k right-parenthesis equals left-parenthesis normal upper S normal upper S normal upper R slash upper T Subscript o Baseline right-parenthesis Superscript negative 1 Baseline ModifyingAbove delta With caret squared sigma-summation Underscript t equals p Subscript normal m normal a normal x Baseline plus 2 Overscript upper T Endscripts y Subscript t minus 1 Superscript 2

and ModifyingAbove delta With caret is the estimated coefficient of the lagged dependent variable y Subscript t minus 1 in the ADF regression.

Lag Selection via Sequential Testing

In this method, the lag order estimation is based on the statistical significance of the estimated AR coefficients. Hall (1994) proposed general-to-specific (GS) and specific-to-general (SG) modeling strategies. Levin, Lin, and Chu (2002) recommend the GS strategy, following Campbell and Perron (1991). In the GS modeling strategy, starting with the maximum lag order p Subscript normal m normal a normal x, the t test for the largest lag order in ModifyingAbove theta With caret Subscript i is performed to determine whether a smaller lag order is preferred. Specifically, when the null of ModifyingAbove theta With caret Subscript i upper L Baseline equals 0 is not rejected given the significance level (5 percent-sign), a smaller lag order is preferred. This procedure continues until a statistically significant lag order is reached. On the other hand, the SG modeling strategy starts with lag order 0 and moves toward the maximum lag order, p Subscript normal m normal a normal x.

Long-Run Variance Estimation

The long-run variance of normal upper Delta y Subscript i t is estimated by an HAC-type estimator. For model (1), given the lag truncation parameter upper K overbar and kernel weights w Subscript upper K overbar upper L, the formula is

ModifyingAbove sigma With caret Subscript y i Superscript 2 Baseline equals StartFraction 1 Over upper T minus 1 EndFraction sigma-summation Underscript t equals 2 Overscript upper T Endscripts normal upper Delta y Subscript i t Superscript 2 Baseline plus 2 sigma-summation Underscript upper L equals 1 Overscript upper K overbar Endscripts w Subscript upper K overbar upper L Baseline left-bracket StartFraction 1 Over upper T minus 1 EndFraction sigma-summation Underscript t equals 2 plus upper L Overscript upper T Endscripts normal upper Delta y Subscript i t Baseline normal upper Delta y Subscript i t minus upper L Baseline right-bracket

To achieve consistency, the lag truncation parameter must satisfy upper K overbar slash upper T right-arrow 0 and upper K overbar right-arrow normal infinity as upper T right-arrow normal infinity. Levin, Lin, and Chu (2002) suggest upper K overbar equals left floor 3.21 upper T Superscript 1 slash 3 Baseline right floor. The weights w Subscript upper K overbar upper L depend on the kernel function. Andrews (1991) proposes data-driven bandwidth (lag truncation parameter + 1 if integer-valued) selection procedures to minimize the asymptotic mean square error (MSE) criterion. For more information about the kernel functions and Andrews’s (1991) data-driven bandwidth selection procedure, see the section Heteroscedasticity- and Autocorrelation-Consistent Covariance Matrices. Because Levin, Lin, and Chu (2002) truncate the bandwidth as an integer, when LLCBAND is specified as the BANDWIDTH option, it corresponds to BANDWIDTH equals left floor 3.21 upper T Superscript 1 slash 3 Baseline right floor plus 1. Furthermore, kernel weights w Subscript upper K overbar upper L Baseline equals k left-parenthesis upper L slash left-parenthesis upper K overbar plus 1 right-parenthesis right-parenthesis with kernel function k left-parenthesis dot right-parenthesis.

For model (2), first the series normal upper Delta y Subscript i t is de-meaned individual by individual. Therefore, normal upper Delta y Subscript i t is replaced by normal upper Delta y Subscript i t minus ModifyingAbove normal upper Delta y Subscript i t With bar, where ModifyingAbove normal upper Delta y Subscript i t With bar is the mean of normal upper Delta y Subscript i t for individual i. For model (3) with individual fixed effects and time trend, both the individual mean and trend should be removed before the long-run variance is estimated. That is, first you regress normal upper Delta y Subscript i t on StartSet 1 comma t EndSet for each individual and save the residual ModifyingAbove normal upper Delta y Subscript i t With tilde, and then you replace normal upper Delta y Subscript i t with the residual.

Cross-Sectional Dependence via Time-Specific Aggregate Effects

The Levin, Lin, and Chu (2002) testing procedure is based on the assumption of cross-sectional independence. It is possible to relax this assumption and allow for a limited degree of dependence via time-specific aggregate effects. Let theta Subscript t denote the time-specific aggregate effects; then the data generating process becomes

normal upper Delta y Subscript i t Baseline equals delta y Subscript i t minus 1 Baseline plus sigma-summation Underscript upper L equals 1 Overscript p Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus alpha Subscript m i Baseline d Subscript m t Baseline plus theta Subscript t Baseline plus epsilon Subscript i t Baseline m equals 4 comma 5

Two more models are considered: (4) d Subscript 1 t Baseline equals phi (the empty set), with no individual effects but with time effects; and (5) d Subscript 2 t Baseline equals StartSet 1 EndSet, in which the series y Subscript i t has an individual-specific mean and a time-specific mean.

By subtracting the time averages y Subscript t Baseline overbar equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts y Subscript i t from the observed dependent variable y Subscript i t, or equivalently, by including the time-specific intercepts theta Subscript t in the ADF regression, the cross-sectional dependence is removed. The impact of a single aggregate common factor that has an identical impact on all individuals but changes over time can also be removed in this way. After cross-sectional dependence is removed, the three-step procedure is applied to calculate the Levin, Lin, and Chu (2002) adjusted t statistic.

Deterministic Variables

Three deterministic variables can be included in the model for the first-stage estimation: CS_FixedEffects (cross-sectional fixed effects), TS_FixedEffects (time series fixed effects), and TimeTrend (individual linear time trend). When a linear time trend is included, the individual fixed effects are also included. Otherwise the time trend is not identified. Moreover, if the time series fixed effects are included, the time trend is not identified either. Therefore, there are five identified models: model (1), no deterministic variables; model (2), CS_FixedEffects; model (3), CS_FixedEffects and TimeTrend; model (4), TS_FixedEffects; and model (5), CS_FixedEffects and TS_FixedEffects. PROC PANEL outputs the test results for all five model specifications.

Im, Pesaran, and Shin Test

To test for the unit root in heterogeneous panels, Im, Pesaran, and Shin (2003) propose a standardized t-bar test statistic based on averaging the (augmented) Dickey-Fuller statistics across the groups. The limiting distribution is standard normal. The stochastic process y Subscript i t is generated by the first-order autoregressive process. If normal upper Delta y Subscript i t Baseline equals y Subscript i t Baseline minus y Subscript i comma t minus 1, the data generating process can be expressed as in the Levin, Lin, and Chu (LLC) test,

normal upper Delta y Subscript i t Baseline equals beta Subscript i Baseline y Subscript i t minus 1 Baseline plus sigma-summation Underscript j equals 1 Overscript p Subscript i Baseline Endscripts rho Subscript i j Baseline normal upper Delta y Subscript i comma t minus j Baseline plus alpha Subscript m i Baseline d Subscript m t Baseline plus epsilon Subscript i t Baseline m equals 1 comma 2 comma 3

where p Subscript i is the lag order in the ADF regression, as in the LLC test. In contrast with the data generating process in the LLC test, beta Subscript i is allowed to differ across groups. The null hypothesis of unit roots is

upper H 0 colon beta Subscript i Baseline equals 0 for all i

against the heterogeneous alternative,

upper H 1 colon beta Subscript i Baseline less-than 0 for i equals 1 comma ellipsis comma upper N 1 comma beta Subscript i Baseline equals 0 for i equals upper N 1 plus 1 comma ellipsis comma upper N

The Im, Pesaran, and Shin (2003) test also allows for some (but not all) of the individual series to have unit roots under the alternative hypothesis. But the fraction of the individual processes that are stationary is positive, limit Underscript upper N right-arrow normal infinity Endscripts upper N 1 slash upper N equals delta element-of left-parenthesis 0 comma 1 right-bracket. The t-bar statistic, denoted by t overbar Subscript upper N upper T, is formed as a simple average of the individual t statistics for testing the null hypothesis of beta Subscript i Baseline equals 0. If t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma beta Subscript i Baseline right-parenthesis is the standard t statistic, then

t overbar Subscript upper N upper T Baseline equals upper N Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript upper N Endscripts t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma beta Subscript i Baseline right-parenthesis

If upper T right-arrow normal infinity, then for each i the t statistic (without time trend) converges to the Dickey-Fuller distribution, eta Subscript i, defined by

eta Subscript i Baseline equals StartFraction one-half StartSet left-bracket upper W Subscript i Baseline left-parenthesis 1 right-parenthesis right-bracket squared minus 1 EndSet minus upper W Subscript i Baseline left-parenthesis 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W Subscript i Baseline left-parenthesis u right-parenthesis d u Over integral Subscript 0 Superscript 1 Baseline left-bracket upper W Subscript i Baseline left-parenthesis u right-parenthesis right-bracket squared d u minus left-bracket integral Subscript 0 Superscript 1 Baseline upper W Subscript i Baseline left-parenthesis u right-parenthesis d u right-bracket squared EndFraction

where upper W Subscript i is the standard Brownian motion. The limiting distribution is different when a time trend is included in the regression (Hamilton 1994, p. 499). The mean and variance of the limiting distributions are reported in Nabeya (1999). The standardized t-bar statistic satisfies

upper Z Subscript t overbar Baseline left-parenthesis p comma beta right-parenthesis equals StartFraction StartRoot upper N EndRoot StartSet t overbar Subscript upper N upper T Baseline minus upper E left-parenthesis eta right-parenthesis EndSet Over StartRoot normal upper V normal a normal r left-parenthesis eta right-parenthesis EndRoot EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

where the standard normal is the sequential limit with upper T right-arrow normal infinity followed by upper N right-arrow normal infinity. To obtain better finite sample approximations, Im, Pesaran, and Shin (2003) propose standardizing the t-bar statistic by means and variances of t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma 0 right-parenthesis under the null hypothesis beta Subscript i Baseline equals 0. The alternative standardized t-bar statistic is

upper W Subscript t overbar Baseline left-parenthesis p comma beta right-parenthesis equals StartFraction StartRoot upper N EndRoot StartSet t overbar Subscript upper N upper T Baseline minus upper N Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript upper N Endscripts upper E left-bracket t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma 0 right-parenthesis vertical-bar beta Subscript i Baseline equals 0 right-bracket EndSet Over StartSet upper N Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript upper N Endscripts normal upper V normal a normal r left-bracket t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma 0 right-parenthesis vertical-bar beta Subscript i Baseline equals 0 right-bracket EndSet Superscript 1 slash 2 Baseline EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

Im, Pesaran, and Shin (2003) simulate the values of upper E left-bracket t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma 0 right-parenthesis vertical-bar beta Subscript i Baseline equals 0 right-bracket and normal upper V normal a normal r left-bracket t Subscript i upper T Baseline left-parenthesis p Subscript i Baseline comma 0 right-parenthesis vertical-bar beta Subscript i Baseline equals 0 right-bracket for different values of T and p. The lag order in the ADF regression can be selected by the same method as in Levin, Lin, and Chu (2002). For more information, see the section Lag Order Selection in the ADF Regression.

When T is fixed, Im, Pesaran, and Shin (2003) assume serially uncorrelated errors, p Subscript i Baseline equals 0; t Subscript i upper T is likely to have finite second moment, which is not established in the paper. The t statistic is modified by imposing the null hypothesis of a unit root. Denote sigma overTilde Subscript i upper T as the estimated standard error from the restricted regression (beta Subscript i Baseline equals 0),

t overbar overTilde Subscript upper N upper T Baseline equals upper N Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript upper N Endscripts t overTilde Subscript i t Baseline equals upper N Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript upper N Endscripts left-bracket ModifyingAbove beta With caret Subscript i upper T Baseline left-parenthesis bold-italic y prime Subscript i comma negative 1 Baseline upper M Subscript tau Baseline bold-italic y Subscript i comma negative 1 Baseline right-parenthesis Superscript 1 slash 2 Baseline slash sigma overTilde Subscript i upper T Baseline right-bracket

where ModifyingAbove beta With caret Subscript i upper T is the OLS estimator of beta Subscript i (unrestricted model), tau Subscript upper T Baseline equals left-parenthesis 1 comma 1 comma ellipsis comma 1 right-parenthesis prime, upper M Subscript tau Baseline equals upper I Subscript upper T Baseline minus tau Subscript upper T Baseline left-parenthesis tau prime Subscript upper T Baseline tau Subscript upper T Baseline right-parenthesis tau prime Subscript upper T, and bold-italic y Subscript i comma negative 1 Baseline equals left-parenthesis y Subscript i Baseline 0 Baseline comma y Subscript i Baseline 1 Baseline comma ellipsis comma y Subscript i comma upper T minus 1 Baseline right-parenthesis prime, where y Subscript i Baseline 0 is a given initial value (fixed or random). Under the null hypothesis, the standardized t overTilde-bar statistic converges to a standard normal variate,

upper Z Subscript t overbar overTilde Baseline equals StartFraction StartRoot upper N EndRoot StartSet t overbar overTilde Subscript upper N upper T Baseline minus upper E left-parenthesis t overTilde Subscript upper T Baseline right-parenthesis EndSet Over StartRoot normal upper V normal a normal r left-parenthesis t overTilde Subscript upper T Baseline right-parenthesis EndRoot EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

where upper E left-parenthesis t overTilde Subscript upper T Baseline right-parenthesis and normal upper V normal a normal r left-parenthesis t overTilde Subscript upper T Baseline right-parenthesis are the mean and variance of t overTilde Subscript i upper T, respectively. The limit is taken as upper N right-arrow normal infinity, and T is fixed. Their values are simulated for finite samples without a time trend. upper Z Subscript t overbar overTilde is also likely to converge to standard normal.

When N and T are both finite, an exact test that assumes no serial correlation can be used. The critical values of t overbar Subscript upper N upper T and t overbar overTilde Subscript upper N upper T are simulated.

As in the section Levin, Lin, and Chu Test, it is possible to relax this assumption of cross-sectional independence and allow for a limited degree of dependence via time-specific aggregate effects. In that section, two more models (model 4 and model 5) with time fixed effects are considered. For more information, see the section Cross-Sectional Dependence via Time-Specific Aggregate Effects.

Combination Tests

Combining the observed significance levels (p-values) from N independent tests of the unit root null hypothesis was proposed by Maddala and Wu (1999) and Choi (2001). Suppose upper G Subscript i is the test statistic to test the unit root null hypothesis for individual i equals 1 comma ellipsis comma upper N, and upper F left-parenthesis dot right-parenthesis is the cumulative distribution function (CDF) of the asymptotic distribution as upper T right-arrow normal infinity. Then the asymptotic p-value is defined as

p Subscript i Baseline equals upper F left-parenthesis upper G Subscript i Baseline right-parenthesis

There are different ways to combine these p-values. The first way is the inverse chi-square test (Fisher 1932); this test is referred to as the P test in Choi (2001) and the lamda test in Maddala and Wu (1999):

upper P equals minus 2 sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-parenthesis p Subscript i Baseline right-parenthesis

When the test statistics StartSet upper G Subscript i Baseline EndSet Subscript i equals 1 comma ellipsis comma upper N are continuous, StartSet p Subscript i Baseline EndSet Subscript i equals 1 comma ellipsis comma upper N are independent uniform left-parenthesis 0 comma 1 right-parenthesis variables. Therefore, upper P right double arrow chi Subscript 2 upper N Superscript 2 as upper T right-arrow normal infinity and N is fixed. But as upper N right-arrow normal infinity, P diverges to infinity in probability. Therefore, it is not applicable for large N. To derive a nondegenerate limiting distribution, the P test (Fisher test with upper N right-arrow normal infinity) should be modified to

upper P Subscript m Baseline equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts left-parenthesis negative 2 ln left-parenthesis p Subscript i Baseline right-parenthesis minus 2 right-parenthesis slash 2 StartRoot upper N EndRoot equals minus sigma-summation Underscript i equals 1 Overscript upper N Endscripts left-parenthesis ln left-parenthesis p Subscript i Baseline right-parenthesis plus 1 right-parenthesis slash StartRoot upper N EndRoot

Under the null as upper T Subscript i Baseline right-arrow normal infinity,[9] and then upper N right-arrow normal infinity, upper P Subscript m Baseline right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis.[10]

The second way of combining individual p-values is the inverse normal test,

upper Z equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts normal upper Phi Superscript negative 1 Baseline left-parenthesis p Subscript i Baseline right-parenthesis

where normal upper Phi left-parenthesis dot right-parenthesis is the standard normal CDF. When upper T Subscript i Baseline right-arrow normal infinity, upper Z right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis as N is fixed. When N and upper T Subscript i are both large, the sequential limit is also standard normal if upper T Subscript i Baseline right-arrow normal infinity first and upper N right-arrow normal infinity next.

The third way of combining p-values is the logit test,

upper L Superscript asterisk Baseline equals StartRoot k EndRoot upper L equals StartRoot k EndRoot sigma-summation Underscript i equals 1 Overscript upper N Endscripts ln left-parenthesis StartFraction p Subscript i Baseline Over 1 minus p Subscript i Baseline EndFraction right-parenthesis

where k equals 3 left-parenthesis 5 upper N plus 4 right-parenthesis slash left-parenthesis pi squared upper N left-parenthesis 5 upper N plus 2 right-parenthesis right-parenthesis. When upper T Subscript i Baseline right-arrow normal infinity and N is fixed, upper L Superscript asterisk Baseline right double arrow t Subscript 5 upper N plus 4. In other words, the limiting distribution is the t distribution with degree of freedom 5 upper N plus 4. The sequential limit is upper L Superscript asterisk Baseline right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis as upper T Subscript i Baseline right-arrow normal infinity and then upper N right-arrow normal infinity. Simulation results in Choi (2001) suggest that the Z test outperforms other combination tests. For the time series unit root test upper G Subscript i, Maddala and Wu (1999) apply the augmented Dickey-Fuller test. According to Choi (2006), the Elliott, Rothenberg, and Stock (1996) Dickey-Fuller generalized least squares (DF-GLS) test offers significant size and power advantages in finite samples.

As in the section Levin, Lin, and Chu Test, it is possible to relax this assumption of cross-sectional independence and allow for a limited degree of dependence via time-specific aggregate effects. In that section, two more models (model 4 and model 5) with time fixed effects are considered. For more information, see the section Cross-Sectional Dependence via Time-Specific Aggregate Effects.

Breitung’s Unbiased Tests

To account for the nonzero mean of the t statistic in the OLS detrending case, bias-adjusted t statistics were proposed by Levin, Lin, and Chu (2002) and Im, Pesaran, and Shin (2003). The bias corrections imply a severe loss of power. Breitung and associates take an alternative approach to avoid the bias, by using alternative estimates of the deterministic terms (Breitung and Meyer 1994; Breitung 2000; Breitung and Das 2005). The data generating process is the same as in the Im, Pesaran, and Shin (IPS) test (2003). Three models are considered, as described in the section Levin, Lin, and Chu Test. When serial correlation is absent, for model (2) with individual specific means, the constant terms are estimated by the initial values y Subscript i Baseline 1. Therefore, the series y Subscript i t is adjusted by subtracting the initial value. The equation becomes

normal upper Delta y Subscript i t Baseline equals delta Superscript asterisk Baseline left-parenthesis y Subscript i comma t minus 1 Baseline minus y Subscript i Baseline 1 Baseline right-parenthesis plus v Subscript i t

For model (3) with individual specific means and time trends, the time trend can be estimated by ModifyingAbove beta With caret Subscript i Baseline equals left-parenthesis y Subscript i upper T Baseline minus y Subscript i Baseline 1 Baseline right-parenthesis slash left-parenthesis upper T minus 1 right-parenthesis. The levels can be transformed as

y overTilde Subscript i t Baseline equals y Subscript i t Baseline minus y Subscript i Baseline 1 Baseline minus ModifyingAbove beta With caret Subscript i Baseline t equals y Subscript i t Baseline minus y Subscript i Baseline 1 Baseline minus t left-parenthesis y Subscript i upper T Baseline minus y Subscript i Baseline 1 Baseline right-parenthesis slash left-parenthesis upper T minus 1 right-parenthesis

The Helmert transformation is applied to the dependent variable to remove the mean of the differenced variable:

normal upper Delta y Subscript i t Superscript asterisk Baseline equals StartRoot StartFraction upper T minus t Over upper T minus t plus 1 EndFraction EndRoot left-parenthesis normal upper Delta y Subscript i t Baseline minus StartFraction normal upper Delta y Subscript i comma t plus 1 Baseline plus midline-horizontal-ellipsis plus normal upper Delta y Subscript i upper T Baseline Over upper T minus t EndFraction right-parenthesis

The transformed model is

normal upper Delta y Subscript i t Superscript asterisk Baseline equals delta Superscript asterisk Baseline y overTilde Subscript i comma t minus 1 Baseline plus v Subscript i t

The pooled t statistic has a standard normal distribution. Therefore, no adjustment is needed for the t statistic. To adjust for heteroscedasticity across cross sections, Breitung (2000) proposes a UB (unbiased) statistic based on the transformed data,

upper U upper B equals StartFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 Overscript upper T Endscripts normal upper Delta y Subscript i t Superscript asterisk Baseline y overTilde Subscript i comma t minus 1 Baseline slash sigma Subscript i Superscript 2 Baseline Over left-parenthesis sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 Overscript upper T Endscripts y overTilde Subscript i comma t minus 1 Superscript 2 Baseline slash sigma Subscript i Superscript 2 Baseline right-parenthesis Superscript 1 slash 2 Baseline EndFraction

where sigma Subscript i Superscript 2 Baseline equals upper E left-parenthesis normal upper Delta y Subscript i t Baseline minus beta Subscript i Baseline right-parenthesis squared. When sigma Subscript i Superscript 2 is unknown, it can be estimated as

ModifyingAbove sigma With caret Subscript i Superscript 2 Baseline equals sigma-summation Underscript t equals 2 Overscript upper T Endscripts left-parenthesis normal upper Delta y Subscript i t Baseline minus StartFraction sigma-summation Underscript t equals 2 Overscript upper T Endscripts normal upper Delta y Subscript i t Baseline Over upper T minus 1 EndFraction right-parenthesis squared slash left-parenthesis upper T minus 2 right-parenthesis

The UB statistic has a standard normal limiting distribution as upper T right-arrow normal infinity followed by upper N right-arrow normal infinity sequentially.
To account for the short-run dynamics, Breitung and Das (2005) suggest applying the test to the prewhitened series, ModifyingAbove y With caret Subscript i t. For model (1) and model (2) (constant-only case), they suggest the same method as in step 1 of the Levin, Lin, and Chu (LLC) test (2002).[11] For model (3) (with a constant and linear time trend), the prewhitened series can be obtained by running the following restricted ADF regression under the null hypothesis of a unit root ( delta equals 0 ) and no intercept and linear time trend (mu Subscript i Baseline equals 0 comma beta Subscript i Baseline equals 0),

normal upper Delta y Subscript i t Baseline equals sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts theta Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L Baseline plus mu Subscript i Baseline plus epsilon Subscript i t

where ModifyingAbove p With caret Subscript i is a consistent estimator of the true lag order p Subscript i and can be estimated by the procedures listed in the section Lag Order Selection in the ADF Regression. For the LLC and IPS tests, the lag orders are selected by running the ADF regressions. But for Breitung and his associates’ tests, the restricted ADF regressions are used to be consistent with the prewhitening method. Let left-parenthesis ModifyingAbove mu With caret Subscript i Baseline comma ModifyingAbove theta With caret Subscript i upper L Baseline right-parenthesis be the estimated coefficients.[12] The prewhitened series can be obtained by

normal upper Delta ModifyingAbove y With caret Subscript i t Baseline equals normal upper Delta y Subscript i t Baseline minus sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts ModifyingAbove theta With caret Subscript i upper L Baseline normal upper Delta y Subscript i t minus upper L

and

ModifyingAbove y With caret Subscript i t Baseline equals y Subscript i t Baseline minus sigma-summation Underscript upper L equals 1 Overscript ModifyingAbove p With caret Subscript i Baseline Endscripts ModifyingAbove theta With caret Subscript i upper L Baseline y Subscript i t minus upper L

The transformed series are random walks under the null hypothesis,

normal upper Delta ModifyingAbove y With caret Subscript i t Baseline equals delta ModifyingAbove y With caret Subscript i comma t minus 1 Baseline plus v Subscript i t

where y Subscript i s Baseline equals 0 for s less-than 0. When the cross-sectional units are independent, the t statistic converges to standard normal under the null, as upper T right-arrow normal infinity followed by upper N right-arrow normal infinity,

t Subscript normal upper O normal upper L normal upper S Baseline equals StartFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 Overscript upper T Endscripts y Subscript i comma t minus 1 Baseline normal upper Delta y Subscript i t Baseline Over ModifyingAbove sigma With caret StartRoot sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 Overscript upper T Endscripts y Subscript i comma t minus 1 Superscript 2 Baseline EndRoot EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

where ModifyingAbove sigma With caret squared equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 2 Overscript upper T Endscripts left-parenthesis normal upper Delta y Subscript i t Baseline minus ModifyingAbove delta With caret y Subscript i comma t minus 1 Baseline right-parenthesis squared slash upper N left-parenthesis upper T minus 1 right-parenthesis with the OLS estimator ModifyingAbove delta With caret.
To account for cross-sectional dependence, Breitung and Das (2005) propose the robust t statistic and a GLS version of the test statistic. Let v Subscript t Baseline equals left-parenthesis v Subscript 1 t Baseline comma ellipsis comma v Subscript upper N t Baseline right-parenthesis prime be the error vector for time t, and let normal upper Omega equals upper E left-parenthesis v Subscript t Baseline v prime Subscript t right-parenthesis be a positive definite matrix with eigenvalues lamda 1 greater-than-or-equal-to midline-horizontal-ellipsis greater-than-or-equal-to lamda Subscript upper N. Let y Subscript t Baseline equals left-parenthesis y Subscript 1 t Baseline comma ellipsis comma y Subscript upper N t Baseline right-parenthesis prime and normal upper Delta y Subscript t Baseline equals left-parenthesis normal upper Delta y Subscript 1 t Baseline comma ellipsis comma normal upper Delta y Subscript upper N t Baseline right-parenthesis prime. The model can be written as a system of equations of the seemingly unrelated regressions (SUR) type:

normal upper Delta y Subscript t Baseline equals delta y Subscript t minus 1 Baseline plus v Subscript t

The unknown covariance matrix normal upper Omega can be estimated by its sample counterpart,

ModifyingAbove normal upper Omega With caret equals sigma-summation Underscript t equals 2 Overscript upper T Endscripts left-parenthesis normal upper Delta y Subscript t Baseline minus ModifyingAbove delta With caret y Subscript t minus 1 Baseline right-parenthesis left-parenthesis normal upper Delta y Subscript t Baseline minus ModifyingAbove delta With caret y Subscript t minus 1 Baseline right-parenthesis Superscript prime Baseline slash left-parenthesis upper T minus 1 right-parenthesis

The sequential limit upper T right-arrow normal infinity followed by upper N right-arrow normal infinity of the standard t statistic t Subscript normal upper O normal upper L normal upper S is normal with mean 0 and variance v Subscript normal upper Omega Baseline equals limit Subscript upper N right-arrow normal infinity Baseline trace left-parenthesis normal upper Omega squared slash upper N right-parenthesis slash left-parenthesis trace normal upper Omega slash upper N right-parenthesis squared. The variance v Subscript normal upper Omega can be consistently estimated by ModifyingAbove v With caret Subscript ModifyingAbove delta With caret Baseline equals left-parenthesis sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline ModifyingAbove normal upper Omega With caret y Subscript t minus 1 Baseline right-parenthesis slash left-parenthesis sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline y Subscript t minus 1 Baseline right-parenthesis squared. Thus the robust t statistic can be calculated as

t Subscript normal r normal o normal b Baseline equals StartFraction delta Over ModifyingAbove v With caret Subscript ModifyingAbove delta With caret Baseline EndFraction equals StartFraction sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline normal upper Delta y Subscript t Baseline Over StartRoot sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline ModifyingAbove normal upper Omega With caret y Subscript t minus 1 Baseline EndRoot EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

as upper T right-arrow normal infinity followed by upper N right-arrow normal infinity under the null hypothesis of random walk. Because the finite sample distribution can be quite different, Breitung and Das (2005) list the 1 percent-sign, 5 percent-sign, and 10 percent-sign critical values for different N’s.

When upper T greater-than upper N, a (feasible) GLS estimator is applied; it is asymptotically more efficient than the OLS estimator. The data are transformed by multiplying ModifyingAbove normal upper Omega With caret Superscript negative 1 slash 2 as defined before, ModifyingAbove z With caret Subscript t Baseline equals ModifyingAbove normal upper Omega With caret Superscript negative 1 slash 2 Baseline y Subscript t. Thus the model is transformed into

normal upper Delta ModifyingAbove z With caret Subscript t Baseline equals delta ModifyingAbove z With caret Subscript t minus 1 Baseline plus e Subscript t

The feasible GLS (FGLS) estimator of delta and the corresponding t statistic are obtained by estimating the transformed model by OLS and denoted by ModifyingAbove delta With caret Subscript normal upper G normal upper L normal upper S and t Subscript normal upper G normal upper L normal upper S, respectively:

t Subscript normal upper G normal upper L normal upper S Baseline equals StartFraction sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline ModifyingAbove normal upper Omega With caret Superscript negative 1 Baseline normal upper Delta y Subscript t Baseline Over StartRoot sigma-summation Underscript t equals 2 Overscript upper T Endscripts y prime Subscript t minus 1 Baseline ModifyingAbove normal upper Omega With caret Superscript negative 1 Baseline y Subscript t minus 1 Baseline EndRoot EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis

As in the section Levin, Lin, and Chu Test, it is possible to relax this assumption of cross-sectional independence and allow for a limited degree of dependence via time-specific aggregate effects. In that section, two more models (model 4 and model 5) with time fixed effects are considered. For more information, see the section Cross-Sectional Dependence via Time-Specific Aggregate Effects.

Hadri Stationarity Test

Hadri (2000) adopts a component representation where an individual time series is written as a sum of a deterministic trend, a random walk, and a white-noise disturbance term. Under the null hypothesis of stationarity, the variance of the random walk equals 0. Specifically, two models are considered:

  • For model (1), the time series y Subscript i t is stationary around a level r Subscript i Baseline 0,

    y Subscript i t Baseline equals r Subscript i t Baseline plus epsilon Subscript i t Baseline i equals 1 comma ellipsis comma upper N comma t equals 1 comma ellipsis comma upper T
  • For model (2), y Subscript i t is trend stationary,

    y Subscript i t Baseline equals r Subscript i t Baseline plus beta Subscript i Baseline t plus epsilon Subscript i t Baseline i equals 1 comma ellipsis comma upper N comma t equals 1 comma ellipsis comma upper T

    where r Subscript i t is the random walk component,

    r Subscript i t Baseline equals r Subscript i t minus 1 Baseline plus u Subscript i t Baseline i equals 1 comma ellipsis comma upper N comma t equals 1 comma ellipsis comma upper T

    The initial values of the random walks, StartSet r Subscript i Baseline 0 Baseline EndSet Subscript i equals 1 comma ellipsis comma upper N, are assumed to be fixed unknowns and can be considered as heterogeneous intercepts. The errors epsilon Subscript i t and u Subscript i t satisfy epsilon Subscript i t Baseline tilde iid script upper N left-parenthesis 0 comma sigma Subscript epsilon Superscript 2 Baseline right-parenthesis, u Subscript i t Baseline tilde iid script upper N left-parenthesis 0 comma sigma Subscript u Superscript 2 Baseline right-parenthesis and are mutually independent.

The null hypothesis of stationarity is upper H 0 colon sigma Subscript u Superscript 2 Baseline equals 0 against the alternative random walk hypothesis upper H 1 colon sigma Subscript u Superscript 2 Baseline greater-than 0.

In matrix form, the models can be written as

y Subscript i Baseline equals upper X Subscript i Baseline beta Subscript i Baseline plus e Subscript i

where y prime Subscript i Baseline equals left-parenthesis y Subscript i Baseline 1 Baseline comma ellipsis comma y Subscript i upper T Baseline right-parenthesis; e prime Subscript i Baseline equals left-parenthesis e Subscript i Baseline 1 Baseline comma ellipsis comma e Subscript i upper T Baseline right-parenthesis, where e Subscript i t Baseline equals sigma-summation Underscript j equals 1 Overscript t Endscripts u Subscript i j Baseline plus epsilon Subscript i t; and upper X Subscript i Baseline equals left-parenthesis iota Subscript upper T Baseline comma a Subscript upper T Baseline right-parenthesis, where iota Subscript upper T is a upper T times 1 vector of ones, a prime Subscript upper T Baseline equals left-parenthesis 1 comma ellipsis comma upper T right-parenthesis, and beta prime Subscript i Baseline equals left-parenthesis r Subscript i Baseline 0 Baseline comma beta Subscript i Baseline right-parenthesis.

Let ModifyingAbove epsilon With caret Subscript i t be the residuals from the regression of y Subscript i on upper X Subscript i; then the LM statistic is

upper L upper M equals StartStartFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts StartFraction 1 Over upper T squared EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts upper S Subscript i t Superscript 2 Baseline OverOver upper N ModifyingAbove sigma With caret Subscript epsilon Superscript 2 Baseline EndEndFraction

where upper S Subscript i t Baseline equals sigma-summation Underscript j equals 1 Overscript t Endscripts ModifyingAbove epsilon With caret Subscript i j is the partial sum of the residuals and ModifyingAbove sigma With caret Subscript epsilon Superscript 2 is a consistent estimator of sigma Subscript epsilon Superscript 2 under the null hypothesis of stationarity. With some regularity conditions,

upper L upper M right-arrow Overscript p Endscripts upper E left-bracket integral Subscript 0 Superscript 1 Baseline upper V squared left-parenthesis r right-parenthesis d r right-bracket

where upper V left-parenthesis r right-parenthesis is a standard Brownian bridge in model (1) and a second-level Brownian bridge in model (2). Let upper W left-parenthesis r right-parenthesis be a standard Wiener process (Brownian motion),

upper V left-parenthesis r right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column upper W left-parenthesis r right-parenthesis minus r upper W left-parenthesis 1 right-parenthesis 2nd Column for model left-parenthesis 1 right-parenthesis 2nd Row 1st Column upper W left-parenthesis r right-parenthesis plus left-parenthesis 2 r minus 3 r squared right-parenthesis upper W left-parenthesis 1 right-parenthesis plus 6 r left-parenthesis r minus 1 right-parenthesis integral Subscript 0 Superscript 1 Baseline upper W left-parenthesis s right-parenthesis d s 2nd Column for model left-parenthesis 2 right-parenthesis EndLayout

The mean and variance of the random variable integral upper V squared can be calculated by using the characteristic functions,

xi equals upper E left-bracket integral Subscript 0 Superscript 1 Baseline upper V squared left-parenthesis r right-parenthesis d r right-bracket equals StartLayout Enlarged left-brace 1st Row 1st Column one-sixth 2nd Column for model left-parenthesis 1 right-parenthesis 2nd Row 1st Column one-fifteenth 2nd Column for model left-parenthesis 2 right-parenthesis EndLayout

and

zeta squared equals normal upper V normal a normal r left-bracket integral Subscript 0 Superscript 1 Baseline upper V squared left-parenthesis r right-parenthesis d r right-bracket equals StartLayout Enlarged left-brace 1st Row 1st Column one-forty-fifth 2nd Column for model left-parenthesis 1 right-parenthesis 2nd Row 1st Column StartFraction 11 Over 6300 EndFraction 2nd Column for model left-parenthesis 2 right-parenthesis EndLayout

The LM statistics can be standardized to obtain the standard normal limiting distribution,

upper Z equals StartFraction StartRoot upper N EndRoot left-parenthesis upper L upper M minus xi right-parenthesis Over zeta EndFraction long right double arrow script upper N left-parenthesis 0 comma 1 right-parenthesis
Consistent Estimator of sigma Subscript epsilon Superscript 2

Hadri’s (2000) test can be applied to the general case of heteroscedasticity and serially correlated disturbance errors. Under homoscedasticity and serially uncorrelated errors, sigma Subscript epsilon Superscript 2 can be estimated as

ModifyingAbove sigma With caret Subscript epsilon Superscript 2 Baseline equals StartFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline Over upper N left-parenthesis upper T minus k right-parenthesis EndFraction

where k is the number of regressors. Therefore, k equals 1 formodel (1) and k equals 2 for model (2).

When errors are heteroscedastic across individuals, the standard errors sigma Subscript epsilon comma i Superscript 2 can be estimated by ModifyingAbove sigma With caret Subscript epsilon comma i Superscript 2 Baseline equals sigma-summation Underscript t equals 1 Overscript upper T Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline slash left-parenthesis upper T minus k right-parenthesis for each individual i and the LM statistic needs to be modified to

upper L upper M equals StartFraction 1 Over upper N EndFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts left-parenthesis StartStartFraction StartFraction 1 Over upper T squared EndFraction sigma-summation Underscript t equals 1 Overscript upper T Endscripts upper S Subscript i t Superscript 2 Baseline OverOver ModifyingAbove sigma With caret Subscript epsilon comma i Superscript 2 Baseline EndEndFraction right-parenthesis

To allow for temporal dependence over t, sigma Subscript epsilon Superscript 2 has to be replaced by the long-run variance of epsilon Subscript i t, which is defined as sigma squared equals sigma-summation Underscript i equals 1 Overscript upper N Endscripts limit Subscript upper T right-arrow normal infinity Baseline upper T Superscript negative 1 Baseline left-parenthesis upper S Subscript i upper T Superscript 2 Baseline right-parenthesis slash upper N. An HAC estimator can be used to consistently estimate the long-run variance sigma squared. For more information, see the section Long-Run Variance Estimation.

As in the section Levin, Lin, and Chu Test, it is possible to relax this assumption of cross-sectional independence and allow for a limited degree of dependence via time-specific aggregate effects. In that section, two more models (model 4 and model 5) with time fixed effects are considered. For more information, see the section Cross-Sectional Dependence via Time-Specific Aggregate Effects.

Harris and Tzavalis Panel Unit Root Test

Harris and Tzavalis (1999) derive the panel unit root test under fixed T and large N. Five models are considered, as in Levin, Lin, and Chu (2002). Model (1) is the homogeneous panel,

y Subscript i t Baseline equals phi y Subscript i t minus 1 Baseline plus v Subscript i t

Under the null hypothesis, phi equals 1. For model (2), each series is a unit root process with a heterogeneous drift,

y Subscript i t Baseline equals alpha Subscript i Baseline plus phi y Subscript i t minus 1 Baseline plus v Subscript i t

Model (3) includes heterogeneous drifts and linear time trends,

y Subscript i t Baseline equals alpha Subscript i Baseline plus beta Subscript i Baseline t plus phi y Subscript i t minus 1 Baseline plus v Subscript i t

As in the section Levin, Lin, and Chu Test, it is possible to relax this assumption of cross-sectional independence and allow for a limited degree of dependence via time-specific aggregate effects. In that section, two more models (model 4 and model 5) with time fixed effects are considered. For more information, see the section Cross-Sectional Dependence via Time-Specific Aggregate Effects.

Let ModifyingAbove phi With caret be the OLS estimator of phi; then

ModifyingAbove phi With caret minus 1 equals left-bracket sigma-summation Underscript i equals 1 Overscript upper N Endscripts y prime Subscript i comma negative 1 Baseline upper Q Subscript upper T Baseline y Subscript i comma negative 1 Baseline right-bracket Superscript negative 1 Baseline dot left-bracket sigma-summation Underscript i equals 1 Overscript upper N Endscripts y prime Subscript i comma negative 1 Baseline upper Q Subscript upper T Baseline v Subscript i Baseline right-bracket

where y Subscript i comma negative 1 Baseline equals left-parenthesis y Subscript i Baseline 0 Baseline comma ellipsis comma y Subscript i upper T minus 1 Baseline right-parenthesis, v prime Subscript i Baseline equals left-parenthesis v Subscript i Baseline 1 Baseline comma ellipsis comma v Subscript i upper T Baseline right-parenthesis, and upper Q Subscript upper T is the projection matrix. Formodel (1), there are no regressors other than the lagged dependent value, so upper Q Subscript upper T is the identity matrix upper I Subscript upper T. For model (2), a constant is included, so upper Q Subscript upper T Baseline equals upper I Subscript upper T Baseline minus e Subscript upper T Baseline e Subscript upper T Superscript prime Baseline slash upper T, where e Subscript upper T is a upper T times 1 column of ones. For model (3), a constant and time trend are included. Thus upper Q Subscript upper T Baseline equals upper I Subscript upper T Baseline minus upper Z Subscript upper T Baseline left-parenthesis upper Z prime Subscript upper T Baseline upper Z Subscript upper T Baseline right-parenthesis Superscript negative 1 Baseline upper Z prime Subscript upper T, where upper Z Subscript upper T Baseline equals left-parenthesis e Subscript upper T Baseline comma tau Subscript upper T Baseline right-parenthesis and tau Subscript upper T Baseline equals left-parenthesis 1 comma ellipsis comma upper T right-parenthesis prime.

When y Subscript i Baseline 0 Baseline equals 0 in model (1) under the null hypothesis, as upper N right-arrow normal infinity

StartRoot upper N upper T left-parenthesis upper T minus 1 right-parenthesis slash 2 EndRoot left-parenthesis ModifyingAbove phi With caret minus 1 right-parenthesis right-arrow Overscript y Subscript i Baseline 0 Baseline equals 0 comma upper H 0 Endscripts script upper N left-parenthesis 0 comma 1 right-parenthesis

As upper T right-arrow normal infinity, it becomes upper T StartRoot upper N EndRoot left-parenthesis ModifyingAbove phi With caret minus 1 right-parenthesis long right double arrow Overscript upper H 0 Endscripts script upper N left-parenthesis 0 comma 2 right-parenthesis.

When the drift is absent in model (2), alpha Subscript i Baseline equals 0, under the null hypothesis, as upper N right-arrow normal infinity

StartRoot StartFraction 5 upper N left-parenthesis upper T plus 1 right-parenthesis cubed left-parenthesis upper T minus 1 right-parenthesis Over 3 left-parenthesis 17 upper T squared minus 20 upper T plus 17 right-parenthesis EndFraction EndRoot left-parenthesis ModifyingAbove phi With caret minus 1 plus StartFraction 3 Over left-parenthesis upper T plus 1 right-parenthesis EndFraction right-parenthesis right-arrow Overscript alpha Subscript i Baseline equals 0 comma upper H 0 Endscripts script upper N left-parenthesis 0 comma 1 right-parenthesis

As upper T right-arrow normal infinity, left-parenthesis upper T StartRoot upper N EndRoot left-parenthesis ModifyingAbove phi With caret minus 1 right-parenthesis plus 3 StartRoot upper N EndRoot right-parenthesis slash StartRoot 51 slash 5 EndRoot long right double arrow Overscript upper H 0 Endscripts script upper N left-parenthesis 0 comma 1 right-parenthesis.

When the time trend is absent in model (3), beta Subscript i Baseline equals 0, under the null hypothesis, as upper N right-arrow normal infinity

StartRoot StartFraction 112 upper N left-parenthesis upper T plus 2 right-parenthesis cubed left-parenthesis upper T minus 2 right-parenthesis Over 15 left-parenthesis 193 upper T squared minus 728 upper T plus 1147 right-parenthesis EndFraction EndRoot left-bracket ModifyingAbove phi With caret minus 1 plus StartFraction 15 Over 2 left-parenthesis upper T plus 2 right-parenthesis EndFraction right-bracket right-arrow Overscript beta Subscript i Baseline equals 0 comma upper H 0 Endscripts script upper N left-parenthesis 0 comma 1 right-parenthesis

When upper T right-arrow normal infinity, left-parenthesis upper T StartRoot upper N EndRoot left-parenthesis ModifyingAbove phi With caret minus 1 right-parenthesis plus 7.5 StartRoot upper N EndRoot right-parenthesis slash StartRoot 2895 slash 112 EndRoot long right double arrow Overscript upper H 0 Endscripts script upper N left-parenthesis 0 comma 1 right-parenthesis.



[8] In practice c is set to 1, following the literature (Hannan and Quinn 1979; Hall 1994).

[9] The time series length T is subindexed by i equals 1 comma ellipsis comma upper N because the panel can be unbalanced.

[10] Choi (2001) also points out that the joint limit result where N and StartSet upper T Subscript i Baseline EndSet Subscript i equals 1 comma ellipsis comma upper N go to infinity simultaneously is the same as the sequential limit, but it requires more moment conditions.

[11] For more information, see the section Levin, Lin, and Chu Test. The only difference is the standard error estimate ModifyingAbove sigma With caret Subscript epsilon i Superscript 2. Breitung suggests using upper T minus p Subscript i Baseline minus 2 instead of upper T minus p Subscript i Baseline minus 1 as in the LLC test to normalize the standard error.

[12] Breitung (2000) suggests the approach in step 1 of Levin, Lin, and Chu (2002), whereas Breitung and Das (2005) suggest the prewhitening method as described in this section. In Breitung’s code, to be consistent with the papers, different approaches are adopted for model (2) and model (3). Meanwhile, for the order of variable transformation and prewhitening, in model (2), the initial values are deducted (variable transformation) first, and then the prewhitening is applied. In model (3), the order is reversed: the series is prewhitened and then transformed to remove the mean and linear time trend.

Last updated: June 19, 2025