PANEL Procedure

Heteroscedasticity-Corrected Covariance Matrices

The HCCME= option in the MODEL statement selects the type of heteroscedasticity-consistent covariance matrix. In the presence of heteroscedasticity, the covariance matrix has a complicated structure that can result in inefficiencies in the OLS estimates and biased estimates of the covariance matrix. The variances for cross-sectional and time dummy variables and the covariances with or between the dummy variables are not corrected for heteroscedasticity in the one-way and two-way models. Whether or not the HCCME= is specified, these variances are the same. For the two-way models, the variance and the covariances for the intercept are not corrected.[5]

Consider the simple linear model:

bold y equals bold upper X bold-italic beta plus bold-italic epsilon

This discussion parallels the discussion in Davidson and MacKinnon (1993, pp. 548–562). For panel data models, heteroscedasticity-corrected covariance matrix estimation (HCCME) is applied to the transformed data (y overTilde and bold upper X overTilde). In other words, first the random or fixed effects are removed through transforming the data,[6] and then the heteroscedasticity (also autocorrelation with the HAC option) is corrected in the residual. The assumptions that make the linear regression best linear unbiased estimator (BLUE) are upper E left-parenthesis epsilon right-parenthesis equals 0 and upper E left-parenthesis bold-italic epsilon bold-italic epsilon Superscript prime Baseline right-parenthesis equals normal upper Omega, where normal upper Omega has the simple structure sigma squared bold upper I. Heteroscedasticity results in a general covariance structure, and it is not possible to simplify normal upper Omega. The result is the following:

bold-italic beta overTilde equals left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline bold y equals left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline left-parenthesis bold upper X bold-italic beta plus bold-italic epsilon right-parenthesis equals beta plus left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline bold-italic epsilon

As long as the following is true, then you are assured that the OLS estimate is consistent and unbiased:

normal p normal l normal i normal m Subscript n right-arrow normal infinity Baseline left-parenthesis StartFraction 1 Over n EndFraction bold upper X Superscript prime Baseline bold-italic epsilon right-parenthesis equals 0

If the regressors are nonrandom, then it is possible to write the variance of the estimated bold-italic beta as

normal upper V normal a normal r left-parenthesis bold-italic beta minus bold-italic beta overTilde right-parenthesis equals left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega bold upper X left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1

You can ameliorate the effect of structure in the covariance matrix by using generalized least squares (GLS), provided that normal upper Omega Superscript negative 1 can be calculated. Using normal upper Omega Superscript negative 1, you premultiply both sides of the regression equation,

upper L Superscript negative 1 Baseline bold y equals upper L Superscript negative 1 Baseline bold upper X bold-italic beta plus upper L Superscript negative 1 Baseline bold-italic epsilon

where L denotes the Cholesky root of normal upper Omega (that is, normal upper Omega equals upper L upper L prime with L lower triangular).

The resulting GLS bold-italic beta is

ModifyingAbove bold-italic beta With caret equals left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold y

Using the GLS bold-italic beta, you can write

StartLayout 1st Row 1st Column ModifyingAbove bold-italic beta With caret 2nd Column equals 3rd Column left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold y 2nd Row 1st Column Blank 2nd Column equals 3rd Column left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline left-parenthesis normal upper Omega Superscript negative 1 Baseline bold upper X beta plus normal upper Omega Superscript negative 1 Baseline bold-italic epsilon right-parenthesis 3rd Row 1st Column Blank 2nd Column equals 3rd Column beta plus left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold-italic epsilon EndLayout

The resulting variance expression for the GLS estimator is

StartLayout 1st Row 1st Column normal upper V normal a normal r left-parenthesis bold-italic beta minus ModifyingAbove bold-italic beta With caret right-parenthesis 2nd Column equals 3rd Column left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 bold-italic epsilon bold-italic epsilon prime normal upper Omega Superscript negative 1 Baseline bold upper X left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 2nd Row 1st Column Blank 2nd Column equals 3rd Column left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline normal upper Omega normal upper Omega Superscript negative 1 Baseline bold upper X left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 3rd Row 1st Column Blank 2nd Column equals 3rd Column left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1 EndLayout

The difference in variance between the OLS estimator and the GLS estimator can be written as

left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X Superscript prime Baseline normal upper Omega bold upper X left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 minus left-parenthesis bold upper X Superscript prime Baseline normal upper Omega Superscript negative 1 Baseline bold upper X right-parenthesis Superscript negative 1

By the Gauss-Markov theorem, the difference matrix must be positive definite under most circumstances (zero if OLS and GLS are the same, when the usual classical regression assumptions are met). Thus, OLS is not efficient under a general error structure. It is crucial to realize that OLS does not produce biased results. It would suffice if you had a method of estimating a consistent covariance matrix and you used the OLS bold-italic beta. Estimation of the normal upper Omega matrix is certainly not simple. The matrix is square and has upper M squared elements; unless some sort of structure is assumed, it becomes an impossible problem to solve. However, the heteroscedasticity can have quite a general structure. White (1980) shows that it is not necessary to have a consistent estimate of normal upper Omega. On the contrary, it suffices to calculate an estimate of the middle expression. That is, you need an estimate of

normal upper Lamda equals bold upper X Superscript prime Baseline normal upper Omega bold upper X

This matrix, normal upper Lamda, is easier to estimate because its dimension is K. PROC PANEL provides the following classical HCCME estimators for normal upper Lamda.

The matrix is approximated as follows:

  • HCCME=N0:

    sigma squared bold upper X Superscript prime Baseline bold upper X

    This is the simple OLS estimator. If you do not specify the HCCME= option, PROC PANEL defaults to this estimator.

  • HCCME=0:

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline bold x Subscript i t Baseline bold x Subscript i t Superscript prime

    Here N is the number of cross sections and upper T Subscript i is the number of observations in the ith cross section. The bold x Subscript i t Superscript prime is from the tth observation in the ith cross section, constituting the left-parenthesis sigma-summation Underscript j equals 1 Overscript i minus 1 Endscripts upper T Subscript j Baseline plus t right-parenthesisth row of the matrix bold upper X. If the CLUSTER option is specified, one extra term is added to the preceding equation so that the estimator of matrix normal upper Lamda is

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline bold x Subscript i t Baseline bold x Subscript i t Superscript prime plus sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts sigma-summation Underscript s equals 1 Overscript t minus 1 Endscripts ModifyingAbove epsilon With caret Subscript i t Baseline ModifyingAbove epsilon With caret Subscript i s Baseline left-parenthesis bold x Subscript i t Baseline bold x Subscript i s Superscript prime Baseline plus bold x Subscript i s Baseline bold x Subscript i t Superscript prime Baseline right-parenthesis

    The formula is the same as the robust variance matrix estimator in Wooldridge (2002, p. 152), and it is derived under the assumptions of section 7.3.2 of Wooldridge (2002).

  • HCCME=1:

    StartFraction upper M Over upper M minus upper K EndFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline bold x Subscript i t Baseline bold x Subscript i t Superscript prime

    Here M is the total number of observations, sigma-summation Underscript j equals 1 Overscript upper N Endscripts upper T Subscript j, and K is the number of parameters. If the CLUSTER option is specified, the estimator becomes

    StartFraction upper M Over upper M minus upper K EndFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline bold x Subscript i t Baseline bold x Subscript i t Superscript prime plus StartFraction upper M Over upper M minus upper K EndFraction sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts sigma-summation Underscript s equals 1 Overscript t minus 1 Endscripts ModifyingAbove epsilon With caret Subscript i t Baseline ModifyingAbove epsilon With caret Subscript i s Baseline left-parenthesis bold x Subscript i t Baseline bold x Subscript i s Superscript prime Baseline plus bold x Subscript i s Baseline bold x Subscript i t Superscript prime Baseline right-parenthesis

    The formula is similar to the robust variance matrix estimator in Wooldridge (2002, p. 152) with the heteroscedasticity adjustment term upper M slash left-parenthesis upper M minus upper K right-parenthesis.

  • HCCME=2:

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline Over 1 minus ModifyingAbove h With caret Subscript i t Baseline EndFraction bold x Subscript i t Baseline bold x Subscript i t Superscript prime

    The ModifyingAbove h With caret Subscript i t term is the left-parenthesis sigma-summation Underscript j equals 1 Overscript i minus 1 Endscripts upper T Subscript j Baseline plus t right-parenthesisth diagonal element of the hat matrix. The expression for ModifyingAbove h With caret Subscript i t is bold x Subscript i t Superscript prime Baseline left-parenthesis bold upper X Superscript prime Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold x Subscript i t. The hat matrix attempts to adjust the estimates for the presence of influence or leverage points. If the CLUSTER option is specified, the estimator becomes

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline Over 1 minus ModifyingAbove h With caret Subscript i t Baseline EndFraction bold x Subscript i t Baseline bold x Subscript i t Superscript prime plus 2 sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts sigma-summation Underscript s equals 1 Overscript t minus 1 Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Baseline Over StartRoot 1 minus ModifyingAbove h With caret Subscript i t Baseline EndRoot EndFraction StartFraction ModifyingAbove epsilon With caret Subscript i s Baseline Over StartRoot 1 minus ModifyingAbove h With caret Subscript i s Baseline EndRoot EndFraction left-parenthesis bold x Subscript i t Baseline bold x Subscript i s Superscript prime Baseline plus bold x Subscript i s Baseline bold x Subscript i t Superscript prime Baseline right-parenthesis

    The formula is similar to the robust variance matrix estimator in Wooldridge (2002, p. 152) with the heteroscedasticity adjustment.

  • HCCME=3:

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline Over left-parenthesis 1 minus ModifyingAbove h With caret Subscript i t Baseline right-parenthesis squared EndFraction bold x Subscript i t Baseline bold x Subscript i t Superscript prime

    If the CLUSTER option is specified, the estimator becomes

    sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Superscript 2 Baseline Over left-parenthesis 1 minus ModifyingAbove h With caret Subscript i t Baseline right-parenthesis squared EndFraction bold x Subscript i t Baseline bold x Subscript i t Superscript prime plus 2 sigma-summation Underscript i equals 1 Overscript upper N Endscripts sigma-summation Underscript t equals 1 Overscript upper T Subscript i Endscripts sigma-summation Underscript s equals 1 Overscript t minus 1 Endscripts StartFraction ModifyingAbove epsilon With caret Subscript i t Baseline Over 1 minus ModifyingAbove h With caret Subscript i t Baseline EndFraction StartFraction ModifyingAbove epsilon With caret Subscript i s Baseline Over 1 minus ModifyingAbove h With caret Subscript i s Baseline EndFraction left-parenthesis bold x Subscript i t Baseline bold x Subscript i s Superscript prime Baseline plus bold x Subscript i s Baseline bold x Subscript i t Superscript prime Baseline right-parenthesis

    The formula is similar to the robust variance matrix estimator in Wooldridge (2002, p. 152) with the heteroscedasticity adjustment.

  • HCCME=4: PROC PANEL includes this option for the calculation of the Arellano (1987) version of the White (1980) HCCME in the panel setting. Arellano’s insight is that there are upper N covariance matrices in a panel, and each matrix corresponds to a cross section. Forming the White HCCME for each cross section, you need to take only the average of those upper N estimators. The details of the estimation follow. First, you arrange the data such that the first cross section occupies the first upper T Subscript i observations. Then, you treat the cross sections as separate regressions with the form

    bold y Subscript i Baseline equals alpha Subscript i Baseline bold i plus bold upper X Subscript i s Baseline bold-italic beta overTilde plus bold-italic epsilon Subscript i

    where the parameter estimates bold-italic beta overTilde and alpha Subscript i are the result of least squares dummy variables (LSDV) or within estimator regressions, and bold i is a vector of ones of length upper T Subscript i. The estimate of the ith cross section’s bold upper X Superscript prime Baseline normal upper Omega bold upper X matrix (where the s subscript indicates that no constant column has been suppressed to avoid confusion) is bold upper X Subscript i Superscript prime Baseline normal upper Omega bold upper X Subscript i. The estimate for the whole sample is

    bold upper X Subscript s Superscript prime Baseline normal upper Omega bold upper X Subscript s Baseline equals sigma-summation Underscript i italic equals italic 1 Overscript upper N Endscripts bold upper X Subscript i Superscript prime Baseline normal upper Omega bold upper X Subscript i

    The Arellano standard error is in fact a White-Newey-West estimator with constant and equal weight on each component. In the between estimators, specifying HCCME=4 returns the HCCME=0 result because there is no "other" variable to group by.

In their discussion, Davidson and MacKinnon (1993, p. 554) argue that HCCME=1 should always be preferred to HCCME=0. Although an HCCME= option value of 3 is generally preferred to 2 and 2 is preferred to 1, the calculation of HCCME=1 is as simple as the calculation of HCCME=0. Therefore, HCCME=1 is preferred when the calculation of the hat matrix is too tedious.

All HCCMEs have well-defined asymptotic properties. The small-sample properties are not well known, and care must exercised when sample sizes are small.

The HCCME of normal upper V normal a normal r left-parenthesis bold-italic beta right-parenthesis is used to drive the covariance matrices for the fixed effects and the Lagrange multiplier standard errors. Robust estimates of the covariance matrix for bold-italic beta imply robust covariance matrices for all other parameters.



[5] The dummy variables are removed by the within transformations, so their variances and covariances cannot be calculated the same way as the other regressors. They are recovered by the formulas in the sections One-Way Fixed-Effects Model (FIXONE and FIXONETIME Options) and Two-Way Fixed-Effects Model (FIXTWO Option). The formulas assume homoscedasticity, so they do not apply when HCCME is used. Therefore, standard errors, variances, and covariances are reported only when the HCCME= option is ignored. HCCME standard errors for dummy variables and intercept can be calculated by the dummy variable approach with the pooled model.

Last updated: June 19, 2025