(View the complete code for this example.)
Sometimes data sets are stored in compressed (or wide) form, where each record contains all observations for the entire cross section. Although the PANEL procedure requires data in uncompressed (long) form, sometimes it is easier to create new variables or summary statistics if the data are in wide form.
To illustrate, suppose you have a simulated data set that contains 20 cross sections measured over six time periods. Each time period has values for dependent and independent variables, , …,
and
, …,
. The
cs and num variables are constant across each cross section.
The observations for the first five cross sections along with other variables are shown in Output 25.6.1. In this example, i represents the cross section. The time period is identified by the subscript of the Y and X variables, which ranges from 1 to 6.
Output 25.6.1: Compressed Data Set
| Obs | i | cs | num | X_1 | X_2 | X_3 | X_4 | X_5 | X_6 | Y_1 | Y_2 | Y_3 | Y_4 | Y_5 | Y_6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | CS1 | -1.56058 | 0.40268 | 0.91951 | 0.69482 | -2.28899 | -1.32762 | 1.92348 | 2.30418 | 2.11850 | 2.66009 | -4.94104 | -0.83053 | 5.01359 |
| 2 | 2 | CS2 | 0.30989 | 1.01950 | -0.04699 | -0.96695 | -1.08345 | -0.05180 | 0.30266 | 4.50982 | 3.73887 | 1.44984 | -1.02996 | 2.78260 | 1.73856 |
| 3 | 3 | CS3 | 0.85054 | 0.60325 | 0.71154 | 0.66168 | -0.66823 | -1.87550 | 0.55065 | 4.07276 | 4.89621 | 3.90470 | 1.03437 | 0.54598 | 5.01460 |
| 4 | 4 | CS4 | -0.18885 | -0.64946 | -1.23355 | 0.04554 | -0.24996 | 0.09685 | -0.92771 | 2.40304 | 1.48182 | 2.70579 | 3.82672 | 4.01117 | 1.97639 |
| 5 | 5 | CS5 | -0.04761 | -0.79692 | 0.63445 | -2.23539 | -0.37629 | -0.82212 | -0.70566 | 3.58092 | 6.08917 | 3.08249 | 4.26605 | 3.65452 | 0.81826 |
When the data are in this form, it is easy to create other variables that are combinations of the existing variables. For example, you can calculate the within-cross-section mean of X by simply summing across the X_i variables and dividing by six. It is easier to perform this kind of data manipulation when the data are in compressed (wide) form instead of uncompressed (long) form.
On the other hand, the PANEL procedure cannot work directly with the data in wide form. You can use the FLATDATA statement to transform wide data into long form “on the fly” for performing a panel data analysis. You can also use the OUT= option to output the transformed data to a new data set, to use for further analysis.
The following code reshapes the data and performs fixed-effects estimation:
proc panel data=flattest;
flatdata indid=i tsname="t" base=(X Y)
keep=( cs num seed ) / out=flat_out;
id i t;
model y = x / fixone noint;
run;
The first six observations in the uncompressed (long) data set and the results for the one-way fixed-effects model are shown in Output 25.6.2 and Output 25.6.3, respectively.
Output 25.6.2: Uncompressed Data Set
| Obs | I | t | X | Y | CS | NUM |
|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 0.40268 | 2.30418 | CS1 | -1.56058 |
| 2 | 1 | 2 | 0.91951 | 2.11850 | CS1 | -1.56058 |
| 3 | 1 | 3 | 0.69482 | 2.66009 | CS1 | -1.56058 |
| 4 | 1 | 4 | -2.28899 | -4.94104 | CS1 | -1.56058 |
| 5 | 1 | 5 | -1.32762 | -0.83053 | CS1 | -1.56058 |
| 6 | 1 | 6 | 1.92348 | 5.01359 | CS1 | -1.56058 |
Output 25.6.3: Estimation with the FLATDATA Statement
| Parameter Estimates | ||||||
|---|---|---|---|---|---|---|
| Variable | DF | Estimate | Standard Error |
t Value | Pr > |t| | Label |
| X | 1 | 2.010753 | 0.1217 | 16.52 | <.0001 | |
Now, suppose you have long data that you want to reshape into wide form. The following DATA step performs this task:
data wide;
set flat_out;
by i;
keep i num cs X_1-X_6 Y_1-Y_6;
retain X_1-X_6 Y_1-Y_6;
array ax(1:6) X_1-X_6;
array ay(1:6) Y_1-Y_6;
if first.i then do;
do j = 1 to 6;
ax(j) = 0;
ay(j) = 0;
end;
end;
ax(t) = X;
ay(t) = Y;
if last.i then output;
run;
As a check, Output 25.6.4 lists the newly compressed data, which match the original data from this example.
Output 25.6.4: Recompressed Data Set
| Obs | I | CS | NUM | X_1 | X_2 | X_3 | X_4 | X_5 | X_6 | Y_1 | Y_2 | Y_3 | Y_4 | Y_5 | Y_6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | CS1 | -1.56058 | 0.40268 | 0.91951 | 0.69482 | -2.28899 | -1.32762 | 1.92348 | 2.30418 | 2.11850 | 2.66009 | -4.94104 | -0.83053 | 5.01359 |
| 2 | 2 | CS2 | 0.30989 | 1.01950 | -0.04699 | -0.96695 | -1.08345 | -0.05180 | 0.30266 | 4.50982 | 3.73887 | 1.44984 | -1.02996 | 2.78260 | 1.73856 |
| 3 | 3 | CS3 | 0.85054 | 0.60325 | 0.71154 | 0.66168 | -0.66823 | -1.87550 | 0.55065 | 4.07276 | 4.89621 | 3.90470 | 1.03437 | 0.54598 | 5.01460 |
| 4 | 4 | CS4 | -0.18885 | -0.64946 | -1.23355 | 0.04554 | -0.24996 | 0.09685 | -0.92771 | 2.40304 | 1.48182 | 2.70579 | 3.82672 | 4.01117 | 1.97639 |
| 5 | 5 | CS5 | -0.04761 | -0.79692 | 0.63445 | -2.23539 | -0.37629 | -0.82212 | -0.70566 | 3.58092 | 6.08917 | 3.08249 | 4.26605 | 3.65452 | 0.81826 |