Panel data are identified by both a cross section identification (ID) variable and a time variable. Suppose that you have a data set Sample, where cross sections are identified by the variable State and time periods are identified by the variable Date. The input data set that PROC PANEL uses must be sorted by cross section and by time within each cross section. As PROC PANEL steps through the observations in the data, it treats any change in the value of the cross section ID variable as a new cross section, regardless of whether it has encountered that value previously. If you do not sort your data, the results might not be what you expect. Therefore, the first step in PROC PANEL is to make sure that the input data set is sorted. The following statements sort the data set Sample appropriately:
proc sort data=sample;
by state date;
run;
The next step is to invoke the PANEL procedure and specify the cross-sectional and time series variables in an ID statement. The following statements show the correct syntax:
proc panel data=sample;
id state date;
model y = x1 x2;
run;
Alternatively, PROC PANEL has the capability to read flat (or wide) data. Suppose you are using the data set Flat, which has observations on states. Specifically, the data are composed of observations on Y, X1, and X2. Unlike the data in the Sample data set, these data are not long. Instead, you have all of a state’s information in a single row. The time observations for the Y variable are recorded horizontally. So the variable Y_1 is the first period’s time observation, and the variable Y_10 is the tenth period’s observation for some state. The same is true of the other variables. You have the variables X1_1 through X1_10 and X2_1 through X2_10. For such data, use the following syntax:
proc panel data=a;
flatdata indid = state base = (Y X1 X2) tsname = t;
id state t;
model Y = X1 X2;
run;
For more information about the FLATDATA statement, see the section FLATDATA Statement and Example 25.6.