COUNTREG Procedure
The CLASS statement names the classification variables that are used to group (classify) data in the analysis. Classification variables can be either character or numeric.
Class levels are determined from the formatted values of the CLASS variables. Thus, you can use formats to group values into levels. For more information, see the discussion of the FORMAT procedure in the Base SAS Procedures Guide. The CLASS statement must precede the MODEL statement.
Most options can be specified either as individual variable options or as global-options. You can specify options for each variable by enclosing the options in parentheses after the variable name. You can also specify global-options for the CLASS statement by placing them after a slash (/). Global-options are applied to all the variables that are specified in the CLASS statement. If you specify more than one CLASS statement, the global-options specified in any one CLASS statement apply to all CLASS statements. However, individual CLASS variable options override the global-options. You can specify the following values for either an option or a global-option:
-
MISSING
treats missing values (., ._, .A, …, .Z for numeric variables and blanks for character variables) as valid values for the CLASS variable.
-
ORDER=DATA | FORMATTED | FREQ | INTERNAL
-
specifies the sort order for the levels of classification variables. This ordering determines which parameters in the model correspond to each level in the data. You can specify the following values:
- DATA
sorts levels by the order of appearance in the input data set.
- FORMATTED
sorts levels by external formatted values, except for numeric variables that have no explicit format. Those variables are sorted by their unformatted (internal) values. Ths sort order is machine-dependent.
- FREQ
sorts levels by descending frequency count; levels that have more observations come earlier in the order.
- INTERNAL
sorts levels by unformatted value. Ths sort order is machine-dependent.
For more information about sort order, see the chapter on the SORT procedure in the Base SAS Procedures Guide and the discussion of BY-group processing in SAS Programmers Guide: Essentials. By default, ORDER=FORMATTED.
-
PARAM=EFFECT | GLM | REFERENCE
-
specifies the parameterization method for the classification variable or variables. You can specify the following values:
- EFFECT
uses effect coding to create design matrix columns from the CLASS variables.
- GLM
uses less-than-full-rank reference cell coding to create design matrix columns from the CLASS variables. This value can be used only as a global option.
- REFERENCE
uses reference cell coding to create design matrix columns from the CLASS variables. You can abbreviate this value as REF.
All parameterizations are full rank, except for the GLM parameterization. The REF= option in the CLASS statement determines the reference level for effect and reference coding and for their orthogonal parameterizations. It also indirectly determines the reference level for a singular GLM parameterization through the order of levels. By default, PARAM=GLM.
-
REF=’level’ | FIRST | LAST
-
specifies the reference level for PARAM=EFFECT, PARAM=REFERENCE, and their orthogonalizations. When PARAM=GLM, the REF= option specifies a level of the classification variable to be put at the end of the list of levels. This level thus corresponds to the reference level in the usual interpretation of the linear estimates with a singular parameterization.
For an individual variable REF= option (but not for a global REF= option), you can specify the level of the variable to use as the reference level. Specify the formatted value of the variable if a format is assigned. For a global or individual variable REF= option, you can use one of the following keywords.
- FIRST
designates the first-ordered level as reference.
- LAST
designates the last-ordered level as reference.
By default, REF=LAST.
Last updated: June 19, 2025