Shared Concepts

class Parameter

This section applies to actions in the following action sets: gam, mixed, modelMatrix, pca, phreg, pls, quantreg, regression, sandwich, and varReduce.

The class parameter names the classification variables to be used as explanatory variables in the analysis. These variables enter the analysis not through their values, but through levels to which the unique values are mapped. For more information about these mappings, see the section Levelization of Classification Variables.

If the action permits a classification variable as a response (dependent variable or target), the response does not need to be specified in the class parameter.

You can specify subparameters for one-or-more variables by specifying them in the class parameter, or as global-subparameters for all variables by specifying them in the classGlobalOpts parameter. Global-subparameters are applied to all variables that are specified in the class parameter. Subparameters specified for individual class parameters override the global-subparameters.

Table 1 summarizes the values you can use for either a subparameter or a global-subparameter. The subparameters are described in detail in the list that follows Table 1.

Table 1: Classification Variable Subparameters

Subparameter	Description
`countMissing`	Treats missing values as valid levels
`descending`	Reverses the sort order
`ignoreMissing`	Honors nonmissing values even if an observation also has missing values
`levelizeRaw`	Bases levelization on unformatted values of the variable
`maxLev`	Specifies the maximum number of levels
`order`	Specifies the sort order for the levels
`param`	Specifies the parameterization of the variable
`ref`	Specifies the reference level of the variable
`split`	Splits levels of the classification variable into independent effects
`vars`	Specifies the classification variables

countMissing=TRUE | FALSE

when set to True, treats missing values (".", and, depending on the programming language, ".A", …, ".Z" for numeric variables and blanks for character variables) as valid values of the classification variable. If you omit this subparameter, observations that have missing values for classification variables are removed from the analysis.

descending=TRUE | FALSE

when set to True, reverses the sort order of the classification variable. If you specify both the descending and order subparameters, the action orders the categories according to the order subparameter and then reverse that order.

ignoreMissing=TRUE | FALSE

when set to True, ignores the fact that some variables in the observation have missing values and honors the nonmissing values of other variables in the observation. In particular, this subparameter affects the identification of valid levels of a variable and various counts of such levels.

levelizeRaw=TRUE | FALSE

when set to True, bases levelization of the variables on their unformatted values.

maxLev=n

specifies the maximum number of levels allowed. The default value, 0, allows an unlimited number of levels.

order='FORMATTED' | 'FREQ' | 'INTERNAL'

specifies the sort order for the levels of classification variables. This ordering determines which parameters in the model correspond to each level in the data.

The following table shows how values of the order subparameter are interpreted.

Value of `order`	Levels Sorted By
FORMATTED	External formatted values, except for numeric variables that have no explicit format, which are sorted by their unformatted (internal) values. The sort order is machine-dependent. For numeric variables for which you have supplied no explicit format, the levels are ordered by their internal values.
FREQ	Descending frequency count (levels that have more observations come earlier in the order)
INTERNAL	Unformatted value. The sort order is machine-dependent.

For more information about sort order, see the chapter about the SORT procedure in Base SAS Procedures Guide and the discussion of BY-group processing in the "Grouping Data" section of SAS Programmers Guide: Essentials. By default, the order subparameter value is FORMATTED.

param='keyword'

specifies the parameterization method for the classification variable or variables. You can specify any of the keywords shown in the following table; design matrix columns are created from classification variables according to the corresponding coding schemes.

Table 2: Parameterization Methods

Value of `param`	Coding
EFFECT	Effect coding. The `ref` subparameter in the `class` parameter determines the reference level.
GLM	Less-than-full-rank reference cell coding. This keyword can be used only as a global-subparameter and is applied to all classification variables; all other individual variable parameterization specifications are ignored. The `ref` subparameter in the `class` parameter indirectly determines the reference level through the order of levels.
ORDINAL \| THERMOMETER	Ordinal coding. Cumulative parameterization for an ordinal classification variable
POLYNOMIAL \| POLY	Polynomial coding. If the classification variable is numeric, then the `order` subparameter in the `class` parameter is ignored, and the internal unformatted values are used.
REFERENCE \| REF	Reference cell coding. The `ref` subparameter in the `class` parameter determines the reference level.
ORTHEFFECT	Orthogonalizes effect coding. The `ref` subparameter in the `class` parameter determines the reference level.
ORTHORDINAL \| ORTHOTHERM	Orthogonalizes ordinal coding
ORTHPOLY	Orthogonalizes polynomial coding. If the classification variable is numeric, then the `order` subparameter in the `class` parameter is ignored, and the internal unformatted values are used.
ORTHREF	Orthogonalizes reference cell coding. The `ref` subparameter in the `class` parameter determines the reference level.

All parameterizations are full rank, except for the GLM parameterization. If you specify a full rank parameterization for any classification variable, then every classification variable without a specified coding is given the effect coding.

By default, GLM coding is used. For more information about how parameterization of classification variables affects the construction and interpretation of model effects, see the section Specification and Parameterization of Model Effects.

reference='level' | 'keyword' ref='level' | 'keyword'

specifies the reference level that is used when you specify a nonsingular parameterization. You can specify the following values:

level: specifies the level of the variable to use as the reference level. Specify the formatted value of the variable if a format is assigned. You can specify this value only for an individual variable subparameter.
FIRST: designates the first ordered level as reference. You can specify this value either for an individual variable subparameter or for a global-subparameter.
LAST: designates the last ordered level as reference. You can specify this value either for an individual variable subparameter or for a global-subparameter.

By default, the ref parameter value is LAST.

split=TRUE | FALSE

when set to True, specifies that design matrix columns that correspond to any effect that contains a split classification variable can be selected to enter or leave a model independently of the other design columns of that effect. This subparameter applies to actions that perform model selection.

Suppose that the variable temp has three levels (hot, warm, and cold), that the variable gender has two levels (M and F), and that the variables are used in an action run (displayed in the CASL language) as follows:

proc cas;
   regression.glm table={name='data'},
      class={{vars={'temp', 'gender', split='true'}}},
      model={depVar='y',
             effects={{vars='gender'},
                           {vars={'gender','temp'},interaction='CROSS'}}};
run;

The two effects in the model parameter are split into eight independent effects. The effect "gender" is split into two effects that are labeled "gender_M" and "gender_F". The effect "gender*temp" is split into six effects that are labeled "gender_M*temp_hot", "gender_F*temp_hot", "gender_M*temp_warm", "gender_F*temp_warm", "gender_M*temp_cold", and "gender_F*temp_cold". The previous regression.glm action call is equivalent to the following:

proc cas;
   regression.glm table={name='data'},
      model={depVar='y',
         effects={{vars='gender_M', 'gender_F',
            {vars={'gender_M','temp_hot'},interaction='CROSS'}
            {vars={'gender_M','temp_warm'},interaction='CROSS'}
            {vars={'gender_M','temp_cold'},interaction='CROSS'}
            {vars={'gender_F','temp_hot'},interaction='CROSS'}
            {vars={'gender_F','temp_warm'},interaction='CROSS'}
            {vars={'gender_F','temp_cold'},interaction='CROSS'}}}};
run;

The split subparameter can be used on individual classification variables. For example, consider the following regression.glm action call:

proc cas;
   regression.glm table={name='data'},
      class={{vars={'temp', split='true'}},{vars={'gender'}}},
      model={depVar='y',
             effects={{vars='gender'},
                           {vars={'gender','temp'},interaction='CROSS'}}};
run;

In this case, the effect "gender" is not split and the effect "gender*temp" is split into three effects, which are labeled "gender*temp_hot", "gender*temp_warm", and "gender*temp_cold". Furthermore, each of these three split effects now has two parameters that correspond to the two levels of "gender." The regression.glm action call is equivalent to the following:

proc cas;
   regression.glm table={name='data'},
      class={{vars={'gender'}}},
      model={depVar='y',
         effects={{vars={'gender'},
         {vars={'gender','temp_hot'},interaction='CROSS'}
         {vars={'gender','temp_warm'},interaction='CROSS'}
         {vars={'gender','temp_cold'},interaction='CROSS'}}}};
run;

vars='variable-name1'<,'variable-name2',…>

specifies the classification variables. This subparameter is available only for the class parameter.

Last updated: March 05, 2026