Provides actions for performing model-based clustering
Performs model-based clustering using the EM algorithm.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametertable |
— |
specifies the input data table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametercasOut |
creates a table that contains observationwise cluster membership probability estimates. |
|
|
names |
lists the names of results tables to save as CAS tables on the server. |
|
|
— |
stores models in a blob (binary large object). |
changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attr |
specifies the convergence test to use.
| Default | LOGL |
|---|
specifies the covariance model.
| Aliases | covModel |
|---|---|
| covType |
specifies the model selection criterion.
| Default | BIC |
|---|
specifies a list of results tables to send to the client for display.
For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).
specifies the convergence criterion for the log likelihood in the expectation-maximization (EM) algorithm.
| Aliases | emEps |
|---|---|
| convergence | |
| conv | |
| Default | 1E-05 |
| Range | 0–1 |
if set to true, causes factor pattern and unique variances to be added to the parameter estimates table.
| Default | FALSE |
|---|
suppresses the analysis if the number of BY groups exceeds the specified value.
| Minimum value | 1 |
|---|
specifies the initialization method to use if no initialization variables are specified.
| Default | RANDOM |
|---|
specifies the maximum number of iterations for the expectation-maximization (EM) algorithm.
| Default | 500 |
|---|---|
| Range | 0–MACINT |
specifies the variables to use for analysis (effects) and the initial cluster membership probability variables (dependents).
The modelStatement value can be one or more of the following:
specifies one or more variables to use as response variables in the model. Not all models support more than one response variable.
| Aliases | depVar |
|---|---|
| target |
names the response variable.
specifies a list of effects that define the model. Each term in this list is made up of variables specified in the vars parameter and their interaction (which can be NONE, CROSS, or BAR). When the interaction is BAR, it can be limited by the maxInteract parameter.
The effect value can be one or more of the following:
specifies the type of interaction for the variables.
| Alias | interact |
|---|---|
| Default | NONE |
eliminates interaction effects whose order is higher than the specified integer value when used in conjunction with the BAR interaction.
specifies the variables to be nested within the term that is defined by the vars parameter. For terms with a BAR or CROSS interaction, the nest corresponds to the last variable in the vars parameter. For terms with no interaction, the nest is distributed across all variables that are listed in the vars parameter.
specifies the variables to use in defining a term of the effect. You must specify at least one variable.
specifies the number of Gaussian clusters.
specifies the number of factors to use in parsimonious Gaussian mixture models.
specifies whether to include a noise cluster in the model.
| Alias | hasNoiseCluster |
|---|
creates a table that contains observationwise cluster membership probability estimates.
The mbcOutput value can be one or more of the following:
when set to True, adds all statistics to the output table.
| Default | FALSE |
|---|
specifies the settings for an output table.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.
specifies a prefix for naming the cluster membership probability estimates from the expectation (E) step that produced the mean and covariance estimates in the final maximization (M) step.
specifies a prefix for naming the cluster log likelihoods.
specifies a prefix for naming the maximum posterior probability cluster.
specifies a prefix for naming the cluster membership probability estimates from an extra expectation (E) step that uses the mean and covariance estimates from the final maximization (M) step.
| Default | "NEXT" |
|---|
specifies a prefix for naming the predicted values.
specifies the name for the column that contains the observation role.
lists the names of results tables to save as CAS tables on the server.
For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).
specifies the bound below which a mixture weight is treated as zero.
| Alias | parmEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
specifies the seed to use for generating initial cluster memberships when initial cluster memberships are not provided.
| Minimum value | 1 |
|---|
specifies the singularity criterion for the covariance matrices.
| Alias | singEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
stores models in a blob (binary large object).
| Alias | savestate |
|---|
| Long form | store={name="table-name"} |
|---|---|
| Shortcut form | store="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | FALSE |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | FALSE |
|---|
specifies the input data table.
For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the expectation-maximization (EM) technique to use. CEM refers to the classification EM technique.
| Default | EM |
|---|
specifies the number of fitted models to show in the summary table after model selection.
| Default | 10 |
|---|---|
| Minimum value | 1 |
Performs model-based clustering using the EM algorithm.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametertable |
— |
specifies the input data table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametercasOut |
creates a table that contains observationwise cluster membership probability estimates. |
|
|
names |
lists the names of results tables to save as CAS tables on the server. |
|
|
— |
stores models in a blob (binary large object). |
changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attr |
specifies the convergence test to use.
| Default | LOGL |
|---|
specifies the covariance model.
| Aliases | covModel |
|---|---|
| covType |
specifies the model selection criterion.
| Default | BIC |
|---|
specifies a list of results tables to send to the client for display.
For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).
specifies the convergence criterion for the log likelihood in the expectation-maximization (EM) algorithm.
| Aliases | emEps |
|---|---|
| convergence | |
| conv | |
| Default | 1E-05 |
| Range | 0–1 |
if set to true, causes factor pattern and unique variances to be added to the parameter estimates table.
| Default | false |
|---|
suppresses the analysis if the number of BY groups exceeds the specified value.
| Minimum value | 1 |
|---|
specifies the initialization method to use if no initialization variables are specified.
| Default | RANDOM |
|---|
specifies the maximum number of iterations for the expectation-maximization (EM) algorithm.
| Default | 500 |
|---|---|
| Range | 0–MACINT |
specifies the variables to use for analysis (effects) and the initial cluster membership probability variables (dependents).
The modelStatement value can be one or more of the following:
specifies one or more variables to use as response variables in the model. Not all models support more than one response variable.
| Aliases | depVar |
|---|---|
| target |
names the response variable.
specifies a list of effects that define the model. Each term in this list is made up of variables specified in the vars parameter and their interaction (which can be NONE, CROSS, or BAR). When the interaction is BAR, it can be limited by the maxInteract parameter.
The effect value can be one or more of the following:
specifies the type of interaction for the variables.
| Alias | interact |
|---|---|
| Default | NONE |
eliminates interaction effects whose order is higher than the specified integer value when used in conjunction with the BAR interaction.
specifies the variables to be nested within the term that is defined by the vars parameter. For terms with a BAR or CROSS interaction, the nest corresponds to the last variable in the vars parameter. For terms with no interaction, the nest is distributed across all variables that are listed in the vars parameter.
specifies the variables to use in defining a term of the effect. You must specify at least one variable.
specifies the number of Gaussian clusters.
specifies the number of factors to use in parsimonious Gaussian mixture models.
specifies whether to include a noise cluster in the model.
| Alias | hasNoiseCluster |
|---|
creates a table that contains observationwise cluster membership probability estimates.
The mbcOutput value can be one or more of the following:
when set to True, adds all statistics to the output table.
| Default | false |
|---|
specifies the settings for an output table.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.
specifies a prefix for naming the cluster membership probability estimates from the expectation (E) step that produced the mean and covariance estimates in the final maximization (M) step.
specifies a prefix for naming the cluster log likelihoods.
specifies a prefix for naming the maximum posterior probability cluster.
specifies a prefix for naming the cluster membership probability estimates from an extra expectation (E) step that uses the mean and covariance estimates from the final maximization (M) step.
| Default | "NEXT" |
|---|
specifies a prefix for naming the predicted values.
specifies the name for the column that contains the observation role.
lists the names of results tables to save as CAS tables on the server.
For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).
specifies the bound below which a mixture weight is treated as zero.
| Alias | parmEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
specifies the seed to use for generating initial cluster memberships when initial cluster memberships are not provided.
| Minimum value | 1 |
|---|
specifies the singularity criterion for the covariance matrices.
| Alias | singEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
stores models in a blob (binary large object).
| Alias | savestate |
|---|
| Long form | store={name="table-name"} |
|---|---|
| Shortcut form | store="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | false |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | false |
|---|
specifies the input data table.
For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the expectation-maximization (EM) technique to use. CEM refers to the classification EM technique.
| Default | EM |
|---|
specifies the number of fitted models to show in the summary table after model selection.
| Default | 10 |
|---|---|
| Minimum value | 1 |
Performs model-based clustering using the EM algorithm.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametertable |
— |
specifies the input data table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametercasOut |
creates a table that contains observationwise cluster membership probability estimates. |
|
|
names |
lists the names of results tables to save as CAS tables on the server. |
|
|
— |
stores models in a blob (binary large object). |
changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attr |
specifies the convergence test to use.
| Default | LOGL |
|---|
specifies the covariance model.
| Aliases | covModel |
|---|---|
| covType |
specifies the model selection criterion.
| Default | BIC |
|---|
specifies a list of results tables to send to the client for display.
For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).
specifies the convergence criterion for the log likelihood in the expectation-maximization (EM) algorithm.
| Aliases | emEps |
|---|---|
| convergence | |
| conv | |
| Default | 1E-05 |
| Range | 0–1 |
if set to true, causes factor pattern and unique variances to be added to the parameter estimates table.
| Default | False |
|---|
suppresses the analysis if the number of BY groups exceeds the specified value.
| Minimum value | 1 |
|---|
specifies the initialization method to use if no initialization variables are specified.
| Default | RANDOM |
|---|
specifies the maximum number of iterations for the expectation-maximization (EM) algorithm.
| Default | 500 |
|---|---|
| Range | 0–MACINT |
specifies the variables to use for analysis (effects) and the initial cluster membership probability variables (dependents).
The modelStatement value can be one or more of the following:
specifies one or more variables to use as response variables in the model. Not all models support more than one response variable.
| Aliases | depVar |
|---|---|
| target |
names the response variable.
specifies a list of effects that define the model. Each term in this list is made up of variables specified in the vars parameter and their interaction (which can be NONE, CROSS, or BAR). When the interaction is BAR, it can be limited by the maxInteract parameter.
The effect value can be one or more of the following:
specifies the type of interaction for the variables.
| Alias | interact |
|---|---|
| Default | NONE |
eliminates interaction effects whose order is higher than the specified integer value when used in conjunction with the BAR interaction.
specifies the variables to be nested within the term that is defined by the vars parameter. For terms with a BAR or CROSS interaction, the nest corresponds to the last variable in the vars parameter. For terms with no interaction, the nest is distributed across all variables that are listed in the vars parameter.
specifies the variables to use in defining a term of the effect. You must specify at least one variable.
specifies the number of Gaussian clusters.
specifies the number of factors to use in parsimonious Gaussian mixture models.
specifies whether to include a noise cluster in the model.
| Alias | hasNoiseCluster |
|---|
creates a table that contains observationwise cluster membership probability estimates.
The mbcOutput value can be one or more of the following:
when set to True, adds all statistics to the output table.
| Default | False |
|---|
specifies the settings for an output table.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.
specifies a prefix for naming the cluster membership probability estimates from the expectation (E) step that produced the mean and covariance estimates in the final maximization (M) step.
specifies a prefix for naming the cluster log likelihoods.
specifies a prefix for naming the maximum posterior probability cluster.
specifies a prefix for naming the cluster membership probability estimates from an extra expectation (E) step that uses the mean and covariance estimates from the final maximization (M) step.
| Default | "NEXT" |
|---|
specifies a prefix for naming the predicted values.
specifies the name for the column that contains the observation role.
lists the names of results tables to save as CAS tables on the server.
For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).
specifies the bound below which a mixture weight is treated as zero.
| Alias | parmEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
specifies the seed to use for generating initial cluster memberships when initial cluster memberships are not provided.
| Minimum value | 1 |
|---|
specifies the singularity criterion for the covariance matrices.
| Alias | singEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
stores models in a blob (binary large object).
| Alias | savestate |
|---|
| Long form | store={"name":"table-name"} |
|---|---|
| Shortcut form | store="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | False |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | False |
|---|
specifies the input data table.
For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the expectation-maximization (EM) technique to use. CEM refers to the classification EM technique.
| Default | EM |
|---|
specifies the number of fitted models to show in the summary table after model selection.
| Default | 10 |
|---|---|
| Minimum value | 1 |
Performs model-based clustering using the EM algorithm.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametertable |
— |
specifies the input data table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
required parametercasOut |
creates a table that contains observationwise cluster membership probability estimates. |
|
|
names |
lists the names of results tables to save as CAS tables on the server. |
|
|
— |
stores models in a blob (binary large object). |
changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attr |
specifies the convergence test to use.
| Default | LOGL |
|---|
specifies the covariance model.
| Aliases | covModel |
|---|---|
| covType |
specifies the model selection criterion.
| Default | BIC |
|---|
specifies a list of results tables to send to the client for display.
For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).
specifies the convergence criterion for the log likelihood in the expectation-maximization (EM) algorithm.
| Aliases | emEps |
|---|---|
| convergence | |
| conv | |
| Default | 1E-05 |
| Range | 0–1 |
if set to true, causes factor pattern and unique variances to be added to the parameter estimates table.
| Default | FALSE |
|---|
suppresses the analysis if the number of BY groups exceeds the specified value.
| Minimum value | 1 |
|---|
specifies the initialization method to use if no initialization variables are specified.
| Default | RANDOM |
|---|
specifies the maximum number of iterations for the expectation-maximization (EM) algorithm.
| Default | 500 |
|---|---|
| Range | 0–MACINT |
specifies the variables to use for analysis (effects) and the initial cluster membership probability variables (dependents).
The modelStatement value can be one or more of the following:
specifies one or more variables to use as response variables in the model. Not all models support more than one response variable.
| Aliases | depVar |
|---|---|
| target |
names the response variable.
specifies a list of effects that define the model. Each term in this list is made up of variables specified in the vars parameter and their interaction (which can be NONE, CROSS, or BAR). When the interaction is BAR, it can be limited by the maxInteract parameter.
The effect value can be one or more of the following:
specifies the type of interaction for the variables.
| Alias | interact |
|---|---|
| Default | NONE |
eliminates interaction effects whose order is higher than the specified integer value when used in conjunction with the BAR interaction.
specifies the variables to be nested within the term that is defined by the vars parameter. For terms with a BAR or CROSS interaction, the nest corresponds to the last variable in the vars parameter. For terms with no interaction, the nest is distributed across all variables that are listed in the vars parameter.
specifies the variables to use in defining a term of the effect. You must specify at least one variable.
specifies the number of Gaussian clusters.
specifies the number of factors to use in parsimonious Gaussian mixture models.
specifies whether to include a noise cluster in the model.
| Alias | hasNoiseCluster |
|---|
creates a table that contains observationwise cluster membership probability estimates.
The mbcOutput value can be one or more of the following:
when set to True, adds all statistics to the output table.
| Default | FALSE |
|---|
specifies the settings for an output table.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.
specifies a prefix for naming the cluster membership probability estimates from the expectation (E) step that produced the mean and covariance estimates in the final maximization (M) step.
specifies a prefix for naming the cluster log likelihoods.
specifies a prefix for naming the maximum posterior probability cluster.
specifies a prefix for naming the cluster membership probability estimates from an extra expectation (E) step that uses the mean and covariance estimates from the final maximization (M) step.
| Default | "NEXT" |
|---|
specifies a prefix for naming the predicted values.
specifies the name for the column that contains the observation role.
lists the names of results tables to save as CAS tables on the server.
For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).
specifies the bound below which a mixture weight is treated as zero.
| Alias | parmEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
specifies the seed to use for generating initial cluster memberships when initial cluster memberships are not provided.
| Minimum value | 1 |
|---|
specifies the singularity criterion for the covariance matrices.
| Alias | singEps |
|---|---|
| Default | 1E-08 |
| Range | 1E-15–1 |
stores models in a blob (binary large object).
| Alias | savestate |
|---|
| Long form | store=list(name="table-name") |
|---|---|
| Shortcut form | store="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | FALSE |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | FALSE |
|---|
specifies the input data table.
For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the expectation-maximization (EM) technique to use. CEM refers to the classification EM technique.
| Default | EM |
|---|
specifies the number of fitted models to show in the summary table after model selection.
| Default | 10 |
|---|---|
| Minimum value | 1 |