Provides actions for training and scoring artificial neural networks
Trains an artificial neural network.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network. |
|
|
required parametertable |
— |
specifies the settings for an input table. |
|
— |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
||
|
casOut |
requests that the action produce SAS score code. Specify additional parameters. |
|
|
state (and nested parameter table) |
specifies the optimization options. |
|
|
— |
Specifies the table in which to save the model state for future model prediction. |
specifies the activation function for the neurons on each hidden layer.
| Alias | act |
|---|
specifies that you wish that the action uses a prespecified row ordering.
| Alias | reproducibleRowOrder |
|---|---|
| Default | FALSE |
specifies the network architecture to be trained.
| Default | GLIM |
|---|---|
| DIRECT | specifies to use an architecture that is an extension of MLP with direct connections between the input layer and the output layer. |
| GLIM | specifies to use the generalized linear model architecture. This uses a two-layer perceptron (one is the input layer and the other is the output layer) without hidden layers or units. |
| MLP | specifies to use a multilayer perceptron with one or more hidden layers. |
specifies temporary attributes, such as a format, to apply to input variables.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attrs | |
| attr | |
| varAttrs |
specifies a fixed bias value for all the hidden and output neurons. In this case, the bias parameters are fixed and not optimized.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
requests that the action produce SAS score code. Specify additional parameters.
For more information about specifying the code parameter, see the common codegen parameter (Appendix A: Common Parameters).
specifies the combination function for the neurons on each hidden layer.
| Alias | comb |
|---|
specifies the annealing parameter when performing a simulated annealing (SA) global optimization. Without this value, the step size and the temperature are used to perform a Monte Carlo (MC) global optimization. When you specify a value, the optimization becomes SA where the temperature is scaled by delta*t at every MC step.
specifies the dropout ratio for the hidden layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the dropout ratio for the input layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the error function to train the network. If you do not specify this parameter, then the ENTROPY function is used for nominal variables. The NORMAL function is used for interval variables.
specifies a numeric variable that contains the frequency of occurrence of each observation.
Generates the full weight model for LBFGS
| Default | FALSE |
|---|
specifies the number of hidden neurons for each hidden layer in the feedforward model. For example, hiddens={5, 3} specifies two hidden layers: one with 5 hidden neurons and the other with 3 hidden neurons. When you specify this parameter, the default architecture is multi-layer perceptron (MLP).
| Alias | hidden |
|---|
by default, bias parameters are included for the hidden and output units. When set to False, these parameters are not included.
| Default | TRUE |
|---|
specifies the input variables to use in the analysis.
For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | input |
|---|
calculates the weight applied to the prediction error of each nominal target variable as the total number of observations divided by the number of observations whose target class is the same as the current observation.
| Default | FALSE |
|---|
specifies the nodes to be included in the output table that is generated by the DATA step scoring code. When the autoencoding of input nodes is requested, the default is HIDDEN. This value is particularly useful when autoencoding is applied to reduce the dimension of the input nodes. By reusing the node output values, machine learning algorithms such as neural networks, clustering, decision tree, and forests can use the newly encoded vectors as input.
| Default | HIDDEN |
|---|---|
| ALL | specifies to include all the nodes in the scored output table. |
| HIDDEN | specifies to include the hidden nodes only. |
| INPUT | specifies to include the input nodes only. |
| OUTPUT | specifies to include the output nodes only. |
specifies how to impute missing values for the input or target variables. If you do not specify this parameter or the parameter is NONE, then observations with missing values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies a model ID variable name that is included in the generated DATA step scoring code. By default, this variable name is the target variable name with ANN_ set as the prefix.
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network.
| Long form | modelTable={name="table-name"} |
|---|---|
| Shortcut form | modelTable="table-name" |
| Alias | model |
|---|
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the number of networks to select out of the specified number of tries. The networks with the smallest errors are selected as a set of optimal networks. When data is scored, the most frequent predicted values among the selected networks are used to make the final predictions. Note that you must specify a value to perform Monte Carlo or simulated annealing optimizations which also use the delta, step, and t parameters (experimental for this release).
| Alias | numAnn |
|---|---|
| Default | 0 |
| Minimum value | 0 |
specifies the optimization options.
For more information about specifying the nloOpts parameter, see the common casOptml parameter (Appendix A: Common Parameters).
specifies the nominal input and target variables to use in the analysis.
For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | nominal |
|---|
specifies the number of tries when training networks with random initial weights. The network with the smallest error is chosen as the optimal network. Note that you must specify a value to perform Monte Carlo or simulated annealing global optimizations which also use the delta, step, and t parameters.
| Alias | numTries |
|---|---|
| Default | 0 |
specifies the distributions for randomly generating the initial network connection weights. All the weights are in the range [-1.0, 1.0]. The initial bias values are zero. When XAVIER or MSRA is specified, the scaleinit option will be ignored.
Resumes a training optimization using weights obtained from previous training. The initial weights for resuming the optimization are read from a temporary table with the modelTable= option. The specified framework for the model must be the same as the previous framework.
| Default | FALSE |
|---|
specifies the fraction of the data to use for building a neural network.
| Range | (0–1] |
|---|
Specifies the table in which to save the model state for future model prediction.
| Long form | saveState={name="table-name"} |
|---|---|
| Shortcut form | saveState="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | FALSE |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | FALSE |
|---|
specifies how to scale the initial weights. If you specify 1, then the range is scaled to [-1.0 / sqrt(n), 1.0 / sqrt(n)], where n is the number of units in the previous layer. If you specify 2, then the range is scaled to [-6.0 / sqrt(n + n1), 6.0 / sqrt(n + n1)], where n1 is the number of units in the current layer.
specifies the random number seed for generating random numbers to initialize the network weights.
| Maximum value | MACINT |
|---|
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies a step size for perturbations on the network weights when performing Monte Carlo or simulated annealing global optimizations.
specifies the artificial temperature parameter when performing Monte Carlo or simulated annealing global optimizations.
specifies the settings for an input table.
| Long form | table={name="table-name"} |
|---|---|
| Shortcut form | table="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the target or response variable for training. If you do not specify a target, then the artificial neural network is trained for autoencoding.
specifies the activation function for the neurons on the output layer. If you do not specify this parameter, then SOFTMAX is used for nominal variables. The IDENTITY function is used for interval variables. If the target variable is not provided, for the purposes of encoding the input nodes, then the SOFTMAX function is used.
specifies the combination function for the neurons on the target output nodes.
| Default | LINEAR |
|---|---|
| ADD | adds all the incoming values without using any weights or biases. |
| LINEAR | uses a linear combination of the incoming values and weights. |
| RADIAL | uses a radial basis function with equal heights and unequal widths for all units in the layer. |
specifies how to impute missing values for the target variable. If you specify NONE for this parameter, then observations with missing target values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table.
| Long form | validTable={name="table-name"} |
|---|---|
| Shortcut form | validTable="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies a variable to weight the prediction errors (the difference between the output of the network and the target value specified in the input data set) for each observation during training.
Trains an artificial neural network.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network. |
|
|
required parametertable |
— |
specifies the settings for an input table. |
|
— |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
||
|
casOut |
requests that the action produce SAS score code. Specify additional parameters. |
|
|
state (and nested parameter table) |
specifies the optimization options. |
|
|
— |
Specifies the table in which to save the model state for future model prediction. |
specifies the activation function for the neurons on each hidden layer.
| Alias | act |
|---|
specifies that you wish that the action uses a prespecified row ordering.
| Alias | reproducibleRowOrder |
|---|---|
| Default | false |
specifies the network architecture to be trained.
| Default | GLIM |
|---|---|
| DIRECT | specifies to use an architecture that is an extension of MLP with direct connections between the input layer and the output layer. |
| GLIM | specifies to use the generalized linear model architecture. This uses a two-layer perceptron (one is the input layer and the other is the output layer) without hidden layers or units. |
| MLP | specifies to use a multilayer perceptron with one or more hidden layers. |
specifies temporary attributes, such as a format, to apply to input variables.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attrs | |
| attr | |
| varAttrs |
specifies a fixed bias value for all the hidden and output neurons. In this case, the bias parameters are fixed and not optimized.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
requests that the action produce SAS score code. Specify additional parameters.
For more information about specifying the code parameter, see the common codegen parameter (Appendix A: Common Parameters).
specifies the combination function for the neurons on each hidden layer.
| Alias | comb |
|---|
specifies the annealing parameter when performing a simulated annealing (SA) global optimization. Without this value, the step size and the temperature are used to perform a Monte Carlo (MC) global optimization. When you specify a value, the optimization becomes SA where the temperature is scaled by delta*t at every MC step.
specifies the dropout ratio for the hidden layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the dropout ratio for the input layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the error function to train the network. If you do not specify this parameter, then the ENTROPY function is used for nominal variables. The NORMAL function is used for interval variables.
specifies a numeric variable that contains the frequency of occurrence of each observation.
Generates the full weight model for LBFGS
| Default | false |
|---|
specifies the number of hidden neurons for each hidden layer in the feedforward model. For example, hiddens={5, 3} specifies two hidden layers: one with 5 hidden neurons and the other with 3 hidden neurons. When you specify this parameter, the default architecture is multi-layer perceptron (MLP).
| Alias | hidden |
|---|
by default, bias parameters are included for the hidden and output units. When set to False, these parameters are not included.
| Default | true |
|---|
specifies the input variables to use in the analysis.
For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | input |
|---|
calculates the weight applied to the prediction error of each nominal target variable as the total number of observations divided by the number of observations whose target class is the same as the current observation.
| Default | false |
|---|
specifies the nodes to be included in the output table that is generated by the DATA step scoring code. When the autoencoding of input nodes is requested, the default is HIDDEN. This value is particularly useful when autoencoding is applied to reduce the dimension of the input nodes. By reusing the node output values, machine learning algorithms such as neural networks, clustering, decision tree, and forests can use the newly encoded vectors as input.
| Default | HIDDEN |
|---|---|
| ALL | specifies to include all the nodes in the scored output table. |
| HIDDEN | specifies to include the hidden nodes only. |
| INPUT | specifies to include the input nodes only. |
| OUTPUT | specifies to include the output nodes only. |
specifies how to impute missing values for the input or target variables. If you do not specify this parameter or the parameter is NONE, then observations with missing values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies a model ID variable name that is included in the generated DATA step scoring code. By default, this variable name is the target variable name with ANN_ set as the prefix.
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network.
| Long form | modelTable={name="table-name"} |
|---|---|
| Shortcut form | modelTable="table-name" |
| Alias | model |
|---|
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | false |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | false |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the number of networks to select out of the specified number of tries. The networks with the smallest errors are selected as a set of optimal networks. When data is scored, the most frequent predicted values among the selected networks are used to make the final predictions. Note that you must specify a value to perform Monte Carlo or simulated annealing optimizations which also use the delta, step, and t parameters (experimental for this release).
| Alias | numAnn |
|---|---|
| Default | 0 |
| Minimum value | 0 |
specifies the optimization options.
For more information about specifying the nloOpts parameter, see the common casOptml parameter (Appendix A: Common Parameters).
specifies the nominal input and target variables to use in the analysis.
For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | nominal |
|---|
specifies the number of tries when training networks with random initial weights. The network with the smallest error is chosen as the optimal network. Note that you must specify a value to perform Monte Carlo or simulated annealing global optimizations which also use the delta, step, and t parameters.
| Alias | numTries |
|---|---|
| Default | 0 |
specifies the distributions for randomly generating the initial network connection weights. All the weights are in the range [-1.0, 1.0]. The initial bias values are zero. When XAVIER or MSRA is specified, the scaleinit option will be ignored.
Resumes a training optimization using weights obtained from previous training. The initial weights for resuming the optimization are read from a temporary table with the modelTable= option. The specified framework for the model must be the same as the previous framework.
| Default | false |
|---|
specifies the fraction of the data to use for building a neural network.
| Range | (0–1] |
|---|
Specifies the table in which to save the model state for future model prediction.
| Long form | saveState={name="table-name"} |
|---|---|
| Shortcut form | saveState="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | false |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | false |
|---|
specifies how to scale the initial weights. If you specify 1, then the range is scaled to [-1.0 / sqrt(n), 1.0 / sqrt(n)], where n is the number of units in the previous layer. If you specify 2, then the range is scaled to [-6.0 / sqrt(n + n1), 6.0 / sqrt(n + n1)], where n1 is the number of units in the current layer.
specifies the random number seed for generating random numbers to initialize the network weights.
| Maximum value | MACINT |
|---|
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies a step size for perturbations on the network weights when performing Monte Carlo or simulated annealing global optimizations.
specifies the artificial temperature parameter when performing Monte Carlo or simulated annealing global optimizations.
specifies the settings for an input table.
| Long form | table={name="table-name"} |
|---|---|
| Shortcut form | table="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | false |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | false |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the target or response variable for training. If you do not specify a target, then the artificial neural network is trained for autoencoding.
specifies the activation function for the neurons on the output layer. If you do not specify this parameter, then SOFTMAX is used for nominal variables. The IDENTITY function is used for interval variables. If the target variable is not provided, for the purposes of encoding the input nodes, then the SOFTMAX function is used.
specifies the combination function for the neurons on the target output nodes.
| Default | LINEAR |
|---|---|
| ADD | adds all the incoming values without using any weights or biases. |
| LINEAR | uses a linear combination of the incoming values and weights. |
| RADIAL | uses a radial basis function with equal heights and unequal widths for all units in the layer. |
specifies how to impute missing values for the target variable. If you specify NONE for this parameter, then observations with missing target values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table.
| Long form | validTable={name="table-name"} |
|---|---|
| Shortcut form | validTable="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | false |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | false |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies a variable to weight the prediction errors (the difference between the output of the network and the target value specified in the input data set) for each observation during training.
Trains an artificial neural network.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network. |
|
|
required parametertable |
— |
specifies the settings for an input table. |
|
— |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
||
|
casOut |
requests that the action produce SAS score code. Specify additional parameters. |
|
|
state (and nested parameter table) |
specifies the optimization options. |
|
|
— |
Specifies the table in which to save the model state for future model prediction. |
specifies the activation function for the neurons on each hidden layer.
| Alias | act |
|---|
specifies that you wish that the action uses a prespecified row ordering.
| Alias | reproducibleRowOrder |
|---|---|
| Default | False |
specifies the network architecture to be trained.
| Default | GLIM |
|---|---|
| DIRECT | specifies to use an architecture that is an extension of MLP with direct connections between the input layer and the output layer. |
| GLIM | specifies to use the generalized linear model architecture. This uses a two-layer perceptron (one is the input layer and the other is the output layer) without hidden layers or units. |
| MLP | specifies to use a multilayer perceptron with one or more hidden layers. |
specifies temporary attributes, such as a format, to apply to input variables.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attrs | |
| attr | |
| varAttrs |
specifies a fixed bias value for all the hidden and output neurons. In this case, the bias parameters are fixed and not optimized.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
requests that the action produce SAS score code. Specify additional parameters.
For more information about specifying the code parameter, see the common codegen parameter (Appendix A: Common Parameters).
specifies the combination function for the neurons on each hidden layer.
| Alias | comb |
|---|
specifies the annealing parameter when performing a simulated annealing (SA) global optimization. Without this value, the step size and the temperature are used to perform a Monte Carlo (MC) global optimization. When you specify a value, the optimization becomes SA where the temperature is scaled by delta*t at every MC step.
specifies the dropout ratio for the hidden layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the dropout ratio for the input layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the error function to train the network. If you do not specify this parameter, then the ENTROPY function is used for nominal variables. The NORMAL function is used for interval variables.
specifies a numeric variable that contains the frequency of occurrence of each observation.
Generates the full weight model for LBFGS
| Default | False |
|---|
specifies the number of hidden neurons for each hidden layer in the feedforward model. For example, hiddens={5, 3} specifies two hidden layers: one with 5 hidden neurons and the other with 3 hidden neurons. When you specify this parameter, the default architecture is multi-layer perceptron (MLP).
| Alias | hidden |
|---|
by default, bias parameters are included for the hidden and output units. When set to False, these parameters are not included.
| Default | True |
|---|
specifies the input variables to use in the analysis.
For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | input |
|---|
calculates the weight applied to the prediction error of each nominal target variable as the total number of observations divided by the number of observations whose target class is the same as the current observation.
| Default | False |
|---|
specifies the nodes to be included in the output table that is generated by the DATA step scoring code. When the autoencoding of input nodes is requested, the default is HIDDEN. This value is particularly useful when autoencoding is applied to reduce the dimension of the input nodes. By reusing the node output values, machine learning algorithms such as neural networks, clustering, decision tree, and forests can use the newly encoded vectors as input.
| Default | HIDDEN |
|---|---|
| ALL | specifies to include all the nodes in the scored output table. |
| HIDDEN | specifies to include the hidden nodes only. |
| INPUT | specifies to include the input nodes only. |
| OUTPUT | specifies to include the output nodes only. |
specifies how to impute missing values for the input or target variables. If you do not specify this parameter or the parameter is NONE, then observations with missing values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies a model ID variable name that is included in the generated DATA step scoring code. By default, this variable name is the target variable name with ANN_ set as the prefix.
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network.
| Long form | modelTable={"name":"table-name"} |
|---|---|
| Shortcut form | modelTable="table-name" |
| Alias | model |
|---|
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | False |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | False |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the number of networks to select out of the specified number of tries. The networks with the smallest errors are selected as a set of optimal networks. When data is scored, the most frequent predicted values among the selected networks are used to make the final predictions. Note that you must specify a value to perform Monte Carlo or simulated annealing optimizations which also use the delta, step, and t parameters (experimental for this release).
| Alias | numAnn |
|---|---|
| Default | 0 |
| Minimum value | 0 |
specifies the optimization options.
For more information about specifying the nloOpts parameter, see the common casOptml parameter (Appendix A: Common Parameters).
specifies the nominal input and target variables to use in the analysis.
For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | nominal |
|---|
specifies the number of tries when training networks with random initial weights. The network with the smallest error is chosen as the optimal network. Note that you must specify a value to perform Monte Carlo or simulated annealing global optimizations which also use the delta, step, and t parameters.
| Alias | numTries |
|---|---|
| Default | 0 |
specifies the distributions for randomly generating the initial network connection weights. All the weights are in the range [-1.0, 1.0]. The initial bias values are zero. When XAVIER or MSRA is specified, the scaleinit option will be ignored.
Resumes a training optimization using weights obtained from previous training. The initial weights for resuming the optimization are read from a temporary table with the modelTable= option. The specified framework for the model must be the same as the previous framework.
| Default | False |
|---|
specifies the fraction of the data to use for building a neural network.
| Range | (0–1] |
|---|
Specifies the table in which to save the model state for future model prediction.
| Long form | saveState={"name":"table-name"} |
|---|---|
| Shortcut form | saveState="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | False |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | False |
|---|
specifies how to scale the initial weights. If you specify 1, then the range is scaled to [-1.0 / sqrt(n), 1.0 / sqrt(n)], where n is the number of units in the previous layer. If you specify 2, then the range is scaled to [-6.0 / sqrt(n + n1), 6.0 / sqrt(n + n1)], where n1 is the number of units in the current layer.
specifies the random number seed for generating random numbers to initialize the network weights.
| Maximum value | MACINT |
|---|
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies a step size for perturbations on the network weights when performing Monte Carlo or simulated annealing global optimizations.
specifies the artificial temperature parameter when performing Monte Carlo or simulated annealing global optimizations.
specifies the settings for an input table.
| Long form | table={"name":"table-name"} |
|---|---|
| Shortcut form | table="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | False |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | False |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the target or response variable for training. If you do not specify a target, then the artificial neural network is trained for autoencoding.
specifies the activation function for the neurons on the output layer. If you do not specify this parameter, then SOFTMAX is used for nominal variables. The IDENTITY function is used for interval variables. If the target variable is not provided, for the purposes of encoding the input nodes, then the SOFTMAX function is used.
specifies the combination function for the neurons on the target output nodes.
| Default | LINEAR |
|---|---|
| ADD | adds all the incoming values without using any weights or biases. |
| LINEAR | uses a linear combination of the incoming values and weights. |
| RADIAL | uses a radial basis function with equal heights and unequal widths for all units in the layer. |
specifies how to impute missing values for the target variable. If you specify NONE for this parameter, then observations with missing target values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table.
| Long form | validTable={"name":"table-name"} |
|---|---|
| Shortcut form | validTable="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | False |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | False |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import_ |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies a variable to weight the prediction errors (the difference between the output of the network and the target value specified in the input data set) for each observation during training.
Trains an artificial neural network.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network. |
|
|
required parametertable |
— |
specifies the settings for an input table. |
|
— |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
||
|
casOut |
requests that the action produce SAS score code. Specify additional parameters. |
|
|
state (and nested parameter table) |
specifies the optimization options. |
|
|
— |
Specifies the table in which to save the model state for future model prediction. |
specifies the activation function for the neurons on each hidden layer.
| Alias | act |
|---|
specifies that you wish that the action uses a prespecified row ordering.
| Alias | reproducibleRowOrder |
|---|---|
| Default | FALSE |
specifies the network architecture to be trained.
| Default | GLIM |
|---|---|
| DIRECT | specifies to use an architecture that is an extension of MLP with direct connections between the input layer and the output layer. |
| GLIM | specifies to use the generalized linear model architecture. This uses a two-layer perceptron (one is the input layer and the other is the output layer) without hidden layers or units. |
| MLP | specifies to use a multilayer perceptron with one or more hidden layers. |
specifies temporary attributes, such as a format, to apply to input variables.
For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Aliases | attribute |
|---|---|
| attrs | |
| attr | |
| varAttrs |
specifies a fixed bias value for all the hidden and output neurons. In this case, the bias parameters are fixed and not optimized.
For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
requests that the action produce SAS score code. Specify additional parameters.
For more information about specifying the code parameter, see the common codegen parameter (Appendix A: Common Parameters).
specifies the combination function for the neurons on each hidden layer.
| Alias | comb |
|---|
specifies the annealing parameter when performing a simulated annealing (SA) global optimization. Without this value, the step size and the temperature are used to perform a Monte Carlo (MC) global optimization. When you specify a value, the optimization becomes SA where the temperature is scaled by delta*t at every MC step.
specifies the dropout ratio for the hidden layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the dropout ratio for the input layers. This parameter is valid when SGD is used for network layer optimization only and all the connections use the linear combination function.
| Range | [0–1) |
|---|
specifies the error function to train the network. If you do not specify this parameter, then the ENTROPY function is used for nominal variables. The NORMAL function is used for interval variables.
specifies a numeric variable that contains the frequency of occurrence of each observation.
Generates the full weight model for LBFGS
| Default | FALSE |
|---|
specifies the number of hidden neurons for each hidden layer in the feedforward model. For example, hiddens={5, 3} specifies two hidden layers: one with 5 hidden neurons and the other with 3 hidden neurons. When you specify this parameter, the default architecture is multi-layer perceptron (MLP).
| Alias | hidden |
|---|
by default, bias parameters are included for the hidden and output units. When set to False, these parameters are not included.
| Default | TRUE |
|---|
specifies the input variables to use in the analysis.
For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | input |
|---|
calculates the weight applied to the prediction error of each nominal target variable as the total number of observations divided by the number of observations whose target class is the same as the current observation.
| Default | FALSE |
|---|
specifies the nodes to be included in the output table that is generated by the DATA step scoring code. When the autoencoding of input nodes is requested, the default is HIDDEN. This value is particularly useful when autoencoding is applied to reduce the dimension of the input nodes. By reusing the node output values, machine learning algorithms such as neural networks, clustering, decision tree, and forests can use the newly encoded vectors as input.
| Default | HIDDEN |
|---|---|
| ALL | specifies to include all the nodes in the scored output table. |
| HIDDEN | specifies to include the hidden nodes only. |
| INPUT | specifies to include the input nodes only. |
| OUTPUT | specifies to include the output nodes only. |
specifies how to impute missing values for the input or target variables. If you do not specify this parameter or the parameter is NONE, then observations with missing values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies a model ID variable name that is included in the generated DATA step scoring code. By default, this variable name is the target variable name with ANN_ set as the prefix.
specifies the table that contains the artificial neural network model. The weights in this table are loaded to initialize the neural network.
| Long form | modelTable=list(name="table-name") |
|---|---|
| Shortcut form | modelTable="table-name" |
| Alias | model |
|---|
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the number of networks to select out of the specified number of tries. The networks with the smallest errors are selected as a set of optimal networks. When data is scored, the most frequent predicted values among the selected networks are used to make the final predictions. Note that you must specify a value to perform Monte Carlo or simulated annealing optimizations which also use the delta, step, and t parameters (experimental for this release).
| Alias | numAnn |
|---|---|
| Default | 0 |
| Minimum value | 0 |
specifies the optimization options.
For more information about specifying the nloOpts parameter, see the common casOptml parameter (Appendix A: Common Parameters).
specifies the nominal input and target variables to use in the analysis.
For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).
| Alias | nominal |
|---|
specifies the number of tries when training networks with random initial weights. The network with the smallest error is chosen as the optimal network. Note that you must specify a value to perform Monte Carlo or simulated annealing global optimizations which also use the delta, step, and t parameters.
| Alias | numTries |
|---|---|
| Default | 0 |
specifies the distributions for randomly generating the initial network connection weights. All the weights are in the range [-1.0, 1.0]. The initial bias values are zero. When XAVIER or MSRA is specified, the scaleinit option will be ignored.
Resumes a training optimization using weights obtained from previous training. The initial weights for resuming the optimization are read from a temporary table with the modelTable= option. The specified framework for the model must be the same as the previous framework.
| Default | FALSE |
|---|
specifies the fraction of the data to use for building a neural network.
| Range | (0–1] |
|---|
Specifies the table in which to save the model state for future model prediction.
| Long form | saveState=list(name="table-name") |
|---|---|
| Shortcut form | saveState="table-name" |
The casouttable value can be one or more of the following:
specifies the name of the caslib for the output table.
specifies the descriptive label to associate with the table.
specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.
| Default | 0 |
|---|---|
| Minimum value | 0 |
specifies the memory format for the output table.
| Default | INHERIT |
|---|
use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.
specifies the name for the output table.
when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.
| Default | FALSE |
|---|
when set to True, overwrites an existing table that has the same name.
| Default | FALSE |
|---|
specifies how to scale the initial weights. If you specify 1, then the range is scaled to [-1.0 / sqrt(n), 1.0 / sqrt(n)], where n is the number of units in the previous layer. If you specify 2, then the range is scaled to [-6.0 / sqrt(n + n1), 6.0 / sqrt(n + n1)], where n1 is the number of units in the current layer.
specifies the random number seed for generating random numbers to initialize the network weights.
| Maximum value | MACINT |
|---|
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies a step size for perturbations on the network weights when performing Monte Carlo or simulated annealing global optimizations.
specifies the artificial temperature parameter when performing Monte Carlo or simulated annealing global optimizations.
specifies the settings for an input table.
| Long form | table=list(name="table-name") |
|---|---|
| Shortcut form | table="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies the target or response variable for training. If you do not specify a target, then the artificial neural network is trained for autoencoding.
specifies the activation function for the neurons on the output layer. If you do not specify this parameter, then SOFTMAX is used for nominal variables. The IDENTITY function is used for interval variables. If the target variable is not provided, for the purposes of encoding the input nodes, then the SOFTMAX function is used.
specifies the combination function for the neurons on the target output nodes.
| Default | LINEAR |
|---|---|
| ADD | adds all the incoming values without using any weights or biases. |
| LINEAR | uses a linear combination of the incoming values and weights. |
| RADIAL | uses a radial basis function with equal heights and unequal widths for all units in the layer. |
specifies how to impute missing values for the target variable. If you specify NONE for this parameter, then observations with missing target values are ignored. For nominal variables, a new category is created for the missing values.
| MAX | specifies to replace missing values for each variable with its maximum value. |
|---|---|
| MEAN | specifies to replace missing values for each variable with its mean value. |
| MIN | specifies to replace missing values for each variable with its minimum value. |
| NONE | specifies to exclude the observations with missing values |
specifies the standardization to use on the interval variables.
| Default | NONE |
|---|---|
| MIDRANGE | specifies to scale the variables to a midrange of 0 and a half-range of 1. |
| NONE | specifies not to alter the variables. |
| STD | specifies to scale the variables to a mean of 0 and a standard deviation of 1. |
specifies the table with the validation data. Using a validation table enables the early stopping of the iteration process with the nloOpts parameter. The validation table must have the same columns and data types as the training table.
| Long form | validTable=list(name="table-name") |
|---|---|
| Shortcut form | validTable="table-name" |
The castable value can be one or more of the following:
specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.
when set to True, creates the computed variables when the table is loaded instead of when the action begins.
| Alias | compOnDemand |
|---|---|
| Default | FALSE |
specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.
| Alias | compVars |
|---|
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for each computed variable that you include in the computedVars parameter.
| Alias | compPgm |
|---|
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the input table.
when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.
| Default | FALSE |
|---|
specifies the variables to use in the action.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the input data.
specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.
The groupbytable value can be one or more of the following:
specifies the caslib for the filter table. By default, the active caslib is used.
specifies data source options.
| Aliases | options |
|---|---|
| dataSource |
For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).
specifies the settings for reading a table from a data source.
| Alias | import |
|---|
For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).
specifies the name of the filter table.
specifies the variable names to use from the filter table.
The casinvardesc value can be one or more of the following:
specifies the format to apply to the variable.
specifies the length of the format field plus the length of the format precision.
specifies the descriptive label for the variable.
specifies the name for the variable.
specifies the length of the format precision.
specifies the length of the format field.
specifies an expression for subsetting the data from the filter table.
specifies a variable to weight the prediction errors (the difference between the output of the network and the target value specified in the input data set) for each observation during training.