Nonparametric Bayes Action Set

Nonparametric Bayes

gpReg Action

Learns a Gaussian process regression model.

CASL Syntax

nonParametricBayes.gpReg <result=results> <status=rc> /
applyRowOrder=TRUE | FALSE,
attributes={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
display={
caseSensitive=TRUE | FALSE,
exclude=TRUE | FALSE,
excludeAll=TRUE | FALSE,
keyIsPath=TRUE | FALSE,
names={"string-1" <, "string-2", ...>},
pathType="LABEL" | "NAME",
traceNames=TRUE | FALSE
},
fixInducingPoints=TRUE | FALSE,
fixKernelParmFirstIter=TRUE | FALSE,
required parameter inputs={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
jitterMaxIters=integer,
nInducingPoints=integer,
nloOpts={
optmlOpt={
fConv=double
fConvWindow=64-bit-integer
gTol=double
maxEvals=64-bit-integer
maxIters=64-bit-integer
maxTime=double
regL1=double
regL2=double
},
sgdOpt={
adaptiveRate=TRUE | FALSE
commFreq=64-bit-integer
learningRate=double
miniBatchSize=64-bit-integer
momentum=double
seed=64-bit-integer
useLocking=TRUE | FALSE
},
validate={
frequency=64-bit-integer
goal=double
stagnation=64-bit-integer
threshold=double
thresholdIter=64-bit-integer
}
},
outInducingPoints={
caslib="string",
compress=TRUE | FALSE,
indexVars={"variable-name-1" <, "variable-name-2", ...>},
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where={"string-1" <, "string-2", ...>}
},
output={
required parameter casOut={
caslib="string"
compress=TRUE | FALSE
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
},
copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>},
role="string"
},
outputTables={
groupByVarsRaw=TRUE | FALSE,
includeAll=TRUE | FALSE,
names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},
repeated=TRUE | FALSE,
replace=TRUE | FALSE
},
outVariationalCov={
caslib="string",
compress=TRUE | FALSE,
indexVars={"variable-name-1" <, "variable-name-2", ...>},
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where={"string-1" <, "string-2", ...>}
},
partByFrac={
seed=integer,
test=double,
validate=double
},
partByVar={
required parameter name="variable-name",
test="string",
train="string",
validate="string"
},
saveState={
caslib="string",
label="string",
lifetime=64-bit-integer,
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
},
seed=double,
required parameter table={
caslib="string",
computedOnDemand=TRUE | FALSE,
computedVars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter name="table-name",
singlePass=TRUE | FALSE,
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
where="where-expression",
whereTable={
casLib="string"
dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter name="table-name"
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}}
where="where-expression"
}
},
required parameter target="variable-name",
useSimpleInit=TRUE | FALSE
;
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 outInducingPoints

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

 outVariationalCov

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

 output

required parametercasOut

specifies the output data table in which to save the scored observations.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

 saveState

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Parameter Descriptions

applyRowOrder=TRUE | FALSE

specifies that you wish that the action uses a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call.

Alias reproducibleRowOrder
Default FALSE

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias attribute

autoRelevanceDetermination=TRUE | FALSE

when set to True, use automatic relevance determination in the kernel function.

Alias ard
Default FALSE

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

fixInducingPoints=TRUE | FALSE

when set to True, fixes inducing points in the optimization.

Default FALSE

fixKernelParmFirstIter=TRUE | FALSE

when set to True, fixes kernel parameters in the first iteration.

Default FALSE

* inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias input

jitterMaxIters=integer

specifies the maximum number of iterations for jitter Cholesky decomposition.

Default 10
Minimum value 0

kernel="LINEAR" | "MATERN32" | "MATERN52" | "PERIODIC" | "RBF"

specifies the kernel function type for Gaussian distributions in the Gaussian process regression model.

Default RBF
LINEAR

uses a linear kernel.

MATERN32

uses a Matern 3/2 kernel.

MATERN52

uses a Matern 5/2 kernel.

PERIODIC

uses a periodic kernel.

RBF

uses a radial-basis function kernel.

nInducingPoints=integer

specifies the number of inducing points.

Default 100
Minimum value 2

nloOpts={casOptml}

specifies the optimization options.

Alias optimizer
Long form nloOpts={algorithm="ADAM" | "SGD"}
Shortcut form nloOpts="ADAM" | "SGD"

The casOptml value can be one or more of the following:

algorithm="ADAM" | "SGD"

specifies the optimization solver to use.

Alias alg
ADAM

uses the adaptive moments variant of the stochastic gradient descent solver.

SGD

uses the stochastic gradient descent (SGD) solver.

optmlOpt={optmlOptions}

specifies options common to all solvers.

The optmlOptions value can be one or more of the following:

clipWeightMaxNorm=double

specifies the maximum L2 norm of the weight vector. Weight vectors that have a greater L2 norm are scaled to this value.

Default 0
Minimum value 0
fConv=double

specifies a stopping criterion. The LBFGS solver stops when the objective fails to change more than this value for at least as many iterations as are specified in the fConvWindow parameter.

Default 1E-05
Minimum value 0
fConvWindow=64-bit-integer

specifies an iteration window for the LBFGS solver's application of the convergence criterion that is specified in the fConv parameter.

Default 1
Minimum value 1
gTol=double

specifies the stopping tolerance for the first-order optimality error.

Default 1E-05
Minimum value 0
maxEvals=64-bit-integer

specifies the maximum number of function evaluations for a single optimization or training.

Alias maxEval
Default 0
Minimum value 0
maxIters=64-bit-integer

specifies the maximum iterations for a single optimization or training.

Alias maxIter
Default 10
Minimum value 0
maxTime=double

specifies the maximum time (in seconds) for a single optimization or training.

Default 0
Minimum value 0
regL1=double

specifies the L1 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
regL2=double

specifies the L2 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
printOpt={optmlPrintOptions}

specifies options for sending information to the log and printing the iteration history table.

The optmlPrintOptions value can be one or more of the following:

logLevel=64-bit-integer

specifies the output display level.

Default 0
Minimum value 0
printFreq=64-bit-integer

specifies how frequently to print the iteration log.

Default 1
Minimum value 1
printLevel="PRINTBASIC" | "PRINTDETAIL" | "PRINTNONE"

specifies the level of detail in rows of the iteration history table.

Default PRINTBASIC
PRINTBASIC

prints only basic information to the iteration history table.

PRINTDETAIL

prints detailed information to the iteration history table.

PRINTNONE

disables printing of the iteration history table.

sgdOpt={sgdOptions}

specifies options for the stochastic gradient descent (SGD) solver.

The sgdOptions value can be one or more of the following:

adaptiveDecay=double

specifies the rate at which the second moment of the gradient is decayed during each SGD iteration.

Alias beta2
Default 0.95
Range [0–1)
adaptiveRate=TRUE | FALSE

when set to True, uses the second moment of the gradient vector to scale the learning rate for SGD.

Default FALSE
annealingRate=double

specifies the annealing parameter.

Default 1E-06
Minimum value 0
commFreq=64-bit-integer

specifies the number of minibatches that each computational thread processes before weights are synchronized across all threads and nodes.

Minimum value 0
learningRate=double

specifies the learning rate parameter for SGD.

Default 0.001
Minimum value (exclusive) 0
miniBatchSize=64-bit-integer

specifies the size of the minibatches to use in SGD.

Default 1
Minimum value 1
momentum=double

specifies the momentum for SGD.

Alias beta1
Default 0
Range [0–1)
seed=64-bit-integer

specifies the seed for random access of observations on each thread for the SGD algorithm.

useLocking=TRUE | FALSE

when set to True, uses locks to perform thread aggregation; when set to False, uses an atomic (nondeterministic) operation.

Default FALSE
validate={optmlValidate}

specifies options for validating models.

The optmlValidate value can be one or more of the following:

frequency=64-bit-integer

specifies how frequently (in epochs) validation occurs.

Default 0
Minimum value 0
goal=double

specifies a goal for the validation misclassification rate. When the misclassification rate drops below this goal, the optimization stops.

Default 0
stagnation=64-bit-integer

specifies a number of consecutive validations with increasing misclassification rates that are allowed before optimization terminates early.

Default 0
Minimum value 0
threshold=double

specifies the early stopping threshold for validation error. If the validation error is greater than this value at the iteration specified in the thresholdIter parameter, then the optimization stops.

Default 1
Minimum value 0
thresholdIter=64-bit-integer

specifies the iteration at which the early stopping threshold (specified in the threshold parameter) is checked.

Default 1
Minimum value 1

outInducingPoints={casouttable}

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

For more information about specifying the outInducingPoints parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

output={gpRegOutputStatement}

specifies the output data table in which to save the scored observations.

The gpRegOutputStatement value can be one or more of the following:

* casOut={casouttable}

specifies the settings for an output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}

specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.

role="string"

renames the generated column _ROLE_ in the output data table to the specified role name.

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

outVariationalCov={casouttable}

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

For more information about specifying the outVariationalCov parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

partByFrac={partByFracStatement}

randomly assigns specified proportions of the observations in the input table to training and validation roles. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

The partByFracStatement value can be one or more of the following:

seed=integer

specifies the seed to use in the random number generator that is used for partitioning the data.

Default 0
test=double

randomly assigns the specified proportion of observations in the input table to the testing role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Range 0–1
validate=double

randomly assigns the specified proportion of observations in the input table to the validation role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Alias valid
Range 0–1

partByVar={partByVarStatement}

specifies the variable in the input data whose values are used to assign roles to each observation. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

Long form partByVar={name="variable-name"}
Shortcut form partByVar="variable-name"

The partByVarStatement value can be one or more of the following:

* name="variable-name"

names the variable in the input table whose values are used to assign roles to each observation.

test="string"

specifies the formatted value of the variable that is used to assign observations to the testing role.

train="string"

specifies the formatted value of the variable that is used to assign observations to the training role. If you do not specify the train parameter, then all observations whose roles are not determined by the test and validate parameters are assigned to training.

validate="string"

specifies the formatted value of the variable that is used to assign observations to the validation role.

Alias valid

saveState={casouttable}

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Long form saveState={name="table-name"}
Shortcut form saveState="table-name"

The casouttable value can be one or more of the following:

caslib="string"

specifies the name of the caslib for the output table.

label="string"

specifies the descriptive label to associate with the table.

lifetime=64-bit-integer

specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.

Default 0
Minimum value 0
memoryFormat="DVR" | "INHERIT" | "STANDARD"

specifies the memory format for the output table.

Default INHERIT
DVR

use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.

INHERIT

use the default memory format that is set for the server. By default, the server uses the standard memory format. If an administrator sets the CAS_DEFAULT_MEMORY_FORMAT environment variable to DVR, then the DVR memory format is set as the default for the server.

STANDARD

use the standard memory format.

name="table-name"

specifies the name for the output table.

promote=TRUE | FALSE

when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.

Default FALSE
replace=TRUE | FALSE

when set to True, overwrites an existing table that has the same name.

Default FALSE
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

Specifies the Table Redistribution Policy when the number of worker pods increases on a running CAS server.

DEFER

Defer redistribution policy selection to higher-level entity.

NOREDIST

Do not redistribute table data when the number of worker pods changes on a running CAS server.

REBALANCE

Rebalance table data when the number of worker pods changes on a running CAS server.

seed=double

specifies the seed value for random number generation in initializing parameters and clustering.

Default 0

* table={castable}

specifies the settings for an input table.

Long form table={name="table-name"}
Shortcut form table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=TRUE | FALSE

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias compOnDemand
Default FALSE
computedVars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.

Alias compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias compPgm
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}

specifies data source options.

Aliases options
dataSource
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=TRUE | FALSE

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default FALSE
vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable={groupbytable}

specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases options
dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

* target="variable-name"

specifies the target variable to use for analysis.

useSimpleInit=TRUE | FALSE

when set to True, uses simple parameter initialization for the optimization.

Default TRUE

gpReg Action

Learns a Gaussian process regression model.

Lua Syntax

results, info = s:nonParametricBayes_gpReg{
applyRowOrder=true | false,
attributes={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
display={
caseSensitive=true | false,
exclude=true | false,
excludeAll=true | false,
keyIsPath=true | false,
names={"string-1" <, "string-2", ...>},
pathType="LABEL" | "NAME",
traceNames=true | false
},
fixInducingPoints=true | false,
fixKernelParmFirstIter=true | false,
required parameter inputs={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
jitterMaxIters=integer,
nInducingPoints=integer,
nloOpts={
optmlOpt={
fConv=double
fConvWindow=64-bit-integer
gTol=double
maxEvals=64-bit-integer
maxIters=64-bit-integer
maxTime=double
regL1=double
regL2=double
},
sgdOpt={
adaptiveRate=true | false
commFreq=64-bit-integer
learningRate=double
miniBatchSize=64-bit-integer
momentum=double
seed=64-bit-integer
useLocking=true | false
},
validate={
frequency=64-bit-integer
goal=double
stagnation=64-bit-integer
threshold=double
thresholdIter=64-bit-integer
}
},
outInducingPoints={
caslib="string",
compress=true | false,
indexVars={"variable-name-1" <, "variable-name-2", ...>},
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=true | false,
replace=true | false,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where={"string-1" <, "string-2", ...>}
},
output={
required parameter casOut={
caslib="string"
compress=true | false
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=true | false
replace=true | false
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
},
copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>},
role="string"
},
outputTables={
groupByVarsRaw=true | false,
includeAll=true | false,
names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},
repeated=true | false,
replace=true | false
},
outVariationalCov={
caslib="string",
compress=true | false,
indexVars={"variable-name-1" <, "variable-name-2", ...>},
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=true | false,
replace=true | false,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where={"string-1" <, "string-2", ...>}
},
partByFrac={
seed=integer,
test=double,
validate=double
},
partByVar={
required parameter name="variable-name",
test="string",
train="string",
validate="string"
},
saveState={
caslib="string",
label="string",
lifetime=64-bit-integer,
name="table-name",
promote=true | false,
replace=true | false,
},
seed=double,
required parameter table={
caslib="string",
computedOnDemand=true | false,
computedVars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter name="table-name",
singlePass=true | false,
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
where="where-expression",
whereTable={
casLib="string"
dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter name="table-name"
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}}
where="where-expression"
}
},
required parameter target="variable-name",
useSimpleInit=true | false
}
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 outInducingPoints

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

 outVariationalCov

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

 output

required parametercasOut

specifies the output data table in which to save the scored observations.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

 saveState

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Parameter Descriptions

applyRowOrder=true | false

specifies that you wish that the action uses a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call.

Alias reproducibleRowOrder
Default false

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias attribute

autoRelevanceDetermination=true | false

when set to True, use automatic relevance determination in the kernel function.

Alias ard
Default false

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

fixInducingPoints=true | false

when set to True, fixes inducing points in the optimization.

Default false

fixKernelParmFirstIter=true | false

when set to True, fixes kernel parameters in the first iteration.

Default false

* inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias input

jitterMaxIters=integer

specifies the maximum number of iterations for jitter Cholesky decomposition.

Default 10
Minimum value 0

kernel="LINEAR" | "MATERN32" | "MATERN52" | "PERIODIC" | "RBF"

specifies the kernel function type for Gaussian distributions in the Gaussian process regression model.

Default RBF
LINEAR

uses a linear kernel.

MATERN32

uses a Matern 3/2 kernel.

MATERN52

uses a Matern 5/2 kernel.

PERIODIC

uses a periodic kernel.

RBF

uses a radial-basis function kernel.

nInducingPoints=integer

specifies the number of inducing points.

Default 100
Minimum value 2

nloOpts={casOptml}

specifies the optimization options.

Alias optimizer
Long form nloOpts={algorithm="ADAM" | "SGD"}
Shortcut form nloOpts="ADAM" | "SGD"

The casOptml value can be one or more of the following:

algorithm="ADAM" | "SGD"

specifies the optimization solver to use.

Alias alg
ADAM

uses the adaptive moments variant of the stochastic gradient descent solver.

SGD

uses the stochastic gradient descent (SGD) solver.

optmlOpt={optmlOptions}

specifies options common to all solvers.

The optmlOptions value can be one or more of the following:

clipWeightMaxNorm=double

specifies the maximum L2 norm of the weight vector. Weight vectors that have a greater L2 norm are scaled to this value.

Default 0
Minimum value 0
fConv=double

specifies a stopping criterion. The LBFGS solver stops when the objective fails to change more than this value for at least as many iterations as are specified in the fConvWindow parameter.

Default 1E-05
Minimum value 0
fConvWindow=64-bit-integer

specifies an iteration window for the LBFGS solver's application of the convergence criterion that is specified in the fConv parameter.

Default 1
Minimum value 1
gTol=double

specifies the stopping tolerance for the first-order optimality error.

Default 1E-05
Minimum value 0
maxEvals=64-bit-integer

specifies the maximum number of function evaluations for a single optimization or training.

Alias maxEval
Default 0
Minimum value 0
maxIters=64-bit-integer

specifies the maximum iterations for a single optimization or training.

Alias maxIter
Default 10
Minimum value 0
maxTime=double

specifies the maximum time (in seconds) for a single optimization or training.

Default 0
Minimum value 0
regL1=double

specifies the L1 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
regL2=double

specifies the L2 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
printOpt={optmlPrintOptions}

specifies options for sending information to the log and printing the iteration history table.

The optmlPrintOptions value can be one or more of the following:

logLevel=64-bit-integer

specifies the output display level.

Default 0
Minimum value 0
printFreq=64-bit-integer

specifies how frequently to print the iteration log.

Default 1
Minimum value 1
printLevel="PRINTBASIC" | "PRINTDETAIL" | "PRINTNONE"

specifies the level of detail in rows of the iteration history table.

Default PRINTBASIC
PRINTBASIC

prints only basic information to the iteration history table.

PRINTDETAIL

prints detailed information to the iteration history table.

PRINTNONE

disables printing of the iteration history table.

sgdOpt={sgdOptions}

specifies options for the stochastic gradient descent (SGD) solver.

The sgdOptions value can be one or more of the following:

adaptiveDecay=double

specifies the rate at which the second moment of the gradient is decayed during each SGD iteration.

Alias beta2
Default 0.95
Range [0–1)
adaptiveRate=true | false

when set to True, uses the second moment of the gradient vector to scale the learning rate for SGD.

Default false
annealingRate=double

specifies the annealing parameter.

Default 1E-06
Minimum value 0
commFreq=64-bit-integer

specifies the number of minibatches that each computational thread processes before weights are synchronized across all threads and nodes.

Minimum value 0
learningRate=double

specifies the learning rate parameter for SGD.

Default 0.001
Minimum value (exclusive) 0
miniBatchSize=64-bit-integer

specifies the size of the minibatches to use in SGD.

Default 1
Minimum value 1
momentum=double

specifies the momentum for SGD.

Alias beta1
Default 0
Range [0–1)
seed=64-bit-integer

specifies the seed for random access of observations on each thread for the SGD algorithm.

useLocking=true | false

when set to True, uses locks to perform thread aggregation; when set to False, uses an atomic (nondeterministic) operation.

Default false
validate={optmlValidate}

specifies options for validating models.

The optmlValidate value can be one or more of the following:

frequency=64-bit-integer

specifies how frequently (in epochs) validation occurs.

Default 0
Minimum value 0
goal=double

specifies a goal for the validation misclassification rate. When the misclassification rate drops below this goal, the optimization stops.

Default 0
stagnation=64-bit-integer

specifies a number of consecutive validations with increasing misclassification rates that are allowed before optimization terminates early.

Default 0
Minimum value 0
threshold=double

specifies the early stopping threshold for validation error. If the validation error is greater than this value at the iteration specified in the thresholdIter parameter, then the optimization stops.

Default 1
Minimum value 0
thresholdIter=64-bit-integer

specifies the iteration at which the early stopping threshold (specified in the threshold parameter) is checked.

Default 1
Minimum value 1

outInducingPoints={casouttable}

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

For more information about specifying the outInducingPoints parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

output={gpRegOutputStatement}

specifies the output data table in which to save the scored observations.

The gpRegOutputStatement value can be one or more of the following:

* casOut={casouttable}

specifies the settings for an output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}

specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.

role="string"

renames the generated column _ROLE_ in the output data table to the specified role name.

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

outVariationalCov={casouttable}

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

For more information about specifying the outVariationalCov parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

partByFrac={partByFracStatement}

randomly assigns specified proportions of the observations in the input table to training and validation roles. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

The partByFracStatement value can be one or more of the following:

seed=integer

specifies the seed to use in the random number generator that is used for partitioning the data.

Default 0
test=double

randomly assigns the specified proportion of observations in the input table to the testing role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Range 0–1
validate=double

randomly assigns the specified proportion of observations in the input table to the validation role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Alias valid
Range 0–1

partByVar={partByVarStatement}

specifies the variable in the input data whose values are used to assign roles to each observation. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

Long form partByVar={name="variable-name"}
Shortcut form partByVar="variable-name"

The partByVarStatement value can be one or more of the following:

* name="variable-name"

names the variable in the input table whose values are used to assign roles to each observation.

test="string"

specifies the formatted value of the variable that is used to assign observations to the testing role.

train="string"

specifies the formatted value of the variable that is used to assign observations to the training role. If you do not specify the train parameter, then all observations whose roles are not determined by the test and validate parameters are assigned to training.

validate="string"

specifies the formatted value of the variable that is used to assign observations to the validation role.

Alias valid

saveState={casouttable}

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Long form saveState={name="table-name"}
Shortcut form saveState="table-name"

The casouttable value can be one or more of the following:

caslib="string"

specifies the name of the caslib for the output table.

label="string"

specifies the descriptive label to associate with the table.

lifetime=64-bit-integer

specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.

Default 0
Minimum value 0
memoryFormat="DVR" | "INHERIT" | "STANDARD"

specifies the memory format for the output table.

Default INHERIT
DVR

use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.

INHERIT

use the default memory format that is set for the server. By default, the server uses the standard memory format. If an administrator sets the CAS_DEFAULT_MEMORY_FORMAT environment variable to DVR, then the DVR memory format is set as the default for the server.

STANDARD

use the standard memory format.

name="table-name"

specifies the name for the output table.

promote=true | false

when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.

Default false
replace=true | false

when set to True, overwrites an existing table that has the same name.

Default false
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

Specifies the Table Redistribution Policy when the number of worker pods increases on a running CAS server.

DEFER

Defer redistribution policy selection to higher-level entity.

NOREDIST

Do not redistribute table data when the number of worker pods changes on a running CAS server.

REBALANCE

Rebalance table data when the number of worker pods changes on a running CAS server.

seed=double

specifies the seed value for random number generation in initializing parameters and clustering.

Default 0

* table={castable}

specifies the settings for an input table.

Long form table={name="table-name"}
Shortcut form table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=true | false

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias compOnDemand
Default false
computedVars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.

Alias compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias compPgm
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}

specifies data source options.

Aliases options
dataSource
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=true | false

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default false
vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable={groupbytable}

specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases options
dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

* target="variable-name"

specifies the target variable to use for analysis.

useSimpleInit=true | false

when set to True, uses simple parameter initialization for the optimization.

Default true

gpReg Action

Learns a Gaussian process regression model.

Python Syntax

results=s.nonParametricBayes.gpReg(
applyRowOrder=True | False,
attributes=[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
display={
"caseSensitive":True | False,
"exclude":True | False,
"excludeAll":True | False,
"keyIsPath":True | False,
"names":["string-1" <, "string-2", ...>],
"pathType":"LABEL" | "NAME",
"traceNames":True | False
},
fixInducingPoints=True | False,
fixKernelParmFirstIter=True | False,
required parameter inputs=[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
jitterMaxIters=integer,
nInducingPoints=integer,
nloOpts={
"optmlOpt":{
"fConv":double
"fConvWindow":64-bit-integer
"gTol":double
"maxEvals":64-bit-integer
"maxIters":64-bit-integer
"maxTime":double
"regL1":double
"regL2":double
},
"printOpt":{
"logLevel":64-bit-integer
"printFreq":64-bit-integer
},
"sgdOpt":{
"adaptiveDecay":double
"adaptiveRate":True | False
"annealingRate":double
"commFreq":64-bit-integer
"learningRate":double
"miniBatchSize":64-bit-integer
"momentum":double
"seed":64-bit-integer
"useLocking":True | False
},
"validate":{
"frequency":64-bit-integer
"goal":double
"stagnation":64-bit-integer
"threshold":double
"thresholdIter":64-bit-integer
}
},
outInducingPoints={
"caslib":"string",
"compress":True | False,
"indexVars":["variable-name-1" <, "variable-name-2", ...>],
"label":"string",
"lifetime":64-bit-integer,
"maxMemSize":64-bit-integer,
"memoryFormat":"DVR" | "INHERIT" | "STANDARD",
"name":"table-name",
"promote":True | False,
"replace":True | False,
"replication":integer,
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE",
"threadBlockSize":64-bit-integer,
"timeStamp":"string",
"where":["string-1" <, "string-2", ...>]
},
output={
required parameter "casOut":{
"caslib":"string"
"compress":True | False
"indexVars":["variable-name-1" <, "variable-name-2", ...>]
"label":"string"
"lifetime":64-bit-integer
"maxMemSize":64-bit-integer
"memoryFormat":"DVR" | "INHERIT" | "STANDARD"
"name":"table-name"
"promote":True | False
"replace":True | False
"replication":integer
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"
"threadBlockSize":64-bit-integer
"timeStamp":"string"
"where":["string-1" <, "string-2", ...>]
},
"copyVars":"ALL" | "ALL_MODEL" | "ALL_NUMERIC" | ["variable-name-1" <, "variable-name-2", ...>],
"role":"string"
},
outputTables={
"groupByVarsRaw":True | False,
"includeAll":True | False,
"names":["string-1" <, "string-2", ...>] | {"key-1":{casouttable-1} <, "key-2":{casouttable-2}, ...>},
"repeated":True | False,
"replace":True | False
},
outVariationalCov={
"caslib":"string",
"compress":True | False,
"indexVars":["variable-name-1" <, "variable-name-2", ...>],
"label":"string",
"lifetime":64-bit-integer,
"maxMemSize":64-bit-integer,
"memoryFormat":"DVR" | "INHERIT" | "STANDARD",
"name":"table-name",
"promote":True | False,
"replace":True | False,
"replication":integer,
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE",
"threadBlockSize":64-bit-integer,
"timeStamp":"string",
"where":["string-1" <, "string-2", ...>]
},
partByFrac={
"seed":integer,
"test":double,
"validate":double
},
partByVar={
required parameter "name":"variable-name",
"test":"string",
"train":"string",
"validate":"string"
},
saveState={
"caslib":"string",
"label":"string",
"lifetime":64-bit-integer,
"name":"table-name",
"promote":True | False,
"replace":True | False,
},
seed=double,
required parameter table={
"caslib":"string",
"computedOnDemand":True | False,
"computedVars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"computedVarsProgram":"string",
"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},
"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter "name":"table-name",
"singlePass":True | False,
"vars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"where":"where-expression",
"whereTable":{
"casLib":"string"
"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter "name":"table-name"
"vars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>]
"where":"where-expression"
}
},
required parameter target="variable-name",
useSimpleInit=True | False
)
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 outInducingPoints

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

 outVariationalCov

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

 output

required parametercasOut

specifies the output data table in which to save the scored observations.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

 saveState

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Parameter Descriptions

applyRowOrder=True | False

specifies that you wish that the action uses a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call.

Alias reproducibleRowOrder
Default False

attributes=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias attribute

autoRelevanceDetermination=True | False

when set to True, use automatic relevance determination in the kernel function.

Alias ard
Default False

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

fixInducingPoints=True | False

when set to True, fixes inducing points in the optimization.

Default False

fixKernelParmFirstIter=True | False

when set to True, fixes kernel parameters in the first iteration.

Default False

* inputs=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias input

jitterMaxIters=integer

specifies the maximum number of iterations for jitter Cholesky decomposition.

Default 10
Minimum value 0

kernel="LINEAR" | "MATERN32" | "MATERN52" | "PERIODIC" | "RBF"

specifies the kernel function type for Gaussian distributions in the Gaussian process regression model.

Default RBF
LINEAR

uses a linear kernel.

MATERN32

uses a Matern 3/2 kernel.

MATERN52

uses a Matern 5/2 kernel.

PERIODIC

uses a periodic kernel.

RBF

uses a radial-basis function kernel.

nInducingPoints=integer

specifies the number of inducing points.

Default 100
Minimum value 2

nloOpts={casOptml}

specifies the optimization options.

Alias optimizer
Long form nloOpts={"algorithm":"ADAM" | "SGD"}
Shortcut form nloOpts="ADAM" | "SGD"

The casOptml value can be one or more of the following:

"algorithm":"ADAM" | "SGD"

specifies the optimization solver to use.

Alias alg
ADAM

uses the adaptive moments variant of the stochastic gradient descent solver.

SGD

uses the stochastic gradient descent (SGD) solver.

"optmlOpt":{optmlOptions}

specifies options common to all solvers.

The optmlOptions value can be one or more of the following:

"clipWeightMaxNorm":double

specifies the maximum L2 norm of the weight vector. Weight vectors that have a greater L2 norm are scaled to this value.

Default 0
Minimum value 0
"fConv":double

specifies a stopping criterion. The LBFGS solver stops when the objective fails to change more than this value for at least as many iterations as are specified in the fConvWindow parameter.

Default 1E-05
Minimum value 0
"fConvWindow":64-bit-integer

specifies an iteration window for the LBFGS solver's application of the convergence criterion that is specified in the fConv parameter.

Default 1
Minimum value 1
"gTol":double

specifies the stopping tolerance for the first-order optimality error.

Default 1E-05
Minimum value 0
"maxEvals":64-bit-integer

specifies the maximum number of function evaluations for a single optimization or training.

Alias maxEval
Default 0
Minimum value 0
"maxIters":64-bit-integer

specifies the maximum iterations for a single optimization or training.

Alias maxIter
Default 10
Minimum value 0
"maxTime":double

specifies the maximum time (in seconds) for a single optimization or training.

Default 0
Minimum value 0
"regL1":double

specifies the L1 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
"regL2":double

specifies the L2 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
"printOpt":{optmlPrintOptions}

specifies options for sending information to the log and printing the iteration history table.

The optmlPrintOptions value can be one or more of the following:

"logLevel":64-bit-integer

specifies the output display level.

Default 0
Minimum value 0
"printFreq":64-bit-integer

specifies how frequently to print the iteration log.

Default 1
Minimum value 1
"printLevel":"PRINTBASIC" | "PRINTDETAIL" | "PRINTNONE"

specifies the level of detail in rows of the iteration history table.

Default PRINTBASIC
PRINTBASIC

prints only basic information to the iteration history table.

PRINTDETAIL

prints detailed information to the iteration history table.

PRINTNONE

disables printing of the iteration history table.

"sgdOpt":{sgdOptions}

specifies options for the stochastic gradient descent (SGD) solver.

The sgdOptions value can be one or more of the following:

"adaptiveDecay":double

specifies the rate at which the second moment of the gradient is decayed during each SGD iteration.

Alias beta2
Default 0.95
Range [0–1)
"adaptiveRate":True | False

when set to True, uses the second moment of the gradient vector to scale the learning rate for SGD.

Default False
"annealingRate":double

specifies the annealing parameter.

Default 1E-06
Minimum value 0
"commFreq":64-bit-integer

specifies the number of minibatches that each computational thread processes before weights are synchronized across all threads and nodes.

Minimum value 0
"learningRate":double

specifies the learning rate parameter for SGD.

Default 0.001
Minimum value (exclusive) 0
"miniBatchSize":64-bit-integer

specifies the size of the minibatches to use in SGD.

Default 1
Minimum value 1
"momentum":double

specifies the momentum for SGD.

Alias beta1
Default 0
Range [0–1)
"seed":64-bit-integer

specifies the seed for random access of observations on each thread for the SGD algorithm.

"useLocking":True | False

when set to True, uses locks to perform thread aggregation; when set to False, uses an atomic (nondeterministic) operation.

Default False
"validate":{optmlValidate}

specifies options for validating models.

The optmlValidate value can be one or more of the following:

"frequency":64-bit-integer

specifies how frequently (in epochs) validation occurs.

Default 0
Minimum value 0
"goal":double

specifies a goal for the validation misclassification rate. When the misclassification rate drops below this goal, the optimization stops.

Default 0
"stagnation":64-bit-integer

specifies a number of consecutive validations with increasing misclassification rates that are allowed before optimization terminates early.

Default 0
Minimum value 0
"threshold":double

specifies the early stopping threshold for validation error. If the validation error is greater than this value at the iteration specified in the thresholdIter parameter, then the optimization stops.

Default 1
Minimum value 0
"thresholdIter":64-bit-integer

specifies the iteration at which the early stopping threshold (specified in the threshold parameter) is checked.

Default 1
Minimum value 1

outInducingPoints={casouttable}

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

For more information about specifying the outInducingPoints parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

output={gpRegOutputStatement}

specifies the output data table in which to save the scored observations.

The gpRegOutputStatement value can be one or more of the following:

* "casOut":{casouttable}

specifies the settings for an output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

"copyVars":"ALL" | "ALL_MODEL" | "ALL_NUMERIC" | ["variable-name-1" <, "variable-name-2", ...>]

specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.

"role":"string"

renames the generated column _ROLE_ in the output data table to the specified role name.

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

outVariationalCov={casouttable}

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

For more information about specifying the outVariationalCov parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

partByFrac={partByFracStatement}

randomly assigns specified proportions of the observations in the input table to training and validation roles. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

The partByFracStatement value can be one or more of the following:

"seed":integer

specifies the seed to use in the random number generator that is used for partitioning the data.

Default 0
"test":double

randomly assigns the specified proportion of observations in the input table to the testing role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Range 0–1
"validate":double

randomly assigns the specified proportion of observations in the input table to the validation role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Alias valid
Range 0–1

partByVar={partByVarStatement}

specifies the variable in the input data whose values are used to assign roles to each observation. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

Long form partByVar={"name":"variable-name"}
Shortcut form partByVar="variable-name"

The partByVarStatement value can be one or more of the following:

* "name":"variable-name"

names the variable in the input table whose values are used to assign roles to each observation.

"test":"string"

specifies the formatted value of the variable that is used to assign observations to the testing role.

"train":"string"

specifies the formatted value of the variable that is used to assign observations to the training role. If you do not specify the train parameter, then all observations whose roles are not determined by the test and validate parameters are assigned to training.

"validate":"string"

specifies the formatted value of the variable that is used to assign observations to the validation role.

Alias valid

saveState={casouttable}

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Long form saveState={"name":"table-name"}
Shortcut form saveState="table-name"

The casouttable value can be one or more of the following:

"caslib":"string"

specifies the name of the caslib for the output table.

"label":"string"

specifies the descriptive label to associate with the table.

"lifetime":64-bit-integer

specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.

Default 0
Minimum value 0
"memoryFormat":"DVR" | "INHERIT" | "STANDARD"

specifies the memory format for the output table.

Default INHERIT
DVR

use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.

INHERIT

use the default memory format that is set for the server. By default, the server uses the standard memory format. If an administrator sets the CAS_DEFAULT_MEMORY_FORMAT environment variable to DVR, then the DVR memory format is set as the default for the server.

STANDARD

use the standard memory format.

"name":"table-name"

specifies the name for the output table.

"promote":True | False

when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.

Default False
"replace":True | False

when set to True, overwrites an existing table that has the same name.

Default False
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"

Specifies the Table Redistribution Policy when the number of worker pods increases on a running CAS server.

DEFER

Defer redistribution policy selection to higher-level entity.

NOREDIST

Do not redistribute table data when the number of worker pods changes on a running CAS server.

REBALANCE

Rebalance table data when the number of worker pods changes on a running CAS server.

seed=double

specifies the seed value for random number generation in initializing parameters and clustering.

Default 0

* table={castable}

specifies the settings for an input table.

Long form table={"name":"table-name"}
Shortcut form table="table-name"

The castable value can be one or more of the following:

"caslib":"string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

"computedOnDemand":True | False

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias compOnDemand
Default False
"computedVars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.

Alias compVars

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"computedVarsProgram":"string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias compPgm
"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>}

specifies data source options.

Aliases options
dataSource
"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import_

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* "name":"table-name"

specifies the name of the input table.

"singlePass":True | False

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default False
"vars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"where":"where-expression"

specifies an expression for subsetting the input data.

"whereTable":{groupbytable}

specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.

The groupbytable value can be one or more of the following:

"casLib":"string"

specifies the caslib for the filter table. By default, the active caslib is used.

"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases options
dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias import_

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* "name":"table-name"

specifies the name of the filter table.

"vars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"where":"where-expression"

specifies an expression for subsetting the data from the filter table.

* target="variable-name"

specifies the target variable to use for analysis.

useSimpleInit=True | False

when set to True, uses simple parameter initialization for the optimization.

Default True

gpReg Action

Learns a Gaussian process regression model.

R Syntax

results <– cas.nonParametricBayes.gpReg(s,
applyRowOrder=TRUE | FALSE,
attributes=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
display=list(
caseSensitive=TRUE | FALSE,
exclude=TRUE | FALSE,
excludeAll=TRUE | FALSE,
keyIsPath=TRUE | FALSE,
names=list("string-1" <, "string-2", ...>),
pathType="LABEL" | "NAME",
traceNames=TRUE | FALSE
),
fixInducingPoints=TRUE | FALSE,
fixKernelParmFirstIter=TRUE | FALSE,
required parameter inputs=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
jitterMaxIters=integer,
nInducingPoints=integer,
nloOpts=list(
optmlOpt=list(
fConv=double
fConvWindow=64-bit-integer
gTol=double
maxEvals=64-bit-integer
maxIters=64-bit-integer
maxTime=double
regL1=double
regL2=double
),
printOpt=list( ),
sgdOpt=list(
adaptiveRate=TRUE | FALSE
commFreq=64-bit-integer
learningRate=double
miniBatchSize=64-bit-integer
momentum=double
seed=64-bit-integer
useLocking=TRUE | FALSE
),
validate=list(
frequency=64-bit-integer
goal=double
stagnation=64-bit-integer
threshold=double
thresholdIter=64-bit-integer
)
),
outInducingPoints=list(
caslib="string",
compress=TRUE | FALSE,
indexVars=list("variable-name-1" <, "variable-name-2", ...>),
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where=list("string-1" <, "string-2", ...>)
),
output=list(
required parameter casOut=list(
caslib="string"
compress=TRUE | FALSE
indexVars=list("variable-name-1" <, "variable-name-2", ...>)
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where=list("string-1" <, "string-2", ...>)
),
copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | list("variable-name-1" <, "variable-name-2", ...>),
role="string"
),
outputTables=list(
groupByVarsRaw=TRUE | FALSE,
includeAll=TRUE | FALSE,
names=list("string-1" <, "string-2", ...>) | list(key-1=list(casouttable-1) <, key-2=list(casouttable-2), ...>),
repeated=TRUE | FALSE,
replace=TRUE | FALSE
),
outVariationalCov=list(
caslib="string",
compress=TRUE | FALSE,
indexVars=list("variable-name-1" <, "variable-name-2", ...>),
label="string",
lifetime=64-bit-integer,
maxMemSize=64-bit-integer,
memoryFormat="DVR" | "INHERIT" | "STANDARD",
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
replication=integer,
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",
threadBlockSize=64-bit-integer,
timeStamp="string",
where=list("string-1" <, "string-2", ...>)
),
partByFrac=list(
seed=integer,
test=double,
validate=double
),
partByVar=list(
required parameter name="variable-name",
test="string",
train="string",
validate="string"
),
saveState=list(
caslib="string",
label="string",
lifetime=64-bit-integer,
name="table-name",
promote=TRUE | FALSE,
replace=TRUE | FALSE,
),
seed=double,
required parameter table=list(
caslib="string",
computedOnDemand=TRUE | FALSE,
computedVars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),
importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters),
required parameter name="table-name",
singlePass=TRUE | FALSE,
vars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
where="where-expression",
whereTable=list(
casLib="string"
dataSourceOptions=list(adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters)
importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)
required parameter name="table-name"
vars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>)
where="where-expression"
)
),
required parameter target="variable-name",
useSimpleInit=TRUE | FALSE
)
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 outInducingPoints

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

 outVariationalCov

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

 output

required parametercasOut

specifies the output data table in which to save the scored observations.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

 saveState

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Parameter Descriptions

applyRowOrder=TRUE | FALSE

specifies that you wish that the action uses a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call.

Alias reproducibleRowOrder
Default FALSE

attributes=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias attribute

autoRelevanceDetermination=TRUE | FALSE

when set to True, use automatic relevance determination in the kernel function.

Alias ard
Default FALSE

display=list(displayTables)

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

fixInducingPoints=TRUE | FALSE

when set to True, fixes inducing points in the optimization.

Default FALSE

fixKernelParmFirstIter=TRUE | FALSE

when set to True, fixes kernel parameters in the first iteration.

Default FALSE

* inputs=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias input

jitterMaxIters=integer

specifies the maximum number of iterations for jitter Cholesky decomposition.

Default 10
Minimum value 0

kernel="LINEAR" | "MATERN32" | "MATERN52" | "PERIODIC" | "RBF"

specifies the kernel function type for Gaussian distributions in the Gaussian process regression model.

Default RBF
LINEAR

uses a linear kernel.

MATERN32

uses a Matern 3/2 kernel.

MATERN52

uses a Matern 5/2 kernel.

PERIODIC

uses a periodic kernel.

RBF

uses a radial-basis function kernel.

nInducingPoints=integer

specifies the number of inducing points.

Default 100
Minimum value 2

nloOpts=list(casOptml)

specifies the optimization options.

Alias optimizer
Long form nloOpts=list(algorithm="ADAM" | "SGD")
Shortcut form nloOpts="ADAM" | "SGD"

The casOptml value can be one or more of the following:

algorithm="ADAM" | "SGD"

specifies the optimization solver to use.

Alias alg
ADAM

uses the adaptive moments variant of the stochastic gradient descent solver.

SGD

uses the stochastic gradient descent (SGD) solver.

optmlOpt=list(optmlOptions)

specifies options common to all solvers.

The optmlOptions value can be one or more of the following:

clipWeightMaxNorm=double

specifies the maximum L2 norm of the weight vector. Weight vectors that have a greater L2 norm are scaled to this value.

Default 0
Minimum value 0
fConv=double

specifies a stopping criterion. The LBFGS solver stops when the objective fails to change more than this value for at least as many iterations as are specified in the fConvWindow parameter.

Default 1E-05
Minimum value 0
fConvWindow=64-bit-integer

specifies an iteration window for the LBFGS solver's application of the convergence criterion that is specified in the fConv parameter.

Default 1
Minimum value 1
gTol=double

specifies the stopping tolerance for the first-order optimality error.

Default 1E-05
Minimum value 0
maxEvals=64-bit-integer

specifies the maximum number of function evaluations for a single optimization or training.

Alias maxEval
Default 0
Minimum value 0
maxIters=64-bit-integer

specifies the maximum iterations for a single optimization or training.

Alias maxIter
Default 10
Minimum value 0
maxTime=double

specifies the maximum time (in seconds) for a single optimization or training.

Default 0
Minimum value 0
regL1=double

specifies the L1 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
regL2=double

specifies the L2 regularization parameter; the value must be nonnegative.

Default 0
Minimum value 0
printOpt=list(optmlPrintOptions)

specifies options for sending information to the log and printing the iteration history table.

The optmlPrintOptions value can be one or more of the following:

logLevel=64-bit-integer

specifies the output display level.

Default 0
Minimum value 0
printFreq=64-bit-integer

specifies how frequently to print the iteration log.

Default 1
Minimum value 1
printLevel="PRINTBASIC" | "PRINTDETAIL" | "PRINTNONE"

specifies the level of detail in rows of the iteration history table.

Default PRINTBASIC
PRINTBASIC

prints only basic information to the iteration history table.

PRINTDETAIL

prints detailed information to the iteration history table.

PRINTNONE

disables printing of the iteration history table.

sgdOpt=list(sgdOptions)

specifies options for the stochastic gradient descent (SGD) solver.

The sgdOptions value can be one or more of the following:

adaptiveDecay=double

specifies the rate at which the second moment of the gradient is decayed during each SGD iteration.

Alias beta2
Default 0.95
Range [0–1)
adaptiveRate=TRUE | FALSE

when set to True, uses the second moment of the gradient vector to scale the learning rate for SGD.

Default FALSE
annealingRate=double

specifies the annealing parameter.

Default 1E-06
Minimum value 0
commFreq=64-bit-integer

specifies the number of minibatches that each computational thread processes before weights are synchronized across all threads and nodes.

Minimum value 0
learningRate=double

specifies the learning rate parameter for SGD.

Default 0.001
Minimum value (exclusive) 0
miniBatchSize=64-bit-integer

specifies the size of the minibatches to use in SGD.

Default 1
Minimum value 1
momentum=double

specifies the momentum for SGD.

Alias beta1
Default 0
Range [0–1)
seed=64-bit-integer

specifies the seed for random access of observations on each thread for the SGD algorithm.

useLocking=TRUE | FALSE

when set to True, uses locks to perform thread aggregation; when set to False, uses an atomic (nondeterministic) operation.

Default FALSE
validate=list(optmlValidate)

specifies options for validating models.

The optmlValidate value can be one or more of the following:

frequency=64-bit-integer

specifies how frequently (in epochs) validation occurs.

Default 0
Minimum value 0
goal=double

specifies a goal for the validation misclassification rate. When the misclassification rate drops below this goal, the optimization stops.

Default 0
stagnation=64-bit-integer

specifies a number of consecutive validations with increasing misclassification rates that are allowed before optimization terminates early.

Default 0
Minimum value 0
threshold=double

specifies the early stopping threshold for validation error. If the validation error is greater than this value at the iteration specified in the thresholdIter parameter, then the optimization stops.

Default 1
Minimum value 0
thresholdIter=64-bit-integer

specifies the iteration at which the early stopping threshold (specified in the threshold parameter) is checked.

Default 1
Minimum value 1

outInducingPoints=list(casouttable)

specifies the output data table in which to save the estimated mean and standard deviation at inducing points.

For more information about specifying the outInducingPoints parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

output=list(gpRegOutputStatement)

specifies the output data table in which to save the scored observations.

The gpRegOutputStatement value can be one or more of the following:

* casOut=list(casouttable)

specifies the settings for an output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | list("variable-name-1" <, "variable-name-2", ...>)

specifies a list of one or more variables to be copied from the input table to the output table. You can alternatively specify the value ALL, ALL_MODEL, or ALL_NUMERIC, which respectively copies all variables, all variables used in the modeling, or all numeric variables from the input table to the output table.

role="string"

renames the generated column _ROLE_ in the output data table to the specified role name.

outputTables=list(outputTables)

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

outVariationalCov=list(casouttable)

specifies the output data table in which to save the estimated variational distribution's covariance matrix at inducing points.

For more information about specifying the outVariationalCov parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

partByFrac=list(partByFracStatement)

randomly assigns specified proportions of the observations in the input table to training and validation roles. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

The partByFracStatement value can be one or more of the following:

seed=integer

specifies the seed to use in the random number generator that is used for partitioning the data.

Default 0
test=double

randomly assigns the specified proportion of observations in the input table to the testing role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Range 0–1
validate=double

randomly assigns the specified proportion of observations in the input table to the validation role. The sum of the fractions that are specified in the test and validate parameters must be less than 1.

Alias valid
Range 0–1

partByVar=list(partByVarStatement)

specifies the variable in the input data whose values are used to assign roles to each observation. Observations are logically partitioned into disjoint subsets for model training, validation, and testing.

Long form partByVar=list(name="variable-name")
Shortcut form partByVar="variable-name"

The partByVarStatement value can be one or more of the following:

* name="variable-name"

names the variable in the input table whose values are used to assign roles to each observation.

test="string"

specifies the formatted value of the variable that is used to assign observations to the testing role.

train="string"

specifies the formatted value of the variable that is used to assign observations to the training role. If you do not specify the train parameter, then all observations whose roles are not determined by the test and validate parameters are assigned to training.

validate="string"

specifies the formatted value of the variable that is used to assign observations to the validation role.

Alias valid

saveState=list(casouttable)

specifies the output data table in which to save the state of the Gaussian process regression for future scoring.

Long form saveState=list(name="table-name")
Shortcut form saveState="table-name"

The casouttable value can be one or more of the following:

caslib="string"

specifies the name of the caslib for the output table.

label="string"

specifies the descriptive label to associate with the table.

lifetime=64-bit-integer

specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds.

Default 0
Minimum value 0
memoryFormat="DVR" | "INHERIT" | "STANDARD"

specifies the memory format for the output table.

Default INHERIT
DVR

use the duplicate value reduction memory format. This memory format can reduce the memory consumption and file size when the input data contains duplicate values.

INHERIT

use the default memory format that is set for the server. By default, the server uses the standard memory format. If an administrator sets the CAS_DEFAULT_MEMORY_FORMAT environment variable to DVR, then the DVR memory format is set as the default for the server.

STANDARD

use the standard memory format.

name="table-name"

specifies the name for the output table.

promote=TRUE | FALSE

when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope.

Default FALSE
replace=TRUE | FALSE

when set to True, overwrites an existing table that has the same name.

Default FALSE
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

Specifies the Table Redistribution Policy when the number of worker pods increases on a running CAS server.

DEFER

Defer redistribution policy selection to higher-level entity.

NOREDIST

Do not redistribute table data when the number of worker pods changes on a running CAS server.

REBALANCE

Rebalance table data when the number of worker pods changes on a running CAS server.

seed=double

specifies the seed value for random number generation in initializing parameters and clustering.

Default 0

* table=list(castable)

specifies the settings for an input table.

Long form table=list(name="table-name")
Shortcut form table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=TRUE | FALSE

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias compOnDemand
Default FALSE
computedVars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.

Alias compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias compPgm
dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>)

specifies data source options.

Aliases options
dataSource
importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=TRUE | FALSE

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default FALSE
vars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable=list(groupbytable)

specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions=list(adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters)

specifies data source options.

Aliases options
dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)

specifies the settings for reading a table from a data source.

Alias import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

* target="variable-name"

specifies the target variable to use for analysis.

useSimpleInit=TRUE | FALSE

when set to True, uses simple parameter initialization for the optimization.

Default TRUE
Last updated: November 23, 2025