Active Machine Learning Action Set

Provides an action set for performing active learning which interactively query the user in order to minimizethe labeling effort

iterate Action

Performs active learning iteratively and assesses the model performance..

CASL Syntax
Summary: Input and Output Tables
Parameter Descriptions

CASL Syntax

activeLearn.iterate <result=results> <status=rc> /

annotatedTable={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

event="string",

groundTruthTable={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

id="variable-name",

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

modelProgram="string",

nBins=integer,

nIterations=integer,

nominals={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

outIterationHistory={

caslib="string",

compress=TRUE | FALSE,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

outQueryHistory={

caslib="string",

compress=TRUE | FALSE,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

selectQuery={method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters},

table={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

target="variable-name",

testTable={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

topK=64-bit-integer

;

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parameterannotatedTable	—	specifies the table that contains the labels to start the model training.
required parametergroundTruthTable	—	specifies the table that contains all the ground truth labels.
required parametertable	—	specifies the table that contains all unlabeled data.
required parametertestTable	—	specifies the table to use for model evaluation.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
required parameteroutIterationHistory	—	creates a data set that contains the model assessment across iterations.
required parameteroutQueryHistory	—	creates a data set that contains the query history across iterations.

Parameter Descriptions

* annotatedTable={castable}

specifies the table that contains the labels to start the model training.

For more information about specifying the annotatedTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* event="string"

specifies the prediction event level for model assessment. If the target has a format, then you must specify the formatted value.

* groundTruthTable={castable}

specifies the table that contains all the ground truth labels.

For more information about specifying the groundTruthTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* id="variable-name"

specifies the ID variable to use for merging the labels from the table specified in the annotatedTable parameter into the input data set. The value of this parameter must be unique and nonmissing.

* inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

* modelProgram="string"

specifies user-defined modeling.

nBins=integer

specifies the number of bins to use for model assessment.

Default	20
Range	2–MACINT

nIterations=integer

specifies the number of iterations.

Default	20
Range	0–MACINT

nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies nominal variables to use for analysis.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

* outIterationHistory={casouttable}

creates a data set that contains the model assessment across iterations.

For more information about specifying the outIterationHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* outQueryHistory={casouttable}

creates a data set that contains the query history across iterations.

For more information about specifying the outQueryHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* selectQuery={method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters}

specifies the query strategy to use. The strategy can be either random sampling, relevance sampling, or uncertainty sampling. By default, the strategy is uncertainty sampling.

The value that you specify for the method parameter determines the other parameters that apply.

* table={castable}

specifies the table that contains all unlabeled data.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* target="variable-name"

specifies the binary target variable to use for analysis. Only the binary target is currently supported.

* testTable={castable}

specifies the table to use for model evaluation.

For more information about specifying the testTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

topK=64-bit-integer

specifies the top K observations to query. When you specify a value of K greater than the number of observations available for query, the number of observations available is used instead.

Default	50
Minimum value	1

Parameters for method="RANDOM"

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

seed=double

specifies the seed for the random number generator that is used for random sampling.

Default	0

Parameters for method="RELEVANCE"

gamma=double

specifies the number of the inverse of the variance for Gaussian kernels in order to calculate the relevance between labeled and unlabeled observations.

Default	20
Minimum value (exclusive)	0

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

Parameters for method="UNCERTAINTY"

includeAllData=TRUE | FALSE

when set to True, indicates that observations with and without labels can be used to query. Otherwise, only unlabeled observations can be used to query.

Default	FALSE

metric="ENTROPY" | "LEASTCONFIDENCE" | "RATIOOFCONFIDENCE"

specifies the metric to use for calculating uncertainty.

Default	ENTROPY

* probVar="variable-name"

specifies the probability variable to use for calculating uncertainty.

Alias	predEventProb

iterate Action

Performs active learning iteratively and assesses the model performance..

Lua Syntax
Summary: Input and Output Tables
Parameter Descriptions

Lua Syntax

results, info = s:activeLearn_iterate{

annotatedTable={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

event="string",

groundTruthTable={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

id="variable-name",

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

modelProgram="string",

nBins=integer,

nIterations=integer,

nominals={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

outIterationHistory={

caslib="string",

compress=true | false,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=true | false,

replace=true | false,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

outQueryHistory={

caslib="string",

compress=true | false,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=true | false,

replace=true | false,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

selectQuery={method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters},

table={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

target="variable-name",

testTable={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

topK=64-bit-integer

}

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parameterannotatedTable	—	specifies the table that contains the labels to start the model training.
required parametergroundTruthTable	—	specifies the table that contains all the ground truth labels.
required parametertable	—	specifies the table that contains all unlabeled data.
required parametertestTable	—	specifies the table to use for model evaluation.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
required parameteroutIterationHistory	—	creates a data set that contains the model assessment across iterations.
required parameteroutQueryHistory	—	creates a data set that contains the query history across iterations.

Parameter Descriptions

* annotatedTable={castable}

specifies the table that contains the labels to start the model training.

For more information about specifying the annotatedTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* event="string"

specifies the prediction event level for model assessment. If the target has a format, then you must specify the formatted value.

* groundTruthTable={castable}

specifies the table that contains all the ground truth labels.

For more information about specifying the groundTruthTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* id="variable-name"

specifies the ID variable to use for merging the labels from the table specified in the annotatedTable parameter into the input data set. The value of this parameter must be unique and nonmissing.

* inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

* modelProgram="string"

specifies user-defined modeling.

nBins=integer

specifies the number of bins to use for model assessment.

Default	20
Range	2–MACINT

nIterations=integer

specifies the number of iterations.

Default	20
Range	0–MACINT

nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies nominal variables to use for analysis.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

* outIterationHistory={casouttable}

creates a data set that contains the model assessment across iterations.

For more information about specifying the outIterationHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* outQueryHistory={casouttable}

creates a data set that contains the query history across iterations.

For more information about specifying the outQueryHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* selectQuery={method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters}

specifies the query strategy to use. The strategy can be either random sampling, relevance sampling, or uncertainty sampling. By default, the strategy is uncertainty sampling.

The value that you specify for the method parameter determines the other parameters that apply.

* table={castable}

specifies the table that contains all unlabeled data.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* target="variable-name"

specifies the binary target variable to use for analysis. Only the binary target is currently supported.

* testTable={castable}

specifies the table to use for model evaluation.

For more information about specifying the testTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

topK=64-bit-integer

specifies the top K observations to query. When you specify a value of K greater than the number of observations available for query, the number of observations available is used instead.

Default	50
Minimum value	1

Parameters for method="RANDOM"

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

seed=double

specifies the seed for the random number generator that is used for random sampling.

Default	0

Parameters for method="RELEVANCE"

gamma=double

specifies the number of the inverse of the variance for Gaussian kernels in order to calculate the relevance between labeled and unlabeled observations.

Default	20
Minimum value (exclusive)	0

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

Parameters for method="UNCERTAINTY"

includeAllData=true | false

when set to True, indicates that observations with and without labels can be used to query. Otherwise, only unlabeled observations can be used to query.

Default	false

metric="ENTROPY" | "LEASTCONFIDENCE" | "RATIOOFCONFIDENCE"

specifies the metric to use for calculating uncertainty.

Default	ENTROPY

* probVar="variable-name"

specifies the probability variable to use for calculating uncertainty.

Alias	predEventProb

iterate Action

Performs active learning iteratively and assesses the model performance..

Python Syntax
Summary: Input and Output Tables
Parameter Descriptions

Python Syntax

results=s.activeLearn.iterate(

annotatedTable={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

"name":"table-name",

"orderBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

event="string",

groundTruthTable={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"name":"table-name",

"orderBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

id="variable-name",

inputs=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

modelProgram="string",

nBins=integer,

nIterations=integer,

nominals=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

outIterationHistory={

"caslib":"string",

"compress":True | False,

"indexVars":["variable-name-1" <, "variable-name-2", ...>],

"label":"string",

"lifetime":64-bit-integer,

"maxMemSize":64-bit-integer,

"memoryFormat":"DVR" | "INHERIT" | "STANDARD",

"name":"table-name",

"promote":True | False,

"replace":True | False,

"replication":integer,

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE",

"threadBlockSize":64-bit-integer,

"timeStamp":"string",

"where":["string-1" <, "string-2", ...>]

outQueryHistory={

"caslib":"string",

"compress":True | False,

"indexVars":["variable-name-1" <, "variable-name-2", ...>],

"label":"string",

"lifetime":64-bit-integer,

"maxMemSize":64-bit-integer,

"memoryFormat":"DVR" | "INHERIT" | "STANDARD",

"name":"table-name",

"promote":True | False,

"replace":True | False,

"replication":integer,

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE",

"threadBlockSize":64-bit-integer,

"timeStamp":"string",

"where":["string-1" <, "string-2", ...>]

selectQuery={"method":"RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters},

table={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"name":"table-name",

"orderBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

target="variable-name",

testTable={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"name":"table-name",

"orderBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

topK=64-bit-integer

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parameterannotatedTable	—	specifies the table that contains the labels to start the model training.
required parametergroundTruthTable	—	specifies the table that contains all the ground truth labels.
required parametertable	—	specifies the table that contains all unlabeled data.
required parametertestTable	—	specifies the table to use for model evaluation.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
required parameteroutIterationHistory	—	creates a data set that contains the model assessment across iterations.
required parameteroutQueryHistory	—	creates a data set that contains the query history across iterations.

Parameter Descriptions

* annotatedTable={castable}

specifies the table that contains the labels to start the model training.

For more information about specifying the annotatedTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* event="string"

specifies the prediction event level for model assessment. If the target has a format, then you must specify the formatted value.

* groundTruthTable={castable}

specifies the table that contains all the ground truth labels.

For more information about specifying the groundTruthTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* id="variable-name"

specifies the ID variable to use for merging the labels from the table specified in the annotatedTable parameter into the input data set. The value of this parameter must be unique and nonmissing.

* inputs=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

* modelProgram="string"

specifies user-defined modeling.

nBins=integer

specifies the number of bins to use for model assessment.

Default	20
Range	2–MACINT

nIterations=integer

specifies the number of iterations.

Default	20
Range	0–MACINT

nominals=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies nominal variables to use for analysis.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

* outIterationHistory={casouttable}

creates a data set that contains the model assessment across iterations.

For more information about specifying the outIterationHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* outQueryHistory={casouttable}

creates a data set that contains the query history across iterations.

For more information about specifying the outQueryHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* selectQuery={"method":"RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters}

specifies the query strategy to use. The strategy can be either random sampling, relevance sampling, or uncertainty sampling. By default, the strategy is uncertainty sampling.

The value that you specify for the method parameter determines the other parameters that apply.

* table={castable}

specifies the table that contains all unlabeled data.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* target="variable-name"

specifies the binary target variable to use for analysis. Only the binary target is currently supported.

* testTable={castable}

specifies the table to use for model evaluation.

For more information about specifying the testTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

topK=64-bit-integer

specifies the top K observations to query. When you specify a value of K greater than the number of observations available for query, the number of observations available is used instead.

Default	50
Minimum value	1

Parameters for method="RANDOM"

"minNObs":64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

"seed":double

specifies the seed for the random number generator that is used for random sampling.

Default	0

Parameters for method="RELEVANCE"

"gamma":double

specifies the number of the inverse of the variance for Gaussian kernels in order to calculate the relevance between labeled and unlabeled observations.

Default	20
Minimum value (exclusive)	0

"minNObs":64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

Parameters for method="UNCERTAINTY"

"includeAllData":True | False

when set to True, indicates that observations with and without labels can be used to query. Otherwise, only unlabeled observations can be used to query.

Default	False

"metric":"ENTROPY" | "LEASTCONFIDENCE" | "RATIOOFCONFIDENCE"

specifies the metric to use for calculating uncertainty.

Default	ENTROPY

* "probVar":"variable-name"

specifies the probability variable to use for calculating uncertainty.

Alias	predEventProb

iterate Action

Performs active learning iteratively and assesses the model performance..

R Syntax
Summary: Input and Output Tables
Parameter Descriptions

R Syntax

results <– cas.activeLearn.iterate(s,

annotatedTable=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

event="string",

groundTruthTable=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

id="variable-name",

inputs=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

modelProgram="string",

nBins=integer,

nIterations=integer,

nominals=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

outIterationHistory=list(

caslib="string",

compress=TRUE | FALSE,

indexVars=list("variable-name-1" <, "variable-name-2", ...>),

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where=list("string-1" <, "string-2", ...>)

outQueryHistory=list(

caslib="string",

compress=TRUE | FALSE,

indexVars=list("variable-name-1" <, "variable-name-2", ...>),

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where=list("string-1" <, "string-2", ...>)

selectQuery=list(method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters),

table=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

target="variable-name",

testTable=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

topK=64-bit-integer

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parameterannotatedTable	—	specifies the table that contains the labels to start the model training.
required parametergroundTruthTable	—	specifies the table that contains all the ground truth labels.
required parametertable	—	specifies the table that contains all unlabeled data.
required parametertestTable	—	specifies the table to use for model evaluation.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
required parameteroutIterationHistory	—	creates a data set that contains the model assessment across iterations.
required parameteroutQueryHistory	—	creates a data set that contains the query history across iterations.

Parameter Descriptions

* annotatedTable=list(castable)

specifies the table that contains the labels to start the model training.

For more information about specifying the annotatedTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* event="string"

specifies the prediction event level for model assessment. If the target has a format, then you must specify the formatted value.

* groundTruthTable=list(castable)

specifies the table that contains all the ground truth labels.

For more information about specifying the groundTruthTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* id="variable-name"

specifies the ID variable to use for merging the labels from the table specified in the annotatedTable parameter into the input data set. The value of this parameter must be unique and nonmissing.

* inputs=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies variables to use for analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

* modelProgram="string"

specifies user-defined modeling.

nBins=integer

specifies the number of bins to use for model assessment.

Default	20
Range	2–MACINT

nIterations=integer

specifies the number of iterations.

Default	20
Range	0–MACINT

nominals=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies nominal variables to use for analysis.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

* outIterationHistory=list(casouttable)

creates a data set that contains the model assessment across iterations.

For more information about specifying the outIterationHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* outQueryHistory=list(casouttable)

creates a data set that contains the query history across iterations.

For more information about specifying the outQueryHistory parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

* selectQuery=list(method="RANDOM" | "RELEVANCE" | "UNCERTAINTY", method-specific-parameters)

specifies the query strategy to use. The strategy can be either random sampling, relevance sampling, or uncertainty sampling. By default, the strategy is uncertainty sampling.

The value that you specify for the method parameter determines the other parameters that apply.

* table=list(castable)

specifies the table that contains all unlabeled data.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

* target="variable-name"

specifies the binary target variable to use for analysis. Only the binary target is currently supported.

* testTable=list(castable)

specifies the table to use for model evaluation.

For more information about specifying the testTable parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

topK=64-bit-integer

specifies the top K observations to query. When you specify a value of K greater than the number of observations available for query, the number of observations available is used instead.

Default	50
Minimum value	1

Parameters for method="RANDOM"

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

seed=double

specifies the seed for the random number generator that is used for random sampling.

Default	0

Parameters for method="RELEVANCE"

gamma=double

specifies the number of the inverse of the variance for Gaussian kernels in order to calculate the relevance between labeled and unlabeled observations.

Default	20
Minimum value (exclusive)	0

minNObs=64-bit-integer

specifies the maximum number of labeled observations.

Alias	cutoff
Default	1000
Minimum value	1

Parameters for method="UNCERTAINTY"

includeAllData=TRUE | FALSE

when set to True, indicates that observations with and without labels can be used to query. Otherwise, only unlabeled observations can be used to query.

Default	FALSE

metric="ENTROPY" | "LEASTCONFIDENCE" | "RATIOOFCONFIDENCE"

specifies the metric to use for calculating uncertainty.

Default	ENTROPY

* probVar="variable-name"

specifies the probability variable to use for calculating uncertainty.

Alias	predEventProb

Last updated: November 23, 2025