Nominal Variables Dimension Reduction Action Set

Provides actions for nominal variables dimension reduction

lpca Action

Reduces the dimensionality of nominal variables by using a logistic principal component analysis.

CASL Syntax
Summary: Input and Output Tables
Parameter Descriptions

CASL Syntax

nominalVarsDimReduction.lpca <result=results> <status=rc> /

dimensions=integer,

display={

caseSensitive=TRUE | FALSE,

exclude=TRUE | FALSE,

excludeAll=TRUE | FALSE,

keyIsPath=TRUE | FALSE,

names={"string-1" <, "string-2", ...>},

pathType="LABEL" | "NAME",

traceNames=TRUE | FALSE

epsilon=double,

freq="variable-name",

id={"variable-name-1" <, "variable-name-2", ...>},

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

m=double,

maxIter=integer,

nominals={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

output={

casOut={

caslib="string"

compress=TRUE | FALSE

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

onDemand=TRUE | FALSE

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}

outputTables={

groupByVarsRaw=TRUE | FALSE,

includeAll=TRUE | FALSE,

names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},

repeated=TRUE | FALSE,

replace=TRUE | FALSE

prefix="string",

saveState={

caslib="string",

compress=TRUE | FALSE,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

table={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupByMode="NOSORT" | "REDISTRIBUTE",

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

name="table-name",

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

;

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
output	required parametercasOut	specifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTables	names	lists the names of results tables to save as CAS tables on the server.
saveState	—	specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

Parameter Descriptions

dimensions=integer

specifies the number of reduced variables.

Aliases	dim
	dimension
	dimens
Minimum value	1

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

epsilon=double

specifies the tolerance to use in determining the convergence of the iterative algorithm.

Alias	tolerance
Default	0.0001
Range	1E-08–0.1

freq="variable-name"

specifies the frequency variable.

Alias	frequency

id={"variable-name-1" <, "variable-name-2", ...>}

specifies the variables to use as record identifiers and to transfer to the output table that you specify in the output parameter.

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

m=double

specifies a finite positive value to approximate the logit function's infinite limits: logit(1) is approximated by m, and logit(0) is approximated by -m.

Default	4
Range	(0–10]

maxIter=integer

specifies the maximum number of iterations for the iterative algorithm.

Aliases	maxIterations
Aliases	maxIters
Default	100
Range	1–100000000

nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the nominal variables to use in the training.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

output={outputStatement}

specifies the output data table that contains the values of the reduced variables for the training nominal data.

For more information about specifying the output parameter, see the common outputStatement parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	DISPLAYOUT

prefix="string"

specifies a prefix to apply to the names of the reduced variables.

Aliases	reducedVarNamePrefix
Aliases	reducedVarsNamePrefix
Default	"rv"

saveState={casouttable}

specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

For more information about specifying the saveState parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	saveModel

* table={castable}

specifies the input table.

Long form	table={name="table-name"}
Shortcut form	table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=TRUE | FALSE

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias	compOnDemand
Default	FALSE

computedVars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included.

Alias	compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias	compPgm

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}

specifies data source options.

Aliases	options
Aliases	dataSource

groupByMode="NOSORT" | "REDISTRIBUTE"

specifies how to create groups.

Default	NOSORT

NOSORT

groups the data without sorting on each machine, and then groups the data again on the controller.

REDISTRIBUTE

transfers rows between nodes to guarantee ordering within groups. This method is slower.

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=TRUE | FALSE

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default	FALSE

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable={groupbytable}

specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first.

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases	options
Aliases	dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

lpca Action

Reduces the dimensionality of nominal variables by using a logistic principal component analysis.

Lua Syntax
Summary: Input and Output Tables
Parameter Descriptions

Lua Syntax

results, info = s:nominalVarsDimReduction_lpca{

dimensions=integer,

display={

caseSensitive=true | false,

exclude=true | false,

excludeAll=true | false,

keyIsPath=true | false,

names={"string-1" <, "string-2", ...>},

pathType="LABEL" | "NAME",

traceNames=true | false

epsilon=double,

freq="variable-name",

id={"variable-name-1" <, "variable-name-2", ...>},

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

m=double,

maxIter=integer,

nominals={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

output={

casOut={

caslib="string"

compress=true | false

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

onDemand=true | false

promote=true | false

replace=true | false

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}

outputTables={

groupByVarsRaw=true | false,

includeAll=true | false,

names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},

repeated=true | false,

replace=true | false

prefix="string",

saveState={

caslib="string",

compress=true | false,

indexVars={"variable-name-1" <, "variable-name-2", ...>},

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=true | false,

replace=true | false,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where={"string-1" <, "string-2", ...>}

table={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
output	required parametercasOut	specifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTables	names	lists the names of results tables to save as CAS tables on the server.
saveState	—	specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

Parameter Descriptions

dimensions=integer

specifies the number of reduced variables.

Aliases	dim
	dimension
	dimens
Minimum value	1

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

epsilon=double

specifies the tolerance to use in determining the convergence of the iterative algorithm.

Alias	tolerance
Default	0.0001
Range	1E-08–0.1

freq="variable-name"

specifies the frequency variable.

Alias	frequency

id={"variable-name-1" <, "variable-name-2", ...>}

specifies the variables to use as record identifiers and to transfer to the output table that you specify in the output parameter.

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

m=double

specifies a finite positive value to approximate the logit function's infinite limits: logit(1) is approximated by m, and logit(0) is approximated by -m.

Default	4
Range	(0–10]

maxIter=integer

specifies the maximum number of iterations for the iterative algorithm.

Aliases	maxIterations
Aliases	maxIters
Default	100
Range	1–100000000

nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the nominal variables to use in the training.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

output={outputStatement}

specifies the output data table that contains the values of the reduced variables for the training nominal data.

For more information about specifying the output parameter, see the common outputStatement parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	DISPLAYOUT

prefix="string"

specifies a prefix to apply to the names of the reduced variables.

Aliases	reducedVarNamePrefix
Aliases	reducedVarsNamePrefix
Default	"rv"

saveState={casouttable}

specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

For more information about specifying the saveState parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	saveModel

* table={castable}

specifies the input table.

Long form	table={name="table-name"}
Shortcut form	table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=true | false

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias	compOnDemand
Default	false

computedVars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

Alias	compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias	compPgm

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}

specifies data source options.

Aliases	options
Aliases	dataSource

groupByMode="NOSORT" | "REDISTRIBUTE"

specifies how to create groups.

Default	NOSORT

NOSORT

groups the data without sorting on each machine, and then groups the data again on the controller.

REDISTRIBUTE

transfers rows between nodes to guarantee ordering within groups. This method is slower.

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=true | false

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default	false

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable={groupbytable}

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases	options
Aliases	dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

lpca Action

Reduces the dimensionality of nominal variables by using a logistic principal component analysis.

Python Syntax
Summary: Input and Output Tables
Parameter Descriptions

Python Syntax

results=s.nominalVarsDimReduction.lpca(

dimensions=integer,

display={

"caseSensitive":True | False,

"exclude":True | False,

"excludeAll":True | False,

"keyIsPath":True | False,

"names":["string-1" <, "string-2", ...>],

"pathType":"LABEL" | "NAME",

"traceNames":True | False

epsilon=double,

freq="variable-name",

id=["variable-name-1" <, "variable-name-2", ...>],

inputs=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

m=double,

maxIter=integer,

nominals=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

output={

"casOut":{

"caslib":"string"

"compress":True | False

"indexVars":["variable-name-1" <, "variable-name-2", ...>]

"label":"string"

"lifetime":64-bit-integer

"maxMemSize":64-bit-integer

"memoryFormat":"DVR" | "INHERIT" | "STANDARD"

"name":"table-name"

"onDemand":True | False

"promote":True | False

"replace":True | False

"replication":integer

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"

"threadBlockSize":64-bit-integer

"timeStamp":"string"

"where":["string-1" <, "string-2", ...>]

"copyVars":"ALL" | "ALL_MODEL" | "ALL_NUMERIC" | ["variable-name-1" <, "variable-name-2", ...>]

outputTables={

"groupByVarsRaw":True | False,

"includeAll":True | False,

"names":["string-1" <, "string-2", ...>] | {"key-1":{casouttable-1} <, "key-2":{casouttable-2}, ...>},

"repeated":True | False,

"replace":True | False

prefix="string",

saveState={

"caslib":"string",

"compress":True | False,

"indexVars":["variable-name-1" <, "variable-name-2", ...>],

"label":"string",

"lifetime":64-bit-integer,

"maxMemSize":64-bit-integer,

"memoryFormat":"DVR" | "INHERIT" | "STANDARD",

"name":"table-name",

"promote":True | False,

"replace":True | False,

"replication":integer,

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE",

"threadBlockSize":64-bit-integer,

"timeStamp":"string",

"where":["string-1" <, "string-2", ...>]

table={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

"name":"table-name",

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
output	required parametercasOut	specifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTables	names	lists the names of results tables to save as CAS tables on the server.
saveState	—	specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

Parameter Descriptions

dimensions=integer

specifies the number of reduced variables.

Aliases	dim
	dimension
	dimens
Minimum value	1

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

epsilon=double

specifies the tolerance to use in determining the convergence of the iterative algorithm.

Alias	tolerance
Default	0.0001
Range	1E-08–0.1

freq="variable-name"

specifies the frequency variable.

Alias	frequency

id=["variable-name-1" <, "variable-name-2", ...>]

specifies the variables to use as record identifiers and to transfer to the output table that you specify in the output parameter.

inputs=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the variables to use in the analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

m=double

specifies a finite positive value to approximate the logit function's infinite limits: logit(1) is approximated by m, and logit(0) is approximated by -m.

Default	4
Range	(0–10]

maxIter=integer

specifies the maximum number of iterations for the iterative algorithm.

Aliases	maxIterations
Aliases	maxIters
Default	100
Range	1–100000000

nominals=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the nominal variables to use in the training.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

output={outputStatement}

specifies the output data table that contains the values of the reduced variables for the training nominal data.

For more information about specifying the output parameter, see the common outputStatement parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	DISPLAYOUT

prefix="string"

specifies a prefix to apply to the names of the reduced variables.

Aliases	reducedVarNamePrefix
Aliases	reducedVarsNamePrefix
Default	"rv"

saveState={casouttable}

specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

For more information about specifying the saveState parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	saveModel

* table={castable}

specifies the input table.

Long form	table={"name":"table-name"}
Shortcut form	table="table-name"

The castable value can be one or more of the following:

"caslib":"string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

"computedOnDemand":True | False

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias	compOnDemand
Default	False

"computedVars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

Alias	compVars

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"computedVarsProgram":"string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias	compPgm

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>}

specifies data source options.

Aliases	options
Aliases	dataSource

"groupByMode":"NOSORT" | "REDISTRIBUTE"

specifies how to create groups.

Default	NOSORT

NOSORT

groups the data without sorting on each machine, and then groups the data again on the controller.

REDISTRIBUTE

transfers rows between nodes to guarantee ordering within groups. This method is slower.

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import_

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* "name":"table-name"

specifies the name of the input table.

"singlePass":True | False

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default	False

"vars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"where":"where-expression"

specifies an expression for subsetting the input data.

"whereTable":{groupbytable}

The groupbytable value can be one or more of the following:

"casLib":"string"

specifies the caslib for the filter table. By default, the active caslib is used.

"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

specifies data source options.

Aliases	options
Aliases	dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}

specifies the settings for reading a table from a data source.

Alias	import_

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* "name":"table-name"

specifies the name of the filter table.

"vars":[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

"format":"string"

specifies the format to apply to the variable.

"formattedLength":integer

specifies the length of the format field plus the length of the format precision.

"label":"string"

specifies the descriptive label for the variable.

* "name":"variable-name"

specifies the name for the variable.

"nfd":integer

specifies the length of the format precision.

"nfl":integer

specifies the length of the format field.

"where":"where-expression"

specifies an expression for subsetting the data from the filter table.

lpca Action

Reduces the dimensionality of nominal variables by using a logistic principal component analysis.

R Syntax
Summary: Input and Output Tables
Parameter Descriptions

R Syntax

results <– cas.nominalVarsDimReduction.lpca(s,

dimensions=integer,

display=list(

caseSensitive=TRUE | FALSE,

exclude=TRUE | FALSE,

excludeAll=TRUE | FALSE,

keyIsPath=TRUE | FALSE,

names=list("string-1" <, "string-2", ...>),

pathType="LABEL" | "NAME",

traceNames=TRUE | FALSE

epsilon=double,

freq="variable-name",

id=list("variable-name-1" <, "variable-name-2", ...>),

inputs=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

m=double,

maxIter=integer,

nominals=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

output=list(

casOut=list(

caslib="string"

compress=TRUE | FALSE

indexVars=list("variable-name-1" <, "variable-name-2", ...>)

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

onDemand=TRUE | FALSE

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where=list("string-1" <, "string-2", ...>)

copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | list("variable-name-1" <, "variable-name-2", ...>)

outputTables=list(

groupByVarsRaw=TRUE | FALSE,

includeAll=TRUE | FALSE,

names=list("string-1" <, "string-2", ...>) | list(key-1=list(casouttable-1) <, key-2=list(casouttable-2), ...>),

repeated=TRUE | FALSE,

replace=TRUE | FALSE

prefix="string",

saveState=list(

caslib="string",

compress=TRUE | FALSE,

indexVars=list("variable-name-1" <, "variable-name-2", ...>),

label="string",

lifetime=64-bit-integer,

maxMemSize=64-bit-integer,

memoryFormat="DVR" | "INHERIT" | "STANDARD",

name="table-name",

promote=TRUE | FALSE,

replace=TRUE | FALSE,

replication=integer,

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE",

threadBlockSize=64-bit-integer,

timeStamp="string",

where=list("string-1" <, "string-2", ...>)

table=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
output	required parametercasOut	specifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTables	names	lists the names of results tables to save as CAS tables on the server.
saveState	—	specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

Parameter Descriptions

dimensions=integer

specifies the number of reduced variables.

Aliases	dim
	dimension
	dimens
Minimum value	1

display=list(displayTables)

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

epsilon=double

specifies the tolerance to use in determining the convergence of the iterative algorithm.

Alias	tolerance
Default	0.0001
Range	1E-08–0.1

freq="variable-name"

specifies the frequency variable.

Alias	frequency

id=list("variable-name-1" <, "variable-name-2", ...>)

specifies the variables to use as record identifiers and to transfer to the output table that you specify in the output parameter.

inputs=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the variables to use in the analysis.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	input

m=double

specifies a finite positive value to approximate the logit function's infinite limits: logit(1) is approximated by m, and logit(0) is approximated by -m.

Default	4
Range	(0–10]

maxIter=integer

specifies the maximum number of iterations for the iterative algorithm.

Aliases	maxIterations
Aliases	maxIters
Default	100
Range	1–100000000

nominals=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the nominal variables to use in the training.

For more information about specifying the nominals parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Alias	nominal

output=list(outputStatement)

specifies the output data table that contains the values of the reduced variables for the training nominal data.

For more information about specifying the output parameter, see the common outputStatement parameter (Appendix A: Common Parameters).

outputTables=list(outputTables)

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	DISPLAYOUT

prefix="string"

specifies a prefix to apply to the names of the reduced variables.

Aliases	reducedVarNamePrefix
Aliases	reducedVarsNamePrefix
Default	"rv"

saveState=list(casouttable)

specifies the output data table in which to save the dimensionality reduction model of the nominal variables for future scoring.

For more information about specifying the saveState parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	saveModel

* table=list(castable)

specifies the input table.

Long form	table=list(name="table-name")
Shortcut form	table="table-name"

The castable value can be one or more of the following:

caslib="string"

specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib.

computedOnDemand=TRUE | FALSE

when set to True, creates the computed variables when the table is loaded instead of when the action begins.

Alias	compOnDemand
Default	FALSE

computedVars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

Alias	compVars

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

computedVarsProgram="string"

specifies an expression for each computed variable that you include in the computedVars parameter.

Alias	compPgm

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>)

specifies data source options.

Aliases	options
Aliases	dataSource

groupByMode="NOSORT" | "REDISTRIBUTE"

specifies how to create groups.

Default	NOSORT

NOSORT

groups the data without sorting on each machine, and then groups the data again on the controller.

REDISTRIBUTE

transfers rows between nodes to guarantee ordering within groups. This method is slower.

importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the input table.

singlePass=TRUE | FALSE

when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs.

Default	FALSE

vars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the variables to use in the action.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the input data.

whereTable=list(groupbytable)

The groupbytable value can be one or more of the following:

casLib="string"

specifies the caslib for the filter table. By default, the active caslib is used.

dataSourceOptions=list(adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters)

specifies data source options.

Aliases	options
Aliases	dataSource

For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter (Appendix A: Common Parameters).

importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)

specifies the settings for reading a table from a data source.

Alias	import

For more information about specifying the importOptions parameter, see the common importOptions parameter (Appendix A: Common Parameters).

* name="table-name"

specifies the name of the filter table.

vars=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the variable names to use from the filter table.

The casinvardesc value can be one or more of the following:

format="string"

specifies the format to apply to the variable.

formattedLength=integer

specifies the length of the format field plus the length of the format precision.

label="string"

specifies the descriptive label for the variable.

* name="variable-name"

specifies the name for the variable.

nfd=integer

specifies the length of the format precision.

nfl=integer

specifies the length of the format field.

where="where-expression"

specifies an expression for subsetting the data from the filter table.

Last updated: November 23, 2025