Nonnegative Matrix Factorization Action Set

Provides actions for performing nonnegative matrix factorization

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

CASL Syntax
Summary: Input and Output Tables
Parameter Descriptions

CASL Syntax

nmf.nmf <result=results> <status=rc> /

attributes={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

display={

caseSensitive=TRUE | FALSE,

exclude=TRUE | FALSE,

excludeAll=TRUE | FALSE,

keyIsPath=TRUE | FALSE,

names={"string-1" <, "string-2", ...>},

pathType="LABEL" | "NAME",

traceNames=TRUE | FALSE

groupByLimit=64-bit-integer,

impute="NONE" | {outputX},

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

iterationDetail=TRUE | FALSE,

method={

delta=double,

maxIter=integer,

name="APG" | "RANDOM",

oversampling=integer,

subIter=integer,

tolerance=double,

updates=integer

missing="MEAN" | "NONE",

noScale=TRUE | FALSE,

output={

casOut={

caslib="string"

compress=TRUE | FALSE

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

component="string",

copyVars={"variable-name-1" <, "variable-name-2", ...>}

outputH={

casOut={

caslib="string"

compress=TRUE | FALSE

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

}

outputTables={

groupByVarsRaw=TRUE | FALSE,

includeAll=TRUE | FALSE,

names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},

repeated=TRUE | FALSE,

replace=TRUE | FALSE

prefix="string",

rank=integer,

regularization={

alpha=double,

beta=double,

lcurve=TRUE | FALSE,

name="L1" | "L2"

seed=integer,

stopMeasure="OBJFUNC" | "PROJGRAD",

table={

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=TRUE | FALSE,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

;

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the settings for an input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
impute	—	specifies the settings for low-rank matrix completion.
output	required parametercasOut	specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.
outputH	required parametercasOut	specifies the output table to be created to contain the factor matrix H.
outputTables	names	lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	attribute
Aliases	attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value	1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE	suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias	copyVar

imputedRowsOnly=TRUE | FALSE

when set to True, keeps only the rows that contain the imputed values in the output table.

Default	FALSE

* output={casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	outputX

predOnly=TRUE | FALSE

when set to True, sets the observed values to missing values in the output table.

Default	FALSE

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	input
	vars
	var

iterationDetail=TRUE | FALSE

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias	iterDetail
Default	FALSE

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form	method={name="APG" \| "RANDOM"}
Shortcut form	method="APG" \| "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default	0.9999
Range	(0, 1)

maxIter=integer

specifies the maximum number of iterations to perform.

Default	500
Range	1–MACINT

* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias	oversamp
Default	10
Minimum value	0

subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default	4
Minimum value	0

tolerance=double

specifies the tolerance at which the iteration stops.

Alias	tol
Default	1E-07
Range	0–1

updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default	10
Range	1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default	NONE

MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=TRUE | FALSE

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default	FALSE

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias	outputW

The nmf_outputW value can be one or more of the following:

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias	comp

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias	copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default	"Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias	r
Range	1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases	reg
Aliases	penalty

Long form	regularization={name="L1" \| "L2"}
Shortcut form	regularization="L1" \| "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default	1
Minimum value	0

beta=double

specifies the regularization weight of the factor matrix H.

Default	1
Minimum value	0

lcurve=TRUE | FALSE

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default	FALSE

* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default	0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases	stop
Aliases	stopCriterion
Default	PROJGRAD

OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

Lua Syntax
Summary: Input and Output Tables
Parameter Descriptions

Lua Syntax

results, info = s:nmf_nmf{

attributes={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

display={

caseSensitive=true | false,

exclude=true | false,

excludeAll=true | false,

keyIsPath=true | false,

names={"string-1" <, "string-2", ...>},

pathType="LABEL" | "NAME",

traceNames=true | false

groupByLimit=64-bit-integer,

impute="NONE" | {outputX},

inputs={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

iterationDetail=true | false,

method={

delta=double,

maxIter=integer,

name="APG" | "RANDOM",

oversampling=integer,

subIter=integer,

tolerance=double,

updates=integer

missing="MEAN" | "NONE",

noScale=true | false,

output={

casOut={

caslib="string"

compress=true | false

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=true | false

replace=true | false

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

component="string",

copyVars={"variable-name-1" <, "variable-name-2", ...>}

outputH={

casOut={

caslib="string"

compress=true | false

indexVars={"variable-name-1" <, "variable-name-2", ...>}

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=true | false

replace=true | false

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where={"string-1" <, "string-2", ...>}

}

outputTables={

groupByVarsRaw=true | false,

includeAll=true | false,

names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},

repeated=true | false,

replace=true | false

prefix="string",

rank=integer,

regularization={

alpha=double,

beta=double,

lcurve=true | false,

name="L1" | "L2"

seed=integer,

stopMeasure="OBJFUNC" | "PROJGRAD",

table={

caslib="string",

computedOnDemand=true | false,

computedVars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

computedVarsProgram="string",

dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},

groupBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

singlePass=true | false,

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}},

where="where-expression",

whereTable={

casLib="string"

name="table-name"

vars={{

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

}, {...}}

where="where-expression"

}

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the settings for an input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
impute	—	specifies the settings for low-rank matrix completion.
output	required parametercasOut	specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.
outputH	required parametercasOut	specifies the output table to be created to contain the factor matrix H.
outputTables	names	lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	attribute
Aliases	attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value	1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE	suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias	copyVar

imputedRowsOnly=true | false

when set to True, keeps only the rows that contain the imputed values in the output table.

Default	false

* output={casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	outputX

predOnly=true | false

when set to True, sets the observed values to missing values in the output table.

Default	false

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	input
	vars
	var

iterationDetail=true | false

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias	iterDetail
Default	false

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form	method={name="APG" \| "RANDOM"}
Shortcut form	method="APG" \| "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default	0.9999
Range	(0, 1)

maxIter=integer

specifies the maximum number of iterations to perform.

Default	500
Range	1–MACINT

* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias	oversamp
Default	10
Minimum value	0

subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default	4
Minimum value	0

tolerance=double

specifies the tolerance at which the iteration stops.

Alias	tol
Default	1E-07
Range	0–1

updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default	10
Range	1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default	NONE

MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=true | false

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default	false

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias	outputW

The nmf_outputW value can be one or more of the following:

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias	comp

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias	copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default	"Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias	r
Range	1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases	reg
Aliases	penalty

Long form	regularization={name="L1" \| "L2"}
Shortcut form	regularization="L1" \| "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default	1
Minimum value	0

beta=double

specifies the regularization weight of the factor matrix H.

Default	1
Minimum value	0

lcurve=true | false

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default	false

* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default	0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases	stop
Aliases	stopCriterion
Default	PROJGRAD

OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

Python Syntax
Summary: Input and Output Tables
Parameter Descriptions

Python Syntax

results=s.nmf.nmf(

attributes=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

display={

"caseSensitive":True | False,

"exclude":True | False,

"excludeAll":True | False,

"keyIsPath":True | False,

"names":["string-1" <, "string-2", ...>],

"pathType":"LABEL" | "NAME",

"traceNames":True | False

groupByLimit=64-bit-integer,

impute="NONE" | {outputX},

inputs=[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

iterationDetail=True | False,

method={

"delta":double,

"maxIter":integer,

"name":"APG" | "RANDOM",

"oversampling":integer,

"subIter":integer,

"tolerance":double,

"updates":integer

missing="MEAN" | "NONE",

noScale=True | False,

output={

"casOut":{

"caslib":"string"

"compress":True | False

"indexVars":["variable-name-1" <, "variable-name-2", ...>]

"label":"string"

"lifetime":64-bit-integer

"maxMemSize":64-bit-integer

"memoryFormat":"DVR" | "INHERIT" | "STANDARD"

"name":"table-name"

"promote":True | False

"replace":True | False

"replication":integer

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"

"threadBlockSize":64-bit-integer

"timeStamp":"string"

"where":["string-1" <, "string-2", ...>]

"component":"string",

"copyVars":["variable-name-1" <, "variable-name-2", ...>]

outputH={

"casOut":{

"caslib":"string"

"compress":True | False

"indexVars":["variable-name-1" <, "variable-name-2", ...>]

"label":"string"

"lifetime":64-bit-integer

"maxMemSize":64-bit-integer

"memoryFormat":"DVR" | "INHERIT" | "STANDARD"

"name":"table-name"

"promote":True | False

"replace":True | False

"replication":integer

"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"

"threadBlockSize":64-bit-integer

"timeStamp":"string"

"where":["string-1" <, "string-2", ...>]

}

outputTables={

"groupByVarsRaw":True | False,

"includeAll":True | False,

"names":["string-1" <, "string-2", ...>] | {"key-1":{casouttable-1} <, "key-2":{casouttable-2}, ...>},

"repeated":True | False,

"replace":True | False

prefix="string",

rank=integer,

regularization={

"alpha":double,

"beta":double,

"lcurve":True | False,

"name":"L1" | "L2"

seed=integer,

stopMeasure="OBJFUNC" | "PROJGRAD",

table={

"caslib":"string",

"computedOnDemand":True | False,

"computedVars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"computedVarsProgram":"string",

"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},

"groupBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"groupByMode":"NOSORT" | "REDISTRIBUTE",

"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},

"name":"table-name",

"orderBy":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"singlePass":True | False,

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>],

"where":"where-expression",

"whereTable":{

"casLib":"string"

"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}

"name":"table-name"

"vars":[{

"format":"string",

"formattedLength":integer,

"label":"string",

"name":"variable-name",

"nfd":integer,

"nfl":integer

}<, {...}>]

"where":"where-expression"

}

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the settings for an input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
impute	—	specifies the settings for low-rank matrix completion.
output	required parametercasOut	specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.
outputH	required parametercasOut	specifies the output table to be created to contain the factor matrix H.
outputTables	names	lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	attribute
Aliases	attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value	1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE	suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

"copyVars":["variable-name-1" <, "variable-name-2", ...>]

copies one or more variables from the input table to the output table.

Alias	copyVar

"imputedRowsOnly":True | False

when set to True, keeps only the rows that contain the imputed values in the output table.

Default	False

* "output":{casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	outputX

"predOnly":True | False

when set to True, sets the observed values to missing values in the output table.

Default	False

inputs=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	input
	vars
	var

iterationDetail=True | False

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias	iterDetail
Default	False

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form	method={"name":"APG" \| "RANDOM"}
Shortcut form	method="APG" \| "RANDOM"

The nmf_method value can be one or more of the following:

"delta":double

specifies the coefficient that is used to control the extrapolation weight.

Default	0.9999
Range	(0, 1)

"maxIter":integer

specifies the maximum number of iterations to perform.

Default	500
Range	1–MACINT

* "name":"APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

"oversampling":integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias	oversamp
Default	10
Minimum value	0

"subIter":integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default	4
Minimum value	0

"tolerance":double

specifies the tolerance at which the iteration stops.

Alias	tol
Default	1E-07
Range	0–1

"updates":integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default	10
Range	1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default	NONE

MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=True | False

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default	False

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias	outputW

The nmf_outputW value can be one or more of the following:

* "casOut":{casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

"component":"string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias	comp

"copyVars":["variable-name-1" <, "variable-name-2", ...>]

copies one or more variables from the input table to the output table.

Alias	copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* "casOut":{casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default	"Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias	r
Range	1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases	reg
Aliases	penalty

Long form	regularization={"name":"L1" \| "L2"}
Shortcut form	regularization="L1" \| "L2"

The nmf_reg value can be one or more of the following:

"alpha":double

specifies the regularization weight of the factor matrix W.

Default	1
Minimum value	0

"beta":double

specifies the regularization weight of the factor matrix H.

Default	1
Minimum value	0

"lcurve":True | False

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default	False

* "name":"L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default	0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases	stop
Aliases	stopCriterion
Default	PROJGRAD

OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

R Syntax
Summary: Input and Output Tables
Parameter Descriptions

R Syntax

results <– cas.nmf.nmf(s,

attributes=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

display=list(

caseSensitive=TRUE | FALSE,

exclude=TRUE | FALSE,

excludeAll=TRUE | FALSE,

keyIsPath=TRUE | FALSE,

names=list("string-1" <, "string-2", ...>),

pathType="LABEL" | "NAME",

traceNames=TRUE | FALSE

groupByLimit=64-bit-integer,

impute="NONE" | list(outputX),

inputs=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

iterationDetail=TRUE | FALSE,

method=list(

delta=double,

maxIter=integer,

name="APG" | "RANDOM",

oversampling=integer,

subIter=integer,

tolerance=double,

updates=integer

missing="MEAN" | "NONE",

noScale=TRUE | FALSE,

output=list(

casOut=list(

caslib="string"

compress=TRUE | FALSE

indexVars=list("variable-name-1" <, "variable-name-2", ...>)

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where=list("string-1" <, "string-2", ...>)

component="string",

copyVars=list("variable-name-1" <, "variable-name-2", ...>)

outputH=list(

casOut=list(

caslib="string"

compress=TRUE | FALSE

indexVars=list("variable-name-1" <, "variable-name-2", ...>)

label="string"

lifetime=64-bit-integer

maxMemSize=64-bit-integer

memoryFormat="DVR" | "INHERIT" | "STANDARD"

name="table-name"

promote=TRUE | FALSE

replace=TRUE | FALSE

replication=integer

tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"

threadBlockSize=64-bit-integer

timeStamp="string"

where=list("string-1" <, "string-2", ...>)

)

outputTables=list(

groupByVarsRaw=TRUE | FALSE,

includeAll=TRUE | FALSE,

names=list("string-1" <, "string-2", ...>) | list(key-1=list(casouttable-1) <, key-2=list(casouttable-2), ...>),

repeated=TRUE | FALSE,

replace=TRUE | FALSE

prefix="string",

rank=integer,

regularization=list(

alpha=double,

beta=double,

lcurve=TRUE | FALSE,

name="L1" | "L2"

seed=integer,

stopMeasure="OBJFUNC" | "PROJGRAD",

table=list(

caslib="string",

computedOnDemand=TRUE | FALSE,

computedVars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

computedVarsProgram="string",

dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),

groupBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

groupByMode="NOSORT" | "REDISTRIBUTE",

name="table-name",

orderBy=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

singlePass=TRUE | FALSE,

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>),

where="where-expression",

whereTable=list(

casLib="string"

name="table-name"

vars=list( list(

format="string",

formattedLength=integer,

label="string",

name="variable-name",

nfd=integer,

nfl=integer

) <, list(...)>)

where="where-expression"

)

indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables
Parameter	Subparameter	Description
required parametertable	—	specifies the settings for an input table.

Parameters for Creating Output Tables
Parameter	Subparameter	Description
impute	—	specifies the settings for low-rank matrix completion.
output	required parametercasOut	specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.
outputH	required parametercasOut	specifies the output table to be created to contain the factor matrix H.
outputTables	names	lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	attribute
Aliases	attr

display=list(displayTables)

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value	1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE	suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars=list("variable-name-1" <, "variable-name-2", ...>)

copies one or more variables from the input table to the output table.

Alias	copyVar

imputedRowsOnly=TRUE | FALSE

when set to True, keeps only the rows that contain the imputed values in the output table.

Default	FALSE

* output=list(casouttable)

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias	outputX

predOnly=TRUE | FALSE

when set to True, sets the observed values to missing values in the output table.

Default	FALSE

inputs=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases	input
	vars
	var

iterationDetail=TRUE | FALSE

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias	iterDetail
Default	FALSE

* method=list(nmf_method)

specifies the settings for the matrix factorization method.

Long form	method=list(name="APG" \| "RANDOM")
Shortcut form	method="APG" \| "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default	0.9999
Range	(0, 1)

maxIter=integer

specifies the maximum number of iterations to perform.

Default	500
Range	1–MACINT

* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias	oversamp
Default	10
Minimum value	0

subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default	4
Minimum value	0

tolerance=double

specifies the tolerance at which the iteration stops.

Alias	tol
Default	1E-07
Range	0–1

updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default	10
Range	1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default	NONE

MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=TRUE | FALSE

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default	FALSE

output=list(nmf_outputW)

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias	outputW

The nmf_outputW value can be one or more of the following:

* casOut=list(casouttable)

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias	comp

copyVars=list("variable-name-1" <, "variable-name-2", ...>)

copies one or more variables from the input table to the output table.

Alias	copyVar

outputH=list(nmf_outputH)

specifies the output table to be created to contain the factor matrix H.

* casOut=list(casouttable)

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables=list(outputTables)

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias	displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default	"Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias	r
Range	1–MACINT

regularization=list(nmf_reg)

specifies the settings for regularization.

Aliases	reg
Aliases	penalty

Long form	regularization=list(name="L1" \| "L2")
Shortcut form	regularization="L1" \| "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default	1
Minimum value	0

beta=double

specifies the regularization weight of the factor matrix H.

Default	1
Minimum value	0

lcurve=TRUE | FALSE

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default	FALSE

* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default	0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases	stop
Aliases	stopCriterion
Default	PROJGRAD

OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table=list(castable)

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

Last updated: March 05, 2026