Nonnegative Matrix Factorization Action Set

Provides actions for performing nonnegative matrix factorization

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

CASL Syntax

nmf.nmf <result=results> <status=rc> /
attributes={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
display={
caseSensitive=TRUE | FALSE,
exclude=TRUE | FALSE,
excludeAll=TRUE | FALSE,
keyIsPath=TRUE | FALSE,
names={"string-1" <, "string-2", ...>},
pathType="LABEL" | "NAME",
traceNames=TRUE | FALSE
},
groupByLimit=64-bit-integer,
impute="NONE" | {outputX},
inputs={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
iterationDetail=TRUE | FALSE,
required parameter method={
delta=double,
maxIter=integer,
required parameter name="APG" | "RANDOM",
oversampling=integer,
subIter=integer,
tolerance=double,
updates=integer
},
noScale=TRUE | FALSE,
output={
required parameter casOut={
caslib="string"
compress=TRUE | FALSE
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
},
component="string",
copyVars={"variable-name-1" <, "variable-name-2", ...>}
},
outputH={
required parameter casOut={
caslib="string"
compress=TRUE | FALSE
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
}
},
outputTables={
groupByVarsRaw=TRUE | FALSE,
includeAll=TRUE | FALSE,
names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},
repeated=TRUE | FALSE,
replace=TRUE | FALSE
},
prefix="string",
required parameter rank=integer,
regularization={
alpha=double,
beta=double,
lcurve=TRUE | FALSE,
required parameter name="L1" | "L2"
},
seed=integer,
required parameter table={
caslib="string",
computedOnDemand=TRUE | FALSE,
computedVars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
computedVarsProgram="string",
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},
groupBy={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
groupByMode="NOSORT" | "REDISTRIBUTE",
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter name="table-name",
orderBy={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
singlePass=TRUE | FALSE,
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
where="where-expression",
whereTable={
casLib="string"
dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter name="table-name"
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}}
where="where-expression"
}
}
;
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 impute

specifies the settings for low-rank matrix completion.

 output

required parametercasOut

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

 outputH

required parametercasOut

specifies the output table to be created to contain the factor matrix H.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases attribute
attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value 1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias copyVar
imputedRowsOnly=TRUE | FALSE

when set to True, keeps only the rows that contain the imputed values in the output table.

Default FALSE
* output={casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias outputX
predOnly=TRUE | FALSE

when set to True, sets the observed values to missing values in the output table.

Default FALSE

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases input
vars
var

iterationDetail=TRUE | FALSE

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias iterDetail
Default FALSE

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form method={name="APG" | "RANDOM"}
Shortcut form method="APG" | "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default 0.9999
Range (0, 1)
maxIter=integer

specifies the maximum number of iterations to perform.

Default 500
Range 1–MACINT
* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias oversamp
Default 10
Minimum value 0
subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default 4
Minimum value 0
tolerance=double

specifies the tolerance at which the iteration stops.

Alias tol
Default 1E-07
Range 0–1
updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default 10
Range 1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default NONE
MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=TRUE | FALSE

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default FALSE

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias outputW

The nmf_outputW value can be one or more of the following:

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias comp
copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default "Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias r
Range 1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases reg
penalty
Long form regularization={name="L1" | "L2"}
Shortcut form regularization="L1" | "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default 1
Minimum value 0
beta=double

specifies the regularization weight of the factor matrix H.

Default 1
Minimum value 0
lcurve=TRUE | FALSE

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default FALSE
* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default 0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases stop
stopCriterion
Default PROJGRAD
OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

Lua Syntax

results, info = s:nmf_nmf{
attributes={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
display={
caseSensitive=true | false,
exclude=true | false,
excludeAll=true | false,
keyIsPath=true | false,
names={"string-1" <, "string-2", ...>},
pathType="LABEL" | "NAME",
traceNames=true | false
},
groupByLimit=64-bit-integer,
impute="NONE" | {outputX},
inputs={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
iterationDetail=true | false,
required parameter method={
delta=double,
maxIter=integer,
required parameter name="APG" | "RANDOM",
oversampling=integer,
subIter=integer,
tolerance=double,
updates=integer
},
noScale=true | false,
output={
required parameter casOut={
caslib="string"
compress=true | false
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=true | false
replace=true | false
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
},
component="string",
copyVars={"variable-name-1" <, "variable-name-2", ...>}
},
outputH={
required parameter casOut={
caslib="string"
compress=true | false
indexVars={"variable-name-1" <, "variable-name-2", ...>}
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=true | false
replace=true | false
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where={"string-1" <, "string-2", ...>}
}
},
outputTables={
groupByVarsRaw=true | false,
includeAll=true | false,
names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>},
repeated=true | false,
replace=true | false
},
prefix="string",
required parameter rank=integer,
regularization={
alpha=double,
beta=double,
lcurve=true | false,
required parameter name="L1" | "L2"
},
seed=integer,
required parameter table={
caslib="string",
computedOnDemand=true | false,
computedVars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
computedVarsProgram="string",
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},
groupBy={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
groupByMode="NOSORT" | "REDISTRIBUTE",
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter name="table-name",
orderBy={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
singlePass=true | false,
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}},
where="where-expression",
whereTable={
casLib="string"
dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter name="table-name"
vars={{
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
}, {...}}
where="where-expression"
}
}
}
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 impute

specifies the settings for low-rank matrix completion.

 output

required parametercasOut

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

 outputH

required parametercasOut

specifies the output table to be created to contain the factor matrix H.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases attribute
attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value 1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias copyVar
imputedRowsOnly=true | false

when set to True, keeps only the rows that contain the imputed values in the output table.

Default false
* output={casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias outputX
predOnly=true | false

when set to True, sets the observed values to missing values in the output table.

Default false

inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases input
vars
var

iterationDetail=true | false

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias iterDetail
Default false

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form method={name="APG" | "RANDOM"}
Shortcut form method="APG" | "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default 0.9999
Range (0, 1)
maxIter=integer

specifies the maximum number of iterations to perform.

Default 500
Range 1–MACINT
* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias oversamp
Default 10
Minimum value 0
subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default 4
Minimum value 0
tolerance=double

specifies the tolerance at which the iteration stops.

Alias tol
Default 1E-07
Range 0–1
updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default 10
Range 1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default NONE
MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=true | false

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default false

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias outputW

The nmf_outputW value can be one or more of the following:

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias comp
copyVars={"variable-name-1" <, "variable-name-2", ...>}

copies one or more variables from the input table to the output table.

Alias copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* casOut={casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default "Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias r
Range 1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases reg
penalty
Long form regularization={name="L1" | "L2"}
Shortcut form regularization="L1" | "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default 1
Minimum value 0
beta=double

specifies the regularization weight of the factor matrix H.

Default 1
Minimum value 0
lcurve=true | false

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default false
* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default 0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases stop
stopCriterion
Default PROJGRAD
OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

Python Syntax

results=s.nmf.nmf(
attributes=[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
display={
"caseSensitive":True | False,
"exclude":True | False,
"excludeAll":True | False,
"keyIsPath":True | False,
"names":["string-1" <, "string-2", ...>],
"pathType":"LABEL" | "NAME",
"traceNames":True | False
},
groupByLimit=64-bit-integer,
impute="NONE" | {outputX},
inputs=[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
iterationDetail=True | False,
required parameter method={
"delta":double,
"maxIter":integer,
required parameter "name":"APG" | "RANDOM",
"oversampling":integer,
"subIter":integer,
"tolerance":double,
"updates":integer
},
noScale=True | False,
output={
required parameter "casOut":{
"caslib":"string"
"compress":True | False
"indexVars":["variable-name-1" <, "variable-name-2", ...>]
"label":"string"
"lifetime":64-bit-integer
"maxMemSize":64-bit-integer
"memoryFormat":"DVR" | "INHERIT" | "STANDARD"
"name":"table-name"
"promote":True | False
"replace":True | False
"replication":integer
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"
"threadBlockSize":64-bit-integer
"timeStamp":"string"
"where":["string-1" <, "string-2", ...>]
},
"component":"string",
"copyVars":["variable-name-1" <, "variable-name-2", ...>]
},
outputH={
required parameter "casOut":{
"caslib":"string"
"compress":True | False
"indexVars":["variable-name-1" <, "variable-name-2", ...>]
"label":"string"
"lifetime":64-bit-integer
"maxMemSize":64-bit-integer
"memoryFormat":"DVR" | "INHERIT" | "STANDARD"
"name":"table-name"
"promote":True | False
"replace":True | False
"replication":integer
"tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE"
"threadBlockSize":64-bit-integer
"timeStamp":"string"
"where":["string-1" <, "string-2", ...>]
}
},
outputTables={
"groupByVarsRaw":True | False,
"includeAll":True | False,
"names":["string-1" <, "string-2", ...>] | {"key-1":{casouttable-1} <, "key-2":{casouttable-2}, ...>},
"repeated":True | False,
"replace":True | False
},
prefix="string",
required parameter rank=integer,
regularization={
"alpha":double,
"beta":double,
"lcurve":True | False,
required parameter "name":"L1" | "L2"
},
seed=integer,
required parameter table={
"caslib":"string",
"computedOnDemand":True | False,
"computedVars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"computedVarsProgram":"string",
"dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>},
"groupBy":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"groupByMode":"NOSORT" | "REDISTRIBUTE",
"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters},
required parameter "name":"table-name",
"orderBy":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"singlePass":True | False,
"vars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>],
"where":"where-expression",
"whereTable":{
"casLib":"string"
"dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}
"importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}
required parameter "name":"table-name"
"vars":[{
"format":"string",
"formattedLength":integer,
"label":"string",
required parameter "name":"variable-name",
"nfd":integer,
"nfl":integer
}<, {...}>]
"where":"where-expression"
}
}
)
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 impute

specifies the settings for low-rank matrix completion.

 output

required parametercasOut

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

 outputH

required parametercasOut

specifies the output table to be created to contain the factor matrix H.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases attribute
attr

display={displayTables}

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value 1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

"copyVars":["variable-name-1" <, "variable-name-2", ...>]

copies one or more variables from the input table to the output table.

Alias copyVar
"imputedRowsOnly":True | False

when set to True, keeps only the rows that contain the imputed values in the output table.

Default False
* "output":{casouttable}

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias outputX
"predOnly":True | False

when set to True, sets the observed values to missing values in the output table.

Default False

inputs=[{casinvardesc-1} <, {casinvardesc-2}, ...>]

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases input
vars
var

iterationDetail=True | False

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias iterDetail
Default False

* method={nmf_method}

specifies the settings for the matrix factorization method.

Long form method={"name":"APG" | "RANDOM"}
Shortcut form method="APG" | "RANDOM"

The nmf_method value can be one or more of the following:

"delta":double

specifies the coefficient that is used to control the extrapolation weight.

Default 0.9999
Range (0, 1)
"maxIter":integer

specifies the maximum number of iterations to perform.

Default 500
Range 1–MACINT
* "name":"APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

"oversampling":integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias oversamp
Default 10
Minimum value 0
"subIter":integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default 4
Minimum value 0
"tolerance":double

specifies the tolerance at which the iteration stops.

Alias tol
Default 1E-07
Range 0–1
"updates":integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default 10
Range 1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default NONE
MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=True | False

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default False

output={nmf_outputW}

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias outputW

The nmf_outputW value can be one or more of the following:

* "casOut":{casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

"component":"string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias comp
"copyVars":["variable-name-1" <, "variable-name-2", ...>]

copies one or more variables from the input table to the output table.

Alias copyVar

outputH={nmf_outputH}

specifies the output table to be created to contain the factor matrix H.

* "casOut":{casouttable}

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables={outputTables}

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default "Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias r
Range 1–MACINT

regularization={nmf_reg}

specifies the settings for regularization.

Aliases reg
penalty
Long form regularization={"name":"L1" | "L2"}
Shortcut form regularization="L1" | "L2"

The nmf_reg value can be one or more of the following:

"alpha":double

specifies the regularization weight of the factor matrix W.

Default 1
Minimum value 0
"beta":double

specifies the regularization weight of the factor matrix H.

Default 1
Minimum value 0
"lcurve":True | False

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default False
* "name":"L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default 0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases stop
stopCriterion
Default PROJGRAD
OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table={castable}

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

nmf Action

Performs factorization of a nonnegative data matrix as the product of two low-rank nonnegative matrices.

R Syntax

results <– cas.nmf.nmf(s,
attributes=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
display=list(
caseSensitive=TRUE | FALSE,
exclude=TRUE | FALSE,
excludeAll=TRUE | FALSE,
keyIsPath=TRUE | FALSE,
names=list("string-1" <, "string-2", ...>),
pathType="LABEL" | "NAME",
traceNames=TRUE | FALSE
),
groupByLimit=64-bit-integer,
impute="NONE" | list(outputX),
inputs=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
iterationDetail=TRUE | FALSE,
required parameter method=list(
delta=double,
maxIter=integer,
required parameter name="APG" | "RANDOM",
oversampling=integer,
subIter=integer,
tolerance=double,
updates=integer
),
noScale=TRUE | FALSE,
output=list(
required parameter casOut=list(
caslib="string"
compress=TRUE | FALSE
indexVars=list("variable-name-1" <, "variable-name-2", ...>)
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where=list("string-1" <, "string-2", ...>)
),
component="string",
copyVars=list("variable-name-1" <, "variable-name-2", ...>)
),
outputH=list(
required parameter casOut=list(
caslib="string"
compress=TRUE | FALSE
indexVars=list("variable-name-1" <, "variable-name-2", ...>)
label="string"
lifetime=64-bit-integer
maxMemSize=64-bit-integer
memoryFormat="DVR" | "INHERIT" | "STANDARD"
name="table-name"
promote=TRUE | FALSE
replace=TRUE | FALSE
replication=integer
tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"
threadBlockSize=64-bit-integer
timeStamp="string"
where=list("string-1" <, "string-2", ...>)
)
),
outputTables=list(
groupByVarsRaw=TRUE | FALSE,
includeAll=TRUE | FALSE,
names=list("string-1" <, "string-2", ...>) | list(key-1=list(casouttable-1) <, key-2=list(casouttable-2), ...>),
repeated=TRUE | FALSE,
replace=TRUE | FALSE
),
prefix="string",
required parameter rank=integer,
regularization=list(
alpha=double,
beta=double,
lcurve=TRUE | FALSE,
required parameter name="L1" | "L2"
),
seed=integer,
required parameter table=list(
caslib="string",
computedOnDemand=TRUE | FALSE,
computedVars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
computedVarsProgram="string",
dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>),
groupBy=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
groupByMode="NOSORT" | "REDISTRIBUTE",
importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters),
required parameter name="table-name",
orderBy=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
singlePass=TRUE | FALSE,
vars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>),
where="where-expression",
whereTable=list(
casLib="string"
dataSourceOptions=list(adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters)
importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters)
required parameter name="table-name"
vars=list( list(
format="string",
formattedLength=integer,
label="string",
required parameter name="variable-name",
nfd=integer,
nfl=integer
) <, list(...)>)
where="where-expression"
)
)
)
indicates a required parameter

Summary: Input and Output Tables

If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.

Parameters for Reading Input Tables

Parameter

Subparameter

Description

required parametertable

specifies the settings for an input table.

Parameters for Creating Output Tables

Parameter

Subparameter

Description

 impute

specifies the settings for low-rank matrix completion.

 output

required parametercasOut

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

 outputH

required parametercasOut

specifies the output table to be created to contain the factor matrix H.

 outputTables

names

lists the names of results tables to save as CAS tables on the server.

Parameter Descriptions

attributes=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored.

For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases attribute
attr

display=list(displayTables)

specifies a list of results tables to send to the client for display.

For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters).

groupByLimit=64-bit-integer

suppresses analysis if the number of BY groups exceeds the specified value.

Minimum value 1

impute="NONE" | {outputX}

specifies the settings for low-rank matrix completion.

NONE suppresses creation of the output table that contains imputation results.

The outputX value can be one or more of the following:

copyVars=list("variable-name-1" <, "variable-name-2", ...>)

copies one or more variables from the input table to the output table.

Alias copyVar
imputedRowsOnly=TRUE | FALSE

when set to True, keeps only the rows that contain the imputed values in the output table.

Default FALSE
* output=list(casouttable)

specifies the output table to be created to contain the input data matrix, with missing values replaced by the imputed values.

For more information about specifying the output parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

Alias outputX
predOnly=TRUE | FALSE

when set to True, sets the observed values to missing values in the output table.

Default FALSE

inputs=list( list(casinvardesc-1) <, list(casinvardesc-2), ...>)

specifies the numeric variables to be analyzed. If you omit this parameter, all numeric variables that are not specified in other parameters are analyzed.

For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters).

Aliases input
vars
var

iterationDetail=TRUE | FALSE

when set to True, generates the "Iteration Details" table, which displays the matrix factorization accuracy for each iteration.

Alias iterDetail
Default FALSE

* method=list(nmf_method)

specifies the settings for the matrix factorization method.

Long form method=list(name="APG" | "RANDOM")
Shortcut form method="APG" | "RANDOM"

The nmf_method value can be one or more of the following:

delta=double

specifies the coefficient that is used to control the extrapolation weight.

Default 0.9999
Range (0, 1)
maxIter=integer

specifies the maximum number of iterations to perform.

Default 500
Range 1–MACINT
* name="APG" | "RANDOM"

specifies the name of the matrix factorization method to use.

APG

uses alternating proximal gradient.

RANDOM

uses random projections.

oversampling=integer

specifies the size of oversampling. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Alias oversamp
Default 10
Minimum value 0
subIter=integer

specifies the number of subspace iterations. The parameter is used only when random projections is chosen as the matrix factorization method (that is, method='RANDOM').

Default 4
Minimum value 0
tolerance=double

specifies the tolerance at which the iteration stops.

Alias tol
Default 1E-07
Range 0–1
updates=integer

specifies the number of updates to the W and H matrices at each iteration. If you specify the "impute" parameter, the default value is 1.

Default 10
Range 1–MACINT

missing="MEAN" | "NONE"

specifies how to handle observations that have missing values.

Default NONE
MEAN

imputes missing values to the mean of the corresponding variable.

NONE

excludes observations that have missing values from the analysis.

noScale=TRUE | FALSE

when set to True, suppresses scaling of the numeric variables to be analyzed to between 0 and 1.

Default FALSE

output=list(nmf_outputW)

specifies the output table to be created to contain observationwise statistics. If you do not specify any statistics, then only the factor matrix W is included.

Alias outputW

The nmf_outputW value can be one or more of the following:

* casOut=list(casouttable)

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

component="string"

specifies the source values for each column in the factor matrix W. If the value is an empty string, the string that is specified in the prefix parameter is used to name the output variables.

Alias comp
copyVars=list("variable-name-1" <, "variable-name-2", ...>)

copies one or more variables from the input table to the output table.

Alias copyVar

outputH=list(nmf_outputH)

specifies the output table to be created to contain the factor matrix H.

* casOut=list(casouttable)

specifies the output table.

For more information about specifying the casOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).

outputTables=list(outputTables)

lists the names of results tables to save as CAS tables on the server.

For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters).

Alias displayOut

prefix="string"

specifies a prefix for naming the columns in the factor matrix W.

Default "Comp"

* rank=integer

specifies the target rank of the low-dimensional factor matrices to be computed.

Alias r
Range 1–MACINT

regularization=list(nmf_reg)

specifies the settings for regularization.

Aliases reg
penalty
Long form regularization=list(name="L1" | "L2")
Shortcut form regularization="L1" | "L2"

The nmf_reg value can be one or more of the following:

alpha=double

specifies the regularization weight of the factor matrix W.

Default 1
Minimum value 0
beta=double

specifies the regularization weight of the factor matrix H.

Default 1
Minimum value 0
lcurve=TRUE | FALSE

when set to True, uses the L-curve approach to perform L2-norm regularization.

Default FALSE
* name="L1" | "L2"

specifies the name of the regularization method to use.

L1

uses L1-norm regularization.

L2

uses L2-norm regularization.

seed=integer

specifies the seed value for pseudorandom number generation.

Default 0

stopMeasure="OBJFUNC" | "PROJGRAD"

specifies the stopping criterion.

Aliases stop
stopCriterion
Default PROJGRAD
OBJFUNC

uses the objective function.

PROJGRAD

uses the projected gradient.

* table=list(castable)

specifies the settings for an input table.

For more information about specifying the table parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).

Last updated: March 05, 2026