Provides actions for mining textual data
Computes the SVD factorization and generates topics. Some parameters require a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the input CAS table that contains parsing configuration information |
|
|
required parameterparent |
— |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document. |
|
— |
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the table to contain the SVD projections of the documents. |
|
|
— |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values. |
|
|
— |
Specifies the output scoring config file. |
|
|
— |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information. |
|
|
— |
specifies the output CAS table to contain the topics that are discovered. |
|
|
— |
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1. |
|
|
— |
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1. |
|
|
— |
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns. |
specifies the name of the input CAS table that contains parsing configuration information
For more information about specifying the config parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
| Alias | parseConfig |
|---|
specifies the variable that contains the, possibly weighted, term count. The values in this variable must be numeric. There can be no missing values in this variable.
| Default | "_COUNT_" |
|---|
specifies the variable that contains the document ID. The type of this variable can either be numeric or a string. There can be no missing values in this variable.
| Default | "_DOCUMENT_" |
|---|
specifies the name of the table to contain the SVD projections of the documents.
For more information about specifying the docPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the document cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies if the exact document projection values should be output. This parameter requires a SAS Visual Text Analytics license.
| Default | TRUE |
|---|
| Alias | exactWeights |
|---|---|
| Default | FALSE |
specifies the number of dimensions to be extracted (also the number of derived topics). If the input data is too small for the requested number of dimensions, this value is adjusted to complete the calculation.
| Alias | numTopics |
|---|---|
| Range | 1–1000 |
specifies whether to use the legacy variable names on tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | FALSE |
|---|
specifies the maximum number of dimensions to be extracted. The maxK option can be used in conjunction with the resolution option to dynamically select the recommended number of dimensions. If you wish to use a specific number of dimensions use maxK and set the resolution to high, or use the k parameter.
| Default | 10 |
|---|---|
| Range | 1–1000 |
specifies whether to normalize the document projections, term projections, or both. The normalization converts the representation from depending on angles between vectors to depending on Euclidean distances between vectors.
| Default | ALL |
|---|
specifies number of threads to be used per node. If not set, or if a value of 0 is specified, all available threads will be used.
| Default | 8 |
|---|---|
| Range | 0–64 |
specifies the number of terms to use in the descriptive label for each topic.
| Default | 5 |
|---|---|
| Range | 1–500 |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document.
For more information about specifying the parent parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the desired resolution level for the recommended number of dimensions to be extracted by the SVD.
| Default | HIGH |
|---|
specifies the type of rotation used to maximize the explanatory power of each topic. A VARIMAX rotation produces uncorrelated topics and a PROMAX rotation produces correlated topics.
| Default | VARIMAX |
|---|
specifies the row-pivot weight for document normalization of the parent table before the SVD. A negative value turns off the row-pivot process. When topics are requested, a value of 1 is used for this parameter by default. This parameter requires a SAS Visual Text Analytics license.
| Default | -1 |
|---|---|
| Range | -1–1 |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values.
For more information about specifying the s parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Specifies the output scoring config file.
For more information about specifying the scoreConfig parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the variable that contains the term ID. The contents of this variable must be an integer greater than or equal to 1. There can be no missing values in this variable.
| Default | "_TERMNUM_" |
|---|
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation.
For more information about specifying the terms parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the term cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information.
For more information about specifying the termTopics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the stopping threshold for the iterative factorization algorithm. If 0 is specified the default value is used.
| Default | 1E-06 |
|---|---|
| Range | 0–1 |
Specifies to include topic membership decisions and document cutoffs in the output tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | FALSE |
|---|
specifies the output CAS table to contain the topics that are discovered.
For more information about specifying the topics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1.
For more information about specifying the u parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1.
For more information about specifying the v parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns.
For more information about specifying the wordPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Computes the SVD factorization and generates topics. Some parameters require a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the input CAS table that contains parsing configuration information |
|
|
required parameterparent |
— |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document. |
|
— |
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the table to contain the SVD projections of the documents. |
|
|
— |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values. |
|
|
— |
Specifies the output scoring config file. |
|
|
— |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information. |
|
|
— |
specifies the output CAS table to contain the topics that are discovered. |
|
|
— |
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1. |
|
|
— |
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1. |
|
|
— |
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns. |
specifies the name of the input CAS table that contains parsing configuration information
For more information about specifying the config parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
| Alias | parseConfig |
|---|
specifies the variable that contains the, possibly weighted, term count. The values in this variable must be numeric. There can be no missing values in this variable.
| Default | "_COUNT_" |
|---|
specifies the variable that contains the document ID. The type of this variable can either be numeric or a string. There can be no missing values in this variable.
| Default | "_DOCUMENT_" |
|---|
specifies the name of the table to contain the SVD projections of the documents.
For more information about specifying the docPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the document cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies if the exact document projection values should be output. This parameter requires a SAS Visual Text Analytics license.
| Default | true |
|---|
| Alias | exactWeights |
|---|---|
| Default | false |
specifies the number of dimensions to be extracted (also the number of derived topics). If the input data is too small for the requested number of dimensions, this value is adjusted to complete the calculation.
| Alias | numTopics |
|---|---|
| Range | 1–1000 |
specifies whether to use the legacy variable names on tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | false |
|---|
specifies the maximum number of dimensions to be extracted. The maxK option can be used in conjunction with the resolution option to dynamically select the recommended number of dimensions. If you wish to use a specific number of dimensions use maxK and set the resolution to high, or use the k parameter.
| Default | 10 |
|---|---|
| Range | 1–1000 |
specifies whether to normalize the document projections, term projections, or both. The normalization converts the representation from depending on angles between vectors to depending on Euclidean distances between vectors.
| Default | ALL |
|---|
specifies number of threads to be used per node. If not set, or if a value of 0 is specified, all available threads will be used.
| Default | 8 |
|---|---|
| Range | 0–64 |
specifies the number of terms to use in the descriptive label for each topic.
| Default | 5 |
|---|---|
| Range | 1–500 |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document.
For more information about specifying the parent parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the desired resolution level for the recommended number of dimensions to be extracted by the SVD.
| Default | HIGH |
|---|
specifies the type of rotation used to maximize the explanatory power of each topic. A VARIMAX rotation produces uncorrelated topics and a PROMAX rotation produces correlated topics.
| Default | VARIMAX |
|---|
specifies the row-pivot weight for document normalization of the parent table before the SVD. A negative value turns off the row-pivot process. When topics are requested, a value of 1 is used for this parameter by default. This parameter requires a SAS Visual Text Analytics license.
| Default | -1 |
|---|---|
| Range | -1–1 |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values.
For more information about specifying the s parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Specifies the output scoring config file.
For more information about specifying the scoreConfig parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the variable that contains the term ID. The contents of this variable must be an integer greater than or equal to 1. There can be no missing values in this variable.
| Default | "_TERMNUM_" |
|---|
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation.
For more information about specifying the terms parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the term cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information.
For more information about specifying the termTopics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the stopping threshold for the iterative factorization algorithm. If 0 is specified the default value is used.
| Default | 1E-06 |
|---|---|
| Range | 0–1 |
Specifies to include topic membership decisions and document cutoffs in the output tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | false |
|---|
specifies the output CAS table to contain the topics that are discovered.
For more information about specifying the topics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1.
For more information about specifying the u parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1.
For more information about specifying the v parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns.
For more information about specifying the wordPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Computes the SVD factorization and generates topics. Some parameters require a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the input CAS table that contains parsing configuration information |
|
|
required parameterparent |
— |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document. |
|
— |
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the table to contain the SVD projections of the documents. |
|
|
— |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values. |
|
|
— |
Specifies the output scoring config file. |
|
|
— |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information. |
|
|
— |
specifies the output CAS table to contain the topics that are discovered. |
|
|
— |
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1. |
|
|
— |
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1. |
|
|
— |
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns. |
specifies the name of the input CAS table that contains parsing configuration information
For more information about specifying the config parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
| Alias | parseConfig |
|---|
specifies the variable that contains the, possibly weighted, term count. The values in this variable must be numeric. There can be no missing values in this variable.
| Default | "_COUNT_" |
|---|
specifies the variable that contains the document ID. The type of this variable can either be numeric or a string. There can be no missing values in this variable.
| Default | "_DOCUMENT_" |
|---|
specifies the name of the table to contain the SVD projections of the documents.
For more information about specifying the docPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the document cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies if the exact document projection values should be output. This parameter requires a SAS Visual Text Analytics license.
| Default | True |
|---|
| Alias | exactWeights |
|---|---|
| Default | False |
specifies the number of dimensions to be extracted (also the number of derived topics). If the input data is too small for the requested number of dimensions, this value is adjusted to complete the calculation.
| Alias | numTopics |
|---|---|
| Range | 1–1000 |
specifies whether to use the legacy variable names on tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | False |
|---|
specifies the maximum number of dimensions to be extracted. The maxK option can be used in conjunction with the resolution option to dynamically select the recommended number of dimensions. If you wish to use a specific number of dimensions use maxK and set the resolution to high, or use the k parameter.
| Default | 10 |
|---|---|
| Range | 1–1000 |
specifies whether to normalize the document projections, term projections, or both. The normalization converts the representation from depending on angles between vectors to depending on Euclidean distances between vectors.
| Default | ALL |
|---|
specifies number of threads to be used per node. If not set, or if a value of 0 is specified, all available threads will be used.
| Default | 8 |
|---|---|
| Range | 0–64 |
specifies the number of terms to use in the descriptive label for each topic.
| Default | 5 |
|---|---|
| Range | 1–500 |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document.
For more information about specifying the parent parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the desired resolution level for the recommended number of dimensions to be extracted by the SVD.
| Default | HIGH |
|---|
specifies the type of rotation used to maximize the explanatory power of each topic. A VARIMAX rotation produces uncorrelated topics and a PROMAX rotation produces correlated topics.
| Default | VARIMAX |
|---|
specifies the row-pivot weight for document normalization of the parent table before the SVD. A negative value turns off the row-pivot process. When topics are requested, a value of 1 is used for this parameter by default. This parameter requires a SAS Visual Text Analytics license.
| Default | -1 |
|---|---|
| Range | -1–1 |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values.
For more information about specifying the s parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Specifies the output scoring config file.
For more information about specifying the scoreConfig parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the variable that contains the term ID. The contents of this variable must be an integer greater than or equal to 1. There can be no missing values in this variable.
| Default | "_TERMNUM_" |
|---|
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation.
For more information about specifying the terms parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the term cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information.
For more information about specifying the termTopics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the stopping threshold for the iterative factorization algorithm. If 0 is specified the default value is used.
| Default | 1E-06 |
|---|---|
| Range | 0–1 |
Specifies to include topic membership decisions and document cutoffs in the output tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | False |
|---|
specifies the output CAS table to contain the topics that are discovered.
For more information about specifying the topics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1.
For more information about specifying the u parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1.
For more information about specifying the v parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns.
For more information about specifying the wordPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Computes the SVD factorization and generates topics. Some parameters require a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
If a row includes a subparameter, you can specify the name, caslib, and so on in the subparameter. Otherwise, you can specify the name, caslib, and so on in the parameter.
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the input CAS table that contains parsing configuration information |
|
|
required parameterparent |
— |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document. |
|
— |
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation. |
|
Parameter |
Subparameter |
Description |
|---|---|---|
|
— |
specifies the name of the table to contain the SVD projections of the documents. |
|
|
— |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values. |
|
|
— |
Specifies the output scoring config file. |
|
|
— |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information. |
|
|
— |
specifies the output CAS table to contain the topics that are discovered. |
|
|
— |
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1. |
|
|
— |
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1. |
|
|
— |
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns. |
specifies the name of the input CAS table that contains parsing configuration information
For more information about specifying the config parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
| Alias | parseConfig |
|---|
specifies the variable that contains the, possibly weighted, term count. The values in this variable must be numeric. There can be no missing values in this variable.
| Default | "_COUNT_" |
|---|
specifies the variable that contains the document ID. The type of this variable can either be numeric or a string. There can be no missing values in this variable.
| Default | "_DOCUMENT_" |
|---|
specifies the name of the table to contain the SVD projections of the documents.
For more information about specifying the docPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the document cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies if the exact document projection values should be output. This parameter requires a SAS Visual Text Analytics license.
| Default | TRUE |
|---|
| Alias | exactWeights |
|---|---|
| Default | FALSE |
specifies the number of dimensions to be extracted (also the number of derived topics). If the input data is too small for the requested number of dimensions, this value is adjusted to complete the calculation.
| Alias | numTopics |
|---|---|
| Range | 1–1000 |
specifies whether to use the legacy variable names on tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | FALSE |
|---|
specifies the maximum number of dimensions to be extracted. The maxK option can be used in conjunction with the resolution option to dynamically select the recommended number of dimensions. If you wish to use a specific number of dimensions use maxK and set the resolution to high, or use the k parameter.
| Default | 10 |
|---|---|
| Range | 1–1000 |
specifies whether to normalize the document projections, term projections, or both. The normalization converts the representation from depending on angles between vectors to depending on Euclidean distances between vectors.
| Default | ALL |
|---|
specifies number of threads to be used per node. If not set, or if a value of 0 is specified, all available threads will be used.
| Default | 8 |
|---|---|
| Range | 0–64 |
specifies the number of terms to use in the descriptive label for each topic.
| Default | 5 |
|---|---|
| Range | 1–500 |
specifies the input CAS table that contains the term-by-document matrix in transaction form. The table must have at last three variables, one containing the document id, a second containing the term id, and the third containing the value in the cell corresponding to that particular term and document.
For more information about specifying the parent parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies the desired resolution level for the recommended number of dimensions to be extracted by the SVD.
| Default | HIGH |
|---|
specifies the type of rotation used to maximize the explanatory power of each topic. A VARIMAX rotation produces uncorrelated topics and a PROMAX rotation produces correlated topics.
| Default | VARIMAX |
|---|
specifies the row-pivot weight for document normalization of the parent table before the SVD. A negative value turns off the row-pivot process. When topics are requested, a value of 1 is used for this parameter by default. This parameter requires a SAS Visual Text Analytics license.
| Default | -1 |
|---|---|
| Range | -1–1 |
specifies the S matrix, which is a diagonal matrix that is output in compressed form, with two variables and k rows. The variable _ID_ indicates the row and column of the entry and the variable S contains the singular values.
For more information about specifying the s parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
Specifies the output scoring config file.
For more information about specifying the scoreConfig parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the variable that contains the term ID. The contents of this variable must be an integer greater than or equal to 1. There can be no missing values in this variable.
| Default | "_TERMNUM_" |
|---|
specifies the name of the input table that contains information about the terms in the document collection. The table is used to determine which terms to use in the topic calculation.
For more information about specifying the terms parameter, see the common castable (Form 1) parameter (Appendix A: Common Parameters).
specifies how many standard deviations above the mean to set the term cutoff. This parameter requires a SAS Visual Text Analytics license.
| Default | 1 |
|---|---|
| Range | 0–10 |
specifies the name of the output CAS table to contain the term-by-topic sparse matrix information.
For more information about specifying the termTopics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the stopping threshold for the iterative factorization algorithm. If 0 is specified the default value is used.
| Default | 1E-06 |
|---|---|
| Range | 0–1 |
Specifies to include topic membership decisions and document cutoffs in the output tables. This parameter requires a SAS Visual Text Analytics license or a SAS Visual Data Mining and Machine Learning license.
| Default | FALSE |
|---|
specifies the output CAS table to contain the topics that are discovered.
For more information about specifying the topics parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the U matrix, which contains the left singular vectors. The matrix U is number of terms by k+1.
For more information about specifying the u parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the transpose of the matrix containing the right singular vectors. The matrix V is number of documents by k+1.
For more information about specifying the v parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).
specifies the table to contain the projections of the terms. If k dimensions of the SVD are found and the input data set contains n terms, this table will have n rows and k+1 columns.
For more information about specifying the wordPro parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters).