HPSEVERITY Procedure

OUTPUT Statement

  • OUTPUT <OUT=SAS-data-set> output-options;

The OUTPUT statement specifies the data set to write the estimates of scoring functions and quantiles to. To specify the name of the output data set, use the following option:

OUT=SAS-data-set

specifies the name of the output data set. If you do not specify this option, then PROC HPSEVERITY names the output data set by using the DATAn convention.

To control the contents of the OUT= data set, specify the following output-options:

COPYVARS=variable-list

specifies the names of the variables that you want to copy from the input DATA= data set to the OUT= data set. If you want to specify more than one name, then separate them by spaces.

If you specify the BY statement, then the BY variables are not automatically copied to the OUT= data set, so you must specify the BY variables in the COPYVARS= option.

FUNCTIONS=(function<(arg)><=variable> <function<(arg)><=variable>> …)

specifies the scoring functions that you want to estimate. For each scoring function that you want to estimate, specify the suffix of the scoring function as the function. For each function that you specify in this option and for each distribution D that you specify in the DIST statement, the FCMP function D_function must be defined in the search path that you specify by using the CMPLIB= system option.

If you want to evaluate the scoring function at a specific value of the response variable, then specify a number arg, which is enclosed in parentheses immediately after the function. If you do not specify arg or if you specify a missing value as arg, then for each observation in the DATA= data set, PROC HPSEVERITY computes the value v by using the following table and evaluates the scoring function at v, where y, r, and l denote the values of the response variable, right-censoring limit, and left-censoring limit, respectively:

Right-Censored Left-Censored v
No No y
No Yes l
Yes No r
Yes Yes left-parenthesis l plus r right-parenthesis slash 2

You can specify the suffix of the variable that contains the estimate of the scoring function by specifying a valid SAS name as a variable. If you do not specify a variable, then PROC HPSEVERITY uses function as the suffix of the variable name.

To illustrate the FUNCTIONS= option with an example, assume that you specify the following DIST and OUTPUT statements:

dist exp logn;
output out=score functions=(cdf pdf(1000)=f1000 mean);

Let both exponential (EXP) and lognormal (LOGN) distributions converge. If ModifyingAbove theta With caret is the final estimate of the scale parameter of the exponential distribution, then PROC HPSEVERITY creates the following three scoring function variables for the exponential (EXP) distribution in the Work.Score data set:

EXP_CDF

contains the CDF estimate upper F Subscript exp Baseline left-parenthesis v comma ModifyingAbove theta With caret right-parenthesis, where upper F Subscript exp denotes the CDF of the exponential distribution and v is the value that is determined by the preceding table.

EXP_F1000

contains the PDF estimate f Subscript exp Baseline left-parenthesis 1000 comma ModifyingAbove theta With caret right-parenthesis, where f Subscript exp denotes the PDF of the exponential distribution.

EXP_MEAN

contains the mean of the exponential distribution for the scale parameter ModifyingAbove theta With caret.

Similarly, if ModifyingAbove mu With caret and ModifyingAbove sigma With caret are the final estimates of the log-scale and shape parameters of the lognormal distribution, respectively, then PROC HPSEVERITY creates the following three scoring function variables for the lognormal (LOGN) distribution in the Work.Score data set:

LOGN_CDF

contains the CDF estimate upper F Subscript logn Baseline left-parenthesis v comma ModifyingAbove mu With caret comma ModifyingAbove sigma With caret right-parenthesis, where upper F Subscript logn denotes the CDF of the lognormal distribution and v is the value that is determined by the preceding table.

LOGN_F1000

contains the probability density function (PDF) estimate f Subscript logn Baseline left-parenthesis 1000 comma ModifyingAbove mu With caret comma ModifyingAbove sigma With caret right-parenthesis, where f Subscript logn denotes the PDF of the lognormal distribution.

LOGN_MEAN

contains the mean of the lognormal distribution for the parameters ModifyingAbove mu With caret and ModifyingAbove sigma With caret.

If you specify the SCALEMODEL statement, then the value of the scale parameter of a distribution depends on the values of the regression parameters. So it might be different for different observations. In this example, the values of ModifyingAbove theta With caret and ModifyingAbove mu With caret might vary by observation, which might cause the values of the EXP_F1000, EXP_MEAN, LOGN_F1000, and LOGN_MEAN variables to vary by observation. The values of the EXP_CDF and LOGN_CDF variables might vary not only because of the varying values of v but also because of the varying values of ModifyingAbove theta With caret and ModifyingAbove mu With caret.

If you do not specify the SCALEMODEL statement, then the values of scoring functions for which you specify a nonmissing argument arg and scoring functions that do not depend on the response variable value do not vary by observation. In this example, the values of the EXP_F1000, EXP_MEAN, LOGN_F1000, and LOGN_MEAN variables do not vary by observation.

If a distribution does not converge, then the scoring function variables for that distribution contain missing values in all observations.

For more information about scoring functions, see the section Scoring Functions.

QUANTILES=quantile-options

specifies the quantiles that you want to estimate. To use this option, for each distribution that you specify in the DIST statement, the FCMP function D_QUANTILE must be defined in the search path that you specify by using the CMPLIB= system option.

You can specify the following quantile-options:

CDF=CDF-values
POINTS=CDF-values

specifies the CDF values at which you want to estimate the quantiles. CDF-values can be one or more numbers, separated by spaces. Each number must be in the interval (0,1).

NAMES=variable-names

specifies the suffixes of the names of the variables for each of the quantile estimates. If you specify n (n greater-than-or-equal-to 0) names in the variable-names option and k values in the CDF= option, and if n less-than k, then PROC HPSEVERITY uses the n names to name the variables that correspond to the first n CDF values. For each of the remaining k minus n CDF values, p Subscript i (n less-than i less-than-or-equal-to k), PROC HPSEVERITY creates a variable name Pt, where t is the text representation of 100 p Subscript i that is formed by retaining at most NDECIMAL= digits after the decimal point and replacing the decimal point with an underscore ('_').

NDECIMAL=number

specifies the number of digits to keep after the decimal point when PROC HPSEVERITY creates the name of the quantile estimate variable. If you do not specify this option, then the default value is 3.

For example, assume that you specify the following DIST and OUTPUT statements:

dist burr;
output out=score quantiles=(cdf=0.9 0.975 0.995 names=ninety var);

PROC HPSEVERITY creates three quantile estimate variables, BURR_NINETY, BURR_VAR, and BURR_P99_5, in the Work.Score data set for the Burr distribution. These variables contain the estimates of upper Q Subscript Burr Baseline left-parenthesis p comma ModifyingAbove theta With caret comma ModifyingAbove alpha With caret comma ModifyingAbove gamma With caret right-parenthesis, for p = 0.9, 0.975, and 0.995, respectively, where upper Q Subscript Burr denotes the quantile function and ModifyingAbove theta With caret, ModifyingAbove alpha With caret, and ModifyingAbove gamma With caret denote the parameter estimates of the Burr distribution.

If you specify the SCALEMODEL statement, then the quantile estimate might vary by observation, because the scale parameter of a distribution depends on the values of the regression parameters.

If you do not specify the SCALEMODEL statement, then the quantile estimates do not vary by observation, and if you do not specify any scoring functions in the FUNCTIONS= option whose estimates vary by observation, then the OUT= data set contains only one observation per BY group.

If a distribution does not converge, then the quantile estimate variables for that distribution contain missing values for all observations.

For more information about the variables and observations in the OUT= data set, see the section OUT= Data Set.

Last updated: June 19, 2025