HPCDM Procedure

PROC HPCDM Statement

  • PROC HPCDM options;

The PROC HPCDM statement invokes the procedure. You can specify the following options, which are listed in alphabetical order.

ADJUSTEDSEVERITY=symbol-name
ADJSEV=symbol-name

names the symbol that represents the adjusted severity value in the SAS programming statements that you specify. The symbol-name is a SAS name that conforms to the naming conventions of a SAS variable. For more information, see the section Programming Statements.

COUNTSTORE=SAS-item-store

names the item store that contains all the information about the frequency (count) model. The COUNTREG procedure generates this item store when you use the STORE statement.

The exogenous variables in the frequency model, if any, are deduced from this item store. The DATA= data set must contain all those variables.

If you specify a BY statement in the PROC COUNTREG step that creates the COUNTSTORE= item store, then you must specify an identical BY statement in the PROC HPCDM step.

You must specify this option if you do not specify the EXTERNALCOUNTS statement. This option is ignored if you specify the EXTERNALCOUNTS statement, because PROC HPCDM does not need to simulate frequency counts internally when you specify externally simulated counts.

DATA=SAS-data-set

names the input data set that contains the values of regression variables in frequency or severity models and severity adjustment variables that you use in the programming statements.

The DATA= data set specifies information about the scenario for which you want to estimate the aggregate loss distribution. The interpretation of the contents of the data set depends on whether you specify the EXTERNALCOUNTS statement. For more information, see the section Specifying Scenario Data in the DATA= Data Set.

MAXCOUNTDRAW=number
MAXCOUNT=number

specifies an upper limit on the number of loss events (count) that is used for simulating one aggregate loss sample point. If the number is equal to upper N Subscript max, then any count that is greater than upper N Subscript max is assumed to be equal to upper N Subscript max, and only upper N Subscript max severity draws are made to compute one point in the aggregate loss sample.

If you specify this option and also specify the COUNTSTORE= item store, then the limit is applied to each count that PROC HPCDM randomly draws from the count distribution in the COUNTSTORE= item store. Any count draw that is larger than the number is replaced by the number.

If you specify this option and also specify the EXTERNALCOUNTS statement, then the limit is applied to each observation in the DATA= data set, and any value of the COUNT= variable that is larger than the number is replaced by the number.

If you do not specify this option, then a default value of 1,000 is used.

If you specify a number that is significantly larger than 1,000, then PROC HPCDM might take a very long time to complete the simulation, especially when some counts are closer to the limit.

NOPRINT

turns off all displayed and graphical output. If you specify this option, then PROC HPCDM ignores any value that you specify for the PRINT= or PLOTS= option.

NPERTURBEDSAMPLES=number
NPERTURB=number

requests that parameter perturbation analysis be conducted. The model parameters are perturbed the specified number of times and a separate full sample is simulated for each set of perturbed parameter values. The summary statistics and percentiles are computed for each such perturbed sample, and their values are aggregated across the samples to compute the mean and standard deviation of each summary statistic and percentile.

The parameter perturbation procedure makes random draws of parameter values from a multivariate normal distribution if the covariance estimates of the parameters are available. For the multivariate normal distribution of severity model parameters, PROC HPCDM attempts to read the covariance estimates from the SEVERITYEST= data set or the SEVERITYSTORE= item store. For the multivariate normal distribution of count model parameters, PROC HPCDM attempts to read the covariance estimates from the COUNTSTORE= store. If covariance estimates are not available or valid, then for each parameter, a random draw is made from the univariate normal distribution that has mean and standard deviation equal to the point estimate and the standard error, respectively, of that parameter. If neither covariance nor standard error estimates are available, then perturbation analysis is not conducted.

If you specify the PRINT=ALL or PRINT=PERTURBSUMMARY option, then the summary of perturbation analysis is printed for the core summary statistics and the percentiles of the aggregate loss distribution. If you specify the OUTSUM statement, then the requested summary statistics are written to the OUTSUM= data set for each perturbed sample. You can also optionally request that each perturbed sample be written in its entirety to the OUT= data set by specifying the PERTURBOUT option in the OUTPUT statement.

For more information on the parameter perturbation analysis, see the section Parameter Perturbation Analysis.

NREPLICATES=number
NREP=number

specifies a number that controls the size of the compound distribution sample that PROC HPCDM simulates. The number is interpreted differently based on whether you specify the EXTERNALCOUNTS statement.

If you do not specify the EXTERNALCOUNTS statement, then the sample size is equal to the number that you specify for this option. If you do not specify this option, then a default value of 100,000 is used.

If you specify the EXTERNALCOUNTS statement, then the number of replicates that you specify in the DATA= data set is multiplied by the number that you specify for this option to get the total size of the compound distribution sample. If you do not specify this option, then a default value of 1 is used.

PCTLDEF=percentile-method

specifies the method to compute the percentiles of the compound distribution. The percentile-method can be 1, 2, 3, 4, or 5. The default method is 5. For more information, see the description of the PCTLDEF= option in the UNIVARIATE procedure in the Base SAS Procedures Guide: Statistical Procedures.

PERTURBMETHOD=perturbation-method

specifies a method to perturb the parameters of the severity and count models. This option has no effect if you do not specify the NPERTURBEDSAMPLES= option. You can specify one of the following perturbation-methods:

ASYNC | 0

causes each thread of computation to use its own set of perturbed model parameters. In particular, each thread uses its own pseudorandom number generator (PRNG) to perturb the model parameters, which is the same PRNG as the PRNG that the thread uses for making random draws from the severity or count distributions. Because each thread’s PRNG starts with a different seed and because random draws that pertain to perturbation are interleaved with random draws that are made from the severity or count distributions, each thread effectively uses a different set of perturbed models parameters even though it is simulating a subset of the same, global perturbed sample.

This method is computationally slightly more efficient because it does not need to synchronize the set of perturbed parameters among threads. However, each perturbed sample that it produces is conceptually a collection of smaller, distinct perturbed samples that belong to different compound distribution models.

SYNC | 1

specifies that all threads of computation use the same (synchronized) set of perturbed model parameters. When you specify this option, PROC HPCDM in concept uses a single, dedicated PRNG to perturb the model parameters and shares those parameters with all threads.

This method ensures that all observations of a particular perturbed sample belong to the same compound distribution model, because each thread uses the same set of perturbed model parameters.

It is recommended that you specify the SYNC method. By default, PERTURBMETHOD=ASYNC to ensure that the current release of PROC HPCDM produces, by default, the same perturbation results as releases prior to SAS/ETS 15.1.

PLOTS <(global-plot-options)> =plot-request-option
PLOTS <(global-plot-options)> =(plot-request-option …plot-request-option)

specifies the desired graphical output.

By default, the HPCDM procedure produces no graphical output.

You can specify the following global-plot-option:

ONLY

turns off the default graphical output and prepares only the requested plots.

If you specify more than one plot-request-option, then separate them with spaces and enclose them in parentheses. The following plot-request-options are available:

ALL

displays all the graphical output.

CONDITIONALDENSITY (conditional-density-plot-options)
CONDPDF (conditional-density-plot-options)

prepares a group of plots of the conditional density functions estimates. The group contains at most three plots, each conditional on the value of the aggregate loss being in one of the three regions that are defined by the quantiles that you specify in the following conditional-density-plot-options:

LEFTQ=number

specifies the quantile in the range (0,1) that marks the end of the left-tail region. If you specify a value of l for number, then the left-tail region is defined as the set of values that are less than or equal to q Subscript l, where q Subscript l is the lth quantile. For the left-tail region, nonparametric estimates of the conditional probability density function f Subscript upper S Superscript l Baseline left-parenthesis s right-parenthesis equals probability left-bracket upper S equals s vertical-bar upper S less-than-or-equal-to q Subscript l Baseline right-bracket are plotted. The value of q Subscript l is estimated by the 100 lth percentile of the simulated compound distribution sample.

If you do not specify this option or you specify a missing value for this option, then the left-tail region is not plotted.

RIGHTQ=number

specifies the quantile in the range (0,1) that marks the beginning of the right-tail region. If you specify a value of r for number, then the right-tail region is defined as the set of values that are greater than q Subscript r, where q Subscript r is the rth quantile. For the right-tail region, nonparametric estimates of the conditional probability density function f Subscript upper S Superscript r Baseline left-parenthesis s right-parenthesis equals probability left-bracket upper S equals s vertical-bar upper S greater-than q Subscript r Baseline right-bracket are plotted. The value of q Subscript r is estimated by the 100 rth percentile of the simulated compound distribution sample.

If you do not specify this option or you specify a missing value for this option, then the right-tail region is not plotted.

You must specify nonmissing value for at least one of the preceding two options. For the region between the LEFTQ= and RIGHTQ= quantiles, which is referred to as the central or body region, nonparametric estimates of the conditional probability density function f Subscript upper S Superscript c Baseline left-parenthesis s right-parenthesis equals probability left-bracket upper S equals s vertical-bar q Subscript l Baseline less-than upper S less-than-or-equal-to q Subscript r Baseline right-bracket are plotted. If you do not specify a LEFTQ= value, then q Subscript l is assumed to be 0. If you do not specify a RIGHTQ= value, then q Subscript r is assumed to be normal infinity.

DENSITY

prepares a plot of the nonparametric estimates of the probability density function (in particular, histogram and kernel density estimates) of the compound distribution.

EDF <(edf-plot-option)>

prepares a plot of the nonparametric estimates of the cumulative distribution function of the compound distribution.

You can request that the confidence interval be plotted by specifying the following edf-plot-option:

ALPHA=number

specifies the confidence level in the (0,1) range that is used for computing the confidence intervals for the EDF estimates. If you specify a value of alpha for number, then the upper and lower confidence limits for the confidence level of 100 left-parenthesis 1 minus alpha right-parenthesis are plotted.

NONE

displays none of the graphical output. If you specify this option, then it overrides all other plot request options. The default graphical output is also suppressed.

Note that if the simulated sample size is large, then it can take a significant amount of time and memory to prepare the plots.

PRINT <(global-display-option)> =display-option
PRINT <(global-display-option)> =(display-option …display-option)

specifies the desired displayed output. If you specify more than one display-option, then separate them with spaces and enclose them in parentheses.

You can specify the following global-display-option:

ONLY

turns off the default displayed output and displays only the requested output.

You can specify the following display-options:

ALL

displays all the output.

NONE

displays none of the output. If you specify this option, then it overrides all other display options. The default displayed output is also suppressed.

PERCENTILES

displays the percentiles of the compound distribution sample. This includes all the predefined percentiles, percentiles that you request in the OUTSUM statement, and percentiles that you specify for preparing conditional density plots.

PERTURBSUMMARY

displays the mean and standard deviation of the summary statistics and percentiles that are taken across all the samples produced by perturbing the model parameters. This option is valid only if you specify the NPERTURBEDSAMPLES= option in the PROC HPCDM statement.

SUMMARYSTATISTICS | SUMSTAT

displays the summary statistics of the compound distribution sample.

If you do not specify the PRINT= option or the ONLY global-display-option, then the default displayed output is equivalent to specifying PRINT=(SUMMARYSTATISTICS).

SEED=number

specifies the integer to use as the seed in generating the pseudorandom numbers that are used for simulating severity and frequency values.

If you do not specify the seed or if number is negative or 0, then the time of day from the computer’s clock is used as the seed.

SEVERITYEST=SAS-data-set

names the input data set that contains the parameter estimates for the severity model. The format of this data set must be the same as the OUTEST= data set that is produced by the SEVERITY procedure.

The names of the regression variables in the scale regression model, if any, are deduced from this data set. In particular, PROC HPCDM assumes that all the variables in the SEVERITYEST= data set that do not appear in the following list are scale regression variables:

  • BY variables

  • _MODEL_, _TYPE_, _NAME_, and _STATUS_ variables

  • variables that represent distribution parameters

The DATA= data set must contain all the regressors in the scale regression model.

To ensure that PROC HPCDM correctly matches the values of regressors and the values of regression parameter estimates, you might need to rename the regressors in the DATA= data set so that their names match the names of the regressors that you specify in the SCALEMODEL statement of the PROC SEVERITY step that fits the severity model.

If you specify a BY statement in the PROC SEVERITY step that creates the SEVERITYEST= data set, then you must specify an identical BY statement in the PROC HPCDM step. Otherwise, PROC HPCDM detects the BY variables as regression variables in the scale regression model, which might produce unexpected results.

SEVERITYSTORE=SAS-item-store
SEVSTORE=SAS-item-store

specifies the item store that contains the context and estimates of the severity model. A PROC SEVERITY step with the OUTSTORE= option creates this item store.

If your severity model contains classification or interaction effects, then you need to use this option instead of the SEVERITYEST= option to specify the severity model. If you specify this option, you cannot specify the SEVERITYEST= option.

If you specify a BY statement in the PROC SEVERITY step that creates the SEVERITYSTORE= item store, then you must specify an identical BY statement in the PROC HPCDM step.

VARDEF=divisor

specifies the divisor to use in the calculation of variance, standard deviation, kurtosis, and skewness of the compound distribution sample. If the sample size is N, then you can specify one of the following values for the divisor:

DF

sets the divisor for variance to upper N minus 1. This is the default. This also changes the definitions of skewness and kurtosis.

N

sets the divisor to N.

For more information, see the section Descriptive Statistics.

Last updated: June 19, 2025