SEVERITY Procedure

Statistics of Fit

PROC SEVERITY computes and reports various statistics of fit to indicate how well the estimated model fits the data. The statistics belong to two categories: likelihood-based statistics and EDF-based statistics. Neg2LogLike, AIC, AICC, and BIC are likelihood-based statistics, and KS, AD, and CvM are EDF-based statistics. The following subsections provide definitions of each.

Likelihood-Based Statistics of Fit

Let , denote the response variable values. Let L be the likelihood as defined in the section Likelihood Function. Let p denote the number of model parameters that are estimated. Note that , where is the number of distribution parameters, k is the number of all regression parameters, and is the number of regression parameters that are found to be linearly dependent (redundant) on other regression parameters. Given this notation, the likelihood-based statistics are defined as follows:

Neg2LogLike

The log likelihood is reported as

The multiplying factor makes it easy to compare it to the other likelihood-based statistics. A model that has a smaller value of Neg2LogLike is deemed better.

AIC

Akaike’s information criterion (AIC) is defined as

AIC equals minus 2 log left-parenthesis upper L right-parenthesis plus 2 p

A model that has a smaller AIC value is deemed better.

AICC

The corrected Akaike’s information criterion (AICC) is defined as

A model that has a smaller AICC value is deemed better. It corrects the finite-sample bias that AIC has when N is small compared to p. AICC is related to AIC as

AICC equals AIC plus StartFraction 2 p left-parenthesis p plus 1 right-parenthesis Over upper N minus p minus 1 EndFraction

As N becomes large compared to p, AICC converges to AIC. AICC is usually recommended over AIC as a model selection criterion.

BIC

The Schwarz Bayesian information criterion (BIC) is defined as

BIC equals minus 2 log left-parenthesis upper L right-parenthesis plus p log left-parenthesis upper N right-parenthesis

A model that has a smaller BIC value is deemed better.

EDF-Based Statistics

This class of statistics is based on the difference between the estimate of the cumulative distribution function (CDF) and the estimate of the empirical distribution function (EDF). A model that has a smaller value of the chosen EDF-based statistic is deemed better.

Let denote the sample of N values of the response variable. Let denote the normalized weight of the ith observation. If denotes the original, unnormalized weight of the ith observation, then . Let denote the number of observations with unique (nonduplicate) values of the response variable. Let denote the total weight of observations with a value , where I is an indicator function. Let denote the total weight of observations with a value less than or equal to . Let denote the total weight of all observations. Use of normalized weights implies that .

Let denote the EDF estimate that is computed by using the method that you specify in the EMPIRICALCDF= option. Let denote the estimate of the CDF. Let denote the EDF estimate of values that are computed using the same method that is used to compute the EDF of values. Using the probability integral transformation, if is the true distribution of the random variable Y, then the random variable is uniformly distributed between 0 and 1 (D’Agostino and Stephens 1986, Ch. 4). Thus, comparing with is equivalent to comparing with (uniform distribution).

Note the following two points regarding which CDF estimates are used for computing the test statistics:

If you specify regression effects, then the CDF estimates that are used for computing the EDF test statistics are from a mixture distribution. For more information, see the section CDF and PDF Estimates with Regression Effects.
If the EDF estimates are conditional because of the truncation information, then each unconditional estimate is converted to a conditional estimate using the method described in the section Truncation and Conditional CDF Estimates.

In the following, it is assumed that denotes an appropriate estimate of the CDF if you specify any truncation or regression effects. Given this, the EDF-based statistics of fit are defined as follows:

KS

The Kolmogorov-Smirnov (KS) statistic computes the largest vertical distance between the CDF and the EDF. It is formally defined as follows:

KS equals sup Underscript y Endscripts StartAbsoluteValue upper F Subscript n Baseline left-parenthesis y right-parenthesis minus upper F left-parenthesis y right-parenthesis EndAbsoluteValue

If the STANDARD method is used to compute the EDF, then the following formula is used:

StartLayout 1st Row 1st Column upper D Superscript plus 2nd Column equals max Subscript i Baseline left-parenthesis StartFraction r Subscript i Baseline Over upper W EndFraction minus upper Z Subscript i Baseline right-parenthesis 2nd Row 1st Column upper D Superscript minus 2nd Column equals max Subscript i Baseline left-parenthesis upper Z Subscript i Baseline minus StartFraction r Subscript i minus 1 Baseline Over upper W EndFraction right-parenthesis 3rd Row 1st Column KS 2nd Column equals StartRoot upper W EndRoot max left-parenthesis upper D Superscript plus Baseline comma upper D Superscript minus Baseline right-parenthesis plus StartFraction 0.19 Over StartRoot upper W EndRoot EndFraction EndLayout

Note that is assumed to be 0.

If the method used to compute the EDF is any method other than the STANDARD method, then the following formula is used:

StartLayout 1st Row 1st Column upper D Superscript plus 2nd Column equals max Subscript i Baseline left-parenthesis upper F Subscript n Baseline left-parenthesis upper Z Subscript i Baseline right-parenthesis minus upper Z Subscript i Baseline right-parenthesis comma if upper F Subscript n Baseline left-parenthesis upper Z Subscript i Baseline right-parenthesis greater-than-or-equal-to upper Z Subscript i Baseline 2nd Row 1st Column upper D Superscript minus 2nd Column equals max Subscript i Baseline left-parenthesis upper Z Subscript i Baseline minus upper F Subscript n Baseline left-parenthesis upper Z Subscript i Baseline right-parenthesis right-parenthesis comma if upper F Subscript n Baseline left-parenthesis upper Z Subscript i Baseline right-parenthesis less-than upper Z Subscript i Baseline 3rd Row 1st Column KS 2nd Column equals StartRoot upper W EndRoot max left-parenthesis upper D Superscript plus Baseline comma upper D Superscript minus Baseline right-parenthesis plus StartFraction 0.19 Over StartRoot upper W EndRoot EndFraction EndLayout

AD

The Anderson-Darling (AD) statistic is a quadratic EDF statistic that is proportional to the expected value of the weighted squared difference between the EDF and CDF. It is formally defined as follows:

AD equals upper N integral Subscript negative normal infinity Superscript normal infinity Baseline StartFraction left-parenthesis upper F Subscript n Baseline left-parenthesis y right-parenthesis minus upper F left-parenthesis y right-parenthesis right-parenthesis squared Over upper F left-parenthesis y right-parenthesis left-parenthesis 1 minus upper F left-parenthesis y right-parenthesis right-parenthesis EndFraction d upper F left-parenthesis y right-parenthesis

If the STANDARD method is used to compute the EDF, then PROC SEVERITY uses the following formula:

AD equals negative upper W minus StartFraction 1 Over upper W EndFraction sigma-summation Underscript i equals 1 Overscript upper N Subscript u Baseline Endscripts upper W Subscript i Baseline left-bracket left-parenthesis 2 r Subscript i Baseline minus 1 right-parenthesis log left-parenthesis upper Z Subscript i Baseline right-parenthesis plus left-parenthesis 2 upper W plus 1 minus 2 r Subscript i Baseline right-parenthesis log left-parenthesis 1 minus upper Z Subscript i Baseline right-parenthesis right-bracket

If the method used to compute the EDF is any method other than the STANDARD method, then the statistic can be computed by using the following two pieces of information:

If the EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM methods, then EDF is a step function such that the estimate is a constant equal to in interval . If the EDF estimates are computed using the TURNBULL method, then there are two types of intervals: one in which the EDF curve is constant and the other in which the EDF curve is theoretically undefined. For computational purposes, it is assumed that the EDF curve is linear for the latter type of the interval. For each method, the EDF estimate at y can be written as

where is the slope of the line defined as

For the KAPLANMEIER or MODIFIEDKM method, in each interval.
Using the probability integral transform , the formula simplifies to

The computation formula can then be derived from the approximation,

where , , and K is the number of points at which the EDF estimate are computed. For the TURNBULL method, for some k.

Assuming , , , and yields the computation formula,

where , , and .

If EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM method, then and , which simplifies the formula as

CvM

The Cramér–von Mises (CvM) statistic is a quadratic EDF statistic that is proportional to the expected value of the squared difference between the EDF and CDF. It is formally defined as follows:

CvM equals upper N integral Subscript negative normal infinity Superscript normal infinity Baseline left-parenthesis upper F Subscript n Baseline left-parenthesis y right-parenthesis minus upper F left-parenthesis y right-parenthesis right-parenthesis squared d upper F left-parenthesis y right-parenthesis

If the STANDARD method is used to compute the EDF, then the following formula is used:

CvM equals StartFraction 1 Over 12 upper W EndFraction plus sigma-summation Underscript i equals 1 Overscript upper N Subscript u Baseline Endscripts upper W Subscript i Baseline left-parenthesis upper Z Subscript i Baseline minus StartFraction left-parenthesis 2 r Subscript i Baseline minus 1 right-parenthesis Over 2 upper W EndFraction right-parenthesis squared

If the method used to compute the EDF is any method other than the STANDARD method, then the statistic can be computed by using the following two pieces of information:

As described previously for the AD statistic, the EDF estimates are assumed to be piecewise linear such that the estimate at y is

where is the slope of the line defined as

For the KAPLANMEIER or MODIFIEDKM method, in each interval.
Using the probability integral transform , the formula simplifies to

The computation formula can then be derived from the following approximation,

where , , and K is the number of points at which the EDF estimate are computed. For the TURNBULL method, for some k.

Assuming , , and yields the following computation formula,

CvM equals upper N StartFraction upper Z 1 cubed Over 3 EndFraction plus upper N sigma-summation Underscript i equals 2 Overscript upper K plus 1 Endscripts left-bracket upper P Subscript i Superscript 2 Baseline upper A Subscript i Baseline minus upper P Subscript i Baseline upper Q Subscript i Baseline upper B Subscript i Baseline minus StartFraction upper Q Subscript i Superscript 2 Baseline Over 3 EndFraction upper C Subscript i Baseline right-bracket

where , , and .

If EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM method, then and , which simplifies the formula as

CvM equals StartFraction upper N Over 3 EndFraction plus upper N sigma-summation Underscript i equals 2 Overscript upper K plus 1 Endscripts left-bracket upper F Subscript n Baseline left-parenthesis upper Z Subscript i minus 1 Baseline right-parenthesis squared left-parenthesis upper Z Subscript i Baseline minus upper Z Subscript i minus 1 Baseline right-parenthesis minus upper F Subscript n Baseline left-parenthesis upper Z Subscript i minus 1 Baseline right-parenthesis left-parenthesis upper Z Subscript i Superscript 2 Baseline minus upper Z Subscript i minus 1 Superscript 2 Baseline right-parenthesis right-bracket

which is similar to the formula proposed by Koziol and Green (1976).

Last updated: June 19, 2025