HPSEVERITY Procedure

Example 21.4 Fitting a Scaled Tweedie Model with Regressors

(View the complete code for this example.)

The Tweedie distribution is often used in the insurance industry to explain the influence of regression effects on the distribution of losses. PROC HPSEVERITY provides a predefined scaled Tweedie distribution (STWEEDIE) that enables you to model the influence of regression effects on the scale parameter. The scale regression model has its own advantages such as the ability to easily account for inflation effects. This example illustrates how that model can be used to evaluate the influence of regression effects on the mean of the Tweedie distribution, which is useful in problems such rate-making and pure premium modeling.

Assume a Tweedie process, whose mean mu is affected by k regression effects x Subscript j, j equals 1 comma ellipsis comma k, as follows,

mu equals mu 0 exp left-parenthesis sigma-summation Underscript j equals 1 Overscript k Endscripts beta Subscript j Baseline x Subscript j Baseline right-parenthesis

where mu 0 represents the base value of the mean (you can think of mu 0 as exp left-parenthesis beta 0 right-parenthesis, where beta 0 is the intercept). This model for the mean is identical to the popular generalized linear model for the mean with a logarithmic link function.

More interestingly, it parallels the model used by PROC HPSEVERITY for the scale parameter theta,

theta equals theta 0 exp left-parenthesis sigma-summation Underscript j equals 1 Overscript k Endscripts beta Subscript j Baseline x Subscript j Baseline right-parenthesis

where theta 0 represents the base value of the scale parameter. As described in the section Tweedie Distributions, for the parameter range p element-of left-parenthesis 1 comma 2 right-parenthesis, the mean of the Tweedie distribution is given by

mu equals theta lamda StartFraction 2 minus p Over p minus 1 EndFraction

where lamda is the Poisson mean parameter of the scaled Tweedie distribution. This relationship enables you to use the scale regression model to infer the influence of regression effects on the mean of the distribution.

Let the data set Work.Test_Sevtw contain a sample generated from a Tweedie distribution with dispersion parameter phi equals 0.5, index parameter p equals 1.75, and the mean parameter that is affected by three regression variables x1, x2, and x3 as follows:

mu equals 5 exp left-parenthesis 0.25 x 1 minus x 2 plus 3 x 3 right-parenthesis

Thus, the population values of regression parameters are mu 0 equals 5, beta 1 equals 0.25, beta 2 equals negative 1, and beta 3 equals 3. You can find the code used to generate the sample in the PROC HPSEVERITY sample program hsevex04.sas.

The following PROC HPSEVERITY step uses the sample in Work.Test_Sevtw data set to estimate the parameters of the scale regression model for the predefined scaled Tweedie distribution (STWEEDIE) with the dual quasi-Newton (QUANEW) optimization technique:

/*--- Fit the scale parameter version of the Tweedie distribution ---*/
proc hpseverity data=test_sevtw outest=estw covout print=all;
   loss y;
   scalemodel x1-x3;

   dist stweedie;
   nloptions tech=quanew;
run;

The dual quasi-Newton technique is used because it requires only the first-order derivatives of the objective function, and it is harder to compute reasonably accurate estimates of the second-order derivatives of Tweedie distribution’s PDF with respect to the parameters.

Some of the key results prepared by PROC HPSEVERITY are shown in Output 21.4.1 and Output 21.4.2. The distribution information and the convergence results are shown in Output 21.4.1.

Output 21.4.1: Convergence Results for the STWEEDIE Model with Regressors

The HPSEVERITY Procedure
stweedie Distribution

Distribution Information
Name stweedie
Description Tweedie Distribution with Scale Parameter
Distribution Parameters 3
Regression Parameters 3

Convergence Status
Convergence criterion (FCONV=2.220446E-16) satisfied.

Optimization Summary
Optimization Technique Dual Quasi-Newton
Iterations 41
Function Calls 196
Log Likelihood -1044.3


The final parameter estimates of the STWEEDIE regression model are shown in Output 21.4.2. The estimate that is reported for the parameter Theta is the estimate of the base value theta 0. The estimates of regression coefficients beta 1, beta 2, and beta 3 are indicated by the rows of x1, x2, and x3, respectively.

Output 21.4.2: Parameter Estimates for the STWEEDIE Model with Regressors

Parameter Estimates
Parameter DF Estimate Standard
Error
t Value Approx
Pr > |t|
Theta 1 0.82500 0.25705 3.21 0.0015
Lambda 1 16.33948 12.22096 1.34 0.1823
P 1 1.75092 0.19347 9.05 <.0001
x1 1 0.27957 0.09874 2.83 0.0050
x2 1 -0.76688 0.10311 -7.44 <.0001
x3 1 3.03227 0.10139 29.91 <.0001


If your goal is to explain the influence of regression effects on the scale parameter, then the output displayed in Output 21.4.2 is sufficient. But, if you want to compute the influence of regression effects on the mean of the distribution, then you need to do some postprocessing. Using the relationship between mu and theta, mu can be written in terms of the parameters of the STWEEDIE model as

mu equals theta 0 exp left-parenthesis sigma-summation Underscript j equals 1 Overscript k Endscripts beta Subscript j Baseline x Subscript j Baseline right-parenthesis lamda StartFraction 2 minus p Over p minus 1 EndFraction

This shows that the parameters beta Subscript j are identical for the mean and the scale model, and the base value mu 0 of the mean model is

mu 0 equals theta 0 lamda StartFraction 2 minus p Over p minus 1 EndFraction

The estimate of mu 0 and the standard error associated with it can be computed by using the property of the functions of maximum likelihood estimators (MLE). If g left-parenthesis normal upper Omega right-parenthesis represents a totally differentiable function of parameters normal upper Omega, then the MLE of g has an asymptotic normal distribution with mean g left-parenthesis ModifyingAbove normal upper Omega With caret right-parenthesis and covariance upper C equals left-parenthesis partial-differential bold g right-parenthesis prime normal upper Sigma left-parenthesis partial-differential bold g right-parenthesis, where ModifyingAbove normal upper Omega With caret is the MLE of normal upper Omega, normal upper Sigma is the estimate of covariance matrix of normal upper Omega, and partial-differential bold g is the gradient vector of g with respect to normal upper Omega evaluated at ModifyingAbove normal upper Omega With caret. For mu 0, the function is g left-parenthesis normal upper Omega right-parenthesis equals theta 0 lamda left-parenthesis 2 minus p right-parenthesis slash left-parenthesis p minus 1 right-parenthesis. The gradient vector is

StartLayout 1st Row 1st Column partial-differential bold g 2nd Column equals left-parenthesis StartFraction partial-differential g Over partial-differential theta 0 EndFraction StartFraction partial-differential g Over partial-differential lamda EndFraction StartFraction partial-differential g Over partial-differential p EndFraction StartFraction partial-differential g Over partial-differential beta 1 EndFraction ellipsis StartFraction partial-differential g Over partial-differential beta Subscript k Baseline EndFraction right-parenthesis 2nd Row 1st Column Blank 2nd Column equals left-parenthesis StartFraction mu 0 Over theta 0 EndFraction StartFraction mu 0 Over lamda EndFraction StartFraction minus mu 0 Over left-parenthesis p minus 1 right-parenthesis left-parenthesis 2 minus p right-parenthesis EndFraction 0 ellipsis 0 right-parenthesis EndLayout

You can write a DATA step that implements these computations by using the parameter and covariance estimates prepared by PROC HPSEVERITY step. The DATA step program is available in the sample program hsevex04.sas. The estimates of mu 0 prepared by that program are shown in Output 21.4.3. These estimates and the estimates of beta Subscript j as shown in Output 21.4.2 are reasonably close (that is, within one or two standard errors) to the parameters of the population from which the sample in Work.Test_Sevtw data set was drawn.

Output 21.4.3: Estimate of the Base Value Mu0 of the Mean Parameter

Parameter Estimate Standard
Error
t Value Approx
Pr > |t|
Mu0 4.47144 0.42213 10.5925 0


Another outcome of using the scaled Tweedie distribution to model the influence of regression effects is that the regression effects also influence the variance V of the Tweedie distribution. The variance is related to the mean as upper V equals phi mu Superscript p, where phi is the dispersion parameter. Using the relationship between the parameters TWEEDIE and STWEEDIE distributions as described in the section Tweedie Distributions, the regression model for the dispersion parameter is

StartLayout 1st Row 1st Column log left-parenthesis phi right-parenthesis 2nd Column equals left-parenthesis 2 minus p right-parenthesis log left-parenthesis mu right-parenthesis minus log left-parenthesis lamda left-parenthesis 2 minus p right-parenthesis right-parenthesis 2nd Row 1st Column Blank 2nd Column equals left-parenthesis left-parenthesis 2 minus p right-parenthesis log left-parenthesis mu 0 right-parenthesis minus log left-parenthesis lamda left-parenthesis 2 minus p right-parenthesis right-parenthesis right-parenthesis plus left-parenthesis 2 minus p right-parenthesis sigma-summation Underscript j equals 1 Overscript k Endscripts beta Subscript j Baseline x Subscript j EndLayout

Subsequently, the regression model for the variance is

StartLayout 1st Row 1st Column log left-parenthesis upper V right-parenthesis 2nd Column equals 2 log left-parenthesis mu right-parenthesis minus log left-parenthesis lamda left-parenthesis 2 minus p right-parenthesis right-parenthesis 2nd Row 1st Column Blank 2nd Column equals left-parenthesis 2 log left-parenthesis mu 0 right-parenthesis minus log left-parenthesis lamda left-parenthesis 2 minus p right-parenthesis right-parenthesis right-parenthesis plus 2 sigma-summation Underscript j equals 1 Overscript k Endscripts beta Subscript j Baseline x Subscript j EndLayout

In summary, PROC HPSEVERITY enables you to estimate regression effects on various parameters and statistics of the Tweedie model.

Last updated: June 19, 2025