SSM Procedure

Example 33.18 Invariance of the Marginal Likelihood under Linear Rescaling of the Diffuse Effects

(View the complete code for this example.)

Consider the following alternate but equivalent specifications of a trend-plus-seasonal model (monthly seasonality):

StartLayout 1st Row 1st Column y Subscript t 2nd Column equals mu Subscript t Baseline plus psi Subscript t Baseline plus epsilon Subscript t Baseline 3rd Column Spec 1 2nd Row 1st Column y Subscript t 2nd Column equals mu Subscript t Baseline plus m Subscript 1 comma t Baseline plus midline-horizontal-ellipsis plus m Subscript 11 comma t Baseline plus epsilon Subscript t Baseline 3rd Column Spec 2 EndLayout

Here the trend (, a random walk with drift) and the irregular component (, white noise) are the same in both the specifications. However, the seasonal component is specified differently: in Spec1 the seasonality is modeled as a deterministic trigonometric seasonal component () whereas in Spec2 it is modeled using the seasonal dummies (). Spec1 and Spec2 are statistically equivalent models from the perspective of the data generation process. This example uses these two specifications to demonstrate a useful invariance property of the marginal and profile likelihoods, which is described in the section Likelihood Computation and Model-Fitting Phase. The airline passenger series, given as Series G in Box and Jenkins (1976), is used to illustrate the computations. The following DATA step prepares the log-transformed passenger series and the seasonal dummies that are needed for this example:

data seriesG;
   set sashelp.air;
   logair = log(air);
   array m{11} m1-m11;
   do i=1 to 11;
      m[i] = (month(date)=i);
   end;
run;

The following statements fit the two models to the log-transformed passenger series. The first PROC SSM call fits Spec1, and the second call fits Spec2.

 proc ssm data=seriesG plots=none like=marginal;
    id date interval=month;
    trend rwDrift(ll) slopevar=0;
    irregular wn;
    state trigState(1) type=season(length=12);
    comp season = trigState[1];
    model logair = rwDrift season wn;
 run;

 proc ssm data=seriesG plots=none like=marginal;
    id date interval=month;
    trend rwDrift(ll) slopevar=0;
    irregular wn;
    model logair = rwDrift m1-m11 wn;
 run;

For these two models, the parameter estimates that are based on the diffuse likelihood (REML_D) and the marginal likelihood ((REML_M) coincide because the extra term in the marginal likelihood () turns out to be independent of these parameters. Nevertheless, it is useful to use the LIKE=MARGINAL option in the PROC SSM statement so that both the likelihood computation summary and the information criteria tables display the likelihood values and the information criteria for all three likelihoods—diffuse, marginal, and profile—at the estimated parameters. The parameter estimates for Spec1 and Spec2 are displayed in Output 33.18.1 and Output 33.18.2, respectively. As expected, the parameter estimates for the two specifications are the same because they are statistically equivalent models. The other aspects of the fit (such as model-based forecasts), which are not shown, also agree.

Output 33.18.1: Parameter Estimates For Spec1

Model Parameter Estimates
Component	Type	Parameter	Estimate	Standard Error	t Value
rwDrift	LL Trend	Level Variance	0.000766	0.000219	3.49
wn	Irregular	Variance	0.000368	0.000141	2.60

Output 33.18.2: Parameter Estimates For Spec2

Model Parameter Estimates
Component	Type	Parameter	Estimate	Standard Error	t Value
rwDrift	LL Trend	Level Variance	0.000766	0.000219	3.49
wn	Irregular	Variance	0.000368	0.000141	2.60

The fit summary tables shown in Output 33.18.3 (for Spec1) and Output 33.18.4 (for Spec2) show that the marginal and profile likelihoods (the last two lines in each table) for the two specifications also agree. However, you can see that the diffuse likelihood value for the two specifications differ (diffuse likelihood = 215.45 for Spec1 and diffuse likelihood = 226.89 for Spec2). This difference occurs because the diffuse likelihood is not invariant to the different (but equivalent) formulations of the seasonal effects. This also means that the information criteria that are based on the marginal and profile likelihoods, which are shown in Output 33.18.5 (for Spec1) and Output 33.18.6 (for Spec2), correctly conclude that the two specifications cannot be distinguished on the basis of these criteria, whereas the information criteria that are based on the diffuse likelihood erroneously suggest that Spec1 is inferior to Spec2.

Output 33.18.3: Likelihood Computation Summary For Spec1

Likelihood Computation Summary
Statistic	Value
Nonmissing Response Values Used	144
Estimated Parameters	2
Initialized Diffuse State Elements	13
Normalized Residual Sum of Squares	131
Diffuse Log Likelihood	215.4522
Profile Log Likelihood	265.63882
Marginal Log Likelihood	248.01412

Output 33.18.4: Likelihood Computation Summary For Spec2

Likelihood Computation Summary
Statistic	Value
Nonmissing Response Values Used	144
Estimated Parameters	2
Initialized Diffuse State Elements	13
Normalized Residual Sum of Squares	131
Diffuse Log Likelihood	226.8959
Profile Log Likelihood	265.63882
Marginal Log Likelihood	248.01412

Output 33.18.5: Information Criteria For Spec1

Information Criteria
Statistic	Diffuse Likelihood Based	Profile Likelihood Based	Marginal Likelihood Based
AIC (lower is better)	-426.9044	-501.2776	-492.0282
BIC (lower is better)	-421.1540	-456.7304	-486.2779
AICC (lower is better)	-426.8106	-497.5276	-491.9345
HQIC (lower is better)	-424.5678	-483.1762	-489.6916
CAIC (lower is better)	-419.1540	-441.7304	-484.2779

Output 33.18.6: Information Criteria For Spec2

Information Criteria
Statistic	Diffuse Likelihood Based	Profile Likelihood Based	Marginal Likelihood Based
AIC (lower is better)	-449.7918	-501.2776	-492.0282
BIC (lower is better)	-444.0414	-456.7304	-486.2779
AICC (lower is better)	-449.6981	-497.5276	-491.9345
HQIC (lower is better)	-447.4552	-483.1762	-489.6916
CAIC (lower is better)	-442.0414	-441.7304	-484.2779

This example highlights the care that must be taken while doing model selection based on information criteria. It suggests that information criteria that are based on the marginal and profile likelihoods are preferred over the information criteria that are based on diffuse likelihood.

Last updated: June 19, 2025