The diagnostic step creates a subset of candidate models for use in the Model Selection Step. There are three main tasks to create candidate models:
The purpose of the diagnostic step is to create a subset of the list of potential candidate models based on characteristics of the time series data. The list of potential candidate models is first limited to the model families that you specify. The following table shows the model families that might be considered during the diagnostic process.
|
Family |
Description |
|---|---|
|
ARIMAX |
ARIMA model that includes predictors that use ARIMA-REG identification order |
|
ESM |
Seasonal and nonseasonal exponential smoothing models |
|
IDM |
Intermittent demand (IDM) model |
|
REGARIMA |
ARIMA model that includes predictors that use REG-ARIMA identification order |
|
UCM |
UCM model with predictors |
To create a subset of models, the time series data for each series is tested for certain characteristics that determine whether a type of model or model feature is appropriate for the data. For example, time series with seasonal trends (deterministic or stochastic) should be forecasta numerical prediction of a future value for a specified time period for each unique combination of BY variable values with models that have a seasonal component. Thus, a seasonality test is performed, and based on the results, the appropriate models are included in the list of candidate models. Likewise, inappropriate models are excluded from the list of candidate models.
The importance of the diagnostic step should not be underestimated. Applying a seasonal model to a nonseasonal time series, particularly one with a short history, can lead to over-parameterization or false seasonality. Applying a linear model to a nonlinear time series can lead to underestimation of the growth (or decline). Applying a non-intermittent model to an intermittent series will result in predictions biased toward zero.
If it is known that a time series has a particular characteristic, then the diagnostics should be overridden and the appropriate model should be used. For example, if the time series is known to be seasonal, the diagnostics should be overridden to always choose a seasonal model.
The following table shows the tests that are performed during the diagnostic process to identify model candidates.
|
Test |
Description |
|---|---|
|
INTERMITTENCY |
Test for an Intermittent demand (IDM) model |
|
MEAN |
Test for a mean (constant) component to be included in the model |
|
SEASONALITY |
Test for a seasonal component to be included in the model |
|
TRANSFORM |
Test for a functional transformation to apply to the independent variable |
|
TREND |
Test for a trend component to be included in the model |
Several causal factors might influence the dependent time series. The diagnostics for the multivariate time series determine which of the causal factors significantly influence the dependent time series. These diagnostics include cross-correlation analysis and transfer function analysis.
Once again, if it is known that a particular causal factor can influence the dependent time series, then the diagnostics should be overridden and the appropriate model should be used.
The following table shows the types of causal factors that might be used in the candidate models.
|
Causal Factor Type |
Description |
|---|---|
|
INPUTS |
Input variables that you supply |
|
EVENTS |
Event dummy variables created using an event definition |
|
OUTLIERS |
Outliers that you specify or identified using outlier detection |
|
SEASONAL DUMMIES |
Seasonal component specified using dummy variables |