Detailed Description of Demand Classifications

The following topics describe how demand classification values are determined for each time seriesan aggregation of transactional data into specified time intervals and sorted according to unique combinations of the default attributes (BY variables). Some of these patterns are determined by the seasonalitya regular change in time series data values that occurs at the same point in each time cycle. value associated with each time series, which is described here: Seasonality.

Categorical Attributes

These attributes generate simple categorical values, such as Y (yes) and N (no).

Retired

This attribute classifies the time series as either no longer active (Y) or still active (N). To determine if a time series is retired, the following calculations are performed. Missing values are not treated as zero unless Missing interpretation is set to 0 for the dependent variable.

The number of trimmed observations is determined by removing the missing observations.
The retired threshold is defined as max(1, floor(seasonality/26)).
The active demand period is determine by removing all leading and trailing zero values from the time series.
The nonzero_demand{} array stores all of the nonzero observations within the active demand period.
The demand_interval{} array stores all demand intervals in active demand periods.
The demand_cycle_length{} array stores the length of each set of consecutive non-zero values within the active demand period.
The gap_interval_length{} array stores the length of each set of consecutive zero values within the active demand period.

The time series is retired if both of these conditions are true:

(number of trimmed observations + length of trailing zeros) > (seasonality + retired threshold)
Any one of the following conditions is true:
- Length of trailing zeros > (seasonality + retired threshold)
- Length of trailing zeros > MAX(gap_interval_length{}) + retired threshold
- If the DEMAND_SPAN attribute is YEAR_ROUND, then the length of trailing zeros > MAX(demand_interval{}) + retired threshold
- If the DEMAND_SPAN attribute is INSEASON, then MIN(demand_cycle_length{}) + length of trailing zeros > (seasonality + retired threshold)

Short

This attribute classifies the time series as either having a short time span (Y) or not short (N). The following SAS function defines the short threshold.

ceil(seasonality/4)

If the ceil() function does not return an integer, it rounds up to the next larger integer.

If the total number of observations is less than or equal to the short threshold, the series is classified as short time series.

Seasonal

This attribute classifies time series as seasonal (Y), not seasonal (N), or if seasonality cannot be determined (ND). The seasonality threshold is seasonality + 9.

If the time series length is less than or equal to the seasonality threshold, then seasonality is ND.

The remaining time series are fit with two AR(1) models, one with a seasonal dummy and one without. Seasonality is N if one of the following are true:

The AR(1) model without the seasonal dummy has a smaller AIC than the model with the seasonal dummy
The AR(1) model without the seasonal dummy has an F-test statistic of 0.05 or better.

The remaining time series are classified as seasonal (Y).

Intermittent

This attribute classifies time series as either intermittent (Y) or not intermittent (N).

The threshold for intermittency is 2. Intermittency is determined by computing the median of the length of contiguous constant periods (demand intervals). It applies the INTERMITTENCYTEST method. If the test result is less than the threshold, the time series is intermittent.

Seasonal Intermittent

This attribute classifies time series as either seasonal intermittent (Y), not seasonal intermittent (N), or cannot be determined (ND). The value is determined by the flow shown in Demand Class and Demand Span Flows. For the time series that are classified as YEAR_ROUND, the INTERMITTENCYTEST method is applied.

Then the time series are accumulated up one level up from the time interval level. For example, if time interval is “MONTH" then the accumulated series interval would be "QTR" interval. Next, the SEASONTEST method is applied.

Volume

This attribute is used to classify the time series in terms of volume: LOW, MEDIUM, or HIGH.

If the maximum value of the accumulated series is less than or equal to 5 and maximum value of the accumulated index series is less than 0, then the series volume is LOW.

The remaining time series are ranked from highest to lowest using the mean. The top ten percent of these time series are designated as HIGH and the remaining time series are MEDIUM.

Volatility

This attribute classifies the amount of volatility in the time series as either HIGH or LOW.

Measurement for measuring volatility starts with these two measures.

the mean absolute error (MAE) from the exponential smoothing model (ESM)
the irregular component from the seasonal decomposition

The volatility measure is computed by taking the average or median of both measurements. The time series are ranked by volatility measure. The top ten percent are classified as HIGH and the remainder are classified as LOW.

Measured Attributes

These attributes are taken from measurements from each time series.

Trailing Zero Length

The number of trailing zeroes at the end of each time series.

Maximum Cycle Length

The maximum value of the demand cycle lengths detected from the time series.

Demand Interval Measure

The median value of the demand intervals. This value is used for detecting whether the series is intermittent.

Volume Measure

The measurement of volume for the dependent variable each time series.

Volatility Measure

The volatility measurement value of the time series. Volatility is determined by first evaluating the error from the exponential smoothing model (ESM) and the irregular component from the seasonal decomposition. The errors from the two models are measured using MAE. The volatility measure is computed by taking the average or median of measurements from both methods.

Status

a value that indicates whether analysis for each time series was successful.

0 - Analysis was successful
3000 - Accumulation failed
4000 - Missing value interpretation failed
6000 - Series is all missing
9000 - Descriptive statistics could not be computed

Combination Attributes

These attributes use combinations of other attributes to obtain the values.

Volume Volatility Class

Not all time series are equally predictable. In a large-scale forecasting project, some time series are difficult to forecasta numerical prediction of a future value for a specified time period for each unique combination of BY variable values automatically. The Volume Volatility Class attribute is designed to identify the time series that are high in volume and volatility. Time series with these characteristics are especially time-consuming for forecasters to evaluate.

This attribute classifies the combination of volume and volatility in the time series. The attribute first identifies the time series that have been classified as SHORT, RETIRED, and INTERMITTENT and removes them from consideration.

Volume Volatility Class Flow

The remaining time series use the Volume classification. Time series with a volume designation of MEDIUM and LOW are classified as LOW. The remaining time series are already classified as HIGH.
The time series classified as LOW and HIGH use the Volatility classification to determine the final combination of values.

Demand Span

This attribute is used to classify the time series based on demand cycles. Possible values are cycles that occur throughout the year (YEAR_ROUND), only during certain seasons of the year (INSEASON), or not determined (ND). The flow for the Demand Span attribute is included in Demand Class and Demand Span Flows.

If Short is Y or if Volume is LOW for any time series, the time series is ND.

For the remainder of the time series, the demand span threshold is determined by the following SAS function:

Ceil(3*seasonality/4)

The time series are analyzed to identify zero demands below the demand span threshold. Next, the time series are analyzed after leading and trailing zeros are removed, to determine whether there is a demand gap. Demand gaps are consecutive zero demand periods that are longer than the demand span threshold. Demand gaps identify demand cycles. Based on the length of the demand cycles, the time series data is classified into long time span series or short time span series.

When a series has at least one demand cycle, the time series is INSEASON if the following are true:

The maximum of demand cycle length is less than or equal to the demand span threshold.
The number of trimmed observations is less than or equal to the demand span threshold.

The following two cases are YEAR_ROUND series:

a time series that has at least one demand cycle and is not INSEASON
a time series that has zero demand cycle and the number of trimmed observations is greater than the demand span threshold

Any remaining time series are classified as ND.

Demand Class

This attribute is used to determine how to segment the time series for the Demand Classification pipeline. It uses the Retired, Demand Span, Intermittent, Seasonal, and Seasonal Intermittent attributes to define the segments. Demand Class and Demand Span Flows shows the flow for how the attributes are derived. This figure also includes the flows for the Demand Span attribute to show how the SHORT, LOW_VOLUME, and OTHER values are calculated.

Demand Class and Demand Span Flows

The pipeline segments that are defined for each Demand Class attribute are described below. You can make changes to each pipeline and to the modeling nodes in each pipeline.

SHORT: Time series with a short record of historical data. This could be a new series with only a few observations. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
LOW_VOLUME: Time series with low volumes. The Naive Forecasting pipeline is selected for this segment. Seasonal random walk is already selected as the naive model type.
INSEASON_INTERMITTENT: Short time span series with intermittent patterns. The Regression Forecasting pipeline is selected for this segment.
INSEASON_NON_INTERMITTENT: Short time span series without intermittent patterns. The Regression Forecasting pipeline is selected for this segment.
YEAR_ROUND_INTERMITTENT: Long time span series with intermittent patterns. The Auto-forecasting model (Intermittent) pipeline is selected for this segment. Only the IDM model is selected for inclusion.
YEAR_ROUND_SEASONAL: Long time span series with seasonal patterns. The Seasonal Forecasting pipeline is selected for this segment.
YEAR_ROUND_NON_SEASONAL: Long time span series without seasonal patterns. The Non-seasonal Forecasting pipeline is selected for this segment.
YEAR_ROUND_SEASONAL_INTERMITTENT: Long time span series with seasonal and intermittent patterns. The Temporal Aggregation Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
YEAR_ROUND_OTHER: Long time span series with no patterns that can be classified. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
OTHER: Time series that do not span long time periods and cannot be classified. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
RETIRED: Time series that are retired or are no longer active. The Retired Series model is selected for this segment.

After time series are moved into their corresponding demand classification segments, each segment is run using the modeling nodes appropriate to their demand classification. See Customizing Each Segment for instructions to change or edit modeling nodes for any segment.