The following topics describe how demand classification values are determined for each time seriesan aggregation of transactional data into specified time intervals and sorted according to unique combinations of the default attributes (BY variables). Some of these patterns are determined by the seasonalitya regular change in time series data values that occurs at the same point in each time cycle. value associated with each time series, which is described here: Seasonality.
These attributes generate simple categorical values, such as Y (yes) and N (no).
This attribute classifies the time series as either no longer active (Y) or still active (N). To determine if a time series is retired, the following calculations are performed. Missing values are not treated as zero unless Missing interpretation is set to 0 for the dependent variable.
max(1, floor(seasonality/26)).nonzero_demand{} array
stores all of the nonzero observations within the active demand period.demand_interval{} array
stores all demand intervals in active demand periods.demand_cycle_length{} array
stores the length of each set of consecutive non-zero values within
the active demand period.gap_interval_length{} array
stores the length of each set of consecutive zero values within the
active demand period.The time series is retired if both of these conditions are true:
MAX(gap_interval_length{}) + retired
thresholdMAX(demand_interval{}) +
retired thresholdMIN(demand_cycle_length{}) + length of trailing zeros > (seasonality + retired threshold)This attribute classifies the time series as either having a short time span (Y) or not short (N). The following SAS function defines the short threshold.
ceil(seasonality/4)
If the ceil() function does not return an integer, it rounds up to the next larger integer.
If the total number of observations is less than or equal to the short threshold, the series is classified as short time series.
This attribute classifies time series as seasonal (Y), not seasonal (N), or if seasonality cannot be determined (ND). The seasonality threshold is seasonality
+ 9.
The remaining time series are fit with two AR(1) models, one with a seasonal dummy and one without. Seasonality is N if one of the following are true:
The remaining time series are classified as seasonal (Y).
This attribute classifies time series as either intermittent (Y) or not intermittent (N).
The threshold for intermittency is 2. Intermittency is determined by computing the median of the length of contiguous constant periods (demand intervals). It applies the INTERMITTENCYTEST method. If the test result is less than the threshold, the time series is intermittent.
This attribute classifies time series as either seasonal intermittent (Y), not seasonal intermittent (N), or cannot be determined (ND). The value is determined by the flow shown in Demand Class and Demand Span Flows. For the time series that are classified as YEAR_ROUND, the INTERMITTENCYTEST method is applied.
Then the time series are accumulated up one level up from the time interval level. For example, if time interval is “MONTH" then the accumulated series interval would be "QTR" interval. Next, the SEASONTEST method is applied.
This attribute is used to classify the time series in terms of volume: LOW, MEDIUM, or HIGH.
If the maximum value of the accumulated series is less than or equal to 5 and maximum value of the accumulated index series is less than 0, then the series volume is LOW.
The remaining time series are ranked from highest to lowest using the mean. The top ten percent of these time series are designated as HIGH and the remaining time series are MEDIUM.
This attribute classifies the amount of volatility in the time series as either HIGH or LOW.
Measurement for measuring volatility starts with these two measures.
The volatility measure is computed by taking the average or median of both measurements. The time series are ranked by volatility measure. The top ten percent are classified as HIGH and the remainder are classified as LOW.
These attributes are taken from measurements from each time series.
The number of trailing zeroes at the end of each time series.
The maximum value of the demand cycle lengths detected from the time series.
The median value of the demand intervals. This value is used for detecting whether the series is intermittent.
The measurement of volume for the dependent variable each time series.
The volatility measurement value of the time series. Volatility is determined by first evaluating the error from the exponential smoothing model (ESM) and the irregular component from the seasonal decomposition. The errors from the two models are measured using MAE. The volatility measure is computed by taking the average or median of measurements from both methods.
a value that indicates whether analysis for each time series was successful.
These attributes use combinations of other attributes to obtain the values.
Not all time series are equally predictable. In a large-scale forecasting project, some time series are difficult to forecasta numerical prediction of a future value for a specified time period for each unique combination of BY variable values automatically. The Volume Volatility Class attribute is designed to identify the time series that are high in volume and volatility. Time series with these characteristics are especially time-consuming for forecasters to evaluate.
This attribute classifies the combination of volume and volatility in the time series. The attribute first identifies the time series that have been classified as SHORT, RETIRED, and INTERMITTENT and removes them from consideration.
This attribute is used to classify the time series based on demand cycles. Possible values are cycles that occur throughout the year (YEAR_ROUND), only during certain seasons of the year (INSEASON), or not determined (ND). The flow for the Demand Span attribute is included in Demand Class and Demand Span Flows.
If Short is Y or if Volume is LOW for any time series, the time series is ND.
For the remainder of the time series, the demand span threshold is determined by the following SAS function:
Ceil(3*seasonality/4)
The time series are analyzed to identify zero demands below the demand span threshold. Next, the time series are analyzed after leading and trailing zeros are removed, to determine whether there is a demand gap. Demand gaps are consecutive zero demand periods that are longer than the demand span threshold. Demand gaps identify demand cycles. Based on the length of the demand cycles, the time series data is classified into long time span series or short time span series.
When a series has at least one demand cycle, the time series is INSEASON if the following are true:
The following two cases are YEAR_ROUND series:
Any remaining time series are classified as ND.
This attribute is used to determine how to segment the time series for the Demand Classification pipeline. It uses the Retired, Demand Span, Intermittent, Seasonal, and Seasonal Intermittent attributes to define the segments. Demand Class and Demand Span Flows shows the flow for how the attributes are derived. This figure also includes the flows for the Demand Span attribute to show how the SHORT, LOW_VOLUME, and OTHER values are calculated.
The pipeline segments that are defined for each Demand Class attribute are described below. You can make changes to each pipeline and to the modeling nodes in each pipeline.
Time series with a short record of historical data. This could be a new series with only a few observations. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
Time series with low volumes. The Naive Forecasting pipeline is selected for this segment. Seasonal random walk is already selected as the naive model type.
Short time span series with intermittent patterns. The Regression Forecasting pipeline is selected for this segment.
Short time span series without intermittent patterns. The Regression Forecasting pipeline is selected for this segment.
Long time span series with intermittent patterns. The Auto-forecasting model (Intermittent) pipeline is selected for this segment. Only the IDM model is selected for inclusion.
Long time span series with seasonal patterns. The Seasonal Forecasting pipeline is selected for this segment.
Long time span series without seasonal patterns. The Non-seasonal Forecasting pipeline is selected for this segment.
Long time span series with seasonal and intermittent patterns. The Temporal Aggregation Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
Long time span series with no patterns that can be classified. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
Time series that do not span long time periods and cannot be classified. The Naive (Moving Average) Forecasting pipeline is selected for this segment. Moving average is already selected as the naive model type.
Time series that are retired or are no longer active. The Retired Series model is selected for this segment.
After time series are moved into their corresponding demand classification segments, each segment is run using the modeling nodes appropriate to their demand classification. See Customizing Each Segment for instructions to change or edit modeling nodes for any segment.