ESM Procedure

ID Statement

ID variable INTERVAL= interval < options > ;

The ID statement names a numeric variable that identifies observations in the input and output data sets. The ID variable’s values are assumed to be SAS date or datetime values. In addition, the ID statement specifies the (desired) frequency associated with the time series. The ID statement options also specify how the observations are accumulated and how the time ID values are aligned to form the time series to be forecast. The information specified affects all variables specified in subsequent FORECAST statements. If the ID statement is specified, the INTERVAL= option must be specified. If an ID statement is not specified, the observation number, with respect to the BY group, is used as the time ID. You can specify the following options.

ACCUMULATE=option

specifies how the data set observations are accumulated within each time period. The frequency (width of each time interval) is specified by the INTERVAL= option. The ID variable contains the time ID values. Each time ID variable value corresponds to a specific time period. The accumulated values form the time series, which is used in subsequent model fitting and forecasting.

This option is particularly useful when there are gaps in the input data or when there are multiple input observations that coincide with a particular time period (for example, transactional data). The Chapter 15, EXPAND Procedure, offers additional frequency conversions and transformations that can also be useful in creating a time series.

The following options determine how the observations are accumulated within each time period based on the ID variable and the frequency specified by the INTERVAL= option:

NONE: No accumulation occurs; the ID variable values must be equally spaced with respect to the frequency.
TOTAL: accumulates observations based on the total sum of their values.
AVERAGE | AVG: accumulates observations based on the average of their values.
MINIMUM | MIN: accumulates observations based on the minimum of their values.
MEDIAN | MED: accumulates observations based on the median of their values.
MAXIMUM | MAX: accumulates observations based on the maximum of their values.
N: accumulates observations based on the number of nonmissing observations.
NMISS: accumulates observations based on the number of missing observations.
NOBS: accumulates observations based on the number of observations.
FIRST: accumulates observations based on the first of their values.
LAST: accumulates observations based on the last of their values.
STDDEV | STD: accumulates observations based on the standard deviation of their values.
CSS: accumulates observations based on the corrected sum of squares of their values.
USS: accumulates observations based on the uncorrected sum of squares of their values.

By default, ACCUMULATE=NONE.

If the ACCUMULATE= option is specified, the SETMISSING= option is useful for specifying how accumulated missing values are treated. If missing values should be interpreted as zero, then SETMISSING=0 should be used. For more information about accumulation, see the section Accumulation.

ALIGN=option

controls the alignment of SAS dates used to identify output observations. The ALIGN= option accepts the following values: BEGINNING | BEG | B, MIDDLE | MID | M, and ENDING | END | E. BEGINNING is the default.

END=date | datetime

specifies a SAS date or datetime literal value that represents the end of the data. If the last time ID variable value is less than the END= value, the series is extended with missing values. If the last time ID variable value is greater than the END= value, the series is truncated. For example, END='1jan2008'D specifies that data for time periods after the first of January 2008 not be used. The option END="&sysdate"D uses the automatic macro variable SYSDATE to extend or truncate the series to the current date. This option and the START= option can be used to ensure that data associated with each BY group contain the same number of observations.

FORMAT=format

specifies the SAS format for the time ID values. If the FORMAT= option is not specified, the default format is implied from the INTERVAL= option.

INTERVAL=interval

specifies the frequency of the input time series or for the time series to be accumulated from the input data. For example, if the input data set consists of quarterly observations, then INTERVAL=QTR should be used. If the SEASONALITY= option is not specified, the length of the seasonal cycle is implied by the INTERVAL= option. For example, INTERVAL=QTR implies a seasonal cycle of length 4. If the ACCUMULATE= option is also specified, the INTERVAL= option determines the time periods for the accumulation of observations.

The basic intervals are YEAR, SEMIYEAR, QTR, MONTH, SEMIMONTH, TENDAY, WEEK, WEEKDAY, DAY, HOUR, MINUTE, SECOND. For more information about the intervals that can be specified, see Chapter 4, Date Intervals, Formats, and Functions.

NOTSORTED

specifies that the time ID values are not in sorted order. The ESM procedure sorts the data with respect to the time ID prior to analysis.

SETMISSING=option | number

specifies how missing values (either input or accumulated) are assigned in the accumulated time series. If a number is specified, missing values are set to that number. If a missing value in the input data set indicates an unknown value, the SETMISSING= option should not be used. If a missing value indicates no value, SETMISSING=0 should be used. You typically use SETMISSING=0 for transactional data, because no recorded data usually implies no activity. The following options can also be used to determine how missing values are assigned:

MISSING: sets missing values to missing. The missing observations are replaced with predicted values that are computed from the exponential smoothing model.
AVERAGE | AVG: sets missing values to the accumulated average value.
MINIMUM | MIN: sets missing values to the accumulated minimum value.
MEDIAN | MED: sets missing values to the accumulated median value.
MAXIMUM | MAX: sets missing values to the accumulated maximum value.
FIRST: sets missing values to the accumulated first nonmissing value.
LAST: sets missing values to the accumulated last nonmissing value.
PREVIOUS | PREV: sets missing values to the previous accumulated nonmissing value. Missing values at the beginning of the accumulated series remain missing.
NEXT: sets missing values to the next accumulated nonmissing value. Missing values at the end of the accumulated series remain missing.

By default, SETMISSING=MISSING.

START=date | datetime

specifies a SAS date or datetime literal value that represents the beginning of the data. If the first time ID variable value is greater than the START= value, the series is prefixed with missing values. If the first time ID variable value is less than the START= value, the series is truncated. This option and the END= option can be used to ensure that data associated with each BY group contain the same number of observations.

ZEROMISS=NONE | LEFT | RIGHT | BOTH

specifies how beginning and ending zero values (either input or accumulated) are interpreted in the accumulated time series. You can specify the following values:

NONE: Beginning and ending zeros are unchanged.
LEFT: Beginning zeros are set to missing.
RIGHT: Ending zeros are set to missing.
BOTH: Both beginning and ending zeros are set to missing.

By default, ZEROMISS=NONE.

If the accumulated series is all missing or zero, the series is not changed.

Last updated: June 19, 2025