ARIMA Procedure

General Notation for ARIMA Models

The order of an ARIMA (autoregressive integrated moving-average) model is usually denoted by the notation ARIMA(p,d,q ), where

p

is the order of the autoregressive part

d

is the order of the differencing

q

is the order of the moving-average process

If no differencing is done left-parenthesis d equals 0), the models are usually referred to as ARMA(p comma q right-parenthesis models. The final model in the preceding example is an ARIMA(1,1,1) model since the IDENTIFY statement specified d equals 1, and the final ESTIMATE statement specified p equals 1 and q equals 1.

Notation for Pure ARIMA Models

Mathematically the pure ARIMA model is written as

upper W Subscript t Baseline equals mu plus StartFraction theta left-parenthesis upper B right-parenthesis Over phi left-parenthesis upper B right-parenthesis EndFraction a Subscript t

where

t

indexes time

upper W Subscript t

is the response series upper Y Subscript t or a difference of the response series

mu

is the mean term

B

is the backshift operator; that is, upper B upper X Subscript t Baseline equals upper X Subscript t minus 1

phi left-parenthesis upper B right-parenthesis

is the autoregressive operator, represented as a polynomial in the backshift operator: phi left-parenthesis upper B right-parenthesis equals 1 minus phi 1 upper B minus midline-horizontal-ellipsis minus phi Subscript p Baseline upper B Superscript p

theta left-parenthesis upper B right-parenthesis

is the moving-average operator, represented as a polynomial in the backshift operator: theta left-parenthesis upper B right-parenthesis equals 1 minus theta 1 upper B minus midline-horizontal-ellipsis minus theta Subscript q Baseline upper B Superscript q

a Subscript t

is the independent disturbance, also called the random error

The series upper W Subscript t is computed by the IDENTIFY statement and is the series processed by the ESTIMATE statement. Thus, upper W Subscript t is either the response series upper Y Subscript t or a difference of upper Y Subscript t specified by the differencing operators in the IDENTIFY statement.

For simple (nonseasonal) differencing, upper W Subscript t Baseline equals left-parenthesis 1 minus upper B right-parenthesis Superscript d Baseline upper Y Subscript t. For seasonal differencing upper W Subscript t Baseline equals left-parenthesis 1 minus upper B right-parenthesis Superscript d Baseline left-parenthesis 1 minus upper B Superscript s Baseline right-parenthesis Superscript upper D Baseline upper Y Subscript t, where d is the degree of nonseasonal differencing, D is the degree of seasonal differencing, and s is the length of the seasonal cycle.

For example, the mathematical form of the ARIMA(1,1,1) model estimated in the preceding example is

left-parenthesis 1 minus upper B right-parenthesis upper Y Subscript t Baseline equals mu plus StartFraction left-parenthesis 1 minus theta 1 upper B right-parenthesis Over left-parenthesis 1 minus phi 1 upper B right-parenthesis EndFraction a Subscript t

Model Constant Term

The ARIMA model can also be written as

phi left-parenthesis upper B right-parenthesis left-parenthesis upper W Subscript t Baseline minus mu right-parenthesis equals theta left-parenthesis upper B right-parenthesis a Subscript t

or

phi left-parenthesis upper B right-parenthesis upper W Subscript t Baseline equals c o n s t plus theta left-parenthesis upper B right-parenthesis a Subscript t

where

c o n s t equals phi left-parenthesis upper B right-parenthesis mu equals mu minus phi 1 mu minus phi 2 mu minus midline-horizontal-ellipsis minus phi Subscript p Baseline mu

Thus, when an autoregressive operator and a mean term are both included in the model, the constant term for the model can be represented as phi left-parenthesis upper B right-parenthesis mu. This value is printed with the label "Constant Estimate" in the ESTIMATE statement output.

Notation for Transfer Function Models

The general ARIMA model with input series, also called the ARIMAX model, is written as

upper W Subscript t Baseline equals mu plus sigma-summation Underscript i Overscript Endscripts StartFraction omega Subscript i Baseline left-parenthesis upper B right-parenthesis Over delta Subscript i Baseline left-parenthesis upper B right-parenthesis EndFraction upper B Superscript k Super Subscript i Baseline upper X Subscript i comma t Baseline plus StartFraction theta left-parenthesis upper B right-parenthesis Over phi left-parenthesis upper B right-parenthesis EndFraction a Subscript t

where

upper X Subscript i comma t

is the ith input time series or a difference of the ith input series at time t

k Subscript i

is the pure time delay for the effect of the ith input series

omega Subscript i Baseline left-parenthesis upper B right-parenthesis

is the numerator polynomial of the transfer function for the ith input series

delta Subscript i Baseline left-parenthesis upper B right-parenthesis

is the denominator polynomial of the transfer function for the ith input series

The model can also be written more compactly as

upper W Subscript t Baseline equals mu plus sigma-summation Underscript i Overscript Endscripts normal upper Psi Subscript i Baseline left-parenthesis upper B right-parenthesis upper X Subscript i comma t Baseline plus n Subscript t

where

normal upper Psi Subscript i Baseline left-parenthesis upper B right-parenthesis

is the transfer function for the ith input series modeled as a ratio of the omega and delta polynomials: normal upper Psi Subscript i Baseline left-parenthesis upper B right-parenthesis equals left-parenthesis omega Subscript i Baseline left-parenthesis upper B right-parenthesis slash delta Subscript i Baseline left-parenthesis upper B right-parenthesis right-parenthesis upper B Superscript k Super Subscript i

n Subscript t

is the noise series: n Subscript t Baseline equals left-parenthesis theta left-parenthesis upper B right-parenthesis slash phi left-parenthesis upper B right-parenthesis right-parenthesis a Subscript t

This model expresses the response series as a combination of past values of the random shocks and past values of other input series. The response series is also called the dependent series or output series. An input time series is also referred to as an independent series or a predictor series. Response variable, dependent variable, independent variable, or predictor variable are other terms often used.

Notation for Factored Models

ARIMA models are sometimes expressed in a factored form. This means that the phi, theta, omega, or delta polynomials are expressed as products of simpler polynomials. For example, you could express the pure ARIMA model as

upper W Subscript t Baseline equals mu plus StartFraction theta 1 left-parenthesis upper B right-parenthesis theta 2 left-parenthesis upper B right-parenthesis Over phi 1 left-parenthesis upper B right-parenthesis phi 2 left-parenthesis upper B right-parenthesis EndFraction a Subscript t

where phi 1 left-parenthesis upper B right-parenthesis phi 2 left-parenthesis upper B right-parenthesis equals phi left-parenthesis upper B right-parenthesis and theta 1 left-parenthesis upper B right-parenthesis theta 2 left-parenthesis upper B right-parenthesis equals theta left-parenthesis upper B right-parenthesis.

When an ARIMA model is expressed in factored form, the order of the model is usually expressed by using a factored notation also. The order of an ARIMA model expressed as the product of two factors is denoted as ARIMA(p,d,q)times(P,D,Q).

Notation for Seasonal Models

ARIMA models for time series with regular seasonal fluctuations often use differencing operators and autoregressive and moving-average parameters at lags that are multiples of the length of the seasonal cycle. When all the terms in an ARIMA model factor refer to lags that are a multiple of a constant s, the constant is factored out and suffixed to the ARIMA(p,d,q ) notation.

Thus, the general notation for the order of a seasonal ARIMA model with both seasonal and nonseasonal factors is ARIMA(p,d,q)times(P,D,Q)Subscript s. The term (p,d,q) gives the order of the nonseasonal part of the ARIMA model; the term (P,D,Q)Subscript s gives the order of the seasonal part. The value of s is the number of observations in a seasonal cycle: 12 for monthly series, 4 for quarterly series, 7 for daily series with day-of-week effects, and so forth.

For example, the notation ARIMA(0,1,2)times(0,1,1)Subscript 12 describes a seasonal ARIMA model for monthly data with the following mathematical form:

left-parenthesis 1 minus upper B right-parenthesis left-parenthesis 1 minus upper B Superscript 12 Baseline right-parenthesis upper Y Subscript t Baseline equals mu plus left-parenthesis 1 minus theta Subscript 1 comma 1 Baseline upper B minus theta Subscript 1 comma 2 Baseline upper B squared right-parenthesis left-parenthesis 1 minus theta Subscript 2 comma 1 Baseline upper B Superscript 12 Baseline right-parenthesis a Subscript t
Last updated: June 19, 2025