SAS Macros and Functions

LOGTEST Macro

The %LOGTEST macro tests whether a logarithmic transformation is appropriate for modeling and forecasting a time series. The logarithmic transformation is often used for time series that show exponential growth or variability proportional to the level of the series.

The %LOGTEST macro fits an autoregressive model to a series and fits the same model to the log of the series. Both models are estimated by the maximum-likelihood method, and the maximum log-likelihood values for both autoregressive models are computed. These log-likelihood values are then expressed in terms of the original data and compared.

You can control the order of the autoregressive models. You can also difference the series and the log-transformed series before the autoregressive model is fit.

You can print the log-likelihood values and related statistics (AIC, SBC, and MSE) for the autoregressive models for the series and the log-transformed series. You can also output these statistics to a SAS data set.

Syntax

The %LOGTEST macro has the following form:

  • %LOGTEST ( SAS-data-set, variable, < options > );

The first argument, SAS-data-set, specifies the name of the SAS data set that contains the time series variable to be analyzed. The second argument, variable, specifies the time series variable name to be analyzed.

The first two arguments are required. The following options can be used with the %LOGTEST macro. Options must follow the required arguments and are separated by commas.

AR=n

specifies the order of the autoregressive model fit to the series and the log-transformed series. The default is AR=5.

CONST=value

specifies a constant to be added to the series before transformation. Use the CONST= option when some values of the series are 0 or negative. The series analyzed must be greater than the negative of the CONST= value. The default is CONST=0.

DIF=( differencing-list )

specifies the degrees of differencing applied to the original and log-transformed series before fitting the autoregressive model. The differencing-list is a list of positive integers separated by commas and enclosed in parentheses. For example, DIF=(1,12) specifies that the transformed series be differenced once at lag 1 and once at lag 12. For more information, see the section IDENTIFY Statement in Chapter 7, ARIMA Procedure.

OUT=SAS-data-set

writes the results to an output data set. The output data set includes a variable TRANS that identifies the transformation (LOG or NONE), the log-likelihood value (LOGLIK), the residual mean squared error (RMSE), Akaike’s information criterion (AIC), and Schwarz’s Bayesian criterion (SBC) for the log-transformed and untransformed cases.

PRINT=YES | NO

specifies whether the results are printed. The default is PRINT=NO. The printed output shows the log-likelihood value, the residual mean squared error, Akaike’s information criterion (AIC), and Schwarz’s Bayesian criterion (SBC) for the log-transformed and untransformed cases.

Results

The result of the test is returned in the macro variable &LOGTEST. The value of the &LOGTEST variable is ‘LOG’ if the model fit to the log-transformed data has a larger log likelihood than the model fit to the untransformed series. The value of the &LOGTEST variable is ‘NONE’ if the model fit to the untransformed data has a larger log likelihood. The variable &LOGTEST is set to ‘ERROR’ if the %LOGTEST macro is unable to compute the test due to errors.

Results are printed when the PRINT=YES option is specified. Results are stored in SAS data sets when the OUT= option is specified.

Details

Assume that a time series upper X Subscript t is a stationary pth-order autoregressive process with normally distributed white noise innovations. That is,

left-parenthesis 1 minus normal upper Theta left-parenthesis upper B right-parenthesis right-parenthesis left-parenthesis upper X Subscript t Baseline minus mu Subscript bold x Baseline right-parenthesis equals epsilon Subscript t

where mu Subscript bold x is the mean of upper X Subscript t.

The log likelihood function of upper X Subscript t is

StartLayout 1st Row 1st Column l 1 left-parenthesis dot right-parenthesis equals 2nd Column minus StartFraction n Over 2 EndFraction ln left-parenthesis 2 pi right-parenthesis minus one-half ln left-parenthesis StartAbsoluteValue normal upper Sigma Subscript bold x bold x Baseline EndAbsoluteValue right-parenthesis minus StartFraction n Over 2 EndFraction ln left-parenthesis sigma Subscript bold e Superscript 2 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column minus StartFraction 1 Over 2 sigma Subscript bold e Superscript 2 Baseline EndFraction left-parenthesis bold upper X minus bold 1 mu Subscript bold x Baseline right-parenthesis prime normal upper Sigma Subscript bold x bold x Superscript negative 1 Baseline left-parenthesis bold upper X minus bold 1 mu Subscript bold x Baseline right-parenthesis EndLayout

where n is the number of observations, 1 is the n-dimensional column vector of 1s, sigma Subscript bold e Superscript 2 is the variance of the white noise, bold upper X equals left-parenthesis upper X 1 comma ellipsis comma upper X Subscript n Baseline right-parenthesis prime, and normal upper Sigma Subscript bold x bold x is the covariance matrix of bold upper X.

On the other hand, if the log-transformed time series upper Y Subscript t Baseline equals ln left-parenthesis upper X Subscript t Baseline plus c right-parenthesis is a stationary pth-order autoregressive process, the log-likelihood function of upper X Subscript t is

StartLayout 1st Row 1st Column l 0 left-parenthesis dot right-parenthesis equals 2nd Column minus StartFraction n Over 2 EndFraction ln left-parenthesis 2 pi right-parenthesis minus one-half ln left-parenthesis StartAbsoluteValue normal upper Sigma Subscript bold y bold y Baseline EndAbsoluteValue right-parenthesis minus StartFraction n Over 2 EndFraction ln left-parenthesis sigma Subscript bold e Superscript 2 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column minus StartFraction 1 Over 2 sigma Subscript bold e Superscript 2 Baseline EndFraction left-parenthesis bold upper Y minus bold 1 mu Subscript bold y Baseline right-parenthesis prime normal upper Sigma Subscript bold y bold y Superscript negative 1 Baseline left-parenthesis bold upper Y minus bold 1 mu Subscript bold y Baseline right-parenthesis minus sigma-summation Underscript t equals 1 Overscript n Endscripts ln left-parenthesis upper X Subscript t Baseline plus c right-parenthesis EndLayout

where mu Subscript bold y is the mean of upper Y Subscript t, bold upper Y equals left-parenthesis upper Y 1 comma ellipsis comma upper Y Subscript n Baseline right-parenthesis prime, and normal upper Sigma Subscript bold y bold y is the covariance matrix of bold upper Y.

The %LOGTEST macro compares the maximum values of l 1 left-parenthesis dot right-parenthesis and l 0 left-parenthesis dot right-parenthesis and determines which is larger.

The %LOGTEST macro also computes Akaike’s information criterion (AIC), Schwarz’s Bayesian criterion (SBC), and the residual mean squared error based on the maximum likelihood estimator for the autoregressive model. For the mean squared error, retransformation of forecasts is based on Pankratz (1983, pp. 256–258).

After differencing as specified by the DIF= option, the process is assumed to be a stationary autoregressive process. You might want to check for stationarity of the series using the %DFTEST macro. If the process is not stationary, differencing with the DIF= option is recommended. For a process with moving average terms, a large value for the AR= option might be appropriate.

Last updated: June 19, 2025