TIMESERIES Procedure

Singular Spectrum Analysis

Given a time series, y Subscript t, for t equals 1 comma ellipsis comma upper T, and a window length, 2 less-than-or-equal-to upper L less-than upper T slash 2, singular spectrum analysis Golyandina, Nekrutkin, and Zhigljavsky (2001) decompose the time series into spectral groupings using the following steps:

Embedding Step

Using the time series, form a upper K times upper L trajectory matrix, bold upper X, with elements

bold upper X equals StartSet x Subscript k comma l Baseline EndSet Subscript k equals 1 comma l equals 1 Superscript upper K comma upper L

such that x Subscript k comma l Baseline equals y Subscript k minus l plus 1 for k equals 1 comma ellipsis comma upper Kand l equals 1 comma ellipsis comma upper L and where upper K equals upper T minus upper L plus 1. By definition upper L less-than-or-equal-to upper K less-than upper T, because 2 less-than-or-equal-to upper L less-than upper T slash 2.

Decomposition Step

Using the trajectory matrix, bold upper X, apply singular value decomposition to the trajectory matrix

bold upper X equals bold upper U bold upper Q bold upper V

where bold upper U represents the upper K times upper K matrix that contains the left-hand-side (LHS) eigenvectors, where bold upper Q represents the diagonal upper K times upper L matrix that contains the singular values, and where bold upper V represents the upper L times upper L matrix that contains the right-hand-side (RHS) eigenvectors.

Therefore,

bold upper X equals sigma-summation Underscript l equals 1 Overscript upper L Endscripts bold upper X Superscript left-parenthesis l right-parenthesis Baseline equals sigma-summation Underscript l equals 1 Overscript upper L Endscripts bold u Subscript l Baseline q Subscript l Baseline bold v Subscript l Superscript upper T

where bold upper X Superscript left-parenthesis l right-parenthesis represents the upper K times upper L principal component matrix, bold u Subscript l represents the upper K times 1 left-hand-side (LHS) eigenvector, q Subscript l represents the singular value, and bold v Subscript l represents the upper L times 1 right-hand-side (RHS) eigenvector associated with the lth window index.

Grouping Step

For each group index, m equals 1 comma ellipsis comma upper M, define a group of window indices upper I Subscript m Baseline subset-of StartSet 1 comma ellipsis comma upper L EndSet. Let

bold upper X Subscript upper I Sub Subscript m Baseline equals sigma-summation Underscript l element-of upper I Subscript m Baseline Endscripts bold upper X Superscript left-parenthesis l right-parenthesis Baseline equals sigma-summation Underscript l element-of upper I Subscript m Baseline Endscripts bold u Subscript l Baseline q Subscript l Baseline bold v Subscript l Superscript upper T

represent the grouped trajectory matrix for group upper I Subscript m. If groupings represent a spectral partition,

union Underscript m equals 1 Overscript upper M Endscripts upper I Subscript m Baseline equals StartSet 1 comma ellipsis comma upper L EndSet and upper I Subscript m Baseline intersection upper I Subscript n Baseline equals normal empty-set for m not-equals n

then according to the singular value decomposition theory,

bold upper X equals sigma-summation Underscript m equals 1 Overscript upper M Endscripts bold upper X Subscript upper I Sub Subscript m

Averaging Step

For each group index, m equals 1 comma ellipsis comma upper M, compute the diagonal average of bold upper X Subscript upper I Sub Subscript m,

x overTilde Subscript t Superscript left-parenthesis m right-parenthesis Baseline equals StartFraction 1 Over n Subscript t Baseline EndFraction sigma-summation Underscript l equals s Subscript t Baseline Overscript e Subscript t Baseline Endscripts x Subscript t minus l plus 1 comma l Superscript left-parenthesis m right-parenthesis

where

StartLayout 1st Row 1st Column s Subscript t Baseline equals 1 comma 2nd Column e Subscript t Baseline equals t comma 3rd Column n Subscript t Baseline equals t 4th Column for 5th Column 1 less-than-or-equal-to 6th Column t 7th Column less-than upper L 2nd Row 1st Column s Subscript t Baseline equals 1 comma 2nd Column e Subscript t Baseline equals upper L comma 3rd Column n Subscript t Baseline equals upper L 4th Column for 5th Column upper L less-than-or-equal-to 6th Column t 7th Column less-than-or-equal-to upper T minus upper L plus 1 3rd Row 1st Column s Subscript t Baseline equals t minus upper T plus upper L comma 2nd Column e Subscript t Baseline equals upper L comma 3rd Column n Subscript t Baseline equals upper T minus t plus 1 4th Column for 5th Column upper T minus upper L plus 1 less-than 6th Column t 7th Column less-than-or-equal-to upper T EndLayout

If the groupings represent a spectral partition, then by definition

y Subscript t Baseline equals sigma-summation Underscript m equals 1 Overscript upper M Endscripts x overTilde Subscript t Superscript left-parenthesis m right-parenthesis

Hence, singular spectrum analysis additively decomposes the original time series, y Subscript t, into m component series x overTilde Subscript t Superscript left-parenthesis m right-parenthesis for m equals 1 comma ellipsis comma upper M.

Computing W-Correlations

An important step in SSA is specifying the groups, upper I Subscript m Baseline subset-of StartSet 1 comma ellipsis comma upper L EndSet for m equals 1 comma ellipsis comma upper M. In order to automate the SSA grouping step, the weighted correlations (w-correlations) are computed. rho Subscript i comma j Superscript left-parenthesis w right-parenthesis Baseline equals StartFraction left-parenthesis x overTilde Subscript t Superscript left-parenthesis i right-parenthesis Baseline comma x overTilde Subscript t Superscript left-parenthesis j right-parenthesis Baseline right-parenthesis Subscript w Baseline Over StartAbsoluteValue EndAbsoluteValue x overTilde Subscript t Superscript left-parenthesis i right-parenthesis Baseline comma x overTilde Subscript t Superscript left-parenthesis i right-parenthesis Baseline StartAbsoluteValue EndAbsoluteValue Subscript w Baseline StartAbsoluteValue EndAbsoluteValue x overTilde Subscript t Superscript left-parenthesis j right-parenthesis Baseline comma x overTilde Subscript t Superscript left-parenthesis j right-parenthesis Baseline StartAbsoluteValue EndAbsoluteValue Subscript w Baseline EndFraction, where left-parenthesis x overTilde Subscript t Superscript left-parenthesis i right-parenthesis Baseline comma x overTilde Subscript t Superscript left-parenthesis j right-parenthesis Baseline right-parenthesis Subscript w Baseline equals sigma-summation Underscript t minus 1 Endscripts Overscript upper T Endscripts w Subscript t Baseline x overTilde Subscript t Superscript left-parenthesis i right-parenthesis Baseline x overTilde Subscript t Superscript left-parenthesis j right-parenthesis and w Subscript t Baseline equals min left-parenthesis t comma upper L comma upper T minus t right-parenthesis.

Specifying the Window Length

You can explicitly specify the maximum window length, 2 less-than-or-equal-to upper L less-than-or-equal-to 1000, by using the LENGTH= option, or you can implicitly specify the window length by using the INTERVAL= option in the ID statement or the SEASONALITY= option in the PROC TIMESERIES statement. Either way, the window length is reduced based on the accumulated time series length, T, to enforce the requirement that 2 less-than-or-equal-to upper L less-than-or-equal-to upper T slash 2.

Specifying the Groups

The GROUPS=(numlist)…(numlist) option explicitly specifies the composition and number of groups, upper I Subscript m Baseline subset-of StartSet 1 comma ellipsis comma upper L EndSet, or you can use the THRESHOLDPCT= option in the SSA statement to implicitly specify the grouping. The THRESHOLDPCT= option is useful for removing noise or less dominant patterns from the accumulated time series.

Let 0 less-than alpha less-than 1 be the cumulative percentage singular value that is specified in the THRESHOLDPCT= option. Then the last group, upper I Subscript upper M Baseline equals StartSet l Subscript alpha Baseline comma ellipsis comma upper L EndSet, is determined by the smallest value such that

left-parenthesis sigma-summation Underscript l equals 1 Overscript l Subscript alpha Baseline minus 1 Endscripts q Subscript l Baseline slash sigma-summation Underscript l equals 1 Overscript upper L Endscripts q Subscript l Baseline right-parenthesis greater-than-or-equal-to alpha Baseline 1 less-than l Subscript alpha Baseline less-than-or-equal-to upper L

Using this rule, the last group, upper I Subscript upper M, describes the least dominant patterns in the time series, and the size of the last group is at least one and is less than the window length, upper L greater-than-or-equal-to 2.

The magnitudes of the principal components that are plotted using the PLOT=SSA option and selected by the THRESHOLDPCT= option are based on the singular values that appear on the diagonal of bold upper Q. Alternatively, each principal component’s contribution to variation in the series can be quantified by using the squares of the singular values. The relative contributions of the principal components to variation in the series are included in the printed tabular output that is produced by the PRINT=SSA option.

Automatic Grouping

Besides specifying the groups explicitly, you can also use the GROUPS=AUTO(number) option to perform the automatic grouping. In this SSA automatic grouping, the following steps are performed:

  1. Initially assume the maximal number of groups: upper M equals upper L.

  2. Diagonally average the groups as described previously: x overTilde Subscript t Superscript left-parenthesis m right-parenthesis for m equals 1 comma ellipsis comma upper L.

  3. Compute the weighted correlations (w-correlations) between groups: rho Subscript i comma j Superscript left-parenthesis m right-parenthesis.

  4. Choose the groups based on the w-correlations for which the absolute values are close to one. Or more formally, upper I Subscript m Baseline subset-of StartSet 1 comma ellipsis comma upper L EndSet such that StartAbsoluteValue rho Subscript i comma j Superscript left-parenthesis m right-parenthesis Baseline EndAbsoluteValue almost-equals 1 whenever i comma j element-of upper I Subscript m Baseline.

Last updated: June 19, 2025