COPULA Procedure

Dependence Measures

There are three basic types of measures: linear correlation, rank correlation, and tail dependence. Linear correlation is given by

rho identical-to corr left-parenthesis upper X comma upper Y right-parenthesis equals StartFraction cov left-parenthesis upper X comma upper Y right-parenthesis Over StartRoot var left-parenthesis upper X right-parenthesis EndRoot StartRoot var left-parenthesis upper Y right-parenthesis EndRoot EndFraction

The linear correlation coefficient carries very limited information about the joint properties of the variables. A well-known property is that uncorrelatedness does not imply independence, while independence implies noncorrelation. In addition, there exist distinct bivariate distributions that have the same marginal distribution and the same correlation coefficient. These results suggest that caution must be used when interpreting the linear correlation.

Another statistical measure of dependence is called rank correlation, which is nonparametric. Kendall’s tau, for example, is the covariance between the sign statistic upper X 1 minus upper X overTilde Subscript 1 and upper X 2 minus upper X overTilde Subscript 2, where left-parenthesis upper X overTilde Subscript 1 Baseline comma upper X overTilde Subscript 2 Baseline right-parenthesis is an independent copy of left-parenthesis upper X 1 comma upper X 2 right-parenthesis:

rho Subscript tau Baseline identical-to upper E left-bracket sign left-parenthesis upper X 1 minus upper X overTilde Subscript 1 Baseline right-parenthesis left-parenthesis upper X 2 minus upper X overTilde Subscript 2 Baseline right-parenthesis right-bracket

The sign function (sometimes written as sgn) is defined by

sign left-parenthesis x right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column negative 1 2nd Column normal i normal f x less-than-or-equal-to 0 2nd Row 1st Column 0 2nd Column normal i normal f x equals 0 3rd Row 1st Column 1 2nd Column normal i normal f x greater-than-or-equal-to 0 EndLayout

Spearman’s rho is the correlation between the transformed random variables:

rho Subscript upper S Baseline left-parenthesis upper X 1 comma upper X 2 right-parenthesis identical-to rho left-parenthesis upper F 1 left-parenthesis upper X 1 right-parenthesis comma upper F 2 left-parenthesis upper X 2 right-parenthesis right-parenthesis

The variables are transformed by their distribution functions so that the transformed variables are uniformly distributed on left-bracket 0 comma 1 right-bracket. The rank correlations depend only on the copula of the random variables and are indifferent to the marginal distributions. Like linear correlation, the rank correlations have their limitations. In particular, there are different copulas that result in the same rank correlation.

A third measure focuses on only part of the joint properties between the variables. Tail dependence measures the dependence when both variables are at extreme values. Formally, they can be defined as the conditional probabilities of quantile exceedances. There are two types of tail dependence:

  • The upper tail dependence, denoted lamda Subscript u, is

    lamda Subscript u Baseline left-parenthesis upper X 1 comma upper X 2 right-parenthesis identical-to limit Underscript q minus greater-than 1 Superscript minus Baseline Endscripts upper P left-parenthesis upper X 2 greater-than upper F 2 Superscript negative 1 Baseline left-parenthesis q right-parenthesis vertical-bar upper X 1 greater-than upper F 1 Superscript negative 1 Baseline left-parenthesis q right-parenthesis right-parenthesis

    when the limit exists lamda Subscript u Baseline element-of left-bracket 0 comma 1 right-bracket. Here upper F Subscript j Superscript negative 1 is the quantile function (that is, the inverse of the CDF).

  • The lower tail dependence is defined symmetrically.

Tail dependence is hard to detect by looking at a scatter plot of realizations of two random variables. One graphical way to detect tail dependence between two variables is by creating the chi plot of those two variables. The chi plot, as defined in Fisher and Switzer (2001), has characteristic patterns that depend on the dependence structure between the variables. The chi plot for the random variables X and Y is a scatter plot of the pairs left-parenthesis lamda Subscript i Baseline comma chi Subscript i Baseline right-parenthesis for each data point left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis. lamda Subscript i is a measure of the distance of the data point left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis from the center of the data as measured by the median values of left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis, and chi Subscript i is a correlation coefficient between dichotomized values of X and Y. A positive lamda Subscript i means that x Subscript i and y Subscript i are either both large with respect to their median values or both small. A negative lamda Subscript i means that x Subscript i or y Subscript i is large with respect to its median, whereas the other value is small. Signs of tail dependence manifest as clusters of points that are significantly far from the chi axis around lamda values of plus-or-minus1. If X and Y are uncorrelated, the chi values cluster around the lamda axis.

Last updated: June 19, 2025