SPATIALREG Procedure

Specifying the Spatial Weights Matrix

The spatial weights matrix bold upper W plays a vital role in spatial econometric modeling. If you fit a purely linear model without SLX effects, you do not need a bold upper W matrix. For other types of models in PROC SPATIALREG, you need to provide a spatial weights matrix to fit the model. Although the creation of the bold upper W matrix is often problem-specific, there are some general guidelines to consider. Two common ways to create the bold upper W matrix are k-order binary contiguity matrices and k-nearest neighbor matrices (Elhorst 2013).

k-Order Binary Contiguity Matrices

You start with the spatial contiguity matrix bold upper C. In the case of the first-order neighbors (k equals 1), a value of 1 for the left-parenthesis i comma j right-parenthesisth entry in bold upper C indicates that the two units i and j are neighbors to each other, and 0 indicates otherwise. The neighbor relationship is often defined based on sharing of a common boundary. To generalize this, a k-order neighbor (k greater-than-or-equal-to 2) of a unit i can be any units whose neighbors are left-parenthesis k minus 1 right-parenthesis-order neighbors of unit i. In this sense, the two units i and j that are not first-order neighbors can still be second-order neighbors if unit j is the neighbor to a first-order neighbor of unit i.

As an example, a first-order binary contiguity matrix might look like the following:

bold upper C equals Start 5 By 5 Matrix 1st Row 1st Column SID 2nd Column upper L 1 3rd Column upper L 2 4th Column upper L 3 5th Column upper L 4 2nd Row 1st Column upper L 1 2nd Column 0 3rd Column 1 4th Column 0 5th Column 1 3rd Row 1st Column upper L 2 2nd Column 1 3rd Column 0 4th Column 0 5th Column 0 4th Row 1st Column upper L 3 2nd Column 0 3rd Column 0 4th Column 0 5th Column 1 5th Row 1st Column upper L 4 2nd Column 1 3rd Column 0 4th Column 1 5th Column 0 EndMatrix

The diagonal elements of bold upper C are zeros because, in general, a unit is not considered to be a neighbor of itself. Moreover, the two units L2 and L4 are neighbors of L1; L2 has L1 as its only neighbor; L3 has L4 as its only neighbor; and L4 has L1 and L3 as its neighbors. You can create the spatial weights matrix bold upper W by row-standardizing the contiguity matrix bold upper C. To do so, you divide entries in each row of bold upper C by the sum of that row. The spatial weights matrix bold upper W, which is the row-standardized version of bold upper C, is as follows:

bold upper W equals Start 5 By 5 Matrix 1st Row 1st Column SID 2nd Column upper L 1 3rd Column upper L 2 4th Column upper L 3 5th Column upper L 4 2nd Row 1st Column upper L 1 2nd Column 0 3rd Column one-half 4th Column 0 5th Column one-half 3rd Row 1st Column upper L 2 2nd Column 1 3rd Column 0 4th Column 0 5th Column 0 4th Row 1st Column upper L 3 2nd Column 0 3rd Column 0 4th Column 0 5th Column 1 5th Row 1st Column upper L 4 2nd Column one-half 3rd Column 0 4th Column one-half 5th Column 0 EndMatrix

k-Nearest Neighbor Matrices

You can create a spatial contiguity matrix based on a distance metric. Let d Subscript i j denote the distance between the two units i and j, which might be the Euclidean distance between centroids of the two spatial units. Let left-parenthesis lon Subscript i Baseline comma lat Subscript i Baseline right-parenthesis and left-parenthesis lon Subscript j Baseline comma lat Subscript j Baseline right-parenthesis be the centroids of units i and j, where 1 less-than-or-equal-to i comma j less-than-or-equal-to n, and lon and lat denote the longitude and latitude, respectively. Under the Euclidean distance metric, the distance d Subscript i j between units i and j is

d Subscript i j Baseline equals StartRoot left-parenthesis lat Subscript i Baseline minus lat Subscript j Baseline right-parenthesis squared plus left-parenthesis lon Subscript i Baseline minus lon Subscript j Baseline right-parenthesis squared EndRoot

After computing the distance between the unit i and other units under a certain metric, you sort d Subscript i j in ascending order; for example, d Subscript i j 1 Baseline less-than-or-equal-to d Subscript i j 2 Baseline less-than-or-equal-to midline-horizontal-ellipsis less-than-or-equal-to d Subscript i j Sub Subscript k Baseline less-than-or-equal-to midline-horizontal-ellipsis less-than-or-equal-to d Subscript i j Sub Subscript n minus 1. For a given k, let upper N Subscript k Baseline left-parenthesis i right-parenthesis equals StartSet j 1 comma j 2 comma ellipsis comma j Subscript k Baseline EndSet be the set that contains the indices of k-nearest neighbors of unit i; then the left-parenthesis i comma j right-parenthesisth entries of the contiguity matrix bold upper C are defined as

upper C Subscript i j Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column 1 2nd Column if j element-of upper N Subscript k Baseline left-parenthesis i right-parenthesis 2nd Row 1st Column 0 2nd Column otherwise EndLayout

The left-parenthesis i comma j right-parenthesisth entry of the corresponding row-standardized matrix bold upper W is upper W Subscript i j Baseline equals upper C Subscript i j Baseline StartSet sigma-summation Underscript j element-of upper N Subscript k Baseline left-parenthesis i right-parenthesis Endscripts upper C Subscript i j Baseline EndSet Superscript negative 1.

Unlike the k-order binary contiguity matrix, which is often symmetric by construction, k-nearest neighbor matrices can be asymmetric. To obtain a symmetric k-nearest neighbor matrices, you can define the left-parenthesis i comma j right-parenthesisth entries of the contiguity matrix bold upper C as follows:

upper C Subscript i j Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column 1 2nd Column if j element-of upper N Subscript k Baseline left-parenthesis i right-parenthesis or i element-of upper N Subscript k Baseline left-parenthesis j right-parenthesis 2nd Row 1st Column 0 2nd Column otherwise EndLayout

In addition to the Euclidean distance measure, you can use other distance metrics as appropriate. A variant of k-nearest neighbor matrices bold upper C Superscript asterisk that is used in some empirical studies defines its left-parenthesis i comma j right-parenthesisth entries as

upper C Subscript i j Superscript asterisk Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column 1 2nd Column if d Subscript i j Baseline less-than-or-equal-to d Subscript cutoff Baseline 2nd Row 1st Column 0 2nd Column otherwise EndLayout

where d Subscript cutoff is a prespecified threshold distance.

In addition to the two constructions of spatial weights matrices that are presented earlier, see Elhorst (2013) and the references therein for more information about other ways to create a spatial weights matrix. In practice, you can define the neighbor relation that is problem-specific. For example, you can define two spatial units that are far apart to be neighbors because they share some attributes (such as population sizes larger than 500,000).

The data set that you specify in the WMAT= option is row-standardized by default to create a spatial weights matrix. This means that if you specify WMAT=C, PROC SPATIALREG row-standardizes the spatial contiguity matrix to create a spatial weights matrix. If you want to suppress row standardization, you must specify the NONORMALIZE option.

Last updated: June 19, 2025