The spatial weights matrix plays a vital role in spatial econometric modeling. If you fit a purely linear model without SLX effects, you do not need a
matrix. For other types of models in PROC SPATIALREG, you need to provide a spatial weights matrix to fit the model. Although the creation of the
matrix is often problem-specific, there are some general guidelines to consider. Two common ways to create the
matrix are k-order binary contiguity matrices and k-nearest neighbor matrices (Elhorst 2013).
You start with the spatial contiguity matrix . In the case of the first-order neighbors (
), a value of 1 for the
th entry in
indicates that the two units i and j are neighbors to each other, and 0 indicates otherwise. The neighbor relationship is often defined based on sharing of a common boundary. To generalize this, a k-order neighbor (
) of a unit i can be any units whose neighbors are
-order neighbors of unit i. In this sense, the two units i and j that are not first-order neighbors can still be second-order neighbors if unit j is the neighbor to a first-order neighbor of unit i.
As an example, a first-order binary contiguity matrix might look like the following:
The diagonal elements of are zeros because, in general, a unit is not considered to be a neighbor of itself. Moreover, the two units L2 and L4 are neighbors of L1; L2 has L1 as its only neighbor; L3 has L4 as its only neighbor; and L4 has L1 and L3 as its neighbors. You can create the spatial weights matrix
by row-standardizing the contiguity matrix
. To do so, you divide entries in each row of
by the sum of that row. The spatial weights matrix
, which is the row-standardized version of
, is as follows:
You can create a spatial contiguity matrix based on a distance metric. Let denote the distance between the two units i and j, which might be the Euclidean distance between centroids of the two spatial units. Let
and
be the centroids of units i and j, where
, and lon and lat denote the longitude and latitude, respectively. Under the Euclidean distance metric, the distance
between units i and j is
After computing the distance between the unit i and other units under a certain metric, you sort in ascending order; for example,
. For a given k, let
be the set that contains the indices of k-nearest neighbors of unit i; then the
th entries of the contiguity matrix
are defined as
The th entry of the corresponding row-standardized matrix
is
.
Unlike the k-order binary contiguity matrix, which is often symmetric by construction, k-nearest neighbor matrices can be asymmetric. To obtain a symmetric k-nearest neighbor matrices, you can define the th entries of the contiguity matrix
as follows:
In addition to the Euclidean distance measure, you can use other distance metrics as appropriate. A variant of k-nearest neighbor matrices that is used in some empirical studies defines its
th entries as
where is a prespecified threshold distance.
In addition to the two constructions of spatial weights matrices that are presented earlier, see Elhorst (2013) and the references therein for more information about other ways to create a spatial weights matrix. In practice, you can define the neighbor relation that is problem-specific. For example, you can define two spatial units that are far apart to be neighbors because they share some attributes (such as population sizes larger than 500,000).
The data set that you specify in the WMAT= option is row-standardized by default to create a spatial weights matrix. This means that if you specify WMAT=C, PROC SPATIALREG row-standardizes the spatial contiguity matrix to create a spatial weights matrix. If you want to suppress row standardization, you must specify the NONORMALIZE option.