This function implements the eigenvector-based semiparametric spatial filtering approach in a linear regression framework using ordinary least squares (OLS). Eigenvectors are selected either by an unsupervised stepwise regression procedure or by a penalized regression approach. The stepwise procedure supports selection criteria based on the minimization of residual autocorrelation, maximization of model fit, significance of residual autocorrelation, and the statistical significance of eigenvectors. Alternatively, all eigenvectors in the candidate set can be included as well.

lmFilter(
  y,
  x = NULL,
  W,
  objfn = "MI",
  MX = NULL,
  sig = 0.05,
  bonferroni = TRUE,
  positive = TRUE,
  ideal.setsize = FALSE,
  conditional.se = FALSE,
  alpha = 0.25,
  tol = 0.1,
  boot.MI = NULL,
  na.rm = TRUE
)

# S3 method for class 'spfilter'
summary(object, EV = FALSE, ...)

Arguments

y

response variable

x

vector/ matrix of regressors (default = NULL)

W

spatial connectivity matrix

objfn

the objective function to be used for eigenvector selection. For stepwise selection, possible criteria are: the maximization of the adjusted R-squared ('R2'), minimization of different information criteria ('AIC', 'AICc', or 'BIC'), minimization of residual autocorrelation ('MI'), significance level of candidate eigenvectors ('p'), or significance of residual spatial autocorrelation ('pMI'). The lasso-based selection ('MI-lasso') implements the procedure suggested by Barde et al. (2025). Alternatively, all eigenvectors in the candidate set ('all') can be included as well which does not require any selection procedure.

MX

covariates used to construct the projection matrix (default = NULL) - see Details

sig

significance level to be used for eigenvector selection if objfn = 'p' or objfn = 'pMI'

bonferroni

Bonferroni adjustment for the significance level (TRUE/ FALSE) if objfn = 'p'. Set to FALSE if objfn = 'pMI' - see Details

positive

restrict search to eigenvectors associated with positive levels of spatial autocorrelation (TRUE/ FALSE)

ideal.setsize

if positive = TRUE, uses the formula proposed by Chun et al. (2016) to determine the ideal size of the candidate set (TRUE/ FALSE)

conditional.se

report standard errors of the regression coefficients associated with the covariates conditional on the selected eigenvectors using a partial regression framework (TRUE/ FALSE). Recommended if objfn = 'MI-lasso' (Barde et al. (2025)) - see Details

alpha

a value in (0,1] indicating the range of candidate eigenvectors according to their associated level of spatial autocorrelation, see e.g., Griffith (2003)

tol

if objfn = 'MI', determines the amount of remaining residual autocorrelation at which the eigenvector selection terminates

boot.MI

number of iterations used to estimate the variance of Moran's I. If boot.MI = NULL (default), analytical results will be used

na.rm

listwise deletion of observations with missing values (TRUE/ FALSE)

object

an object of class spfilter

EV

display summary statistics for selected eigenvectors (TRUE/ FALSE)

...

additional arguments

Value

An object of class spfilter containing the following information:

estimates

summary statistics of the parameter estimates

varcovar

estimated variance-covariance matrix

EV

a matrix containing the summary statistics of selected eigenvectors

selvecs

vector/ matrix of selected eigenvectors

evMI

Moran coefficient of eigenvectors

moran

residual autocorrelation in the initial and the filtered model

fit

adjusted R-squared of the initial and the filtered model

ICs

information criteria (AIC, AICc, and BIC) of the initial and the filtered model

residuals

initial and filtered model residuals

other

a list providing supplementary information:

ncandidates

number of candidate eigenvectors considered

nev

number of selected eigenvectors

sel_id

ID of selected eigenvectors

sf

vector representing the spatial filter

sfMI

Moran coefficient of the spatial filter

model

type of the fitted regression model

dependence

filtered for positive or negative spatial dependence

objfn

selection criterion specified in the objective function of the stepwise regression procedure

bonferroni

TRUE/ FALSE: Bonferroni-adjusted significance level (if objfn = 'p')

conditional.se

TRUE/ FALSE: conditional standard errors used

siglevel

if objfn = 'p' or objfn = 'pMI': actual (unadjusted/ adjusted) significance level

Details

If W is not symmetric, it gets symmetrized by 1/2 * (W + W') before the decomposition.

If covariates are supplied to MX, the function uses these regressors to construct the following projection matrix:

M = I - X (X'X)^-1X'

Eigenvectors from MWM using this specification of M are not only mutually uncorrelated but also orthogonal to the regressors specified in MX. Alternatively, if MX = NULL, the projection matrix becomes M = I - 11'/*n*, where 1 is a vector of ones and *n* represents the number of observations. Griffith and Tiefelsdorf (2007) show how the choice of the appropriate M depends on the underlying spatial process. For inference on regression coefficients when the DGP is unknown, Moran eigenvectors should be derived independently of the regressors (MX = NULL). Projecting W onto the space orthogonal to X may alter the estimand of the regression coefficients and is therefore not recommended for general inference.

The Bonferroni correction is only possible if eigenvector selection is based on the significance level of the eigenvectors (objfn = 'p'). It is set to FALSE if eigenvectors are added to the model until the residuals exhibit no significant level of spatial autocorrelation (objfn = 'pMI').

For inference on regression coefficients, Barde et al. (2025) compute standard errors using a partial regression framework (see also Chernozhukov et al. (2015)). Both the outcome and the covariates are residualized with respect to the selected eigenvectors, and the variance–covariance matrix is calculated from these residualized variables. This approach treats the selected eigenvectors as fixed regressors and provides valid post-selection inference when eigenvector selection is stable. When objfn = "MI-lasso", conditional standard errors are therefore generally recommended for inference on regression coefficients.

References

Tiefelsdorf, Michael and Daniel A. Griffith (2007): Semiparametric filtering of spatial autocorrelation: the eigenvector approach. Environment and Planning A: Economy and Space, 39 (5): pp. 1193 - 1221.

Griffith, Daniel A. (2003): Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Berlin/ Heidelberg, Springer.

Chun, Yongwan, Daniel A. Griffith, Monghyeon Lee, Parmanand Sinha (2016): Eigenvector selection with stepwise regression techniques to construct eigenvector spatial filters. Journal of Geographical Systems, 18, pp. 67 – 85.

Le Gallo, Julie and Antonio Páez (2013): Using synthetic variables in instrumental variable estimation of spatial series models. Environment and Planning A: Economy and Space, 45 (9): pp. 2227 - 2242.

Tiefelsdorf, Michael and Barry Boots (1995): The Exact Distribution of Moran's I. Environment and Planning A: Economy and Space, 27 (6): pp. 985 - 999.

Barde, Sylvain, Rowan Cherodian, Guy Tchuente (2025): Moran’s I lasso for models with spatially correlated data. The Econometrics Journal, 28 (3): pp. 423 - 441.

Chernozhukov, Victor, Christian Hansen and Martin Spindler (2015): Post-selection and post-regularization inference in linear models with many controls and instruments. American Economic Review, 105 (5), pp. 486 - 490.

Examples

data(fakedata)
y <- fakedataset$x1
X <- cbind(fakedataset$x2, fakedataset$x3)

res <- lmFilter(y = y, x = X, W = W, objfn = 'MI', positive = FALSE)
print(res)
#> 9 out of 31 candidate eigenvectors selected
summary(res, EV = TRUE)
#> 
#> 	- Spatial Filtering with Eigenvectors (Linear Model)  -
#> 
#> Coefficients (OLS):
#>               Estimate         SE      p-value    
#> (Intercept) 9.07249572 0.66591427 2.154944e-23 ***
#> beta_1      1.01189295 0.07963279 1.285157e-21 ***
#> beta_2      0.01280317 0.05316175 8.102438e-01    
#> 
#> Adjusted R-squared:
#>   Initial  Filtered 
#> 0.4620807 0.7360684 
#> 
#> Information Criteria:
#>               AIC     AICc      BIC
#> Initial  427.4333 427.6833 435.2488
#> Filtered 364.4941 368.0803 395.7561
#> 
#> Filtered for positive spatial autocorrelation
#> 9 out of 31 candidate eigenvectors selected
#> Objective Function: "MI"
#> 
#> Summary of selected eigenvectors:
#>        Estimate       SE      p-value  partialR2      VIF        MI    
#> ev_13 -9.625366 1.423628 1.439657e-09 0.23201418 1.010147 0.6302019 ***
#> ev_10 -5.566222 1.453475 2.400091e-04 0.08005190 1.052256 0.7303271 ***
#> ev_2   4.753911 1.434804 1.339581e-03 0.06167582 1.025083 1.0004147  **
#> ev_4  -3.143717 1.460866 3.413928e-02 0.02963330 1.059905 0.9257835   *
#> ev_9   3.513101 1.442592 1.689824e-02 0.03101221 1.035583 0.7638378   *
#> ev_5  -2.943102 1.439806 4.393322e-02 0.02023369 1.031162 0.8968632   *
#> ev_21  4.074010 1.423203 5.251176e-03 0.04070449 1.009606 0.4539879  **
#> ev_19  3.780013 1.418009 9.140094e-03 0.03715301 1.003177 0.4615722  **
#> ev_26 -3.545690 1.418580 1.429342e-02 0.03243658 1.004043 0.3113456   *
#> 
#> Moran's I ( Residuals):
#>             Observed    Expected   Variance          z     p-value    
#> Initial   0.35062455 -0.01222386 0.01219947 3.28514585 0.000509648 ***
#> Filtered -0.05674773 -0.08163256 0.07622586 0.09013296 0.464090779    

E <- res$selvecs
(ols <- coef(lm(y ~ X + E)))
#> (Intercept)          X1          X2    Eevec_13    Eevec_10     Eevec_2 
#>  9.07249572  1.01189295  0.01280317 -9.62536590 -5.56622177  4.75391095 
#>     Eevec_4     Eevec_9     Eevec_5    Eevec_21    Eevec_19    Eevec_26 
#> -3.14371729  3.51310119 -2.94310163  4.07400963  3.78001325 -3.54569037 
coef(res)
#> (Intercept)      beta_1      beta_2 
#>  9.07249572  1.01189295  0.01280317