This function implements the eigenvector-based semiparametric spatial filtering approach in a generalized linear regression framework using maximum likelihood estimation (MLE). Eigenvectors are selected by an unsupervised stepwise regression technique. Supported selection criteria are the minimization of residual autocorrelation, maximization of model fit, significance of residual autocorrelation, and the statistical significance of eigenvectors. Alternatively, all eigenvectors in the candidate set can be included as well.

glmFilter(
  y,
  x = NULL,
  W,
  objfn = "AIC",
  MX = NULL,
  model,
  optim.method = "BFGS",
  sig = 0.05,
  bonferroni = TRUE,
  positive = TRUE,
  ideal.setsize = FALSE,
  min.reduction = 0.05,
  boot.MI = 100,
  resid.type = "pearson",
  alpha = 0.25,
  tol = 0.1,
  na.rm = TRUE
)

Arguments

y

response variable

x

vector/ matrix of regressors (default = NULL)

W

spatial connectivity matrix

objfn

the objective function to be used for eigenvector selection. Possible criteria are: the maximization of model fit ('AIC', 'AICc', or 'BIC'), minimization of residual autocorrelation ('MI'), significance level of candidate eigenvectors ('p'), significance of residual spatial autocorrelation ('pMI'), or all eigenvectors in the candidate set ('all')

MX

covariates used to construct the projection matrix (default = NULL) - see Details

model

a character string indicating the type of model to be estimated. Currently, 'probit', 'logit', 'poisson', and 'nb' (for negative binomial model) are valid inputs

optim.method

a character specifying the optimization method used by the optim function

sig

significance level to be used for eigenvector selection if objfn = 'p' or objfn = 'pMI'

bonferroni

Bonferroni adjustment for the significance level (TRUE/ FALSE) if objfn = 'p'. Set to FALSE if objfn = 'pMI' - see Details

positive

restrict search to eigenvectors associated with positive levels of spatial autocorrelation (TRUE/ FALSE)

ideal.setsize

if positive = TRUE, uses the formula proposed by Chun et al. (2016) to determine the ideal size of the candidate set (TRUE/ FALSE)

min.reduction

if objfn is 'AIC', 'AICc' or 'BIC'. A value in the interval [0,1) that determines the minimum reduction in the selected information criterion (relative to its current value) a candidate eigenvector needs to achieve in order to be selected

boot.MI

number of iterations used to estimate the variance of Moran's I (default is 100). Alternatively, if boot.MI = NULL, analytical results will be used

resid.type

character string specifying the residual type to be used. Options are 'raw', 'pearson', and 'deviance'

alpha

a value in (0,1] indicating the range of candidate eigenvectors according to their associated level of spatial autocorrelation, see e.g., Griffith (2003)

tol

if objfn = 'MI', determines the amount of remaining residual autocorrelation at which the eigenvector selection terminates

na.rm

remove observations with missing values (TRUE/ FALSE)

Value

An object of class spfilter containing the following information:

estimates

summary statistics of the parameter estimates

varcovar

estimated variance-covariance matrix

EV

a matrix containing the summary statistics of selected eigenvectors

selvecs

vector/ matrix of selected eigenvectors

evMI

Moran coefficient of eigenvectors

moran

residual autocorrelation in the initial and the filtered model

fit

adjusted R-squared of the initial and the filtered model

residuals

initial and filtered model residuals

other

a list providing supplementary information:

ncandidates

number of candidate eigenvectors considered

nev

number of selected eigenvectors

condnum

condition number to assess the degree of multicollinearity among the eigenvectors induced by the link function, see e.g., Griffith/ Amrhein (1997)

sel_id

ID of selected eigenvectors

sf

vector representing the spatial filter

sfMI

Moran coefficient of the spatial filter

model

type of the regression model

dependence

filtered for positive or negative spatial dependence

objfn

selection criterion specified in the objective function of the stepwise regression procedure

bonferroni

TRUE/ FALSE: Bonferroni-adjusted significance level (if objfn='p')

siglevel

if objfn = 'p' or objfn = 'pMI': actual (unadjusted/ adjusted) significance level

resid.type

residual type ('raw', 'deviance', or 'pearson')

pseudoR2

McFadden's (adjusted) pseudo R-squared (filtered vs. unfiltered model) based on the models' likelihood functions

Details

If W is not symmetric, it gets symmetrized by 1/2 * (W + W') before the decomposition.

If covariates are supplied to MX, the function uses these regressors to construct the following projection matrix:

M = I - X (X'X)^-1X'

Eigenvectors from MWM using this specification of M are not only mutually uncorrelated but also orthogonal to the regressors specified in MX. Alternatively, if MX = NULL, the projection matrix becomes M = I - 11'/ *n*, where 1 is a vector of ones and *n* represents the number of observations. Griffith and Tiefelsdorf (2007) show how the choice of the appropriate M depends on the underlying process that generates the spatial dependence.

The Bonferroni correction is only possible if eigenvector selection is based on the significance level of the eigenvectors (objfn = 'p'). It is set to FALSE if eigenvectors are added to the model until the residuals exhibit no significant level of spatial autocorrelation (objfn = 'pMI').

For the negative binomial model, deviance residuals are currently not computed. The function sets resid.type = 'pearson' and prints a message to the console.

Note

If the condition number (condnum) suggests high levels of multicollinearity, eigenvectors can be sequentially removed from selvecs and the model can be re-estimated using the glm function in order to identify and manually remove the problematic eigenvectors. Moreover, if other models that are currently not implemented here need to be estimated (e.g., quasi-binomial models), users can extract eigenvectors using the function getEVs and perform a supervised eigenvector search using the glm function.

In contrast to eigenvector-based spatial filtering in linear regression models, Chun (2014) notes that only a limited number of studies address the problem of measuring spatial autocorrelation in generalized linear model residuals. Consequently, eigenvector selection may be based on an objective function that maximizes model fit rather than a function that minimizes residual spatial autocorrelation, e.g., the corrected Akaike information criterion ('AICc') which includes a small-sample penalty to account for the tendency to choose overparameterized models.

References

Chun, Yongwan (2014): Analyzing Space-Time Crime Incidents Using Eigenvector Spatial Filtering: An Application to Vehicle Burglary. Geographical Analysis 46 (2): pp. 165 - 184.

Tiefelsdorf, Michael and Daniel A. Griffith (2007): Semiparametric filtering of spatial autocorrelation: the eigenvector approach. Environment and Planning A: Economy and Space, 39 (5): pp. 1193 - 1221.

Griffith, Daniel A. (2003): Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Berlin/ Heidelberg, Springer.

Griffith, Daniel A. and Carl G. Amrhein (1997): Multivariate Statistical Analysis for Geographers. Englewood Cliffs, Prentice Hall.

Examples

data(fakedata)

# poisson model
y_pois <- fakedataset$count
poisson <- glmFilter(y = y_pois, x = NULL, W = W, objfn = "MI", positive = FALSE,
model = "poisson", boot.MI = 100)
#> Warning: Note: The default value of `resid.type` will change from 'pearson' to 'deviance' in a future release
print(poisson)
#> 2 out of 31 candidate eigenvectors selected
summary(poisson, EV = FALSE)
#> 
#> 	- Spatial Filtering with Eigenvectors (Poisson Model)  -
#> 
#> Coefficients (MLE):
#>             Estimate         SE      p-value    
#> (Intercept) 2.467731 0.03000572 1.602753e-91 ***
#> 
#> Model Fit:
#>          logL      AIC      AICc     BIC     
#> Initial  -1688.207 3378.415 3378.456 3381.02 
#> Filtered -1608.803 3223.606 3223.856 3231.421
#> 
#> Filtered for positive spatial autocorrelation
#> 2 out of 31 candidate eigenvectors selected
#> Condition Number (Multicollinearity): 1
#> Objective Function: "MI"
#> 
#> Moran's I (Pearson Residuals):
#>            Observed    Expected    Variance        z    p-value  
#> Initial  0.11448695 -0.01010101 0.003502022 2.105313 0.05940594 .
#> Filtered 0.09342985 -0.02103275 0.007189273 1.349961 0.17821782  

# probit model - summarize EVs
y_prob <- fakedataset$indicator
probit <- glmFilter(y = y_prob, x = NULL, W = W, objfn = "p", positive = FALSE,
model = "probit", boot.MI = 100)
#> Warning: Note: The default value of `resid.type` will change from 'pearson' to 'deviance' in a future release
print(probit)
#> 0 out of 31 candidate eigenvectors selected
summary(probit, EV = TRUE)
#> 
#> 	- Spatial Filtering with Eigenvectors (Probit Model)  -
#> 
#> Coefficients (MLE):
#>              Estimate        SE    p-value  
#> (Intercept) 0.3054679 0.1274797 0.01844387 *
#> 
#> Model Fit:
#>          logL      AIC      AICc     BIC    
#> Initial  -66.40641 134.8128 134.8536 137.418
#> Filtered -66.40641 134.8128 134.8536 137.418
#> 
#> Filtered for positive spatial autocorrelation
#> 0 out of 31 candidate eigenvectors selected
#> Objective Function: "p" (significance level = 0.05)
#> Bonferroni correction: TRUE (adjusted significance level = 0.00161)
#> 
#> No eigenvectors selected
#> 
#> Moran's I (Pearson Residuals):
#>              Observed    Expected    Variance         z   p-value  
#> Initial  -0.001603494 -0.01010101 0.005087936 0.1191300 0.5346535  
#> Filtered -0.001603494 -0.01010101 0.004862694 0.1218579 0.5247525  

# logit model - AIC objective function
y_logit <- fakedataset$indicator
logit <- glmFilter(y = y_logit, x = NULL, W = W, objfn = "AIC", positive = FALSE,
model = "logit", min.reduction = .01)
#> Warning: Note: The default value of `resid.type` will change from 'pearson' to 'deviance' in a future release
print(logit)
#> 6 out of 31 candidate eigenvectors selected
summary(logit, EV = FALSE)
#> 
#> 	- Spatial Filtering with Eigenvectors (Logit Model)  -
#> 
#> Coefficients (MLE):
#>              Estimate        SE    p-value  
#> (Intercept) 0.5992487 0.2389685 0.01388762 *
#> 
#> Model Fit:
#>          logL      AIC      AICc     BIC     
#> Initial  -66.40641 134.8128 134.8536 137.418 
#> Filtered -54.11907 122.2381 123.4555 140.4743
#> 
#> Filtered for positive spatial autocorrelation
#> 6 out of 31 candidate eigenvectors selected
#> Condition Number (Multicollinearity): 1
#> Objective Function: "AIC"
#> 
#> Moran's I (Pearson Residuals):
#>              Observed    Expected    Variance          z   p-value  
#> Initial  -0.001603471 -0.01010101 0.005997056  0.1097297 0.4356436  
#> Filtered -0.186524838 -0.04954621 0.005985983 -1.7704557 1.0000000