'spatial panel regression in R: non conformable spatial weights?
I am trying to run a spatial panel regression in R with the splm package. So I have polygons with summarized data over time and I want to see how the dependent variable is affected by the other variables that also change over time.
I have 546 regions with a number of variables, but to test how it works I took a subset of my data for 3 polygons, including the shapefile for calculating the weights, and the data.
https://drive.google.com/file/d/0B4SK0f2zZUKxZ0dDU2lnclB2M3c/view?usp=sharing
#load data
file="sector_panel_data_test.csv"
sector_data=read.table(file,sep=",", header=T, quote="")
sector_data[is.na(sector_data)] <- 0
names(sector_data)
attach(sector_data)
#load shape
require (rgdal)
sectors <-readOGR(dsn=".",layer="sectors_test_sample_year1")
nb <- poly2nb(sectors)
#distance based neighbors
coords <- coordinates(sectors)
nb.d125<- dnearneigh(coords,0,125000,row.names=sectors$Code)
#create weights matrix
mat.d125 <-nb2mat(nb.d125,glist=NULL,style="W",zero.policy=TRUE)
#and then a weights list object
listd125 = mat2listw(mat.d125, style="W")
#design model and run, just picked one variable here
fm <- prop_fdeg ~ mean_pop
randommodel <-spml(fm,
data=sector_data,index=NULL,listw=listFQQ,model="random", lag=FALSE)
I get the following error:
Error in spreml(formula = formula, data = data, index = index, w = listw2mat(listw), : Non conformable spatial weights
Does anyone know what this means? I have searched everywhere, and only found people with the same problem looking for a solution.
Solution 1:[1]
I've also just received the same error. After going step-by-step through the source code with my data (see below), it appears that missingness in the panel data leads to listwise deletion of some rows. Those deleted rows will, in turn, result in the panel data and the listw object having different numbers of observations. To fix the issue, you'll need to either (1) impute the missing data, (2) delete the dropped rows from your listw object or (3) drop variables from your model that have missingness. In my case, imputing all missing data seemed to stop the error. You'll also need to be attentive to keeping the panel data balanced as splm will break with most unbalanced panel data, too (the make.pbalanced
command in plm
does not seem to be much help with addressing imbalance in the data because it will add rows with NAs that splm
will reject).
A handful of ways to check for missingness, impute missing data and/or see how your data works in the source code:
compare
dim(your_data)
anddim(na.omit(your_data))
Visualize missingness in your panel data with naniar (also, see the new panelView package)
install.packages("naniar") # visualise missing data library(ggplot2) library(naniar) gg_miss_var(your_data)
Run
plm
(notsplm
) on your data and check dimensions of the data in the output (as compared with the original data).p_out <- plm(formula = your_formula, data = your_data, model = "within") summary(p_out) dim(model.matrix(p_out))
A straightforward way to impute missing data is with the simputation package. For more see https://cran.r-project.org/web/packages/simputation/vignettes/intro.html
The Amelia package offers better options for multiple imputation with time series, cross sectional data: https://gking.harvard.edu/amelia
Run the underlying code for spreml at R-Forge directly. First, looking at the code below suggests that the error is being generated by the last line. Just above this line, we can see how n is defined and this, at least, suggests some possible avenues for debugging (by running the underlying code directly to see where dim(w) is different from n (where
w <- your_listw_object
).## data management through plm functions pmod <- plm(formula, data, index=index, model="pooling") X <- model.matrix(pmod) y <- pmodel.response(pmod) #names(index) <- row.names(data) #ind <- index[which(names(index) %in% row.names(X))] #tind <- tindex[which(names(index) %in% row.names(X))] ind <- attr(pmod$model, "index")[, 1] tind <- attr(pmod$model, "index")[, 2] oo <- order(tind, ind) X <- X[oo, , drop=FALSE] y <- y[oo] ind <- ind[oo] tind <- tind[oo] n <- length(unique(ind)) k <- dim(X)[[2]] t <- max(tapply(X[, 1], ind, length)) nT <- length(ind) ## check compatibility of weights matrix if (dim(w)[[1]] != n) stop("Non conformable spatial weights")
Solution 2:[2]
This might not be relevant to your problem, but hopefully can help others searching for this error.
Data have to be in specific format: first two columns containing index
and time
in this order and rest is remaining variables. Switching time
and index
will cause Non conformable spatial weights
because dim(w) != n
, where $n$ will be number of unique elements of time
.
Solution 3:[3]
I had the same problem myself.
It turned out that my spatial weight matrix contained one extra country.
For e.g., your dataset contains 33 countries, but you have a matrix of 34 countries.
Simply remove that extra country from the matrix
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | jyr |
Solution 3 | Heman Hemanovich |