'In a point pattern with marks representing area height and width: how to delete points with "nested" areas?
I have point patterns comprising of several 100k marked points representing cell centres of cells of two types (big and small cells). The point coordinates are derived from the bounding boxes of a neural network based object detection on tile images of a tissue section scan. I reconstructed the whole point patterns using R package spatstat by superimposing individual overlapping tile point patterns. Then I deleted redundant points derived from the tiling process. Until mow I did it by empirically measuring a minimum distance between the big, the small cell centers, and the "cross-distance". Then I deleted redundant points with the closepairs
or crosspairs
command. Here is an example (two classes are the big brown cells and the small brown cells, redundant points are red):
However I'm now working with another tissue type where one cell type can be very big and rounded, but also small and rounded or long elliptical. So I can't use the uniform close-/ or crosspairs distance any more. I can attach marks with the cell width and cell height (i.e. bounding box dimensions) to each point. But I can't figure out how to effectively use these dimensions to delete all points which boxes intersect too much or lay inside each other. I imagined sub-setting several size-classes based on the long side of the boxes and deleting overlapping points starting with the biggest size class. Here is a hand-drawn fictitious example with light blue dots= true points (cell centres), dark blue dots= redundant points, red, yellow, and green boxes = three size classes, arrows = supposed radii of the closepair-distance for the respective size class. Black arrow= true point which is deleted because of the elliptical shape of the cell. In this examples, the thresholds for the three classes are 14µm and 25µm.
Dose someone have a better solution or should this task be better done outside of spatstat?
Update: here is a working example based on the image above:
library(spatstat)
# cell centre coordinates
centroid.x<-c(4.5,5.5,8.6,7.25,8.3,13.5,13.5,16.7,17.5,19,20.5,21.1,27.5,32.5,30.9,29.5,32.5,29.5,32,35,34,24.5,23.5,21.5,23.5,22.1,17,13.5,11.5)*2.5
centroid.y<-c(3.5,9.5,9.5,19.9,22.5,22.25,27.5,24.5,20.9,26.9,21.5,18,21.5,24,22,20,16,13,12.5,9.3,7.5,9.5,12,11.5,6.5,3.9,5.5,13.5,11.9)*2.5
# marks, only one cell type in this example
bb.width<-c(7.1,7.6,7.6,9,7.9,4.2,6,4.2,5.5,4.2,5.3,4.2,5.8,5.8,5,3,8.5,4,3.9,3.5,4,13,8,15.5,6.6,8.8,4,7.6,7.6)*2.5 # bounding box width
bb.height<-c(7.7,7,7,6,7,3.8,7,4.2,4.2,4.2,5.3,4.2,9,7.5,6,3,5.2,4.5,3.9,3.5,4,9,8,10.5,3.1,4.9,4.6,6,6)*2.5 # bounding box height
marks.df<-as.data.frame(cbind("width"=as.integer(bb.width),"height"=as.integer(bb.height)))
#cell.type<-rep("posit",length(centroid.x))
#marks.df<-cbind(marks.df,"celltype"=cell.type)
#marks.df$celltype<-factor(marks.df$celltype,levels=c("posit","negat"))
# window
W<-owin(xrange=c(0,92.5), yrange=c(0,75))
# point pattern
examppp<-ppp(x = centroid.x, y = centroid.y, marks = marks.df, window=W)
examppp<-spatstat.geom::affine(examppp, mat=diag(c(1,-1))) # flip the y-axis!
plot(examppp)
# create three size classes based on the longest bounding box side
longside<-c()
M<-marks(examppp)
for(i in 1:nrow(M)){
if(M$width[i]>=M$height[i]){longside[i]<-M$width[i]}else{longside[i]<-M$height[i]}
}
sizes<-cut(longside, breaks = c(min(longside)-1,14,25,max(longside)+1),labels=c("small","medium","big"))
#marks(examppp)<-cbind(M,sizes)
# remove close- or crosspairs
u<-split(examppp,sizes)
# only for big and medium cells: the biggest bounding box (i.e. completely recognized cell) should be preserved!
# big cells
close_list = closepairs(u$big, rmax=12,twice = FALSE)
close_indexi <- close_list$i
close_indexj <- close_list$j
if(is.empty(close_indexi)==F){
close_index<-rep(NA,length(close_indexj))
for(ii in 1:length(close_indexi)){ # save the index of the smallest cell of each pair
if(sum(marks(u$big[close_indexi[ii]]))>=sum(marks(u$big[close_indexj[ii]]))){close_index[ii]<-close_indexj[ii]}else{close_index[ii]<-close_indexi[ii]}
}
u$big<-u$big[-close_index]}
# big vs medium cells
close_list = crosspairs(u$big, u$medium, 12)
close_index <- close_list$j
if(is.empty(close_index)==F){
u$medium<-u$medium[-close_index]}
# big vs small cells
close_list = crosspairs(u$big, u$small, 12)
close_index <- close_list$j
if(is.empty(close_index)==F){
u$small<-u$small[-close_index]}
# medium cells
close_list = closepairs(u$medium, rmax=8,twice = FALSE)
close_indexi <- close_list$i
close_indexj <- close_list$j
if(is.empty(close_indexi)==F){
close_index<-rep(NA,length(close_indexj))
for(ii in 1:length(close_indexi)){ # save the index of the smallest cell of each pair
if(sum(marks(u$medium[close_indexi[ii]]))>=sum(marks(u$medium[close_indexj[ii]]))){close_index[ii]<-close_indexj[ii]}else{close_index[ii]<-close_indexi[ii]}
}
u$medium<-u$medium[-close_index]}
# medium vs small cells
close_list = crosspairs(u$medium, u$small, 8)
close_index <- close_list$j
if(is.empty(close_index)==F){
u$small<-u$small[-close_index]}
# small vs small cells
close_list = closepairs(u$small, rmax=5,twice = FALSE)
close_index <- close_list$j
if(is.empty(close_index)==F){
u$small<-u$small[-close_index]}
examppp1<-superimpose(u)
plot(examppp1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|