'Remove underscore and number at the end of string
I am working with a dataset that has column with some underscores. There is a patter to it but they are different patterns, as shown below
ID Col1
1029 ap_analog
2334 critical_1_mm_1
2334 transpose_2_mm_2
9877 public_1_yes_0_no_1
9877 public_1_yes_0_no_2
1333 Lateral_mm
1333 Lateral_mm_1
1333 Lateral_mm_2
1333 Lateral_mm_3
1333 ap_mm_axial
1333 ap_mm_axial_1
1333 ap_mm_axial_2
1333 ap_mm_axial_3
9876 central_star_six_mm
9876 central_star_six_mm_1
9876 central_star_six_mm_2
9876 central_star_six_mm_3
I just like to separate the numbers from the string with a final dataset like this
ID Col1 Index
1029 ap_analog 0
2334 critical_1_mm 1
2334 transpose_2_mm 2
9877 public_1_yes_0_no 1
9877 public_1_yes_0_no 2
1333 Lateral_mm 0
1333 Lateral_mm 1
1333 Lateral_mm 2
1333 Lateral_mm 3
1333 ap_mm_axial 0
1333 ap_mm_axial 1
1333 ap_mm_axial 2
1333 ap_mm_axial 3
9876 central_star_six_mm 0
9876 central_star_six_mm 1
9876 central_star_six_mm 2
9876 central_star_six_mm 3
Right now I am doing this very inefficiently. Something like this
df1$index <- df1$Col1
for(i in 1:3) {
df1$index <- regmatches(df1$index,gregexpr("(?<=_).*",df1$index,perl=TRUE))
}
df1$index[ which(df1$index == "character(0)")] <- 0
I would appreciate any suggestions to improve on this.
Solution 1:[1]
One way using dplyr and stringr :
We can extract the Index value which is the number at the end of Col1, replace the NA values with 0. We can remove the last digit from Col1.
library(dplyr)
library(stringr)
library(dplyr)
df %>%
mutate(Index = str_extract(Col1, '\\d+$'),
Index = replace(Index, is.na(Index), 0),
Col1 = sub('_\\d+$', '', Col1))
# ID Col1 Index
#1 1029 ap_analog 0
#2 2334 critical_1_mm 1
#3 2334 transpose_2_mm 2
#4 9877 public_1_yes_0_no 1
#5 9877 public_1_yes_0_no 2
#6 1333 Lateral_mm 0
#7 1333 Lateral_mm 1
#8 1333 Lateral_mm 2
#9 1333 Lateral_mm 3
#10 1333 ap_mm_axial 0
#11 1333 ap_mm_axial 1
#12 1333 ap_mm_axial 2
#13 1333 ap_mm_axial 3
#14 9876 central_star_six_mm 0
#15 9876 central_star_six_mm 1
#16 9876 central_star_six_mm 2
#17 9876 central_star_six_mm 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
