'find exact match with grep
I am attempting to take a fairly large dataframe of comments from a survey and use grep
to identify comments that contain one of a list of keywords
index <- c(1:3)
comments <- c("My boss was good", "My bosses were great", "Goodness gracious I love my boss")
reviews <- as.data.frame(index,comments)
list = "good|leave|absense|physical|medical|sick"
What I want to do is create a new column in this dataframe which will contain only comments which contain an exact match of the strings from the 'list' vector. So from the 'comments' field I want [1] since in contains "good" which is in the 'list' vector but not [3] because "goodness" is not an exact match to "good".
This is what I've come up with so far:
new_column <- reviews[grep(list,reviews$comments, fixed=TRUE), ]
When I run this, however, I get a new colunmn with 0 observations despite the fact that there are comments that contain words which I included in 'list'.
I know that there are similar posts to this in the past but I have still not been able to find a good solution.
Solution 1:[1]
Use an alternation with word boundaries:
regex <- "\\b(?:good|leave|absense|physical|medical|sick)\\b"
df <- reviews[grepl(regex, reviews$comments), ]
df
index comments
1 1 My boss was good
Data:
index <- c(1:3)
comments <- c("My boss was good", "My bosses were great", "Goodness gracious I love my boss")
reviews <- data.frame(index,comments)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Tim Biegeleisen |