'Using grep to match variables in one column to a string of text in another column [duplicate]
I need to match a string in the first variable with a string in the second variable and then return true or false in the third column.
Here is my data
regex <- c("cat", "dog", "mouse")
text<- c("asdf.cat/asdf", "asdf=asdf", "asdf=mouse asdf")
df <- data.frame(regex, text)```
And I need an output like this
regex | text | result |
---|---|---|
cat | asdf.cat/asdf | 1 |
dog | asdf=asdf | 0 |
mouse | asdf=mouse asdf | 1 |
I have tried using grepl but I cant figure out how to use it in a dataframe.
df$result <- as.integer(grepl("cat", df$text))
This will work for the first row only
I have also tried the following code which works to filter out the matches but I want to keep them all in and just return true or false.
df %>%
filter(unlist(Map(function(x, y) grepl(x, y), regex, text)))
As you can see it is complicated by the text string containing various characters
I feel like this should be easy but I cant wrap my head round it!!
Solution 1:[1]
Instead of grepl
, use str_detect
which is vectorised for the pattern
and string
library(stringr)
library(dplyr)
df %>%
mutate(result= +(str_detect(text, regex)))
-output
regex text result
1 cat asdf.cat/asdf 1
2 dog asdf=asdf 0
3 mouse asdf=mouse asdf 1
data
df <- structure(list(regex = c("cat", "dog", "mouse"), text = c("asdf.cat/asdf",
"asdf=asdf", "asdf=mouse asdf")), class = "data.frame", row.names = c(NA,
-3L))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | akrun |