'R replace string in df with partial match in a list
I have a dataframe (df) in R and I want to create a new column (city1_n) that contains a line stored in the list key whenever there is a partial match between city1 and key.
Below I have created a little example that should help to visualize my problem.
> dput(df)
structure(list(Country = c("USA", "France", "Italy", "Spain", 
"Mexico"), City1 = c("Los angeles", "Paris", "Rome", "Madrid", 
"Cancun"), City2 = c("New York", "Lyon", "Pisa", "Barcelona", 
"San Cristobal de las Casas")), class = "data.frame", row.names = c(NA, 
-5L))
> dput(key)
list("Los angeles California", "Paris Île-de-France", "Rome Lazio", 
    "Madrid Comunidad de Madrid ", "Cancun Quintana Roo")
Result:
I am looking to solve this in R or Unix.
Solution 1:[1]
Use fuzzyjoin::fuzzyjoin:
fuzzyjoin::fuzzy_left_join(df, data.frame(key), by = c("City1" = "key"), match_fun = \(x,y) str_detect(y, x))
  Country       City1                      City2                         key
1     USA Los angeles                   New York      Los angeles California
2  France       Paris                       Lyon         Paris Île-de-France
3   Italy        Rome                       Pisa                  Rome Lazio
4   Spain      Madrid                  Barcelona Madrid Comunidad de Madrid 
5  Mexico      Cancun San Cristobal de las Casas         Cancun Quintana Roo
data
df <- structure(list(Country = c("USA", "France", "Italy", "Spain", 
                           "Mexico"), City1 = c("Los angeles", "Paris", "Rome", "Madrid", 
                                                "Cancun"), City2 = c("New York", "Lyon", "Pisa", "Barcelona", 
                                                                     "San Cristobal de las Casas")), class = "data.frame", row.names = c(NA, 
                                                                                                                                         -5L))
key <- c("Los angeles California", "Paris Île-de-France", "Rome Lazio", 
     "Madrid Comunidad de Madrid ", "Cancun Quintana Roo")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | Maël | 


