'HTML tag remove from
I have written code for downloading files from FTP sites ! here's my basic start of code:
filenames = getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
#filenames <- strsplit(filenames,"<!></><>")
filenames <- gsub("<(.|\n)*?>","", filenames)
#gsub("<.*?>", filenames)
filenames = unlist(filenames)
filenames
and I am getting output something like this
"\n\n \n Index of /data\n \n \nIndex of /data\nNameLast modifiedSize\nParent Directory - \n2D/24-Feb-2021 15:57 - \n3DRefl/13-Oct-2020 16:30 - \n3DRhoHV/13-Oct-2020 16:30 - \n3DZdr/13-Oct-2020 16:30 - \nProbSevere/13-Oct-2020 16:30 - \nRIDGEII/11-Feb-2021 19:02 - \nheartbeat-50m11-Mar-2022 10:07
48M\nheartbeat-500m11-Mar-2022 10:07 477M\n\n\n\n"
can anyone please tell me how I can remove tags I used below method
filenames <- gsub("<(.|\n)*?>","", filenames)
Solution 1:[1]
Using stringr
library(stringr)
str_remove_all(filenames, '\n')
[1] " Index of /data Index of /dataNameLast modifiedSizeParent Directory - 2D/24-Feb-2021 15:57 - 3DRefl/13-Oct-2020 16:30 - 3DRhoHV/13-Oct-2020 16:30 - 3DZdr/13-Oct-2020 16:30 - ProbSevere/13-Oct-2020 16:30 - RIDGEII/11-Feb-2021 19:02 - heartbeat-50m11-Mar-2022 10:0748Mheartbeat-500m11-Mar-2022 10:07 477M"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nad Pat |