'Traverse a link tree for Shiny dashboard
I am making an interactive Shiny dashboard with typical "drill-down" capability. There are hundreds of thousands of links in the database. I am working on taking the home page, such as stackoverflow.com/
and finding all subfolders within that, then ranking by page view. Using shinydashboard
and D3
I'll plot that onto a bar plot and allow the user to drill down from there. Once they've clicked on another page, the same process occurs--pulling all sub-folders, ranking by page views, plotting, etc... Thus allowing traversal across the link tree.
I have tried using str_view()
with a couple different regex options, /.*?/
and /[a-z]+/
, both of which give me the first subfolder. But this would cause massive duplicates, because it doesn't account for the rest of the folders in link path. So stackoverflow.com/questions/
would be pulled along with stackoverflow.com/questions/ask
because the regex would identify both as containing /questions/
, when I am only needing the result to be /questions/
and /tags/
.
I have also tried strsplit()
to see if that would help with identifying all rows that have, for example, column 3 of the dataframe blank, then doing a SQL query with each of the links against the database to rank them, but since the webpage I am working with is so large this becomes cumbersome quickly. It also slows down the reaction time of the dashboard because of the queries.
Has anyone implemented something like this? If so, any pointers would be greatly appreciated.
If you want to test what I am discussing, here's the reprex:
test <- c("www.stackoverflow.com/questions/","www.stackoverflow.com/questions/ask/","www.stackoverflow.com/tags/")
test %>% str_view("/.*?/")
test %>% str_view("/[a-z]+/")
test %>% strsplit("/")
Solution 1:[1]
Is this what you're looking for?
test <- c("www.stackoverflow.com/questions/","www.stackoverflow.com/questions/ask/","www.stackoverflow.com/tags/")
library(stringr)
unique(str_split(test, "\\/", simplify=TRUE)[,2])
# [1] "questions" "tags"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | DaveArmstrong |