'dply: order columns alphabetically in R
If I have a large DF (hundreds and hundreds) columns with different col_names randomly distributed alphabetically:
df.x <- data.frame(2:11, 1:10, rnorm(10))
colnames(df.x) <- c("ID", "string", "delta")
How would I order all of the data (vertically) by col_name alphabetically?
Essentially, I have hundreds of CSV(sep="|") text files that I need to read their columns into a single df, order those columns alphabetically and then use some other dplyf functions to get a final result. I have all of this figured out except how to order the columns alphabetically. I do not want to sort the columns (up and down) by alphabet, rather, the actual vertical orientation of the col_names and their corresponding data. Analogous to cutting and pasting entire columns of data in Excel.
For example I reviewed this approach but this is the "sort" the rows alphabetically bit, which is not what I'm looking to do.
How to sort a dataframe by column(s)?
Thanks!
Solution 1:[1]
An alternative way to do this in dplyr is:
iris %>%
select(sort(current_vars()))
current_vars()
returns column names such that they're sortable, and select()
will take the vector of column names.
Solution 2:[2]
If a specific column (or columns) has to be the first one (or last), but the rest is ordered, you can:
mtcars %>% tibble %>%
select("hp", sort(colnames(.)))
Solution 3:[3]
For a partial sorting using the recent tidyverse (more specific, the tidyselect
package), use peek_vars()
:
library(dplyr)
starwars
#> # A tibble: 87 x 14
#> name height mass hair_color skin_color eye_color birth_year
#> <chr> <int> <dbl> <chr> <chr> <chr> <dbl>
#> 1 Luke Skywalker 172 77 blond fair blue 19.0
#> 2 C-3PO 167 75 NA gold yellow 112.0
#> 3 R2-D2 96 32 NA white, blue red 33.0
#> 4 Darth Vader 202 136 none white yellow 41.9
#> 5 Leia Organa 150 49 brown light brown 19.0
#> 6 Owen Lars 178 120 brown, grey light blue 52.0
#> 7 Beru Whitesun lars 165 75 brown light blue 47.0
#> 8 R5-D4 97 32 NA white, red red NA
#> 9 Biggs Darklighter 183 84 black light brown 24.0
#> 10 Obi-Wan Kenobi 182 77 auburn, white fair blue-gray 57.0
#> # ... with 77 more rows, and 7 more variables: sex <chr>, gender <chr>,
#> # homeworld <chr>, species <chr>, films <list>, vehicles <list>, starships <list>
starwars %>% select(name, mass, sort(tidyselect::peek_vars()))
#> # A tibble: 87 x 14
#> name mass birth_year eye_color films gender hair_color height
#> <chr> <dbl> <dbl> <chr> <lis> <chr> <chr> <int>
#> 1 Luke Skywalker 77 19.0 blue <chr> masculine blond 172
#> 2 C-3PO 75 112.0 yellow <chr> masculine NA 167
#> 3 R2-D2 32 33.0 red <chr> masculine NA 96
#> 4 Darth Vader 136 41.9 yellow <chr> masculine none 202
#> 5 Leia Organa 49 19.0 brown <chr> feminine brown 150
#> 6 Owen Lars 120 52.0 blue <chr> masculine brown, grey 178
#> 7 Beru Whitesun lars 75 47.0 blue <chr> feminine brown 165
#> 8 R5-D4 32 NA red <chr> masculine NA 97
#> 9 Biggs Darklighter 84 24.0 brown <chr> masculine black 183
#> 10 Obi-Wan Kenobi 77 57.0 blue-gray <chr> masculine auburn, white 182
#> # ... with 77 more rows, and 6 more variables: homeworld <chr>, sex <chr>,
#> # skin_color <chr>, species <chr>, starships <list>, vehicles <list>
Solution 4:[4]
Why not just:
sort(colnames(df.x))
[1] "delta" "ID" "string"
Solution 5:[5]
We can define :
abc <- function(..., desc = FALSE) {
data <- tidyselect::peek_data()
named_selection <- tidyselect::eval_select(rlang::expr(c(...)), data)
named_selection[order(names(named_selection), named_selection, decreasing = desc)]
}
Unlike other solutions you can call it around any tidy selection, but in our case it can be only:
df %>% select(abc(everything()))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Steph Locke |
Solution 2 | HBat |
Solution 3 | MS Berends |
Solution 4 | Frank B. |
Solution 5 | moodymudskipper |