Category "tidyverse"

How to split up a dataframe with one column into a dataframe with different columns?

I have asked a similar question before and tried to use the answers (which were very good) on my project, but I failed. I have the following dataframe: library(

can you use split_cols_by and also get a total column?

I'm making a table like this: basic_table() %>% split_cols_by("ARM") %>% analyze(vars = c("AGE", "BMRKR1"), afun = function(x) { in_rows( "M

case_when fails when condition checks for rows that don't exist

Consider this data: df <- data.frame(group = c(1, 2, 2, 2), start = c(2, 7, 7, 7), stop = c(8, 7, 8, 9),

Create and fill new columns based on range information from two other columns

I have the following data: df <- data.frame(group = c(1, 1, 1, 2, 2, 2), start = c(2, 2, 2, 7, 7, 7), stop = c(4, 7, 8,

How do i replicate these plots using ggplot?

plots This is what I have tried so far. The box plot is kind of close, but the other plot is way off. ggplot(data_anova, aes(x = delay, y = soa, color = age)) +

Annotate ggplot2 across multiple facets

I have recently started using the facet_nested function from the ggh4x package and I really like the look of the nested axis. I would like to annotate the plot

Combine list of dataframes into one dataframe and summarize in one step

I want to combine/reduce a list of dataframes into one dataframe, but I also want to summarize the data in one step. The output is from a simulation; therefore,

Fetching values from one column based on other column keys in long-formatted dataset

I have a long format dataset of 100,000+ individuals, capturing clinic visits at 5 different time points (not chronological). I've included an example dataset b

Extract div class text and sub tables in rvest

I am trying to recreate a table from this website under "Battle Pass Rewards." The final result is a data.frame with each of the following areas as different co

Simultaneously remove the first and last rows of a data frame until reaching a row that does not have an NA

I have a dataframe that contains NA values, and I want to remove some rows that have an NA (i.e., not complete cases). However, I only want to remove rows at th

Create new column based on presence/absence of string in other column by group

I have this dataset about vessels locations, where the same "id" can correspond to two levels. Corresponds to a defined category, such as "fishing" and may also

Issues with accent when using the "separate" function from tidyverse

I am using the separate function from tidyverse to split the first column of this tibble : # A tibble: 6,951 x 9 Row.names Number_of_ana

select(column, (contains()) not showing results

chase_2021 = chase[c(143:1020),] paychecks = chase_2021 %>% select(Posting.Date, Amount, Description, starts_with('CVS'), ends_with('PPD ID: 995338

Change column name with tq_get() in Tidyquant

I'm using the tq_get() function in Tidyquant to retrieve economic data from FRED: library(tidyquant) library(tidyverse) consumer_price_index <- 'CPIAUCSL'

How to filter out rows that do not fit specified condition in R [duplicate]

I have this data frame: df <- data.frame (ID = c(1:20), Ethnicity = c(rep(c("White", "Asian", "Black", "Hispanic", "Othe

Count number of rows that fulfill multiple conditions in R

I have a dataframe: df <- data.frame (ID = c(1:20), Ethnicity = c(rep(c("White", "Asian", "Black", "Hispanic", "Other"), times=20/5)),

Mutate variable conditional on first unique occurance of another variable

I want to create a variable which identifies the first occurance of a variable in a column but I cannot seem to get the code to work. The new varibale should on

How to insert rows in specific indices of dataframe containing sum of few rows above only in R pipe dplyr

for dataframe below, df <- data.frame(id = c(rep(101, 4), rep(202, 3)), status = c("a","b","c","d", "a", "b", "c"), wt = c(10

Counting observations by 30-days window

As I explained in previous posts I'm trying to count observations over 30 days windows grouping by id. The data: df<-structure(list(id=c(1,1,1,2),date=c("200

How can I filter rows out if their start date is within 90 days from today and place them out until the 1st of the following month in R?

I am having difficulty finding the words to describe what I am searching for but will try. I would like to solve the following using R or Python (but preferably