Category "dplyr"

Overwrite variables if condition is met, else keep existing values R

I have a data frame df<-data.frame(Name=c('H001', 'H002', 'H003', 'H004', 'H005', 'H006', 'H007', 'H008', 'H009', 'H010'),

How to rank a variable in a column based on a conditional, when there are NAs in the column

I have a longitudinal data set with two people in which the rows of data are numbered as 'episodes', and some episodes have a test 'result'. The goal of the bel

How to get the frequency( count) of Variable C when Variables A and B are mentioned together?

I have the following dplyr code: df3 <- Table3%>% group_by(Q6,Q9,Q11) %>% summarise(count = n()) %>% mutate(per = paste0(round(100 *count/sum(

Dplyr Lags on Summarised Grouped Data

Using dplyr, I'm looking to summarise a new column of data as a lagged version of an existing column of grouped data. Reprex: dateidx <- as.Date(c("2019-

Is there a way to vectorize seq() and grep() to use on conjunction with dplyr?

Apologies if this is obvious, I don't have much experience with R. I have a function contains_leap_year(date1, date2) that I want to pass in as a condition to d

Create several new variables using a vector of names and a vector for computation within dplyr::mutate

I'd like to create several new columns. They should take their names from one vector and they should be computed by taking one column in the data and dividing i

Merging three dfs of different row lengths

I need to merge three separate DFs ("factors_sed", "resp", and "npoc_sed") based on the shared column "Samples". Each DF contains a different number of rows (s

Change values in multiple columns if condition is met (R) [duplicate]

I have a dataframe: n <- 50 df <- data.frame(id = seq (1:n), age = sample(c(20:90), n, rep = TRUE), s

Error when updating a dataframe with new column inside a for loop using Dplyr

I have the following R dataframe df: library(tidyquant) start_date <- as.Date('2022-01-01') end_date <- as.Date('2022-03-31') assets_list <- c('DGS30

Calculate changes in totals of subgroups in R

I have the following dataframe: # A tibble: 8 x 5 Year Group Unit Profit Sales <dbl> <chr> <chr> <dbl> <dbl> 1 2021 One

Why is my dplyr code to create multiple variables using mutate and zoo incredibly slow?

I am using dplyr to create multiple variables in my data frame using mutate. At the same time, I am using zoo to calculate a rolling average. As an example, I h

Dplyr summarize "sum" function works correctly only for subset not the larger dataset in R

I have a dataset where I sampled abundance of 4 species across 12 months, 6 sites (5 replicates within a site). I am trying to calculate various summary stats (

Copy column from one data.frame to another based on index

The problem is similar to what posted in Combine dataframe based on index R I am trying to copy one column from df2 (huge df) to df1 (small df) but based on ind

Lag and lead a variable in a dataframe by 1 month and 6 business days for panel data

I have a large panel data set and I would like to lag and lead a variable by 1 month and 6 business days. I know, for instance, from dplyr there is the lag or

How to filter very small values in r?

I have a large dataset in which one column is p-values that range from 0.9 to being extremely small like 5e-79. In R I can sort the data in descending order and

Use dplyr::select's where with base R grepl and anonymus function

There is a very similar question here: How to select columns based on grep in dplyr::tibble However I think that the select_if was superseeded with select(where

Why doesn't R dplyr arrange sort properly using a vector element within a for loop

I'm having trouble getting r's dplyr::arrange() to sort properly when used in a for loop. I found many posts discussing this issue (like ex.1 with the .by_grou

Joining two datasets by (non-uniform) names

I need to join two datasets and the only identifier in both are the company names. For example: db1 <- tibble( Company = c('Bombardier Inc.','Honeywell Dev

Paste together results within case_when (if-else) statements

I want to paste together results within the same case_when statement (i.e., if multiple statements are true for a given row). I know that I could do something l

dplyr get linear regression coefficients

I'm wondering if there is a better way is to get linear regression coefficients as columns in dplyr. Here is some sample data. mydata <- data.frame( S