'Data Frame- Add number of occurrences with a condition in R

I'm having a bit of a struggle trying to figure out how to do the following. I want to map how many days of high sales I have previously a change of price. For example, I have a price change on day 10 and the high sales indicator will tell me any sale greater than or equal to 10. Need my algorithm to count the number of consecutive high sales.

In this case it should return 5 (day 5 to 9)

For example purposes, the dataframe is called df. Code:

#trying to create a while loop that will check if lag(high_sales) is 1, if yes it will count until
#there's a lag(high_sales) ==0

#loop is just my dummy variable that will take me out of the while loop
count_sales<-0
loop<-0
df<- df %>% mutate(consec_high_days= ifelse(price_change > 0, while(loop==0){
                                       if(lag(High_sales_ind)==1){
                                           count_sales<-count_sales +1}
                                       else{loop<-0}
                                       count_sales},0))


day price price_change sales High_sales_ind
1 5 0 12 1
2 5 0 6 0
3 5 0 5 0
4 5 0 4 0
5 5 0 10 1
6 5 0 10 1
7 5 0 10 1
8 5 0 12 1
9 5 0 14 1
10 7 2 3 0
11 7 0 2 0

This is my error message:

Warning: Problem with mutate() column consec_high_days. i consec_high_days = ifelse(...). i the condition has length > 1 and only the first element will be used

Warning: Problem with mutate() column consec_high_days. i consec_high_days = ifelse(...). i 'x' is NULL so the result will be NULL

Error: Problem with mutate() column consec_high_days. i consec_high_days = ifelse(...). x replacement has length zero

Any help would be greatly appreciated.



Solution 1:[1]

This is a very inelegant brute-force answer, though hopefully someone better than me can provide a more elegant answer - but to get the desired dataset, you can try:

df <- read.table(text = "day    price   price_change    sales   High_sales_ind
1 5 0   12  1
2   5   0   6   0
3   5   0   5   0
4   5   0   4   0
5   5   0   10  1
6   5   0   10  1
7   5   0   10  1
8   5   0   12  1
9   5   0   14  1
10  7   2   3   0
11  7   0   2   0", header = TRUE)

# assign consecutive instances of value
df$seq <- sequence(rle(as.character(df$sales >= 10))$lengths)

# Find how many instance of consecutive days occurred before price change
df <- df %>% mutate(lseq = lag(seq))

# define rows you want to keep and when to end
keepz <- df[df$price_change != 0, "lseq"]
end <- as.numeric(rownames(df[df$price_change != 0,]))-1

df_want <- df[keepz:end,-c(6:7)]

Output:

#     day price price_change sales High_sales_ind
# 5   5     5            0    10              1
# 6   6     5            0    10              1
# 7   7     5            0    10              1
# 8   8     5            0    12              1
# 9   9     5            0    14              1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1