'Data Frame- Add number of occurrences with a condition in R
I'm having a bit of a struggle trying to figure out how to do the following. I want to map how many days of high sales I have previously a change of price. For example, I have a price change on day 10 and the high sales indicator will tell me any sale greater than or equal to 10. Need my algorithm to count the number of consecutive high sales.
In this case it should return 5 (day 5 to 9)
For example purposes, the dataframe is called df
. Code:
#trying to create a while loop that will check if lag(high_sales) is 1, if yes it will count until
#there's a lag(high_sales) ==0
#loop is just my dummy variable that will take me out of the while loop
count_sales<-0
loop<-0
df<- df %>% mutate(consec_high_days= ifelse(price_change > 0, while(loop==0){
if(lag(High_sales_ind)==1){
count_sales<-count_sales +1}
else{loop<-0}
count_sales},0))
day | price | price_change | sales | High_sales_ind |
---|---|---|---|---|
1 | 5 | 0 | 12 | 1 |
2 | 5 | 0 | 6 | 0 |
3 | 5 | 0 | 5 | 0 |
4 | 5 | 0 | 4 | 0 |
5 | 5 | 0 | 10 | 1 |
6 | 5 | 0 | 10 | 1 |
7 | 5 | 0 | 10 | 1 |
8 | 5 | 0 | 12 | 1 |
9 | 5 | 0 | 14 | 1 |
10 | 7 | 2 | 3 | 0 |
11 | 7 | 0 | 2 | 0 |
This is my error message:
Warning: Problem with
mutate()
columnconsec_high_days
. iconsec_high_days = ifelse(...)
. i the condition has length > 1 and only the first element will be used
Warning: Problem with
mutate()
columnconsec_high_days
. iconsec_high_days = ifelse(...)
. i 'x' is NULL so the result will be NULL
Error: Problem with
mutate()
columnconsec_high_days
. iconsec_high_days = ifelse(...)
. x replacement has length zero
Any help would be greatly appreciated.
Solution 1:[1]
This is a very inelegant brute-force answer, though hopefully someone better than me can provide a more elegant answer - but to get the desired dataset, you can try:
df <- read.table(text = "day price price_change sales High_sales_ind
1 5 0 12 1
2 5 0 6 0
3 5 0 5 0
4 5 0 4 0
5 5 0 10 1
6 5 0 10 1
7 5 0 10 1
8 5 0 12 1
9 5 0 14 1
10 7 2 3 0
11 7 0 2 0", header = TRUE)
# assign consecutive instances of value
df$seq <- sequence(rle(as.character(df$sales >= 10))$lengths)
# Find how many instance of consecutive days occurred before price change
df <- df %>% mutate(lseq = lag(seq))
# define rows you want to keep and when to end
keepz <- df[df$price_change != 0, "lseq"]
end <- as.numeric(rownames(df[df$price_change != 0,]))-1
df_want <- df[keepz:end,-c(6:7)]
Output:
# day price price_change sales High_sales_ind
# 5 5 5 0 10 1
# 6 6 5 0 10 1
# 7 7 5 0 10 1
# 8 8 5 0 12 1
# 9 9 5 0 14 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |