'Create loyalty variable in R by subtracting 2 date variables. Problem with getting the total amount of years. Football data
Im currently working for a Dutch football club and run into some problems with getting the length of relationship of certain fans. In the image below a short overview of a fan and his season tickets bought over time. The membershipJoinDate indicates when the fan joined the club, however when calculating the loyalty with the difference between the join and end date with the following code:
dfmembership_season1$Loyalty_days <- difftime(dfmembership_season1$MembershipEndDate ,dfmembership_season1$MembershipJoinDate , units = c("days"))
dfmembership_season1$Loyalty_years <- interval(dfmembership_season1$MembershipJoinDate, dfmembership_season1$MembershipEndDate) %>%
as.period() %>%
year()
I create the loyalty of the fan based on the products he bought. However I would like to calculate the overall loyalty in this variable, starting from the first date he came member of the club. I tried the following:
dfmembership1$loyalty_year <- dfmembership1 %>%
group_by(GroupID) %>%
mutate(loyalty_year = MembershipEndDate - MembershipJoinDate, default = first(MembershipJoinDate), order_by = GroupID)
But I get the error:
Error in mutate()
:
! Problem while computing loyalty_year = MembershipEndDate - MembershipJoinDate
.
ℹ The error occurred in group 1: GroupID = 302339.
Caused by error in MembershipEndDate - MembershipJoinDate
:
! non-numeric argument to binary operator
Run rlang::last_error()
to see where the error occurred.
Is there someone who has the solution?
For dput() data to reconstruct dataset: structure(list(MembershipType = c("Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart" ), TransactionPrice = c(0, 211, 206, 0, 0, 0, 209, 0, 0, 219, 206, 211, 214, 209), MembershipJoinDate = structure(c(17348, 16983, 16252, 18078, 18078, 17713, 16252, 17713, 17348, 18078, 16252, 17348, 17713, 16252), class = "Date"), MembershipEndDate = structure(c(18443, 17347, 17347, 18808, 18808, 18443, 17347, 18443, 18443, 18808, 17347, 18443, 18443, 17347), class = "Date"), GroupID = c(100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L), Loyalty_days = structure(c(1095, 364, 1095, 730, 730, 730, 1095, 730, 1095, 730, 1095, 1095, 730, 1095), class = "difftime", units = "days"), Loyalty_years = c(3, 1, 3, 2, 2, 2, 3, 2, 3, 2, 3, 3, 2, 3)), row.names = c(25162L, 25163L, 25164L, 25165L, 25166L, 25167L, 25168L, 25169L, 25170L, 25171L, 25172L, 25173L, 25174L, 25733L), class = "data.frame")
Solution 1:[1]
If I understand you correctly, you want to calculate the time interval from the user's first MembershipJoinDate
until the current row's MembershipEndDate
. Here's a way to do that.
library(dplyr)
library(lubridate)
dfmembership_season1 %>%
group_by(GroupID) %>%
mutate(Loyalty_days = MembershipEndDate - min(MembershipJoinDate),
Loyalty_years = time_length(Loyalty_days, unit = 'years')) %>%
ungroup()
#> # A tibble: 14 × 7
#> MembershipType TransactionPrice MembershipJoinDate MembershipEndDate GroupID Loyalty_days Loyalty_years
#> <chr> <dbl> <date> <date> <int> <drtn> <dbl>
#> 1 Seizoenkaart 206 2014-07-01 2017-06-30 100041615 1095 days 3.00
#> 2 Seizoenkaart 209 2014-07-01 2017-06-30 100041615 1095 days 3.00
#> 3 Seizoenkaart 206 2014-07-01 2017-06-30 100041615 1095 days 3.00
#> 4 Seizoenkaart 209 2014-07-01 2017-06-30 100041615 1095 days 3.00
#> 5 Seizoenkaart 211 2016-07-01 2017-06-30 100041615 1095 days 3.00
#> 6 Seizoenkaart 0 2017-07-01 2020-06-30 100041615 2191 days 6.00
#> 7 Seizoenkaart 0 2017-07-01 2020-06-30 100041615 2191 days 6.00
#> 8 Seizoenkaart 211 2017-07-01 2020-06-30 100041615 2191 days 6.00
#> 9 Seizoenkaart 0 2018-07-01 2020-06-30 100041615 2191 days 6.00
#> 10 Seizoenkaart 0 2018-07-01 2020-06-30 100041615 2191 days 6.00
#> 11 Seizoenkaart 214 2018-07-01 2020-06-30 100041615 2191 days 6.00
#> 12 Seizoenkaart 0 2019-07-01 2021-06-30 100041615 2556 days 7.00
#> 13 Seizoenkaart 0 2019-07-01 2021-06-30 100041615 2556 days 7.00
#> 14 Seizoenkaart 219 2019-07-01 2021-06-30 100041615 2556 days 7.00
Created on 2022-05-13 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Aron |