'Create loyalty variable in R by subtracting 2 date variables. Problem with getting the total amount of years. Football data

Im currently working for a Dutch football club and run into some problems with getting the length of relationship of certain fans. In the image below a short overview of a fan and his season tickets bought over time. The membershipJoinDate indicates when the fan joined the club, however when calculating the loyalty with the difference between the join and end date with the following code:

dfmembership_season1$Loyalty_days <- difftime(dfmembership_season1$MembershipEndDate ,dfmembership_season1$MembershipJoinDate , units = c("days"))
dfmembership_season1$Loyalty_years <- interval(dfmembership_season1$MembershipJoinDate, dfmembership_season1$MembershipEndDate) %>%
  as.period() %>%
  year()

I create the loyalty of the fan based on the products he bought. However I would like to calculate the overall loyalty in this variable, starting from the first date he came member of the club. I tried the following:

dfmembership1$loyalty_year <- dfmembership1 %>%
  group_by(GroupID) %>%
  mutate(loyalty_year = MembershipEndDate - MembershipJoinDate, default = first(MembershipJoinDate), order_by = GroupID)

But I get the error: Error in mutate(): ! Problem while computing loyalty_year = MembershipEndDate - MembershipJoinDate. ℹ The error occurred in group 1: GroupID = 302339. Caused by error in MembershipEndDate - MembershipJoinDate: ! non-numeric argument to binary operator Run rlang::last_error() to see where the error occurred.

Is there someone who has the solution?

For dput() data to reconstruct dataset: structure(list(MembershipType = c("Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart", "Seizoenkaart" ), TransactionPrice = c(0, 211, 206, 0, 0, 0, 209, 0, 0, 219, 206, 211, 214, 209), MembershipJoinDate = structure(c(17348, 16983, 16252, 18078, 18078, 17713, 16252, 17713, 17348, 18078, 16252, 17348, 17713, 16252), class = "Date"), MembershipEndDate = structure(c(18443, 17347, 17347, 18808, 18808, 18443, 17347, 18443, 18443, 18808, 17347, 18443, 18443, 17347), class = "Date"), GroupID = c(100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L, 100041615L), Loyalty_days = structure(c(1095, 364, 1095, 730, 730, 730, 1095, 730, 1095, 730, 1095, 1095, 730, 1095), class = "difftime", units = "days"), Loyalty_years = c(3, 1, 3, 2, 2, 2, 3, 2, 3, 2, 3, 3, 2, 3)), row.names = c(25162L, 25163L, 25164L, 25165L, 25166L, 25167L, 25168L, 25169L, 25170L, 25171L, 25172L, 25173L, 25174L, 25733L), class = "data.frame")

Example of data



Solution 1:[1]

If I understand you correctly, you want to calculate the time interval from the user's first MembershipJoinDate until the current row's MembershipEndDate. Here's a way to do that.

library(dplyr)
library(lubridate)

dfmembership_season1 %>%
  group_by(GroupID) %>%
  mutate(Loyalty_days = MembershipEndDate - min(MembershipJoinDate),
         Loyalty_years = time_length(Loyalty_days, unit = 'years')) %>%
  ungroup()

#> # A tibble: 14 × 7
#>    MembershipType TransactionPrice MembershipJoinDate MembershipEndDate   GroupID Loyalty_days Loyalty_years
#>    <chr>                     <dbl> <date>             <date>                <int> <drtn>               <dbl>
#>  1 Seizoenkaart                206 2014-07-01         2017-06-30        100041615 1095 days             3.00
#>  2 Seizoenkaart                209 2014-07-01         2017-06-30        100041615 1095 days             3.00
#>  3 Seizoenkaart                206 2014-07-01         2017-06-30        100041615 1095 days             3.00
#>  4 Seizoenkaart                209 2014-07-01         2017-06-30        100041615 1095 days             3.00
#>  5 Seizoenkaart                211 2016-07-01         2017-06-30        100041615 1095 days             3.00
#>  6 Seizoenkaart                  0 2017-07-01         2020-06-30        100041615 2191 days             6.00
#>  7 Seizoenkaart                  0 2017-07-01         2020-06-30        100041615 2191 days             6.00
#>  8 Seizoenkaart                211 2017-07-01         2020-06-30        100041615 2191 days             6.00
#>  9 Seizoenkaart                  0 2018-07-01         2020-06-30        100041615 2191 days             6.00
#> 10 Seizoenkaart                  0 2018-07-01         2020-06-30        100041615 2191 days             6.00
#> 11 Seizoenkaart                214 2018-07-01         2020-06-30        100041615 2191 days             6.00
#> 12 Seizoenkaart                  0 2019-07-01         2021-06-30        100041615 2556 days             7.00
#> 13 Seizoenkaart                  0 2019-07-01         2021-06-30        100041615 2556 days             7.00
#> 14 Seizoenkaart                219 2019-07-01         2021-06-30        100041615 2556 days             7.00

Created on 2022-05-13 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aron