'Classify if a customer converted on initial email or reminder email using SQL
I have been stuck to get around this in Oracle SQL. I have a table that looks like following.
cust_id | event | event_date | camp_id | email_ind(calculated column) |
---|---|---|---|---|
1 | OPENED | 06/09/2021 | 80 | |
1 | SENT | 06/09/2021 | 80 | initial |
1 | SENT | 14/09/2021 | 80 | reminder |
2 | CLICK | 15/04/2021 | 80 | |
2 | OPENED | 15/04/2021 | 80 | |
2 | SENT | 15/04/2021 | 80 | initial |
2 | CONVERTED | 15/04/2021 | 80 | |
2 | WEBSITE_VISIT | 15/04/2021 | 80 | |
3 | OPENED | 01/06/2021 | 80 | |
3 | SENT | 01/06/2021 | 80 | initial |
3 | OPENED | 09/06/2021 | 80 | |
3 | SENT | 09/06/2021 | 80 | reminder |
3 | CLICK | 10/06/2021 | 80 | |
3 | OPENED | 10/06/2021 | 80 | |
3 | SENT | 10/06/2021 | 80 | |
3 | CONVERTED | 10/06/2021 | 80 | |
3 | WEBSITE_VISIT | 10/06/2021 | 80 | |
4 | SENT | 06/09/2021 | 80 | initial |
4 | SENT | 14/09/2021 | 80 | reminder |
5 | OPENED | 20/09/2021 | 80 | |
5 | SENT | 20/09/2021 | 80 | initial |
5 | SENT | 28/09/2021 | 80 | reminder |
5 | CLICK | 03/10/2021 | 80 | |
5 | OPENED | 03/10/2021 | 80 | |
5 | SENT | 03/10/2021 | 80 | not a reinder email |
5 | CONVERTED | 03/10/2021 | 80 | |
5 | WEBSITE_VISIT | 03/10/2021 | 80 | |
5 | OPENED | 05/11/2021 | 80 | |
6 | OPENED | 01-Jun-21 | 80 | |
6 | SENT | 01-Jun-21 | 80 | initial |
6 | OPENED | 09-Jun-21 | 80 | |
6 | SENT | 09-Jun-21 | 80 | reminder |
7 | OPENED | 26-Jul-21 | 80 | |
7 | SENT | 26-Jul-21 | 80 | initial |
7 | CLICK | 03-Aug-21 | 80 | |
7 | OPENED | 03-Aug-21 | 80 | |
7 | SENT | 03-Aug-21 | 80 | not a reminder email |
7 | CONVERTED | 03-Aug-21 | 80 | |
7 | WEBSITE_VISIT | 03-Aug-21 | 80 | |
7 | CLICK | 04-Aug-21 | 80 | |
7 | OPENED | 04-Aug-21 | 80 | |
7 | SENT | 04-Aug-21 | 80 | |
8 | OPENED | 12-Jul-21 | 80 | |
8 | SENT | 12-Jul-21 | 80 | initial |
8 | OPENED | 20-Jul-21 | 80 | |
8 | SENT | 20-Jul-21 | 80 | reminder |
9 | SENT | 29-Apr-21 | 80 | initial |
9 | SENT | 07-May-21 | 80 | reminder |
cust_id is customer_id ,event is the type of event (SENT - email was sent, OPENED - email was opened, CLICK- email was clicked, CONVERTED - customer accepted the offer) ,event_date is the timestamp of the event ,camp_id is the campaign_id.
Please note email_ind is not in the dataset. I basically want to know if the customer CONVERTED on the initial email or the reminder email that was SENT or there was no CONVERSION at all?
I want to create a new column called email_ind that tells if it is a initial email or a reminder email. Suppose, for cust_id=1, the initial email was sent on 06-Sept-21 and reminder email was sent 14-Sept-21 For cust_id=2, there was only initial email on 15-Apr-21 and then the cust_id CONVERTED on the same day. That means the customer accepted the offer on the initial email.
Also, when there is event='CONVERTED' there is automatically an event='SENT' is generated which should not be considered as a reminder email. For ex for cusT_id='3', the event 'CONVERTED' occured on 10-JUN-21 and automatically an event='SENT' is generated for 10-JUN-21 which should not be considered as reminder email.
Also, once the customer is CONVERTED, I do not want to consider the events that follow after conversion. For ex cust_id=7, it got converted on 03-Aug-21, so the rest of the vents on 04-Aug-21 can be ignored There are multiple camp_ids.
I tried self join and use windows function like over() partition by() but it doesn't seem to work. I must be missing something. It would be great if anyone could help here?
Solution 1:[1]
From Oracle 12, you can use MATCH_RECOGNIZE
to do row-by-row processing:
SELECT cust_id,
event,
event_date,
camp_id,
CASE email_ind
WHEN 'CONVERTED' THEN 'CONVERTED'
WHEN 'INITIAL' THEN 'INITIAL'
WHEN 'REMINDER' THEN 'REMINDER'
END AS email_ind,
CASE
WHEN email_ind = 'CONVERTED'
AND has_reminder = 1
THEN 'REMINDER'
WHEN email_ind = 'CONVERTED'
THEN 'INITIAL'
END AS conversion_type
FROM (SELECT t.*, ROWNUM AS rn FROM table_name t)
MATCH_RECOGNIZE (
PARTITION BY cust_id
ORDER BY rn
MEASURES
CLASSIFIER() AS email_ind,
COUNT(reminder.cust_id) AS has_reminder
ALL ROWS PER MATCH
PATTERN ( ("INITIAL" not_sent*)? (reminder not_sent*)? not_reminder converted | other )
DEFINE
"INITIAL" AS event = 'SENT',
reminder AS event = 'SENT',
not_reminder AS event = 'SENT',
not_sent AS event <> 'SENT',
converted AS event = 'CONVERTED'
)
db<>fiddle here
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |