'Get a rolling order count into session data

I have the following table

enter image description here

One client has two purchases in one session.

My goal is to assign a order counter to each row of the table.

To reach this goal I am using the lag function to call the last order_id and the last order_timestamp:

SELECT 
lag(event_timestamp) over (partition by session_id order by
ecom_data.order_id) as prev_order_timestamp,  
lag(ecom_data.order_id)
over (partition by session_id order by event_timestamp) as
prev_order_number 
From table

My desired output is this:

enter image description here

Problem, I do not get the previous order time. Instead I get the event_timestamp from the previous event.

My second challenge is that I do not know how to assign a order_count. My desired output is this:

enter image description here

Ideally, this order count should be rolling as in the real data set I dont know how many orders in total each session had. There can be 0 - infinite orders.

Can you help ?

Thank you!



Solution 1:[1]


### create sample table (helps to introduce these in your questions)
WITH
  base AS (
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:17:41") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'ts' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:17:42") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:27:14") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'atc' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:27:15") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    123 AS order_id,
    DATETIME("2022-05-12 10:30:47") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:30:50") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:01") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'atc' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:20") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'ts' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:22") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'rv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:32") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:35") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:32:49") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    456 AS order_id,
    DATETIME("2022-05-12 10:33:35") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:33:48") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'tv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:33:50") AS event_timestamp
  UNION ALL
  SELECT
    'B' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:50") AS event_timestame
  UNION ALL
  SELECT
    'B' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    123 AS order_id,
    DATETIME("2022-05-12 10:33:50") AS event_timestame
  UNION ALL
  SELECT
    'C' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 11:13:50") AS event_timestame
  UNION ALL
  SELECT
    'C' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 11:33:50") AS event_timestame),

  prev_order1 AS (
  SELECT
    *,
    LAG(order_id) OVER (PARTITION BY client_id ORDER BY event_timestamp) AS prev_order_number1
  FROM
    base),

  ### filling in order number using your requested output
  prev_order2 AS (
  SELECT
    *,
    MAX(prev_order_number1) OVER(partition by client_id ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS prev_order_number2
  FROM
    prev_order1
  ORDER BY
    event_timestamp )

### inserting order_counter logic
SELECT
  *,
  DENSE_RANK() OVER(partition by client_id ORDER BY prev_order_number2) - 1 AS order_counter
FROM
  prev_order2

Think about edge cases and perhaps if you want to partition by other dimensions such as client_id vs total table (as you have it now). I included client_id = B as an example.

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1