'SQL: Pull rows based on sequence of values

I need to pull rows of data based on the existence of certain values that exist in a specific sequence.

Here's an example of the data:

Header	EventId	EventDate
67891882	382	2022-01-21 09:29:50.000
67891882	81	2022-01-21 09:03:23.000
67891882	273	2022-01-21 09:03:51.000
67891882	77	2022-01-21 09:05:58.000
67891882	2	2022-01-21 09:29:48.000

The results I need are to capture the Header and the EventDate for EventId=81. Further criteria include:

EventID 81 is the "start" and EventID 77 is the "end"
Any number of other events can exist between these two with the exception of (60, 72, 73, 74, 75, 76, 83, 85, 86, 87, 103, 154, 166, 197, 199)

So in the example above, Eventid 81 with EventDate 2022-01-21 09:03:23.000 would qualify as a row I want to pull as 273 is not in the exception list.

ATTEMPT: I have tried the following query

SELECT *
FROM #Table
WHERE EventDate BETWEEN (SELECT EventDate
                         FROM #Table
                         WHERE EventId = 81)
                    AND (SELECT eventdate
                         FROM #Table
                         WHERE EventId = 77)
    AND EventId NOT IN (60, 72, 73, 74, 75, 76, 83, 85, 86, 87, 103, 154, 166, 197, 199)
ORDER BY 3

But I was immediately confronted with the fact that my sub-queries return more than one result, so this won't work (I was using this to test a singular Header # example, which worked fine). So now I'm not quite sure how to proceed. I'd hate to think that I'd be forced to use a CURSOR, mostly because my source data is comprised of 266 million rows.

I had also previously tried using the LAG() function to find my "starting point", but that possibility seemed to dissipate once the request started becoming more and more complex (with the addition of the exclusion list as well as the fact that there could be 1 or 40 rows in between the 81 and 77).

How should I proceed with this? Here's some example data to play with. The Header can be thought of as a parent key, associated with any number of EventID (representing a specific action) and the EventDate with when this occurred:

create table #data (header int, eventid int, eventdate datetime)

insert into #data 
values
('62252595',    '22',   '5/23/2021  12:34:02 PM'),
('62252595',    '81',   '5/23/2021  12:34:03 PM'),
('62252595',    '29',   '5/23/2021  12:34:12 PM'),
('62252595',    '40',   '5/23/2021  12:34:27 PM'),
('62252595',    '22',   '5/23/2021  12:35:02 PM'),
('62252595',    '22',   '5/23/2021  12:36:12 PM'),
('62252595',    '37',   '5/23/2021  12:36:36 PM'),
('62252595',    '77',   '5/23/2021  12:37:04 PM'),
('62252595',    '6',    '5/23/2021  12:37:52 PM'),
('63252595',    '39',   '5/23/2021  12:38:01 PM'),
('63252595',    '81',   '5/23/2021  12:38:04 PM'),
('63252595',    '37',   '5/23/2021  12:38:06 PM'),
('63252595',    '21',   '5/23/2021  12:38:09 PM'),
('63252595',    '75',   '5/23/2021  12:38:10 PM'),
('63252595',    '77',   '5/23/2021  12:38:12 PM'),
('64252595',    '29',   '5/23/2021  12:38:15 PM'),
('64252595',    '26',   '5/23/2021  12:38:18 PM'),
('64252595',    '81',   '5/23/2021  12:38:20 PM'),
('64252595',    '40',   '5/23/2021  12:38:21 PM'),
('64252595',    '81',   '5/23/2021  12:38:24 PM'),
('64252595',    '83',   '5/23/2021  12:39:06 PM'),
('64252595',    '77',   '5/23/2021  12:39:07 PM'),
('65252595',    '41',   '5/23/2021  12:39:12 PM'),
('65252595',    '81',   '5/23/2021  12:39:16 PM'),
('65252595',    '37',   '5/23/2021  12:39:20 PM'),
('65252595',    '18',   '5/23/2021  12:39:56 PM'),
('65252595',    '18',   '5/23/2021  12:40:03 PM'),
('65252595',    '77',   '5/23/2021  12:40:15 PM'),
('65252595',    '36',   '5/23/2021  12:40:46 PM'),
('65252595',    '77',   '5/23/2021  12:40:53 PM')

EXPECTED RESULTS: From this #Data table, the results I would expect to see would be:

Header	EventId	EventDate
62252595	81	5/23/2021 12:34:03 PM
65252595	81	5/23/2021 12:39:16 PM

Header #'s 63252595 and 64252595 would not qualify because between the first instance of 81 and the first instance of 77 (partition by Header order by EventDate), there exists a 75 at 5/23/2021 12:38:10 PM and an 83 at 5/23/2021 12:39:06 PM respectively (both of which in exclusion list). I hope this clears up some confusion.

EDIT: After some thinking, I wonder if it would be possible to simplify this using a CASE expression. Using the example data from the #Data table above, I wrote this query:

select *
from (
    select * from (
        select *, id=case when EventId = 81 then 1 
                            when EventId = 77 then 2
                            when EventId in (60, 72, 73, 74, 75, 76, 83, 85, 86, 87, 103, 154, 166, 197, 199) then 5 else 0 end
        from #data) a
    where id <> 0)b
    order by 3

What this does is filters out all of the 'allowable' events and makes it so that I can filter to only see the unencumbered events where id=1 and then follows with a 2. What I'm not sure of as of yet is how to get it to show me only entries of id=1 with a following 2.

sql sql-server

Solution 1:^[1]

It would be handy to see the actual expected results for the sample data so I don't actually know if this is correct, it looks like you just need to calculate date ranges per each header:

with h as (
  select *, 
    Min(case when eventid=81 then eventdate end) over(partition by header) Sdate, 
    Max(case when eventid=77 then eventdate end) over(partition by header) Edate
  from #data
)
select header, eventId, EventDate
from h
where eventdate between SDate and EDate
and EventId not in(60, 72, 73, 74, 75, 76, 83, 85, 86, 87, 103, 154, 166, 197, 199)
order by eventdate;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Stu

'SQL: Pull rows based on sequence of values

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]