'MySQL Sort One Column Select 2nd Column Values Based on some Criteria

I have a table very similar to that below. I need to sort the end_time into ascending order and then select the start_time that are say within plus or minus 2 seconds of the selected end_time in the remaining rows. In essence what I am trying to do is to create an end to end chain of those end_times that match starting times end to end.

So after the first match I would get the main_id of the start_time row that matched and repeat for its end_time. Using this you can see that from the table below the the following links or aggregates can be found.

1->2
3->4
5->6->7->8
10->11

Now I can readily do this programmatically in say PHP by reading in the rows, but would like to know if there is a more efficient way of doing this in SQL where some of the tables could be large.

+-------------------+------------+
|main_id|start_time |   end_time |
|-------------------+------------+
| 1      1467695616   1467695676 |
| 2      1467695677   1467695683 |
| 3      1467782122   1467782182 |
| 4      1467782181   1467782238 |
| 5      1472329347   1472329374 |
| 6      1472329375   1472329553 |
| 7      1472329554   1472329733 |
| 8      1472329734   1472329764 |
| 9      1472329949   1472330078 |
|10      1472330275   1472330453 |
|11      1472330454   1472330479 |
+----+--------------+------------+


Solution 1:[1]

In the first example we assign a group to each line. If the difference between the previous end_time and the current lines start_time is within +/- 2 seconds we keep the same group number, otherwise we increment the group number. This should work in all versions of MySQL.
In the second example we use windows functions lag() over (order by ~), sum(~) over(order by ~) and a CTE so it will not work in versions of MySQL lower then 8.

create table t (
main_id int,
start_time int,
end_time int);
insert into t values
(1,1467695616,1467695676),
(2,1467695677,1467695683),
(3,1467782122,1467782182),
(4,1467782181,1467782238),
(5,1472329347,1472329374),
(6,1472329375,1472329553),
(7,1472329554,1472329733),
(8,1472329734,1472329764),
(9,1472329949,1472330078),
(10,1472330275,1472330453),
(11,1472330454,1472330479);
set @latest  = 0;
set @group = 0;
select
@group  := case when abs(start_time - @latest) <= 2
                   then @group
                   else @group + 1 
                   end  "group",
main_id, 
start_time,
@latest := end_time "end_time"
from t 
order by end_time,start_time
> group | main_id | start_time |   end_time
> ----- | --------| -----------| -----------|
>     1 |       1 | 1467695616 | 1467695676
>     1 |       2 | 1467695677 | 1467695683
>     2 |       3 | 1467782122 | 1467782182
>     2 |       4 | 1467782181 | 1467782238
>     3 |       5 | 1472329347 | 1472329374
>     3 |       6 | 1472329375 | 1472329553
>     3 |       7 | 1472329554 | 1472329733
>     3 |       8 | 1472329734 | 1472329764
>     4 |       9 | 1472329949 | 1472330078
>     5 |      10 | 1472330275 | 1472330453
>     5 |      11 | 1472330454 | 1472330479
> 
with CTE as
(select
case when abs(start_time - lag(end_time) over(order by end_time, start_time)) > 2 then 1 else 0 end  diff_previous,
main_id, 
start_time,
end_time
from t 
order by end_time,start_time
)
select
1 + sum(diff_previous) over(order by end_time) as "group",
main_id, 
start_time,
end_time
from CTE
order by end_time, start_time
group | main_id | start_time |   end_time
----: | ------: | ---------: | ---------:
    1 |       1 | 1467695616 | 1467695676
    1 |       2 | 1467695677 | 1467695683
    2 |       3 | 1467782122 | 1467782182
    2 |       4 | 1467782181 | 1467782238
    3 |       5 | 1472329347 | 1472329374
    3 |       6 | 1472329375 | 1472329553
    3 |       7 | 1472329554 | 1472329733
    3 |       8 | 1472329734 | 1472329764
    4 |       9 | 1472329949 | 1472330078
    5 |      10 | 1472330275 | 1472330453
    5 |      11 | 1472330454 | 1472330479

db<>fiddle here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alexander