'MySQL Sort One Column Select 2nd Column Values Based on some Criteria
I have a table very similar to that below. I need to sort the end_time
into ascending order and then select the start_time
that are say within plus or minus 2 seconds of the selected end_time
in the remaining rows. In essence what I am trying to do is to create an end to end chain of those end_times that match starting times end to end.
So after the first match I would get the main_id
of the start_time
row that matched and repeat for its end_time
. Using this you can see that from the table below the the following links or aggregates can be found.
1->2
3->4
5->6->7->8
10->11
Now I can readily do this programmatically in say PHP by reading in the rows, but would like to know if there is a more efficient way of doing this in SQL where some of the tables could be large.
+-------------------+------------+
|main_id|start_time | end_time |
|-------------------+------------+
| 1 1467695616 1467695676 |
| 2 1467695677 1467695683 |
| 3 1467782122 1467782182 |
| 4 1467782181 1467782238 |
| 5 1472329347 1472329374 |
| 6 1472329375 1472329553 |
| 7 1472329554 1472329733 |
| 8 1472329734 1472329764 |
| 9 1472329949 1472330078 |
|10 1472330275 1472330453 |
|11 1472330454 1472330479 |
+----+--------------+------------+
Solution 1:[1]
In the first example we assign a group to each line. If the difference between the previous end_time and the current lines start_time is within +/- 2 seconds we keep the same group number, otherwise we increment the group number. This should work in all versions of MySQL.
In the second example we use windows functions lag() over (order by ~)
, sum(~) over(order by ~)
and a CTE so it will not work in versions of MySQL lower then 8.
create table t ( main_id int, start_time int, end_time int); insert into t values (1,1467695616,1467695676), (2,1467695677,1467695683), (3,1467782122,1467782182), (4,1467782181,1467782238), (5,1472329347,1472329374), (6,1472329375,1472329553), (7,1472329554,1472329733), (8,1472329734,1472329764), (9,1472329949,1472330078), (10,1472330275,1472330453), (11,1472330454,1472330479);
set @latest = 0; set @group = 0; select @group := case when abs(start_time - @latest) <= 2 then @group else @group + 1 end "group", main_id, start_time, @latest := end_time "end_time" from t order by end_time,start_time
> group | main_id | start_time | end_time > ----- | --------| -----------| -----------| > 1 | 1 | 1467695616 | 1467695676 > 1 | 2 | 1467695677 | 1467695683 > 2 | 3 | 1467782122 | 1467782182 > 2 | 4 | 1467782181 | 1467782238 > 3 | 5 | 1472329347 | 1472329374 > 3 | 6 | 1472329375 | 1472329553 > 3 | 7 | 1472329554 | 1472329733 > 3 | 8 | 1472329734 | 1472329764 > 4 | 9 | 1472329949 | 1472330078 > 5 | 10 | 1472330275 | 1472330453 > 5 | 11 | 1472330454 | 1472330479 >
with CTE as (select case when abs(start_time - lag(end_time) over(order by end_time, start_time)) > 2 then 1 else 0 end diff_previous, main_id, start_time, end_time from t order by end_time,start_time ) select 1 + sum(diff_previous) over(order by end_time) as "group", main_id, start_time, end_time from CTE order by end_time, start_time
group | main_id | start_time | end_time ----: | ------: | ---------: | ---------: 1 | 1 | 1467695616 | 1467695676 1 | 2 | 1467695677 | 1467695683 2 | 3 | 1467782122 | 1467782182 2 | 4 | 1467782181 | 1467782238 3 | 5 | 1472329347 | 1472329374 3 | 6 | 1472329375 | 1472329553 3 | 7 | 1472329554 | 1472329733 3 | 8 | 1472329734 | 1472329764 4 | 9 | 1472329949 | 1472330078 5 | 10 | 1472330275 | 1472330453 5 | 11 | 1472330454 | 1472330479
db<>fiddle here
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Alexander |