'SQL rank counting occurences

In my table I have data from employee contracts like this: enter image description here

And I want to get job position in time periods, something like this

login ValidFrom ValidTo JobPosition
bcde 2019-07-01 2019-09-30 Project Manager
bcde 2020-01-09 2020-06-16 Head of Center of Excellence
bcde 2020-06-17 2021-07-31 Team Leader
bcde 2021-08-01 2099-12-31 Head of Center of Excellence

So I write query:

select DimEmployeeId, JobPosition, login, min(ValidFrom), max(ValidTo)
from employeeContracts
group by DimEmployeeId, JobPosition, login

But It doesn't work in this case (if someone has the same JobPosition), so I decided to use dense_rank, like this:

select login, ValidFrom, ValidTo, JobPosition
,dense_rank() OVER (Partition BY JobPosition ORDER BY login, ValidFrom, ValidTo, JobPosition) as no
from employeeContracts

and after that

select DimEmployeeId, JobPosition, login, min(ValidFrom), max(ValidTo)
from employeeContracts
group by DimEmployeeId, JobPosition, no, login

but problem is dense_rank doesn't work as I need ;) I want to get something like this:

login ValidFrom ValidTo JobPosition no
bcde 2019-07-01 2019-09-30 Project Manager 1
bcde 2020-01-09 2020-06-16 Head of Center of Excellence 2
bcde 2020-06-17 2020-07-31 Team Leader 3
bcde 2020-08-01 2021-03-31 Team Leader 3
bcde 2021-04-01 2021-06-30 Team Leader 3
bcde 2021-07-01 2021-07-31 Team Leader 3
bcde 2021-08-01 2021-12-31 Head of Center of Excellence 4
bcde 2022-01-01 2022-05-09 Head of Center of Excellence 4
bcde 2022-02-01 2022-05-09 Head of Center of Excellence 4
bcde 2022-05-09 2099-12-31 Head of Center of Excellence 4

and after that use query to get final result:

select DimEmployeeId, JobPosition, login, min(ValidFrom), max(ValidTo)
from employeeContracts
group by DimEmployeeId, JobPosition, no, login

Example Data Scheme

CREATE TABLE employeeContracts (
  login text,
  ValidFrom datetime,
  ValidTo datetime,
  JobPosition text
);

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2019-07-01', '2019-09-30', 'Project Manager');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2020-01-09', '2020-06-16', 'Head of Center of Excellence');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2020-06-17', '2020-07-31', 'Team Leader');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2020-08-01', '2021-03-31', '   Team Leader');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2021-04-01', '2021-06-30', '   Team Leader');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2021-07-01', '2021-07-31', '   Team Leader');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2021-08-01', '2021-12-31', '   Head of Center of Excellence');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2022-01-01', ' 2022-05-09', 'Head of Center of Excellence');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2022-02-01', '2022-05-09', 'Head of Center of Excellence');

INSERT INTO employeeContracts (login, ValidFrom, ValidTo, JobPosition) 
VALUES ('bcde', '2022-05-09', '2099-12-31', 'Head of Center of Excellence');

This can be tested Here



Solution 1:[1]

This sounds like a "Gaps and Islands" problem. One way of approaching it is to identify and assign rankings to each "island" (or date ranges for each JobPosition).

login ValidFrom ValidTo JobPosition PrevPosition JobPositionGroup
bcde 2019-07-01 00:00:00 2019-09-30 00:00:00 Project Manager null 1
bcde 2020-01-09 00:00:00 2020-06-16 00:00:00 Head of Center of Excellence Project Manager 2
bcde 2020-06-17 00:00:00 2020-07-31 00:00:00 Team Leader Head of Center of Excellence 3
bcde 2020-08-01 00:00:00 2021-03-31 00:00:00 Team Leader Team Leader 3
bcde 2021-04-01 00:00:00 2021-06-30 00:00:00 Team Leader Team Leader 3
bcde 2021-07-01 00:00:00 2021-07-31 00:00:00 Team Leader Team Leader 3
bcde 2021-08-01 00:00:00 2021-12-31 00:00:00 Head of Center of Excellence Team Leader 4
bcde 2022-01-01 00:00:00 2022-05-09 00:00:00 Head of Center of Excellence Head of Center of Excellence 4
bcde 2022-02-01 00:00:00 2022-05-09 00:00:00 Head of Center of Excellence Head of Center of Excellence 4
bcde 2022-05-09 00:00:00 2099-12-31 00:00:00 Head of Center of Excellence Head of Center of Excellence 4

Then apply your min/max logic, grouping by the ranking number:

 SELECT grp.JobPositionGroup
        , grp.JobPosition
        , grp.Login
        , MIN(grp.ValidFrom) AS ValidFrom
        , MAX(grp.ValidTo) AS ValidTo
 FROM (
         SELECT  cron.*
                 , SUM( IF( JobPosition = PrevPosition, 0, 1) ) OVER(
                            ORDER BY ValidFrom, ValidTo
                 )  AS JobPositionGroup
         FROM   (
                    SELECT ec.*
                          , LAG(JobPosition, 1) OVER (
                               ORDER BY ValidFrom, ValidTo
                          ) AS PrevPosition
                    FROM   employeeContracts ec
                 )
                 cron
         ) grp
 GROUP BY grp.JobPositionGroup
          , grp.JobPosition
          , grp.Login

Results:

JobPositionGroup JobPosition Login ValidFrom ValidTo
1 Project Manager bcde 2019-07-01 00:00:00 2019-09-30 00:00:00
2 Head of Center of Excellence bcde 2020-01-09 00:00:00 2020-06-16 00:00:00
3 Team Leader bcde 2020-06-17 00:00:00 2021-07-31 00:00:00
4 Head of Center of Excellence bcde 2021-08-01 00:00:00 2099-12-31 00:00:00

db<>fiddle here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SOS