'Deterministic sort order for window functions
I've a status
table and I want to fetch the latest details.
Slno | ID | Status | date
1 | 1 | Pass | 15-06-2015 11:11:00 - this is inserted first
2 | 1 | Fail | 15-06-2015 11:11:00 - this is inserted second
3 | 2 | Fail | 15-06-2015 12:11:11 - this is inserted first
4 | 2 | Pass | 15-06-2015 12:11:11 - this is inserted second
I use a window function with partition by ID order by date desc
to fetch the first value.
Excepted Output :
2 | 1 | Fail | 15-06-2015 11:11:00 - this is inserted second
4 | 2 | Pass | 15-06-2015 12:11:11 - this is inserted second
Actual Output :
1 | 1 | Pass | 15-06-2015 11:11:00 - this is inserted first
3 | 2 | Fail | 15-06-2015 12:11:11 - this is inserted first
According to [http://docs.aws.amazon.com/redshift/latest/dg/r_Examples_order_by_WF.html], adding a second ORDER BY
column to the window function may solve the problem. But I don't have any other column to differentiate the rows!
Is there another approach to solve the issue?
EDIT: I've added slno
here for clarity. I don't have slno
as such in the table!
My SQL:
with range as (
select id from status where date between 01-06-2015 and 30-06-2015
), latest as (
select status, id, row_number() OVER (PARTITION BY id ORDER BY date DESC) row_num
)
select * from latest where row_num = 1
Solution 1:[1]
If you don't have slno
in your table, then you don't have any reliable information which row was inserted first. There is no natural order in a table, the physical order of rows can change any time (with any update, or with VACUUM
, etc.)
You could use an unreliable trick: order by the internal ctid
.
select *
from (
select id, status
, row_number() OVER (PARTITION BY id
ORDER BY date, ctid) AS row_num
from status -- that's your table name??
where date >= '2015-06-01' -- assuming column is actually a date
and date < '2015-07-01'
) sub
where row_num = 1;
- In absence of any other information which row came first (which is a design error to begin with, fix it!), you might try to save what you can using the internal tuple ID
ctid
- In-order sequence generation
Rows will be in physical order when inserted initially, but that can change any time with any write operation to the table or VACUUM
or other events.
This is a measure of last resort and it will break.
Your presented query was invalid on several counts: missing column name in 1st CTE, missing table name in 2nd CTE, ...
You don't need a CTE for this.
Simpler with DISTINCT ON
(considerations for ctid
apply the same):
SELECT DISTINCT ON (id)
id, status
FROM status
WHERE date >= '2015-06-01'
AND date < '2015-07-01'
ORDER BY id, date, ctid;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |