'Using 'on conflict' with a unique constraint on a table partitioned by date
Given the following table:
CREATE TABLE event_partitioned (
customer_id varchar(50) NOT NULL,
user_id varchar(50) NOT NULL,
event_id varchar(50) NOT NULL,
comment varchar(50) NOT NULL,
event_timestamp timestamp with time zone DEFAULT NOW()
)
PARTITION BY RANGE (event_timestamp);
And partitioning by calendar week [one example]:
CREATE TABLE event_partitioned_2020_51 PARTITION OF event_partitioned
FOR VALUES FROM ('2020-12-14') TO ('2020-12-20');
And the unique constraint [event_timestamp necessary since the partition key]:
ALTER TABLE event_partitioned
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
I would like to update if customer_id, user_id, event_id exist, otherwise insert:
INSERT INTO event_partitioned (customer_id, user_id, event_id)
VALUES ('9', '99', '999')
ON CONFLICT (customer_id, user_id, event_id, event_timestamp) DO UPDATE
SET comment = 'I got updated';
But I cannot add a unique constraint only for customer_id, user_id, event_id, hence event_timestamp as well.
So this will insert duplicates of customer_id, user_id, event_id. Even so with adding now() as a fourth value, unless now() precisely matches what's already in event_timestamp.
Is there a way that ON CONFLICT could be less 'granular' here and update if now() falls in the week of the partition, rather than precisely on '2020-12-14 09:13:04.543256' for example?
Basically I am trying to avoid duplication of customer_id, user_id, event_id, at least within a week, but still benefit from partitioning by week (so that data retrieval can be narrowed to a date range and not scan the entire partitioned table).
Solution 1:[1]
I don't think you can do this with on conflict
in a partitioned table. You can, however, express the logic with CTEs:
with
data as ( -- data
select '9' as customer_id, '99' as user_id, '999' as event_id
),
ins as ( -- insert if not exists
insert into event_partitioned (customer_id, user_id, event_id)
select * from data d
where not exists (
select 1
from event_partitioned ep
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
)
returning *
)
update event_partitioned ep -- update if insert did not happen
set comment = 'I got updated'
from data d
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
and not exists (select 1 from ins)
Solution 2:[2]
@GMB's answer is great and works well. Since enforcing a unique constrain on a partitioned table (parent table) partitioned by time range is usually not that useful, why now just have a unique constraint/index placed on the partition itself?
In your case, event_partitioned_2020_51 can have a unique constraint:
ALTER TABLE event_partitioned_2020_51
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
And subsequent query can just use
INSERT ... INTO event_partitioned_2020_51 ON CONFLICT (customer_id, user_id, event_id, event_timestamp)
as long as this its the partition intended, which is usually the case.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | GMB |
Solution 2 | 1qnew |