'Azure Synapse IDENTITY COLUMN; Wrong Id values based on seed
The goal is to create a table, insert some (3) dummy rows for technical reasons, then for any valid data, start using Id
s above 100.
Script for creating (drop-create) the table:
IF OBJECT_ID(N'dbo.IdentityInsertTest') IS NOT NULL
DROP TABLE dbo.IdentityInsertTest;
GO
CREATE TABLE dbo.IdentityInsertTest (
Id BIGINT NOT NULL IDENTITY (100, 1) -- SEED is 100, not 1!
,Col1 INT
,Col2 INT
)
WITH (
HEAP
,DISTRIBUTION = HASH(Col1)
)
;
Insert scripts:
-- 1) 1 row insert; this one is only "needed" to make sure valid data starts with >100 Id; see explanation later
INSERT INTO dbo.IdentityInsertTest(Col1, Col2)
SELECT -1 AS Col1
,-2 AS Col2
WHERE 1 = 1
;
-- 2) Insert technical/dummy rows
SET IDENTITY_INSERT dbo.IdentityInsertTest ON;
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(1, 1, 1);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(2, 2, 2);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(3, 3, 3);
SET IDENTITY_INSERT dbo.IdentityInsertTest OFF;
-- 3) Inserting "valid" rows
WITH
t0(i) AS (SELECT 0 UNION ALL SELECT 0), -- 2
t1(i) AS (SELECT 0 FROM t0 a, t0 b), -- 4
t2(i) AS (SELECT 0 FROM t1 a, t1 b), -- 16
t3(i) AS (SELECT 0 FROM t2 a, t2 b), -- 256
t4(i) AS (SELECT 0 FROM t3 a, t3 b), -- 65 536
t5(i) AS (SELECT 0 FROM t4 a, t4 b), -- 4 294 967 296
tn(i) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM t5)
INSERT INTO dbo.IdentityInsertTest(Col1, Col2)
SELECT tn.i % 180 AS Col1
,tn.i AS Col2
FROM tn
WHERE i BETWEEN 1 AND 15
;
Normally, only 2) and 3) would be needed, but it seems that if I first insert rows manually then load valid data, the engine will give Id
s to those rows <100, see image below.
(1) Shows what happens when I run only 2) and 3), in this order, not running 1), no extra row (-1, -2). First I insert the 3 technical rows I need by swithcing on and off the IDENTITY_INSERT
, then load the valid data. Valid data starts with Id = 4
, but the definition of the IDENTITY column
is IDENTIY(100, 1)
.
(2) Shows that I do add an extra row (-1, -2) to the table first, then do the manual insert then loading the valid data. Valid data does get correct Id values (> 100
).
Based on further observation, it seems to me that if I insert into an empty table manual rows first (IDENTITY_INSERT ... ON
), then load additional row with IDENTITY_INSERT ... OFF
, the system won't take the SEED
value into account, so in order to make this to work, I need to:
1 - Insert a dummy, not needed row into the empty table **first** (which I'll delete later)
2 - Insert the technical manual rows I need
3 - Delete the extra row I added before (1st step)
4 - Now it's safe to add more rows to the table (with `IDENTITY_INSERT ... OFF`).
I tried to google about this unexpected behavior, but couldn't find anything. Is this a bug?
Solution 1:[1]
This is one of the limitations of Azure Synapse. You can use the following code to generate identity column explicitly:
Declare @IdentityColumnName BIGINT
set @IdentityColumnName = (SELECT case when MAX(IdentityColumnName) is null then 0 else MAX(IdentityColumnName)+1 end FROM dw.TableName)
DBCC CHECKIDENT('dw.TableName', reseed, @IdentityColumnName)
Solution 2:[2]
Unless I'm not understanding correctly, I think you should be able to run your identity inserts, then reseed the table to 100:
IF OBJECT_ID(N'dbo.IdentityInsertTest') IS NOT NULL
DROP TABLE dbo.IdentityInsertTest;
GO
CREATE TABLE dbo.IdentityInsertTest (
Id BIGINT NOT NULL IDENTITY (100, 1) -- SEED is 100, not 1!
,Col1 INT
,Col2 INT
)
WITH (
HEAP
,DISTRIBUTION = HASH(Col1)
)
;
SET IDENTITY_INSERT dbo.IdentityInsertTest ON;
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(1, 1, 1);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(2, 2, 2);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(3, 3, 3);
SET IDENTITY_INSERT dbo.IdentityInsertTest OFF;
DBCC CHECKIDENT('dbo.IdentityInsertTest', reseed, 100) ;
WITH
t0(i) AS (SELECT 0 UNION ALL SELECT 0), -- 2
t1(i) AS (SELECT 0 FROM t0 a, t0 b), -- 4
t2(i) AS (SELECT 0 FROM t1 a, t1 b), -- 16
t3(i) AS (SELECT 0 FROM t2 a, t2 b), -- 256
t4(i) AS (SELECT 0 FROM t3 a, t3 b), -- 65 536
t5(i) AS (SELECT 0 FROM t4 a, t4 b), -- 4 294 967 296
tn(i) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM t5)
INSERT INTO dbo.IdentityInsertTest(Col1, Col2)
SELECT tn.i % 180 AS Col1
,tn.i AS Col2
FROM tn
WHERE i BETWEEN 1 AND 15
;
SELECT * FROM dbo.IdentityInsertTest
ORDER BY id
Interestingly, the reseed doesn't work properly if the identity_insert is still on. So this doesn't work:
SET IDENTITY_INSERT dbo.IdentityInsertTest ON;
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(1, 1, 1);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(2, 2, 2);
INSERT INTO dbo.IdentityInsertTest(Id, Col1, Col2) VALUES(3, 3, 3);
DBCC CHECKIDENT('dbo.IdentityInsertTest', reseed, 100) ;
SET IDENTITY_INSERT dbo.IdentityInsertTest OFF;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | AnnuKumari-MSFT |
Solution 2 |