'How Can I Retrieve CLOBs for Unique Records in Group By
I have a table with one CLOB field (s.File) and I need to create a join on this table to get related data. However, in Oracle, CLOBs are not allowed in group by queries. What is the alternative? I have the following query for which I need to retrieve and serialize the CLOB, but I need the related information as well. See below:
select s.FileName, s.File, c.Name as CompanyName
from Source s inner join Company c on s.id = c.id
group by s.FileName, s.File, c.Name
I tried rewriting the query using row_number and partition by, but couldn't get that solution to work either. Any ideas would be greatly appreciated. The expected results would look something like
- List item
:
FileName File CompanyName
interestincome.xlsx CLOB Woodson Company, Inc.
projectdetails.pdf CLOB Johnson Products, LLC
expenses2021.xlsx CLOB Marshall Consultants, LLP
Solution 1:[1]
If you want to get the unique CompanyName
/Filename
combinations and get a single FILE
(rather than including the files in the GROUP BY
clause) then:
SELECT filename,
(SELECT "FILE" FROM Source f WHERE f.ROWID = min_rowid) AS "FILE",
companyname
FROM (
select s.FileName,
c.Name as CompanyName,
MIN(s.ROWID) AS min_rowid
from Source s
inner join Company c
on s.id = c.id
group by
s.FileName, c.Name
)
If you want to check that the files are identical then you can use:
SELECT filename,
(SELECT "FILE" FROM Source f WHERE f.ROWID = min_rowid) AS "FILE",
companyname
FROM (
select s.FileName,
c.Name as CompanyName,
MIN(s.ROWID) AS min_rowid
from Source s
inner join Company c
on s.id = c.id
group by
s.FileName,
c.Name,
DBMS_CRYPTO.HASH(s."FILE", DBMS_CRYPTO.HASHSH256)
)
If you want to also consider the unlikely event of hash collisions then you can also compare the start of the files (since any hash collisions are likely to be between files that are not alike):
SELECT filename,
(SELECT "FILE" FROM Source f WHERE f.ROWID = min_rowid) AS "FILE",
companyname
FROM (
select s.FileName,
c.Name as CompanyName,
MIN(s.ROWID) AS min_rowid
from Source s
inner join Company c
on s.id = c.id
group by
s.FileName,
c.Name,
DBMS_LOB.SUBSTR(s."FILE", 1, 4000),
DBMS_CRYPTO.HASH(s."FILE", DBMS_CRYPTO.HASHSH256)
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | MT0 |