'Function Based Index not improving query performance
I have created view and in this view i have added the below case statement which i need and for which i already create exactly the function based index. The view has 1900000 records. When i tried to execute the view it takes hours to run and the performance of this view is very low. I dont understand how can i improve the performance.
CREATE OR REPLACE VIEW
TST_AGG
(
ROOT) AS
Select
CASE
WHEN regexp_like(ticker, '\s.*\s')
THEN SUBSTR(ticker, 1, instr(ticker, ' ')-1)
WHEN regexp_like(ticker, '\s')
THEN
CASE
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '(P|C)$')
AND LENGTH(SUBSTR(ticker, 1, instr(ticker, ' ')-1)) >= 4
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-3)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '\w\d\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-5)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')), '\w\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-4)
ELSE SUBSTR(ticker, 1, instr(ticker, ' ')-1)
END
WHEN regexp_like(ticker, '(P|C)$')
AND LENGTH(ticker) >= 4
THEN SUBSTR(ticker, 1, LENGTH(ticker)-3)
WHEN regexp_like(ticker, '\w\d\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-5)
WHEN regexp_like(ticker, '\w\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-4)
ELSE ticker
END ) AS ROOT
FROM TTT_IMP
Below is the functional based index i have created:
CREATE INDEX "IDX_ROOT" ON "TTT_IMP" (CASE WHEN REGEXP_LIKE ("TICKER",'\s.*\s') THEN SUBSTR("TICKER",1,INSTR("TICKER",' ')-1) WHEN REGEXP_LIKE ("TICKER",'\s') THEN CASE WHEN ( REGEXP_LIKE (SUBSTR("TICKER",1,INSTR("TICKER",' ')-1),'(P|C)$') AND LENGTH(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1))>=4) THEN SUBSTR(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1),1,LENGTH(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1))-3) WHEN REGEXP_LIKE (SUBSTR("TICKER",1,INSTR("TICKER",' ')-1),'\w\d\d\w\d$') THEN SUBSTR(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1),1,LENGTH(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1))-5) WHEN REGEXP_LIKE (SUBSTR("TICKER",1,INSTR("TICKER",' ')),'\w\d\w\d$') THEN SUBSTR(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1),1,LENGTH(SUBSTR("TICKER",1,INSTR("TICKER",' ')-1))-4) ELSE SUBSTR("TICKER",1,INSTR("TICKER",' ')-1) END WHEN ( REGEXP_LIKE ("TICKER",'(P|C)$') AND LENGTH("TICKER")>=4) THEN SUBSTR("TICKER",1,LENGTH("TICKER")-3) WHEN REGEXP_LIKE ("TICKER",'\w\d\d\w\d$') THEN SUBSTR("TICKER",1,LENGTH("TICKER")-5) WHEN REGEXP_LIKE ("TICKER",'\w\d\w\d$') THEN SUBSTR("TICKER",1,LENGTH("TICKER")-4) ELSE "TICKER" END );
Solution 1:[1]
I would suggest to review your data model, the regex is really ugly. Store relevant information directly in column instead of somewhere hidden in a ticket
string.
Anyway, I would propose to create a virtual column instead of view. Then you can create an index on this virtual column and it should also be used. Would be similar to this:
ALTER TABLE TTT_IMP ADD (ROOT VARCHAR2(20) GENERATED ALWAYS AS (
CAST(
CASE
WHEN regexp_like(ticker, '\s.*\s')
THEN SUBSTR(ticker, 1, instr(ticker, ' ')-1)
WHEN regexp_like(ticker, '\s')
THEN
CASE
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '(P|C)$')
AND LENGTH(SUBSTR(ticker, 1, instr(ticker, ' ')-1)) >= 4
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-3)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '\w\d\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-5)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')), '\w\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-4)
ELSE SUBSTR(ticker, 1, instr(ticker, ' ')-1)
END
WHEN regexp_like(ticker, '(P|C)$')
AND LENGTH(ticker) >= 4
THEN SUBSTR(ticker, 1, LENGTH(ticker)-3)
WHEN regexp_like(ticker, '\w\d\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-5)
WHEN regexp_like(ticker, '\w\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-4)
ELSE ticker
END
AS VARCHAR2(20))
) VIRTUAL);
Solution 2:[2]
The index can be used in two scenarios.
1) To reduce the records selected, ie
SELECT ... FROM TST_AGG where ROOT = ...
2) To avoid querying a table, and to avoid expensive calcuations
SELECT ROOT FROM TST_AGG
I am assuming the latter here. The optimizer can only use an index (in place of a table) if it knows that the index entries are one for one with the table. Because it does not know if that expression may return a NULL (which are NOT stored in the index) it cannot make that direct swap unless you inform the optimizer.
Hence
SELECT ROOT FROM TST_AGG
will not have the chance to use the index, but
SELECT ROOT FROM TST_AGG WHERE ROOT IS NOT NULL
should be able to.
One other thing to be careful of, is that we might alter the expression syntax that we used to store the index. So check out USER_IND_EXPRESSIONS, and perhaps put that expression back into the definition of the view.
Solution 3:[3]
To add to @ConnorMcDonnald’s great answer, here’s a blurb from the docs:
12.4.2 Disadvantages of Function-Based Indexes
If the index expression is a function invocation, then the function return type cannot be constrained.
Because you cannot constrain the function return type with NOT NULL, you must ensure that the query that uses the index cannot fetch NULL values. Otherwise, the database performs a full table scan.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Wernfried Domscheit |
Solution 2 | Connor McDonald |
Solution 3 | User1974 |