Hi, I have a query only join two tables, one table has about 10 million rows, the other has 4 thousand rows, the query took about 200 seconds to return 700 thousand rows
This is the query:
Code:
SELECT count(1)
FROM table1 INNER JOIN table2 ON ( table1.col1 = table2.cola
AND
table1.col2 = table2.colb)
WHERE table1.col3<=SYSDATE
AND (
(
table1.col4='a'
AND
table1.col5<100
)
OR
(
table1.col4='b'
AND
(
table1.col5=10
OR
table1.col5=20
)
)
)
AND table1.col6<>'abc'
AND table1.col6<>'def'
AND table1.col6<>'ghi'
AND table2.colc='xy'
AND table2.cold='z'
This is the output of ultxpls.sql
Code:
SELECT STATEMENT, GOAL = FIRST_ROWS Cost=87796 Cardinality=1 Bytes=39
SORT AGGREGATE Cardinality=1 Bytes=39
HASH JOIN Cost=87796 Cardinality=34711 Bytes=1353729
TABLE ACCESS FULL Object owner=joe Object name=table2 Cost=25 Cardinality=91 Bytes=1274
TABLE ACCESS FULL Object owner=joe Object name=table1 Cost=87756 Cardinality=1461194 Bytes=36529850
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("table1"."col1"="table2"."colb")
3 - filter("table2"."colc"='xy' AND "table2"."cold"='z'
AND "table2"."colb"='001')
4 - filter("table1"."col3"<=SYSDATE@! AND
("table1"."col4"='a' AND "table1"."col5"<100 OR
"table1"."col4"='b' AND ("table1"."col5"=10 OR
"table1"."col5"=20)) AND "table1"."col6"<>'abc' AND
"table1"."col6"<>'def' AND "table1"."col6"<>'ghi'
AND "table1"."WHSE"='001')
I want to know why it is doing full table scan, I have index on:
Because the optimizer thinks "FULL TABLE SCAN" cost is cheaper than using indexes.
See below:
TABLE ACCESS FULL Object owner=joe Object name=table1 Cost=87756 Cardinality=1461194 Bytes=36529850
First run a query on the big table with out joining the small table but with all other predicates on table1.
And see the statistics matches with your output. If they are not, then create new indexes on high cardinality columns. And check one more time.
how many rows do are returned if execute
SELECT count(*)
FROM table1
WHERE table1.col3<=SYSDATE
AND (
(
table1.col4='a'
AND
table1.col5<100
)
OR
(
table1.col4='b'
AND
(
table1.col5=10
OR
table1.col5=20
)
)
)
AND table1.col6<>'abc'
AND table1.col6<>'def'
AND table1.col6<>'ghi'
How many rows are retuned if you execute
SELECT count(1)
FROM table2 WHERE table2.colc='xy'
AND table2.cold='z'
How long does the execution of these 2 queries take?
select count(*) from table1
--9746801
select count(*) from table2
--3845
SELECT count(*)
FROM table1
WHERE table1.col3<=SYSDATE
AND (
(
table1.col4='a'
AND
table1.col5<100
)
OR
(
table1.col4='b'
AND
(
table1.col5=10
OR
table1.col5=20
)
)
)
AND table1.col6<>'abc'
AND table1.col6<>'def'
AND table1.col6<>'ghi'
--857747 in 138.389 seconds
SELECT count(1)
FROM table2 WHERE table2.colc='xy'
AND table2.cold='z'
--77 in 0.441 seconds
Since there is a index on table1 (col1, col2), and
Code:
select count(1) from table1;
--9746801
select COUNT(distinct col1)
from table1
--390
select COUNT(distinct col2)
from table1
--1
select COUNT(distinct col3)
from table1
--857
select COUNT(1)
from table1
where col3<=sysdate;
--971876
Do you know it will help if I modify the current index to
table1 (col3, col1, col2) or add a new index just on col3?
With your conditions set against TABLE1 you will get about 9% (857747/9746801) of te rows returned. This means that an index would probably not be of any help.
Bookmarks