-
Optimizing for unpredictable selectivity
I have a query which uses a name column for search criteria.
It performs nicely if the name is obscure. If the name is common, selectivity suffers.
What is the solution for this problem?
-
You'll need to provide more details. I find it hard to believe that a basic query against a table containing names suffers significantly based on the name. How many rows are in the table? How many rows are in the worst-case result?
I'm more inclined to think that you are joining this table with something else, and the 'something else' scales poorly based on the number of names matched. You'll have to post the query and the explain plan output for help with that.
-
You are quite right, it is a question of scale and, yes, I am joining this table with several other tables.
The core driving table (that has the columns for which I have criteria) is a very large table. If I use 'SMITH' in my query, it takes f.o.r.e.v.e.r (100+ seconds), if I use 'FUZ', it returns in 3 seconds.
I apologize, but I can't post the query. Don't you think the bottom line is the issue of selectivity?
Would Oracle Text help here?
-
I'm still inclined to say 'no' -- how does a query against the name table only behave, without the join?
-
 Originally Posted by KenEwald
You are quite right, it is a question of scale and, yes, I am joining this table with several other tables.
The core driving table (that has the columns for which I have criteria) is a very large table. If I use 'SMITH' in my query, it takes f.o.r.e.v.e.r (100+ seconds), if I use 'FUZ', it returns in 3 seconds.
I apologize, but I can't post the query. Don't you think the bottom line is the issue of selectivity?
Would Oracle Text help here?
Does the explain plan change between the SMITH and the FUZ examples? Feel free to post them
-
-
Interesting read. It sounds like a historgram on the name column is indicated here, IF 'SMITH' takes up more than 1.3% of the table. I'd still like to see the distribution of values for that table.
-
 Originally Posted by KenEwald
You are quite right, it is a question of scale and, yes, I am joining this table with several other tables.
The core driving table (that has the columns for which I have criteria) is a very large table. If I use 'SMITH' in my query, it takes f.o.r.e.v.e.r (100+ seconds), if I use 'FUZ', it returns in 3 seconds.
I apologize, but I can't post the query. Don't you think the bottom line is the issue of selectivity?
Would Oracle Text help here?
The bottom line is cardinality that oracle expects to reurn not the slectivity.
Do you use bind variable for the name column?
If yes, peeking the bind variable may also affect the performance.
For example, first time the variable gets 'FUZ" value, and the sql is parsed and optimized the execution (say NL join), the next time if it gets "SMITH' value, then Oracle will still do NL join, may be in this case FULL TABLESCAN and HASH JOIN may outperform NL join.
Atleast you can post the plan after the changing the name of the tables, if you want to hide the SQL.
Tamil
-
I would recommand you, in this case, not to use bind variables for the name and force Oracle to use dynamic sampling.
It will probably slowdown your "fast" queries a bit but because they already take 3 seconds I don't think that you will notice it to much.
-
 Originally Posted by KenEwald
You are quite right, it is a question of scale and, yes, I am joining this table with several other tables.
The core driving table (that has the columns for which I have criteria) is a very large table. If I use 'SMITH' in my query, it takes f.o.r.e.v.e.r (100+ seconds), if I use 'FUZ', it returns in 3 seconds.
I apologize, but I can't post the query. Don't you think the bottom line is the issue of selectivity?
Would Oracle Text help here?
Are these "fuzzy" matches, or are you searching for the entire name with an equality operator? If fuzzy, are you using any functions such as soundex to run with this? I have found that (using soundex) searching for "SMIT%" always takes considerably longer than searching for "FUZ%" (or whatever), even with histograms.
Oracle Text may or may not help if you are fuzzy matching. It depends of how often DML is performed on columns that have Text Indexes (different to "normal indexes"), and whether the overhead of maintaining the indexes is worth your while (EXEC CTX_DDL.SYNC_INDEX('idx')). To be honest, i'm not sure if using it for a column such as surname is the right way to go anyway (more designed for columns with 100s or 1000s of character values).
Assistance is Futile...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|