Table Partition Design

Printable View

12-10-2010, 03:58 PM
Test2day

Table Partition Design

We have two groups of users Group A and Group B.

Group A of users queries data by insert_date = sysdate. Table A partitioned by INSERT_DATE.
Group B has queries that use REPORT_DATE = sysdate -1. TABLE B partitioned by REPORT_DATE it contains all the columns of table A except that is partitioned by REPORT_DATE. Basically we have same data stored twice once partitioned by insert_date and report_date. Our users have hundreds of queries that run against either table A or Table B.

Note that the tables are partitioned by DAY each day gets 1 million records.
Data in each table goes back to 4 years to 2006. so those tables contain huge amounts of data.

Group A goes after TABLE A

INSERT_DATE date
DWBUID number
CURRENT_BALANCE number
FLAG_ACTIVE char(1)
....More Columns up to (150 columns).......................
combination of INSERT_DATE and DWBUID makes the data unique

Group B uses TABLE B

REPORT_DATE date
DWBUID number
CURRENT_BALANCE number
FLAG_ACTIVE char(1)
.....More Columns up to (150 columns).......
combination of REPORT_DATE and DWBUID makes the data unique

We would like to be able to build ONE partitioned table which allows users to write queries to retrieve data by either insert_date or report_date while getting same type of performance as they would have gotten if they query either Table A or Table B.

We were thinking that table C should be the new table that contains both Insert_date and Report_date along with DWBUID.
The question is what kind of partitioning should we use to achieve our goal where users do not get performance degradation.

both users groups should be able to use TABLE C

INSERT_DATE date
REPORT_DATE date
DWBUID number
CURRENT_BALANCE number
FLAG_ACTIVE char(1)
....More Columns up to (150 columns).......................
combination of INSERT_DATE REPORT_DATE and DWBUID makes the data unique

Not sure we can use a composite partition by (INSERT_DATE, REPORT_DATE)
and hash subpartition by DWBUID.

Any suggestions will be appreciated.
12-11-2010, 09:17 AM
PAVB

Build just one partitioned table.

If business specs are asking to either archive or purge data, a good idea would be to have a partitioning strategy that helps during such archive and/or purge process.

If business specs are asking to keep data forever then partitioning strategy should help your most critical queries.

Revise indexing strategy - queries shouldn't rely only in partitioning strategy but in indexing strategy.

Why are you mentioning hash partitioning? are yo seeing contention during insert?
12-13-2010, 10:16 AM
Test2day

We tried creating a table partitioned by composite range
PARTITION BY RANGE(insert_date, report_date, dwbuid)

The thinking was that by having both dates part of range partition will allow users to have queries that hit either insert_date or report_date to get same performance.
it turns out that queries that reference insert_date are much faster than the ones
that hit report_date.

Queries that hit report_date do a full index scan. (slow) 2 min
queries that hit insert_date hit the correct partition ranges. (fast) 10 sec
Need a way to have both dates have the same weight regardless of the order of which they are enumerated in the partitioning statement.
12-13-2010, 11:12 AM
PAVB

Quote:

Originally Posted by Test2day

We tried creating a table partitioned by composite range
PARTITION BY RANGE(insert_date, report_date, dwbuid)

The thinking was that by having both dates part of range partition will allow users to have queries that hit either insert_date or report_date to get same performance.
it turns out that queries that reference insert_date are much faster than the ones that hit report_date.

Queries that hit report_date do a full index scan. (slow) 2 min
queries that hit insert_date hit the correct partition ranges. (fast) 10 sec
Need a way to have both dates have the same weight regardless of the order of which they are enumerated in the partitioning statement.

Partitioning strategy is wrong.
Partition by range on (insert_date) then create indexes on insert_date, report_date and dwbuid.
Is this a FACT table?
do you have a DIMension describing dates? In the affirmative case, is your DIM_DATE table pointing to a surrogate date on FACT?
do you have a DIMension describing dwbuid?
If you got four"YES" that means star-schema is properly designed then consider defining insert_date, report_date and dwbuid as bitmap indexes; user queries should mention DIM tables.