Custom ETL performance issue
We have serious performance issue with a Telco DW application. I have reached my wits end with my analysis; hoping to get some bright ideas from this forum -
1. ETL process starts with a nightly call record file that coule range from 20-25Million records
2. SQL Load direct path into a staging table (I would have prefered an external table, but this is what I have). No issues with the load as such
3. No ETL/ELT tool in use. A .Net application processes the staging tables in multiple streams, does some 'minor' transformations, and inserts into a set of DW tables
4. The DW tables, 7 of them, is estimated to have 10+ Billion records
5. As part of 3, the nightly load is broken down in parallel streams and loaded, pretty sure its using bulk collections, into the DW tables. No PL/SQL procedural objects; all processing happens in the .Net code (3rd party application that is database agnostic). The number of parallel streams is currently scoped for around 10 in parallel. All inserts are bulk inserts and no Insert into table select from...
6. The DW tables are heavily partitioned
* Interval partition by month on the date
* subpartition by hash on the cust_no, 1000 subpartitions with tablespace template used in a round-robin format
* Global partitioned hash indexes, some to enforce uniqueness
* Local indexes
* Global indexes are on ID columns that use a sequence (.Net sequence generator)
7. Single-instance application at the moment, Production will be 4-8 node RAC running version 11gR2.
8. 32Gig RAM, 16g SGA, 4 core/8CPU server with 4 hard-disks in a 0+1 config. No Solid State disks at play. This is a development/test server.
9. The obvious thought to load the staging table and exchange partition has been thought over. May be the only solution here.
10. Since the insert frequency is very high, ASSM is not being used. The tables and indexes have freelists set to 16 and initrans to 10
Observations:
1. For one of the loads, with half the load, the AWR report shows
Elapsed: 120.53 (mins)
DB Time: 1,323.98 (mins)
Top 5 Timed Foreground Events
Event Waits Time(s) Avg wait (ms) % DB time Wait Class
db file sequential read 1,434,939 46,381 32 58.39 User I/O
log file sync 601,242 10,632 18 13.38 Commit
DB CPU 9,833 12.38
enq: TX - index contention 307,691 8,162 27 10.27 Concurrency
read by other session 98,243 3,593 37 4.52 User I/O
2. I am seeing lots of enqueue row lock/index contention, buffer busy waits. I guess this is because of the parallel jobs inserting into the same table and the B-tree index leaf split operation. This is confirmed by the 90/10 index leaf block splits in the sysstat
3. dba_hist_system_event for the duration of the run
"EVENT_NAME" "WAIT_CLASS" "SUM(TOTAL_WAITS)" "SUM(TIME_WAITED_MICRO)" "SUM(H.TOTAL_WAITS_FG)" "SUM(H.TIME_WAITED_MICRO_FG)"
"enq: TX - contention" "Other" "58" "33240" "58" "33240"
"enq: TX - index contention" "Concurrency" "982170" "24765321306" "982170" "24765321306"
"enq: TX - row lock contention" "Application" "112561" "43338527296" "112561" "43338527296"
"enq: TX - allocate ITL entry" "Configuration" "4053" "88731305" "4053" "88731305"
4. dba_hist_enqueue_stat for the duration of the run
"EQ_TYPE" "REQ_REASON" "SUM(TOTAL_REQ#)" "SUM(TOTAL_WAIT#)" "SUM(CUM_WAIT_TIME)"
"TM" "contention" "74309120" "34" "6221500"
"TX" "contention" "13070383" "58" "20"
"TX" "index contention" "1109465" "982249" "24772400"
"TX" "allocate ITL entry" "10898" "4053" "88860"
"TX" "row lock contention" "112688" "112564" "45070690"
5. Top SQL with top events are all the insert statement with db file sequential read, enq: TX - index contention and read by other session
Questions: My concern is the parallel threads and the contention caused by the b-tree leaf block splitting. Reverse Key indexes might not help as the 90/10 splits might change to 50/50 block splits.
any ideas how I nail this one?