-
Need help on Performance Tuning
Hi,
I have a really bad performance on the database as more users got into the system (> 50 users) or after the bulk load or bulk import. I'm not a pure DBA, I'd need help on tuning the server (at least with buffer cache, shared pool size, and pga).
Below is my AWR Report
Thanks in Advance
Buffer Cache: 1,768M Std Block Size: 8K
Shared Pool Size: 352M Log Buffer: 5,120K
Load Profile
Per Second Per Transaction
Redo size: 1,131.67 3,025.06
Logical reads: 169.48 453.03
Block changes: 6.54 17.49
Physical reads: 0.18 0.49
Physical writes: 0.39 1.04
User calls: 9.65 25.79
Parses: 3.22 8.61
Hard parses: 0.17 0.45
Sorts: 1.01 2.71
Logons: 0.05 0.13
Executes: 6.09 16.28
Transactions: 0.37
% Blocks changed per Read: 3.86 Recursive Call %: 60.19
Rollback per transaction %: 18.47 Rows per Sort: 40.95
Instance Efficiency Percentages (Target 100%)
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 100.07 In-memory Sort %: 100.00
Library Hit %: 97.80 Soft Parse %: 94.82
Execute to Parse %: 47.09 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 95.10 % Non-Parse CPU: 91.53
Shared Pool Statistics
Begin End
Memory Usage %: 75.62 79.10
% SQL with executions>1: 95.39 93.05
% Memory for SQL w/exec>1: 93.56 91.78
Top 5 Timed Events
Event Waits Time(s) Percent Total DB Time Wait Class
class slave wait 122 624 3,490.44 Other
CPU time 11 64.05
control file parallel write 1,180 3 19.57 System I/O
control file sequential read 544 2 13.80 System I/O
log file parallel write 1,358 2 10.52 System I/O
SGA Memory Summary
SGA regions Size in Bytes
Database Buffers 1,853,882,368
Fixed Size 792,964
Redo Buffers 5,242,880
Variable Size 556,000,892
init.ora Parameters
Parameter Name Begin value End value (if different)
__db_cache_size 1853882368
__java_pool_size 16777216
__large_pool_size 58720256
__shared_pool_size 369098752
_shared_pool_reserved_pct 6
background_dump_dest C:\ORACLE\PRODUCT\10.1.0\ADMIN\V5
compatible 10.1.0.2.0
control_files C:\ORACLE\PRODUCT\10.1.0\V5\V5\CO
core_dump_dest C:\ORACLE\PRODUCT\10.1.0\ADMIN\V5
db_block_size 8192
db_cache_size 16777216
db_domain
db_file_multiblock_read_count 16
db_name V5
db_recovery_file_dest C:\oracle\product\10.1.0\flash_re
db_recovery_file_dest_size 21474836480
dispatchers (protocol=TCP)
fast_start_mttr_target 26
java_pool_size 16777216
job_queue_processes 20
large_pool_size 58720256
log_buffer 5242880
open_cursors 2000
pga_aggregate_target 896147200
processes 900
remote_login_passwordfile SHARED
resource_limit FALSE
session_cached_cursors 100
sessions 995
sga_max_size 2415919104
sga_target 2306867200
shared_pool_size 369098752
shared_servers 25
sort_area_size 65536
transactions 900
undo_management AUTO
undo_tablespace UNDOTBS1
user_dump_dest C:\ORACLE\PRODUCT\10.1.0\ADMIN\V5
-
Take a look at Jeff Hunter's blog (Marist89), http://marist89.blogspot.com/2005/07/trace-it_04.html. He has a post about doing SQL tracing. Then start doing SQL tracing and see what your database is really doing. Also look at this page, http://www.psoug.org/reference/trace_tkprof.html. I would use this command to turn on tracing though, "ALTER SESSION SET EVENTS '10046 trace name context forever, level 12';"
Start by clearing out your $ORACLE_ADMIN/udump directory, turn on tracing during the busy part of the day, let it run for an hour, then turn it off and run tkprof on each of the trace files according to Jeff's post. Then look through each of the trace files.
Finally find the poorest performing SQL, tune, then start the process all over again. Continue until you have fixed the worst performing SQL. This might require adding indexes, changing SQL, changing table layouts and foreign keys or some mix of the above.
-
Please run query below, then post results in between code tags.
HTML Code:
set pages 999
set lines 90
column c1 heading 'Event|Name' format a30
column c2 heading 'Total|Waits' format 999,999,999,999
column c3 heading 'Seconds|Waiting' format 999,999,999
column c4 heading 'Total|Timeouts' format 999,999,999
column c5 heading 'Average|Wait|(in secs)' format 99.999
ttitle 'System-wide Wait Analysis|for current wait events'
select
event c1,
total_waits c2,
time_waited / 100 c3,
total_timeouts c4,
average_wait /100 c5
from
sys.v_$system_event
where
event not in (
'dispatcher timer',
'lock element cleanup',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'pmon timer',
'rdbms ipc message',
'slave wait',
'smon timer',
'SQL*Net break/reset to client',
'SQL*Net message from client',
'SQL*Net message to client',
'SQL*Net more data to client',
'virtual circuit status',
'WMON goes to sleep'
)
AND
event not like 'DFS%'
and
event not like '%done%'
and
event not like '%Idle%'
AND
event not like 'KXFX%'
order by
c3 desc
;
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
 Originally Posted by PAVB
Please run query below, then post results in between code tags.
So I ran the query against one of my databases that has had performance issues, and it looks like a statspack report. I was wondering what specific things do you suggest people look at in order to diagnose performance problems? Do the following two stats tell me that I am doing too many full tablescans?
Code:
db file scattered read 33,168,651 478,165 0 .014
db file sequential read 69,640,302 407,351 0 .006
Do you spool to a file and grep for the word contention or the word wait? If I see that the longest average wait is less than 4 seconds, what does that really mean? I thought tuning by ratios was for the most part debunked.
-
Hi Pablo,
Here is the output
Code:
Thu Oct 08 page 1
System-wide Wait Analysis
for current wait events
Average
Event Total Seconds Total Wait
Name Waits Waiting Timeouts (in secs)
------------------------------ ---------------- ------------ ------------ ---------
jobq slave wait 81,293 249,772 79,125 3.070
Queue Monitor Slave Wait 3,719 221,041 3,719 59.440
queue messages 26,345 130,674 25,503 4.960
Queue Monitor Wait 4,193 128,206 4,190 30.580
wakeup time manager 2,102 114,648 866 54.540
class slave wait 4,100 20,930 4,100 5.100
Backup: sbtwrite2 72,844 1,628 0 .020
Backup: sbtbackup 12 307 0 25.540
db file scattered read 15,144 138 0 .010
Queue Monitor Task Wait 4,023 136 5 .030
PX Deq: Execution Msg 4,525 119 20 .030
control file parallel write 43,410 114 0 0.000
control file sequential read 34,008 109 0 0.000
db file sequential read 14,901 85 0 .010
log file parallel write 57,659 73 0 0.000
Backup: sbtclose2 12 66 0 5.500
log file sync 34,626 51 0 0.000
db file parallel write 46,918 38 0 0.000
PX Deq Credit: send blkd 1,000 25 3 .020
log file sequential read 450 13 0 .030
PX Deq: Execute Reply 3,584 13 0 0.000
direct path read 3,164 8 0 0.000
switch logfile command 5 6 0 1.230
Backup: sbtinfo2 12 4 0 .370
enq: CF - contention 19 4 0 .230
control file heartbeat 1 4 1 4.100
PX Deq: Table Q Normal 965 3 0 0.000
rdbms ipc reply 504 3 1 .010
Backup: sbtremove2 6 2 0 .360
PX Deq: Signal ACK 342 2 27 .010
PX Deq Credit: need buffer 187 2 0 .010
read by other session 279 2 0 .010
ksfd: async disk IO 15,352 1 0 0.000
enq: US - contention 22 1 0 .040
control file single write 278 1 0 0.000
PX Deq: Parse Reply 244 1 0 0.000
enq: TC - contention 3 1 0 .200
latch: session allocation 352 1 0 0.000
library cache pin 25 0 0 .010
Backup: sbtinit 8 0 0 .030
latch free 74,475 0 0 0.000
process startup 60 0 0 0.000
log file switch completion 11 0 0 .020
latch: library cache 102 0 0 0.000
db file parallel read 10 0 0 .020
PX Deq: Table Q Sample 118 0 0 0.000
recovery area: computing obsol 41 0 0 0.000
ete files
library cache load lock 12 0 0 .010
log buffer space 3 0 0 .030
row cache lock 21 0 0 0.000
SQL*Net more data from client 10,741 0 0 0.000
enq: RO - fast object reuse 96 0 0 0.000
PX Deq: Table Q Get Keys 99 0 0 0.000
LGWR wait for redo copy 2,203 0 0 0.000
latch: cache buffers chains 11 0 10 0.000
log file single write 36 0 0 0.000
PX Deq: Join ACK 192 0 0 0.000
db file single write 14 0 0 0.000
direct path read temp 47 0 0 0.000
PX Deq: Msg Fragment 87 0 0 0.000
latch: library cache lock 2 0 0 .010
latch: shared pool 206 0 0 0.000
Backup: sbtend 8 0 0 0.000
direct path write 91 0 0 0.000
recovery area: computing dropp 4 0 0 0.000
ed files
recovery area: computing appli 4 0 0 0.000
ed logs
PX Deq: Table Q qref 129 0 0 0.000
recovery area: computing backe 33 0 0 0.000
d up files
buffer busy waits 15 0 0 0.000
Backup: sbtinit2 8 0 0 0.000
latch: cache buffers lru chain 1 0 0 0.000
instance state change 1 0 0 0.000
enq: PS - contention 21 0 0 0.000
latch: In memory undo latch 1 0 0 0.000
undo segment extension 18 0 17 0.000
direct path write temp 606 0 0 0.000
latch: redo allocation 1 0 0 0.000
reliable message 1 0 0 0.000
78 rows selected.
-
 Originally Posted by gandolf989
I thought tuning by ratios was for the most part debunked. 
I'm not looking at ratios but at wait events.
Query shows which wait events are affecting the most your system so you can drill down on them.
As per posted "db file scatered/sequential read" I don't know the answer to your question... and I don't know it for two reasons. 1) I really don't care about ratios but individual wait events and, 2) correct ratio would depend on specific application. May be that ratio is fantastic, who knows?
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
 Originally Posted by PAVB
I'm not looking at ratios but at wait events.
Query shows which wait events are affecting the most your system so you can drill down on them.
As per posted "db file scatered/sequential read" I don't know the answer to your question... and I don't know it for two reasons. 1) I really don't care about ratios but individual wait events and, 2) correct ratio would depend on specific application. May be that ratio is fantastic, who knows? 
So given the waits that we are seeing, does it seem like the waits are caused by replication to or from another database? And perhaps there is some bottleneck in the replication process?
-
 Originally Posted by socalguy09
Here is the output
quick question... was this system startup three or four days ago?
Even so... system doesn't appear to be under any kind of stress.
Are you sure you are facing a general performance issue or is it just a couple of easily identifiable queries?
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
Hi Pablo,
Yes, the system was just rebooted due to slowness and database hanging.
I would say the bad performance is the result of both general performance issue
and bad SQL queries/code. That's what I'm trying to find the root cause. Sorry, I'm not a dba.
Thanks in advance.
-
Please describe the host system e.g. processors and memory, please also let us know if there is anything else other than this Oracle instance running on the particular host.
Please describe database in terms of size - total space allocation, size of the five larger tables.
You mentioned performance gets hurt "after the bulk load or bulk import". In regards to this item... 1) how many tables are getting loaded?... 2) how many rows are involved in each load?... 3) What are the size of load target tables in number of rows?
Is there any process gathering performance statistics?
You also mention database was shutdown/startup because of performance/hanging issues... did shutdown/startup the instance solved those issues?
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|