he has 5 groups, where did you see 3 groups...?
OK, imagine your web app users are washing their hands in the bathroom sink before they prepare your dinner, since you sold the utility room sink and kitchen sink on ebay. You route the washing machine drain into the bathroom sink and run a load. How will this affect your dinner time? Will measuring the amount of water flowing into the sewer help? Would routing the washing into it's own drain help your users so they don't have to wait to use the bathroom sink until the washing machine is done with it?
Can you give me your reasoning for this? What happens?
Why would dedicated connections be better (Because it would prevent....?)
Switching from MTS to dedicated would be quite easy (change to tnsnames on client) - but what impact would that have on the system / response times / connection times / SGA usage?
What I suspect is happening (and why you should look at those V$ tables I posted on cdos) is your web users response time is going way down while the bulkier operations hog the shared servers. This is probably being worsened by the redo issues. But if it is the problem, then all your testing of disk response and wait time stats and whatall is kind of pointless. Those things should be done after correcting the configuration to separate out OLTP and batch connections. You may then find the operations choke your redo log buffer or thrash your RBS tablespace or whatever. But if the phone isn't ringing off the hook because the users don't notice, maybe they won't even let you tune and call you ahiggly town hero. Or it really could be what everyone else says and I'm just talking out of my butt.
Note that you can use both MTS and dedicated servers, that is just controlled by the client connection. You can either put it in the connect string (if I'm not completely confused, the syntax is in the docs somewhere), or just have another TNSNAMES entry with a different name and a (SERVER=DEDICATED) line, and connect to that for the big jobs. Be real careful about backing up the tnsnames.ora before making such a cut-and-paste mod, some versions of Oracle are just psychotic about stray spaces.
As to switching everybody away from MTS, I couldn't say if that would benefit your configuration. However, my experience with similar configurations has often been, throw out MTS and everybody happy. YMMV
DBA? Larry sez we donneeddoseanymore
There are couple of issues:
1. Log Writer bottleneck & Init.ora file
2. MTS connection
== from you init.ora
SQL*Net message from client 3,085,839 0 5,278,952 17 54.7
virtual circuit status 1,361 60 243,750 1791 0.0
SQL*Net message to client 3,085,835 0 548 0 54.7
SQL*Net more data from clien 11,044 0 107 0 0.2
I have gone through the statspack report.
Here are my observations:
log_checkpoint_interval is set to 10000 OS blocks. In most of unix OS, the OS block is 512 bytes. So when the LGWR writes 5,120,000 bytes on to the redo log file, the log file gets synchronized.That's why you see "log file sync waits" in the statspack report. This value, 10000 is too low. I usually set it to very high value (9M or 10M ) so that sync will occur after the log switch. Also, reduce log_buffer value to less than 1MB, unless your system is pure DW.
If you use JAVA, change autocommit to false. This is also one of the reasons for more log file sync wait.
Since the box has a huge physical memory(48 GB), I would not recommend to use MTS. MTS is basically designed for low memory systems. You can very well go for dedicated server connection.
MTS is suitable for pure OLTP transactions where it is expected that the transaction finishes its work in less than 45 seconds (Tom explained it in his first book, page number 87) and low data transfer between client and server.
From your statspack report the wait on the SQL*net message from and to client is very high. This is mainly because of poor MTS design. Today, big shops never use MTS. To improve the speed of data transfer between Server and clients, increase SDU and TDU also.
Other than these 2, I don't find any serious bottleneck from the statspack report.
What is the CPU_COUNT value?
Last edited by tamilselvan; 06-09-2005 at 08:10 AM.
Originally posted by blguy
New mount points added, and we moved the REDO's from the R5 array to the new drives. But no change in performance from dedicated drives (same wait times, etc).
It turns out that SUN do 36GB 15K RPM drives for the 3510, instead of the 146GB 10K RPM drives. The drive buffer caches are different too (it seems SUN do small fast drives specifically).
I just looked closer to the log writer write performance:
Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
redo blocks written 314,161 510.0 5.6
redo buffer allocation retries 5 0.0 0.0
redo entries 222,696 361.5 3.9
redo log space requests 5 0.0 0.0
redo log space wait time 75 0.1 0.0
redo ordering marks 0 0.0 0.0
redo size 145,666,592 236,471.7 2,580.2
redo synch time 225,625 366.3 4.0
redo synch writes 56,541 91.8 1.0
redo wastage 9,989,864 16,217.3 177.0
redo write time 115,578 187.6 2.1
redo writer latching time 2 0.0 0.0
redo writes 39,158 63.6 0.7
user commits 56,456 91.7 1.0
236 kb/sec redo size / 63 writes = 3.75 kb/write (7 or 8 blocks a 512 bytes)
510 redo blocks written/sec * 512 bytes = 255 kb/sec (about the same as 236 kb/sec redo size), OK
(Just to check if mathematic works... :-)
91.7 user commits / 63.6 redo writes = 1.44 commits/second
(You are in "batch commiting" - one write serves more than 1 commit; that is good)
Total Wait wait Waits
Event Waits Timeouts Time (cs) (ms) /txn
---------------------------- ------------ ---------- ----------- ------ ------
log file sync 56,757 0 225,427 40 1.0
db file sequential read 1,729,916 0 93,853 1 30.6
(1) your "db file sequential read" comes from a cache - either UNIX Filesystem Cache or Cache from Storage Array. (real physical read is 6 ms - 12 ms)
(2) The 40 ms for "log file sync" are low - as mentioned on many places.
On an Application whith high commit-rate (800 commits/second) we used the old Sun A1000 arrays for data and redo. 2 Arrays, mirroring was across the arrays.
iostat showed less than 1 ms per write,
Oracle Statspack of course more (but below 4 ms) - the Oracle "log file sync" includes in my understanding both write calls - one per redo member.
Although the A1000 are not very expensive they provided a 128 kb battery buffered write back cache. And once I noticed that "log file sync" increased (more than 10 ms) - and system performance decreased - and that was because the old battery was not working any more and the Storage array disabled automatically the write back cache.
Conclusion: Even a VERY SMALL (128 kb - yes, kb not MB) SIGNIFICANTLY increases the performance of the logwriter in case that you have small writes.
I guess the Sun 3510 replaced the old A1000 arrays.
(The log file sync on that Sun A1000 were faster than on our billing system using an EMC Symmetrix with 32 GB write back cache....)
Now I am really curious about your log file sync on this new arrays and waiting for new statspack report (at least after you got your new controller)
But this will help only for the "normal" operation with many small commits.
Your huge CTAS in logging mode will definitely have severe impact!
(How big is that table and how long does that take?)
Click Here to Expand Forum to Full Width