Slow OLTP on 14CPU SunFire

**blguy** · 06-06-2005, 06:09 AM

Currently we have 2 x 140GB drives reserved for REDO directly attached.

I'm very much considering moving to raw for redo logs. We have 6 independant drives (each 140GB) on a fibre back plane - direct attached, not SAN. I don't think the array controller is a bottleneck - it would not even break into a sweat with 6 drives.

Q. How would I configure them? My thoughts are:

Three sets of mirrored drives. Each mirror contains a log member.

i.e.
Drive 1 - REDO1A
Drive 2 - REDO1B

Drive 3 - REDO2A
Drive 4 - REDO2B

Drive 5 - REDO3A
Drive 6 - REDO3B

and then Drive 1 & 2 would contain REDO4A/B, drive 3 & 4 REDO05A/B and drive 5 & 6 REDO06A/B to give 6 members - each 256MB in size to keep the 15 minute switch rate.

Archiving goes to a 470GB 9*72GB RAID5 array (and soon to a Standby server over the WAN)

Afriad my SUN hardware knowledge is a bit basic - I have experienced engineers to do the physical stuff - but they don't know Oracle very well!

So - do we think this REDO configuration allow us to do DML during the working day without killing the OLTP users?

__________________
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Guy Wicks
Database Manager
The British Library

**hrishy** · 06-06-2005, 06:20 AM

Hi

I think thats okay to have 6 groups with two members each per group.And your redolog layout also seems okay to me.

Now whether this would solve your problem well that can only be answered by actually doing the task. :-)

You can also clone your Prod db to a test instance and see whether we can really reproduce your problem ?

regards
Hrishy

**marist89** · 06-06-2005, 08:31 AM

Originally posted by blguy

Q. How would I configure them? My thoughts are:

Three sets of mirrored drives. Each mirror contains a log member.

i.e.
Drive 1 - REDO1A
Drive 2 - REDO1B

Drive 3 - REDO2A
Drive 4 - REDO2B

Drive 5 - REDO3A
Drive 6 - REDO3B

and then Drive 1 & 2 would contain REDO4A/B, drive 3 & 4 REDO05A/B and drive 5 & 6 REDO06A/B to give 6 members - each 256MB in size to keep the 15 minute switch rate.

I don't see the point of spreading on three sets of drives. LGWR will write to one pair (on seperate controllers, right?). Then ARCH will read the logfile and send to log_archive_dest_n and/or standby while LGWR is working on another pair. Since the ARCH process is so quick (most likely), the time spent reading of the log that was just switched is minimal. Eight disks would be ideal. You could setup two 0+1 filesystems and write one member to each filesystem (as long as you're using hardware RAID).

For what it's worth, I think you're masking your real problem with hardware.

**blguy** · 06-06-2005, 09:59 AM

Jeff,

I think your right - the fact the system runs OK during the day is just brute force - a massive overspec of the base system. Generally the system bumbles along on 4 to 6 CPU's active (using top) with the rest idle/sleeping. It may not be nice, but it works.

(This is a basically a government procurement - of course we got screwed eh?)

Will arrange for the redo drives to be reconfigured at the next scheduled down time (need to change control etc etc etc) and I'll see if it has any improvement on the flexibility of the system.

In answer to my original question - Would you expect a CTAS to cripple this system? - I think the answer is a definate no, and I'm glad everyone else agrees. I thought I might be going mad!

Anyway - got to get onto setting up a SQLServer cluster with replication. Oh Joy!

**pando** · 06-06-2005, 11:28 AM

Have you done the stress test and the new statspack report? I wouldnt suggest you to change anything without finding out the root cause!

Your CTAS can hang your system because of redo log issues. I am saying it because you said if you use NOLOGGING the system runs fine, without your users complains.

**svk** · 06-06-2005, 01:26 PM

I feel there is some problem even when operations like CTAS are NOT running. Response time is comprised of @ 15% service time and @ 85% wait time. Almost every commit is waiting for "log file sync".

I sincerely suggest you to check everything @ the disks where your online redo logs are hosted. Do your ever see at least one of the redo group status as INACTIVE ? How big are the online redo logs ? How many archive logs are getting generated in a day and during peak times ?

**blguy** · 06-07-2005, 03:58 AM

svk
I've never seen INACTIVE redo groups.
We have 5 members per group, each member 256MB. Average log switch time every 10 to 15 minutes during the day.

Here are the results of the CTAS test just before the users arrive (the reading rooms open at 9:30 BST) - this would of upset a few internet web users tho!

This is the SQLPLUS script

Code:

SQL> @lf0
Connected.
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv> SELECT MAX( snap_id ) FROM perfstat.stats$snapshot;

MAX(SNAP_ID)
------------
        8128

Elapsed: 00:00:00.00
gwicks@aleph0.ils-datasrv> DROP TABLE gwicks.z02_full3;

Table dropped.

Elapsed: 00:00:00.62
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv> exec perfstat.statspack.snap

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.19
gwicks@aleph0.ils-datasrv> CREATE TABLE gwicks.z02_full3
  2  NOLOGGING
  3  AS
  4  SELECT     *
  5  FROM       bll10.z02
  6  WHERE      z02_doc_number LIKE '005%';

Table created.

Elapsed: 00:02:166.40
gwicks@aleph0.ils-datasrv> exec perfstat.statspack.snap

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.72
gwicks@aleph0.ils-datasrv> SELECT MAX( snap_id ) FROM perfstat.stats$snapshot;

MAX(SNAP_ID)
------------
        8130

Elapsed: 00:00:00.00
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv> SELECT MAX( snap_id ) FROM perfstat.stats$snapshot;

MAX(SNAP_ID)
------------
        8130

Elapsed: 00:00:00.00
gwicks@aleph0.ils-datasrv> DROP TABLE gwicks.z02_full3;

Table dropped.

Elapsed: 00:00:00.81
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv> exec perfstat.statspack.snap

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.41
gwicks@aleph0.ils-datasrv> CREATE TABLE gwicks.z02_full3
  2  --NOLOGGING
  3  AS
  4  SELECT     *
  5  FROM       bll10.z02
  6  WHERE      z02_doc_number LIKE '005%';

Table created.

Elapsed: 00:02:167.59
gwicks@aleph0.ils-datasrv> exec perfstat.statspack.snap

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.88
gwicks@aleph0.ils-datasrv> SELECT MAX( snap_id ) FROM perfstat.stats$snapshot;

MAX(SNAP_ID)
------------
        8133

Elapsed: 00:00:00.00
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>
gwicks@aleph0.ils-datasrv>

**blguy** · 06-07-2005, 04:15 AM

Stats pack report for processing during CTAS logging exercise

sp_8131_8133.txt

And a couple of Spotlight captures:

Notice the RED on log writer - the average log write time can go over 1,000 ms (1 sec) - so is the drive is being swamped? With over 2,500 blocks per second (8K block) that is over to be 20GB/sec transferred. That's high for any single drive.

See the orange block grow as the number of active sessions get seen. This is how I know I affect the online users.

Why would the CTAS saturate the REDO logs and prevent the OLTP sessions from processing. Can it not prioritise the small OLTP sessions?

I feel that doing the CTAS exacerbates a current problem - the system is 'out-of-balance' Very fast CPU / Datafile but slow REDO. However, this is all proper SUN kit!

**pando** · 06-07-2005, 05:06 AM

Man you didnt see what users were waiting by looking v$session_Wait when you ran CTAS?!

**pando** · 06-07-2005, 05:16 AM

Also, the first snap were 10 minutes, this second one is 4 minutes!

However if you look the log file sync waits

Code:

First SNAP without CTAS:
log file sync                                      56,757      225,427   63.60 

Second SNAP with CTAS:
log file sync                                       7,697      686,518   93.81

The wait time of second SNAP with only around 13% of first SNAP waits is MUCH MUCH longer than the first SNAP, clearly shows you have serious I/O problems with your redo log I/O subsystem

Code:

SECOND SNAP 

Event                               Waits   Timeouts  Time (cs)    (ms)   /txn
---------------------------- ------------ ---------- ----------- ------ ------
log file sync                       7,697      3,637     686,518    892    1.9
db file sequential read            87,472          0      28,547      3   21.4
log file parallel write               349          0       9,263    265    0.1
db file parallel write                256          0       2,547     99    0.1
log file switch completion             26          3       1,179    453    0.0
log buffer space                      108          0       1,043     97    0.0
buffer busy waits                     264          2         898     34    0.1
control file parallel write            82          0         677     83    0.0
direct path write                     161          0         372     23    0.0


FIRST SNAP

Event                               Waits   Timeouts  Time (cs)    (ms)   /txn
---------------------------- ------------ ---------- ----------- ------ ------
log file sync                      56,757          0     225,427     40    1.0
db file sequential read         1,729,916          0      93,853      1   30.6
log file parallel write            39,154          0      29,790      8    0.7
db file parallel write                576          0       2,203     38    0.0
SQL*Net more data to client       540,735          0       2,161      0    9.6
control file parallel write           206          0         508     25    0.0
log file sequential read            4,194          0         163      0    0.1
log file switch completion              5          0          75    150    0.0

Thread: Slow OLTP on 14CPU SunFire

Thread Tools

Display

Posting Permissions