-
Hi all, I work for one of the Stock Exchanges, and at the market open time there are a lot of inserts into multiple tables and selects from multiple tables for a matter of 2-3 mins. At this time the idle time per CPU (on our 8 CPU UNIX box) is about 3-5% !!! Bad, Very BAD !!!
So, any suggestions as to what I should look at to tune the DB so that the CPU will be more efficiently utilized ?
BTW, we just upped our CPUs from 4 to 8.
Thanks in advance for the help.
-
What happens after the 3rd minute?
-
Are there a lot of log switches when this happens? If so, bigger redo logs would probably help with the inserts.
-
After the 3rd min, the CPU average utilization goes down to 40-60%;
and yes there are alot of REDO log switches: 1 250MB file every 5-7 mins.
-
another thing you can look into with the inserts is make sure that you really need all of your indexes. Indexes slow down inserts.
also, you might look into your app and see if it is possible to do array binding. what this means is that if your query looks like this :
insert into mytable(col1, col2, col3, col4)
values (:val1, :val3, :val4);
your 'val' variables contain ARRAYS instead of just single
values. If your arrays contain 100 values, then you execute
a query 1 time instead of 100 times.
If your inserts look more like this :
insert into mytable(col1, col2, col3, col4)
values ('x', 'y', 'z', 'zz');
with hard-coded values then you should definately switch to variables. You will notice a big improvement.
-
any other suggestions of decreasing the CPU utilization ?
-
Are you having any freelist contention? run this query just after or during your peak periods...
SELECT round((sum(W.COUNT) / SUM(S.VALUE)) * 100, 2) "Free List Wait Ratio"
FROM SYS.V_$WAITSTAT W, SYS.V_$SYSSTAT S
WHERE W.CLASS = 'free list'
and s.name in ('db block gets', 'consistent gets')
Oracle Recommends that this ratio be less than 1. If it is greater than 1, then you might want to recreate some of your most active tables with a greater value for FREELISTS.
that's all I can come up with....
-
Hrm you said that 2-3 minutes the cpu util is very high, but the huge logs are created every 5-7 minutes. By that I assume that all the inserts/selects continue longer than the 2-3 minutes, but only the preliminary 2-3 minutes are the problem, Is that a correct assumption?
What else is happening for those 2-3 minutes? 100 people opening up applications/sessions to start doing the inserts/selects? Other preliminary work?
-
I guess I need to dive a little deeper into the activity that happens that 2-3 mins. I think there are several selects that take place at that time to set up some daily tables, but no, no users login to this db, it is more for the trading system inserts.
Thank you all for your input.
-
Take a look at sar -q and vmstat during the high-cpu period.
If the run-queue doesn't go up dramatically, it may not be a problem.
Is it possible a lot connections are being made ? Creating a connection is very expensive on memory and CPU.
Do you have enough memory ?
using sar -u, when you have the low cpu-idle, what's high ?
user ? system ? waiting for i/o ?
If waiting for i/o is high, it's possible it's paging.
if user is high, I think it the creation off connections.
Regards
Gert
-
Look SQL
I have a similar problem as well. The rule what to do is simple:
1) tune app first if can do that. For example, can always create/drop indexes. 2) Else upgrade HW. 3) If you are lucky, something wrong is with DB config.
To tune application:
1. find out most active SQL by block visits:
select kind of buffer_gets/executions, ... from sql_area order by 1
a) If you have 2/3 minutes of activity, so can do the folowing do get deltas:
create table hist_sql as select sysdate cr_date, a.* from v$sqlarea a
then find deltas, ex., most often executed SQL:
select (a.executions-b.executions)/((sysdate-b.cr_date*24*60*60), b.* from hist_sql b, v$sqlarea a where a.HASH_VALUE=b.HASH_VALUE and a.ADDRESS=b.ADDRESS and a.sql_text=b.sql_text
order by 1 desc
b) the same for calculating statistics deltas from v$sysstat, v$sesstat, but this does not help much unless there are obvious anomalies in some sessions: then you can identify which sessions is using x time more of CPU then others and concentrate on that.
2. do a trace and analyse with TKPROF:
a) enable timing with alter system set timed_statistics = true
b) switch on trace for any session you need with:
sys.DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION(SID, SERIAL,TRUE/FALSE);
c) tkprof in.trc out.trc SORT=(EXEQRY, EXECU, FCHQRY, FCHCU) for visited blocks
tkprof in.trc out.trc SORT=(PRSCPU, EXECPU, FCHCPU) for CPU time, etc
3. PIN stored PL/SQL into shared pool, for systems which use a lot of dynamic SQL it is healthy - if have enough RAM.