anyone for help tuning an 8.1.6 ops system ?

**sgomber** · 02-16-2004, 05:38 AM

hi all,

we're running an ops cluster 8.1.6 on two 5*450Mhz CPU each 6GB RAM, fibrecat storage 0.5GB cache. there's also the BEA Tuxedo middle-tier running process chains. userconnections are made by the application itself not by Tuxedo.

we're facing performance problems due to user increasement - while having 60-80 users the system was performing very well but when increasing up to 90-100 users we're facing incredible loads on the db-server like: 0% cpu idle, 82% user, 18% kernel.

is there any way for balancing those load averages or is the answer simply: 'hell no! - add cpu, cpu, cpu !!!' ?

here's the initora for node1

parallel_server true
parallel_server_instances 2
disk_asynch_io true
db_block_size 8192
db_name db1
db_files 400
db_file_multiblock_read_count 8
gc_files_to_locks 1=1000,2-5=2000,6-15=25000,16=25000,
17-18=50000,19-28=0,29-31=12500,
32-36=0,37-38=20000,39-40=0,
41-42=20000,41-42=20000,43=0,
41-42=20000,43=0,44=40000
gc_releasable_locks 400000
db_block_buffers 154112
shared_pool_size 536870912
log_buffer 4194304
sort_area_size 4194304
log_checkpoint_interval 3600
processes 500
parallel_max_servers 12
audit_trail false
timed_statistics true
max_dump_file_size 10240
transactions 250
transactions_per_rollback_segment 5
global_names true
compatible 817
instance_number 1
thread 1
buffer_pool_keep 76800
create_bitmap_area_size 33554432
bitmap_merge_area_size 8388608
aq_tm_processes 1
hash_multiblock_io_count 0

thanks in advance to all of you

**stmontgo** · 02-16-2004, 11:36 AM

Can you give a more detailed explanation of the enviornment please? Is this an off the shelf app? Web servers? app servers?

**slimdave** · 02-16-2004, 11:41 AM

sure that theapp is using bind variables? That'd be a prime cause of non-scalability.

**pando** · 02-16-2004, 01:07 PM

sinc it´s OPS have you looked if you have locking contentions? (pinging or whatever they are called)

but for a 100 user application doesnt using OPS a bit too "heavy metal"?

**stmontgo** · 02-16-2004, 05:01 PM

Originally posted by pando
sinc it´s OPS have you looked if you have locking contentions? (pinging or whatever they are called)

but for a 100 user application doesnt using OPS a bit too "heavy metal"?

false pinging, in ops you have to write your apps explicitly to work with ops (ie partitioning).

**sgomber** · 02-17-2004, 04:14 AM

Hey all,

@stmontgo:
it's an OLTP system based on the dbservers (each 6 UltraSPARC-II@400Mhz 6GB RAM) already mentioned - no webservers, just two dbservers for ops, and 4 appservers running the tuxedo/application

for the 4 (Tuxedo-)appservers: each provided with 3 SPARC64GP CPU's @400Mhz / 3GB RAM - where appsrv1 is the 'master' and the others are 'slaves'

database connections are made by the appsrv1 and actually - because of the application program chain (approx. 500 processes) dedicated to dbsrv1...

question here may be: looks like we don't use ops in all ???

right - because of the MCPD (max_commit_propagation_delay) of 700 we decided to run everything on dbsrv1 while dbsrv2 is idling mostly
right now there won't be any false pinging at all, but we'd like to balance processes to both dbservers.

since response times of our application were acceptable there was no need to change our config - but the situation changed rapidly when adding more users.

i expect lots of false pinging after spreading the program chain over the two dbservers.

thanks again

**stmontgo** · 02-19-2004, 07:59 AM

Run statspack on your database for starters and make sure you have run the script to create the gv$ views to get the OPS perspecive. Also look at the system at as whole.

In PeopleSoft for example a the web server maintains the state of the session. The connection is then passed to the appserver which grabs as much information as it can from the db and then does the processing on the app server. Essentially it grabs a large chunk of data from the db and then manipulates it for submission to the database. In this regard the number of calls to the db is dramatically reduced thus reducing the overhead on the db. For example referntial integrity is maintained on the application severs an not the database. What you may find is that if the app servers are not properly configured or are being underutilitzed you could be transferring alot of the work from the app server to the db server instead of spreading out the workload.

Now given that you db does not seem to scale well I would give extra attention to the db. Ensure what you think is happening is true. You said you have partioned the app to only access one node of the cluster. Can you verify this? Can you ensure that no other processes are running on the *standby* node.

As a last resort (after looking at statspack) can you shutdown the cluster and start it up as a single instance to see if it scales any better?

Regards,

Steve

Thread: anyone for help tuning an 8.1.6 ops system ?

Thread Tools

Display

anyone for help tuning an 8.1.6 ops system ?

OPS-OLTP

Posting Permissions