buffer cache

**pando** · 02-02-2001, 06:48 AM

I am asking this coz the other day I was doing a massive select * from a huge table and I notice that after seeing certain number of rows in sqlplus the speed of returning rows slows down considerably (guess in this part buffer was full and Oracle was reading from disk again to fill the buffer?) then speed up again (in buffer now?) :D

Anyway I thought everyone would share this "unofficial paper" from Oracle, received from one of Oracle analyst through e-mail, as usual with these papers Oracle will not be responsible of consequences applying criteria from this paper

Here is a "unofficial" copy of a note (104937.1)--not sure if it will be
made external in the near future....

=====================================

Description
-----------

This note explains the new (Oracle 8i) algorithm for managing buffers in

the Oracle buffer cache.

Scope and application
---------------------

This note is for INTERNAL use only and is not for general release.

Replacement algorithm for buffer cache in 8.1.3
-----------------------------------------------

The new replacement algorithm for managing buffers in the cache is no
longer LRU based. Instead, it is a variation of the "touch count", or
"clock frequency" scheme. It is described below assuming that all
buffers
are linked on a single list (different from implementation, which has
several lists, for keeping track of buffers that are absolutely useless,

those that need to be written, etc).

Basic description assuming all buffers on a single list
-------------------------------------------------------

Buffer's have a "touch count" field that keeps track of the number of
touches (hits) a DBA has encountered while it is in the cache. Hits that

are very "close" (within _db_aging_touch_time seconds) are counted as 1
hit.

You can see the current number of touches per buffer by dumping the
buffer header.

You might alternatively query x$bh as it has been updated to show the
buffer's
touch count:

SQL> SELECT TCH FROM SYS.X$BH WHERE FILE#=4 AND DBABLK=2794;
TCH
-------------
100
1 row selected

What happens when a buffer is touched (hit) in cache?
-----------------------------------------------------

On a hit, the following steps occur:

- Assuming that _db_aging_touch_time seconds has passed since we last
incremented the touch count, the touch count is increased by 1.

- The buffer is NOT moved from its current position in the list. That is

it stays where it is.

- Incrementing the touch count is done without any latching activity.
Thus, we may miss an increment to touch count occasionally as a
result.

So basically, we increment touch count based on the time elapsed since
we last incremented it.

How is a victim selected for replacement?
-----------------------------------------

If we need to read a buffer into the cache, we must first identify a
"victim" to be replaced:

- Victims are selected by scanning the list from the tail of the list.

IF ( touch count of scanned buffer > _db_aging_hot_criteria ) THEN
Give buffer another chance (do not select as a victim)
IF (_db_aging_stay_count >= _db_aging_hot_criteria) THEN
Halve the buffer's touch count
ELSE
Set the buffer's touch count to _db_aging_stay_count
END IF
ELSE
Select buffer as a victim
END IF

Where is a new buffer placed in the list?
-----------------------------------------

Unlike LRU, where a buffer with a new DBA is always moved to the top of
the list (except for long table scans), with this scheme a buffer is
inserted in the "middle" of the cache. "Middle" is specified as a
percentage of the list, and is set for the 3 buffer pools using the
parameters:

_db_percent_hot_default
_db_percent_hot_keep
_db_percent_hot_recycle

To understand this, think of the list being divided up into a "hot"
portion (buffers above the "middle" point) and "cold" buffers (those
that are below the middle point). Here, "hot" and "cold" are very loose
terms since it is possible for buffers with a very high touch count to
trickle down to the cold region, if they are not touched since their
last
series of touches.

Thus, a buffer read into the default cache will (by default) be
positioned
in the middle of the cache (_db_percent_hot_default = 50 by default) .
The reasoning for putting the buffer in the middle, as opposed to the
top, is to make this buffer earn its touch count by getting a few hits
before getting a chance to go to the top of the cache.

Actual Implementation Details
-----------------------------

The actual implementation is a variation of the algorithm to account
for maintaining buffers to be written in separate lists (regular
checkpoint
writes versus ping writes versus writes due to reuse range and reuse
object
calls).

Also, once buffers are written, they are usually "useless", in the sense

that they have aged out. Such buffers are kept aside on a AUXiliary list

rather than the main list, provided there are no waiters (or
foregrounds)
waiting for this buffer. If there are foregrounds that want to access
these buffers, then they are moved to the "middle" portion of the main
after re-setting their touch count.

The victim selection described earlier always prefers a buffer in the
auxiliary list over a buffer on the main list. If there are no buffers
on the auxiliary list, then it uses the algorithm described earlier for
the buffers on the main list.

Also, there is some special processing for CR buffers, which allows
these
buffers to be "cooled", and put at the tail of the main list after
setting
their temperature below the "threshold" (parameters _db_aging_freeze_cr
and _db_aging_cool_count).

Specific cases - full table (long) scans
----------------------------------------

From testing it appears that blocks read by a full scan of a long table
(see <Parameter:small_table_threshold>) will be placed at the end of the

list. Thus, these buffers will imediately become victims for
replacement.
It was observed that during such a scan, every
db_file_multiblock_read_count's worth of blocks were placed at the end
of the list (LRU flag "moved_to_tail" bit was set) and were replaced by
the next db_file_multiblock_read_count's batch of blocks.

This effectively mimics the pre-8i behaviour for full table (long) scans

and avoids the cache being flushed by a huge table scan.

Specific cases - full table (short) scans and cached tables
-----------------------------------------------------------

The buffers are not moved to the tail of the list, and thus do not
become
immediate candidates for replacement.

Summary of parameters
---------------------

Parameter name Default Description
------------------------------- ------- ----------------------------------
_db_aging_hot_criteria 2 Used to decide victim selection
(threshold)

_db_aging_stay_count 99 Touch count set to this if
(low value) low value < threshold during
victim selection

_db_percent_hot_default 50 % Divides default cache into hot
(middle point) and cold regions; specifies
where
in the cache a new buffer is to
be placed

_db_percent_hot_keep 0 % Same as above, but for keep pool

_db_percent_hot_recycle 0 % Same as above, but for recycle
pool

_db_aging_touch_time 3 secs Touch count not incremented if
(small interval) the buffer is touched within
_db_aging_touch_time seconds of
the last touch

SO SEEMS LIKE LRU ALGORITHM IS NOT LONGER USED!

thanx for the inputs

Thread: buffer cache

Thread Tools

Display

Threaded View

Posting Permissions