buffer cache - Page 2
DBAsupport.com Forums - Powered by vBulletin
Page 2 of 2 FirstFirst 12
Results 11 to 19 of 19

Thread: buffer cache

  1. #11
    Join Date
    May 2000
    Assume that you are executing the SQL statement in SQLPLUS.
    1 Oracle reads first 10M rows and then places them into buffer cache. And sends the data to User Workstation through NET8 (TCP/IP)

    2 Now, Oracle reads 2nd 10M rows that must be placed in the buffer cache, but all the buffer blocks are filled with the 1st set of 10M rows. So it flushes (aged ) out the old buffer blocks and places the new data (2nd 10M rows) into the buffer cache. And sends the data to user's workstation.

    This process goes on until it completes the fetching the rows from disk and sends out to workstation.

    So it needs 10 reads logically to send 100M rows, but physically Oracle reads the data based on the I/O size, and DB_FILE_MULTIBLOCK_READ_COUNT. But remember, the max I/O size supported in one physical read cannot exceed 64K.

    [Edited by tamilselvan on 02-01-2001 at 12:05 PM]

  2. #12
    Join Date
    Jul 2000
    My guess would be that the amount returned to a user requesting 100M rows would be the
    db_block_size * db_multiblock_read_count / rowsize (in bytes).

    So by this token I believe that the db buffer cache would eventually fill up but would fill up at the rate of
    (db_buffer * db_block_size) - db_block_size * db_multiblock_read_count.

    Obviously the 'oldest' data would be aged out first and then get overwritten first (one user on DB only).

    So the answer in my opinion would be
    100,000,000 / (db_block_size * db_multiblock_read_count / rowsize).


    What does anyone else think?


  3. #13
    Join Date
    Jul 2000
    Hi Pando,

    Depending on the setting of the session parameter for the optimizer mode
    first_rows or all_rows.

    The user will get the first rows quick or has to wait for the query to complete.


  4. #14
    Join Date
    Jun 2000
    Madrid, Spain
    the optimizer is choose

  5. #15
    Join Date
    Dec 2000
    Ljubljana, Slovenia
    There are couple of things here that have to be clarified.

    1. This is not directly connected to your question, but the idea is present in some of the messages in this thread: "Full table scan (FTS) of big table would fill the buffer cache, thus flushing out blocks from other tables completely". This is not true, because during full table scan buffers are read into the "least recently used" (LRU) end of the buffer cache, meaning that they will be the first to be flushed out of the buffer cashe. Let's assume we have buffer cashe of 1000 blocks, 100 of which are free. When FTS begin, first 100 blocks are read into the empty buffers in the cache and they are all put in the LRU end of the free list. When next set of blocks (number is defined by MULTIPLE_BLOCK_READ_COUNT)) have to be read, Oracle must flush the same number of buffers from the cache and it chooses those from the LRO end of the list. Incidentaly, this are the same blocks that have where read previously in the same FTS operation. So in effect, FTS cycles through a limited number of bufers in the cache, thus preventin a single FTS to flood the whole buffer cache.

    2. The optimizer mode (RULE, CHOOSE, FIRST_ROWS, ALL_ROWS) has absolutely no effect wheter the rows will be returned to user as soon as each of them is found or when all the rows are processed. This is determined by the type of a particular operation - "row operations" will return row by row as soon as rows are found, while "statement operations" cuse resulting rows to be returned as a whole result set at the end of the operation, once all the rows are processed. Typical examples of "statement operations" are INTERSECT, MINUS, UNION, all kind of SORT operations, HASH JOIN, MERGE JOIN.... Typical examples of "row operations" are TABLE/INDEX SCANs, TABLE BY ROWID, FILTER, NESTED LOOP, AND-EQUAL, CONCATENATE.... So when FTS is executed (if it is a simpleFTS, without ORDER BY or being a part of HASH/MERGE JOIN), each row is returned to the user as soon as it is found, regardles of optimizer mode. But, of course, by choosing diferent optimizer mode you can dramatically influence on the decision whether FTS will be used or not.

    3. To go directly to the answer of your question:

    a) When there is no "statement operation" during the execution of the query (as FTS in your example, or INDEX_SCAN+TABLE_BY_ROWID or simmilar), as soon as single block or MULTIPLE_BLOCK_READ_COUNT set of blocks are read into the cache, they are examined for the rows that met all the conditions of the query. Those rows are returned immediately to the users. In the mean time, user proces reads another set of blocks into the cache, removing previously read blocks from the cache if necessarry. This goes on until all the table blocks are processed.
    b) When there is a "statement operation" involved in a querry (for example, if you use ORDER BY in your query), oracle must perform a sorting (if you check explain plans of queries using any kind of statement operations like MINUS, MERGE JOIN, SORT ORDER BY, SORT AGGREGATE, SORT UNIQUE etc, you'll find out there is allways a sorting operation involved). So in this case, when the block is read into the cache, the resulting rows are found and put into the sort area of the shared pool, where they are sorted. This goes on untill all blocks involved are processed. If during this process sort area is filled up with the resulted rows, they are written into the temporary tablespace segment. After all the blocks have been processed, the final sort of the temporary segment is done (if necessary) and the whole result set is returned to the user.

    So no matter how many blocks have to be read and how large the result set is, the resulting rows are returned to the user either immediately row-by-row, or at the end of the operation from the sorting area (assisted by temporary tablespace) as a complete resulting set.

    Jurij Modic
    ASCII a stupid question, get a stupid ANSI
    24 hours in a day .... 24 beer in a case .... coincidence?

  6. #16
    Join Date
    Oct 2000


    If you're using Oracle Client , then there is swapping of disk space - Oracle will write the fetched rows onto disk and when all rows are retrieved, the result is thrown on to the screen/user. I don't know how it is if you're directly on the server.
    This is my suggestion since I've seen a lot of swapping when dealing with big queries.
    Kindly let me know if I am any where near to the answer.


  7. #17
    Join Date
    Dec 2000
    Ljubljana, Slovenia
    Well, this has nothing to do with buffer cache as a part of a SGA.

    This "client swapping" depends on what kind of client application/tool you use. For example, if you use SQL*Plus you won't notice any swapping on client machine, because all the recived rows are immediately shown on the screen and screen scrolls automaticaly when new rows arive.

    But if you use some kind of tool that presents returned rows in some kind of "data grid" on the display, only first few rows are displayed at the screen, while others have to be "buffered" somewhere in the memory in case user chooses to scroll up/down the screen to see the rows that are currently not visible. If the amount of the returned rows is huge, client machine have to swap the memory to disk.
    Jurij Modic
    ASCII a stupid question, get a stupid ANSI
    24 hours in a day .... 24 beer in a case .... coincidence?

  8. #18
    Join Date
    Jun 2000
    Madrid, Spain
    I am asking this coz the other day I was doing a massive select * from a huge table and I notice that after seeing certain number of rows in sqlplus the speed of returning rows slows down considerably (guess in this part buffer was full and Oracle was reading from disk again to fill the buffer?) then speed up again (in buffer now?) :D

    Anyway I thought everyone would share this "unofficial paper" from Oracle, received from one of Oracle analyst through e-mail, as usual with these papers Oracle will not be responsible of consequences applying criteria from this paper

    Here is a "unofficial" copy of a note (104937.1)--not sure if it will be
    made external in the near future....



    This note explains the new (Oracle 8i) algorithm for managing buffers in

    the Oracle buffer cache.

    Scope and application

    This note is for INTERNAL use only and is not for general release.

    Replacement algorithm for buffer cache in 8.1.3

    The new replacement algorithm for managing buffers in the cache is no
    longer LRU based. Instead, it is a variation of the "touch count", or
    "clock frequency" scheme. It is described below assuming that all
    are linked on a single list (different from implementation, which has
    several lists, for keeping track of buffers that are absolutely useless,

    those that need to be written, etc).

    Basic description assuming all buffers on a single list

    Buffer's have a "touch count" field that keeps track of the number of
    touches (hits) a DBA has encountered while it is in the cache. Hits that

    are very "close" (within _db_aging_touch_time seconds) are counted as 1

    You can see the current number of touches per buffer by dumping the
    buffer header.

    You might alternatively query x$bh as it has been updated to show the
    touch count:

    1 row selected

    What happens when a buffer is touched (hit) in cache?

    On a hit, the following steps occur:

    - Assuming that _db_aging_touch_time seconds has passed since we last
    incremented the touch count, the touch count is increased by 1.

    - The buffer is NOT moved from its current position in the list. That is

    it stays where it is.

    - Incrementing the touch count is done without any latching activity.
    Thus, we may miss an increment to touch count occasionally as a

    So basically, we increment touch count based on the time elapsed since
    we last incremented it.

    How is a victim selected for replacement?

    If we need to read a buffer into the cache, we must first identify a
    "victim" to be replaced:

    - Victims are selected by scanning the list from the tail of the list.

    IF ( touch count of scanned buffer > _db_aging_hot_criteria ) THEN
    Give buffer another chance (do not select as a victim)
    IF (_db_aging_stay_count >= _db_aging_hot_criteria) THEN
    Halve the buffer's touch count
    Set the buffer's touch count to _db_aging_stay_count
    END IF
    Select buffer as a victim
    END IF

    Where is a new buffer placed in the list?

    Unlike LRU, where a buffer with a new DBA is always moved to the top of
    the list (except for long table scans), with this scheme a buffer is
    inserted in the "middle" of the cache. "Middle" is specified as a
    percentage of the list, and is set for the 3 buffer pools using the


    To understand this, think of the list being divided up into a "hot"
    portion (buffers above the "middle" point) and "cold" buffers (those
    that are below the middle point). Here, "hot" and "cold" are very loose
    terms since it is possible for buffers with a very high touch count to
    trickle down to the cold region, if they are not touched since their
    series of touches.

    Thus, a buffer read into the default cache will (by default) be
    in the middle of the cache (_db_percent_hot_default = 50 by default) .
    The reasoning for putting the buffer in the middle, as opposed to the
    top, is to make this buffer earn its touch count by getting a few hits
    before getting a chance to go to the top of the cache.

    Actual Implementation Details

    The actual implementation is a variation of the algorithm to account
    for maintaining buffers to be written in separate lists (regular
    writes versus ping writes versus writes due to reuse range and reuse

    Also, once buffers are written, they are usually "useless", in the sense

    that they have aged out. Such buffers are kept aside on a AUXiliary list

    rather than the main list, provided there are no waiters (or
    waiting for this buffer. If there are foregrounds that want to access
    these buffers, then they are moved to the "middle" portion of the main
    after re-setting their touch count.

    The victim selection described earlier always prefers a buffer in the
    auxiliary list over a buffer on the main list. If there are no buffers
    on the auxiliary list, then it uses the algorithm described earlier for
    the buffers on the main list.

    Also, there is some special processing for CR buffers, which allows
    buffers to be "cooled", and put at the tail of the main list after
    their temperature below the "threshold" (parameters _db_aging_freeze_cr
    and _db_aging_cool_count).

    Specific cases - full table (long) scans

    From testing it appears that blocks read by a full scan of a long table
    (see <Parameter:small_table_threshold>) will be placed at the end of the

    list. Thus, these buffers will imediately become victims for
    It was observed that during such a scan, every
    db_file_multiblock_read_count's worth of blocks were placed at the end
    of the list (LRU flag "moved_to_tail" bit was set) and were replaced by
    the next db_file_multiblock_read_count's batch of blocks.

    This effectively mimics the pre-8i behaviour for full table (long) scans

    and avoids the cache being flushed by a huge table scan.

    Specific cases - full table (short) scans and cached tables

    The buffers are not moved to the tail of the list, and thus do not
    immediate candidates for replacement.

    Summary of parameters

    Parameter name Default Description
    ------------------------------- ------- ----------------------------------
    _db_aging_hot_criteria 2 Used to decide victim selection

    _db_aging_stay_count 99 Touch count set to this if
    (low value) low value < threshold during
    victim selection

    _db_percent_hot_default 50 % Divides default cache into hot
    (middle point) and cold regions; specifies
    in the cache a new buffer is to
    be placed

    _db_percent_hot_keep 0 % Same as above, but for keep pool

    _db_percent_hot_recycle 0 % Same as above, but for recycle

    _db_aging_touch_time 3 secs Touch count not incremented if
    (small interval) the buffer is touched within
    _db_aging_touch_time seconds of
    the last touch


    thanx for the inputs

  9. #19
    Join Date
    Nov 2000
    "..was doing a massive select * from a huge table ... after seeing certain number of rows in sqlplus the speed of returning rows slows down considerably.... then speed up again"

    This kind of fluctuations is very common in real life in general(look into airplains's properler or car's tire :-)).

    Returning to your sample:

    It least likely can be caused by described algorithm: the cache strategy for a full table scan for one user has no effect. The work by Oracle on LRU or whatever lists - searching of victim, etc is negligeble - those are fast in-RAM operations.

    It can be because of interaction between disk HW, OS, Oracle and Client. Those are independant processes, running in parallel. Having in mind, that they process data in chunks (e.g Oracle read datafiles in buffers*DB_MULTIB...COUNT, piping data to client or sending via socket also use buffers, HW use buffers, possitions heads on disks onto right track), it could well happen that the performance graph is kind of oscillations.

    For example, on WinNT you can see very nice CPU ussage picture when oracle is busy with some homegenous processing, like import/delete/ect of huge table.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Click Here to Expand Forum to Full Width