RAW vs Cooked file system
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 8 of 8

Thread: RAW vs Cooked file system

  1. #1
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135

    RAW vs Cooked file system

    Austin, I don't see the thread you posted. Hence, I created one and this is the first thread I opened.

    ======== original Message from Austin ==================

    Mark, that link you sent was interesting.

    I believe I'm correct in saying that db_file_sequential_read siginifies a wait for an IO request to complete. Am I correct, then, in saying that the increase in db_file_sequential_read is introduced because of the increased IO throughput you get from async IO. In otherwords, it's a symptom of IO performance improvements and not degredation?
    Thanks

    Austin

    Hello Mark

    Thanks again for your input.

    I do realise that db_file_sequential_read signifies a wait for a single block IO and when this event represents a large proportion of a query's resource profile the most appropriate action is to reduce the number of database calls (e.g. logical IOs). Only when the number of LIOs required has been truly reduced to a minimum should we consider making changes to the IO subsystem if a performance problem still exists.

    Where my confusion comes in is that the optimal configuration in the
    paper also exhibits the longest db_file_sequential_read durations. This despite the SQL that is being executed during the test being identical the SQL executed when testing the non-optimal configurations. This has led me to think that the increased throughput afforded by async IO has led to busier disks and longer db_file_sequential_read durations. Do you think this assumption is reasonable?
    ======================

    Your assumption of db_file_sequential_read signifies a wait for a single BLOCK IO request to complete is correct.
    However, what you said next "increase in db_file_sequential_read is introduced because of the increased IO throughput you get from async IO" may not be true. Increase in db_file_sequential_read will be caused b/c of the block not available in SGA, hence the user process reads blocks from the disk and puts it in SGA. When many user proces reads blocks from the disks, and the system has few disks and few controllers, then wait occurs. The increase in disk IO throughput depends upon the controller throughput, the disk throughput, IO bus speed etc..

    The last assumption about "increased throughput afforded by async IO has led to busier disks and longer db_file_sequential_read durations" is also not correct. If the system is well configured (I leave it you for the definition of well configured), you will see less db_file_sequential_read duration.

    I did a lot of benchmark in Oracle using RAW, LVM and JFS files, with 2 disk controllers as well as 4 disk controllers. In all cases RAW devices outperforms JFS file systems.

    Tamil

  2. #2
    Join Date
    Feb 2003
    Location
    Leeds, UK
    Posts
    367
    Tamil

    Many thanks for your response.

    If the system is well configured (I leave it you for the definition of well configured), you will see less db_file_sequential_read duration.
    This is what I had always thought too. However, I began to question my beliefs when the paper from HP stated that async IO with raw is always the optimal choice yet in their own benchmarks db_file_sequential_read was at it's greatest in that very configuration. It is this apparant contradiction that I am trying to understand.

    My reason for obsessing over this matter is that the db I've inherited (8.1.7.4 on HPUX 11i 64bit using raw logical volumes) is not utilising async IO (the driver isn't in the kernel). Now the paper would seem to imply that if I configure async IO I'll immediately benefit from improved IO performance. Given, however, that db_file_sequential_read represents the most significant response time component for any query that I've ever looked at on the system I'm worried that async IO will actually degrade performance.

    I fully appreciate that reducing the number of LIOs required by a query should also reduce the total amount of time taken waiting on db_file_sequential_read (take care of the logical IOs and the physical ones will take care of themselves as Tom Kyte would say) but this isn't relevent when I'm considering simply enabling async IO for the database.

    I have to admit that I hadn't fully considered your point about db_file_sequential_read being related to buffer cache misses when I've been trying to understand all of this. Of course the buffer cache was the same size when testing each config (raw with async, raw without async, cooked with various vxfs mount options), but there could be some factor that led to less blocks being read from cache during the bench marking of raw with async IO.

    It's a shame that the authors at HP didn't include their email addresses at the end of the paper, otherwise I'd get in touch and ask them what they thought had reduced cache hits. I do find it curious that they included the statistic when it implies a reduced response time though.

    Of course the best course of action would be to test the effect of async io in my environment. Unfortunately, however, I don't really have the ability to simulate a production load in our test environment.

  3. #3
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    ========
    My reason for obsessing over this matter is that the db I've inherited (8.1.7.4 on HPUX 11i 64bit using raw logical volumes) is not utilising async IO (the driver isn't in the kernel).
    Given, however, that db_file_sequential_read represents the most significant response time component for any query that I've ever looked at on the system I'm worried that async IO will actually degrade performance
    ===

    You only get the best from RAW when the POSIX implementation AIO is configured with HP OS. How ever, since AIO is not configured in your system, you can do some test by increasing DBWR processes (DBWR_IO_SLAVES).

    ========
    .........but there could be some factor that led to less blocks being read from cache during the bench marking of raw with async IO.
    =====
    The best throughput will be the lowest the denominator of disk controller through put and disk throughput. What is the maxmum number of physical reads that the disk subsystem achieved?
    Today's disk subsystem technology has improved a lot. Even with raw devices, EMC disk arrays can be configured in such a way that any controller can access any device, and I have seen a true load balancing now-a-days.

    Tamil

  4. #4
    Join Date
    Feb 2003
    Location
    Leeds, UK
    Posts
    367
    Hi Tamil

    The best throughput will be the lowest the denominator of disk controller through put and disk throughput. What is the maxmum number of physical reads that the disk subsystem achieved?
    According to the paper they "used Oracle and HP performance tools to collect statistics during the entire run for each test case. (They) compared the different run results using the application throughput measured in number of transactions per minute, the log_file_parallel_write, db_file_sequential_read, IO throughput and CPU utilization". The results for raw with and wiithout async were:

    Raw without async IO (medium and heavy workloads):

    TRANSACTION THROUGHPUT 23135 28168
    IO THROUGHPUT 3316 3898
    LOG_FILE_PARALLEL_WRITE 9 ms 9 ms
    DB_FILE_SEQUENTIAL_READ 9 ms 19 ms
    CPU UTILIZATION 80% 88%

    Raw with async IO (medium and heavy workloads)

    TRANSACTION THROUGHPUT 31961 35367
    IO THROUGHPUT 4877 5392
    LOG_FILE_PARALLEL_WRITE 4 ms 8 ms
    DB_FILE_SEQUENTIAL_READ 12 ms 26 ms
    CPU UTILIZATION 79% 91%

    So no actual stats for reads alone unfortunately

    [edited to tidy formatting]
    Last edited by hacketta1; 12-30-2004 at 06:40 AM.

  5. #5
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    Can you post the link?

    Tamil

  6. #6
    Join Date
    Feb 2003
    Location
    Leeds, UK
    Posts
    367
    Sorry, I didn't realise you hadn't seen the paper. I should have worked that out given the link I posted at the start of the thread wasn't working.

    http://www.oracle.com/technology/dep...e_HP_files.pdf

  7. #7
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    PHP Code:
    *********** from the white paper **********
    The Oracle Stripe And Mirror Everything (SAMEmethodology was used for the database layout for all the tests.
    ***********

    The white paper did not talk about IO response time which is comprised of waiting time and service timeTo optimize I/O performance you must eliminate waiting time and minimize service time.

    First of allSAME is not the best.  SAME can eliminate waiting time and will not minimize service time.  The SAME has been growing in popularity not for its technical meritswhich are fewbut because it requires very little DBA skill

    *********** 
    from the white paper **********
    10 1Gb Fibre Channel Host Bus Adapters
    ***************
    My question isCould any disk be accessed by any controller (HBA)?  From the white paper I donít see it

    *****
    Raw Device based database with multiple DBWR
    Without AIO
                    Medium Workload         Heavy Workload
    Transaction Throughput       23135             28168
    IO throughput                 3316            3898        
    Log_file_parallel_write      9 ms                9 ms
    DB_FILE_sequential_read      9 ms                19 ms
    CPU utilization           80 
    %                88 %

    With AIO
                    Medium Workload         Heavy Workload
    Transaction Throughput       31961             35367
    IO throughput          4877                5392        
    Log_file_parallel_write      4 ms                8 ms
    DB_FILE_sequential_read      12 ms            26 ms
    CPU utilization          79 
    %                91 %

    In the medium work load the increase in transaction throughput (38 % = 31961-23135/23135*100with AIO was achieved by spending more time about 3 ms (12 ms -9ms) (or 33 %). 
    I have a feeling that the increase in db_file_sequential_read wait time was mainly due to SAME methodology because SAME would  not minimize service time.
    Alternate to SAME is to configure multiple stripe size sets of just few disks. If HP had done the same tests without SAMEthe result would have been altogether different
    Tamil

  8. #8
    Join Date
    Feb 2003
    Location
    Leeds, UK
    Posts
    367
    Tamil

    Thanks for taking the time out to help me with this.

    In the medium work load the increase in transaction throughput (38 % = 31961-23135/23135*100) with AIO was achieved by spending more time about 3 ms (12 ms -9ms) (or 33 %).
    So in summary, the increase in transaction throughput is related to increased service times and therefore db_file_sequential_read.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width