DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: analyzing the tables corrects the problem.

  1. #1
    Join Date
    Oct 2008
    Location
    Chandigarh, India
    Posts
    322

    analyzing the tables corrects the problem.

    I have seen a strange a thing happening with the application. The application is designed with certain database jobs that includes analyzing the tables and indexes on regular basis. I have observed that certain tables and indexes do not get analyzed daily.
    When this happens, the application suffers hangs and some queries just do not work and end up with logging off the application user. When I manually analyze the tables and indexes using "alter table/ index ... compute statistics", then some queries start working fine and some others not.
    The whole problem gets corrected when we restart both the database and application. Then the entire system works fine for few days before the problem starts recurring again.

    Solaris 9 with Oracle 10g Release 1.

    Any comments?
    lucky

  2. #2
    Join Date
    Mar 2007
    Location
    Ft. Lauderdale, FL
    Posts
    3,555
    I think there is more than one issue in described scenario - recycling database and application servers has nothing to do with performance statistics.

    In regards to performance statistics.

    I assume when you say "alter table compute statistics" you actually mean "analyze table compute statistics" - anyway that's deprecated, you are on Ora10g then use proper dbms_stats system package to gather performance stats.

    Having said that, contrary to popular belief usually you do not need to keep gathering statistics all the time. Only exeption would be tables/indexes with very high volatility.

    A good practice is to gather stats until you get optimal performance then stop gathering stats and stick with the ones that give you consistent good performance.
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

  3. #3
    Join Date
    Oct 2008
    Location
    Chandigarh, India
    Posts
    322
    Quote Originally Posted by PAVB View Post
    I think there is more than one issue in described scenario - recycling database and application servers has nothing to do with performance statistics.

    In regards to performance statistics.

    I assume when you say "alter table compute statistics" you actually mean "analyze table compute statistics" - anyway that's deprecated, you are on Ora10g then use proper dbms_stats system package to gather performance stats.

    Having said that, contrary to popular belief usually you do not need to keep gathering statistics all the time. Only exeption would be tables/indexes with very high volatility.

    A good practice is to gather stats until you get optimal performance then stop gathering stats and stick with the ones that give you consistent good performance.
    I appreciate your suggestions. But can you comment on why application and database restart solves it for few days? I am not well aware about Solaris. I think that I need to look into some OS problem caused by Oracle.
    lucky

  4. #4
    Join Date
    Sep 2002
    Location
    England
    Posts
    7,334
    you need to look at your stats gathering procedure and see what you are doing wrong

    you have too many moving parts, make it simpler and therefore easier to to track down

    the easiest thing you can do is trace the sesins to see what is going on

  5. #5
    Join Date
    Mar 2007
    Location
    Ft. Lauderdale, FL
    Posts
    3,555
    Quote Originally Posted by mahajanakhil198 View Post
    But can you comment on why application and database restart solves it for few days?
    I bet that by shutdown/startup affected instance - without recyling the actual host - problems "gets solved" for a few days - is that correct?

    If that's the case I'll suggest to run statspack or awr and look at what is changing over time... shared_pool is always a good initial suspect for this kind of behavior.
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

  6. #6
    Join Date
    Nov 2002
    Location
    Mooresville, NC
    Posts
    349
    Did you check top wait events. Is latch wait in top wait event? If so, after restating Oracle removing all fragmented memory and so query seems to work faster.

    BTW use only dbms_stats.gather_stats for gathering stats and table indexes do not analyze daily, instead it's a increment job, if some percentage of records got modified/added then only that table/index will be analyzed.

    Keep us posted with your findings.
    http://www.perf-engg.com
    A performance engineering forum

  7. #7
    Join Date
    Oct 2008
    Location
    Chandigarh, India
    Posts
    322
    Thanks for your feedback.

    I found a particular query taking so long. The query was found after analyzing the application logs that pointed to a query that takes so long to execute and eventually logs out the application user without actually executing the query.

    I executed the query on standalone SQLPLUS client using autotrace on. Please see the attached file named 'problematic_query.txt'.

    I have also taken the awr report for the affected time period. The same can be verified as the problematic query's SQL Module column in SQL ordered by elapsed time is 'SQL*Plus'.

    I have also attached the AWR report in the zipped format.
    Attached Files Attached Files
    lucky

  8. #8
    Join Date
    Oct 2008
    Location
    Chandigarh, India
    Posts
    322
    Quote Originally Posted by PAVB View Post
    I bet that by shutdown/startup affected instance - without recyling the actual host - problems "gets solved" for a few days - is that correct?

    If that's the case I'll suggest to run statspack or awr and look at what is changing over time... shared_pool is always a good initial suspect for this kind of behavior.
    Yes, you are correct. Restarting solves the problem. But shared pool is good enough in terms of hit ratio. SGA_TARGET is on.

    Please see my previous posting and have a look at AWR report. May be, you can bring me out of this situation. One more thing is that I cannot tune the SQL statement except rebuilding the indexes as the application development was stopped three years back and we are only supporting the application.

    Pablo, I would appreciate your suggestions on this.
    lucky

  9. #9
    Join Date
    Nov 2002
    Location
    Mooresville, NC
    Posts
    349
    First i would advise AWR report of max 30 minutes duration not a 3 hour.
    How many CPU cores u got there?

    Second, the top SQL you picked is correct, however if you see "Segments by Physical Reads" and "Segments by Logical Reads" the top contender name not at all in top SQL report. How did this hapen, i beleive that the SQL not finished by this 3 hour period and that's why that SQL not captured?

    Can you find top SQL may be from v$longops or some where where table_name will be MAM_ASSET_ATTR_STRING_VALUES and MAM_ASSET_ATTR_DOMAIN_VALUES, you see they are the top contenders?

    Last but not the least "Buffer Pool Advisory" says if Buffer cache can be increased from 3,840 M to 4,992 M then physical reads can be reduced from 17,217,394 to 477,260. Since your top wait event is DB file sequential reda this should benefit you.

    Can you also attach CPU usage info during AWR report gathered period, i suspect there would be high IO usage during that period.

    What is your disk storage, is it SAN?
    http://www.perf-engg.com
    A performance engineering forum

  10. #10
    Join Date
    Oct 2008
    Location
    Chandigarh, India
    Posts
    322
    1. This 3 hour duration of AWR is non peak hour duration. The query itself takes more than 30 minutes to execute. For 45 minute duration, I will have to take remote connectivity of client again. I will check if it is possible.

    2. No, I took AWR report after the query finished execution in SQLPLUS.

    3. The query I have mentioned in attached file 'problematic_query.txt' already refers to tables MAM_ASSET_ATTR_STRING_VALUES and MAM_ASSET_ATTR_DOMAIN_VALUES.

    4. Yes, I will see if it is possible to increase SGA_TARGET.

    5. CPU cores, CPU usage and SAN related is unknow as of now. I will try to capture it. But it won't be of much help as same query was running good few days back and it has been working fine for last 5 years. I know that the data growth patterns have to be taken into account but still it works fine after restart.

    6. I don't know much about latches. Did you find anything about latches from AWR that may alleviate the problem?
    lucky

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width