When log switch occurs in oracle 9i, DBWn writes contents of Database buffer cache to Data files.
Is both uncommitted and committed data written to data files at this time or only committed data is writted to data files on checkpoints.
Printable View
When log switch occurs in oracle 9i, DBWn writes contents of Database buffer cache to Data files.
Is both uncommitted and committed data written to data files at this time or only committed data is writted to data files on checkpoints.
All dirty blocks are written to disk.
This includes blocks that have changes for uncommitted data.
Your dirty buffers contains both commited and uncommited data. So your DBWR writes the both types of data's(commited and uncommited) to the data file.
Regards,
Seelan
When a log switch occurs, NOT ALL dirty buffers (committed and uncommitted) are written to data files by the DBWn processes.
When a log switch occurs, a SLOW checkpoint is triggered.
There is a vast difference between SLOW checkpoint and FAST checkpoint which is triggered by "ALTER SYSTEM CHECKPOINT or ALTER TABLESPACE ... BEGIN BACKUP or ALTER TABLESPACE ...OFFLINE".
SLOW CHECKPOINT:
If Oracle is doing a SLOW checkpoint, the DBWR process stops to process the checkpoint when one of the two following conditions occurs:
1. If the threshold size of the DB_CHECKPOINT_BATCH_SIZE (number of buffers) is reached
2. When over 1000 buffers are scanned and a dirty buffer can't be found to write to disk.
The idea is to give less stress on the CPU and IO sub system.
FAST CHECKPOINT:
When Oracle is doing fast checkpoint, then the DBWR continues scanning all the buffers in the cache and writes the dirty buffers into disk. The DBWR will not stop until it scans all the buffers, thus causing much over head to CPU and IO sub system.
Before you start any backup or abort the instance under extreme ciscumstances, you must always do FAST CHECKPOINT.
A brief idea about Checkpoint. See Note 147468.1 from Metalink for detail explanation
A Checkpoint is a database event, which synchronizes the data blocks in memory
with the datafiles on disk. A checkpoint has two purposes:
(1) to establish data consistency, and
(2) Enable faster database recovery.
When a checkpoint fails messages must be verified into into the alert.log file.
Here are some tips to tune the checkpoint process:
· The CKPT process can improve performance significantly and decrease the
amount of time users have to wait for a checkpoint operation to complete.
· If the value of LOG_CHECKPOINT_INTERVAL is larger than the size of the redo
log, then the checkpoint will only occur when Oracle performs a log switch
from one group to another, which is preferred. There has been a change in
this behaviour in Oracle 8i.
· The LOG_CHECKPOINTS_TO_ALERT when set to TRUE allows you to log checkpoint
start and stop times in the alert log. This is very helpful in determining
if checkpoints are occurring at the optimal frequency
. Ideally checkpoints should occur only at log swiches
Hi Tamil,
This is the first time I've heared about SLOW CHECKPOINTING.
Can you give me links to documents explaining about slow checkpoint?
Thanks.
Who else can provide better info other than Rama Velpuri, Backup & Recovery Guru?
Tamil,
That was very interesting explanation but I seems missed something
1) On log switch, a checkpoint is done in order to write all the dirty blocks, covered by the just filled up redo log to the disk, so that the redo log is no more needed for instance recovery and can be reused. I am correct here right?
2) You say:
"When a log switch occurs, a SLOW checkpoint is triggered.
.........
If Oracle is doing a SLOW checkpoint, the DBWR process stops to process the checkpoint when one of the two following conditions occurs:
1. If the threshold size of the DB_CHECKPOINT_BATCH_SIZE (number of buffers) is reached
2. When over 1000 buffers are scanned and a dirty buffer can't be found to write to disk."
Under these circumstences,how we guarantee that we do not need the last redo log anymore and it can be reused (which is the purpose of the checkpoint at log switch)?
So yes, I would agree that not all the durty blocks are written. Also I do not claim that I know how exactly that happens, but I belive that some process ( LGWR or CKPT) makes list of all the dirty buffers, covered by redo vectors of the last log file, and triggers the DBWR to write these blocks.
Please let's discuss on that
Thanks
Boris
Yes, you are partially right. In general, after a checkpoint, the redo in the redo log files is no longer needed for crash/instance recovery. However, the redo logs may be needed for instance recovery if ta single transaction's redo size is greater than the redo log file size.Quote:
That was very interesting explanation but I seems missed something
1) On log switch, a checkpoint is done in order to write all the dirty blocks, covered by the just filled up redo log to the disk, so that the redo log is no more needed for instance recovery and can be reused. I am correct here right?
For example, You update millions of rows in table and then before commit/rollback, the system crashed even though several log switches occurred during the update process.
Another example is when you clone the database, the recovery process may ask you to enter redo log files which are older than the current redo log.
The only way I can see is, do manual checkpoint, and switch log files.Quote:
2) You say:
"When a log switch occurs, a SLOW checkpoint is triggered.
.........
If Oracle is doing a SLOW checkpoint, the DBWR process stops to process the checkpoint when one of the two following conditions occurs:
1. If the threshold size of the DB_CHECKPOINT_BATCH_SIZE (number of buffers) is reached
2. When over 1000 buffers are scanned and a dirty buffer can't be found to write to disk."
Under these circumstences,how we guarantee that we do not need the last redo log anymore and it can be reused (which is the purpose of the checkpoint at log switch)?
You can experiment with a test case.Quote:
So yes, I would agree that not all the durty blocks are written. Also I do not claim that I know how exactly that happens, but I belive that some process ( LGWR or CKPT) makes list of all the dirty buffers, covered by redo vectors of the last log file, and triggers the DBWR to write these blocks.
Update millions of rows.
Do log switch.
Shutdown abort
Start up the instance. Note down the recovery time.
Do the same exercise with "alter sytem checkpoint", and you see the vast difference in recovery time.
doing checkpoint manually through "alter system" and issuing a shutdown abort afterthen is basically the same of issuing a command of shutdown immediate.
So there is no need for crash/instance recovery to perform because all datafiles have been in sync, not unless there were pending/ongoing process executed during those commands that will generate new redo blocks.
I disagree. Oracle will have to perform crash recovery as the database was not closed in a consistent state. The checkpoint means that only a fairly small amount of redo will have to be applied, but it still has to be recovered. Oracle background activities can generate redo, so redo is generated even if there is no users in the database.Quote:
Originally Posted by reydp
So, Tamil,
You are telling me, that if I issue VEEEERY long update, and suppose I have just 2 VEEERY SMALL redo log groups, I can hypotetically fail in situation where the update fills up the first log, then fills up the second one (the first is checkpointed at that time) and then the first gets reused.
And the reused redo is still needed for the instance recovery?
So if now my instance fail I will loose data?
Sorry, but I do not belive that. The check point have nothing to deal with the size of the transactions and with the trasactions at all. Nothing is wrong if we checkpoint the file and writre the dirty buffers of an uncommited transaction to the disk, so that the redo is no longer needed.
About the experiment, yes, the effect will be as you said, but the reason is quite different.
When you do 'alter system checkpoint' this is a SYNCHRONIOUS comand. That means that system starts agrresively to checkpoint and you do not get the control back until the checkpoint does not finishes. So when then you issue shutdown abort, you really do not need an instance recovery
'Alter system switch logfile' on the other hand switches the log file, hence STARTS the checkpoint, and returns the control to you, so now, when you "shutdown abort" the checkpoint is not yet finished and hence the redo log is still needed for instance recovery.
Here is the proove of what I say
------------------------
1* select group#,status from v$log
GROUP# STATUS
---------- ----------------
1 CURRENT
2 ACTIVE
3 ACTIVE
SQL> alter system switch logfile
2 /
System altered.
SQL> select group#,status from v$log
2 /
GROUP# STATUS
---------- ----------------
1 ACTIVE
2 CURRENT
3 ACTIVE
SQL> alter system checkpoint;
System altered.
SQL> select group#,status from v$log
2 /
GROUP# STATUS
---------- ----------------
1 INACTIVE
2 CURRENT
3 INACTIVE
SQL>
----------------------------------------------------
As you can see, alter system switch logfile leaves the previously "current" logfile in ACTIVE status, what means it is currently checkpointed
Alter system checkpoint
on the other hand leaves all the non-current log files INACTIVE what means the checkpoint have completed and the files are no longer needed for instance recovery
Regards
Boris
Tamil, I think you are confusing transaction recover with instance recovery.Quote:
Originally Posted by tamilselvan
Redo logs are not required for transaction recovery. At instance recovery, the undo tablespace is recovered, like any other and any active transactions are rolled back. This has nothing to do with checkpointing.
On line redo logs are the only thing that is needed to perform instance recovery or what some are referring to as crash recovery.
Checkpoints may be either incremental or full. In either case oracle records the low cache RBA (redo block address) on disk in the control file. This is the location that is needed inside the redo log to perform instance recovery. See x$kcccp
An active log is needed for instance recovery, You can not switch into an active log.
When you do a FAST CHECKPOINT all dirty buffers with current or below SCN value will be written to data files regardless if it is commited or uncommited. At this state, the only thing that needs for instance recovery is the uncommited data which is actually a transaction recovery(rolling back) in order for the database to be in a consistent state. And the SCN information needed for this transaction recovery is recorded also in the redo log.Quote:
Originally Posted by jhmartin
That's why I mentioned in my previous post:
unless there were pending/ongoing process executed during those commands that will generate new redo blocks
This means pending(NOT commited/NOT rolled back) or ongoing process.
Or let me clear it PENDING transaction.
I hope this clears out.
If the redo size of a single transaction is greater than the size of one or few redo logs but not the total redo logs group size then I believe Active not just Current redo logs will be needed for transaction recovery during instance recovery.Quote:
Originally Posted by tamilselvan
If in case the redo size of a single transaction is greater than the total size of all the redo logs groups then most probable you will bump the "Checkpoint not complete" error. So, I don't think there's going to be a several log switches that could occur.
Gyus,
Again, the checkpoint has nothing to deal with the size of the transaction, nor the log switch has. "active" redo log means that the redo log is currently being checkpointed and the checkpoint have still not completed, what ,yes, means that it's still needed for instance recovery, however no connection to the transaction size.
"If in case the redo size of a single transaction is greater than the total size of all the redo logs groups then most probable you will bump the "Checkpoint not complete" error."
That's simply not true. Checkpoint not completed is signalled when you are trying to reuse a log file, which is still not finished with the checkpoint and hence you cannot reuse it, so the server process have to wait until the checkpoint do not complete and the file is not ready for reusing. This appears generally when teh redo generation is faster than the checkpoint.
I am really amaized that such fundamental mechanism can cause such heated discussions.
Regards
Boris
Good discussion....
Isn't undo used for recovery as well? Redo is played back, and then undo is applied after that...Quote:
Originally Posted by ixion
correct Tomcat,
After the redo application, the instance is opened and SMON takes care to rollback all the transactions that have been opened at the time of failure.
That's called rollback phase of the instance recovery.
Bore,
From oracle manual:
V$LOG STATUS column = ACTIVE means:
Log is active but is not the current log. It is needed for crash recovery. It may be in use for block recovery. It may or may not be archived.
The whole point of discussion is: Does oracle need the redo log file for crash recovery "after log switch occurs" because checkpoint is done?
My answer is yes.
Well,
I would say the discussion is Does the instance needs for instance recovery the active log because the checkpoint is still not finished and my answer to that is yes as well.
But as I understood, the discussion was is the checkpoint dependent somehow on the size of the transaction, how exactly the checkpoint is performed etc. and hence my comments on that
So... basically we agree upon the mechanism
Regards
Boris
Hi Boris,Quote:
Originally Posted by Bore
First, this is not a heated discussion, here it is normal and healthy to exchange ideas sometimes with passion. But at the end of the day, everybody listen and express ideas always with a positive outcome and that's the continued process of learning regardless of the knowledge level each one have.
Sometimes, one can make mistake just to emphasize one point, and others may take opportunity to attack that mistake. And it happens here most of the time. If you were a member of this forum 4 or 5 years ago, then you will be amazed how the brilliants showcasing there knowledge into different topics and Tamil is just one of them.
Rey,
First, I absolutely support that such discussions are healty and lead to knolwedge for all of us, which is actually the purpose of the forum
2) My great respect to Tamil and his knowledge about paralellism, query performance, partitioning and many other topics. If this post somehow makes an impression in the opposite, my appoligies.
3) What I am surorised of is, that I did not got any support in what I am saying form any of the really good professionals wriring here (except ixion) nor got I an argumented post against what I say ( in wihch case I would accept that as I have done before, when I have posted wrong answers)
4) I really regret that I did not knew abou this forum 5 years ago, because I enjoy my stay here and the discussions with you
Regards
Boris
Boris,
I'm not saying that you are consistently right all along. For all I know Tamil might be explaining it correctly, it could be just your/mine/other interpretation that leads to conclusion.
---------------
What I am surorised of is, that I did not got any support in what I am saying form any of the really good professionals wriring here (except ixion) nor got I an argumented post against what I say ( in wihch case I would accept that as I have done before, when I have posted wrong answers)
------
I really regret that I did not knew abou this forum 5 years ago, because I enjoy my stay here and the discussions with you
---------------
You are on the right way of waking the interest of this guys. Discuss topics that are really interesting. And that also goes for all the other members of this forum.
Be prepair first before raising questions, exhaust all effort first before asking questions. And not just rely on an instant answer, and please appreciate and recognize when somebody helps you. It is really deterring when somebody ask an answer to a problem, and can't even say "thank you" when problem was resolved.
I mean, when this guys started to talk, I become just one of the audience.
I haven't seen lately this type of atmosphere here, but I am hoping that it will come back.
Bore and Reydp,
In my opinion, both of you are good.
I always welcome healthy criticism, comments and views. I may also commit mistake because Oracle RDBMS is becoming big and big after every release.
Some time we may have to unlearn what we have learned earlier.
Thanks.
maybe an information overload dear :D
I read the whole thread and it is very interesting. I learned the slow checkpointing from this thread. Thank you all for exchaing useful information.
I have one question for Tamil. You are saying about INIT parameter DB_CHECKPOINT_BATCH_SIZE. I tried to set this parameter. But it is not taking.
Here is the oracle version i am using.
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
its not a parameter, thats why
Quote:
Originally Posted by tamilselvan
Does this mean that there is no situation when DBWn writes uncommitted data to disk..coz if during checkpoint, if only bufferes related to redo log files are return to disk...then since redo log files always hold committed data(LGWR writes data from redo log buffer to redo log file on commit)-- so there is no case of writing uncommitted data to disk or what?????????
Great discussion !!!
I don't think that is what he is saying at all;Quote:
Originally Posted by janki_mehta
DBWn will write to disk dirty blocks during a checkpoint. A dirty block is one that has been modified in memory and is not currently on disk. This includes uncommitted data.
Redo Log files hold change vectors for the modified blocks. They do not particularly distinguish between committed/uncommitted data. Only Changes to blocks this includes not only your table data blocks but also Undo blocks. (thats how it can rebuild the undo tablespace and perform transaction rollback).
The commit command guarantees that what your transaction has posted to the log buffers is on disk. The responsiblity for the write operation is LGWR. The commit command will not return until LGWR has flushed the log buffer to disk.
LGWR at the time of your commit will also write any other changes that are in the log buffers. These changes can be from other users and may or may not be completed transactions (ie committed).
HTH
-------------------------------------------------Quote:
Originally Posted by janki_mehta
Does this mean that there is no situation
when DBWn writes uncommitted data to disk..
****wrong assumption
coz if during checkpoint, if only bufferes related
to redo log files are return to disk...
*****wrong assumption
then since redo log files always hold committed data
*****wrong assumption
(LGWR writes data from redo log buffer to redo log file on commit)
-- so there is no case of writing uncommitted data to disk or what?????????
*****wrong conclusion
-----------------------------------------------
First, Nice explaination ixion,
Janki_mehta,
Just to clarify your understanding about redo logs switch, checkpoints, DBWn, LGWR. I am open to any opinion, ideas.
I think Tamil also miss 1 point about the reason why a REDO LOG SWITCH OCCURS.
And that is when an ONLINE REDO LOG is full and it has to advance to the next LOG sequence.
When this happens a SLOW CHECKPOINT(others call it SOFT CHECKPOINT) happens implicitly.
The other type of SLOW CHECKPOINT happens as what TAMIL have explained earlier,
which I believe will only happened if you have not configured your db buffer cache properly,
or you have a very huge REDO LOG size that a SLOW CHECKPOINT must happen even before a REDO LOG becomes full.
So at the normal operation where in you have a good db buffer cache configuration and an adequate REDO LOG size,
the SLOW CHECKPOINT that Tamil have discussed is less likely to happen.
I'll show you why a DBWn also writes uncommitted data to disk/datafile.
And it also follows that during checkpoint not only buffers related
to redo log files are written to disk/redo log files but also the dirty blocks are also be disk/datafile.
I'll show you also why uncommitted data are also written to redo logs.
Example.
In a DB where in it is newly opened and no other users used, we are going to do a series of inserts into a single table.
Leaves it as uncommited, then issue a shutdown abort to the database. And let's see what will happen when we start the database.
Walkthrough yourself for explanation through item# INFO.
item#1 - a newly opened db without transaction - ONLINE REDO LOG info
SQL> l
1 SELECT A.GROUP#, A.MEMBER, B.SEQUENCE#, B.BYTES, B.STATUS
2 FROM V$LOGFILE A, V$LOG B
3* WHERE A.GROUP# = B.GROUP#
SQL> /
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- ----------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 152 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 153 1048576 INACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 154 1048576 CURRENT
--- here let's an INSERTs happens to a table--- uncommitted-----
item#2 - a db with transaction(uncommited) before SHUTDOWN ABORT - ONLINE REDO LOG info
SQL> /
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- ----------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 155 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 156 1048576 CURRENT
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 154 1048576 INACTIVE
item#3 - a buffer busy waits - just to confirm that there was at least
- a wait events happened when the uncommitted transaction was made.
SQL> select class, count
2 from v$waitstat
3 where count > 0
4 order by count desc;
CLASS COUNT
------------------ ----------
data block 6
----------shutdown abort was issued --------
----------the db was restarted---------------
item#4 - an alertSID.log shows that during instance recovery
- a transaction recovery(roll forward) occured before database was completely open.
- See Thread recovery: finish rolling forward thread 1
- Thread recovery: 67 data blocks read, 65 data blocks written, 1821 redo blocks read
- As you can see it reads both from the data blocks(datafile) and the redo blocks(redo log files) during instance recovery.
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Mon Apr 02 12:38:02 2007
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 156 Reading mem 0
Mem# 0 errs 0: C:\ORACLE\ORADATA\HRISESM\REDO02.LOG
Mon Apr 02 12:38:02 2007
Thread recovery: finish rolling forward thread 1
Thread recovery: 67 data blocks read, 65 data blocks written, 1821 redo blocks read
Crash recovery completed successfully
item#5 - After the DB was successfully reopen, it shows that redo log Group 2 SEQUENCE# 156
- was also archived immediately instead of putting it in ACTIVE status
- because it was suppose to be ARCHIVE if the DB was shutdown properly.
SQL> SELECT A.GROUP#, A.MEMBER, B.SEQUENCE#, B.BYTES, B.STATUS
2 FROM V$LOGFILE A, V$LOG B
3 WHERE A.GROUP# = B.GROUP#
4 /
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- ----------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 155 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 156 1048576 INACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 157 1048576 CURRENT
Tamil,
As you said that during a slow checkpoint dbw may not write all the dirty buffers to the disk and proceed to complete the checkpoint. In that case, what will be the status of the logfile associated? Active or Inactive?
Hi Guys,
I have not entered the forum for several days and that thread became a hit meanwhile :-)
I think we should separate the 2 cases here.
1) The log switch time checkpoint ( which is equal with the checkpoint caused by FAST_START_MTTR_TARGET, LOG_CHECKPOINT_INTERVAL or LOG_CHECKPOINT_TIMEOUT) That one is called incremental checkpoint and is mainly ment for limiting the instance roll forward phase time and hence minimizing the instance recovery time. It is also necessary to guarantee that the redo in an already filled log is no more needed, since the dirty blocks covered by this redo have been written to the disk.
2) The case described here, called "SLOW CHECKPOINT" (actually that's the first time I hear that term, but I like it) occures mainly because your db buffer cache have become too "dirty". When a block is dirty, it cannot be replaced by another block since that will cause lost of the modified immage. When a server process looks for a "clean" block to accure and replace its contents with a block needed from the disk, it cannot use dirty block and skips these blocks. If it happen that a specific number of blocks get scanned and skiped, because they are dirty, a treshold is reached and the DB Writer gets signalled to start writnig the dirty blocks, hence make them clean and let the server process to easily find a clean block to replace. Each time the server process skips a dirty block, this block is put to so called "checkpoint queue" and when the DB Writer is signalled, it writes out the blocks of the checkpoint queue to the disk. I do not know more details here, if DBWR writes the entire queue or just part of it. I belive Tamil have explained some details about that
There is one more event, causing the DB Writer to write. If I remember right it is if a specific percentage of the DB Buffer Cache become dirty but I am not sure about that and the details around it.
The case 2) has no much to deal with the redo. It is driven not by the redo stream, but by the scanned dirty blocks. The redo vectors, covering these blocks maight even not be consequent in the redo log file. I guess the slow checkpoint is somehow mentioned in the controlfile or redo stream or something, but that's not the key idea, since we cannot say that the slow checkpoint has specific position in the redo stream. Again, the redo vecrors could not be consequent at all
Warm Regards
Boris
Hi Boris and All,
for your comment #1,
we can't just separate the CHECKPOINT(soft/incremental) issues related LOG SWITCH and the SLOW CHECKPOINT as pointed by Tamil related to this topic, because the title of the thread itself is "WHEN LOG SWITCH OCCUR..."
In fact it was even overlooked in the whole process of discussion here. WHEN LOG SWITCH OCCUR a mandatory incremental or soft CHECKPOINT is necessary. So now, the SLOW CHECKPOINT is different(given a premise that this two are different) since its concern touches only for dirty buffers while the INCREMENTAL CHECKPOINT concern mostly for LOG SWITCHING due to a CURRENT online redo log being full and need to advance to the next available online REDO LOG with the new sequence # or influenced by the value settings in your INIT parameter which already mentioned here. Nevertheless, bought have the same objective, and that is to give partial/incomplete CHECKPOINT.
for your comment #2
Its nice to know about the SLOW CHECKPOINT, I may have hear this before but perhaps with different name.
And yet, I'd like to emphasize in my previous post that if you have configured your db buffer cache and you have an adequate size of redo log, your system will not most likely encounter SLOW CHECKPOINT.
And I'd like to correct if you don't mind, about what you assume about a redo log not needed during instance recovery after being checkpointed and archived.
Example:
I will create an uncommitted transaction that has a size larger than the whole total size of online redo logs, so that the redo logs involved with the transaction are both online and archived. And let’s find out also if there are other checkpoints involved other than that happen during log switch.
This example was based on Tamil's theory of a huge transaction against a smaller size of redo logs.
Again let’s assume that the db is newly started, and no other user to create a transaction but me.
Item #1 Just to show that no transaction happens yet in the system
select name, value from v$sysstat
where name like 'DBWR%' and name like 'redo%;
-------------------
NAME VALUE
---------------------------------------------------------------- ----------
DBWR checkpoint buffers written 0
DBWR transaction table writes 0
DBWR undo block writes 0
DBWR revisited being-written buffer 0
DBWR make free requests 0
DBWR free buffers found 0
DBWR lru scans 0
DBWR summed scan depth 0
DBWR buffers scanned 0
DBWR checkpoints 0
DBWR cross instance writes 0
redo synch writes 2
redo synch time 0
redo entries 59
redo size 12676
redo buffer allocation retries 0
redo wastage 5676
redo writer latching time 0
redo writes 21
redo blocks written 37
redo write time 0
redo log space requests 0
redo log space wait time 0
redo log switch interrupts 0
redo ordering marks 0
Item #2 Query the ONLINE REDO LOG- note the STATUS and SEQUENCE# as start before the transaction was executedSELECT A.GROUP#, A.MEMBER, B.SEQUENCE#, B.BYTES, B.STATUS
FROM V$LOGFILE A, V$LOG B
WHERE A.GROUP# = B.GROUP#;
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- ----------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 188 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 189 1048576 INACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 190 1048576 CURRENT
Item #3 table structure used for testing data inserts.
SQL> DESC TEMPTAB
Name Null? Type
----------------------------------------- -------- ----------------------------
TESTFLD VARCHAR2(110)
Item #4 create an insert script – a multiple of this pl/sql that can create a transaction size bigger than the whole size of online redo logs. Probably up to 36000 records
declare mcounter number :=0;
begin
for mcounter IN 1..6000 LOOP
DBMS_OUTPUT.PUT_LINE('INSERT INTO TEMTAB VALUES (''THIS IS SAMPLE DATA '||TO_CHAR(MCOUNTER)||''');');
END LOOP;
end;
/
Item #5 execute the spool file create from item #4 without commit
Item #6 Status of the ONLINE redo logs –note new SEQUENCE# and STATUS
- if you notice from the original status, there 12 sequence# advanced, since 3 are the ONLINE REDO LOGS, the remaining 9 should be archived.
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- -------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 200 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 201 1048576 ACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 202 1048576 CURRENT
Item #7 System statistics shows after the insert transaction was made-uncommitted.
- Please observer the new value of “DBWR checkpoints” and “redo log space requests”.
- It is the same value with the # REDO LOG switches. So all, are INCREMENTAL checkpoint and not SLOW CHECKPOINT.---------------------
NAME VALUE
---------------------------------------------------------------- ----------
DBWR checkpoint buffers written 814
DBWR transaction table writes 18
DBWR undo block writes 278
DBWR revisited being-written buffer 0
DBWR make free requests 0
DBWR free buffers found 0
DBWR lru scans 0
DBWR summed scan depth 0
DBWR buffers scanned 0
DBWR checkpoints 12
DBWR cross instance writes 0
redo synch writes 2
redo synch time 0
redo entries 36617
redo size 11759540
redo buffer allocation retries 12
redo wastage 348700
redo writer latching time 0
redo writes 1118
redo blocks written 24418
redo write time 0
redo log space requests 12
redo log space wait time 0
redo log switch interrupts 0
redo ordering marks 0
Item #8 Issue SHUTDOWN ABORT
Item #9 Issue STARTUP
Item #10 Let’s see below the alertSID.log. Here it shows, that only 2 redo logs was used
during transaction recovery, and they are both ONLINE redo logs.
But would you believe that there were only 2 redo logs used during the instance recovery knowing that there were 12 redo logs generated for the INSERT DML?
The answer is NO, all 12 redo logs (ONLINE/ARCHIVED) were used and that’s what CONTROL FILE plays in this process.
-------------------------------------
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Wed Apr 04 13:41:34 2007
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 201 Reading mem 0
Mem# 0 errs 0: C:\ORACLE\ORADATA\HRISESM\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 202 Reading mem 0
Mem# 0 errs 0: C:\ORACLE\ORADATA\HRISESM\REDO03.LOG
Wed Apr 04 13:41:35 2007
Thread recovery: finish rolling forward thread 1
Thread recovery: 68 data blocks read, 66 data blocks written, 1845 redo blocks read
Crash recovery completed successfully
-------------------------------------
And Thank you Tamil for the compliment.:D
Rey, I thought this battle is over :-)
To sunmmarize
1) You agree that incremental checkpoint and slow checkpoint are quite different and I did not see any serious difference with what I posted here
2) You claim that if you have DB Buffer cache well configured, you'll never face slow checkpoint. Well, if I try I probably can find a case when that's not true, but generally I agree
3) If I understand rigth, you claim that archived logs are needed during the instance recovery in case if a transaction is longer that the total size of the redo logs. Sorry but your example did not prooved that, and it cannot proove it because it is not the case. What if thet database is in noarchivelog mode? You will prohibit the long trtansactions or the instance recovery is not applicable to noarchivelog mode or it will fail in case of long transaction?
I am saing that maybe for thid time in that thread but I am doing so, because I am completely sure about that - The size of transaction have nothing to deal with the checkpoints. What is keeped and needed until the transaction ends is the unod. Redo has nothing to do with that. As ixion mentioned, the undo itself is covered by the redo mechanism.
Maybe it is a good idea to make an article about that
Regards
Boris
Boris,
come on men, I spent at least 2 hours for the last post, and you are not giving at least some credit for that. :rolleyes:
for #1 somehow we agree.
for #2 I said, it is less likely to happen.
for #3 - well, to prove your point, why don't you try creating a transaction longer than the total size of the redo logs in an NOARCHIVELOG mode DATABASE, and let's see if your DB can still kick in when you leave the trans uncommitted and abort the instance.
Your Undo information is played back into the point of consistent state, how could you play it back if the redo information which is the source of the UNDO play back is incomplete?
Hi Ray,
1) I promis to try a transaction longer than all the redo logs, immediately when I get back to the office (what means next week), but I would suggest also you to try to crash a database in noarchivelog mode, by issuing a transaction as long as you wish, and shutting down abort the db while the transaction is not commited
2)Well, it is clear how. Once the block gets changed into the memory, the redo vector describing the change goes to the redo buffer and later is written to the on-line redo logs by the log writer - that's clear and we agree about. The data block is dirty and it's immage in the memory differs from the immage on the disk. Now, if the db fails, we will loose the changed immage from the memory, but trough the instance recovery the SMON is going to read the old block immage from the disk and reaply the change from the on-ilne redo log.Quote:
Originally Posted by reydp
When an incremental checkpoin occures, a list is created,containing all the blocks, covered by the redo vectors being checkpointed.Then the db writer writes all these block immages from the memory to the disk. Now, if the database fails, you will not need these redo vectors, since if someone needs this block ( say to rollback a transaction if this is a data block belonging to UNDO tablespace) it can simply read it from the disk, no need to reapply changes.That's because the block have been written to the disk AFTER the change covered by the redo vector occured.
That is how it works. The redo is not used to resotre the database consistency, UNDO is doing that. Redo just saves the changes. The instance recovery roll forward phase tries to bring the cache as close as it can to the state of the cache before the crash. Suppose before the crash the db buffer cache contained some dirty blocks. Now, trough the roll forward phase, the old images of the same blocks will be read, the same changes will be reaplied and the blocks will get dirty into the buffer cache again, just as they were before the crash. From that point on, the database will continue as if the crash never happen, just the sessions that have opened transactions before the crash, are not here anymore to commit or rollback these tranactions.That's why the SMON will take care of that. It will run the rollback phase, rolling back all the changes, part of uncommited transactions, opened by sessions from before the crash (what means restoring the old state of the data using UNDO and releasing the locks, held by these transactions). The database will not get to a consistent state at all during the instance recovery, since at the moment when SMON is rolling back, the database is opened and the users creates new transactions. That happens only after "Shutdown immediate". Again, the roll forward phase does not tries to make the db consistent, it just tries to bring it as close to the moment before crash, as it can and misses only the blocks which have been dirty, and the redo covering the changes have still not been written to the redo log files, but since these blocks can never be a part of commited transactions, loosing these changes is OK.
I hope my english is not too bad and do not prevents me to say what I want
Regards
Boris
Dave, Please help me here. What is DB_CHECKPOINT_BATCH_SIZE? I am not able to find any info about this. As per Tamil, checkpoint stops when threshold value is reached to DB_CHECKPOINT_BATCH_SIZE.
Where do we define this value??? Please help me to understand.
Quote:
Originally Posted by davey23uk
I know that because of his British accent sometimes is difficult to catch what Davey says. Let me translate it for you... DB_CHECKPOINT_BATCH_SIZE is not a parameter. Don't know what Tamil was trying to convey.
IF he was talking about setting a threshold for checkpoint size the way to go is by setting LOG_CHECKPOINT_INTERVAL to a non-zero value.
You can google all info you might need about LOG_CHECKPOINT_INTERVAL
boris,
Of course UNDO didn't need undo data from redo logs. UNDO tbs is not like TEMP tbs that loss TEMPorary data after instance is aborted. So all UNDO data to be played back are all in it. But what I meant with "UNDO INFORMATION" is the corresponding SCN, is recorded in the REDO LOGS. Read my posts here in this thread.
My assumption is that, for example, a long transaction during its first execution, its corresponding SCN is recorded to the CURRENT REDO LOGS(let's say SEQ 50), and after the transaction completed before commit, REDO LOG finished with sequence 60. REDO LOG SEQ 50, have to be read during transaction recovery. But it seems like:
after a REDO LOG is archived the SCN number that is UNCOMMITTED will be passed to the new CURRENT online REDO LOG. So that, during INSTANCE recovery the only REDO LOGS that are read are CURRENT, ACTIVE(ONLINE).
So, I could be wrong with ARCHIVE LOG being read during INSTANCE RECOVERY, but I still believe that ONLINE REDO LOG is used.