when log switch occurs....

**simply_dba** · 04-03-2007, 08:27 AM

Tamil,
As you said that during a slow checkpoint dbw may not write all the dirty buffers to the disk and proceed to complete the checkpoint. In that case, what will be the status of the logfile associated? Active or Inactive?

**Bore** · 04-03-2007, 11:11 AM

Hi Guys,
I have not entered the forum for several days and that thread became a hit meanwhile :-)

I think we should separate the 2 cases here.
1) The log switch time checkpoint ( which is equal with the checkpoint caused by FAST_START_MTTR_TARGET, LOG_CHECKPOINT_INTERVAL or LOG_CHECKPOINT_TIMEOUT) That one is called incremental checkpoint and is mainly ment for limiting the instance roll forward phase time and hence minimizing the instance recovery time. It is also necessary to guarantee that the redo in an already filled log is no more needed, since the dirty blocks covered by this redo have been written to the disk.
2) The case described here, called "SLOW CHECKPOINT" (actually that's the first time I hear that term, but I like it) occures mainly because your db buffer cache have become too "dirty". When a block is dirty, it cannot be replaced by another block since that will cause lost of the modified immage. When a server process looks for a "clean" block to accure and replace its contents with a block needed from the disk, it cannot use dirty block and skips these blocks. If it happen that a specific number of blocks get scanned and skiped, because they are dirty, a treshold is reached and the DB Writer gets signalled to start writnig the dirty blocks, hence make them clean and let the server process to easily find a clean block to replace. Each time the server process skips a dirty block, this block is put to so called "checkpoint queue" and when the DB Writer is signalled, it writes out the blocks of the checkpoint queue to the disk. I do not know more details here, if DBWR writes the entire queue or just part of it. I belive Tamil have explained some details about that
There is one more event, causing the DB Writer to write. If I remember right it is if a specific percentage of the DB Buffer Cache become dirty but I am not sure about that and the details around it.
The case 2) has no much to deal with the redo. It is driven not by the redo stream, but by the scanned dirty blocks. The redo vectors, covering these blocks maight even not be consequent in the redo log file. I guess the slow checkpoint is somehow mentioned in the controlfile or redo stream or something, but that's not the key idea, since we cannot say that the slow checkpoint has specific position in the redo stream. Again, the redo vecrors could not be consequent at all

Warm Regards
Boris

**reydp** · 04-04-2007, 08:53 AM

Hi Boris and All,
for your comment #1,
we can't just separate the CHECKPOINT(soft/incremental) issues related LOG SWITCH and the SLOW CHECKPOINT as pointed by Tamil related to this topic, because the title of the thread itself is "WHEN LOG SWITCH OCCUR..."

In fact it was even overlooked in the whole process of discussion here. WHEN LOG SWITCH OCCUR a mandatory incremental or soft CHECKPOINT is necessary. So now, the SLOW CHECKPOINT is different(given a premise that this two are different) since its concern touches only for dirty buffers while the INCREMENTAL CHECKPOINT concern mostly for LOG SWITCHING due to a CURRENT online redo log being full and need to advance to the next available online REDO LOG with the new sequence # or influenced by the value settings in your INIT parameter which already mentioned here. Nevertheless, bought have the same objective, and that is to give partial/incomplete CHECKPOINT.

for your comment #2
Its nice to know about the SLOW CHECKPOINT, I may have hear this before but perhaps with different name.
And yet, I'd like to emphasize in my previous post that if you have configured your db buffer cache and you have an adequate size of redo log, your system will not most likely encounter SLOW CHECKPOINT.

And I'd like to correct if you don't mind, about what you assume about a redo log not needed during instance recovery after being checkpointed and archived.

Example:
I will create an uncommitted transaction that has a size larger than the whole total size of online redo logs, so that the redo logs involved with the transaction are both online and archived. And let’s find out also if there are other checkpoints involved other than that happen during log switch.

This example was based on Tamil's theory of a huge transaction against a smaller size of redo logs.

Again let’s assume that the db is newly started, and no other user to create a transaction but me.

Item #1 Just to show that no transaction happens yet in the system
select name, value from v$sysstat
where name like 'DBWR%' and name like 'redo%;
-------------------
NAME VALUE
---------------------------------------------------------------- ----------

DBWR checkpoint buffers written 0
DBWR transaction table writes 0
DBWR undo block writes 0
DBWR revisited being-written buffer 0
DBWR make free requests 0
DBWR free buffers found 0
DBWR lru scans 0
DBWR summed scan depth 0
DBWR buffers scanned 0
DBWR checkpoints 0
DBWR cross instance writes 0
redo synch writes 2
redo synch time 0
redo entries 59
redo size 12676
redo buffer allocation retries 0
redo wastage 5676
redo writer latching time 0
redo writes 21
redo blocks written 37
redo write time 0
redo log space requests 0
redo log space wait time 0
redo log switch interrupts 0
redo ordering marks 0

Item #2 Query the ONLINE REDO LOG- note the STATUS and SEQUENCE# as start before the transaction was executedSELECT A.GROUP#, A.MEMBER, B.SEQUENCE#, B.BYTES, B.STATUS
FROM V$LOGFILE A, V$LOG B
WHERE A.GROUP# = B.GROUP#;

GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- ----------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 188 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 189 1048576 INACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 190 1048576 CURRENT

Item #3 table structure used for testing data inserts.
SQL> DESC TEMPTAB
Name Null? Type
----------------------------------------- -------- ----------------------------
TESTFLD VARCHAR2(110)

Item #4 create an insert script – a multiple of this pl/sql that can create a transaction size bigger than the whole size of online redo logs. Probably up to 36000 records

declare mcounter number :=0;
begin
for mcounter IN 1..6000 LOOP
DBMS_OUTPUT.PUT_LINE('INSERT INTO TEMTAB VALUES (''THIS IS SAMPLE DATA '||TO_CHAR(MCOUNTER)||''');');
END LOOP;
end;
/

Item #5 execute the spool file create from item #4 without commit

Item #6 Status of the ONLINE redo logs –note new SEQUENCE# and STATUS
- if you notice from the original status, there 12 sequence# advanced, since 3 are the ONLINE REDO LOGS, the remaining 9 should be archived.
GROUP# MEMBER SEQUENCE# BYTES STATUS
---------- ------------------------------------ ---------- ---------- -------------
1 C:\ORACLE\ORADATA\HRISESM\REDO01.LOG 200 1048576 INACTIVE
2 C:\ORACLE\ORADATA\HRISESM\REDO02.LOG 201 1048576 ACTIVE
3 C:\ORACLE\ORADATA\HRISESM\REDO03.LOG 202 1048576 CURRENT

Item #7 System statistics shows after the insert transaction was made-uncommitted.
- Please observer the new value of “DBWR checkpoints” and “redo log space requests”.
- It is the same value with the # REDO LOG switches. So all, are INCREMENTAL checkpoint and not SLOW CHECKPOINT.---------------------
NAME VALUE
---------------------------------------------------------------- ----------
DBWR checkpoint buffers written 814
DBWR transaction table writes 18
DBWR undo block writes 278
DBWR revisited being-written buffer 0
DBWR make free requests 0
DBWR free buffers found 0
DBWR lru scans 0
DBWR summed scan depth 0
DBWR buffers scanned 0
DBWR checkpoints 12
DBWR cross instance writes 0

redo synch writes 2
redo synch time 0
redo entries 36617
redo size 11759540
redo buffer allocation retries 12
redo wastage 348700
redo writer latching time 0
redo writes 1118
redo blocks written 24418
redo write time 0
redo log space requests 12
redo log space wait time 0
redo log switch interrupts 0
redo ordering marks 0

Item #8 Issue SHUTDOWN ABORT
Item #9 Issue STARTUP

Item #10 Let’s see below the alertSID.log. Here it shows, that only 2 redo logs was used
during transaction recovery, and they are both ONLINE redo logs.
But would you believe that there were only 2 redo logs used during the instance recovery knowing that there were 12 redo logs generated for the INSERT DML?
The answer is NO, all 12 redo logs (ONLINE/ARCHIVED) were used and that’s what CONTROL FILE plays in this process.
-------------------------------------
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Wed Apr 04 13:41:34 2007
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 201 Reading mem 0
Mem# 0 errs 0: C:\ORACLE\ORADATA\HRISESM\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 202 Reading mem 0
Mem# 0 errs 0: C:\ORACLE\ORADATA\HRISESM\REDO03.LOG
Wed Apr 04 13:41:35 2007
Thread recovery: finish rolling forward thread 1
Thread recovery: 68 data blocks read, 66 data blocks written, 1845 redo blocks read
Crash recovery completed successfully
-------------------------------------

And Thank you Tamil for the compliment.

**Bore** · 04-05-2007, 02:48 AM

Rey, I thought this battle is over :-)
To sunmmarize
1) You agree that incremental checkpoint and slow checkpoint are quite different and I did not see any serious difference with what I posted here
2) You claim that if you have DB Buffer cache well configured, you'll never face slow checkpoint. Well, if I try I probably can find a case when that's not true, but generally I agree
3) If I understand rigth, you claim that archived logs are needed during the instance recovery in case if a transaction is longer that the total size of the redo logs. Sorry but your example did not prooved that, and it cannot proove it because it is not the case. What if thet database is in noarchivelog mode? You will prohibit the long trtansactions or the instance recovery is not applicable to noarchivelog mode or it will fail in case of long transaction?
I am saing that maybe for thid time in that thread but I am doing so, because I am completely sure about that - The size of transaction have nothing to deal with the checkpoints. What is keeped and needed until the transaction ends is the unod. Redo has nothing to do with that. As ixion mentioned, the undo itself is covered by the redo mechanism.

Maybe it is a good idea to make an article about that

Regards
Boris

**reydp** · 04-05-2007, 05:56 AM

Boris,
come on men, I spent at least 2 hours for the last post, and you are not giving at least some credit for that.

for #1 somehow we agree.

for #2 I said, it is less likely to happen.

for #3 - well, to prove your point, why don't you try creating a transaction longer than the total size of the redo logs in an NOARCHIVELOG mode DATABASE, and let's see if your DB can still kick in when you leave the trans uncommitted and abort the instance.

Your Undo information is played back into the point of consistent state, how could you play it back if the redo information which is the source of the UNDO play back is incomplete?

**Bore** · 04-05-2007, 10:32 AM

Hi Ray,
1) I promis to try a transaction longer than all the redo logs, immediately when I get back to the office (what means next week), but I would suggest also you to try to crash a database in noarchivelog mode, by issuing a transaction as long as you wish, and shutting down abort the db while the transaction is not commited
2)

Originally Posted by reydp

Your Undo information is played back into the point of consistent state, how could you play it back if the redo information which is the source of the UNDO play back is incomplete?

Well, it is clear how. Once the block gets changed into the memory, the redo vector describing the change goes to the redo buffer and later is written to the on-line redo logs by the log writer - that's clear and we agree about. The data block is dirty and it's immage in the memory differs from the immage on the disk. Now, if the db fails, we will loose the changed immage from the memory, but trough the instance recovery the SMON is going to read the old block immage from the disk and reaply the change from the on-ilne redo log.
When an incremental checkpoin occures, a list is created,containing all the blocks, covered by the redo vectors being checkpointed.Then the db writer writes all these block immages from the memory to the disk. Now, if the database fails, you will not need these redo vectors, since if someone needs this block ( say to rollback a transaction if this is a data block belonging to UNDO tablespace) it can simply read it from the disk, no need to reapply changes.That's because the block have been written to the disk AFTER the change covered by the redo vector occured.
That is how it works. The redo is not used to resotre the database consistency, UNDO is doing that. Redo just saves the changes. The instance recovery roll forward phase tries to bring the cache as close as it can to the state of the cache before the crash. Suppose before the crash the db buffer cache contained some dirty blocks. Now, trough the roll forward phase, the old images of the same blocks will be read, the same changes will be reaplied and the blocks will get dirty into the buffer cache again, just as they were before the crash. From that point on, the database will continue as if the crash never happen, just the sessions that have opened transactions before the crash, are not here anymore to commit or rollback these tranactions.That's why the SMON will take care of that. It will run the rollback phase, rolling back all the changes, part of uncommited transactions, opened by sessions from before the crash (what means restoring the old state of the data using UNDO and releasing the locks, held by these transactions). The database will not get to a consistent state at all during the instance recovery, since at the moment when SMON is rolling back, the database is opened and the users creates new transactions. That happens only after "Shutdown immediate". Again, the roll forward phase does not tries to make the db consistent, it just tries to bring it as close to the moment before crash, as it can and misses only the blocks which have been dirty, and the redo covering the changes have still not been written to the redo log files, but since these blocks can never be a part of commited transactions, loosing these changes is OK.

I hope my english is not too bad and do not prevents me to say what I want

Regards
Boris

**pranavgovind** · 04-07-2007, 04:40 PM

Dave, Please help me here. What is DB_CHECKPOINT_BATCH_SIZE? I am not able to find any info about this. As per Tamil, checkpoint stops when threshold value is reached to DB_CHECKPOINT_BATCH_SIZE.

Where do we define this value??? Please help me to understand.

Originally Posted by davey23uk

its not a parameter, thats why

**PAVB** · 04-07-2007, 05:47 PM

I know that because of his British accent sometimes is difficult to catch what Davey says. Let me translate it for you... DB_CHECKPOINT_BATCH_SIZE is not a parameter. Don't know what Tamil was trying to convey.

IF he was talking about setting a threshold for checkpoint size the way to go is by setting LOG_CHECKPOINT_INTERVAL to a non-zero value.

You can google all info you might need about LOG_CHECKPOINT_INTERVAL

**reydp** · 04-08-2007, 05:13 AM

boris,

Of course UNDO didn't need undo data from redo logs. UNDO tbs is not like TEMP tbs that loss TEMPorary data after instance is aborted. So all UNDO data to be played back are all in it. But what I meant with "UNDO INFORMATION" is the corresponding SCN, is recorded in the REDO LOGS. Read my posts here in this thread.

My assumption is that, for example, a long transaction during its first execution, its corresponding SCN is recorded to the CURRENT REDO LOGS(let's say SEQ 50), and after the transaction completed before commit, REDO LOG finished with sequence 60. REDO LOG SEQ 50, have to be read during transaction recovery. But it seems like:
after a REDO LOG is archived the SCN number that is UNCOMMITTED will be passed to the new CURRENT online REDO LOG. So that, during INSTANCE recovery the only REDO LOGS that are read are CURRENT, ACTIVE(ONLINE).

So, I could be wrong with ARCHIVE LOG being read during INSTANCE RECOVERY, but I still believe that ONLINE REDO LOG is used.

Thread: when log switch occurs....

Thread Tools

Display

Posting Permissions