Not for the faint of heart...

Solaris 2.6
Oracle 8.1.6

When performing crash recovery, we all know that there are three phases to recover: rollforward, open, rollback. During the rollback phase, if there is a large tx that requires rollback, smon takes care of this (possibly delegating the work to parallel slaves if using fast_start_parallel_rollback).

We are having an issue that we cannot allocate sort space until smon completes this rollback. Please do not respond with 'smon is cleaning up your temporary segment' because we are using a locally managed temporary tablespace. So smon should not have any work to do to prepare for a sort. The query requiring a disk sort is being done on an object that does not have any tx rollback occurring.

The shadow process for the query that is requiring a disk sort is waiting on 'sort segment request', but all of the parameters for the wait are 0:

WAIT #3: nam='sort segment request' ela= 0 p1=0 p2=0 p3=0

The truss of the shadow process shows that it is looping waiting to acquire a semaphore:

semop(3866624, 0xDFFFB514, 1) Err#91 ERESTART

The trace of the smon process shows that it is looping on an update of undo$.


So the million dollar question is this: what is smon doing that would prevent us from allocating sort space in a locally managed temp tablespace?

All theories welcome....