-
analyze table
i have a table with 8 million rows.
the table has no indexes on it.
the table also has no stats on it.
if the only sql i do is
> select * from table
where columna is null
for update of columna
and
> update table
set columna = somevalue
what affect would analyzing the table do?
would you expect it to make a performance difference, and if so why?
-
when you run ...
select * from table
where columna is null
for update of columna
... how many of the 8 million rows generally match "columna is null"?
Do you never ever select, insert, update, delete at all apart from this?
-
this reads in every row from the table, using a cursor loop.
there are other sql statemnets that are processed. these validate the contents of the data, but these statements don't run on this table.
script is basically:
cursor cur
select * from tablea
where columna is null
for update of columna
begin
for records in cur loop
if cur.columnb > 1 then ....
err=1;
etc.
update tablea set columna=err
where current of cur;
end loop;
end;
-
so all of the rows in the table start off with columna is null?
I may be looking at too simple a case, but why not ...
cursor cur
select * from tablea
where columna is null
and colummnb > 1
for update of columna
begin
for records in cur loop
....
err=1;
etc.
update tablea set columna=err
where current of cur;
end;
-
the data is coming from text files which are fed into a table on oracle using sql loader.
the last column on this table is an errorcode which on loading is null for every record.
i then run a validate script which then checks each row to make sure it conforms to specification.
this particular file has around 50 columns whihc have to validated in one way or another - the example i gave was much simplified than the actual script.
-
If it's true that you don't have to do anything else with this table (which honestly seems really doubtful -- what's the point of loading the data, identifying errors, then never looking for them or using the data again) then you don't need to analyze it.
-
there is a separate "insert" script which then puts the validated data onto the actual tables required, but this question is based purely around the validation part.
thanks for your answer, it's what i thought as well.
-
just had a thought - what if the script was stopped after updating 4 million rows, and an analyze was run? the errorcode would be updated for a lot of records and wouldn't the initial select speed up?
-
If you mean that the error code would be updated by the first run for half the rows, and those rows that had an error code could then be ignored when the script was rerun, then i don't think it would make a difference.
In the absence of indexes a full scan would have to be performed anyway. You could add an index to the error code column, but even if the table was analyzed the cost-based optimizer ought to ignore it.
Does each row get updated with a code, whether or not an error is found? If so, what proportion of the rows do not generally have an error?