I have a DB table which is being fed data from a java program almost every second. In 24 hours an estimated 10-20 million rows get added to this table. Every night at 2.20 AM I have a java batch job which triggers a DML for cleaning up the data from this table which isnt required the next day anymore as follows:-
DELETE FROM table1 where status = 'O' and creation_time < '20 Jan 2008'
Bascially I need to cleanup all records having status as 'O' and which got inserted prior to today's midnight.
What is happening now is that the DML seems to running for hours 9-11+ hours. As a result the jobs for inserting data into these tables, retrieving data from these tables, etc other threads are getting extremely affected.
I have 3 composite indexes on this table, so my guess is the time is getting eaten up for generating re-do logs, etc.
Please suggest on how I can optimize this deletion. Any optimization which helps in getting the cleanup done within 1-2 hours would be great.
Note that I cannot have the batch job inserting data into the table stopped while the cleanup happens.
As per the dba the time is getting consumed due to heavy redo logs generation. Is there any specific thing that I need to check? Parallel addition of new records and updates adding to the issue.
Your purge strategy is flawed, mass delete is going to be always a time consuming nightmare.
You have to partition your table in a way you can just truncate or drop the unwanted partition.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
declare
v_rows_processed number := 0;
begin
loop
delete test_delete
where status = 'O' and
ins_date < sysdate and
rownum <= 100;
v_rows_processed := sql%rowcount ;
if v_rows_processed = 0 then
exit;
end if;
commit;
end loop;
end;
/
If you have proper index on columns used in the predicate, then this delete process will work faster.
The batch size ( in my case 100) may vary according to your table size, resources available etc.
Once again, Poster has to delete anything in between 10 to 20 million rows a day... delete statement is not a pretty solution no matter how you wrap it.
Massive purging calls for a structure that allows massive purging in an easy, fast and scalable way.
The answer is partition your table and indexes in a way you have to just truncate a single partition a day. Can you find anything easier than that?
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
delete statement is not a pretty solution no matter how you wrap it.
Massive purging calls for a structure that allows massive purging in an easy, fast and scalable way.
The answer is partition your table and indexes in a way you have to just truncate a single partition a day.
It is more work than a quick fix to your DML, but it IS the right way to go. Tinkering with the DELETE statement is not going to solve your problems.
If you can't see your way to creating a partitioned table, then you will have to create separate tables, flip flopping between inserting into one then into another. The data you don't want you can remove through a TRUNCATE. It will work, but would be horrible.
Hi,
How many records have status other than 'O'? If the count is lesser percentage then how abt the following steps:
1. Move the records other than status 'O' to a temp table
2. Truncate main table
3. insert back non 'O' status from temp table to main table.
Bookmarks