I have a table with 42 fields. No constraints are defined on any of them. Every hour fresh records are being inserted using SQL Loader. Average daily records being loaded is around 60 million.
Problem is quite a number of the records are duplicate and these are increasing the total size of the database. If I use a trigger for checking the duplicates during insert phase, it slows down the insert. Can somebody suggest the script, to clean the duplicate records so that the script can be run during lean hours and clean up the duplicate records(except the original record ofcourse)?
I am running 8.1.6 on Unix.
Why do not you add a constraint to the table ??
IT Manager For Development
Oracle Consultant - OCP
As a one time activity I suggest you export the table, create the table with unique key constraint and import the data. Subsequently you should not be having problems on duplicates.
Try this :
from table_name T1
where T1.RowID >
from table_name T2
where T1. Column_name =T2.column_name);
It could have been better to have a constraints, but no Problem.
Please have the index on the columns that you need to fnd the duplicates, once you are done ypu can drop that index. This way you will save a huge time.
create index idx_columnname on tablename (columnnames)
delete from tablename a
where rowid not in (select max(rowid) from tablename b
where a.duplicatedcolname = b.duplicatecolname
and same for the other columns where you don't want to have duplicated columns.
the spool off
Tablename : your Table name
Columnname : Your Columnname that needs to be searched for Duplicated values.
Export/Import is out of question due to the size of database.
I will follow what oracledba has suggested, that is creating index when necessary, deleting the duplicates and then dropping the index.
Click Here to Expand Forum to Full Width