We have a number of separate data feeds that all insert rows into a central table. It is quite possible (and legitimate) for these feeds to try to append rows that already exist in the table. If this happens we want to discard duplicates just keeping one copy of the row. We use SQL*Loader (sqlldr) to append rows into the table.
As a a quick and convenient way to solve this we put a unique key constraint across all the columns in the table and then configured the CTL file to allow a huge number of errors/discards so that the constraint vioations got ignored.
This works great until you try more than 10 or 20 rows! We're pushing in around 30K rows at a time and normal load times are about 7 or 8 seconds across a network, but when we load a batch full of duplicates (emulated by loading the same file twice) the load drops to about a minute and a half. Turning off errors (SILENT=(ERRORS)) takes about 10/15% off the time but its still really slooooow. Is there anything I can do on the DBMS table to help speed this up? FYI - I'm not interested in knowing which rows failed.
Click Here to Expand Forum to Full Width