-
removing duplicates and sql performance
Hi,
I have the following code to identify and remove duplicates. The problem is that I have around 27,000 records and the deduping using this code takes over an hour. Can someone suggest a faster code for this type of deduping. Thank you
----------------------------------------------------------
SELECT lname, fname, company, address_1, source, COUNT(*)
FROM leads
group by lname, fname, company, source, address_1
having count(*) > 1;
DELETE FROM leads_raw a
WHERE rowid < (
select max(rowid)
from leads_raw b
where b.fname = a.fname
and b.lname = a.lname
and b.company = a.company
and b.address_1 = a.address_1);
------------------------------------------------------------
Regards
Roman
-
See if this works for you ...
Code:
Delete from
leads_raw
where
rowid in
(
select ri
from
(
select
rowid ri,
min(rowid) over
(partition by
fname,
lname,
company,
address_1) min_ri
from
leads_raw
)
where
ri != min_ri
);
-
-
How faster?
Can you show us the time of your method and slimdave's?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|