-
Transferring huge amount of data
Hi folks,
I need to design a process which will transfer approximately 100,000,000 rows between two databases across two servers.
The data is transformed CDRS which should be transferred to a staging area in the target database. After the transferring they will be moved to the real tables by EXCHANGE PARTITION command.
Both databases version is 10.2.
The source database is hosted on Red Hat Enterprise Linux AS release 3 (Taroon Update 8).
The target database is hosted on Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
My question is : what is the best way to transfer this huge amount of data between the servers?
The possibilities I thought :
1. Data pump
2. Transportable tablespace
3. Create table/insert as select over DB link
What do you think?
Any advice will be helpful.
Thanks in advance,
Nir
-
All options depend on network bandwith.
If you want something elegant and semi-transparent, consider dblink IF your nodes are LAN connected. DBLink over WAN is a nightmare...
Options 1 and 2 may use offline compression, in order to reduce sending time.
Option 1 may be benefited with parallel load/multiple DB writers, direct-path loading which gives less disk usage (if you compress receiving tables), thus lowering IO when executing EXCHANGE PARTITION.
Option 2 seems like 'almost' transparent, kind of plug and play... I've not tested it.
-
Thanks for your comments!
My favorite option is using the data pump utility.
To avoid transferring the dump file between the nodes I thought running expdp and impdp from the target database host,as the following example:
Code:
expdp user/pwd tables=T1,T2,T3,....,Tn NETWORK_LINK=source_database_link directory=CDRS_DIR dumpfile=CDRs.dmp logfile=expdpCDRs.log parallel=4
impdp user/pwd tables=T1,T2,T3,....,Tn directory=CDRS_DIR dumpfile=CDRs.dmp logfile=impdpCDRs.log parallel=4
What do you think?
-
We used to do data transfer between servers using exp/imp for mis purpose.
but started to take too long as data went on increasing.
so we did following.
1) on source db create a flat file with comma seperated data for tables which you want.
2) ftp file to target db
3) map a external table on that flat file
4) pull data from that external table to required table .
5) or if you have partition table then just pull data in temp table in 4th stage and do exchange partition.
you need script for this. its a bit head cracking job but once done it works smoothly...
Rgds
Parag
-
What is the avg row length?
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
-
100M rows is not such an impresive amount of data, large but not huge.
If servers are linked in a high-capacity network I'll go with the most simple option which is a DBLink.
If servers are not liked in a high-capacity network I'll exp/compress - ftp - uncompress/imp. Be sure you do exp/compress and uncompress/imp concurrently, on-the-fly; not as sequential steps.
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
Originally Posted by PAVB
100M rows is not such an impresive amount of data, large but not huge.
well, rather than going by just numbers.. one should look at the size of the table (determined by avg row length * number of rows).. after all one is transfering data and size of movement is all it matters..
lets say avg row length is too low (<100), then db link would be the answer irrespective of LAN settings (assuiming atleast 1MBPS which any LAN would have more than this)..
if avg row length is too high i would conisder transportable tablespace or expdp.
rgds
abhay
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
-
Originally Posted by PAVB
... Be sure you do exp/compress and uncompress/imp concurrently, on-the-fly; not as sequential steps.
Can you explain why, PAVB? It's not something I'm familar with.
-
Originally Posted by JMac
Can you explain why, PAVB? It's not something I'm familar with.
Just to save 1) Space, 2) I/O and, 3) FTP bandwidth
1- exp writes to a pipe and compress writes to the filesystem a already compressed file.
2) because of #1, you do a lot less I/O
3) because you get a compressed dump file you save network bandwidth during FTP
Also, during uncompress/imp you will save a lot of I/O.
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
Thanks guys for your advices and suggestions!
I don't know yet the avg row length. I assume that some tables will be below 100 and the others above.
I'll test soon both DB link and data pump options and I'll let you know.
Best regards,
Nir
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|