Hello friends,
We're running 10gR2 (with the 10.2.0.3 patch set) on RHEL 4.5. We're testing out access to our external tables (using the external tables as staging tables to transform logs into a star schema - so each external table will have multiple select statements run on it - for a bunch of dimensions plus a fact load).

The files for the external tables are basic tab-delimited text files and they are sitting on a 5-disk RAID5 LUN on EMC Clarion (so access should be rather fast - and is via filesystem commands). The Oracle server is an 8-CPU machine with 16GB RAM and has no usage at this point (aside from my tests).

So, I set up the external table below. I then ran a simple select count(*) from z_ext_users with autotrace turned on. The plan shows me that it's doing parallel access to the file (I see the old "PX BLOCK ITERATOR" and other PX statements), but when I watch over v$session, I see AT MOST 2 processes running. I then tried with the parallel hint ("select /*+ parallel(a,8) */ ..."), but that didn't help. The query takes approx 50 secs for about 2million records in the file.

If I CTAS the external table to a new regular table and run the select count(*) on the regular table, the query takes about 2 seconds (and runs with about 10 threads by default - no hints - as I would expect).

So, what's the issue here? What constitutes more parallelism with external tables? Is there something in my setup that isn't helping the optimizer? I was expecting similar (maybe not exactly the same) query performance.

There was one statement in the docs (http://download-west.oracle.com/docs...t_concepts.htm) that confused me - maybe some one can translate the second bullet point for me (could it have something to do with us being UTF-8?):
The following file, record, and data characteristics make it impossible for a file to be processed in parallel:

* Sequential data sources (such as a tape drive or pipe)
* Data in any multibyte character set whose character boundaries cannot be determined starting at an arbitrary byte in the middle of a string

This restriction does not apply to any datafile with a fixed number of bytes per record.

* Records with the VAR format


Here is the create statement:
create table z_ext_users
(
user_id number
, version number
, urltracker varchar2(255)
, invitationcode varchar2(255)
, username varchar2(40)
)
organization external
(
type oracle_loader
default directory oracle_home
access parameters
(
records delimited by newline
fields terminated by '\t'
missing field values are null
)
location ('users.txt')
)
parallel
reject limit 10000
;