I have to ask the same question that I asked in my previous post : what are the columns within bcapture and hotlistmatch tables that are defined as not null and have unique values in them? Are there indexes on those columns and are those indexed columns being used in the query?
If there are not null columns with unique values that are indexed and they are currently being used in the query and if oracle's not using those indexes then :

What does the execution plan look like when you disable automatic pga management for your session?

SQL> alter session set workarea_size_policy=manual;
SQL> alter session set hash_area_size=1048576;
SQL> explain plan set statement_id = <.......>

As I stated earlier, I believe the cost for estimating a hash join is this:

(outer access cost * # of hash partitions) + inner access cost

and # of hash partitions = 0.8 x hash_area_size)
(db_block_size x _hash_multiblock_io_count)

...and so the amount of hash area memory allocated by oracle to a session can potentially influence the optimizer.

Just curious, how did you arrive at the number 486MB for the pga_aggregate_target?

While your queries are running on the database, you can run this query to see the amount of memory allocations as part of the automatic PGA management:

SELECT to_number(decode(SID, 65535, NULL, SID)) sid,
operation_type OPERATION,trunc(EXPECTED_SIZE/1024) ESIZE,
trunc(ACTUAL_MEM_USED/1024) MEM, trunc(MAX_MEM_USED/1024) "MAX MEM",

MEM => Memory currently being used
MAX MEM => Max allocation for the session
ESIZE => maximum amount of memory that the PGA memory manager expects to use for the operation
TSIZE => Amount that's been transferred to a temporary segment on disk. This can be very expensize for hash joins when partitions are spilled to disk.

As far as setting min shared server to 5 and max servers to 25, it depends. You could always check the v$ views related to shared servers to see if too many of them are being idle most of the time.

As you stated earlier, CPU being used for parsing is not a whole lot and so not sure if altering cursor_sharing will help you much. Keep in mind that cursor_sharing parameter does influence the optimizer in its decision making as evidenced though a 10053 trace.

Also, what is the value of db_file_multiblock_read_count parameter for your database?

Is parallel query configured for your database? Is it enabled through parallel_automatic_tuning and parallel_adaptive_multi_user therby allowing oracle to configure the parallel_max_servers ? If so, you could set the parallel degree for the table to be default. Processes take up memory and so yes, parallel processes will use memory.

It sounds like the queries being run against your database all need to be revisited in a test machine. Simulate production in terms of load and concurrency. Make sure proper indexes are in place along with a proper strategy for collecting stats. Also, play around in your test database with settings that influence the optimizer and arrive at optimal settings that will work for your environment. Treating symptoms and not curing the real problem can potentially cause more major issues.

Good luck.......