Parallel PL/SQL launcher

This is an experimental package for submitting PL/SQL calls in parallel using the Oracle SQL Parallel Query mechanism.

Many have complained that PL/SQL has no multi-threading support, as Java (for example) does. The conventional solution is to use DBMS_JOB or DBMS_SCHEDULER, and if necessary write your own wrapper to handle completion status reporting (Has it finished yet? Did it succeed?) - for an example of this approach, see job_ot.

However, SQL does have a parallel threading mechanism, in the form of Parallel Query (PQ). You simply execute your query using a PARALLEL() hint, or against tables with a parallel degree defined - particularly effective with partitioned tables, where partitions are divided among available PQ slave processes depending on resources available, and the results are combined at the end. You can think of this as a PQ controller dividing up the job, handing out tasks to PQ slaves, and collating the results. Meanwhile over in PL/SQL, you can define parallel-enabled pipelined functions that accept a ref cursor. The PQ controller decides how to divide up the ref cursor among available PQ slaves and passes them one function call each, thus invoking multiple instances of the function to handle the cursor.

Now, if you create a table "PQ_DRIVER" with 4 partitions and a parallel degree of 4, with one row in each partition, and you execute a query along the lines of select /*+ parallel(pq,4) */ thread_id from pq_driver pq; that should persuade the PQ controller to set four PQ slave processes to work on it (one per partition). And if you pass that query as a cursor parameter to a parallel-enabled pipelined function, then shouldn't that create a situation where each row is processed by a separate PQ slave process? So here is a way to use (alright, hack) the PQ engine so that it processes arbitrary PL/SQL procedure calls in parallel.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> set timing on
SQL> exec exec parallel_launch.test
sid=151 serial#=39222: Degree of parallelism: requested 4, actual 4: 20:04:16 - 20:04:18
sid=137 serial#=27074: Degree of parallelism: requested 4, actual 4: 20:04:16 - 20:04:18
sid=146 serial#=33830: Degree of parallelism: requested 4, actual 4: 20:04:16 - 20:04:18
sid=132 serial#=11582: Degree of parallelism: requested 4, actual 4: 20:04:16 - 20:04:18

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.55


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set timing on
SQL> exec parallel_launch.test
sid=123 serial#=21233: Degree of parallelism: requested 4, actual 2: 06:07:47 - 06:07:49
sid=123 serial#=21233: Degree of parallelism: requested 4, actual 2: 06:07:49 - 06:07:51
sid=127 serial#=21506: Degree of parallelism: requested 4, actual 2: 06:07:47 - 06:07:49
sid=127 serial#=21506: Degree of parallelism: requested 4, actual 2: 06:07:49 - 06:07:51

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.83

The degree of parallelism is lower in 11g than in 10.2 - possibly 11g is more intelligent and decides that four slaves would be excessive for a table with only four rows, especially in my single-processor RAM-starved desktop VM.

I also tested in 10.1.0.3.0:

sid=439 serial#=58467:   Degree of parallelism: requested 4, actual 4: 16:18:57 - 16:18:59
sid=439 serial#=58467: Degree of parallelism: requested 4, actual 4: 16:18:59 - 16:19:01
sid=439 serial#=58467: Degree of parallelism: requested 4, actual 4: 16:19:01 - 16:19:03
sid=439 serial#=58467: Degree of parallelism: requested 4, actual 4: 16:19:03 - 16:19:05

I got the four PQ slaves I asked for, but everything was executed by just one of them. What this was really telling me is that I didn't fully understand PQ.

I had another look at my initialisation parameters, and I noticed CPU_COUNT was set to 1 (which was true, there is only 1 cpu) and PARALLEL_THREADS_PER_CPU was 2 (the default setting). I found that setting PARALLEL_THREADS_PER_CPU = 4 (or CPU_COUNT = 2) enabled the full 4 parallel threads requested in parallel_launch.test(). Also, since gathering statistics on the PQ_DRIVER table had the effect of serialising queries against it, and since by default the background stats job will do this, I added a call to dbms_stats.lock_table_stats(user,'pq_driver'). I still don't know what statistics cause this to happen and why, and it's a little puzzling because I would have expected dynamic sampling (enabled by default since 10g) to give the same results.

(To investigate further, it might be worth looking at some of the trace events, such as 10975 trace execution of parallel propagation, 10383 auto parallelization testing event and 10384 parallel dataflow scheduler tracing.)

Also I noticed a potential bug in the pq_submit function. It queries v$px_session as a self-diagnostic check, but if for any reason no parallel slaves were used there will be no rows in it, leading to a no_data_found exception because I used a SELECT INTO construction. Since no_data_found exceptions are automatically handled by the SQL engine, my pq_submit might appear to return successfully after mysteriously doing nothing. That's now fixed.

This has not been stress tested and I make no promises that it will work. Let me know what you find.