Oracle Technical Track

Adding DML Parallelization to Oracle Jobs

3 min read

Oct 5, 2021

When investigating why an Oracle job is taking a long time to execute, normally the DBA will identify the DML instructions or select queries running longer and different approaches might be used to improve the performance, e.g., create new indexes, gather statistics, use of hints, rewrite the queries or other convenient solutions.

One more option could be modifying the Oracle job to run in parallel (at the session level). Below is a small test carried out to show how a job can use Parallel DML where just the job action needs to be modified. The test was performed in a virtual box running Linux Red Hat 7.7 and Oracle Database Enterprise Edition 12.2.0.1.

1. Create a test table using the dba_objects as data source:

show user
USER is "DBUSER1"
create table abc tablespace users as select * from dba_objects;

2. Verify the current ABC table size, just 11 MB after the first load:

select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, 
extents, MAX_EXTENTS  from dba_segments
where 
owner = 'DBUSER1' and
segment_name in ('ABC'); 

OWNER                SEGMENT_NA SEGMENT_TY TABLESPACE_NAME        BYTES_MB    EXTENTS MAX_EXTENTS
-------------------- ---------- ---------- -------------------- ---------- ---------- -----------
DBUSER1              ABC        TABLE      USERS                        11         26  2147483645

3. Create a simple procedure to load data into a test table (insert as select instruction):

CREATE or replace PROCEDURE PLOADTEST
is
BEGIN
    insert into abc select * from abc;
    commit;
END;
/

4. Create a job to run the procedure PLOADTEST, at the beginning it will run the job in serial mode:

BEGIN
    DBMS_SCHEDULER.create_job (
        job_name        => 'JOB_TEST',
        job_type        => 'PLSQL_BLOCK',
        job_action      => 'BEGIN 
                            PLOADTEST;  
                            END;',
        start_date 	=> TO_TIMESTAMP_TZ('2021-09-25 01:00:00.000000000 AMERICA/NEW_YORK',
                           'YYYY-MM-DD HH24:MI:SS.FF TZR'),
        repeat_interval => 'FREQ=DAILY;BYHOUR=01;BYMINUTE=00;',
        end_date        => NULL,
        enabled         => TRUE,
        comments        => 'Job to test');
END;
/

5. Run the job JOB_TEST a couple of times to populate more data into ABC table:

exec dbms_scheduler.run_job('JOB_TEST',FALSE);

6. Verify the current ABC table size. In the example, its size is now ~1.3GB:

select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, 
extents, MAX_EXTENTS  from dba_segments
where 
owner = 'DBUSER1' and
segment_name in ('ABC');  

OWNER                SEGMENT_NA SEGMENT_TY TABLESPACE_NAME        BYTES_MB    EXTENTS MAX_EXTENTS
-------------------- ---------- ---------- -------------------- ---------- ---------- -----------
DBUSER1              ABC        TABLE      USERS                      1344        204  2147483645

7. Check the execution time to complete the job. The latest entry shows three minutes to complete the procedure:

SELECT owner, job_name,status,run_duration FROM dba_scheduler_job_run_details 
WHERE owner = 'DBUSER1' and job_name = 'JOB_TEST'
ORDER BY job_name,actual_start_date;
  
OWNER      JOB_NAME  STATUS      RUN_DURATION
---------- --------- ----------  --------------------
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:01
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:01
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:03
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:06
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:12
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:00:43
DBUSER1    JOB_TEST  SUCCEEDED   +000 00:03:01

8. Change the Job Action to use parallel DML and four threads. It will depend on how many server configuration–related CPU cores are available and always avoiding extra stress on the system by preventing too many processes from running in parallel. The virtual machine where the test was performed has four CPU cores:

BEGIN
        DBMS_SCHEDULER.SET_ATTRIBUTE (
        name         =>  'DBUSER1.JOB_TEST',
        attribute    =>  'job_action',
        value        =>  'BEGIN
                           EXECUTE IMMEDIATE ''ALTER SESSION ENABLE PARALLEL DML'';
                           EXECUTE immediate ''ALTER SESSION FORCE PARALLEL QUERY parallel 4'';
                           PLOADTEST; 
                          END;');
END;
/

9. Run the job again and verify the parallelism usage:

exec dbms_scheduler.run_job('JOB_TEST',FALSE);

select * from v$pq_sysstat;

STATISTIC                   VALUE     CON_ID
---------------------- ---------- ----------
Servers Busy                    4          0

10. Check the execution time to complete the job using parallelism. In the example, it took 51 seconds to complete using parallel DML at the session level:

SELECT owner, job_name,status,run_duration FROM dba_scheduler_job_run_details 
WHERE owner = 'DBUSER1' and job_name = 'JOB_TEST'
ORDER BY job_name,actual_start_date;
 
OWNER      JOB_NAME   STATUS      RUN_DURATION
---------- ---------- ----------  --------------------
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:01
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:01
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:03
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:06
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:12
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:43
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:03:01
DBUSER1    JOB_TEST   SUCCEEDED   +000 00:00:51

11. Verify the current ABC table size. Here, it’s ~2.6GB:

select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, 
extents, MAX_EXTENTS  from dba_segments
where 
owner = 'DBUSER1' and
segment_name in ('ABC');
  
OWNER                SEGMENT_NA SEGMENT_TY TABLESPACE_NAME        BYTES_MB    EXTENTS MAX_EXTENTS
-------------------- ---------- ---------- -------------------- ---------- ---------- -----------
DBUSER1              ABC        TABLE      USERS                      2635        338  2147483645

In summary, to load ~1.3GB, the job took 51 seconds when using parallelism, which was faster than its previous run to load ~650MB, where the execution time was three minutes.

Hopefully this will help with job performance improvements, where a quick modification can complete the process faster.

For more helpful tips from the experts at Pythian, subscribe to our weekly blog post emails at the top of this page or here.