Adding DML Parallelization to Oracle Jobs
When investigating why an Oracle job is taking a long time to execute, normally the DBA will identify the DML instructions or select queries running longer and different approaches might be used to improve the performance, e.g., create new indexes, gather statistics, use of hints, rewrite the queries or other convenient solutions.
One more option could be modifying the Oracle job to run in parallel (at the session level). Below is a small test carried out to show how a job can use Parallel DML where just the job action needs to be modified. The test was performed in a virtual box running Linux Red Hat 7.7 and Oracle Database Enterprise Edition 12.2.0.1.
1. Create a test table using the dba_objects as data source:
show user USER is "DBUSER1" create table abc tablespace users as select * from dba_objects;
2. Verify the current ABC table size, just 11 MB after the first load:
select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, extents, MAX_EXTENTS from dba_segments where owner = 'DBUSER1' and segment_name in ('ABC'); OWNER SEGMENT_NA SEGMENT_TY TABLESPACE_NAME BYTES_MB EXTENTS MAX_EXTENTS -------------------- ---------- ---------- -------------------- ---------- ---------- ----------- DBUSER1 ABC TABLE USERS 11 26 2147483645
3. Create a simple procedure to load data into a test table (insert as select instruction):
CREATE or replace PROCEDURE PLOADTEST is BEGIN insert into abc select * from abc; commit; END; /
4. Create a job to run the procedure PLOADTEST, at the beginning it will run the job in serial mode:
BEGIN DBMS_SCHEDULER.create_job ( job_name => 'JOB_TEST', job_type => 'PLSQL_BLOCK', job_action => 'BEGIN PLOADTEST; END;', start_date => TO_TIMESTAMP_TZ('2021-09-25 01:00:00.000000000 AMERICA/NEW_YORK', 'YYYY-MM-DD HH24:MI:SS.FF TZR'), repeat_interval => 'FREQ=DAILY;BYHOUR=01;BYMINUTE=00;', end_date => NULL, enabled => TRUE, comments => 'Job to test'); END; /
5. Run the job JOB_TEST a couple of times to populate more data into ABC table:
exec dbms_scheduler.run_job('JOB_TEST',FALSE);
6. Verify the current ABC table size. In the example, its size is now ~1.3GB:
select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, extents, MAX_EXTENTS from dba_segments where owner = 'DBUSER1' and segment_name in ('ABC'); OWNER SEGMENT_NA SEGMENT_TY TABLESPACE_NAME BYTES_MB EXTENTS MAX_EXTENTS -------------------- ---------- ---------- -------------------- ---------- ---------- ----------- DBUSER1 ABC TABLE USERS 1344 204 2147483645
7. Check the execution time to complete the job. The latest entry shows three minutes to complete the procedure:
SELECT owner, job_name,status,run_duration FROM dba_scheduler_job_run_details WHERE owner = 'DBUSER1' and job_name = 'JOB_TEST' ORDER BY job_name,actual_start_date; OWNER JOB_NAME STATUS RUN_DURATION ---------- --------- ---------- -------------------- DBUSER1 JOB_TEST SUCCEEDED +000 00:00:01 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:01 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:03 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:06 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:12 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:43 DBUSER1 JOB_TEST SUCCEEDED +000 00:03:01
8. Change the Job Action to use parallel DML and four threads. It will depend on how many server configuration–related CPU cores are available and always avoiding extra stress on the system by preventing too many processes from running in parallel. The virtual machine where the test was performed has four CPU cores:
BEGIN DBMS_SCHEDULER.SET_ATTRIBUTE ( name => 'DBUSER1.JOB_TEST', attribute => 'job_action', value => 'BEGIN EXECUTE IMMEDIATE ''ALTER SESSION ENABLE PARALLEL DML''; EXECUTE immediate ''ALTER SESSION FORCE PARALLEL QUERY parallel 4''; PLOADTEST; END;'); END; /
9. Run the job again and verify the parallelism usage:
exec dbms_scheduler.run_job('JOB_TEST',FALSE); select * from v$pq_sysstat; STATISTIC VALUE CON_ID ---------------------- ---------- ---------- Servers Busy 4 0
10. Check the execution time to complete the job using parallelism. In the example, it took 51 seconds to complete using parallel DML at the session level:
SELECT owner, job_name,status,run_duration FROM dba_scheduler_job_run_details WHERE owner = 'DBUSER1' and job_name = 'JOB_TEST' ORDER BY job_name,actual_start_date; OWNER JOB_NAME STATUS RUN_DURATION ---------- ---------- ---------- -------------------- DBUSER1 JOB_TEST SUCCEEDED +000 00:00:01 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:01 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:03 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:06 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:12 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:43 DBUSER1 JOB_TEST SUCCEEDED +000 00:03:01 DBUSER1 JOB_TEST SUCCEEDED +000 00:00:51
11. Verify the current ABC table size. Here, it’s ~2.6GB:
select owner, segment_name, segment_type, tablespace_name, round(bytes/1024/1024) bytes_MB, extents, MAX_EXTENTS from dba_segments where owner = 'DBUSER1' and segment_name in ('ABC'); OWNER SEGMENT_NA SEGMENT_TY TABLESPACE_NAME BYTES_MB EXTENTS MAX_EXTENTS -------------------- ---------- ---------- -------------------- ---------- ---------- ----------- DBUSER1 ABC TABLE USERS 2635 338 2147483645
In summary, to load ~1.3GB, the job took 51 seconds when using parallelism, which was faster than its previous run to load ~650MB, where the execution time was three minutes.
Hopefully this will help with job performance improvements, where a quick modification can complete the process faster.
For more helpful tips from the experts at Pythian, subscribe to our weekly blog post emails at the top of this page or here.
Share this
You May Also Like
These Related Stories
No Comments Yet
Let us know what you think