This article presents a number of solutions for making calls to PL/SQL functions from SQL more efficient.
Related articles.
Sometimes we require functions in the select list of a query. These may be built in functions, or functions we've coded ourselves. The default action is to call the function for each row returned by the query. If the function is deterministic, for the same input parameter signature it gives the same output, and the number of unique parameter signatures used is low compared to the number of rows processed, this can represent a lot of wasted effort.
The example below creates a test table with 10 rows, of 3 distinct values. The function contains a call to DBMS_LOCK.SLEEP
, to represent a workload that requires a considerable amount of internal processing, taking approximately 1 second per call. The function is then used in the select list of a query from the test table.
conn / as sysdba grant execute on dbms_lock to test; conn test/test drop table func_test; create table func_test ( id number ); insert into func_test select case when level = 10 then 3 when mod(level, 2) = 0 then 2 else 1 end from dual connect by level <= 10; commit; create or replace function slow_function (p_in in number) return number as begin dbms_lock.sleep(1); return p_in; end; / set timing on select slow_function(id) from func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:10.14 SQL>
The elapsed time clearly shows the function was called 10 times, once for each row, even though there were only 3 distinct input parameters.
The first thing you should do it attempt to remove any unnecessary function calls. It sounds obvious, but lazy programming and a lack of SQL knowledge may lead you into using PL/SQL where there is a SQL alternative. SQL keeps evolving and features such as analytic functions allow you to do incredible things without resorting to PL/SQL. Assuming this is not possible, the methods shown below should help reduce the overhead associated with the function calls.
Remember, the sweet spot for any of the caching solutions is where the number of unique parameter signatures are low compared to the total number of rows processed and any dependent data is not too volatile. If functions are dependent on tables and views containing volatile data, the contents of the cache will quickly become out of date.
Rewriting the function call into a scalar subquery allows Oracle to use scalar subquery caching to optimize the number of function calls needed to complete the query. Oracle builds an in-memory hash table to cache the results of scalar subqueries.
set timing on select (select slow_function(id) from dual) from func_test; (SELECTSLOW_FUNCTION(ID)FROMDUAL) --------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.03 SQL>
The elapsed time shows Oracle was able to cache the result of the subquery and reuse it for subsequent rows where necessary.
Caching is only available for the lifetime of the query. Subsequent queries have no access to the cached values of previous runs.
The size of the hash table is quite important, as it does limit the extent to which Oracle can cache scalar subqueries. In Oracle 10g and 11g, the hash table contains 255 buckets. If there are more than 255 distinct values, the 256th and onward values are not cached in the hash table. Likewise, if there are hash collisions, the subqueries resulting in the collisions are not placed in the hash table. Even in this situation you can still get performance improvements, because Oracle always keeps the latest subquery value, in addition to those placed in the hash table, so provided the data set is ordered appropriately, you can still get great performance when there are more than 255 distinct values, or hash collisions.
We will mention scalar subquery caching again later.
The DETERMINSTIC
hint has been available since Oracle 8i, where it was first introduced to mark functions as deterministic to allow them to be used in function-based indexes, but it is only in Oracle 10gR2 onward that it has some affect on how the function is cached in SQL.
In the following example, labelling the function as deterministic does improve performance, but the caching is limited to a single fetch, so it is affected by the array size. In this example, the array size is varied using the SQL*Plus "SET ARRAYSIZE n
" command.
create or replace function slow_function (p_in in number) return number deterministic as begin dbms_lock.sleep(1); return p_in; end; / set timing on set arraysize 15 select slow_function(id) from func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:04.04 SQL> set timing on set arraysize 2 select slow_function(id) from func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:10.01 SQL>
The difference in array size produced drastically different performance, showing that caching is only available for the lifetime of the fetch. Subsequent queries (or fetches) have no access to the cached values of previous runs.
Oracle 11g introduced two new caching mechanisms:
We can use the first mechanism to cache the results of our slow function, allowing us to remove the need to rerun it for the same parameter signature.
create or replace function slow_function (p_in in number) return number result_cache as begin dbms_lock.sleep(1); return p_in; end; / set timing on select slow_function(id) from func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.09 SQL>
The advantage of this method is the cached information can be reused by any session and dependencies are managed automatically. If we run the query again we get even better performance because we can used the cached values without calling the function at all.
set timing on select slow_function(id) from func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.02 SQL>
In versions prior to 11g, we can manually cache the values of function calls in a PL/SQL collection. The following code builds a caching layer in front of the calls to the slow function.
-- Recreate the original function to remove any amendments made in previous examples. create or replace function slow_function (p_in in number) return number as begin dbms_lock.sleep(1); return p_in; end; / create or replace package cached_lookup_api as function get_cached_value (p_id in number) return number; procedure clear_cache; end cached_lookup_api; / create or replace package body cached_lookup_api as type t_tab is table of number index by binary_integer; g_tab t_tab; g_last_use date := sysdate; g_max_cache_age number := 10/(24*60); -- 10 minutes -- ----------------------------------------------------------------- function get_cached_value (p_id in number) return number as l_value number; begin if (sysdate - g_last_use) > g_max_cache_age then -- Older than 10 minutes. Delete cache. g_last_use := sysdate; clear_cache; end if; begin l_value := g_tab(p_id); exception when no_data_found then -- call function and cache data. l_value := slow_function(p_id); g_tab(p_id) := l_value; end; return l_value; end get_cached_value; -- ----------------------------------------------------------------- -- ----------------------------------------------------------------- procedure clear_cache as begin g_tab.delete; end; -- ----------------------------------------------------------------- end cached_lookup_api; /
If we query the caching layer, it only calls the slow function if a suitable value isn't already cached.
set timing on select cached_lookup_api.get_cached_value(id) from func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.10 SQL>
Executing the query again in the same session allows it to reuse the cached values without calling the function at all.
set timing on select cached_lookup_api.get_cached_value(id) from func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.01 SQL>
There are a few gotchas associated with this method.
The manual caching approach can also be performed using a context, rather than a PL/SQL collection. The advantage of this is the context can be defined as globally accessible, making the data shareable between sessions. The following package is a rewrite of the previous example, modified to use a context. Notice the first line is the creation of the context itself.
-- Recreate the original function to remove any amendments made in previous examples. create or replace function slow_function (p_in in number) return number as begin dbms_lock.sleep(1); return p_in; end; / create or replace context cache_context using cached_lookup_api accessed globally; create or replace package cached_lookup_api as function get_cached_value (p_id in number) return number; procedure clear_cache; end cached_lookup_api; / create or replace package body cached_lookup_api as g_last_use date := sysdate; g_max_cache_age number := 10/(24*60); -- 10 minutes g_context_name varchar2(20) := 'cache_context'; -- ----------------------------------------------------------------- function get_cached_value (p_id in number) return number as l_value number; begin if (sysdate - g_last_use) > g_max_cache_age then -- Older than 10 minutes. Delete cache. g_last_use := sysdate; clear_cache; end if; l_value := sys_context(g_context_name, p_id); if l_value is null then l_value := slow_function(p_id); dbms_session.set_context(g_context_name, p_id, l_value); end if; return l_value; end get_cached_value; -- ----------------------------------------------------------------- -- ----------------------------------------------------------------- procedure clear_cache as begin dbms_session.clear_all_context(g_context_name); end; -- ----------------------------------------------------------------- end cached_lookup_api; /
Like the example in the previous section, if we query the caching layer, it only calls the slow function if a suitable value isn't already cached.
set timing on select cached_lookup_api.get_cached_value(id) from func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.07 SQL>
Executing the query again in the same session allows it to reuse the cached values without calling the function at all.
set timing on select cached_lookup_api.get_cached_value(id) from func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.01 SQL>
Creating the context as ACCESSED GLOBALLY
allows the cache to be shared between sessions, as shown in the example below, which starts by connecting to a new session.
-- Create new session. conn test/test set timing on select cached_lookup_api.get_cached_value(id) from func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.02 SQL>
Manually caching data in contexts shares most of the same gotchas as caching in PL/SQL collections.
The automatic SQL transpiler in Oracle 23c allows some functions to be converted into SQL expressions to reduce the overhead of function calls in SQL.
We have discussed a number of caching mechanisms in addition to scalar subquery caching, but do these alternative caching methods negate the need for scalar subquery caching? The answer to that is no, because scalar subquery caching is the only mechanism that effectively reduces the number of context switches between SQL and PL/SQL. To show this we will build a new test table with 100,000 rows of the same value.
drop table t2; create table t2 ( id number ); insert /*+ append */ into t2 select 1 from dual connect by level <= 100000; commit;
Recreate the slow function using the result cache, but without the sleep.
create or replace function slow_function (p_in in number) return number result_cache as begin --dbms_lock.sleep(1); return p_in; end; /
Compare the CPU usage of a regular query using the result cache, with that of a scalar subquery using the result cache.
set serveroutput on declare l_start number; begin l_start := dbms_utility.get_cpu_time; for cur_rec in (select slow_function(id) from t2) loop null; end loop; dbms_output.put_line('Regular Query (SELECT List): ' || (dbms_utility.get_cpu_time - l_start) || ' hsecs CPU Time'); l_start := dbms_utility.get_cpu_time; for cur_rec in (select (select slow_function(id) from dual) from t2) loop null; end loop; dbms_output.put_line('Scalar Subquery (SELECT List): ' || (dbms_utility.get_cpu_time - l_start) || ' hsecs CPU Time'); END; / Regular Query (SELECT List): 42 hsecs CPU Time Scalar Subquery (SELECT List): 6 hsecs CPU Time PL/SQL procedure successfully completed. SQL>
This difference in CPU usage is also visible when using scalar subqueries in the WHERE
clause.
set serveroutput on declare l_start number; begin l_start := dbms_utility.get_cpu_time; for cur_rec in (select 1 from t2 where id = slow_function(id)) loop null; end loop; dbms_output.put_line('Regular Query (WHERE): ' || (dbms_utility.get_cpu_time - l_start) || ' hsecs CPU Time'); l_start := dbms_utility.get_cpu_time; for cur_rec in (select 1 from t2 where id = (select slow_function(id) from dual)) loop null; end loop; dbms_output.put_line('Scalar Subquery (WHERE): ' || (dbms_utility.get_cpu_time - l_start) || ' hsecs CPU Time'); END; / Regular Query (WHERE): 49 hsecs CPU Time Scalar Subquery (WHERE): 13 hsecs CPU Time PL/SQL procedure successfully completed. SQL>
What causes this difference in CPU usage? With the exception of scalar subquery caching, all the other caching methods discussed here still require calls to the PL/SQL function, which result in context switches between SQL and PL/SQL. These context switches account for the extra CPU load.
So even when you are using alternative caching features to improve performance of function calls between multiple execution, or between sessions, you should still use scalar subquery caching to reduce context switching.
The caching methods discussed previously are also appropriate for the WHERE
clause, especially scalar subquery caching to reduce context switching.
Applying a function on a column in the WHERE
clause of a query can result in poor performance, because it prevents the optimizer from using regular indexes on that column. Assuming the query can't be rewritten to remove the need for the function call, one option is to use a function based index.
You should also consider virtual columns, introduced in Oracle 11g.
Oracle's read consistency model uses undo to maintain a consistent view of the data as it was at the point in time when a SQL statement was issued. This same read consistency model applies to SQL statements issued directly, or from within PL/SQL. This has an interesting effect on SQL statements containing functions, which internally contain SQL statements. Each SQL statement in the function is read consistent with the point in time it is issued, not with the point in time the function is called and not with the point in time the outer SQL statement was issued. In effect, calling functions in SQL statements, that internally issue SQL breaks the Oracle read consistency model.
Bryn Llewellyn came to one of my presentations on efficient function calls from SQL and we had a conversation about this issue. During the conversation I asked the question, "Do you care?", which might seem strange, but we have to keep this issue in context.
You have to make a judgement for yourself how concerned you are about this issue on a case-by-case basis. If you really want to be certain you have a read-consistent result, your best option is to use flashback query in the following manner.
exec dbms_flashback.enable_at_time(dbms_flashback.get_system_change_number); select slow_function(id) from func_test; exec dbms_flashback.disable;
Using this method, any SQL issued between the ENABLE_AT_TIME
and the DISABLE
calls is consistent with the same point in time.
In many cases, I think this is an academic exercise, not a real concern, but it is important you understand the issue or you may fall victim to it in the specific cases where it does matter.
For more information see:
Hope this helps. Regards Tim...
Back to normal view: https://oracle-base.com/articles/misc/efficient-function-calls-from-sql