What is the alternative to REGEXP REPLACE for creating a functional index?

After migrating to Oracle 18c Enterprise Edition, creating a functional index (FBI) fails.

This is what DDLs look like for creating an index:

CREATE INDEX my_index ON my_table(
    UPPER (REGEXP_REPLACE ("DEPT_NUM", '[^[:alnum:]]', NULL, 1, 0)));

I get the following error:

ORA-01743: only pure functions can be indexed
01743. 00000 -  "only pure functions can be indexed"
*Cause:    The indexed function uses SYSDATE or the user environment.
*Action:   PL/SQL functions must be pure (RNDS, RNPS, WNDS, WNPS).  SQL
           expressions must not use SYSDATE, USER, USERENV(), or anything
           else dependent on the session state.  NLS-dependent functions
           are OK.

Is this a known bug in 18c?
If such a feature is no longer supported for the FBI, what other way is there to provide such functionality for the index?

Author: 0xdb, 2020-02-10

1 answers

Prior to release 12.2.0.1, it was possible to create a functional index (FBI) with functions whose result was not deterministic.


For example, in 11.2.0.1 it looked like it worked:

create table depts (dept_num) as
    select cast (column_value as varchar2 (16)) dept_num
    from xmlTable ('"12345","ABC123", "XYZ#123"');
create index depts_idx on depts(
     upper (regexp_replace ((dept_num), '[^[:alnum:]]', null, 1, 0)));

Index DEPTS_IDX created.

The same thing in 12.2.0.1 will end with an error when creating an index:

ORA-01743: only pure functions can be indexed

As for the user functions, everything is clear here, and in 11.2.0.2, and in 12.2.0.1 it is explicitly stated that:

Any user-defined function referenced in column_expression must be declared as DETERMINISTIC.

In the case of the internal SQL function REGEXP_REPLACE, Oracle recognized that its use in the FBI is a bug and, starting with release 12.2.0.1, fixed it. Read more @ConnorMcDonald on Ask Tom:

Bug 20804063 ORA-1499 as REGEXP_REPLACE is allowed to be used in Function-based indexes (FBI)

And indeed, the result of this function depends on the session's TLS settings, i.e. it is nondeterministic. For example, the result of the same query with umlauts will be different:

alter session set nls_language = 'german';  --> null
alter session set nls_language = 'english'; --> ÄÖÜ

select (regexp_replace ('ABCÄÖÜ','[A-Z]*')) res from dual;

One possible solution is to limit department IDs to ASCII characters only. In this case, it is enough to wrap the logic in a custom function with a deterministic result. For example, like this:

create or replace function normalizeDeptNum (deptNum varchar2) return varchar deterministic is
begin
    if replace (asciistr (deptNum), asciistr (chr (92)), chr (92)) != deptNum then
        raise_application_error (
            -20000, 'The value in deptNum contains non-ascii char >'||deptNum||'<');
    end if;
    return upper (regexp_replace (deptNum, '[^[:alnum:]]', null, 1, 0));
end;
/
create index depts_idx on depts (normalizeDeptNum (dept_num));

Index DEPTS_IDX created.

insert into depts (dept_num) values ('AAA//123');

1 row inserted.

insert into depts (dept_num) values ('ÄÄÄ//123');

ORA-20000: The value in deptNum contains non-ascii char >ÄÄÄ//123<
 4
Author: 0xdb, 2020-02-11 14:03:11