1

The following code shows my problem, I need to extract the JOBD value (with in reality variable and matches \w{1, 10}

    with A (N, STR) as (
           values
          (1, 'SBMJOB MYJOB'),
          (2, 'SBMJOB JOB '),
          (3, 'SBMJOB JOB JOBD'),
          (4, 'SBMJOB JOB4 JOBD '),
          (5, 'SBMJOB JOB CMD('),
          (6, 'SBMJOB JOB JOBD CMD('),
          (7, 'SBMJOB JOB JOBD JOBQ('),
          (8, 'SBMJOB JOB CMD('),
          (9, 'SBMJOB JOB JOBQ(')
         ),
         R (REGEX) as (
           values
          'sbmjob(?: +((?!\w+ *\()\w+))?(?: +((?!\w+\()\w+))?(?: +(\w+)\()?'
         )
      select N,
             STR,
             REGEXP_EXTRACT(STR, REGEX, 1, 1, 'i', 1) JOB,
             REGEXP_EXTRACT(STR, REGEX, 1, 1, 'i', 2) JOBD,
             length(REGEXP_EXTRACT(STR, REGEX, 1, 1, 'i', 2)) LENGTH_JOBD,
             REGEXP_EXTRACT(STR, REGEX, 1, 1, 'i', 3) FIRST_KEYWORD
        from A
             cross join R
    ;   

I expect it to returns this

N STR JOB JOBD LENGTH_JOBD FIRST_KEYWORD
1 SBMJOB MYJOB MYJOB null null null
2 SBMJOB JOB JOB null null null
3 SBMJOB JOB JOBD JOB JOBD 4 null
4 SBMJOB JOB4 JOBD JOB4 JOBD 4 null
5 SBMJOB JOB CMD( JOB null null CMD
6 SBMJOB JOB JOBD CMD( JOB JOBD 4 CMD
7 SBMJOB JOB JOBD JOBQ( JOB JOBD 4 JOBQ
8 SBMJOB JOB CMD( JOB null null CMD
9 SBMJOB JOB JOBQ( JOB null null JOBQ

And it does with DB2 LUW (see fiddle)

But it returns this when using DB2 for IBMi

N STR JOB JOBD LENGTH_JOBD FIRST_KEYWORD
1 SBMJOB MYJOB MYJOB 0
2 SBMJOB JOB JOB 0
3 SBMJOB JOB JOBD JOB 0
4 SBMJOB JOB4 JOBD JOB4 0
5 SBMJOB JOB CMD( JOB 0 CMD
6 SBMJOB JOB JOBD CMD( JOB 0
7 SBMJOB JOB JOBD JOBQ( JOB 0
8 SBMJOB JOB CMD( JOB 0 CMD
9 SBMJOB JOB JOBQ( JOB 0 JOBQ

I think I have to open a ticket but do you have any idea ?

6
  • Different regex engines might work differently with lookarounds sometimes. So i would verify that both DB2 LUW and DB2 for IBMi use the same regex engine. Commented Sep 19 at 13:42
  • Unfortunately, I cannot find the information in the documentation. I assume that both target the behaviour of Oracle's Commented Sep 19 at 14:17
  • From the results, it just means the captured substrings are not returned. Commented Sep 19 at 16:25
  • Opening a ticket is not the right approach here. The issue is not a Db2 bug, but rather a limitation of the regular expression engine used on Db2 for i. Unlike Db2 LUW, Db2 for i does not support advanced constructs such as negative lookahead ((?! … )), which is why your expression works on LUW but fails on IBM i. The best solution is to rewrite the expression without lookaheads (e.g. by using explicit token boundaries or additional CASE logic) or to extract candidates with REGEXP_SUBSTR and filter them with standard SQL conditions. Commented Sep 20 at 7:27
  • @JonasMetzler Your comment is largely incorrect. DB2 for i uses International components for unicode to provide REGEXP functionality, and my reading of the ICU documentation at unicode-org.github.io/icu/userguide/strings/… indicates that negative lookahead is indeed supported. I would open a ticket with IBM. This seems to be a bug. Commented Oct 3 at 15:01

1 Answer 1

-1

The regular expression engines of DB2 LUW and DB2 for i (IBM i / AS400) are not implemented in exactly the same way.

On LUW, REGEXP_EXTRACT supports lookahead/lookbehind constructs, such as (?! … ).

On IBM i, the regex engine is simpler — it does not support lookahead.

please try this

select N, STR,
       REGEXP_SUBSTR(STR, 'SBMJOB +(\w+)', 1, 1, 'i', 1) as JOB,
       REGEXP_SUBSTR(STR, 'SBMJOB +\w+ +(\w+)', 1, 1, 'i', 1) as JOBD,
       length(REGEXP_SUBSTR(STR, 'SBMJOB +\w+ +(\w+)', 1, 1, 'i', 1)) as LENGTH_JOBD,
       REGEXP_SUBSTR(STR, '([A-Z]+)\(', 1, 1, 'i', 1) as FIRST_KEYWORD
  from A
Sign up to request clarification or add additional context in comments.

1 Comment

The premise here is incorrect. OP should submit a ticket to IBM as DB2 for i uses International Components for Unicode to provide REGEXP support. See IBM documentation here: ibm.com/docs/en/i/…, and the ICU documentation here: unicode-org.github.io/icu/userguide/strings/… explicitly states support for negative lookahead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.