Recursive CTE how to concatenate the results onto a single line

Below are the articles in the CTE series (including this current one):

Definition of the context

Let’s suppose we have a table that reports the results of nightly interface processing, as in the following example:

INT	RESULT
0011
0012
0013
0014	W
0015
0016	W
0017
0018
0019
0020	E
0023	E
0024	W

The result of a processing can be without warnings (space), with warnings (“W”) or with errors (“E”).

Suppose we want to obtain a summary of the results on a single line, reporting the incorrect calculations at the beginning, then those with warnings and finally those without warnings, as in the following example:

ROW_NUM	INT_LIST
12	0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019

Does all this require a program or can it be done elegantly with a sql command?

Query

Yes, with the following SQL command you get exactly the expected result:

with -- Interface result
     W1 (INT, RESULT) as (
        values ('0011', ' '), ('0012', ' '), ('0013', ' '), ('0014', 'W'),
               ('0015', ' '), ('0016', 'W'), ('0017', ' '), ('0018', ' '),
               ('0019', ' '), ('0020', 'E'), ('0023', 'E'), ('0024', 'W')
        ) -- select * from W1 order by 1;
     -- assign progressive numeration (first E errors, then W warnings,
     -- finally processing without reports
   , W2 (INTR, ROW_NUM, RESULT) as (
       select W1.INT concat result concat CAST(RPAD(' ', 399, ' ') AS CHAR(400)) INTR
            , row_number() over ( partition by 'COMMON_VALUE'
               order by case W1.result
                        when 'E' then 1
                        when 'W' then 2
                        when ' ' then 3
                        end ,
                       W1.INT) ROW_NUM
                      , W1.RESULT
          from W1 order by  1) -- select trim(INTR) INTR, ROW_NUM, RESULT from W2;
     -- recursive CTE (Common Table Expression)
   , W3 (INTR, ROW_NUM, INT_LIST)
     as (select BASE.INTR, BASE.ROW_NUM, trim(BASE.INTR)
           from W2 BASE where BASE.ROW_NUM = 1    union all
          select t1.INTR, t1.ROW_NUM, INT_LIST concat ', ' concat trim(t1.INTR) concat ' '
            from W3 t0, W2 t1 where t0.ROW_NUM + 1 = t1.ROW_NUM
         ) -- select trim(INTR) INTR, ROW_NUM, INT_LIST from W3;
select ROW_NUM, INT_LIST from W3 RQ
 where RQ.ROW_NUM = (select max(ROW_NUM) from W3);

This query which, starting from a table containing the results of interface processing (W1), produces as a result a single line containing the number of interfaces and the list of each individual interface, reporting those with E error first, then those with W warning and finally those without warnings.

Detailed Query Description

Let’s see in detail how the SQL instruction is composed

CTE W1 (Interface Result):

This CTE is a simulation of a table using the VALUES clause, which creates a temporary set of rows (INT, RESULT). Each row represents an interface identifier (INT) and its result (RESULT), where the result could be an error (‘E’), a warning (‘W’), or no issues (‘ ‘).

CTE W2 (Progressive Numeration):

This CTE adds a progressive number to each row from W1 using the ROW_NUMBER() window function. This numbering is partitioned by a constant value (which effectively means no partitioning since all rows will have the same ‘COMMON_VALUE’), and ordered first by the type of result (‘E’, then ‘W’, then ‘ ‘) and then by the interface identifier INT.

The SELECT statement concatenates the INT value, the RESULT, and a right-padded string of 399 spaces cast to CHAR(400) to ensure all INTR values have a uniform length.

This ordered numbering helps in later concatenation of the interface identifiers in the desired order.

CTE W3 (Recursive CTE for Concatenation):

This CTE is recursive and builds a single string (INT_LIST) containing all the interface identifiers, in the order specified by W2.

It starts with the base case where ROW_NUM is 1, selecting the first interface identifier.

The recursive part joins W3 to W2 on the condition that W3.ROW_NUM + 1 = W2.ROW_NUM, effectively adding the next interface identifier to the list (INT_LIST) for each subsequent row.

The recursive query continues until all rows are processed, creating a cumulative list of interface identifiers.

Final SELECT Statement:

The final SELECT statement retrieves the last row from W3, which contains the cumulative list of all interfaces ordered by the result type and INT. The ROW_NUM in this case represents the total count of interfaces.

It uses a subquery to find the max(ROW_NUM) to ensure only the final, complete list is selected.

In summary

In summary, the query is designed to create an ordered list of interface identifiers, prioritizing errors (‘E’), then warnings (‘W’), and lastly those with no issues (‘ ‘), and presents it as a single string alongside the total count of interfaces.

The use of the RPAD function and casting to CHAR(400) ensures that each identifier in the list is uniformly formatted, which is particularly important for the recursive concatenation process..

More in-depth look

To better understand how Recursive CTEs work, let’s try to take a deeper look at the individual results of intermediate CTEs.

CTE W2 (Progressive Numeration)

This CTE adds sequential numbering based on the result type (“E”, then “W”, then “”) and then by the interface identifier INT.

This ordered numbering helps in subsequent concatenation of interface identifiers in the desired order.

select trim(INTR) INTR, ROW_NUM, RESULT from W2;

INTR	ROW_NUM	RESULT
0020E	1	E
0023E	2	E
0014W	3	W
0016W	4	W
0024W	5	W
0011	6
0012	7
0013	8
0015	9
0017	10
0018	11
0019	12

CTE W3 (Recursive CTE for Concatenation)

This recursive CTE iteratively (recursively) creates a string (INT_LIST) containing all interface identifiers, in the order specified by W2.

The recursive query continues until all rows are processed, creating a cumulative list of interface identifiers.

select trim(INTR) INTR, ROW_NUM, INT_LIST from W3;

INTR	ROW_NUM	INT_LIST
0020E		0020E
0023E		0020E, 0023E
0014W		0020E, 0023E , 0014W
0016W		0020E, 0023E , 0014W , 0016W
0024W		0020E, 0023E , 0014W , 0016W , 0024W
0011		0020E, 0023E , 0014W , 0016W , 0024W , 0011
0012		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012
0013		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013
0015		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015
0017		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017
0018		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018
0019		0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019

As you can see, the result of the query contains all the rows of the various iterations and the one that interests us is the last one, i.e. that of the last recursion cycle, which is extracted from the next and final CTE.

Final SELECT Statement

This final CTE retrieves the last line from W3, which contains the cumulative list of all interfaces sorted by result type and INT. ROW_NUM in this case represents the total interface count

select ROW_NUM, INT_LIST
  from W3 RQ
 where RQ.ROW_NUM = (select max(ROW_NUM) from W3)

ROW_NUM	INT_LIST
12	0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019

Conclusions

The SQL query shown uses Common Table Expression (CTE) to organize and process interface data, creating an ordered list of interface identifiers with a total count. By prioritizing errors first (‘E’), then warnings (‘W’), and finally unreported identifiers (‘ ‘), the end result is a single row containing the total number of interfaces and a linked list of each interface , facilitating the review and analysis of interface process results.

Please note, as always, that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database. It is always good practice to refer to the official IBM documentation for more precise and detailed information