Recursive CTE how to concatenate the results onto a single line
Below are the articles in the CTE series (including this current one):
Definition of the context
Let’s suppose we have a table that reports the results of nightly interface processing, as in the following example:
INT | RESULT |
0011 | |
0012 | |
0013 | |
0014 | W |
0015 | |
0016 | W |
0017 | |
0018 | |
0019 | |
0020 | E |
0023 | E |
0024 | W |
The result of a processing can be without warnings (space), with warnings (“W”) or with errors (“E”).
Suppose we want to obtain a summary of the results on a single line, reporting the incorrect calculations at the beginning, then those with warnings and finally those without warnings, as in the following example:
ROW_NUM | INT_LIST |
12 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019 |
Does all this require a program or can it be done elegantly with a sql command?
Query
Yes, with the following SQL command you get exactly the expected result:
with -- Interface result W1 (INT, RESULT) as ( values ('0011', ' '), ('0012', ' '), ('0013', ' '), ('0014', 'W'), ('0015', ' '), ('0016', 'W'), ('0017', ' '), ('0018', ' '), ('0019', ' '), ('0020', 'E'), ('0023', 'E'), ('0024', 'W') ) -- select * from W1 order by 1; -- assign progressive numeration (first E errors, then W warnings, -- finally processing without reports , W2 (INTR, ROW_NUM, RESULT) as ( select W1.INT concat result concat CAST(RPAD(' ', 399, ' ') AS CHAR(400)) INTR , row_number() over ( partition by 'COMMON_VALUE' order by case W1.result when 'E' then 1 when 'W' then 2 when ' ' then 3 end , W1.INT) ROW_NUM , W1.RESULT from W1 order by 1) -- select trim(INTR) INTR, ROW_NUM, RESULT from W2; -- recursive CTE (Common Table Expression) , W3 (INTR, ROW_NUM, INT_LIST) as (select BASE.INTR, BASE.ROW_NUM, trim(BASE.INTR) from W2 BASE where BASE.ROW_NUM = 1 union all select t1.INTR, t1.ROW_NUM, INT_LIST concat ', ' concat trim(t1.INTR) concat ' ' from W3 t0, W2 t1 where t0.ROW_NUM + 1 = t1.ROW_NUM ) -- select trim(INTR) INTR, ROW_NUM, INT_LIST from W3; select ROW_NUM, INT_LIST from W3 RQ where RQ.ROW_NUM = (select max(ROW_NUM) from W3);
This query which, starting from a table containing the results of interface processing (W1), produces as a result a single line containing the number of interfaces and the list of each individual interface, reporting those with E error first, then those with W warning and finally those without warnings.
Detailed Query Description
Let’s see in detail how the SQL instruction is composed
- CTE W1 (Interface Result):
- This CTE is a simulation of a table using the VALUES clause, which creates a temporary set of rows (INT, RESULT). Each row represents an interface identifier (INT) and its result (RESULT), where the result could be an error (‘E’), a warning (‘W’), or no issues (‘ ‘).
- CTE W2 (Progressive Numeration):
- This CTE adds a progressive number to each row from W1 using the ROW_NUMBER() window function. This numbering is partitioned by a constant value (which effectively means no partitioning since all rows will have the same ‘COMMON_VALUE’), and ordered first by the type of result (‘E’, then ‘W’, then ‘ ‘) and then by the interface identifier INT.
- The SELECT statement concatenates the INT value, the RESULT, and a right-padded string of 399 spaces cast to CHAR(400) to ensure all INTR values have a uniform length.
- This ordered numbering helps in later concatenation of the interface identifiers in the desired order.
- CTE W3 (Recursive CTE for Concatenation):
- This CTE is recursive and builds a single string (INT_LIST) containing all the interface identifiers, in the order specified by W2.
- It starts with the base case where ROW_NUM is 1, selecting the first interface identifier.
- The recursive part joins W3 to W2 on the condition that W3.ROW_NUM + 1 = W2.ROW_NUM, effectively adding the next interface identifier to the list (INT_LIST) for each subsequent row.
- The recursive query continues until all rows are processed, creating a cumulative list of interface identifiers.
Final SELECT Statement:
- The final SELECT statement retrieves the last row from W3, which contains the cumulative list of all interfaces ordered by the result type and INT. The ROW_NUM in this case represents the total count of interfaces.
- It uses a subquery to find the max(ROW_NUM) to ensure only the final, complete list is selected.
In summary
In summary, the query is designed to create an ordered list of interface identifiers, prioritizing errors (‘E’), then warnings (‘W’), and lastly those with no issues (‘ ‘), and presents it as a single string alongside the total count of interfaces.
The use of the RPAD function and casting to CHAR(400) ensures that each identifier in the list is uniformly formatted, which is particularly important for the recursive concatenation process..
More in-depth look
To better understand how Recursive CTEs work, let’s try to take a deeper look at the individual results of intermediate CTEs.
CTE W2 (Progressive Numeration)
This CTE adds sequential numbering based on the result type (“E”, then “W”, then “”) and then by the interface identifier INT.
This ordered numbering helps in subsequent concatenation of interface identifiers in the desired order.
select trim(INTR) INTR, ROW_NUM, RESULT from W2;
INTR | ROW_NUM | RESULT |
0020E | 1 | E |
0023E | 2 | E |
0014W | 3 | W |
0016W | 4 | W |
0024W | 5 | W |
0011 | 6 | |
0012 | 7 | |
0013 | 8 | |
0015 | 9 | |
0017 | 10 | |
0018 | 11 | |
0019 | 12 |
CTE W3 (Recursive CTE for Concatenation)
This recursive CTE iteratively (recursively) creates a string (INT_LIST) containing all interface identifiers, in the order specified by W2.
The recursive query continues until all rows are processed, creating a cumulative list of interface identifiers.
select trim(INTR) INTR, ROW_NUM, INT_LIST from W3;
INTR | ROW_NUM | INT_LIST |
0020E | 0020E | |
0023E | 0020E, 0023E | |
0014W | 0020E, 0023E , 0014W | |
0016W | 0020E, 0023E , 0014W , 0016W | |
0024W | 0020E, 0023E , 0014W , 0016W , 0024W | |
0011 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 | |
0012 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 | |
0013 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 | |
0015 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 | |
0017 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 | |
0018 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 | |
0019 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019 |
As you can see, the result of the query contains all the rows of the various iterations and the one that interests us is the last one, i.e. that of the last recursion cycle, which is extracted from the next and final CTE.
Final SELECT Statement
This final CTE retrieves the last line from W3, which contains the cumulative list of all interfaces sorted by result type and INT. ROW_NUM in this case represents the total interface count
select ROW_NUM, INT_LIST from W3 RQ where RQ.ROW_NUM = (select max(ROW_NUM) from W3)
ROW_NUM | INT_LIST |
12 | 0020E, 0023E , 0014W , 0016W , 0024W , 0011 , 0012 , 0013 , 0015 , 0017 , 0018 , 0019 |
Conclusions
The SQL query shown uses Common Table Expression (CTE) to organize and process interface data, creating an ordered list of interface identifiers with a total count. By prioritizing errors first (‘E’), then warnings (‘W’), and finally unreported identifiers (‘ ‘), the end result is a single row containing the total number of interfaces and a linked list of each interface , facilitating the review and analysis of interface process results.
Please note, as always, that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database. It is always good practice to refer to the official IBM documentation for more precise and detailed information