iSeries Sql – CTE Common Table Expressions – definitions and usage
Common Table Expressions (CTE) in SQL on IBM iSeries (AS/400), as in other database management systems, are a powerful and flexible feature that allows you to create temporary, reusable queries within a larger SQL statement wide.
They are particularly useful for structuring and simplifying complex queries, allowing for greater readability and maintainability of SQL code.
Below are the articles in the CTE series (including this current one):
Technical Description of CTEs
- Definition: A CTE is essentially a temporary query that exists only for the duration of a specific SQL statement. It is defined at the beginning of a query and can then be used as a normal table within that same SQL statement.
- Syntax: A CTE is defined using the WITH clause, followed by the name of the CTE, an optional list of columns, and the query that populates the CTE. The general syntax is:
WITH NomeCTE (Colonna1, Colonna2, ...) AS ( SELECT ... FROM ... WHERE ... ) SELECT * FROM NomeCTE;
- Reuse: A CTE can be used multiple times within a single SQL statement. This makes it useful for breaking complex queries into smaller, more manageable pieces, to avoid code duplication, and to improve readability.
- Recursive Queries: CTEs provide the ability to write recursive queries, which are particularly useful for working with hierarchical data or performing iterative calculations. A recursive CTE self-references to create a series of repeating results until a termination condition is reached.
- Performance: While CTEs can improve the readability of code, their performance efficiency depends on the specific database implementation and the nature of the query. In some cases, they can be more efficient than equivalent subqueries, but not always.
- Compatibility: CTEs are supported by most modern database management systems, including IBM DB2 for iSeries, SQL Server, PostgreSQL, MySQL (recent versions), and others.
- Applications: CTEs are widely used to simplify complex queries, to execute recursive queries, to prepare data for aggregation, to organize multi-step queries, and to improve the overall structure of SQL queries.
Considerations for IBM iSeries
- On IBM iSeries, the availability and specific features of CTEs may depend on the version of DB2.
- It is important to check the specific DB2 on iSeries documentation to ensure that all CTE features are supported and to understand best practices for using them.
Let’s see some examples of use.
CTE Common Table Expression to assign numbering over “partition by”
Description Context and Objective
The query in question is designed to operate on a table that records transactions, identified by a transaction ID (TRANS_ID) and associated with a specific movement date (DATE_MOV).
While the original table may contain other relevant information, for the purpose of this query, the focus is solely on TRANS_ID and DATE_MOV.
The main objective is to generate a list in which each transaction is listed with a unique sequential number and its movement date.
This sequential number starts at 1 for each transaction date and increases by one for each subsequent transaction that occurs on the same date.
In other words, the query assigns a sequential number to the transactions, restarting the count from 1 for each new movement date, thus ensuring that each transaction receives a unique identifier within its specific day.
Query
WITH -- starting data W1 (TRANS_ID, DATE_MOV) AS ( VALUES ('T0011', '2024-01-10'), ('T0012', '2024-01-10'), ('T0013', '2024-01-10') , ('T0014', '2024-01-10'), ('T0015', '2024-01-10'), ('T0016', '2024-01-10') , ('T0017', '2024-01-10'), ('T0018', '2024-01-10'), ('T0019', '2024-01-10') , ('T0020', '2024-01-10'), ('T0023', '2024-01-10'), ('T0024', '2024-01-10') , ('T0025', '2024-01-10'), ('T0026', '2024-01-10'), ('T0127', '2024-01-10') , ('T0128', '2024-01-10'), ('T0129', '2024-01-10'), ('T0149', '2024-01-10') , ('T0150', '2024-01-11'), ('T0151', '2024-01-11'), ('T0152', '2024-01-11') , ('T0165', '2024-01-11'), ('T0166', '2024-01-11'), ('T0167', '2024-01-11') , ('T0168', '2024-01-11'), ('T0201', '2024-01-11')) -- assigns progressive numbering to the same DATA_MOV, ordering by TRANS_ID , W2 (TRANS_ID, ROW_NUM, DATE_MOV) as (select trim(W1.TRANS_ID) , row_number() over ( partition by DATE_MOV -- 'COMMON_VALUE' order by W1.TRANS_ID) ROW_NUM , W1.DATE_MOV from W1 order by W1.TRANS_ID) select * from W2;
Result
Original table (W1)
Result table (W2)
TRANS_ID DATE_MOV T0011 2024-01-10 T0012 2024-01-10 T0013 2024-01-10 T0014 2024-01-10 T0015 2024-01-10 T0016 2024-01-10 T0017 2024-01-10 T0018 2024-01-10 T0019 2024-01-10 T0020 2024-01-10 T0023 2024-01-10 T0024 2024-01-10 T0025 2024-01-10 T0026 2024-01-10 T0127 2024-01-10 T0128 2024-01-10 T0129 2024-01-10 T0149 2024-01-10 T0150 2024-01-11 T0151 2024-01-11 T0152 2024-01-11 T0165 2024-01-11 T0166 2024-01-11 T0167 2024-01-11 T0168 2024-01-11 T0201 2024-01-11
TRANS_ID ROW_NUM DATE_MOV T0011 1 2024-01-10 T0012 2 2024-01-10 T0013 3 2024-01-10 T0014 4 2024-01-10 T0015 5 2024-01-10 T0016 6 2024-01-10 T0017 7 2024-01-10 T0018 8 2024-01-10 T0019 9 2024-01-10 T0020 10 2024-01-10 T0023 11 2024-01-10 T0024 12 2024-01-10 T0025 13 2024-01-10 T0026 14 2024-01-10 T0127 15 2024-01-10 T0128 16 2024-01-10 T0129 17 2024-01-10 T0149 18 2024-01-10 T0150 1 2024-01-11 T0151 2 2024-01-11 T0152 3 2024-01-11 T0165 4 2024-01-11 T0166 5 2024-01-11 T0167 6 2024-01-11 T0168 7 2024-01-11 T0201 8 2024-01-11
Detailed description of the query
The SQL query uses Common Table Expressions to create a sample dataset and then assign a sequential number to each row within groups based on the date of movement (DATE_MOV), sorting by the transaction ID (TRANS_ID).
The query is divided into two main parts:
W1: Definition of Starting Data
The first CTE, W1, defines a starting dataset with two columns: TRANS_ID and DATE_MOV. Values are manually entered to simulate real data.
W2: Assignment of the Progressive Number
The second CTE, W2, is based on the data defined in W1. Use the ROW_NUMBER() function to assign a sequential number to each transaction (TRANS_ID) within each date (DATE_MOV). The ROW_NUMBER() function is partitioned by DATE_MOV to ensure that the numbering starts again at 1 for each new date. Additionally, the order is established by the TRANS_ID to determine the order of number assignment within each date group
Let’s analyze this part in more detail:
ROW_NUMBER() OVER (PARTITION BY DATE_MOV ORDER BY W1.TRANS_ID) AS ROW_NUM,
This expression is an example of a window function in SQL, specifically ROW_NUMBER(), that is used to assign a unique sequential number to each row within a query result set. Let’s examine each component of this expression:
- ROW_ ROW_NUMBER() is a window function that assigns a sequential number to each row in a result set. The number starts at 1 for the first row and increases by 1 for each subsequent row
- OVER (PARTITION BY DATE_MOV ORDER BY W1.TRANS_ID) The OVER clause specifies how the ROW_NUMBER() window function should be applied to the result set.
- PARTITION BY DATE_MOV: This divides the result set into partitions (or groups) based on the value of the DATE_MOV column. The ROW_NUMBER() function is then applied within each of these partitions separately. This means that the counting starts again from 1 for each new date (DATE_MOV). In other words, each date will have its own set of sequential numbers.
- ORDER BY W1.TRANS_ID: Within each partition (created based on DATE_MOV), the rows are sorted based on the value of TRANS_ID. The ROW_NUMBER() function assigns sequential numbers to these rows based on this order. So, for each date, the first transaction in the order will have the number 1, the second will have the number 2, and so on.
- AS ROW_NUM: Finally, AS ROW_NUM assigns a name to the column that contains the progressive number generated by the ROW_NUMBER() function. In this case, the column will be named ROW_NUM.
Please note that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database. It is always good practice to refer to the official IBM documentation for more precise and detailed information
CTE Common Table Expression to detect breaks in the number sequence
Description Context and Objective
The query in question is designed to operate on a table that records transactions, identified by a Movement Code (MovCode) and associated with a specific identifier (NumberID).
While the original table may contain other relevant information, for the purpose of this query, the focus is solely on MovCode and NumberID.
The main objective is to generate a list in which to identify and mark any breaks in the numerical sequence of identification numbers (NumberID) for each Movement Code (MovCode).
Query
WITH xx (MovCode, NumberID) AS (VALUES ('A1002','101') ,('A1002','102') ,('A1002','104') ,('A1002','105') ,('A1002','108') ,('A1002','109') ,('A1002','110') ,('B2002','201') ,('B2002','202') ,('B2002','203') ,('B2002','204') ,('B2002','210') ,('B2002','211') ,('B2002','214')) , CTE_CheckRegister (MovCode, NumberID, NextNumberID) AS ( SELECT MovCode, NumberID, Lead(NumberID, 1) OVER (PARTITION BY MovCode ORDER BY NumberID) NextNumberID FROM xx ) SELECT MovCode, CASE WHEN NextNumberID is NULL THEN NumberID WHEN (NumberID = NextNumberID - 1) THEN NumberID ELSE NumberID concat ' ***' END as NumberID FROM CTE_CheckRegister ORDER BY MovCode, NumberID;
Result
Original table ( XX )
Result table (CTE_CheckRegister)
MOVCODE NUMBERID A1002 101 A1002 102 A1002 104 A1002 105 A1002 108 A1002 109 A1002 110 B2002 201 B2002 202 B2002 203 B2002 204 B2002 210 B2002 211 B2002 214
MOVCODE NUMBERID A1002 101 A1002 102 *** A1002 104 A1002 105 *** A1002 108 A1002 109 A1002 110 B2002 201 B2002 202 B2002 203 B2002 204 *** B2002 210 B2002 211 *** B2002 214
Detailed description of the query
This SQL query uses CTE (Common Table Expressions) and the LEAD function to identify and mark any breaks in the numerical sequence of identification numbers (NumberID) for each Movement Code (MovCode).
Here is a detailed explanation of each part of the query:
- CTE Initial xx: This CTE defines a set of example values representing pairs of MovCode and NumberID. It is a temporary table used to demonstrate query logic.
- CTE CTE_CheckRegister:
- Use data from the CTE xx.
- Apply the LEAD function to get the next NumberID (NextNumberID) for each MovCode, sorting by NumberID.
- The LEAD function is a window function that provides access to a line a certain physical distance from the current line, in this case, the next line.
- Final Selection and CASE Logic:
- The final query selects MovCode and a modified version of NumberID.
- CASE logic is used to determine what to show for NumberID:
- If NextNumberID is NULL, this indicates that it is the latest identification number for that MovCode, so it simply shows NumberID.
- If NumberID is consecutive to NextNumberID (NumberID = NextNumberID – 1), show NumberID.
- Otherwise, if there is a break in the sequence (i.e., the NumberID is not followed by the consecutive number), appends ‘***’ to NumberID to signal the break.
- Sorting of Results:
- Results are sorted by MovCode and NumberID, ensuring that output is presented neatly and sequentially for each MovCode.
Please note that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database.
It is always good practice to refer to the official IBM documentation for more precise and detailed information