iSeries Sql – CTE Common Table Expressions – definitions and usage

Common Table Expressions (CTE) in SQL on IBM iSeries (AS/400), as in other database management systems, are a powerful and flexible feature that allows you to create temporary, reusable queries within a larger SQL statement wide.

They are particularly useful for structuring and simplifying complex queries, allowing for greater readability and maintainability of SQL code.

Below are the articles in the CTE series (including this current one):

Technical Description of CTEs

  1. Definition: A CTE is essentially a temporary query that exists only for the duration of a specific SQL statement. It is defined at the beginning of a query and can then be used as a normal table within that same SQL statement.
  2. Syntax: A CTE is defined using the WITH clause, followed by the name of the CTE, an optional list of columns, and the query that populates the CTE. The general syntax is:
    WITH NomeCTE (Colonna1, Colonna2, ...) AS ( SELECT ... FROM ... WHERE ... ) SELECT * FROM NomeCTE;
  3. Reuse: A CTE can be used multiple times within a single SQL statement. This makes it useful for breaking complex queries into smaller, more manageable pieces, to avoid code duplication, and to improve readability.
     
  4. Recursive Queries: CTEs provide the ability to write recursive queries, which are particularly useful for working with hierarchical data or performing iterative calculations. A recursive CTE self-references to create a series of repeating results until a termination condition is reached.
     
  5. Performance: While CTEs can improve the readability of code, their performance efficiency depends on the specific database implementation and the nature of the query. In some cases, they can be more efficient than equivalent subqueries, but not always.
     
  6. Compatibility: CTEs are supported by most modern database management systems, including IBM DB2 for iSeries, SQL Server, PostgreSQL, MySQL (recent versions), and others.
     
  7. Applications: CTEs are widely used to simplify complex queries, to execute recursive queries, to prepare data for aggregation, to organize multi-step queries, and to improve the overall structure of SQL queries.

Considerations for IBM iSeries

  • On IBM iSeries, the availability and specific features of CTEs may depend on the version of DB2.
     
  • It is important to check the specific DB2 on iSeries documentation to ensure that all CTE features are supported and to understand best practices for using them.

Let’s see some examples of use.

CTE Common Table Expression to assign numbering over “partition by”

Description Context and Objective

The query in question is designed to operate on a table that records transactions, identified by a transaction ID (TRANS_ID) and associated with a specific movement date (DATE_MOV).

While the original table may contain other relevant information, for the purpose of this query, the focus is solely on TRANS_ID and DATE_MOV.

The main objective is to generate a list in which each transaction is listed with a unique sequential number and its movement date.

This sequential number starts at 1 for each transaction date and increases by one for each subsequent transaction that occurs on the same date.

In other words, the query assigns a sequential number to the transactions, restarting the count from 1 for each new movement date, thus ensuring that each transaction receives a unique identifier within its specific day.

Query

WITH -- starting data
     W1 (TRANS_ID, DATE_MOV) AS (
         VALUES  ('T0011', '2024-01-10'), ('T0012', '2024-01-10'), ('T0013', '2024-01-10')
               , ('T0014', '2024-01-10'), ('T0015', '2024-01-10'), ('T0016', '2024-01-10')
               , ('T0017', '2024-01-10'), ('T0018', '2024-01-10'), ('T0019', '2024-01-10')
               , ('T0020', '2024-01-10'), ('T0023', '2024-01-10'), ('T0024', '2024-01-10')
               , ('T0025', '2024-01-10'), ('T0026', '2024-01-10'), ('T0127', '2024-01-10')
               , ('T0128', '2024-01-10'), ('T0129', '2024-01-10'), ('T0149', '2024-01-10')
               , ('T0150', '2024-01-11'), ('T0151', '2024-01-11'), ('T0152', '2024-01-11')
               , ('T0165', '2024-01-11'), ('T0166', '2024-01-11'), ('T0167', '2024-01-11')
               , ('T0168', '2024-01-11'), ('T0201', '2024-01-11'))
     --  assigns progressive numbering to the same DATA_MOV, ordering by TRANS_ID
   , W2 (TRANS_ID, ROW_NUM, DATE_MOV)
     as (select trim(W1.TRANS_ID)
              , row_number() over ( partition by DATE_MOV -- 'COMMON_VALUE'
                 order by W1.TRANS_ID) ROW_NUM
              , W1.DATE_MOV
          from W1 order by  W1.TRANS_ID)
select * from W2;

Result

Original table (W1)

Result table (W2)

TRANS_ID DATE_MOV
T0011    2024-01-10
T0012    2024-01-10
T0013    2024-01-10
T0014    2024-01-10
T0015    2024-01-10
T0016    2024-01-10
T0017    2024-01-10
T0018    2024-01-10
T0019    2024-01-10
T0020    2024-01-10
T0023    2024-01-10
T0024    2024-01-10
T0025    2024-01-10
T0026    2024-01-10
T0127    2024-01-10
T0128    2024-01-10
T0129    2024-01-10
T0149    2024-01-10
T0150    2024-01-11
T0151    2024-01-11
T0152    2024-01-11
T0165    2024-01-11
T0166    2024-01-11
T0167    2024-01-11
T0168    2024-01-11
T0201    2024-01-11
TRANS_ID ROW_NUM  DATE_MOV
T0011          1  2024-01-10
T0012          2  2024-01-10
T0013          3  2024-01-10
T0014          4  2024-01-10
T0015          5  2024-01-10
T0016          6  2024-01-10
T0017          7  2024-01-10
T0018          8  2024-01-10
T0019          9  2024-01-10
T0020         10  2024-01-10
T0023         11  2024-01-10
T0024         12  2024-01-10
T0025         13  2024-01-10
T0026         14  2024-01-10
T0127         15  2024-01-10
T0128         16  2024-01-10
T0129         17  2024-01-10
T0149         18  2024-01-10
T0150          1  2024-01-11
T0151          2  2024-01-11
T0152          3  2024-01-11
T0165          4  2024-01-11
T0166          5  2024-01-11
T0167          6  2024-01-11
T0168          7  2024-01-11
T0201          8  2024-01-11

Detailed description of the query

The SQL query uses Common Table Expressions to create a sample dataset and then assign a sequential number to each row within groups based on the date of movement (DATE_MOV), sorting by the transaction ID (TRANS_ID).

The query is divided into two main parts:

W1: Definition of Starting Data

The first CTE, W1, defines a starting dataset with two columns: TRANS_ID and DATE_MOV. Values ​​are manually entered to simulate real data.

W2: Assignment of the Progressive Number

The second CTE, W2, is based on the data defined in W1. Use the ROW_NUMBER() function to assign a sequential number to each transaction (TRANS_ID) within each date (DATE_MOV). The ROW_NUMBER() function is partitioned by DATE_MOV to ensure that the numbering starts again at 1 for each new date. Additionally, the order is established by the TRANS_ID to determine the order of number assignment within each date group

Let’s analyze this part in more detail:

ROW_NUMBER() OVER (PARTITION BY DATE_MOV ORDER BY W1.TRANS_ID) AS ROW_NUM,

This expression is an example of a window function in SQL, specifically ROW_NUMBER(), that is used to assign a unique sequential number to each row within a query result set. Let’s examine each component of this expression:

  • ROW_ ROW_NUMBER() is a window function that assigns a sequential number to each row in a result set. The number starts at 1 for the first row and increases by 1 for each subsequent row
  • OVER (PARTITION BY DATE_MOV ORDER BY W1.TRANS_ID) The OVER clause specifies how the ROW_NUMBER() window function should be applied to the result set.
  • PARTITION BY DATE_MOV: This divides the result set into partitions (or groups) based on the value of the DATE_MOV column. The ROW_NUMBER() function is then applied within each of these partitions separately. This means that the counting starts again from 1 for each new date (DATE_MOV). In other words, each date will have its own set of sequential numbers.
  • ORDER BY W1.TRANS_ID: Within each partition (created based on DATE_MOV), the rows are sorted based on the value of TRANS_ID. The ROW_NUMBER() function assigns sequential numbers to these rows based on this order. So, for each date, the first transaction in the order will have the number 1, the second will have the number 2, and so on.
  • AS ROW_NUM: Finally, AS ROW_NUM assigns a name to the column that contains the progressive number generated by the ROW_NUMBER() function. In this case, the column will be named ROW_NUM.

Please note that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database. It is always good practice to refer to the official IBM documentation for more precise and detailed information


CTE Common Table Expression to detect breaks in the number sequence

Description Context and Objective

The query in question is designed to operate on a table that records transactions, identified by a Movement Code (MovCode) and associated with a specific identifier (NumberID).

While the original table may contain other relevant information, for the purpose of this query, the focus is solely on MovCode and NumberID.

The main objective is to generate a list in which to identify and mark any breaks in the numerical sequence of identification numbers (NumberID) for each Movement Code (MovCode).

Query

WITH xx (MovCode, NumberID)
        AS (VALUES ('A1002','101')
                  ,('A1002','102')
                  ,('A1002','104')
                  ,('A1002','105')
                  ,('A1002','108')
                  ,('A1002','109')
                  ,('A1002','110')
                  ,('B2002','201')
                  ,('B2002','202')
                  ,('B2002','203')
                  ,('B2002','204')
                  ,('B2002','210')
                  ,('B2002','211')
                  ,('B2002','214'))
   , CTE_CheckRegister (MovCode, NumberID, NextNumberID)
        AS (
              SELECT MovCode,
                     NumberID,
                     Lead(NumberID, 1) OVER (PARTITION BY MovCode ORDER BY NumberID) NextNumberID
              FROM xx
           )
SELECT   MovCode,
         CASE
            WHEN NextNumberID is NULL THEN NumberID
            WHEN (NumberID = NextNumberID - 1) THEN NumberID
            ELSE NumberID concat ' ***'
         END as NumberID
FROM     CTE_CheckRegister
ORDER BY MovCode, NumberID;

Result

Original table ( XX )

Result table (CTE_CheckRegister)

MOVCODE	NUMBERID
A1002	101
A1002	102
A1002	104
A1002	105
A1002	108
A1002	109
A1002	110
B2002	201
B2002	202
B2002	203
B2002	204
B2002	210
B2002	211
B2002	214
MOVCODE	NUMBERID
A1002	101
A1002	102 ***
A1002	104
A1002	105 ***
A1002	108
A1002	109
A1002	110
B2002	201
B2002	202
B2002	203
B2002	204 ***
B2002	210
B2002	211 ***
B2002	214

Detailed description of the query

This SQL query uses CTE (Common Table Expressions) and the LEAD function to identify and mark any breaks in the numerical sequence of identification numbers (NumberID) for each Movement Code (MovCode).

Here is a detailed explanation of each part of the query:

  • CTE Initial xx: This CTE defines a set of example values ​​representing pairs of MovCode and NumberID. It is a temporary table used to demonstrate query logic.
  • CTE CTE_CheckRegister:
    • Use data from the CTE xx.
    • Apply the LEAD function to get the next NumberID (NextNumberID) for each MovCode, sorting by NumberID.
    • The LEAD function is a window function that provides access to a line a certain physical distance from the current line, in this case, the next line.
  • Final Selection and CASE Logic:
    • The final query selects MovCode and a modified version of NumberID.
    • CASE logic is used to determine what to show for NumberID:
      • If NextNumberID is NULL, this indicates that it is the latest identification number for that MovCode, so it simply shows NumberID.
      • If NumberID is consecutive to NextNumberID (NumberID = NextNumberID – 1), show NumberID.
      • Otherwise, if there is a break in the sequence (i.e., the NumberID is not followed by the consecutive number), appends ‘***’ to NumberID to signal the break.
  • Sorting of Results:
    • Results are sorted by MovCode and NumberID, ensuring that output is presented neatly and sequentially for each MovCode.

Please note that actual implementation and functionality may vary depending on the version of the iSeries operating system and DB2 database.

It is always good practice to refer to the official IBM documentation for more precise and detailed information

Similar Posts