RCTEs for Recursive Modify | SQLite SQL Tutorial Link Search Menu Expand Document

Recursive CTEs enable the implementation of the following task. Suppose we have a set of file system directory paths and a set of COPY operations defined as the original and new paths. We need an SQL query to perform this transformation. The RCTE feature is the only way to implement this transformation in SQL/SQLite because of potential successive modifications. The snippet below shows only the core code (the LOOP_COPY block) without further processing. Note that straightforward use of the replace routine would be incorrect because of possible matches in the middle of the path.

WITH RECURSIVE
    ops(opid, rootpath_old, rootpath_new) AS (
        VALUES
            (1, 'doc/',            'docABC'                 ),
            (2, 'docABC/thesis/',  'docABC/master'          ),
            (3, 'docABC/app/job/', 'docABC/app/academic_job'),
            (4, 'code/',           'prog'                   )
    ),
    folders(path_old) AS (
        VALUES
            ('doc/thesis/exp'),
            ('doc/thesis/theory'),
            ('doc/app/job/lor'),
            ('code/scripts/py'),
            ('code/scripts/bas')
    ),
    LOOP_COPY AS (
            -- Initial SELECT --
            SELECT 0 AS opid, path_old AS path_new
            FROM folders
        UNION ALL -- LOOP body starts here --
            -- Place input rows in the processing queue for the next loop
            SELECT ops.opid, path_new
            FROM LOOP_COPY AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            -- Append the processing queue with new paths generated by the current operation        
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    )
SELECT * FROM LOOP_COPY
WHERE opid = (SELECT max(ops.opid) FROM ops);

Note that this task is still not particularly well suited for SQL. The initial SELECT places all paths in the processing queue for the first loop cycle. Because the COPY operation does not delete any folders, the entire input must be passed along between loop cycles, meaning that the initial set and all previously created paths are duplicated during each loop cycle. For this reason, the entire row set corresponding to the processing of the last operation is the final result, hence the last line in the query.

Even though the RCTE loop body processes one row at a time, when the processing queue acts as FIFO (the default behavior), it might be helpful to treat the RCTE loop as if it processed the entire row set produced by the preceding cycle. When the processing queue acts as FIFO, this treatment is appropriate, as illustrated by the two tables above (compare the output of Loop Cycle #1 shown in the first table with the column Cycle #3 from the second table). The query below shows an equivalent implementation of the RCTE block (only valid for the given input), which unravels the above LOOP_COPY RCTE.

WITH RECURSIVE
    ops(opid, rootpath_old, rootpath_new) AS (
        VALUES
            (1, 'doc/',            'docABC'                 ),
            (2, 'docABC/thesis/',  'docABC/master'          ),
            (3, 'docABC/app/job/', 'docABC/app/academic_job'),
            (4, 'code/',           'prog'                   )
    ),
    folders(path_old) AS (
        VALUES
            ('doc/thesis/exp'),
            ('doc/thesis/theory'),
            ('doc/app/job/lor'),
            ('code/scripts/py'),
            ('code/scripts/bas')
    ),
    LOOP_COPY_INIT AS (
            SELECT 0 AS opid, path_old AS path_new
            FROM folders
    ),
    LOOP_COPY_STEP_1 AS (
            SELECT ops.opid, path_new
            FROM LOOP_COPY_INIT AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY_INIT AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    ),
    LOOP_COPY_STEP_2 AS (
            SELECT ops.opid, path_new
            FROM LOOP_COPY_STEP_1 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY_STEP_1 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    ),
    LOOP_COPY_STEP_3 AS (
            SELECT ops.opid, path_new
            FROM LOOP_COPY_STEP_2 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY_STEP_2 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    ),
    LOOP_COPY_STEP_4 AS (
            SELECT ops.opid, path_new
            FROM LOOP_COPY_STEP_3 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY_STEP_3 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    ),
    LOOP_COPY_STEP_5_STOP AS (
            SELECT ops.opid, path_new
            FROM LOOP_COPY_STEP_4 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
        UNION ALL
            SELECT ops.opid,
                   rootpath_new || substr(path_new, length(rootpath_old)) AS path_new
            FROM LOOP_COPY_STEP_4 AS BUFFER, ops
            WHERE ops.opid = BUFFER.opid + 1
              AND BUFFER.path_new like rootpath_old || '%'            
    ) 
-- SELECT * FROM LOOP_COPY_INIT;
-- SELECT * FROM LOOP_COPY_STEP_1;
-- SELECT * FROM LOOP_COPY_STEP_2;
-- SELECT * FROM LOOP_COPY_STEP_3;
-- SELECT * FROM LOOP_COPY_STEP_4;
-- SELECT * FROM LOOP_COPY_STEP_5_STOP;
SELECT * FROM LOOP_COPY_INIT;