Class MergeNode

  • All Implemented Interfaces:
    Visitable

    public final class MergeNode
    extends DMLModStatementNode

    A MergeNode represents a MERGE statement. The statement looks like this...

     MERGE INTO targetTable
     USING sourceTable
     ON searchCondition
     matchingClause1 ... matchingClauseN
     

    ...where each matching clause looks like this...

     WHEN MATCHED [ AND matchingRefinement ] THEN DELETE
     

    ...or

     WHEN MATCHED [ AND matchingRefinement ] THEN UPDATE SET col1 = expr1, ... colM = exprM
     

    ...or

     WHEN NOT MATCHED [ AND matchingRefinement ] THEN INSERT columnList VALUES valueList
     

    The Derby compiler essentially rewrites this statement into a driving left join followed by a series of DELETE/UPDATE/INSERT actions. The left join looks like this:

     SELECT selectList FROM sourceTable LEFT OUTER JOIN targetTable ON searchCondition
     

    The selectList of the driving left join consists of the following:

    • All of the columns mentioned in the searchCondition.
    • All of the columns mentioned in the matchingRefinement clauses.
    • All of the columns mentioned in the SET clauses and the INSERT columnLists and valueLists.
    • All additional columns needed for the triggers and foreign keys fired by the DeleteResultSets and UpdateResultSets constructed for the WHEN MATCHED clauses.
    • All additional columns needed to build index rows and evaluate generated columns needed by the UpdateResultSets constructed for the WHEN MATCHED...THEN UPDATE clauses.
    • A trailing targetTable.RowLocation column.

    The driving left join's selectList then looks like this...

     sc1, ..., scN, tc1, ..., tcM, targetTable.RowLocation
     

    Where sc1...scN are the columns we need from the source table (in alphabetical order) and tc1...tcM are the columns we need from the target table (in alphabetical order).

    The matchingRefinement expressions are bound and generated against the FromList of the driving left join. Dummy DeleteNode, UpdateNode, and InsertNode statements are independently constructed in order to bind and generate the DELETE/UPDATE/INSERT actions.

    At execution time, the targetTable.RowLocation column is used to determine whether a given driving row matches. The row matches iff targetTable.RowLocation is not null. The driving row is then assigned to the first DELETE/UPDATE/INSERT action to which it applies. The relevant columns from the driving row are extracted and buffered in a temporary table (the "then" rows) specific to that DELETE/UPDATE/INSERT action. After the driving left join has been processed, the DELETE/UPDATE/INSERT actions are run in order, each taking its corresponding temporary table as its source ResultSet.

    Name resolution was a particularly thorny problem. This is because name resolution behaves differently for SELECTs and UPDATEs. In particular, while processing UPDATEs, the compiler throws away name resolution information; this happens as a consequence of work done on DERBY-1043. In the end, I had to invent more name resolution machinery in order to compensate for the differences in the handling of SELECTs and UPDATEs. If we are to allow subqueries in matching refinement clauses and in the values expressions of INSERT and UPDATE actions, then we probably need to remove this special name resolution machinery. And that, in turn, probably means revisiting DERBY-1043.

    The special name resolution machinery involves marking source and target column references in order to make it clear which table they belong to. This is done in associateColumn(). The markers are consulted at code-generation time in order to resolve column references when we generate the expressions needed to populate the rows which go into the temporary tables. That resolution happens in MatchingClauseNode.getSelectListOffset().