This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Commutativity Operand Sorting Pattern Bug Fixes
Needs RevisionPublic

Authored by amandatang on Aug 9 2023, 10:52 AM.

Details

Summary

This patch addresses some bugs in the commutative operand sorting logic, adds some tests and performs some cleanup.

Bug 1:
The first bug occurs when comparing key sizes when at least one ancestorQueue is empty. If one ancestorQueue is not empty, its traversal isn't complete and the size comparison may be incorrect.

In the while loop of the comparison method, before comparing the entries at the current key index, if one of the ancestor queues is empty, then the sizes of the ancestor keys are compared.

Consider operandA and operandB with the following keys and ancestorQueues:

operand A key = {
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: BLOCK_ARGUMENT, opName: ""}, 
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand A ancestorQueue = empty

operand B key = {
    {type: NON_CONSTANT_OP, opName: "arith.divsi"}
}

operand B ancestorQueue = {block_argument, block_argument}

Operand A has a fully generated key and empty ancestor queue while Operand B is not fully generated. When they are compared, operand A should be smaller because at keyIndex 0, "arith.addi" is smaller than "arith.divsi". However, Operand A's ancestor queue is empty, so the following if statement is executed:

if (commOperandA->ancestorQueue.empty() || commOperandB->ancestorQueue.empty())
  return commOperandA->key.size() < commOperandB->key.size();

The comparison incorrectly returns false since operand A's key size is 3 and operand B's key size is 1.

To fix this, the entries at the current index must be checked and compared first if possible. The ancestor key sizes must only be compared if the current index exceeds one of the sizes and both ancestor keys are fully generated.

Bug 2:
The second bug lies in the visitedAncestors set skipping previously seen Operation pointers. The visitedAncestors DenseSet that keeps track of ancestors that were already visited attempts to keep track of and prevent checks on duplicate operands. The same operation may be used as multiple different operands and should not be treated as a duplicate.

Consider the following example:

%0 = arith.addi <block argument> <block argument>
%1 = arith.subi <block argument> <block argument>
%2 = arith.divsi <block argument> <block argument>
%3 = arith.muli %0, %0
%4 = arith.muli %1, %0
%5 = arith.muli %0, %2
operand A = arith.divsi %3, %4
operand B = arith.divsi %5, %4

The expected ancestor keys of operands A and B are:

operand A expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, // duplicate appearance of %0
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

The actual ancestor keys of operands A and B are:

operand A actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

When comparing the expected keys, operand A is smaller than operand B because at keyIndex 4, "arith.addi" is smaller than "arith.divsi". However, when comparing the actual keys, operand B ends up being smaller than operand A because at keyIndex 4, “arith.divsi” is smaller than “arith.subi”.

To fix this, visitedAncestors and its corresponding checks are removed completely.

Bug 3:
The third bug has to do with block arguments not being added to the ancestor queue and key.

The ancestorQueue is a queue of Operation pointers. Currently, ancestors are only added to the queue if they can be successfully cast to Operation pointers, which means Block Arguments are skipped. The key is formed by taking from the front of the ancestorQueue each time, so it also does not contain Block Arguments. This is fine for cases where Block Arguments are on the lowest level of the keys (at the end of the key). However, if Block Arguments appear at other levels, it can cause incorrect comparisons.

Consider the operandA and operandB in the following example:

%0 = arith.addi <block argument> <block argument>
%1 = arith.subi <block argument> <block argument>
operandA = arith.divsi <block argument>, %1
operandB = arith.divsi %0, %1
commutativeOp = commutative %operandA, %operandB

The expected ancestor keys of operandA and operandB are:

operandA expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: NON_CONSTANT_OP, opName: "arith.subi"},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operandB expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},
  {type: NON_CONSTANT_OP, opName: "arith.subi"},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

The actual ancestor keys of operandA and operandB are:

operandA actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}
}

operandB actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},
  {type: NON_CONSTANT_OP, opName: "arith.subi"}
}

When comparing the expected keys, operandA is smaller than operandB because at keyIndex 1, the block argument is smaller than "arith.addi". However, when comparing the actual keys, operand B ends up being smaller than operand A because at keyIndex 1, “arith.addi” is smaller than “arith.subi”.

To fix this, Block Arguments are added to the ancestorQueue in the form of nullptr. The nullcheck for the Operation pointer cast is removed. The AncestorKey constructor which takes an Operation pointer already treats nullptr as Block Arguments, so nothing needs to be changed there.

Bug 4:
The fourth bug is caused by an assertion failure on the comparison function that is passed into stable_sort. stable_sort expects a strictly less than comparison function, which means it should return false for equivalent values.

Note that equivalent operands have the same key size and key values at each index. When the algorithm reaches a keyIndex that exceeds the operands' key sizes, it checks commOperandA first and separately from commOperandB. If the current keyIndex exceeds commOperandA's key size and commOperandA's ancestorQueue is empty, then the comparison function returns true, implying commOperandA is smaller regardless of commOperandB's key.

To fix this, before returning true if commOperandA's ancestorQueue is empty, commOperandB's key size and ancestorQueue is checked for equivalence.

Improvement:
The OpTraitRewritePattern which is a wrapper for the RewritePattern class can be used to perform matchAndRewrite on Ops with a specified trait.

Cleanup and Tests:
Example 2 in the comments is corrected. The key associated with operand %2 is missing a block argument entry at the end. Additionally, the final sorted operand ordering should be %4, %2, %3, %1. Both examples in the comments are added as tests.

Diff Detail

Event Timeline

amandatang created this revision.Aug 9 2023, 10:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2023, 10:52 AM
amandatang requested review of this revision.Aug 9 2023, 10:52 AM
amandatang edited the summary of this revision. (Show Details)Aug 9 2023, 11:00 AM
amandatang added reviewers: jpienaar, okwank.
amandatang added a reviewer: srishti-pm.
amandatang updated this revision to Diff 548754.Aug 9 2023, 1:09 PM
amandatang edited the summary of this revision. (Show Details)

run clang-format

Can you make the summary precise? It's a bit long-winded and the bugs aren't clear from the summary. If you can explain a bug in one line or with a small example, it will be great. You don't need to explain the older functionality in detail.

amandatang added a comment.EditedAug 9 2023, 6:44 PM

Can you make the summary precise? It's a bit long-winded and the bugs aren't clear from the summary. If you can explain a bug in one line or with a small example, it will be great. You don't need to explain the older functionality in detail.

Here are specific examples for each bug that do not behave as intended.

Bug 1:
The same commutative operand may be compared multiple times to reach a properly sorted state. Therefore, operand ancestor keys can be at different stages of the BFS traversal.
Consider operandA and operandB with the following keys and ancestorQueues:

operand A key = {
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: BLOCK_ARGUMENT, opName: ""}, 
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand A ancestorQueue = empty

operand B key = {
    {type: NON_CONSTANT_OP, opName: "arith.divsi"}
}

operand B ancestorQueue = {block_argument, block_argument}

Operand A has a fully generated key and empty ancestor queue while Operand B is not fully generated. When they are compared, operand A should be smaller because at keyIndex 0, "arith.addi" is smaller than "arith.divsi". However, Operand A's ancestor queue is empty, so the following if statement is executed:

if (commOperandA->ancestorQueue.empty() || commOperandB->ancestorQueue.empty())
  return commOperandA->key.size() < commOperandB->key.size();

The comparison incorrectly returns false since operand A's key size is 3 and operand B's key size is 1.

Bug 2:
The visitedAncestors DenseSet can exclude different operands with the same Operation *. Consider the following example:

%0 = arith.addi <block argument> <block argument>
%1 = arith.subi <block argument> <block argument>
%2 = arith.divsi <block argument> <block argument>
%3 = arith.muli %0, %0
%4 = arith.muli %1, %0
%5 = arith.muli %0, %2
operand A = arith.divsi %3, %4
operand B = arith.divsi %5, %4

The expected ancestor keys of operands A and B are:

operand A expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, // duplicate appearance of %0
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

The actual ancestor keys of operands A and B are:

operand A actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

When comparing the expected keys, operand A is smaller than operand B because at keyIndex 4, "arith.addi" is smaller than "arith.divsi". However, when comparing the actual keys, operand B ends up being smaller than operand A because at keyIndex 4, “arith.divsi” is smaller than “arith.subi”.

Bug 3:
Block Arguments should be added to the ancestorQueue as nullptr so they can be correctly added to the key. Consider the operandA and operandB in the following example:

%0 = arith.addi <block argument> <block argument>
%1 = arith.subi <block argument> <block argument>
operandA = arith.divsi <block argument>, %1
operandB = arith.divsi %0, %1
commutativeOp = commutative %operandA, %operandB

The expected ancestor keys of operandA and operandB are:

operandA expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: NON_CONSTANT_OP, opName: "arith.subi"},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operandB expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},
  {type: NON_CONSTANT_OP, opName: "arith.subi"},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

The actual ancestor keys of operandA and operandB are:

operandA actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}
}

operandB actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},
  {type: NON_CONSTANT_OP, opName: "arith.subi"}
}

When comparing the expected keys, operandA is smaller than operandB because at keyIndex 1, the block argument is smaller than "arith.addi". However, when comparing the actual keys, operand B ends up being smaller than operand A because at keyIndex 1, “arith.addi” is smaller than “arith.subi”.

Update comments

amandatang added a comment.EditedAug 14 2023, 3:48 PM

@srishti-pm Let me know if these two examples make sense

Can you make the summary precise? It's a bit long-winded and the bugs aren't clear from the summary. If you can explain a bug in one line or with a small example, it will be great. You don't need to explain the older functionality in detail.

Bug 1:
The same commutative operand may be compared multiple times to reach a properly sorted state. Therefore, operand ancestor keys can be at different stages of the BFS traversal.
Consider operandA and operandB with the following keys and ancestorQueues:

operand A key = {
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: BLOCK_ARGUMENT, opName: ""}, 
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand A ancestorQueue = empty

operand B key = {
    {type: NON_CONSTANT_OP, opName: "arith.divsi"}
}

operand B ancestorQueue = {block_argument, block_argument}

Operand A has a fully generated key and empty ancestor queue while Operand B is not fully generated. When they are compared, operand A should be smaller because at keyIndex 0, "arith.addi" is smaller than "arith.divsi". However, Operand A's ancestor queue is empty, so the following if statement is executed:

if (commOperandA->ancestorQueue.empty() || commOperandB->ancestorQueue.empty())
  return commOperandA->key.size() < commOperandB->key.size();

The comparison incorrectly returns false since operand A's key size is 3 and operand B's key size is 1.

Bug 2:
The visitedAncestors DenseSet can exclude different operands with the same Operation *. Consider the following example:

%0 = arith.addi <block argument> <block argument>
%1 = arith.subi <block argument> <block argument>
%2 = arith.divsi <block argument> <block argument>
%3 = arith.muli %0, %0
%4 = arith.muli %1, %0
%5 = arith.muli %0, %2
operand A = arith.divsi %3, %4
operand B = arith.divsi %5, %4

The expected ancestor keys of operands A and B are:

operand A expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, // duplicate appearance of %0
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B expected key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"},  // duplicate appearance of %0
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

The actual ancestor keys of operands A and B are:

operand A actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

operand B actual key = {
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.muli"}, 
  {type: NON_CONSTANT_OP, opName: "arith.addi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.divsi"}, 
  {type: NON_CONSTANT_OP, opName: "arith.subi"}, 
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""},
  {type: BLOCK_ARGUMENT, opName: ""}
}

When comparing the expected keys, operand A is smaller than operand B because at keyIndex 4, "arith.addi" is smaller than "arith.divsi". However, when comparing the actual keys, operand B ends up being smaller than operand A because at keyIndex 4, “arith.divsi” is smaller than “arith.subi”.

Use strict weak ordering in the operand comparison method

Additional tests

Rename test

amandatang edited the summary of this revision. (Show Details)Aug 21 2023, 11:02 AM
amandatang edited the summary of this revision. (Show Details)Aug 21 2023, 12:10 PM
amandatang retitled this revision from [mlir] Commutativity sorting util bug fix to [mlir] Commutativity Operand Sorting Pattern Bug Fixes.Aug 21 2023, 12:35 PM
srishti-pm requested changes to this revision.Aug 22 2023, 2:31 PM

Thank you for working on this. There are a few issues here. Listing them below:-

  1. There are 7 new tests added in this patch. 5 of them work correctly without any bug fixes from your patch. Thus, all 5 of them should be removed/modified because they do not reflect the changes done in this patch. You should only add those tests that reflect the functional changes done in this patch. Upon looking closer into them, it seemed that some of them were directly pointing to a specific bug fix (for example @check_commutative_non_lowest_level_block_argument) but those tests were already working correctly before the fix.
  1. There are 4 bug fixes being offered by this patch. Each of them warrants a separate commit of its own for better readability and ease of review and because they are all orthogonal to each other but fix the same utility. Please create one commit (and thus one patch) for each of the 4 fixes. The first fix can be done here and thus only 3 more commits are needed to be created for the 3 other fixes. Each commit will showcase test(s) that illustrate that specific bug fix. Further, since you are also doing cleanup, feel free to perform cleanup in any subset of these 4 commits.
  1. Regarding bug 1: No need to explain what an ancestor key entry is and how it is compared. This detail is a black box with respect to this bug and is thus an unnecessary detail to add here. Thus, remove these lines: Each ancestor key entry consists of a type (block_argument, non_constant_op and constant_op in increasing order) and an op name if it is an op (empty for block arguments). An ancestor key entry is smaller if its type is smaller or if it has a smaller op name lexicographically.
  1. Regarding bug 2: You mention: To fix this, visitedAncestors and its corresponding checks are removed completely. I think some explanation is needed on how the utility works with them removed and whether you have done something else to potentially do what they were doing, but now, correctly. A BFS requires some way to keep track of the visited nodes of the tree.
mlir/test/Transforms/test-commutativity-utils.mlir
3–16

Adding this test is unnecessary because it was working before any bug fixes.

17–38

Likewise for this test case as well.

154–173

Likewise.

175–200

Likewise.

280–305

Likewise.

This revision now requires changes to proceed.Aug 22 2023, 2:31 PM

@srishti-pm Let me know if these two examples make sense

They make sense, yes. You can add them in the revision and commit summaries.

amandatang edited the summary of this revision. (Show Details)Aug 22 2023, 2:45 PM
amandatang added a comment.EditedAug 22 2023, 4:11 PM

Thank you for working on this. There are a few issues here. Listing them below:-

  1. There are 7 new tests added in this patch. 5 of them work correctly without any bug fixes from your patch. Thus, all 5 of them should be removed/modified because they do not reflect the changes done in this patch. You should only add those tests that reflect the functional changes done in this patch. Upon looking closer into them, it seemed that some of them were directly pointing to a specific bug fix (for example @check_commutative_non_lowest_level_block_argument) but those tests were already working correctly before the fix.
  1. There are 4 bug fixes being offered by this patch. Each of them warrants a separate commit of its own for better readability and ease of review and because they are all orthogonal to each other but fix the same utility. Please create one commit (and thus one patch) for each of the 4 fixes. The first fix can be done here and thus only 3 more commits are needed to be created for the 3 other fixes. Each commit will showcase test(s) that illustrate that specific bug fix. Further, since you are also doing cleanup, feel free to perform cleanup in any subset of these 4 commits.
  1. Regarding bug 1: No need to explain what an ancestor key entry is and how it is compared. This detail is a black box with respect to this bug and is thus an unnecessary detail to add here. Thus, remove these lines: Each ancestor key entry consists of a type (block_argument, non_constant_op and constant_op in increasing order) and an op name if it is an op (empty for block arguments). An ancestor key entry is smaller if its type is smaller or if it has a smaller op name lexicographically.
  1. Regarding bug 2: You mention: To fix this, visitedAncestors and its corresponding checks are removed completely. I think some explanation is needed on how the utility works with them removed and whether you have done something else to potentially do what they were doing, but now, correctly. A BFS requires some way to keep track of the visited nodes of the tree.

Thank you for bringing up these points.

  1. example1_test, example2_test and check_commutative_small_similar_ancestor_tree are the only tests that worked prior to the fixes in this patch. Some of the other tests also required more than one of the bug fixes in this patch to work correctly. When coming up with test cases, I was also considering coverage and other edge cases that would be nice to have checks for in case more changes are made in the future. That's why some of them do already work correctly without changes. I can add these in another separate patch following the 4 for these bug fixes.
  1. I'll be separating these into their own patches.
  1. Done.
  1. The ancestorQueue and ancestor keys alone are sufficient in correctly performing the BFS traversal.

@srishti-pm In terms of separating the bug fixes into their own patches and having test cases that illustrate each bug fix, it's very difficult to write tests for each bug fix in isolation when all 4 are present at the same time in the original code. The examples I provided in the summary each assume everything else is correct. Some of the bug fixes, when moved to their own commit, also caused existing and previously correct test cases to fail.

Some of the bug fixes, when moved to their own commit, also caused existing and previously correct test cases to fail.

Why will this happen if your fix is correct? I'd suggest making sure your fix is correct and doesn't have a bug in itself.

Some of the bug fixes, when moved to their own commit, also caused existing and previously correct test cases to fail.

Why will this happen if your fix is correct? I'd suggest making sure your fix is correct and doesn't have a bug in itself.

Because fixing each bug individually means the other 3 bugs remain unaddressed in each patch. Those unaddressed bugs causes problems in the tests that are not due to the fix itself. Sorry the wording was a bit confusing in my last comment.

Some of the bug fixes, when moved to their own commit, also caused existing and previously correct test cases to fail.

Why will this happen if your fix is correct? I'd suggest making sure your fix is correct and doesn't have a bug in itself.

Because fixing each bug individually means the other 3 bugs remain unaddressed in each patch. Those unaddressed bugs causes problems in the tests that are not due to the fix itself. Sorry the wording was a bit confusing in my last comment.

You can have a specific ordering of commits in your local system that works for your tests. But, I'm not sure why this issue should happen for the existing tests, based on the bugs claimed and my understanding of the existing tests. It doesn't seem like any of the existing tests is working only because of a "stack of bugs".

I would suggest you try again to come up with specific tests for each bug. A unit test shouldn't be testing multiple orthogonal bug fixes together.

The issue here is that when all these fixes are combined in a single commit, it is impossible to actually review and verify their validity (I did try). It could be possible that an older bug is being unintentionally replaced by a new bug, or that the bug fix is an unintentional hack. A reviewer can't tell at this point. We want to avoid that scenario. That's why it is important to break a commit that is doing several different tasks into multiple smaller commits. Again, the size of a commit is not only defined by the number of code lines changed but also by the number of orthogonal tasks it is doing.

amandatang added a comment.EditedAug 23 2023, 2:23 PM

Bug 2 and bug 3 are related. When the visitedAncestor check in line 152 is removed, it causes block arguments to not be added to the ancestorQueue. Originally, block arguments are only added because nullptr is not in the visitedAncestor set, so either way, the logic is not sound despite being functional. There could also be additional block argument cast and check to make this more safe.

if (!operandDefOp || !visitedAncestors.contains(operandDefOp))

Bug 2 and bug 3 are related. When the visitedAncestor check in line 152 is removed, it causes block arguments to not be added to the ancestorQueue. Originally, block arguments are only added because nullptr is not in the visitedAncestor set, so either way, the logic is not sound despite being functional. There could also be additional block argument cast and check to make this more safe.

if (!operandDefOp || !visitedAncestors.contains(operandDefOp))

I see what issue you are facing.

You fixed bug 2 by doing this: visitedAncestors and its corresponding checks are removed completely. -> This is not "fixing bug 2" but rather "replacing bug 2 with a new bug". This is why you are facing the issue of interlinking between bug 2 and bug 3. You need to find a replacement for visitedAncestors, one that allows duplicated operands to be differentiated but also avoid visiting the same ancestor from the same operand position. I believe this can be done by taking into account, the operandNumber of the operand where the BFS is happening because that is unique, whether or not operands are duplicated. Once you do this, you will no longer have the weird issue of a relationship between fixing bug 2 and bug 3. In short, fix the fix of bug 2 to resolve the problem you are facing.

In the interest of time, I would still suggest you to have bug 1 and bug 4 fixes separated and ready. I will review them while you work on bug 2.

! In D157528#4608458, @amandatang wrote:
example1_test, example2_test and check_commutative_small_similar_ancestor_tree are the only tests that worked prior to the fixes in this patch.

The 5 (out of 7) new tests that are added in this patch that were already working without the fixes are example1_test, example2_test, check_commutative_non_lowest_level_block_argument, check_commutative_small_similar_ancestor_tree, and check_commutative_equal_ancestor_traversal_different_size. Every new test you add with a proposed fix should be failing before the fix. That is the only way it demonstrates the existence of a fix.

! In D157528#4608458, @amandatang wrote:
When coming up with test cases, I was also considering coverage and other edge cases that would be nice to have checks for in case more changes are made in the future. That's why some of them do already work correctly without changes. I can add these in another separate patch following the 4 for these bug fixes.

Yes, this is orthogonal work and should be done in a completely different patch, thanks. Although I don't believe example1_test and example2_test add any new test coverage. I haven't looked at the other 3 tests in detail. Will do that once your patches are broken into multiple patches, as discussed.

amandatang added a comment.EditedAug 23 2023, 8:27 PM

! In D157528#4608458, @amandatang wrote:
example1_test, example2_test and check_commutative_small_similar_ancestor_tree are the only tests that worked prior to the fixes in this patch.

The 5 (out of 7) new tests that are added in this patch that were already working without the fixes are example1_test, example2_test, check_commutative_non_lowest_level_block_argument, check_commutative_small_similar_ancestor_tree, and check_commutative_equal_ancestor_traversal_different_size. Every new test you add with a proposed fix should be failing before the fix. That is the only way it demonstrates the existence of a fix.

! In D157528#4608458, @amandatang wrote:
When coming up with test cases, I was also considering coverage and other edge cases that would be nice to have checks for in case more changes are made in the future. That's why some of them do already work correctly without changes. I can add these in another separate patch following the 4 for these bug fixes.

Yes, this is orthogonal work and should be done in a completely different patch, thanks. Although I don't believe example1_test and example2_test add any new test coverage. I haven't looked at the other 3 tests in detail. Will do that once your patches are broken into multiple patches, as discussed.

For Bug 4, the assertion error initially was only encountered on the Windows version of the build in this patch. My local Linux environment and the Linux build were both unable to catch this. Although there is no previously broken test case that gets resolved via this fix, it is a big problem that will cause failures at runtime. The assertion exists within stable_sort to prevent the endless swapping of equal operands if the comparison method returns true for those cases.

For Bug 4, the assertion error initially was only encountered on the Windows version of the build in this patch. My local Linux environment and the Linux build were both unable to catch this. Although there is no previously broken test case that gets resolved via this fix, it is a big problem that will cause failures at runtime. The assertion exists within stable_sort to prevent the endless swapping of equal operands if the comparison method returns true for those cases.

I understand that it was only encountered in Windows. It is good to work towards fixing such errors but don't combine all fixes together. That's my only ask.

Is this the only other new patch created: https://reviews.llvm.org/D158796 ? It seems to be moving around a lot of code from one file to the other. I'm confused as to whether or not such a movement is relevant to the fix claimed.