OperationFolder::tryToFold was running the pre-replacement action even
when there was no constant folding, i.e., when the operation was just
being updated in place but was not going to be replaced. This led to
nested ops being unnecessarily removed from the worklist and only being
processed in the next outer iteration of the greedy pattern rewriter,
which is also why this didn't affect the final output IR but only the
convergence rate. It also led to an op's results' users to be
unnecessarily added to the worklist.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Hmmm, can we instead add a flag to the callback detailing if the op was just updated in-place? I'd say for canonicalization we still want to reconsider the original operands and the users of the operation even when it gets updated.