This patch collects operations that have users in a for loop and uses them when loop invariant operations are detected and hoisted.
This choice of name definedOps appears to be confusing: there is no such thing as ops being defined. It would be good rename this and add a comment while on this (since you are adding to definedOps below) to make this readable.
Nit: op.getNumResults() > 0 if you prefer.
Just because the op has results doesn't mean those results have users. You can do a check in O(1) time to determine if the results have no uses (use_empty()). This will make the check more powerful and you'll still be able to hoist those that have results but are ultimately unused.
Thank you for bringing up this issue in LICM, it is an important one. I have a few comments on the fix:
- Inherent issue is definedOps ( going to be renamed to opsWithUsers) is being populated only under some conditions, and does not actually do the job of collecting all ops that define an SSA value in that block.
- Even after this fix, only AffineForOps will be added to the list, but there are other cases like AffineDma... ops, AffineIfOps that are actually NOT code invariant etc. which will still cause similar problems.
So, I suggest that you can implement a generic fix for this inherent issue, where any op that defines an SSA value can be added to the above list, and this is decoupled from the checks like 'isOpLoopInvariant`. This will make the check more robust, and relevant test cases can also be added.
@sumesh13 This new generic fix looks good, it handles more cases that were previously not handled. Thanks for this. I have added some comments related to code organization, please check them. Logic looks good to me.
It is better to keep this local like before, see comment below.
This is a per AffineForOp var, and can be kept local here like before.
Populating into definedOps (opsWithUsers) can happen here. No need to walk the affine for, because source of an operand for any op, can only come before it. So,
if (op->getNumResults() > 0 && !op->use_empty()) opsWithUsers.insert(op);
can be placed here.
@sumesh13 Thanks for addressing all the comments, the change looks good now. Can you please check these last two comments before I approve this ?
Don't need this now, right ? This line can be deleted, and comment be moved to the location where it is getting populated.
This section now does not have any change, so please make sure that there is no diff here.
Please change the commit title to a more relevant one, I am approving this.
@bondhugula Could you please take a final pass at this?
Actually, I take my above comment back. This function is being used in ifOp hoisting as well, and with the current implementation it will be needed. Refactoring that can be a separate change in itself.
@vinayaka-polymage Just noticed that you accepted a previous revision with comment on handling of ifops. Considering ifops, would you think it might be helpful to just collect all the ops like I was doing in a previous revision ?
Thanks ...that was one reason I was collecting ops separately. Would you think that might be preferable ?
Gentle reminder..also pls suggest if it would be better to collect all the opsWithUsers separately as i was doing in a previous revision. For nested ifs, looks like otherwise you may not be collecting all the opsWithUsers by just collecting at the top level and leaves.