This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Allow constrained FP intrinsics in mightUseCTR
ClosedPublic

Authored by qiucf on Jun 16 2020, 4:02 AM.

Details

Summary

We may meet Invalid CTR loop crash when there's constrained ops inside. This patch adds constrained FP intrinsics to the list so that CTR loop verification doesn't complain about it.

Diff Detail

Event Timeline

qiucf created this revision.Jun 16 2020, 4:02 AM
steven.zhang added inline comments.Jun 16 2020, 7:22 PM
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
413

So, how about the constrained fcmp/fcmps ?

qiucf updated this revision to Diff 272332.Jun 21 2020, 10:05 PM
qiucf marked an inline comment as done.

Add logic for constrained binops/conversion.

qiucf added inline comments.Jun 21 2020, 10:06 PM
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
413

It seems normal fcmp is also not handled here (when hard float is on).

qiucf updated this revision to Diff 282495.Aug 2 2020, 8:38 PM

Handle constrained 'fcmp' and 'fcmps'

steven.zhang accepted this revision.Aug 2 2020, 8:44 PM

LGTM. Thank you for doing this.

This revision is now accepted and ready to land.Aug 2 2020, 8:44 PM
steven.zhang requested changes to this revision.Aug 2 2020, 8:47 PM

I realize that, the test is missing. Please provide a test for a loop that has the constrain fp operations and verify if the hw loop is generated.

This revision now requires changes to proceed.Aug 2 2020, 8:47 PM
qiucf updated this revision to Diff 282507.Aug 2 2020, 11:41 PM

Add missing test back. Thanks for review

Add missing test back. Thanks for review

Can you add checker for the bdnz ?

Can you add checker for the bdnz ?

We don't think any loop containing calls is profitable to transform:

// We don't want to spill/restore the counter register, and so we don't
// want to use the counter register if the loop contains calls.

And it seems PPC won't set hardware loops info properly if it's not profitable. So when forcing it to use hardware loops, it would crash.

Can you add checker for the bdnz ?

We don't think any loop containing calls is profitable to transform:

I think the comments is out of date. This is the full context and we will check if we are using CTR for some calls as what you did now.

// We don't want to spill/restore the counter register, and so we don't
// want to use the counter register if the loop contains calls.
SmallPtrSet<const Value *, 4> Visited;
for (Loop::block_iterator I = L->block_begin(), IE = L->block_end();
     I != IE; ++I)
  if (mightUseCTR(*I, LibInfo, Visited))
    return false;
// We don't want to spill/restore the counter register, and so we don't
// want to use the counter register if the loop contains calls.

And it seems PPC won't set hardware loops info properly if it's not profitable. So when forcing it to use hardware loops, it would crash.

qiucf updated this revision to Diff 286962.Aug 21 2020, 12:56 AM

Add test

qiucf updated this revision to Diff 286978.Aug 21 2020, 2:36 AM

use script to refresh test

steven.zhang accepted this revision.Aug 21 2020, 3:04 AM

LGTM now.

This revision is now accepted and ready to land.Aug 21 2020, 3:04 AM
steven.zhang added inline comments.Aug 21 2020, 3:09 AM
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
410

This can be improved to map to the ISD Opcode and query the target hook to decide if it is call or instruction as what we did for sqrt etc. I think, we can post a NFC patch to handle all such intrinsic this way to sync it with lower.

This revision was automatically updated to reflect the committed changes.