We may meet Invalid CTR loop crash when there's constrained ops inside. This patch adds constrained FP intrinsics to the list so that CTR loop verification doesn't complain about it.
Details
- Reviewers
ZhangKang steven.zhang nemanjai jsji hfinkel - Group Reviewers
Restricted Project - Commits
- rGfed6107dcbfb: [PowerPC] Allow constrained FP intrinsics in mightUseCTR
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp | ||
---|---|---|
520 | So, how about the constrained fcmp/fcmps ? |
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp | ||
---|---|---|
520 | It seems normal fcmp is also not handled here (when hard float is on). |
I realize that, the test is missing. Please provide a test for a loop that has the constrain fp operations and verify if the hw loop is generated.
We don't think any loop containing calls is profitable to transform:
// We don't want to spill/restore the counter register, and so we don't // want to use the counter register if the loop contains calls.
And it seems PPC won't set hardware loops info properly if it's not profitable. So when forcing it to use hardware loops, it would crash.
I think the comments is out of date. This is the full context and we will check if we are using CTR for some calls as what you did now.
// We don't want to spill/restore the counter register, and so we don't // want to use the counter register if the loop contains calls. SmallPtrSet<const Value *, 4> Visited; for (Loop::block_iterator I = L->block_begin(), IE = L->block_end(); I != IE; ++I) if (mightUseCTR(*I, LibInfo, Visited)) return false;
// We don't want to spill/restore the counter register, and so we don't // want to use the counter register if the loop contains calls.And it seems PPC won't set hardware loops info properly if it's not profitable. So when forcing it to use hardware loops, it would crash.
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp | ||
---|---|---|
410 | This can be improved to map to the ISD Opcode and query the target hook to decide if it is call or instruction as what we did for sqrt etc. I think, we can post a NFC patch to handle all such intrinsic this way to sync it with lower. |
This can be improved to map to the ISD Opcode and query the target hook to decide if it is call or instruction as what we did for sqrt etc. I think, we can post a NFC patch to handle all such intrinsic this way to sync it with lower.