This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Don't replicate instructions in Ifcvt at minsize
ClosedPublic

Authored by dmgreen on Apr 1 2019, 12:14 PM.

Details

Summary

Ifcvt can replicate instructions as it converts them to be predicated. This stops that from happening at minsize.

Test is new, just showing the differences here for clarity.

Diff Detail

Repository
rL LLVM

Event Timeline

dmgreen created this revision.Apr 1 2019, 12:14 PM
efriedma added inline comments.Apr 1 2019, 3:52 PM
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
1939 ↗(On Diff #193148)

Maybe explain why this increases codesize in a comment. I guess the issue here is that if a block has multiple predecessors, you essentially have to clone it, then if-convert the clone? (Theoretically, if we if-convert all the predecessors, it could actually save codesize, but I guess that's unlikely to come up in practice.)

llvm/test/CodeGen/Thumb2/ifcvt-minsize.ll
17 ↗(On Diff #193148)

The testcases don't really seem to demonstrate the point of this... yes, you're saving two bytes, but only because we can use cbz here.

48 ↗(On Diff #193148)

I guess this is sort of orthogonal to your patch, but why aren't we generating "pop {r7, pc}" here?

dmgreen updated this revision to Diff 194042.Apr 6 2019, 4:25 PM
dmgreen marked 3 inline comments as done.
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
1939 ↗(On Diff #193148)

I kind of forgot about A32 code. I've made this T32 only and updated the comment.

From what I've seen, we usually trade a branch for a IT block, so the cloning comes out as a negative.

llvm/test/CodeGen/Thumb2/ifcvt-minsize.ll
17 ↗(On Diff #193148)

Oh yeah. Block placement is duplicating certain instructions it shouldn't too. It gets a taildup limit of 1, which in these situations where you don't get a fallthrough makes things worse.

I tried just turning that off, (putting the limit down to 0 at minsize) but that makes things larger in places.

CBZ's are a part of what makes this better, but I've added a new test that compares against 1

48 ↗(On Diff #193148)

Yet another things that isn't tuned for minsize...

This revision is now accepted and ready to land.Apr 19 2019, 2:46 PM
This revision was automatically updated to reflect the committed changes.