Page MenuHomePhabricator

[X86] Avoid using high register trick for test instruction
ClosedPublic

Authored by deadalnix on Jan 29 2018, 8:27 AM.

Details

Summary

It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 .

Diff Detail

Repository
rL LLVM

Event Timeline

deadalnix created this revision.Jan 29 2018, 8:27 AM
craig.topper added inline comments.Jan 29 2018, 9:33 AM
lib/Target/X86/X86ISelDAGToDAG.cpp
3090 ↗(On Diff #131809)

I believe this was the only use of the TEST8ri_NOREX instruction. Can you remove it from the td file too?

niravd added inline comments.Jan 29 2018, 9:41 AM
lib/Target/X86/X86ISelDAGToDAG.cpp
3101 ↗(On Diff #131809)

We're now generating a testl but we've passed the logic to reduce to the smaller to testw. Can you fold that logic into this and add a test case (e.g. replicate testOperand32 with optsize set)?

Remove TEST8ri_NOREX
Merge codepath for testl and testw

deadalnix added inline comments.Jan 29 2018, 12:42 PM
test/CodeGen/X86/test-shrink.ll
530 ↗(On Diff #131854)

The test case for size optimization was added here in another commit, in case reviewers are wondering.

niravd accepted this revision.Jan 29 2018, 12:43 PM

You're missing the optsize test case but otherwise LGTM.

This revision is now accepted and ready to land.Jan 29 2018, 12:43 PM

Remove leftover DEBUG

This revision was automatically updated to reflect the committed changes.
reames added a subscriber: reames.Jan 29 2018, 4:43 PM

Is it worth considering doing this as a late peephole? (i.e. after register allocation if the register happens to be appropriate?)

Potentially, but the flag usage is tricky. We can't turn TEST32 into TEST8 if the sign flag is being used. We are guaranteed safe if bit 15 of the immediate is clear. But if bit 15 is set we'd need to check flag users.

craig.topper reopened this revision.Jan 30 2018, 10:28 AM

Reopening since it was reverted. I think this patch experienced some scope creep. As titled it should have only removed the TEST8ri_NOREX creation. Improvements to truncand32 from test-shrink.ll should have been separate based on the title. I don't think there was any need to merge TEST32 and TEST16 handling together especially since TEST8 is still separate. I suspect the bug that caused the revert was created during that merging.

This revision is now accepted and ready to land.Jan 30 2018, 10:28 AM
deadalnix updated this revision to Diff 132147.Jan 31 2018, 5:15 AM

Reduce the diff to simply avoid doing the high register trick.
It turns out that there is a bug in the other optimization made in the orginal diff, so it seems like a better idea to split that out in several smaller steps to ensure progress.

This revision was automatically updated to reflect the committed changes.