This is an archive of the discontinued LLVM Phabricator instance.

[x86] Teach the x86 backend that it can fold between TCRETURNm* and TCRETURNr* and fix latent bugs with register class updates.
ClosedPublic

Authored by chandlerc on Jul 23 2018, 8:35 PM.

Details

Summary

Enabling this fully exposes a latent bug in the instruction folding: we
never update the register constraints for the register operands when
fusing a load into another operation. The fused form could, in theory,
have different register constraints on its operands. And in fact,
TCRETURNm* needs its memory operands to use tailcall compatible
registers.

I've updated the folding code to re-constrain all the registers after
they are mapped onto their new instruction.

However, we still can't enable folding in the general case from
TCRETURNr* to TCRETURNm* because doing so may require more registers to
be available during the tail call. If the call itself uses all but one
register, and the folded load would require both a base and index
register, there will not be enough registers to allocate the tail call.

It would be better, IMO, to teach the register allocator to *unfold*
TCRETURNm* when it runs out of registers (or specifically check the
number of registers available during the TCRETURNr*) but I'm not going
to try and solve that for now. Instead, I've just blocked the forward
folding from r -> m, leaving LLVM free to unfold from m -> r as that
doesn't introduce new register pressure constraints.

The down side is that I don't have anything that will directly exercise
this. Instead, I will be immediately using this it my SLH patch. =/

Still worse, without allowing the TCRETURNr* -> TCRETURNm* fold, I don't
have any tests that demonstrate the failure to update the memory operand
register constraints. This patch still seems correct, but I'm nervous
about the degree of testing due to this.

Suggestions?

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc created this revision.Jul 23 2018, 8:35 PM
craig.topper added inline comments.Jul 23 2018, 8:41 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
4672 ↗(On Diff #156965)

Not at my computer but can this be NewMI.getDesc instead of TII.get?

chandlerc marked an inline comment as done.Jul 23 2018, 8:42 PM
chandlerc added inline comments.
llvm/lib/Target/X86/X86InstrInfo.cpp
4672 ↗(On Diff #156965)

Doh, yes of course. Will update patch when back at my computer as well.

chandlerc updated this revision to Diff 157008.Jul 24 2018, 4:54 AM
chandlerc marked an inline comment as done.

Use NewMI.getDesc() rather than looking it up in TII again. Thanks to Craig for the suggestion.

craig.topper added inline comments.Jul 24 2018, 10:12 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
4662 ↗(On Diff #157008)

This is cute and all, but why not a normal integer for loop?

chandlerc added inline comments.Jul 24 2018, 10:57 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
4662 ↗(On Diff #157008)

I find it shorter and more clear (you can look at the review for Sequence.h for other commentary there). If it bothers folks, I can remove it, I just routinely use it in new code.

This revision is now accepted and ready to land.Jul 24 2018, 11:43 AM
This revision was automatically updated to reflect the committed changes.