Page MenuHomePhabricator

[GlobalISel] Accept multiple vregs for lowerCall's arguments
ClosedPublic

Authored by rovka on Jun 19 2019, 6:59 AM.

Details

Summary

Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one. This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Diff Detail

Repository
rL LLVM

Event Timeline

rovka created this revision.Jun 19 2019, 6:59 AM
rovka retitled this revision from [GlobalISel] Accept multiple vregs for lowerCall's result to [GlobalISel] Accept multiple vregs for lowerCall's arguments.
rovka edited the summary of this revision. (Show Details)
arsenm accepted this revision.Jun 19 2019, 4:16 PM
This revision is now accepted and ready to land.Jun 19 2019, 4:16 PM
rovka updated this revision to Diff 205761.Jun 20 2019, 2:37 AM

Thanks for the review! More context for posterity though :)

aemerson added inline comments.Jun 20 2019, 11:36 AM
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
1257 ↗(On Diff #205761)

Do we still need packRegs anywhere at all?

LGTM too if we remove the last packRegs use and delete the IRTranslator pack/unpackRegs() implementations.

rovka added a comment.Mon, Jun 24, 1:37 AM

LGTM too if we remove the last packRegs use and delete the IRTranslator pack/unpackRegs() implementations.

I was going to do that, but I'm not super familiar with the intrinsics code so I wanted to get feedback on the general idea first. I'll try to give it a shot this week. Thanks for all the reviews!

rovka added a comment.Tue, Jun 25, 3:41 AM

Hi Amara,

Is something like this OK for intrinsics: https://reviews.llvm.org/differential/diff/206394/ ? It has the disadvantage that if we have an intrinsic with 2 aggregate args, it will be hard to tell where one of them ends and the next one begins. OTOH, I'm having trouble finding intrinsics with even one aggregate parameter, and in fact just using getOrCreateVReg there doesn't cause check-all to crash (nor the test-suite, nor selfhost at -O0 on AArch64).

Alternatively, we could preserve the old interface here and try to move pack/unpackRegs to utils, where we can use them from both the IRTranslator and CallLowering.

Either way, it would be nice to be able to add a test-case before changing this interface, but like I said I don't really know my way around intrinsics that well. Any suggestions?

Hi Amara,

Is something like this OK for intrinsics: https://reviews.llvm.org/differential/diff/206394/ ? It has the disadvantage that if we have an intrinsic with 2 aggregate args, it will be hard to tell where one of them ends and the next one begins. OTOH, I'm having trouble finding intrinsics with even one aggregate parameter, and in fact just using getOrCreateVReg there doesn't cause check-all to crash (nor the test-suite, nor selfhost at -O0 on AArch64).

Alternatively, we could preserve the old interface here and try to move pack/unpackRegs to utils, where we can use them from both the IRTranslator and CallLowering.

Either way, it would be nice to be able to add a test-case before changing this interface, but like I said I don't really know my way around intrinsics that well. Any suggestions?

You're probably right, I can't think of an intrinsic for a test case. I think we should actually fall back if getOrCreateVRegs() ever returns multiple vregs there, we can cross that bridge when we come to it (if ever).

This revision was automatically updated to reflect the committed changes.