This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make library calls sensitive to regparm module flag (Fixes PR3997).
ClosedPublic

Authored by niravd on Nov 23 2016, 8:18 AM.

Details

Summary

Case all constructions of library calls in ISEL to hook into target-specific attribute labeller. This allows libcalls to be sensitive to
the default regparm flag.

Diff Detail

Repository
rL LLVM

Event Timeline

niravd updated this revision to Diff 79087.Nov 23 2016, 8:18 AM
niravd retitled this revision from to [X86] Add explicit regparm flag for X86-32 calling convention..
niravd updated this object.
niravd added reviewers: rnk, mkuper.
niravd added a subscriber: llvm-commits.
niravd updated this revision to Diff 79088.Nov 23 2016, 8:19 AM

Add missing testcase

mkuper edited edge metadata.Nov 23 2016, 9:59 AM

IIRC, The MCU ABI has at least one other difference from -mregparm=3:

MCU will continue passing parameters in-register even if an earlier parameter ends up on the stack, as long as free registers are available. -mregparm will bail once a parameter doesn't fit in a register. So, for something like foo(int a, int b, long long c, int d), MCU will pass {a, b, c} inreg, and mregparm will only do it for {a,b}.

(Interestingly enough, it looks like GCC gets this wrong, and has the -mregparm behavior for -miamcu... :-\ )

include/llvm/CodeGen/CommandFlags.h
48 ↗(On Diff #79088)

I really don't think we should have this (except, possibly, for debugging, in which case it probably needs to be hidden).
As I mentioned on D27051, the information should be contained in the IR.

I meant "MCU will pass {a, b, d} inreg, and mregparm will only do it for {a,b}."

There's two cases:

  1. The calling convention for functions named in the IR. This can be set separately for each function by the frontend. Today, it's done by putting "inreg" attributes on appropriate parameters.
  2. The calling convention used for library calls generated by LLVM. Today, this is not handled.

The former can be specified by attribute((regparm(N))), and the -mregparm command-line flag specifies both 2, and the default for 1.

This patch adds support for 2, without changing the way 1 works. But, ISTM, both of these should be really handled similarly -- I don't much like that this patch is adding new calling convention code for 2, but leaving the handling for 1 in clang's decision about whether to use "inreg".

I think, instead, that 3 new calling conventions, "x86_regparm_{1,2,3}" should be defined -- variants of the C calling convention, with a different number of args passed in registers.

Then --

  1. Clang can emit the right cc for functions it defines/declares, rather than choosing where to place inreg.
  2. Which one of the calling convention to use for libcalls can be set by a subtarget feature (selected by clang based on -mregparm=N command-line option, examined by X86TargetLowering to determine what to pass to setLibcallCallingConv).
niravd updated this revision to Diff 79945.Dec 1 2016, 10:34 AM
niravd edited edge metadata.

Rework with new calling conventions. Unchange MCU ABI.

This new version leaves the old inreg computation in place but add library-call-only calling conventions to deal with libcalls (one for C and one for StdCall). Not ideal, but does work.

A quick search didn't turn up a definitive description of the MCU ABI to verify that gcc is wrong, so I've returned MCU back to its original behavior for now.

This is the most definitive description of the MCU ABI I'm aware of:
https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/iamcu-psABI-0.7.pdf

See the example in table 2.5.
To the best of my understanding, -mregparm=3 should pass "s" on the stack, not in register.

niravd updated this object.Dec 2 2016, 6:44 AM
rnk edited edge metadata.Dec 6 2016, 11:56 AM

You touched stdcall, and that reminded me of -mrtd, which has the same problem as -mregparm=: calls to library functions generated by LLVM aren't handled correctly.

Does anybody have any objections to doing this with module flags? They seem like the right kind of thing here for communicating global settings to the backend, and they have the nice property that mismatches get diagnosed during LTO. If you use function attributes as you've done in this patch, it's still possible to LTO a non-regparm TU with a regparm TU and get weird results.

niravd updated this revision to Diff 85926.Jan 26 2017, 9:04 AM

Switch from Subtarget features to module flag.

niravd edited the summary of this revision. (Show Details)Jan 30 2017, 7:06 AM
rnk added a comment.Jan 30 2017, 5:01 PM

This new version leaves the old inreg computation in place but add library-call-only calling conventions to deal with libcalls (one for C and one for StdCall). Not ideal, but does work.

Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.

niravd added a comment.EditedFeb 6 2017, 12:51 PM

Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.

The module flag makes it clear how to do the lowering for intrinsics. The calling convention is to mark which calls are to intrinsics. At call time we only have the type / arg attributes and the calling convention and given the C regparm attribute, we can't rely on the reg arg attributes being zero for just intrinsics. The obvious solutions for this are:

  1. Explicitly label all library calls as being of a special calling convention that always uses the module flag value - Uses two calling conventions because of StdCall.
  2. Calculate which parameters should be done in registers in intrinsic lowering time - This seems brittle and more awkward than it's worth.
  3. Mark each intrinsic function with a new "isIntrinsic" function-level attribute - Attributes seem more valuable a resource then calling conventions but this would work as well.

I'm disinclined to implement 2 but 3 might be reasonable.

rnk added a comment.Feb 13 2017, 5:03 PM

What's hard about approach number 2? There are only 47 or so places where we construct a CallLoweringInfo to feed to LowerCallTo. I bet you could add a method to CallLoweringInfo that handles all the common cases of making a call to an intrinsic, similar to how setCallee is supposed to work.

niravd updated this revision to Diff 89391.Feb 22 2017, 11:36 AM

Replace calling conventions with special casing of setcall in CallLoweringInfo.

niravd retitled this revision from [X86] Add explicit regparm flag for X86-32 calling convention. to [X86] Make library calls sensitive to regparm module flag (Fixes PR3997)..Feb 22 2017, 12:22 PM
niravd edited the summary of this revision. (Show Details)
rnk added a comment.Mar 7 2017, 4:10 PM

Minor comments, I like the approach, sorry for the delay

include/llvm/Target/TargetLowering.h
167 ↗(On Diff #89391)

I think we can probably leave the SDValue Node field null when doing fast isel and get away without any subclassing. It simplifies the code that needs to take the address of all the arglistentries as well.

2636–2638 ↗(On Diff #89391)

Avoiding the subclassing would save this work

lib/CodeGen/SelectionDAG/FastISel.cpp
879 ↗(On Diff #89391)

ditto

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2509 ↗(On Diff #89391)

Unrelated change?

niravd updated this revision to Diff 91892.Mar 15 2017, 9:31 AM
niravd edited the summary of this revision. (Show Details)

Fold all ArgListEntry types together and cleanup.

niravd marked an inline comment as done.Mar 15 2017, 9:36 AM
niravd added inline comments.
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2509 ↗(On Diff #89391)

Unrelated diff from rebasing.

rnk accepted this revision.Mar 17 2017, 8:37 AM

lgtm

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7658 ↗(On Diff #91892)

Please commit the s/is/Is/ change first as a separate change to reduce the diff size and simplify future source history archaeology.

This revision is now accepted and ready to land.Mar 17 2017, 8:37 AM
This revision was automatically updated to reflect the committed changes.
niravd marked an inline comment as done.