Case all constructions of library calls in ISEL to hook into target-specific attribute labeller. This allows libcalls to be sensitive to
the default regparm flag.
Details
Diff Detail
- Build Status
Buildable 1544 Build 1544: arc lint + arc unit
Event Timeline
IIRC, The MCU ABI has at least one other difference from -mregparm=3:
MCU will continue passing parameters in-register even if an earlier parameter ends up on the stack, as long as free registers are available. -mregparm will bail once a parameter doesn't fit in a register. So, for something like foo(int a, int b, long long c, int d), MCU will pass {a, b, c} inreg, and mregparm will only do it for {a,b}.
(Interestingly enough, it looks like GCC gets this wrong, and has the -mregparm behavior for -miamcu... :-\ )
include/llvm/CodeGen/CommandFlags.h | ||
---|---|---|
48 | I really don't think we should have this (except, possibly, for debugging, in which case it probably needs to be hidden). |
There's two cases:
- The calling convention for functions named in the IR. This can be set separately for each function by the frontend. Today, it's done by putting "inreg" attributes on appropriate parameters.
- The calling convention used for library calls generated by LLVM. Today, this is not handled.
The former can be specified by attribute((regparm(N))), and the -mregparm command-line flag specifies both 2, and the default for 1.
This patch adds support for 2, without changing the way 1 works. But, ISTM, both of these should be really handled similarly -- I don't much like that this patch is adding new calling convention code for 2, but leaving the handling for 1 in clang's decision about whether to use "inreg".
I think, instead, that 3 new calling conventions, "x86_regparm_{1,2,3}" should be defined -- variants of the C calling convention, with a different number of args passed in registers.
Then --
- Clang can emit the right cc for functions it defines/declares, rather than choosing where to place inreg.
- Which one of the calling convention to use for libcalls can be set by a subtarget feature (selected by clang based on -mregparm=N command-line option, examined by X86TargetLowering to determine what to pass to setLibcallCallingConv).
This new version leaves the old inreg computation in place but add library-call-only calling conventions to deal with libcalls (one for C and one for StdCall). Not ideal, but does work.
A quick search didn't turn up a definitive description of the MCU ABI to verify that gcc is wrong, so I've returned MCU back to its original behavior for now.
This is the most definitive description of the MCU ABI I'm aware of:
https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/iamcu-psABI-0.7.pdf
See the example in table 2.5.
To the best of my understanding, -mregparm=3 should pass "s" on the stack, not in register.
You touched stdcall, and that reminded me of -mrtd, which has the same problem as -mregparm=: calls to library functions generated by LLVM aren't handled correctly.
Does anybody have any objections to doing this with module flags? They seem like the right kind of thing here for communicating global settings to the backend, and they have the nice property that mismatches get diagnosed during LTO. If you use function attributes as you've done in this patch, it's still possible to LTO a non-regparm TU with a regparm TU and get weird results.
Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.
Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.
The module flag makes it clear how to do the lowering for intrinsics. The calling convention is to mark which calls are to intrinsics. At call time we only have the type / arg attributes and the calling convention and given the C regparm attribute, we can't rely on the reg arg attributes being zero for just intrinsics. The obvious solutions for this are:
- Explicitly label all library calls as being of a special calling convention that always uses the module flag value - Uses two calling conventions because of StdCall.
- Calculate which parameters should be done in registers in intrinsic lowering time - This seems brittle and more awkward than it's worth.
- Mark each intrinsic function with a new "isIntrinsic" function-level attribute - Attributes seem more valuable a resource then calling conventions but this would work as well.
I'm disinclined to implement 2 but 3 might be reasonable.
What's hard about approach number 2? There are only 47 or so places where we construct a CallLoweringInfo to feed to LowerCallTo. I bet you could add a method to CallLoweringInfo that handles all the common cases of making a call to an intrinsic, similar to how setCallee is supposed to work.
Minor comments, I like the approach, sorry for the delay
include/llvm/Target/TargetLowering.h | ||
---|---|---|
167 ↗ | (On Diff #89391) | I think we can probably leave the SDValue Node field null when doing fast isel and get away without any subclassing. It simplifies the code that needs to take the address of all the arglistentries as well. |
2636–2638 ↗ | (On Diff #89391) | Avoiding the subclassing would save this work |
lib/CodeGen/SelectionDAG/FastISel.cpp | ||
879 ↗ | (On Diff #89391) | ditto |
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
2509 ↗ | (On Diff #89391) | Unrelated change? |
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
2509 ↗ | (On Diff #89391) | Unrelated diff from rebasing. |
lgtm
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
7658 ↗ | (On Diff #91892) | Please commit the s/is/Is/ change first as a separate change to reduce the diff size and simplify future source history archaeology. |
I really don't think we should have this (except, possibly, for debugging, in which case it probably needs to be hidden).
As I mentioned on D27051, the information should be contained in the IR.