- User Since
- Jan 21 2020, 7:29 AM (34 w, 6 d)
Fri, Sep 18
Cleaning-up test and removing the getABIRegCopyCC function.
This seemed weird to me as well, specially as removing it had no impact on any of the regression tests.
Thu, Sep 17
Thu, Aug 27
Removing unrelated whitespace change.
Tue, Aug 25
Hi @hans , I'll have a look at it!
Jul 9 2020
Jul 7 2020
Jul 2 2020
Moving fix to SelectionDAG::getNode.
Jun 25 2020
The issue happens when creating the ISD::FP_TO_FP16 in case the input is a constant.
It tries to generate the new 16-bit constant but the NOutVT type passed to DAG.getNode has 32-bits, which causes the assertion in SelectionDAG.cpp:1307 to fail:
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1307: llvm::SDValue llvm::SelectionDAG::getConstant(const llvm::ConstantInt &, const llvm::SDLoc &, llvm::EVT, bool, bool): Assertion `Elt->getBitWidth() == EltVT.getSizeInBits() && "APInt size does not match type size!"' failed.
I've posted D82552 fixing the assertion failure on the type legalization's bitcast handling. Once it lands, it should be ok to include arm_aapcs_fvpcc and swiftcc on this patch.
Jun 24 2020
It seems most of the argument lowering convertions get simplifyed on the regular case, when not using fastcc.
When using it, though, type legalization ends up trying to handle a i16 = bitcast ConstantFP:f16<APFloat(0)> node, turning it into an i32 <- f32 operation and getting lost while trying to create the new constant.
Taking a further look to the differences in the DAG, the resulting type is actually the same.
Both result in an f32 value, but get to it in different ways as expected.
I believe simply limiting the calling conventions might be a bit tricky. At this point the method might get a calling convention id that does not reflect the one required for argument lowering (see ARMTargetLowering::getEffectiveCallingConv in ARMISelLowering.cpp).
One option to get around this would be to use that same method to determine the effective calling convention, but it might not be easy to check whether or not the function beeing handled is variadic.
Jun 18 2020
Jun 16 2020
Splitting NFC changes into a separate patch.
Jun 15 2020
The changes to the backend only handle the half (f16) type itself, not vectors that have it as their base type.
After taking a deeper look at ARMConstantIslandPass, I believe this patch won't actually cause any issues when interacting with it.
Jun 12 2020
From @SjoerdMeijer's comment and the links he pointed to, it seems to me that making f16 types legal for all ARM subtargets would be a major undertaking and far from trivial to implement. It's also not clear to me how significant would be the returns of this effort.
My feeling is that we could proceed with the current approach and discuss the possbility of making f16 legal in a separate follow up effort, as mentioned by @dnsampaio.
Jun 11 2020
Addressing review comment.
Rebasing and simplifying function attributes on test.
Fixing failure on CodeGen/ARM/GlobalISel/arm-unsupported.ll and making clang-format happy.
Jun 10 2020
Clean-ups + fixing failure in CodeGen/ARM/half.ll test.
Jun 9 2020
Moving the clean-up of the Clang-side handling to a separate patch.
Splitting the patch into two parts: one for introducing the half-precision
handling into AArch32's backend and one for removing the existing coercion
of those arguments from Clang.
Jun 8 2020
Re-writing the handling of fp16 arguments, moving their lowering to be performed
in the backend.
Jun 4 2020
This is intentionally not addressing greedy regalloc, I guess.
Rebasing on top of rG66251f7e1de7.
Jun 3 2020
Jun 2 2020
Addressing review comments and extending tests.
Removing unecessary include and fixing formatting.
May 27 2020
Hi @efriedma and @plotf,
May 26 2020
May 20 2020
May 11 2020
May 7 2020
Addressing review comment.
Addressing review comment.
May 5 2020
Apr 28 2020
Apr 23 2020
Apr 22 2020
Apr 21 2020
Removing unnecessary function.
Fixing "mising clang-format" messages.
Fixing missing clang-format messages.
Apr 20 2020
Apr 8 2020
Apr 6 2020
From the AAPCS64's Parameter Passing Rules section (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules), I believe the proposed handling is correct. The HFA related rules described in this section are:
Stage B – Pre-padding and extension of arguments [...] B.2 If the argument type is an HFA or an HVA, then the argument is used unmodified. [...]
Stage C – Assignment of arguments to registers and stack [...] C.2 If the argument is an HFA or an HVA and there are sufficient unallocated SIMD and Floating-point registers (NSRN + number of members <= 8), then the argument is allocated to SIMD and Floating-point Registers (with one register per member of the HFA or HVA). The NSRN is incremented by the number of registers used. The argument has now been allocated. C.3 If the argument is an HFA or an HVA then the NSRN is set to 8 and the size of the argument is rounded up to the nearest multiple of 8 bytes. C.4 If the argument is an HFA, an HVA, a Quad-precision Floating-point or Short Vector Type then the NSAA is rounded up to the larger of 8 or the Natural Alignment of the argument’s type. [...]
As per rule C.4, the argument should be allocated on the stack address rounded to the larger of 8 and its Natural Alignment, which is 32 according to what is specified by the Composite Types rules in sectoin 5.6 of that same document (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#composite-types):
5.6 Composite Types [...] - The natural alignment of a composite type is the maximum of each of the member alignments of the 'top-level' members of the composite type i.e. before any alignment adjustment of the entire composite is applied
In regards to the compatibility with other compilers, I'm not sure that following what seems to be an uncompliant behavior would be the best way to proceed. @rnk and @ostannard, what would be your take on this?