Download Raw Diff

Details

Reviewers

pratlucas
rjmccall
efriedma
ostannard
SjoerdMeijer

Commits

rG7e169cec74b0: [NFC][test] Adding fastcc test case for promoted 16-bit integer bitcasts.

Summary

I had noticed an assert in the ARM backend on builds of the Swift stdlib. I've since reduced the offending IR and reported my findings to the folks on D75169. I suspect that the changes in D75169 were meant mainly for the AAPCS calling conventions and indeed they do work fine with them, but with fastcc and swiftcc I noticed there was a crash.

I also noticed that the line that checks the calling convention in D75169 is only checking to see that there is a calling convention, not specifically checking that it is a supported calling convention. So this patch does just that, and includes a regression test too.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

plotfi created this revision.Jun 24 2020, 2:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2020, 2:12 AM

Herald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls. · View Herald Transcript

plotfi mentioned this in D75169: [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend.Jun 24 2020, 2:14 AM

Hi @plotfi,

I believe simply limiting the calling conventions might be a bit tricky. At this point the method might get a calling convention id that does not reflect the one required for argument lowering (see ARMTargetLowering::getEffectiveCallingConv in ARMISelLowering.cpp).
One option to get around this would be to use that same method to determine the effective calling convention, but it might not be easy to check whether or not the function beeing handled is variadic.

Also, I did not expect to see a difference in the resulting parts' types between getCopyToParts and ARMTargetLowering::splitValueIntoRegisterParts, only in how the final type is achieved. I'm investigating this further to see what's missing.

Taking a further look to the differences in the DAG, the resulting type is actually the same.
Both result in an f32 value, but get to it in different ways as expected.

With ARMTargetLowering::splitValueIntoRegisterParts:

t22: f32 = bitcast t21
  t21: i32 = any_extend t20
    t20: i16 = bitcast t5
      t5: f16 = extract_vector_elt t3, Constant:i32<0>
        t3: v8f16,v8f16 = merge_values t2, t2
          t2: v8f16 = BUILD_VECTOR ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f
16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>

Without ARMTargetLowering::splitValueIntoRegisterParts:

t20: f32 = fp_extend t5
  t5: f16 = extract_vector_elt t3, Constant:i32<0>
    t3: v8f16,v8f16 = merge_values t2, t2
      t2: v8f16 = BUILD_VECTOR ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<A
PFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>

I'm taking a deeper look into the assertion failure stack to figure out what's causing it.

Harbormaster failed remote builds in B61515: Diff 272940!Jun 24 2020, 3:44 AM

It seems most of the argument lowering convertions get simplifyed on the regular case, when not using fastcc.
When using it, though, type legalization ends up trying to handle a i16 = bitcast ConstantFP:f16<APFloat(0)> node, turning it into an i32 <- f32 operation and getting lost while trying to create the new constant.

Avoiding the splitting should do the trick for fastcc, as it has no need to conform with AAPCS. I'm not sure about swiftcc, though, as it is expected to comply with AAPCS-VFP on ios platforms according to LLVM's language reference manual.

llvm/lib/Target/ARM/ARMISelLowering.cpp
4159 ↗	(On Diff #272940)	`CallingConv::ARM_AAPCS_VFP` and `CallingConv::C` would also need to be included here.

pratlucas added reviewers: efriedma, ostannard, SjoerdMeijer.Jun 24 2020, 9:38 AM

In D82443#2111882, @pratlucas wrote:

It seems most of the argument lowering convertions get simplifyed on the regular case, when not using fastcc.
When using it, though, type legalization ends up trying to handle a i16 = bitcast ConstantFP:f16<APFloat(0)> node, turning it into an i32 <- f32 operation and getting lost while trying to create the new constant.

Avoiding the splitting should do the trick for fastcc, as it has no need to conform with AAPCS. I'm not sure about swiftcc, though, as it is expected to comply with AAPCS-VFP on ios platforms according to LLVM's language reference manual.

llvm/lib/Target/ARM/ARMISelLowering.cpp
4159 ↗	(On Diff #272940)	I believe I tried ARM_AAPCS_VFP and it asserted. Lemme confirm.

In D82443#2111882, @pratlucas wrote:

It seems most of the argument lowering convertions get simplifyed on the regular case, when not using fastcc.
When using it, though, type legalization ends up trying to handle a i16 = bitcast ConstantFP:f16<APFloat(0)> node, turning it into an i32 <- f32 operation and getting lost while trying to create the new constant.

Avoiding the splitting should do the trick for fastcc, as it has no need to conform with AAPCS. I'm not sure about swiftcc, though, as it is expected to comply with AAPCS-VFP on ios platforms according to LLVM's language reference manual.

I gave the following a try, and it also asserts:

target datalayout = "e-m:o-p:32:32-Fi8-f64:32:64-v64:32:64-v128:32:128-a:0:32-n32-S32"
target triple = "thumbv7s-apple-ios7.0.0"

define arm_aapcs_vfpcc { <8 x half>, <8 x half> } @f() {
  ret { <8 x half>, <8 x half> } zeroinitializer
}

I've posted D82552 fixing the assertion failure on the type legalization's bitcast handling. Once it lands, it should be ok to include arm_aapcs_fvpcc and swiftcc on this patch.

Updated diff to only handle fastcc

@pratlucas I've updated this diff to only exclude fastcc. How does this look to you? Can I land this for now, as it does not interfere with your D82552 diff?

Harbormaster failed remote builds in B62227: Diff 274251!Jun 29 2020, 3:18 PM

@plotfi This should not interfere D82552.
I believe you can land this, LGTM.

This revision is now accepted and ready to land.Jul 2 2020, 8:36 AM