This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Fix Crash in 't'/'w' handling without fp16/bf16
ClosedPublic

Authored by lenary on Feb 10 2023, 1:43 AM.

Details

Summary

After https://reviews.llvm.org/rGff4027d152d0 and
https://reviews.llvm.org/rG7d15212b8c0c we saw crashes in SelectionDAG
when trying to use these constraints when you don't have the fp16 or
bf16 extensions.

However, it is still possible to move 16-bit floating point values into
the right place in S registers with a normal vmov, even if we don't
have fp16 instructions we can use within the inline assembly string.
This patch therefore fixes the crash.

I think the reason we weren't getting this crash before is because I
think the fp16 and bf16 types got an error diagnostic in the Clang
frontend when you didn't have the right architectural extensions to use
them. This restriction was recently relaxed.

The approach for bf16 needs a bit more explanation. Exactly how BF16 is
legalized was changed in rGb769eb02b526e3966847351e15d283514c2ec767 -
effectively, whether you have the right instructions to get a bf16 value
into/out of a S register with MoveTo/FromHPR depends on hasFullFP16, but
whether you use a HPR for a value of type MVT::bf16 depends on hasBF16.
This is why the tests are not changed by +bf16 vs -bf16, but I've
left both sets of RUN lines in case this changes in the future.

Test Changes:

  • Added more testing for testing inline asm (the core part)
  • fp16-promote.ll and pr47454.ll show improvements where unnecessary fp16-fp32 up/down-casts are no longer emitted. This results in fewer libcalls where those casts would be done with a libcall.
  • aes-erratum-fix.ll is fairly noisy, and I need to revisit this test so that the IR is more minimal than it is right now, because most of the changes in this commit do not relate to what AES is actually trying to verify.

Diff Detail

Event Timeline

lenary created this revision.Feb 10 2023, 1:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 1:43 AM
lenary requested review of this revision.Feb 10 2023, 1:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 1:43 AM
olista01 accepted this revision.Feb 10 2023, 7:21 AM

LGTM, with one comment in the test.

llvm/test/CodeGen/ARM/inlineasm-fp-half.ll
14–55

I think we should also have tests with FP16 but not BF16, because there are separate code paths for them.

This revision is now accepted and ready to land.Feb 10 2023, 7:21 AM
lenary added inline comments.Feb 20 2023, 9:30 AM
llvm/test/CodeGen/ARM/inlineasm-fp-half.ll
14–55

So, I looked at doing this, and there is no difference in codegen between +fullfp16,-bf16 and +fullfp16,+bf16. I believe this is because this codepath ends up relying on MoveToHPR and MoveFromHPR, which are only called often if the VT is MVT::f16 or MVT::bf16, but the relevant code path inside the function only checks for Subtarget->hasFullFP16(). I'm not convinced this is incorrect, but I'm not sure. I think I need to do more investigation before I canonically say either way.

This might mean that the register class allocation for bf16 becomes entirely based on whether we have fullfp16, I'm not sure.

lenary updated this revision to Diff 501843.Mar 2 2023, 6:04 AM
lenary retitled this revision from [ARM] Fix Crash in 't'/'w' handling without fp16/fp16 to [ARM] Fix Crash in 't'/'w' handling without fp16/bf16.
lenary edited the summary of this revision. (Show Details)
lenary marked an inline comment as done.Mar 2 2023, 6:05 AM
lenary added inline comments.
llvm/test/CodeGen/ARM/inlineasm-fp-half.ll
14–55

I've added these tests, and I think I got down to the difference in codegen, so I have also updated the description.

olista01 accepted this revision.Mar 2 2023, 6:19 AM

LGTM