This is an archive of the discontinued LLVM Phabricator instance.

[Aarch64] Customer lowering of COPYSIGN to SIMD should check for NEON availability
ClosedPublic

Authored by akshaykumars614 on Jan 4 2022, 12:31 AM.

Details

Summary

For the following test case, clang is crashing for ARM64 architecture
$ cat crash.c
double crash(double a, double b)
{
return __builtin_copysign(a, b);
}

$ clang -O2 -march=armv8-a+nosimd --target=arm64 -S crash.c -o /dev/null
fatal error: error in backend: Cannot select: 0x7fae361bb4e8: v2i64 = AArch64ISD::BIT 0x7fae361bb210, 0x7fae361bb278, 0x7fae361bb480
Fix: PR51806

Diff Detail

Event Timeline

akshaykumars614 requested review of this revision.Jan 4 2022, 12:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2022, 12:31 AM

Sounds OK. AArch64 without NEON is not very well supported - it's presumed to be available on all implementations. There may be other bugs too, but we can fix the places we know about.

Can you add a test?

akshaykumars614 edited the summary of this revision. (Show Details)

Updated the testcase, and formatted the file accordingly.

dmgreen added inline comments.Jan 5 2022, 3:22 AM
llvm/test/CodeGen/AArch64/fcopysign.ll
2–3

Can you make sure there is a "with neon" and a "without neon" run line. That way we test both cases.

The update script doesn't remove old CHECK lines that have been removed fro the RUN lines - those need to be removed manually (or keep the old run lines with the same check-prefix, they should then be updated).

16

This file only seems to contain f128 copysigns. Can we make sure there are f64/f32/f16 tests too?

@dmgreen , I have updated the run lines.
I am feeling difficult in finding example for adding f64/f32/f16 tests in the testcase. It will be helpful if you provide me with the example.

@dmgreen , I have updated the run lines.
I am feeling difficult in finding example for adding f64/f32/f16 tests in the testcase. It will be helpful if you provide me with the example.

They should look something like this: https://godbolt.org/z/qvGs1G53M. With other half/double versions too.

llvm/test/CodeGen/AArch64/fcopysign.ll
2–3

We likely don't need the duplicate first line. I would change these run lines to:

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -march=aarch64-none-eabi | FileCheck %s
; RUN: llc < %s -march=aarch64-none-eabi -mattr -neon | FileCheck -check-prefix=CHECK-NONEON %s

And remove the "target triple = .." line below.

I have updated the testcase and changed the run lines.

dmgreen added inline comments.Jan 11 2022, 12:38 AM
llvm/test/CodeGen/AArch64/fcopysign.ll
8–10

We'll need to change the types for each of these too:

declare double @llvm.copysign.f64(double %a, double %b)
declare float @llvm.copysign.f32(float %a, float %b)
declare half @llvm.copysign.f16(half %a, half %b)

Then change the types below too.

RKSimon resigned from this revision.Jan 11 2022, 1:57 AM

I have updated the types as suggested.

dmgreen added inline comments.Jan 16 2022, 8:13 AM
llvm/test/CodeGen/AArch64/fcopysign.ll
8

These intrinsics in llvm are overloaded - the types they operate on are specified in the name of the intrinsic, and the type they operator on should match the name of the intrinsic (where float=f32, half=f16, etc). There are some details in https://llvm.org/docs/LangRef.html#intrinsic-functions

So it should be float @llvm.copysign.f32(float %a, float %b), double @llvm.copysign.f64(double %a, double %b) and half @llvm.copysign.f16(half %a, half %b) but not any of the other combinations here.

updated the changes suggested

dmgreen accepted this revision.Jan 17 2022, 7:00 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jan 17 2022, 7:00 AM
This revision was landed with ongoing or failed builds.Jan 17 2022, 10:55 AM
This revision was automatically updated to reflect the committed changes.