This is an archive of the discontinued LLVM Phabricator instance.

[ARM][AArch64] Dont use macros for half instrinsics in NeonEmitter
ClosedPublic

Authored by dmgreen on Aug 9 2022, 9:00 AM.

Details

Summary

We don't require arm_neon.h fp16 intrinsics to be treated as macros any more.

Diff Detail

Event Timeline

dmgreen created this revision.Aug 9 2022, 9:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2022, 9:00 AM
dmgreen requested review of this revision.Aug 9 2022, 9:00 AM

It's been a few years since I looked at this, so am writing this from memory...

But I have never understood the use of -fallow-half-arguments-and-return, and made this remark several times I think on Sander's SVE patches. That option, I think, is from an era before (full) FP16, used for OpenCL types if I remember well, and storage-only FP16. For AArch64, it should not be necessary at all because everything is lovely in AArch64 world: no soft FP and fp16 is always a legal type. For AArch32 that's a whole different story. There we have the different ABIs, and we have a few hacks how values are returned/passed. But again, that depends on the ABI, not on this option. This is the reason why I have never understood the usage of it.

And this comment is true, so my question is why it is okay to remove it?

// It is not permitted to pass or return an __fp16 by value, so intrinsics
// taking a scalar float16_t must be implemented as macros.

So I don't understand what problem this patch is solving, and more general, I wanted to see the usage of -fallow-half-arguments-and-return disappear, not be introduced more. But that's with the knowledge I had at the time, perhaps things have changed.

Yeah I don't really understand it either. But this code seems to always add -fallow-half-arguments-and-returns from the driver:
https://github.com/llvm/llvm-project/blob/5331e1229aa6d0d33b5ec9fab7c05310187746d9/clang/lib/Driver/ToolChains/Clang.cpp#L1787
And it is present in the flags for both of these compiles:
https://godbolt.org/z/fG64nYrT8
So I believe for Arm and AArch64 it should always be present through clang. The "additions" in this patch are really just adding the option to -cc1 test lines where it should always have been present.

This patch is just a cleanup. It's almost an NFC, except that the arm_neon.h and arm_fp.h change slightly - some functions that were macros are now proper functions.

Matt added a subscriber: Matt.Aug 10 2022, 4:38 PM
dmgreen updated this revision to Diff 460177.Sep 14 2022, 12:09 PM
dmgreen edited the summary of this revision. (Show Details)

The option is now being removed in D133885.

dmgreen updated this revision to Diff 464665.Oct 3 2022, 6:40 AM

Rebase and ping. I think this is a pretty simple NFCish commit now.

SjoerdMeijer accepted this revision.Oct 3 2022, 6:50 AM

Agreed, LGTM.

This revision is now accepted and ready to land.Oct 3 2022, 6:50 AM
This revision was landed with ongoing or failed builds.Oct 3 2022, 7:27 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:27 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript