This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Disable __muloti4 libcalls for AArch64
Needs RevisionPublic

Authored by sebastianpoeplau on Jan 16 2023, 3:02 AM.

Details

Summary

The library function only exists in compiler-rt, not in libgcc.

This is consistent with similar changes for x86-64 (D109385),
x86 (D108928), ARM (D108842), MIPS (D108844), PPC (D108936), and
RISC-V (D108939).

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 3:02 AM
sebastianpoeplau requested review of this revision.Jan 16 2023, 3:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 3:02 AM
sebastianpoeplau edited the summary of this revision. (Show Details)

Removed internal ticket identifiers from the commit message

The build failure on Debian is caused by a clangd unit test that seems unrelated to AArch64 codegen; I'm assuming for now that the failure isn't caused by this change.

sdesmalen accepted this revision.Jan 23 2023, 5:49 AM

From reading the thread on https://github.com/llvm/llvm-project/issues/16778 (which is closed, although the issue doesn't seem resolved), this is a stop-gap for toolchains that don't include an implementation for __muloti4. An implementation seems to exist in compiler-rt, but compiler-rt isn't linked with by default. It seems unfortunate that there is no way to still emit a call to __muloti4 when we do link with compiler-rt, but from reading that thread there is currently no way to tell LLVM that an implementation of __muloti4 is available.

So I'm happy to accept this change, but please wait for a few days to land it to see if @rengolin or @nickdesaulniers has any comments.

This revision is now accepted and ready to land.Jan 23 2023, 5:49 AM

From reading the thread on https://github.com/llvm/llvm-project/issues/16778 (which is closed, although the issue doesn't seem resolved), this is a stop-gap for toolchains that don't include an implementation for __muloti4. An implementation seems to exist in compiler-rt, but compiler-rt isn't linked with by default. It seems unfortunate that there is no way to still emit a call to __muloti4 when we do link with compiler-rt, but from reading that thread there is currently no way to tell LLVM that an implementation of __muloti4 is available.

That's my understanding as well.

So I'm happy to accept this change, but please wait for a few days to land it to see if @rengolin or @nickdesaulniers has any comments.

Will do. Thanks for your review!

From reading the thread on https://github.com/llvm/llvm-project/issues/16778 (which is closed, although the issue doesn't seem resolved), this is a stop-gap for toolchains that don't include an implementation for __muloti4. An implementation seems to exist in compiler-rt, but compiler-rt isn't linked with by default. It seems unfortunate that there is no way to still emit a call to __muloti4 when we do link with compiler-rt, but from reading that thread there is currently no way to tell LLVM that an implementation of __muloti4 is available.

That's my understanding as well.

Every few years I come to this issue via bugs or reviews. I am shocked that this is still an issue today...

This isn't by far a good solution, but it will work around the "original" problem, just not the "actual" problem.

I don't mind this going in as is, but I'll let @nickdesaulniers approve to be sure I'm not missing anything.

@echristo FYI

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
607

please, add a comment:

// TODO: Link compiler-rt's builtins by default in Clang once and for all
// See: https://github.com/llvm/llvm-project/issues/16778

The right move here is to check about muloti, not disable it unless the
aarch64 backend doesn't need it.

The right move here is to check about muloti, not disable it unless the
aarch64 backend doesn't need it.

You mean __has_builtin?

The problem is that LLVM itself lowers the call for generic code, because it thinks there is support (since the target reports there is), so users can't do much to avoid it.

This "fix" is just telling LLVM there isn't such call, and so it lowers to standard arithmetic. Checking clang is linking RT is not fool-proof either (multi-stage compilation).

My view is that it's wrong to say we have that builtin when it's possible not to have, so I'd err in the side of caution and only re-enable it when we make sure we know when it's safe to lower.

In the end, the generated code is probably a bit worse than the implemented one, but still works, so it won't break anything.

Added requested comment

sebastianpoeplau marked an inline comment as done.Jan 24 2023, 1:18 AM

I would hypothesize that the number of platforms where compiler-rt is used to link aarch64 code is greater than the number that don't and there's work to make linking with compiler-rt (https://github.com/llvm/llvm-project/tree/main/llvm-libgcc) in order to replace uses on other platforms. In addition, while my employment meant I couldn't add this to libgcc in the past there's no reason why it couldn't be added as well. Has anyone tried?

cjdb added a comment.Jan 25 2023, 11:54 AM

I would hypothesize that the number of platforms where compiler-rt is used to link aarch64 code is greater than the number that don't and there's work to make linking with compiler-rt (https://github.com/llvm/llvm-project/tree/main/llvm-libgcc) in order to replace uses on other platforms. In addition, while my employment meant I couldn't add this to libgcc in the past there's no reason why it couldn't be added as well. Has anyone tried?

+1 to trying to add these to libgcc instead of removing from compiler-rt. Both ChromeOS and FreeBSD use llvm-libgcc to make compiler-rt appear as if it is libgcc (but it's still compiler-rt), and this may have unintended consequences. I suspect that there are more systems that do this too.

+1 to trying to add these to libgcc instead of removing from compiler-rt.

To be clear, no one is suggesting to remove this from compiler-rt, just not enable by default for LLVM lowering on AArch64. Very different things.

There are some Linux distros and FreeBSD who do the same thing, and they'd also be affected. But since this is a build issue, we could work around it by some build flag or triple check that enables it.

there's no reason why it couldn't be added as well. Has anyone tried?

Honestly, I don't know, and am still a little surprised that it's still not there yet...

But the main problem back then, and is still relevant now, is that adding it to libgcc won't fix for the previous versions, which will be available for years on distros, builds, etc.

I think this is an orthogonal issue. Sure, it would be nice to have that in libgcc, and it would eventually fix the problem, but it wouldn't make a difference today.

I'm also surprised that this is still the only (?) case where we have that kind of a problem. I'd expect it to be more widespread...

rengolin requested changes to this revision.Jan 25 2023, 1:48 PM
rengolin added reviewers: bero, theraven, hans.

Just to make sure there's no panic, I'm blocking the revision until more people looked over. I'll remove once we have a number of approvals from people in different positions.

I'm also adding Bero (Mandriva's maintainer), David (our FreeBSD representative) and Hans (Chromium) to check over their side.

Feel free to add more people.

This revision now requires changes to proceed.Jan 25 2023, 1:48 PM

I think there just isn't any good way forward that imposes demands on the "system" libgcc/compiler-rt. Adding or modifying APIs to libgcc/compiler-rt simply isn't practical given that programs are linked using implementations distributed independent from the compiler. Even if we convince the gcc developers to add __muloti4, it'll be years before it actually ships in Linux distributions. There are bunch of APIs we'd want if it were straightforward to add new ones; not just mul-with-overflow, but also saturating float-to-int, clz with zero defined, memory functions optimized for small sizes, outlined prologue/epilogue sequences, existing functions with optimized ABI, math functions that don't mess with errno, etc.

Long-term, I think the only practical way forward is to teach LLVM to embed the definitions of helper functions such as "mul-with-overflow" as weak symbols into the files that use them, so we can just assume they're always available.

I don't have a strong opinion on whether the current patch is the right way forward short-term. Maybe restrict the change to glibc/musl targets, though, since we generally use compiler-rt on other aarch64 targets (and it's included __muloti4 for ages).

At least for ChromeOS, as long as this change is limited to clang not making calls to it and function is not removed from compiler-rt, it works for us . We ship a libgcc_s.so using compiler-rt sources so we need so compatibility there so that older binaries keep working.

This seems like a regression for FreeBSD. This function was added to compiler-rt before we officially supported AArch64 and so I'm pretty sure it's shipped in every AArch64 version of FreeBSD, even Tier 2 releases.

Ok, created a proposal to split builtins: https://discourse.llvm.org/t/proposal-split-built-ins-from-the-rest-of-compiler-rt/67978

If there's a better option, it hopefully will be reached there.

Wanted to say thank you for your work on that proposal - I think it sets
the right tone and consensus is looking nice :)

Thanks!

-eric