This is an archive of the discontinued LLVM Phabricator instance.

[RFC][IR] Correct lowering of `f128` intrinsics
Needs ReviewPublic

Authored by tmgross on Aug 14 2023, 12:08 AM.

Details

Summary

[IR] Correct lowering of f128 intrinsics

Change the lowering of 128-bit floating point math function intrinsics
to use binary128 functions (sqrtf128) rather than long double
functions (sqrtl).

Currently intrinsic calls such as @llvm.sqrt.f128 are lowered to
libc's long double functions. On platforms where long double is
not fp128, this results in incorrect math.

define fp128 @test_sqrt(fp128 %a) {
start:
  %0 = tail call fp128 @llvm.sqrt.f128(fp128 %a)
  ret fp128 %0
}

declare fp128 @llvm.sqrt.f128(fp128)

lowers to

test_sqrt:                              # @test_sqrt
        jmp     sqrtl@PLT                       # TAILCALL

On x86 this results in the binary128 argument being treated as 80-bit
extended precision.

This has no effect on clang, which lowers builtins to the libc calls
directly without going through LLVM intrinsics.

Fixes https://github.com/llvm/llvm-project/issues/44744

Diff Detail

Event Timeline

tmgross created this revision.Aug 14 2023, 12:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2023, 12:08 AM
Herald added a subscriber: pengfei. · View Herald Transcript
tmgross retitled this revision from [IR] Add an xpassing test for `f128` intrinsic lowering to [RFC][IR] Correct lowering of `f128` intrinsics.Aug 14 2023, 12:40 AM
tmgross edited the summary of this revision. (Show Details)
tmgross added reviewers: RKSimon, pengfei, ldionne.
tmgross added subscribers: RKSimon, ldionne.
tmgross published this revision for review.Aug 14 2023, 12:43 AM
tmgross added a subscriber: llvm-commits.
Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2023, 12:45 AM

I did not yet complete this change since I am unsure how to handle libc support. Glibc supports f128 math since 2.26 (2017, [1]), but I am not sure how to handle older versions or platforms where the math isn't yet supported. For targets where long double is __float128 this doesn't matter and the cosl-style calls already work properly (should probably lower to f128 instead to be less ambiguous), but for anything else the lowering is broken. What is the best way to handle this?

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=NEWS;hb=81325b12b14c44887f1633a2c180a413afc2b504#l143

If there's literally no implementation of sinf128 available, I don't think we can do much better than just letting the linker produce a "sinf128 not found" error. I mean, a compiler error might be slightly better, but not by a lot, so I don't think it's worth the complexity to try to produce one. (And this way lets the user provide their own implementation if they want to.)

My primary concern here is the effect on non-x86 targets. For targets where "long double" is f128, lowering "llvm.sin.f128" to "sinl" is fine, and messing with the name could break previously working code. So maybe instead of messing with RuntimeLibcalls.def directly, we should make the backend fix the names (call setLibcallName() in X86ISelLowering.cpp).

If there's literally no implementation of sinf128 available, I don't think we can do much better than just letting the linker produce a "sinf128 not found" error. I mean, a compiler error might be slightly better, but not by a lot, so I don't think it's worth the complexity to try to produce one. (And this way lets the user provide their own implementation if they want to.)

That sounds good, still a lot better than runtime computation errors

My primary concern here is the effect on non-x86 targets. For targets where "long double" is f128, lowering "llvm.sin.f128" to "sinl" is fine, and messing with the name could break previously working code. So maybe instead of messing with RuntimeLibcalls.def directly, we should make the backend fix the names (call setLibcallName() in X86ISelLowering.cpp).

Thank you for the direction - is there a place to do this in a more platform-generic way based on the layout of long double? I think there quite a few targets where long double != f128 especially non-64-bit.

efriedma added a comment.EditedAug 14 2023, 3:28 PM

clang knows the layout of "long double", but we don't explicitly pass it down to LLVM. I don't think it's something we've ever needed to care about; e.g. compiler-rt routines use different names for f80 vs. f128.

Not sure exactly what the design of that would look like off the top of my head; maybe something involving a module flag.

clang knows the layout of "long double", but we don't explicitly pass it down to LLVM. I don't think it's something we've ever needed to care about; e.g. compiler-rt routines use different names for f80 vs. f128.

Not sure exactly what the design of that would look like off the top of my head; maybe something involving a module flag.

It seems like this logic needs to move from clang to LLVM since this affects non-C codegen, correct? This could probably travel with target information, I assume clang can pull this from llvm where needed.

I can probably figure out the work for this but just need a rough outline of what the changes will look like.

clang knows the layout of "long double", but we don't explicitly pass it down to LLVM. I don't think it's something we've ever needed to care about; e.g. compiler-rt routines use different names for f80 vs. f128.

Not sure exactly what the design of that would look like off the top of my head; maybe something involving a module flag.

It seems like this logic needs to move from clang to LLVM since this affects non-C codegen, correct? This could probably travel with target information, I assume clang can pull this from llvm where needed.

I can probably figure out the work for this but just need a rough outline of what the changes will look like.

You may override it in X86ISelLowering.cpp, see the line starting from Setup Windows compiler runtime calls.

There needs to be some communication between clang and the backend if we want to support -mlong-double-128. But probably we want some reasonable default, sure.

Making the backends in question override the libcall name probably works, but we probably want to share the code across backends? That said, the simplest way to accomplish it is to just stick a helper in TargetLowering, and make each backend call it as appropriate.

We could consider something more abstract, but I don't think it's worth it unless we find other places need the information.

Wouldn't -mlong-double-128 remain a frontend-only flag since it doesn't affect the libc calls, instead just tells clang whether to lower to fp128/x86_fp80/double?

In any case, it seem like long double logic is quite mixed into other C-specific logic, probably not worth extracting to LLVM. So it seems a module flag will be needed anyway and can provide this information.

Wouldn't -mlong-double-128 remain a frontend-only flag since it doesn't affect the libc calls, instead just tells clang whether to lower to fp128/x86_fp80/double?

-mlong-double-128 is an ABI flag; among other things, it specifies the format of the operand/result to sinl.

In any case, it seem like long double logic is quite mixed into other C-specific logic, probably not worth extracting to LLVM. So it seems a module flag will be needed anyway and can provide this information.

It's probably worth some effort to ensure that unmodified targets work correctly without the module flag (for the sake of people writing their own frontends).

ldionne resigned from this revision.Sep 8 2023, 5:59 AM

I am not qualified to review this change.