This is an archive of the discontinued LLVM Phabricator instance.

Add builtin for llvm set rounding
ClosedPublic

Authored by xiongji90 on Feb 21 2023, 12:44 AM.

Details

Summary

llvm.get.rounding and llvm.set.rounding have been added to llvm and for llvm.get.rounding, a corresponding builtin "__builtin_flt_rounds" has been added but corresponding builtin for llvm.set.rounding has not been added. This patch aims to add builtin for set rounding behavior.

Diff Detail

Event Timeline

xiongji90 created this revision.Feb 21 2023, 12:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 12:44 AM
xiongji90 requested review of this revision.Feb 21 2023, 12:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 12:44 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
andrew.w.kaylor accepted this revision.Feb 21 2023, 10:15 AM

This looks good to me. I'm adding @rjmccall as a reviewer to make sure someone with front end expertise sees this.

There seems to be a potential type mismatch between the intrinsic and the builtin in the case of __builtin_flt_rounds(). Based on D24461, I wonder if that's a target-specific difference. I'm not sure if a similar issue exists in this case since we aren't copying a gcc builtin.

This revision is now accepted and ready to land.Feb 21 2023, 10:15 AM

New builtins should be documented in the user manual.

There are standard pragmas for changing the rounding mode, right? What's the interaction between the ability to set this dynamically with a builtin call and those pragmas?

clang/test/CodeGen/builtins.c
283

Just for completeness, please test that this gets the result of the expression.

New builtins should be documented in the user manual.

There are standard pragmas for changing the rounding mode, right? What's the interaction between the ability to set this dynamically with a builtin call and those pragmas?

Hi, @rjmccall
Current status is LLVM has added llvm.get.rounding and llvm.set.rounding intrinsic and has also added "builtin_flt_rounds" for llvm.get.rounding but there is no corresponding for llvm.set.rounding intrinisc. According to original patch to add llvm.set.rounding: https://reviews.llvm.org/D74729 and LLVM document for this intrinsic: https://llvm.org/docs/LangRef.html#id1506
The introduction of llvm.set.rounding intrinsic can provide same functionality as C library function "fesetround" and avoid unnecessary dependency on C library. Currently, this intrinsic can't benefit C/C++ developers since we don't have a corresponding builtin and we even can't add a C/C++ test for this intrinsic.
There may be more use scenario other then normal C/C++ program, for example, in heterogeneous computing, some other device such as GPU may support rounding mode setting in device code, users can invoke
builtin_flt_rounds_set in their source code to use this functionality but they can't use C library functions since it unavailable in GPU or other device.

Thanks very much for the review!

Update test for llvm.set.rounding

New builtins should be documented in the user manual.

There are standard pragmas for changing the rounding mode, right? What's the interaction between the ability to set this dynamically with a builtin call and those pragmas?

Hi, @rjmccall
I didn't find appropriate place to document this new added builtin in https://github.com/llvm/llvm-project/blob/main/clang/docs/UsersManual.rst and didn't find a specific doc for builtins, so could you provide advice on where should document builtins newly added?
Thanks very much.

clang/test/CodeGen/builtins.c
283

Hi, @rjmccall
I updated the test to check "llvm.set.rounding" string with input, does it meet your requirement?
Thanks very much.

New builtins should be documented in the user manual.

There are standard pragmas for changing the rounding mode, right? What's the interaction between the ability to set this dynamically with a builtin call and those pragmas?

Hi, @rjmccall
Current status is LLVM has added llvm.get.rounding and llvm.set.rounding intrinsic and has also added "__builtin_flt_rounds" for llvm.get.rounding but there is no corresponding for llvm.set.rounding intrinisc. According to original patch to add llvm.set.rounding: https://reviews.llvm.org/D74729 and LLVM document for this intrinsic: https://llvm.org/docs/LangRef.html#id1506
The introduction of llvm.set.rounding intrinsic can provide same functionality as C library function "fesetround" and avoid unnecessary dependency on C library. Currently, this intrinsic can't benefit C/C++ developers since we don't have a corresponding builtin and we even can't add a C/C++ test for this intrinsic.

If this builtin is defined to have the same behavior as fesetround, the obvious name would be __builtin_fesetround. You seem to have patterned it off of __builtin_flt_rounds, but that is not an arbitrarily-chosen name, it's taken from the standard FLT_ROUNDS macro and is expected to work as an implementation of it.

We already understand how fesetround is supposed to interact with the standard pragmas about the FP environment, so that indirectly answers that part of my question. I assume you've modeled your intrinsic correctly in LLVM so that the optimizer understands it can't reorder the environment-sensitive math intrinsics around calls to the new intrinsic?

If this builtin is just an intrinsic implementation of fesetround, I don't think we need further documentation of it.

I assume the intrinsic is only implemented on certain targets? If so, we might need to restrict uses of the builtin as well (and make sure that __has_builtin only returns true on those targets).

clang/test/CodeGen/builtins.c
283

Yes, thank you.

xiongji90 updated this revision to Diff 501732.Mar 1 2023, 6:55 PM

Change the builtin name to __builtin_fesetround

New builtins should be documented in the user manual.

There are standard pragmas for changing the rounding mode, right? What's the interaction between the ability to set this dynamically with a builtin call and those pragmas?

Hi, @rjmccall
Current status is LLVM has added llvm.get.rounding and llvm.set.rounding intrinsic and has also added "__builtin_flt_rounds" for llvm.get.rounding but there is no corresponding for llvm.set.rounding intrinisc. According to original patch to add llvm.set.rounding: https://reviews.llvm.org/D74729 and LLVM document for this intrinsic: https://llvm.org/docs/LangRef.html#id1506
The introduction of llvm.set.rounding intrinsic can provide same functionality as C library function "fesetround" and avoid unnecessary dependency on C library. Currently, this intrinsic can't benefit C/C++ developers since we don't have a corresponding builtin and we even can't add a C/C++ test for this intrinsic.

If this builtin is defined to have the same behavior as fesetround, the obvious name would be __builtin_fesetround. You seem to have patterned it off of __builtin_flt_rounds, but that is not an arbitrarily-chosen name, it's taken from the standard FLT_ROUNDS macro and is expected to work as an implementation of it.

We already understand how fesetround is supposed to interact with the standard pragmas about the FP environment, so that indirectly answers that part of my question. I assume you've modeled your intrinsic correctly in LLVM so that the optimizer understands it can't reorder the environment-sensitive math intrinsics around calls to the new intrinsic?

If this builtin is just an intrinsic implementation of fesetround, I don't think we need further documentation of it.

I assume the intrinsic is only implemented on certain targets? If so, we might need to restrict uses of the builtin as well (and make sure that __has_builtin only returns true on those targets).

Hi, @rjmccall
"llvm.set.rounding" intrinsic will change default floating-point environment, so as same as explicit C library call to "fesetround", this intrinsic should work with "pragma std fenv_access on", then "strictfp" attr will be added and optimizer will honor it.
"llvm.set.rounding" shouldn't be a target-sepcific intrinsic and target backend should be responsible for supporting it. I didn't my local tests on x86/x86_64 hardware and I also searched tests related to "llvm.set.rounding", arm, aarch64, risc-v all include such tests, so this builtin should work for these targets too.
Thanks very much.

C standard function fesetround accepts rounding mode in the form of target-specific values like FE_TONEAREST. They usually represent corresponding bits in control register, so are different for different targets. llvm.set_rounding in contrast accepts rounding mode as target-independent value, the encoding is same as used in FLT_ROUNDS.

The difference between fesetround and llvm.set_rounding is same as for fegetround and FLT_ROUNDS. They differ in how rounding mode is specified, - as target-specific or as target-independent values respectively.

The intrinsic llvm.set_rounding was introduced to facilitate work on IR, because IR is (mainly) target-agnostic and defines like FE_TONEAREST are not available there. It would be nice to have a C builtin counterpart for FLT_ROUNDS, I don't know why the standard does not define it.

Implementation of fesetround with set_rounding would be very useful, but it requires translation of target-specific values for rounding mode into universal values and also checking availability of FPU, as fesetround returns value if rounding mode cannot be set. Both these checks are likely to require separate builtins.

C standard function fesetround accepts rounding mode in the form of target-specific values like FE_TONEAREST. They usually represent corresponding bits in control register, so are different for different targets. llvm.set_rounding in contrast accepts rounding mode as target-independent value, the encoding is same as used in FLT_ROUNDS.

The difference between fesetround and llvm.set_rounding is same as for fegetround and FLT_ROUNDS. They differ in how rounding mode is specified, - as target-specific or as target-independent values respectively.

The intrinsic llvm.set_rounding was introduced to facilitate work on IR, because IR is (mainly) target-agnostic and defines like FE_TONEAREST are not available there. It would be nice to have a C builtin counterpart for FLT_ROUNDS, I don't know why the standard does not define it.

Implementation of fesetround with set_rounding would be very useful, but it requires translation of target-specific values for rounding mode into universal values and also checking availability of FPU, as fesetround returns value if rounding mode cannot be set. Both these checks are likely to require separate builtins.

Hi, @sepavloff
This patch only aims to add a builtin for llvm.set.rounding, llvm.set.rounding can benefit C/C++ developers then. Since llvm.set.rounding is target-independent, __builtin_fesetround should be target-independent as well and the then arguments for this builtin should align with what documented here: https://llvm.org/docs/LangRef.html#id1502 for all targets. If we want to map fesetround to llvm.set.rounding in the future, we need to take care of 2 points:

  1. translate fesetround arugment to llvm.set.rounding arugment.
  2. properly handle the return value.

Did I correctly understand your points?
Thanks very much.

I see. If we're going to take the target-independent values specified by FLT_ROUNDS, then the original builtin name is more appropriate. Of course, this has the disadvantage of not allowing target-specific values that might exist beyond those specified in the standard; are we pretty certain that's not a problem in practice?

Working on x86, ARM, and AArch64 is great, but I'm a little worried about adding another builtin that works only on major targets and probably crashes on others. I suppose we've got some number of those already, though.

I see. If we're going to take the target-independent values specified by FLT_ROUNDS, then the original builtin name is more appropriate. Of course, this has the disadvantage of not allowing target-specific values that might exist beyond those specified in the standard; are we pretty certain that's not a problem in practice?

Working on x86, ARM, and AArch64 is great, but I'm a little worried about adding another builtin that works only on major targets and probably crashes on others. I suppose we've got some number of those already, though.

Hi, @rjmccall @sepavloff
Which name do you prefer for this builtin? builtin_flt_rounds_set or builtin_set_flt_rounds?
Does LLVM have mechanism to restrict builtin to work only on specific targets? If not, can we check add target check code to guard just like: https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGBuiltin.cpp#LL4685C8-L4685C8
And although the builtin should work automatically for arm and aarch64, I didn't have environment to verify, can I enable it only for x86 currently?
Thanks very much.

__builtin_set_flt_rounds seems better.

We should add the portability checking, yeah. Look at the checking we do for certain builtins with CheckBuiltinTargetInSupported in SemaChecking.cpp.

I see. If we're going to take the target-independent values specified by FLT_ROUNDS, then the original builtin name is more appropriate. Of course, this has the disadvantage of not allowing target-specific values that might exist beyond those specified in the standard; are we pretty certain that's not a problem in practice?

Working on x86, ARM, and AArch64 is great, but I'm a little worried about adding another builtin that works only on major targets and probably crashes on others. I suppose we've got some number of those already, though.

Using fesetround as compiler builtin requires implementation of constrained intrinsic. It is large work anyway. Adding support for non-standard rounding modes is small relative to it.

Okay, so if we're adding this builtin, and it's not just imitating a stdlib function, we should definitely document it as a language extension. There's a section in the manual about controlling FP modes which is probably an appropriate place for this.

I assume we need to document that the builtin has undefined behavior if it receives a value outside of the standard-defined set. (We can't guarantee that it's a no-op or else we'll never be able to extend the standard-defined set.)

xiongji90 updated this revision to Diff 502507.Mar 5 2023, 7:55 PM

Update the builtin name to __builtin_set_flt_rounds and restrict it to be used only on x86, x86_64, arm, aarch64 target.

Hi, @rjmccall and @sepavloff
I updated built name to "builtin_set_flt_rounds" as suggested and restrict the built to be used on x86, arm, aarch64 target. I did my local test on x86 and x86_64 and found arm and aarch64 code gen have already supported "llvm.set.rounding" :https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMISelLowering.cpp#L6398 and https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L4528
So enabling the builtin on these targets should be OK. To document, is it ok to create another patch to add doc for both "
builtin_flt_rounds" and "__builtin_set_flt_rounds"?
Thanks very much.

rjmccall accepted this revision.Mar 6 2023, 1:01 AM

I have no objection to doing the documentation in a separate patch. LGTM.

Hi, @sepavloff
Do you have any concern about this patch?
Thanks very much.

sepavloff accepted this revision.Mar 6 2023, 8:28 PM

LGTM.

Thanks!

This revision was automatically updated to reflect the committed changes.