Page MenuHomePhabricator

[FPEnv] Intrinsics for access to FP control modes
Needs ReviewPublic

Authored by sepavloff on Jun 25 2020, 1:30 AM.

Details

Summary

The change introduces intrinsics 'get_fpmode', 'set_fpmode' and 'reset_fpmode'.
They manage all the target's dynamic floating-point control modes collectively.
A control mode is a part of floating-point environment and once set affects
subsequent behavior of floating-point operations. Examples of control
modes are rounding direction, precision, treatment of denormals and so
on. The intrinsics do the same operations as C library functions 'fegetmode'
and 'fesetmode'. By default they are lowered to calls to these library
functions.

Diff Detail

Event Timeline

sepavloff created this revision.Jun 25 2020, 1:30 AM
Herald added a project: Restricted Project. · View Herald Transcript
sepavloff updated this revision to Diff 275055.Jul 2 2020, 4:06 AM

Fixed legalization of SET_FPMODE

sepavloff updated this revision to Diff 275373.Jul 3 2020, 5:25 AM

Missed change

sepavloff updated this revision to Diff 281827.Jul 30 2020, 1:37 AM

Rebased patch

sepavloff updated this revision to Diff 285081.Aug 12 2020, 7:30 AM

Rebased patch

sepavloff updated this revision to Diff 289188.Sep 1 2020, 8:49 AM

Rebased patch

sepavloff updated this revision to Diff 289360.Sep 2 2020, 12:15 AM

Get rid of clang-tidy warnings

sepavloff updated this revision to Diff 327751.Mar 3 2021, 5:04 AM

Rebased and simplified a bit.

sepavloff edited the summary of this revision. (Show Details)Mar 3 2021, 5:05 AM
sepavloff edited the summary of this revision. (Show Details)

From langref it isn't obvious if the following transform is valid or not

%z = fadd_strict %x, %y
call @llvm.set.fpmode.i16(i16 %fpenv)
  =>
call @llvm.set.fpmode.i16(i16 %fpenv)
%z = fadd_strict %x, %y
craig.topper added inline comments.
llvm/test/CodeGen/Generic/fpenv.ll
36

Is this missing the instructions that copy %fpenv into the stack temporary?

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

sepavloff updated this revision to Diff 328037.Mar 3 2021, 11:28 PM
sepavloff edited the summary of this revision. (Show Details)

Extended documentation, fixed chain treatment.

From langref it isn't obvious if the following transform is valid or not

%z = fadd_strict %x, %y
call @llvm.set.fpmode.i16(i16 %fpenv)
  =>
call @llvm.set.fpmode.i16(i16 %fpenv)
%z = fadd_strict %x, %y

Short mention about function ordering is added to the paragraph "Floating Point Environment Manipulation intrinsics".

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

Strictly speaking there is no such guarantee. However the obvious implementation of femode_t is the type used to store content of FP control register. Most of 16 targets supported by glibc use unsigned int as femode_t. Exceptions are alpha, ia64, sparc (unsigned long) and powerpc (double). In these cases femode_t is identical to fenv_t.

llvm/test/CodeGen/Generic/fpenv.ll
36

Indeed, due to incorrect chain argument supplied to the library function call, the store to stack disappeared.

Thank you for the catch!

From langref it isn't obvious if the following transform is valid or not

%z = fadd_strict %x, %y
call @llvm.set.fpmode.i16(i16 %fpenv)
  =>
call @llvm.set.fpmode.i16(i16 %fpenv)
%z = fadd_strict %x, %y

Short mention about function ordering is added to the paragraph "Floating Point Environment Manipulation intrinsics".

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

Strictly speaking there is no such guarantee. However the obvious implementation of femode_t is the type used to store content of FP control register. Most of 16 targets supported by glibc use unsigned int as femode_t. Exceptions are alpha, ia64, sparc (unsigned long) and powerpc (double). In these cases femode_t is identical to fenv_t.

Isn't X86 using this struct which is 8 bytes?

typedef struct
  {
    unsigned short int __control_word;
    unsigned short int __glibc_reserved;
    unsigned int __mxcsr;
  }
femode_t;

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

Strictly speaking there is no such guarantee. However the obvious implementation of femode_t is the type used to store content of FP control register. Most of 16 targets supported by glibc use unsigned int as femode_t. Exceptions are alpha, ia64, sparc (unsigned long) and powerpc (double). In these cases femode_t is identical to fenv_t.

Isn't X86 using this struct which is 8 bytes?

typedef struct
  {
    unsigned short int __control_word;
    unsigned short int __glibc_reserved;
    unsigned int __mxcsr;
  }
femode_t;

Sure. I forget to mention x86.

There are cases when size of fenv_t differs in different libraries. ARM uses unsigned int in glibc but unsigned long in musl.

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

Strictly speaking there is no such guarantee. However the obvious implementation of femode_t is the type used to store content of FP control register. Most of 16 targets supported by glibc use unsigned int as femode_t. Exceptions are alpha, ia64, sparc (unsigned long) and powerpc (double). In these cases femode_t is identical to fenv_t.

Isn't X86 using this struct which is 8 bytes?

typedef struct
  {
    unsigned short int __control_word;
    unsigned short int __glibc_reserved;
    unsigned int __mxcsr;
  }
femode_t;

Sure. I forget to mention x86.

There are cases when size of fenv_t differs in different libraries. ARM uses unsigned int in glibc but unsigned long in musl.

Is unsigned long 32-bits in this case?

Is there any guarantee that femode_t will be the same layout for a given target in different C library implementations?

There are cases when size of fenv_t differs in different libraries. ARM uses unsigned int in glibc but unsigned long in musl.

Is unsigned long 32-bits in this case?

Yes, ARM gcc 10.2(linux) generates 4 for sizeof(unsigned long).

qiucf added a subscriber: qiucf.Mar 15 2021, 1:27 AM
sepavloff updated this revision to Diff 331293.Mar 17 2021, 9:38 AM

Added helper functions

These are methods of IRBuilder: createGetFPMode, which get size of fp modes from
DataLayout, createSetFPMode and createResetFPMode.

sepavloff updated this revision to Diff 334095.Tue, Mar 30, 3:15 AM

Rebased patch

Any feedback?