This is an archive of the discontinued LLVM Phabricator instance.

IR: Add llvm.frexp intrinsic
ClosedPublic

Authored by arsenm on May 2 2023, 7:16 PM.

Details

Summary

Add an intrinsic which returns the two pieces as multiple return
values. Alternatively could introduce a pair of intrinsics to
separately return the fractional and exponent parts.

AMDGPU has native instructions to return the two halves, but could use
some generic legalization and optimization handling. For example, we
should be able to handle legalization of f16 on older targets, and for
bf16. Additionally antique targets need a hardware workaround which
would be better handled in the backend rather than in library code
where it is now.

Diff Detail

Event Timeline

arsenm created this revision.May 2 2023, 7:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2023, 7:16 PM
arsenm requested review of this revision.May 2 2023, 7:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2023, 7:16 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm updated this revision to Diff 518946.May 2 2023, 7:33 PM
foad added inline comments.May 3 2023, 1:50 AM
llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
1835

"G_FFREXP"

sepavloff added inline comments.May 3 2023, 2:22 AM
llvm/docs/LangRef.rst
14767

integral -> integer?

llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
1835

G_FLDEXP -> G_FFREXP?

llvm/include/llvm/IR/Intrinsics.td
1046

Can LLVMScalarOrSameVectorWidth be used to constrain return values?

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2092

Can it be just Entry.IsZExt = !Entry.IsSExt?

2510

The variable name is misleading. It is not a mask, it does not select bits. It is just a value.

2519

ExpMask could be a better name, there is no infinity bit.

arsenm updated this revision to Diff 523429.May 18 2023, 10:09 AM
arsenm marked 4 inline comments as done.

Address comments

llvm/docs/LangRef.rst
14767

No, every other doc phrases it as integral

llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
1835

I'm debating doing a cleanup and dropping all of the "F" prefixes from all of these. It's copying an annoyance from the DAG where the node names don't exactly match the IR

llvm/include/llvm/IR/Intrinsics.td
1046

I tried to use it a few times, but it seems to not work. I think it's having trouble with the type being embedded inside this argument list.

/Users/matt/src/llvm-project/llvm/include/llvm/IR/Intrinsics.td:1046:7: error: Initializer of 'TypeSig' in 'int_frexp' could not be fully resolved: anonymous_142.TypeSig
  def int_frexp : DefaultAttrsIntrinsic<[llvm_anyfloat_ty, LLVMScalarOrSameVectorWidth<0, llvm_anyint_ty>], [LLVMMatchType<0>]>;
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2092

This is just copy paste from the other case that does this. Seems like a broken API though, why split out the 2 cases if it must be one or the other?

sepavloff added inline comments.May 22 2023, 11:08 PM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2514

Thos variable is not a mask, it is just a value.

2549

Look like the name is wrong. It states about biggest normal, but actually the smallest is added.

2551

Is this correct?

The check is x + SNN <= SNN in unsigned wrapping math. It holds if x represents zero or denormal value, so essentially it checks for zero or denormal, no?

2564

This value is used only SELECT, which is then shifted right. Can it be dropped?

5238

Can we use 1 for the second operand? The value produced by FFREXP should not add extra bits to mantissa.

arsenm added inline comments.May 23 2023, 3:00 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2564

Not sure I follow, you would have to shift the other operand of the select too.

I got here by running the C implementation through the optimizer and translating from IR up to the DAG nodes. I'm trusting the IR's canonicalization decisions.

arsenm added inline comments.May 23 2023, 3:02 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
5238

Would that even compile on most targets? I view the second operand here as a historical artifact/mistake and don't really expect a value other than 0 to work to exactly correspond to fptrunc

arsenm added inline comments.May 23 2023, 3:03 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
5238

Oh, I thought this was a rounding mode control. not a rounding-known-not-to-happen bit. Not sure, it's probably OK

barannikov88 added inline comments.
llvm/docs/LangRef.rst
14784
14798

This should explain what is returned in either of the struct fields.

kpn added a comment.May 23 2023, 6:11 AM

Would a constrained version of this be needed?

Would a constrained version of this be needed?

Yes, I think so. Feels like it shouldn't be necessary but on my system at least the library function returns 0 with DAZ enabled so that implies a dependence on the FP mode

Would a constrained version of this be needed?

Yes, I think so. Feels like it shouldn't be necessary but on my system at least the library function returns 0 with DAZ enabled so that implies a dependence on the FP mode

Actually the system one works as you would expect with or without DAZ. The musl implementation (and the expansion here) usage of fmul is what breaks denormal handling

arsenm updated this revision to Diff 525222.May 24 2023, 9:27 AM
arsenm marked 5 inline comments as done.

There is no code for WidenVecRes_FREXP and WidenVecOp_FREXP?

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2564

The value ExpMaskScaled is used only in SELECT:

SDValue ScaledValue =
    DAG.getNode(ISD::SELECT, dl, AsIntVT, IsDenormal, ExpMaskScaled, Abs);

which is used only in the right shift:

SDValue ShiftedExp =
    DAG.getNode(ISD::SRL, dl, AsIntVT, ScaledValue, ExponentShiftAmt);

It shifts out the bits that were cleared by AND operation, so this operation looks useless and may be removed.

I got here by running the C implementation through the optimizer and translating from IR up to the DAG nodes.

Probably this implementation may be used as is if its correctness is validated in some other way, maybe by runtime tests. Cleanup and optimization may be made later on.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
336

Is this code unreachable?

arsenm updated this revision to Diff 529051.Jun 6 2023, 2:45 PM
arsenm marked an inline comment as done.

There is no code for WidenVecRes_FREXP and WidenVecOp_FREXP?

I also don't know of a use case for them, so no way to test it

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2564

This passed when I forced passing this through opencl conformance on amdgpu. I don't really want to touch this anymore, especially since the combiner should take care of these types of folds

LGTM.

There is no code for WidenVecRes_FREXP and WidenVecOp_FREXP?

I also don't know of a use case for them, so no way to test it

In the test X86/llvm.frexp.ll there is a note ; FIXME: Widen vector result, probably this the use case. I think the support could be implemented in another patch, this one is already large.

foad added inline comments.Jun 8 2023, 1:13 AM
llvm/docs/LangRef.rst
14798–14799

"The second result is that power of two" is wrong.

frexp(2560) is { 0.625, 12 }

The argument (2560) was multiplied by some power of two (2^-12) to get 0.625, but the second result is not 2^-12 or -12, it's 12.

foad added inline comments.Jun 8 2023, 1:18 AM
llvm/include/llvm/CodeGen/ISDOpcodes.h
935

Typo "FFREXP", or just remove it?

arsenm updated this revision to Diff 532300.Jun 16 2023, 3:00 PM
arsenm marked 2 inline comments as done.

Address comments

sepavloff accepted this revision.Jun 28 2023, 6:37 AM

LGTM.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2332

The comment looks not suitable.

2336

You could use C++17 feature here:

auto [LoVT, HiVT] = DAG.GetSplitDestVTs(N->getValueType(0));
This revision is now accepted and ready to land.Jun 28 2023, 6:37 AM