This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion (PR17686)
ClosedPublic

Authored by RKSimon on Oct 27 2018, 1:43 PM.

Details

Summary

PR17686 demonstrates that for some targets FP exceptions can fire in cases where the FP_TO_UINT is expanded using a FP_TO_SINT instruction.

The existing code converts both the inrange and outofrange cases using FP_TO_SINT and then selects the result, this patch changes this for 'strict' cases to pre-select the FP_TO_SINT input and the offset adjustment.

The X87 cases doesn't need the strict flag but generates much nicer code with it....

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Oct 27 2018, 1:43 PM
RKSimon updated this revision to Diff 171427.Oct 28 2018, 7:16 AM

Rebased on rL345473 which added vector legalization support

uweigand added a subscriber: kpn.Oct 29 2018, 7:42 AM

So I had thought that normal LLVM floating-point IR operations are defined to assume that FP exceptions are disabled, and therefore transformations do not need to preserve the FP exception status.

If you rely on FP exceptions, you should instead use the new family of "strict" floating-point intrinsics.

In particular, this very issue came up in the context of correctly lowering the STRICT_FP_TO_UINT operation, where @kpn suggested to lower this directly to an if-then-else sequence at the IR level, before even getting to the DAG. This hasn't gone in the tree yet, but the discussion is ongoing here: https://reviews.llvm.org/D43515

The proposed new pass is here: https://reviews.llvm.org/D43515?id=144714#change-iPGnktlSiYV3

Now, the solution suggested in this patch is also interesting since it can be done without introducing new basic blocks (and therefore can be done in SelectionDAG, without needed a new pass). However, in order to generate efficient code we probably still need to get to the same assembler code we'd have gotten from that new pass -- but now instead the burden is on the back-end to recover the if-then-else block from a series of select statements. Not sure what is better here.

In any case, the current code I see e.g. in the SystemZ tests is definitely worse than what we have today.

Would you like me to keep this patch around for STRICT_FP_TO_UINT expansion?

RKSimon updated this revision to Diff 172579.Nov 5 2018, 6:39 AM
RKSimon edited the summary of this revision. (Show Details)

Updated to only use the 'strict' FP_TO_UINT path when the TLI shouldUseStrictFP_TO_INT helper is true.

RKSimon updated this revision to Diff 172759.Nov 6 2018, 6:42 AM

Fix signed/unsigned mismatch

efriedma added inline comments.Nov 14 2018, 2:07 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
4160

Do we need to check that Cst is finite? For example, 0x80000000 doesn't fit into a half-precision float. I guess it doesn't matter for x86, but it might matter in the future.

4167

Maybe worth explaining why the extra rounding step here doesn't matter.

RKSimon updated this revision to Diff 174375.Nov 16 2018, 8:25 AM

Added early-out for the case where the integer dst type is larger than the float src type maximum value - we should be able to just use FP_TO_SINT. Thanks to @efriedma for noticing that - it affects a MIPS test.

@atanasyan - should the f16 test should be using a smaller type than the i32 so the full uitofp half path is tested?

kpn added a comment.Nov 16 2018, 9:40 AM

Now, the solution suggested in this patch is also interesting since it can be done without introducing new basic blocks (and therefore can be done in SelectionDAG, without needed a new pass). However, in order to generate efficient code we probably still need to get to the same assembler code we'd have gotten from that new pass -- but now instead the burden is on the back-end to recover the if-then-else block from a series of select statements. Not sure what is better here.

But does this actually solve the issue with FP traps, or does it just mask it? What's to stop llvm from reordering the selects and bringing back the trap?

In any case, the current code I see e.g. in the SystemZ tests is definitely worse than what we have today.

Does this patch prevent llvm from introducing traps? If no, what is the benefit from a change that doesn't eliminate the problem and produces worse code?

As I understand the approach, it can never introduce a new trap. Now of course, if the original source value is outside the range of the target integer type, some of the instructions introduced by this approach may trap, but in that case, it is expected for the fp_to_uint operation to trap somewhere.

If the original source value is in range however, then none of the instructions introduced here can ever trap:

	​    // Sel = Src < 0x8000000000000000
	​    // Val = select Sel, Src, Src - 0x8000000000000000
	​    // Ofs = select Sel, 0, 0x8000000000000000
	​    // Result = fp_to_sint(Val) ^ Ofs

The first is a compare that never traps. The select operations themselves also never trap. The first select in addition contains a subtraction, but if Src is in range for the fp_to_uint operation, that subtract cannot trap either. Finally, if Src is in range for fp_to_uint, then by construction the value Val will be in range for fp_to_sint, and therefore that operation cannot trap either.

Sorry @kpn I missed your post - thanks @uweigand for answering. I should add that I don't know of any combine that would move the select back again and reintroduce this.

@atanasyan The f16 conversion fix should probably be pulled out - do you want me to add 'in overflow range' and 'out of overflow range' fptoui tests to f16-llvm-ir.ll ?

@atanasyan The f16 conversion fix should probably be pulled out - do you want me to add 'in overflow range' and 'out of overflow range' fptoui tests to f16-llvm-ir.ll ?

Sorry @RKSimon, I missed your initial question. Yes, let's add such fptoui tests to f16-llvm-ir.ll.

kpn added a comment.Nov 19 2018, 7:57 AM

As I understand the approach, it can never introduce a new trap. Now of course, if the original source value is outside the range of the target integer type, some of the instructions introduced by this approach may trap, but in that case, it is expected for the fp_to_uint operation to trap somewhere.

If the original source value is in range however, then none of the instructions introduced here can ever trap:

	​    // Sel = Src < 0x8000000000000000
	​    // Val = select Sel, Src, Src - 0x8000000000000000
	​    // Ofs = select Sel, 0, 0x8000000000000000
	​    // Result = fp_to_sint(Val) ^ Ofs

The first is a compare that never traps. The select operations themselves also never trap. The first select in addition contains a subtraction, but if Src is in range for the fp_to_uint operation, that subtract cannot trap either. Finally, if Src is in range for fp_to_uint, then by construction the value Val will be in range for fp_to_sint, and therefore that operation cannot trap either.

OK, thanks for the clarification. That makes sense.

I like that this isn't a pass and therefore doesn't cost us any compilation time when it isn't needed.

efriedma added inline comments.Nov 29 2018, 4:30 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
4165

f32/f64 results should then be whole integers

I'm not sure what this means; are you saying the result of the subtraction a whole integer?

RKSimon updated this revision to Diff 176396.Dec 3 2018, 6:58 AM

Cleaned up comment

efriedma accepted this revision.Dec 3 2018, 5:22 PM

LGTM

This revision is now accepted and ready to land.Dec 3 2018, 5:22 PM
This revision was automatically updated to reflect the committed changes.