This is an archive of the discontinued LLVM Phabricator instance.

[FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent reordering
Needs ReviewPublic

Authored by sdmitrouk on Oct 26 2015, 6:56 AM.

Details

Summary

When floating-point environment is not examined, reordering instructions doesn't
harm, otherwise library call that checks floating-point environment state can be
moved before the actual computation, producing wrong results (not affected by
side-effects of operations).

Diff Detail

Repository
rL LLVM

Event Timeline

sdmitrouk updated this revision to Diff 38417.Oct 26 2015, 6:56 AM
sdmitrouk retitled this revision from to [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent reordering.
sdmitrouk updated this object.
sdmitrouk added reviewers: hfinkel, mehdi_amini.
sdmitrouk set the repository for this revision to rL LLVM.
sdmitrouk added subscribers: llvm-commits, resistor, scanon.

Please drop the AMDGPU changes. I don't see a point in trying to support these at this time, and they can't be meaningfully tested now. While the hardware supports floating point exceptions, they require a lot of additional compiler work to implement beyond the core operations. A lot of SGPRs need to be reserved and initialized, the control registers set to enable them, somehow supporting trap handlers and probably other things.

Other than that, Legal is a bad default for these. These should use the new LibCall action by default set in TargetLoweringBase. There is also a hasFloatingPointExceptions() target hook already, perhaps something could error if these are used and not supported.

include/llvm/CodeGen/ISDOpcodes.h
243

Typo: inary

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
1422–1424

You can use a statically sized C array for this

4485

Using the integer size for address space 0 seems like a poor choice. Do targets actually care about this type? Why isn't it an i1 target constant?

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
202–206

A new utility function for isFPOpWithChain would be useful.

208–209

Why is this Op.getOperand(0) instead of just Op? I would also move the chain handling out of the loop and handle separately or have a separate w/chain and wo/chain loop

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2182–2186

A Chain out arguments looks weird here. Why can't UnrollVectorOp's result have the chain result?

lib/CodeGen/SelectionDAG/TargetLowering.cpp
92

Ops.empty()

lib/Target/AMDGPU/SIISelLowering.cpp
1309–1312 ↗(On Diff #38417)

Why would only the last fmul need to be FMUL_W_CHAIN?

lib/Target/AMDGPU/SIInstructions.td
1450–1454 ↗(On Diff #38417)

The instruction asm string should not include the _wchain. A better name I think for this would be V_ADD_F32_FPE or something else that doesn't mention chains.

jholewinski added inline comments.Oct 26 2015, 8:54 AM
lib/Target/NVPTX/NVPTXInstrInfo.td
696

Does it matter if the target does not actually support FP exceptions? This is not currently modeled in PTX.

sdmitrouk added inline comments.Oct 27 2015, 10:52 AM
lib/Target/NVPTX/NVPTXInstrInfo.td
696

Does it matter if the target does not actually support FP exceptions? This is not currently modeled in PTX.

Not really, this only fixes ordering of FP instructions, everything else (i.e., changes in floating-point state registers) is left for future improvements. I just wanted to ensure that new instructions won't cause selection failures on supported targets, but if you think this is useless for PTX or other targets I can remove these changes.

Drive-by-comment...

Is there an advantage to duplicating FADD, FSUB, FMUL, FDIV, and FREM into a second W_CHAIN set, instead of having only a single set of ops which have the chain operand? It would naively seem better to have less duplication? (That is, if the compilation flags say to not care about fp ordering, the input chain would still be there, but would always be set to the entry node.)

Drive-by-comment...

Is there an advantage to duplicating FADD, FSUB, FMUL, FDIV, and FREM into a second W_CHAIN set, instead of having only a single set of ops which have the chain operand? It would naively seem better to have less duplication? (That is, if the compilation flags say to not care about fp ordering, the input chain would still be there, but would always be set to the entry node.)

This was the first implementation, but the issue was that in that case you can't (or I don't know how) use same instruction twice to create one node with side effects and the other one without (and model changes in related registers probably too). Hal actually said that one could add new instructions and for me it also appeared like a safer choice to do not break current floating-point math.

sdmitrouk updated this revision to Diff 42442.Dec 10 2015, 10:01 AM

Dropped AMDGPU. Addressed comments.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4485

This is a rewrite of code in the case directly above, I just made it aware of the chain.

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
208–209

Thanks, it should be Op.

I would also move the chain handling out of the loop and handle separately or have a separate w/chain and wo/chain loop

It's not the chain of the node we're processing, it's reference to the chain of an operand. Node shouldn't be legalized by reference to its chain, which will be replaced with new value when such node gets legalized through reverence to its value.

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2182–2186

Why can't UnrollVectorOp's result have the chain result?

Because its result is processed is assumed to have type of vector element and it is processed that way.

mehdi_amini edited edge metadata.Jan 5 2016, 10:25 PM

Haven't looked at the target specific code, seems mostly mechanical changes.
Some comments on the generic part.

include/llvm/CodeGen/SelectionDAG.h
1228

Remove spurious space change.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
1338

I'm not sure I understand what's going on here, this need to be documented.

8701

It is not clear to me that this will preserve the FP env?
Also "UnsafeFPMath" seems fundamentally not compatible with FP env, so it smells.

lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
1064

spurious change

1070

spurious formatting change?

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
170

Not clear how it relates to the "FP with chain" stuff.

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
114

spurious formatting change

2058

In the rest of the patch, I think you grouped the *_W_CHAIN cases.

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
3928

I'm not convince you maintain correct FPenv semantics here.

3936

Another case of mixing UnsafeFPMath with FPenv, I'm not sure about where this is going. Shouldn't we just drop the chain and turn every XXX_W_CHAIN into XXX when UnsafeFPMath is enabled?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2352

This has changed recently :)