This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)
ClosedPublic

Authored by RKSimon on Apr 22 2017, 12:30 PM.

Details

Summary

PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.

This patch adds support for extension/truncation for both integer and float types.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Apr 22 2017, 12:30 PM
RKSimon retitled this revision from [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31008 to [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31008).Apr 23 2017, 10:38 AM
efriedma added inline comments.Apr 24 2017, 3:33 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
368

Is it actually possible for this to truncate/round?

561

Unless I'm misreading this, we just set Val to an EXTRACT_VECTOR_ELT of type PartVT on the previous line; does this conversion do anything?

test/CodeGen/X86/pr31088.ll
15

This __gnu_f2h_ieee + __gnu_h2f_ieee sequence looks strange...

22

Only tangentially related to your patch, but I'm not sure I understand the lowering here; addss is not a half-precision operation (and therefore won't produce correctly rounded results).

RKSimon added inline comments.Apr 25 2017, 6:40 AM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
561

Good point - will remove this and add an assert to check that the Parts[0] has a PartVT value type.

test/CodeGen/X86/pr31088.ll
15

From what I can tell, half args always come in as single precision floats and we don't guarantee that the floats are only set to half precision, so we have to truncate them - and then as we're performing arithmetic they must be extended back again to singles for the fadd.

Interestingly we don't bother truncating+extending the single results - we leave it with excess precision on return.

RKSimon updated this revision to Diff 96544.Apr 25 2017, 6:44 AM

Updated based on @efriedma comments

RKSimon updated this revision to Diff 96552.Apr 25 2017, 7:12 AM

rebased against trunk

efriedma resigned from this revision.Apr 27 2017, 3:01 PM

(I don't understand how this is supposed to work, so I don't feel comfortable approving it.)

spatel edited edge metadata.

The title/summary shows the wrong PR#; it should be PR31088:
https://bugs.llvm.org/show_bug.cgi?id=31088

The patch itself seems mechanical in that it's repeating the integer fixups, but I wonder if we can get someone with half-FP experience to take a look too in case this isn't behaving how we expect on all targets? The suspicious x86 codegen is a separate issue as noted earlier, but that's better than crashing? :)

test/CodeGen/X86/pr31088.ll
2–3

Does having both SSE and AVX RUNs add value? Seems like the output is identical apart from 'v' prefixes, so I'd kill one of them just as a matter of saving test time.

jlebar added a reviewer: tra.Apr 28 2017, 12:17 PM
jlebar removed a reviewer: jlebar.
jlebar added a subscriber: jlebar.
RKSimon retitled this revision from [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31008) to [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088).Apr 28 2017, 12:58 PM
RKSimon edited the summary of this revision. (Show Details)
RKSimon added inline comments.Apr 28 2017, 1:00 PM
test/CodeGen/X86/pr31088.ll
2–3

OK - I'll keep SSE. I'll add i686 targets instead as they will be coming from the stack instead of registers so will show another behaviour

tra added inline comments.Apr 28 2017, 3:03 PM
lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
532

Nit. I'd use VT.isFloatingPoint() and use FP_EXTEND only on types we positively know to be FP and leave behavior for all other types as it is right now. Otherwise you're making implicit assumption that all types are either integer or FP which, generally speaking, is not true. There's void, for instance.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
367

Same as above.

test/CodeGen/X86/pr31088.ll
2–3

It may be interesting to run the tests for NVPTX target as it supports both native FP16 and promote-to-fp32-calcualate->demote back to fp16 codegen modes.

RKSimon updated this revision to Diff 97193.Apr 29 2017, 8:20 AM

Use VT.isFloatingPoint() to select float extension/truncation (and still use integer path by default).

Add NVPTX test as suggested.

RKSimon marked 4 inline comments as done.Apr 29 2017, 8:21 AM
tra edited edge metadata.May 1 2017, 10:43 AM

LGTM. The patch produces sensible code for NVPTX target with and without fp16 support in hardware.

This revision was automatically updated to reflect the committed changes.