This is an archive of the discontinued LLVM Phabricator instance.

Don't combine fp_round (fp_round x) if f80 to f16 is generated
ClosedPublic

Authored by pirama on Feb 12 2016, 3:52 PM.

Details

Summary

This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2. This prevents selection of
native f16 conversion instructions from f32 or f64. Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Diff Detail

Event Timeline

pirama updated this revision to Diff 47869.Feb 12 2016, 3:52 PM
pirama retitled this revision from to Don't combine fp_round (fp_round x) if f80 to f16 is generated.
pirama updated this object.
pirama added a reviewer: ab.
pirama added subscribers: llvm-commits, srhines.
ab added inline comments.Feb 12 2016, 3:58 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9030

How do we know that the round is value-preserving?

pirama added inline comments.Feb 12 2016, 4:03 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9030

The first round being value-preserving is a pre-requisite for the folding (the if statement below). So, this patch doesn't change behavior if the first fp_round is not value-preserving.

ab accepted this revision.Feb 12 2016, 4:11 PM
ab edited edge metadata.

LGTM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9030

Ah, you're right; I misread that, sorry.

test/CodeGen/X86/half.ll
270

What do you think of explicitly testing f80->f16 instead?

This revision is now accepted and ready to land.Feb 12 2016, 4:11 PM

Thanks for the quick review :)

This revision was automatically updated to reflect the committed changes.
srhines added inline comments.Feb 12 2016, 5:18 PM
test/CodeGen/X86/half.ll
270

Pirama, I think you missed this question. :)

pirama added inline comments.Feb 12 2016, 6:35 PM
test/CodeGen/X86/half.ll
270

Aah, sorry, I missed this. I think you mean testing ftrunc from f80 -> say f32 followed by f32 -> f16. This doesn't trigger the folding because the rounds are not value preserving.