This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Add support for FP_ROUND in WidenVectorOperand.
ClosedPublic

Authored by foad on Oct 23 2019, 6:29 AM.

Details

Summary

This is used on AMDGPU for rounding from v3f64 (which is illegal) to
v3f32 (which is legal).

Event Timeline

foad created this revision.Oct 23 2019, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 23 2019, 6:29 AM

I think the code looks fine, but it doesn't look like some of the code is tested. And may not be testable?

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4301

I'm not sure how this if could be true given that the opposite was checked in outer if. But since that was already like that when you got here this looks fine.

llvm/test/CodeGen/AMDGPU/fptrunc.ll
34

Is this hitting the unrolling path rather than any of the code that creates a wider node?

foad marked 2 inline comments as done.Oct 24 2019, 1:31 AM
foad added inline comments.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4301

I hadn't noticed that. My guess is that the check for !N->isStrictFPOpcode() in the outer if was added along with the FIXME comment, so maybe it was supposed to be a temporary fix?

llvm/test/CodeGen/AMDGPU/fptrunc.ll
34

The test definitely fails without the fix above, but I haven't looked in detail at how it is handled.

arsenm added inline comments.Oct 27 2019, 8:36 PM
llvm/test/CodeGen/AMDGPU/fptrunc.ll
34

I would expect this to hit the unroll path. v3f64 isn't legal

This revision is now accepted and ready to land.Oct 27 2019, 8:51 PM
foad marked an inline comment as done.Oct 28 2019, 3:59 AM
foad added inline comments.
llvm/test/CodeGen/AMDGPU/fptrunc.ll
34

If I'm reading the legalize-types debug output correctly it is being widened to v4 and then split to v2:

Legalizing node: t33: v3f32 = fp_round t30, TargetConstant:i64<0>
Analyzing result type: v3f32
Legal result type
Analyzing operand: t30: v3f64 = extract_subvector t37, Constant:i32<0>
Widen node operand 0: t33: v3f32 = fp_round t30, TargetConstant:i64<0>

Legalizing node: t49: v4f32 = fp_round t37, TargetConstant:i64<0>
Analyzing result type: v4f32
Legal result type
Analyzing operand: t37: v4f64 = bitcast t36
Split node operand: t49: v4f32 = fp_round t37, TargetConstant:i64<0>

Is that a problem? Is my patch still OK?

craig.topper added inline comments.Oct 28 2019, 10:24 AM
llvm/test/CodeGen/AMDGPU/fptrunc.ll
34

Ok so v4f32 result type is legal so it went up to v4f32 <- v4f64, but then v4f64 isn't legal so it split back down? I think that should be fine. But maybe @arsenm should confirm.

This revision was automatically updated to reflect the committed changes.