This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix crash when dag-combining bitcast
ClosedPublic

Authored by ruiling on Aug 11 2020, 7:49 PM.

Details

Summary

From the code after the 'break', they are processing 64bit scalar and
vector bitcast. So I think the break-condition should be (cond1 || cond2)
This means we only execute following code if (64bit and dest-is-vector).

Also remove a previous fix which is not needed with this new fix.
(introduced in: 1349a04ef5f594dda705ec80474dda4837f26dba)

Diff Detail

Event Timeline

ruiling created this revision.Aug 11 2020, 7:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2020, 7:49 PM
ruiling requested review of this revision.Aug 11 2020, 7:49 PM
arsenm added inline comments.Aug 12 2020, 5:34 AM
llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
306

Does this need the constant canonicalize? I'd rather avoid relying on some specific constant folding behavior.

I'm also not sure how this produces a 64-bit value

ruiling added inline comments.Aug 12 2020, 6:28 AM
llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
306

I am not sure why the constant canonicalize was generated right now. The IR in the test-case will become a bitcast from fp32 constant to <1 x i32> in SelectionDAG, I cannot directly make an IR to bitcast a constantfp to <1 x i32>, that will be directly optimized off as a i32 constant after optimization. The code here checking for break condition is wrong, a bitcast from constantfp to <1 x i32> should not execute this piece of code handling 64bit bitcast. It should break-away early. But the break-condition checking expression is false because the DestVT is vector. So a 32bit bitcast goes into if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Src)). That's why I modify the break-condition check from && to ||. I think a bitcast from fp32 constant to <2 x i16> or fp32 constant cast to <4 x i8> also need such fix.

arsenm accepted this revision.Aug 12 2020, 6:41 AM

LGTM. I'm slightly worried InstSimplify may get too smart and start folding out the intrinsic before it hits the DAG someday, but that probably won't happen

This revision is now accepted and ready to land.Aug 12 2020, 6:41 AM

I see. Is there other way to keep a bitcast from contantfp to i32 at LLVM IR level from being optimized off before the DAG?

I see. Is there other way to keep a bitcast from contantfp to i32 at LLVM IR level from being optimized off before the DAG?

I don't have any great ideas. It can be difficult to figure out how to produce operations that will produce the right type combination at the right time in the DAG

This revision was automatically updated to reflect the committed changes.