This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix cvt_f32_ubyte combine with shl
ClosedPublic

Authored by vangthao on Oct 28 2021, 9:56 AM.

Details

Summary

Shift node is still needed to check if the shift is shr or shl to increment/decrement offset. Do not override the node.

Diff Detail

Event Timeline

vangthao created this revision.Oct 28 2021, 9:56 AM
vangthao requested review of this revision.Oct 28 2021, 9:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2021, 9:56 AM
arsenm accepted this revision.Oct 28 2021, 11:06 AM
arsenm added inline comments.
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte_vector.ll
4–6 ↗(On Diff #383069)

You can just put this in the existing test

This revision is now accepted and ready to land.Oct 28 2021, 11:06 AM
vangthao added inline comments.Oct 28 2021, 12:22 PM
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte_vector.ll
4–6 ↗(On Diff #383069)

When I put it with the existing test, I am getting:

LLVM ERROR: Cannot select: t86: ch = store<(store (s8) into i32* undef + 3), trunc to i8> t97, t51, undef:i64, undef:i64

This is coming from the first check with -mcpu=tahiti.

arsenm added inline comments.Oct 28 2021, 12:55 PM
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte_vector.ll
4–6 ↗(On Diff #383069)

Just change the flat pointers to addrspace(1)

vangthao updated this revision to Diff 383181.Oct 28 2021, 3:35 PM

Moved new test to existing test.

arsenm accepted this revision.Oct 28 2021, 4:40 PM
This revision was landed with ongoing or failed builds.Oct 28 2021, 10:07 PM
This revision was automatically updated to reflect the committed changes.
foad added inline comments.Oct 29 2021, 4:40 AM
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
1

This comment is misleading now because you didn't generate all the GFX9 checks for your new RUN line, and you didn't generate checks for the new function cvt_f32_ubyte0_vector.

vangthao added inline comments.Oct 29 2021, 12:18 PM
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
1

Removed comment in D112839