This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Custom lower INSERT_SUBVECTOR v3, v4, v5, v8
ClosedPublic

Authored by tpr on Jun 11 2019, 1:10 PM.

Details

Summary

Since the changes to introduce vec3 and vec5, INSERT_VECTOR for these
sizes has been marked "expand", which made LegalizeDAG lower it to loads
and stores via a stack slot. The code got optimized a bit later, but the
now-unused stack slot was never deleted.

This commit avoids that problem by custom lowering INSERT_SUBVECTOR into
an EXTRACT_VECTOR_ELT and INSERT_VECTOR_ELT for each element in the
subvector to insert.

Change-Id: I9e3c13e36f68cfa3431bb9814851cc1f673274e1

Diff Detail

Event Timeline

tpr created this revision.Jun 11 2019, 1:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2019, 1:10 PM
arsenm added inline comments.Jun 11 2019, 1:15 PM
lib/Target/AMDGPU/SIISelLowering.cpp
348

Should this handle MVT::Other and get all the types? What happens with v6i32 or v4i16?

test/CodeGen/AMDGPU/insert-subvector-unused-scratch.ll
6

Should check more, even if generated

12

No case for 5 x?

tpr updated this revision to Diff 204248.Jun 12 2019, 3:29 AM

V2: Addressed review comments re test.

tpr marked 3 inline comments as done.Jun 12 2019, 3:31 AM
tpr added inline comments.
lib/Target/AMDGPU/SIISelLowering.cpp
348

I would prefer not to in this commit. Here I am just trying to undo the damage done by my vec3 and vec5 changes, which (a) added those setOperationActions with Expand, and (b) implemented some joining involving the new vec3/vec5 types in terms of INSERT_SUBVECTOR.

arsenm accepted this revision.Jun 18 2019, 4:35 PM

LGTMT

This revision is now accepted and ready to land.Jun 18 2019, 4:35 PM
This revision was automatically updated to reflect the committed changes.