This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Prepare for introduction of v3 and v5 MVTs
ClosedPublic

Authored by tpr on Mar 4 2019, 7:05 AM.

Details

Summary

AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This
commit does not add them, but makes preparatory changes:

  • Exclude non-legal non-power-of-2 vector types from ComputeRegisterProp mechanism in TargetLoweringBase::getTypeConversion.
  • Cope with SETCC and VSELECT for odd-width i1 vector when the other vectors are legal type.
  • Fixed an assumption of power-of-2 vector type in ARM.
  • Fixed assumptions of power-of-2 vector type in AMDGPU kernel arg handling.
  • Fixed AMDGPU cost analysis to behave the same.

Some of this patch is from Matt Arsenault, also of AMD.

Change-Id: Ib5f23377dbef511be3a936211a0b9f94e46331f8

Diff Detail

Event Timeline

tpr created this revision.Mar 4 2019, 7:05 AM
arsenm added a comment.Mar 4 2019, 7:33 AM

The ARM patch at least should be split into a separate patch

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1034 ↗(On Diff #189142)

Braces

1036 ↗(On Diff #189142)

Should also make sure to add some 5x vector argument tests to the kernarg tests since you are adding those

lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
343 ↗(On Diff #189142)

This could use a test in test/Analysis/CostModel/AMDGPU

lib/Target/ARM/ARMISelLowering.cpp
12153 ↗(On Diff #189142)

Combine the last 2 checks into NumLanes >= 3

12211 ↗(On Diff #189142)

Ditto

tpr marked 3 inline comments as done.Mar 4 2019, 11:59 AM
tpr added inline comments.
lib/Target/ARM/ARMISelLowering.cpp
12153 ↗(On Diff #189142)

That won't work -- it needs to be true for ==3 and >4, but not for ==4.

The general idea makes sense, I think.

lib/CodeGen/TargetLoweringBase.cpp
746

The reason we're checking for isSimple() here is precisely so that we can perform a lookup in the TransformToType array. If we need a couple extra lines of code to handle TypeWidenVector, we should just add them (and maybe refactor the code so it doesn't repeat itself so much).

tpr updated this revision to Diff 189217.Mar 4 2019, 3:17 PM
tpr marked an inline comment as done.

V2: Moved ARM and AMDGPU changes out to their own commits.

tpr marked 4 inline comments as done.Mar 4 2019, 3:24 PM
tpr added inline comments.
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
343 ↗(On Diff #189142)

I have separated out the amdgpu changes into another commit, but:

In adding v5 cost model tests, I found that (a) this change here is unnecessary, and (b) I had a bug in one of my later changes (the one that makes v5 legal on amdgpu) that made the test fail. So I have added the tests.

tpr updated this revision to Diff 189577.Mar 6 2019, 1:33 PM
tpr marked an inline comment as done.

V3: Addressed review comment by widening illegal odd vectors in a
different way.

tpr marked an inline comment as done.Mar 6 2019, 1:35 PM
efriedma accepted this revision.Mar 6 2019, 3:49 PM

LGTM

This revision is now accepted and ready to land.Mar 6 2019, 3:49 PM
This revision was automatically updated to reflect the committed changes.