This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Make i16 a legal type for VI subtargets
ClosedPublic

Authored by tstellarAMD on Mar 10 2016, 9:23 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

wdng updated this revision to Diff 50288.Mar 10 2016, 9:23 AM
wdng retitled this revision from to AMDGPU i16 implementation .
wdng updated this object.
wdng added reviewers: arsenm, tstellarAMD, cfang.
wdng set the repository for this revision to rL LLVM.
wdng added a project: Restricted Project.
wdng added a subscriber: wdng.
arsenm edited edge metadata.Mar 12 2016, 5:55 PM

All of these tests should go in the existing test files and be run on the other subtargets as well.

I also expected way more load and store test additions. We need to have extload tests for i8->i16, i16->i32, i16->i64, some of which probably exist already.

This patch should limit itself to the basic set of required operations on i16: load and store, add, sub, bitshifts, and conversions. Optimizations like min/max should come later.

lib/Target/AMDGPU/VIInstructions.td
170

This pattern is necessary. I believe I had a test for this in my original patch

178–195

Dead code should be removed

199

Should follow camel case naming convention

201

No spaces around the :s

204–205

Should be indented like other Pats in the file

210

This is incorrect if this is a scalar zext, which currently doesn't happen because there are no scalar i16 instructions (although we may want pseudos for these). To be consistent, this should use S_MOV_B32 to materialize the 0

240–241

These should be using the signed min/max. I also think the min/max matching should be a separate patch

246

These should be done in a separate patch

249–267

Dead code should be removed. These instruction's dont exist. However, tests should be added to the rotl/rotr/bswap/ctlz/cttz to make sure these are properly expanded when i16 is added as legal since most operations by default are assumed to be legal if the type is

This is also missing the required register info changes and SIISelLowering changes

This is also missing the required register info changes and SIISelLowering changes

arsenm edited edge metadata.Mar 12 2016, 5:57 PM
arsenm added a subscriber: llvm-commits.
wdng updated this revision to Diff 50775.Mar 15 2016, 3:10 PM
  1. global-extload-i16.ll contains extload tests for i16->i32, i16->i64. Test for load/store i8->i16 is added into file global-extload-i8.ll
  2. Dead codes have been removed.
  3. Fixed indentation issues.
  4. Fixed scalar zext, used S_MOV_B32 to materialize the 0
  5. Added tests for shl, sra, sub
  6. Removed min/max, which will be committed into another separate patch.
wdng updated this revision to Diff 51139.Mar 20 2016, 8:53 PM
wdng updated this object.
lib/Target/AMDGPU/SIISelLowering.cpp
264–268

Are these really necessary? I thought making a type legal marked these operations legal by default.

270–276

Same with these too. Are they really necessary?

1743–1756

Why does i16 needs special handling here. These seems to be nearly identical to the block of code directly below.

2022–2031

We do we need to custom lower i16 stores? Can't we just mark then as promote?

arsenm added inline comments.Mar 23 2016, 9:44 AM
lib/Target/AMDGPU/SIISelLowering.cpp
270–276

min/max need to be explicitly made legal, but that should be a separate patch. setcc should be promote for now until those are added later

arsenm added inline comments.Mar 23 2016, 9:45 AM
lib/Target/AMDGPU/SIISelLowering.cpp
65

I did this already, so this should check Subtarget->has16BitInsts()

arsenm added inline comments.Mar 23 2016, 10:09 AM
lib/Target/AMDGPU/SIISelLowering.cpp
2022–2031

Load/store promote expects an equal size type for a bitcast promote. This is the same problem that i1 has, so it should follow that example

wdng updated this revision to Diff 51797.Mar 28 2016, 9:00 AM
wdng updated this object.

Please temporarily ignore code changes in function LowerFDV32(). I will finish the code development for fp32 div in next patch.

Can you send a new patch without the FP32 Div change, so I can commit it.

wdng updated this revision to Diff 52514.Apr 3 2016, 2:22 PM
wdng updated this object.
wdng edited edge metadata.

Revert DFIV32 code changes. This patch contains i16 implementation only based on Tom / Matt comments.

I tried to apply this patch, but I get a lot of failing lit tests in test/CodeGen/AMDGPU. Do all the tests in this directory pass for you?

lib/Target/AMDGPU/SIISelLowering.cpp
65

Do we still need this comment?

1740

Does this need to be removed?

1748

Coding style. Variable names should start with a captial.

tstellarAMD commandeered this revision.Oct 17 2016, 2:11 PM
tstellarAMD edited reviewers, added: wdng; removed: tstellarAMD.
tstellarAMD retitled this revision from AMDGPU i16 implementation to AMDGPU i16 implementation.

Rebase on top of master.

tstellarAMD retitled this revision from AMDGPU i16 implementation to AMDGPU/SI: Make i16 a legal type for VI subtargets.Oct 17 2016, 2:13 PM
tstellarAMD updated this object.

LGTM overall with minor fixes:

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
600 ↗(On Diff #74904)

Subtarget->has16BitInsts()

611 ↗(On Diff #74904)

Subtarget->has16BitInsts()

2369 ↗(On Diff #74904)

Space after //. Capitalize.

2370 ↗(On Diff #74904)

Subtarget->has16BitInsts()

lib/Target/AMDGPU/SIISelLowering.cpp
37

Alphabetize

82

I do not think we need this comment

83

Subtarget->has16BitInsts()

229

Subtarget->has16BitInsts()

273

Detabify

275

Detabify

277

Detabify

279

Detabify

280

Remove extra new line

arsenm added inline comments.Oct 26 2016, 9:44 AM
lib/Target/AMDGPU/BUFInstructions.td
962–963 ↗(On Diff #74904)

Originally I was avoiding these by promoting the load return type and truncating. Is this the same or is there an advantage from known bits types of things knowing it is really a 32-bit extload?

lib/Target/AMDGPU/SIISelLowering.cpp
1916

Why is this part of the patch? This looks unrelated

3428–3429

Define on same line

tstellarAMD marked 22 inline comments as done.

Address review comments.

lib/Target/AMDGPU/BUFInstructions.td
962–963 ↗(On Diff #74904)

The main problem is that extload must be legal, so it's hard to custom lower all extloads. I attempted to do this, and I think it would require a lot of work in the backend and also the legalizer to make this work. I'm not sure it's worth the effort at this point.

arsenm added inline comments.Oct 27 2016, 1:53 PM
lib/Target/AMDGPU/BUFInstructions.td
960 ↗(On Diff #76087)

This can be refined to a has 16-bit predicate

lib/Target/AMDGPU/VOP1Instructions.td
614–618 ↗(On Diff #76087)

Commented out code

lib/Target/AMDGPU/VOP2Instructions.td
348 ↗(On Diff #76087)

This should only need to be around the actual defs, not the multiclasses

355 ↗(On Diff #76087)

I don't think you need to repeat the type in the output here

lib/Target/AMDGPU/VOP3Instructions.td
232 ↗(On Diff #76087)

Line wrapping

tstellarAMD marked 4 inline comments as done.

Address review comments.

arsenm accepted this revision.Oct 28 2016, 11:26 AM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Oct 28 2016, 11:26 AM
Closed by commit rL285939: AMDGPU: Add VI i16 support (authored by tstellar). · Explain WhyNov 3 2016, 10:23 AM
This revision was automatically updated to reflect the committed changes.