Page MenuHomePhabricator

AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA
ClosedPublic

Authored by tstellarAMD on Dec 22 2015, 7:42 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

tstellarAMD retitled this revision from to AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA.
tstellarAMD updated this object.
tstellarAMD added reviewers: arsenm, cfang.
tstellarAMD added a subscriber: llvm-commits.
cfang added inline comments.Dec 23 2015, 12:33 PM
lib/Target/AMDGPU/AMDGPUInstructions.td
391 ↗(On Diff #43502)

I think this def should be untouched because it is used to FLAT for real flat address.

lib/Target/AMDGPU/CIInstructions.td
265 ↗(On Diff #43502)

Do you think we should use "mubuf_load" here? With FlatForGlobal ON, we actually disabled SelectMUBUF. As a result, we should select FLAT for mubuf here.

lib/Target/AMDGPU/SIInstrInfo.td
149 ↗(On Diff #43502)

This def here is a little confusion because it may have side effect for the general case where FlatForGlobal is OFF.

arsenm added inline comments.Dec 23 2015, 1:18 PM
lib/Target/AMDGPU/CIInstructions.td
259–262 ↗(On Diff #43502)

Do constant extloads need to be handled also?

test/CodeGen/AMDGPU/salu-to-valu.ll
24 ↗(On Diff #43502)

HSA missing 2nd load

lib/Target/AMDGPU/AMDGPUInstructions.td
391 ↗(On Diff #43502)

I didn't remove this, I just moved it into SIInstrInfo.td. It's a GCN specific definition, so I think that is a better place for it.

lib/Target/AMDGPU/CIInstructions.td
259–262 ↗(On Diff #43502)

Yes.

265 ↗(On Diff #43502)

flat_load matches an additional address space "FlatAddress".

lib/Target/AMDGPU/SIInstrInfo.td
149 ↗(On Diff #43502)

I'm not sure what you mean here.

cfang added inline comments.Dec 23 2015, 2:43 PM
lib/Target/AMDGPU/AMDGPUInstructions.td
391 ↗(On Diff #43502)

The relocation is fine. But I can not understand the redefinition of flat_load.

lib/Target/AMDGPU/CIInstructions.td
265 ↗(On Diff #43502)

Do you think FlatAddress should be matched here?
I think we already have FlatAddress matching above in the same file:
def : FLATLoad_Pattern <FLAT_LOAD_DWORD, i32, flat_load>;

lib/Target/AMDGPU/SIInstrInfo.td
149 ↗(On Diff #43502)

If isGlobalLoad is true, then flat_load is true. This statement is valid only when FlatForGlobal is set.

lib/Target/AMDGPU/CIInstructions.td
259–262 ↗(On Diff #43502)

Actually, constant extloads already work. I've added test cases for this.

Update test cases.

lib/Target/AMDGPU/AMDGPUInstructions.td
391 ↗(On Diff #43559)

The new definition will match loads in the global and constant address space. The previous definition only matched loads in the flat ("generic" in OpenCL) address space.

lib/Target/AMDGPU/CIInstructions.td
265 ↗(On Diff #43559)

Those patterns are redundant now. I will remove them.

lib/Target/AMDGPU/SIInstrInfo.td
149 ↗(On Diff #43559)

These functions only look at the the address space of the load instruction. FlatForGlobal has no effect on them.

lib/Target/AMDGPU/CIInstructions.td
265 ↗(On Diff #43559)

I looked at this more closely and they actually aren't redundant, because they match flat address space on Kaveri without HSA.

cfang added inline comments.Dec 23 2015, 3:48 PM
lib/Target/AMDGPU/CIInstructions.td
265 ↗(On Diff #43559)

what was redundant is here:
let Predicates = [useFlatForGlobal] in {
...
def : FlatLoadPat <FLAT_LOAD_DWORD, flat_load, i32>;
...
}
I think you should not use "flat_load" here, which including FlatAddress. Instead, you should use "mubuf_load"

Split flat pattern re-organization into a separate change.

arsenm accepted this revision.Dec 23 2015, 8:52 PM
arsenm edited edge metadata.

LGTM

lib/Target/AMDGPU/SIInstrInfo.td
147–148 ↗(On Diff #43573)

global should be tested first since it is more common

This revision is now accepted and ready to land.Dec 23 2015, 8:52 PM
This revision was automatically updated to reflect the committed changes.