This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/R600: Implement loads from constant AS
ClosedPublic

Authored by jvesely on May 1 2016, 9:57 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

jvesely updated this revision to Diff 55781.May 1 2016, 9:57 PM
jvesely retitled this revision from to AMDGPU/R600: Implement loads from constant AS.
jvesely updated this object.
tstellarAMD added inline comments.May 2 2016, 7:30 AM
lib/Target/AMDGPU/R600ISelLowering.cpp
1067 ↗(On Diff #55781)

Why can't you lower this directly to an immediate instead of using the CONST_DATA_PTR node?

jvesely updated this revision to Diff 55950.May 2 2016, 11:10 PM
jvesely edited edge metadata.
jvesely set the repository for this revision to rL LLVM.
jvesely added a subscriber: llvm-commits.

rename IMM_CONST_ADDRESS -> IMM_GLOBAL_ADDRESS
add tests for D19790

jvesely added inline comments.May 2 2016, 11:19 PM
lib/Target/AMDGPU/R600ISelLowering.cpp
1067 ↗(On Diff #55781)

I read somewhere that conversion from GA to TGA needed some kind of wrapper pseudo instruction and CONST_DATA_PTR filled the spot.

I tried using:
Pat<
(globaladdress:$a)
(MOV_IMM_CONST_ADDR tglobaladdress:$a)>;
to eliminate Lowering conversion entirely but that produced:
%vreg1<def> = MOV_IMM_CONST_ADDR %vreg1; R600_Reg32:%vreg1
I guess conversions in patterns don't work.

I also tried using:
Pat<
(tglobaladdress:$a)
(MOV_IMM_CONST_ADDR tglobaladdress:$a)>;
and lower to TGA, but that hit assertions in a lot of places that expect register operands and the easiest way to convert TGA to a register was a wrapper node.
%vreg2<def> = ADD_INT 0, 0, 1, 0, 0, 0, <ga:@float_gv>, 0, 0, 0, -1, %vreg0, 0, 0, 0, -1, 1, pred:%PRED_SEL_OFF, 0, 0; R600_TReg32_X:%vreg2 R600_Reg32:%vreg0

tstellarAMD added inline comments.May 3 2016, 8:25 AM
lib/Target/AMDGPU/R600ISelLowering.cpp
1070 ↗(On Diff #55950)

You are correct about the wrapper, but what I'm saying is why even emit a GlobalAddress at all. Why not do return DAG.getConstant(0, MVT::i32), like we do when lowering LDS global addresses.

jvesely added inline comments.May 3 2016, 9:36 AM
lib/Target/AMDGPU/R600ISelLowering.cpp
1070 ↗(On Diff #55950)

the value is not zero in this case.
RO data is appended at the end of code section so we need that offset at least.
even if we used separate rodata section for r600 the address would be 0 only for the first global variable, the other variables would still need section relative addresses (that's what fixup in D19791 does).

LDS path computes section offset in lowering, which is OK for uninitialized variables (or if we emit initialization code somewhere). I don't think it'd work with const data, unless we force rodata layout to use the lowered addresses.
I think we can move the LDS approach to use targetglobaladdress and fixups, it would fix the TODO about minimizing wasted space in LDS.

tstellarAMD accepted this revision.May 6 2016, 5:13 PM
tstellarAMD edited edge metadata.

LGTM.

This revision is now accepted and ready to land.May 6 2016, 5:13 PM
jvesely updated this revision to Diff 57235.May 13 2016, 1:12 PM
jvesely edited edge metadata.
jvesely removed rL LLVM as the repository for this revision.

rebase on top of changed D19789

This revision was automatically updated to reflect the committed changes.
llvm/trunk/lib/Target/AMDGPU/R600Instructions.td