This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg
ClosedPublic

Authored by nhaehnle on Dec 2 2016, 7:27 AM.

Details

Summary

Without the fix to isFrameOffsetLegal to consider the instruction's
immediate offset, the new test case hits the corresponding assertion in
resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a
different base register.

With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of
places because frame base registers are added where they're not needed.
This is addressed by properly implementing needsFrameBaseReg, which also
helps to avoid unnecessary zero frame indices in a bunch of other places.

Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test

Event Timeline

nhaehnle updated this revision to Diff 80069.Dec 2 2016, 7:27 AM
nhaehnle retitled this revision from to AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg.
nhaehnle updated this object.
nhaehnle added reviewers: arsenm, tstellarAMD.
nhaehnle added a subscriber: llvm-commits.
arsenm added inline comments.Dec 2 2016, 9:54 AM
lib/Target/AMDGPU/SIRegisterInfo.cpp
220

I was working on a patch to change this to just always true (or everything except an add/or instruction) because we can't directly access memory from any instruction. I'm still sort of confused about the relationship between needsFrameBaseReg and isFrameOffsetLegal

221–233

This looks the same as the same as what isFrameOffsetLegal is doing. Can you just call that from here?

319

I think the isMUBUF check should early exit before all of these other parts

test/CodeGen/AMDGPU/local-stack-slot-offset.ll
16–17

Can you come up with a testcase which doesn't use an enormous illegal vector type? These end up having really bad compile time and I've been meaning to fix the ones we already have

nhaehnle added inline comments.Dec 2 2016, 10:29 AM
lib/Target/AMDGPU/SIRegisterInfo.cpp
220

ARM does something in isFrameOffsetLegal based on whether the register is SP, but indeed maybe needFrameBaseReg could just be replaced by isFrameOffsetLegal with BaseReg == 0.

319

That's better for consistency, yes. isFrameOffsetLegal currently happens to be called only when needsFrameBaseReg has previously returned true.

test/CodeGen/AMDGPU/local-stack-slot-offset.ll
16–17

Hmm, maybe something with -promote-alloca?

arsenm added inline comments.Dec 2 2016, 10:41 AM
test/CodeGen/AMDGPU/local-stack-slot-offset.ll
16–17

Yes, most of our private memory tests need to use that

nhaehnle updated this revision to Diff 80577.Dec 7 2016, 5:12 AM
nhaehnle marked 2 inline comments as done.
nhaehnle edited edge metadata.
  • Test case using alloca
  • Unify getting the MUBUF instruction offset
arsenm accepted this revision.Dec 7 2016, 9:18 AM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Dec 7 2016, 9:18 AM
This revision was automatically updated to reflect the committed changes.