This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix broken FrameIndex handling
ClosedPublic

Authored by arsenm on Sep 7 2016, 7:40 PM.

Details

Summary

We were trying to avoid using a FrameIndex operand in non-pointer
operands in a convoluted way, and would break because of
using TargetFrameIndex. The TargetFrameIndex should only be used
in the case where it makes sense to fold it as part of the addressing
mode, otherwise it requires materialization like a normal constant.
This wasn't working reliably and failed in the added testcase, hitting
the assert when processing the frame index.

The TargetFrameIndex was coming from trying to produce an AssertZext
limiting the maximum stack size. I'm not sure this was correct to begin
with, because it is apparently possible to have a single workitem
dispatch that requires all 4G of private memory.

This mostly improves code but a few of the cases in
amdgpu.private-memory.ll are worse.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 70637.Sep 7 2016, 7:40 PM
arsenm retitled this revision from to AMDGPU: Fix broken FrameIndex handling.
arsenm updated this object.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.
arsenm updated this revision to Diff 70643.Sep 7 2016, 8:20 PM

Fix missing piece of patch and test changes that belong with later commit

arsenm added a comment.Sep 7 2016, 8:30 PM

Fix missing piece of patch and test changes that belong with later commit

Actually I think omitting the FrameIndex into addressing mode DAG folding and letting LocalStackSlotAllocation figure it out instead is where all of the improvements came from, so the original patch is probably better

ping, changing to folding frame indexes into memory instructions needs to be a separate patch

nhaehnle accepted this revision.Sep 17 2016, 7:04 AM
nhaehnle added a reviewer: nhaehnle.

LGTM

This revision is now accepted and ready to land.Sep 17 2016, 7:04 AM
arsenm closed this revision.Sep 17 2016, 9:18 AM

r281824