This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Restrict soft clause bundling at half of the available regs
ClosedPublic

Authored by arsenm on Feb 10 2021, 4:44 PM.

Details

Summary

Fixes a testcase that was overcommitting large register tuples to a
bundle, which the register allocator could not possibly satisfy. This
was producing a bundle which used nearly all of the available SGPRs
with a series of 16-dword loads (not all of which are freely available
to use).

This is a quick hack for some deeper issues with how the clause
bundler tracks register pressure.

Overall the pressure tracking used here doesn't make sense and is too
imprecise for what it needs to avoid the allocator failing. The
pressure estimate does not account for the alignment requirements of
large SGPR tuples, so this was really underestimating the pressure
impact. This also ignores the impact of the extended live range of the
use registers after the bundle is introduced. Additionally, it didn't
account for some wide tuples not being available due to reserved
registers.

This regresses a few cases. These end up introducing more
spilling. This is also a function of the global pressure being used in
the decision to bundle, not the local pressure impact of the bundle
itself.

Diff Detail

Event Timeline

arsenm created this revision.Feb 10 2021, 4:44 PM
arsenm requested review of this revision.Feb 10 2021, 4:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2021, 4:44 PM
Herald added a subscriber: wdng. · View Herald Transcript

I agree RPT needs improvements around tuples, so the w/a seems reasonable for now. I am concerned about huge tests though.

llvm/test/CodeGen/AMDGPU/limit-soft-clause-reg-pressure.mir
2

Do you really need a test so huge? We will regenerate it very often. At the very least you could just check something small, you only need to make sure it compiles.

arsenm added inline comments.Feb 10 2021, 5:05 PM
llvm/test/CodeGen/AMDGPU/limit-soft-clause-reg-pressure.mir
2

I thought I did that but I guess I lost that revision

arsenm updated this revision to Diff 322975.Feb 11 2021, 6:00 AM

Trim test checks

rampitec added inline comments.Feb 11 2021, 9:22 AM
llvm/test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll
6

What about this one? Do you really need full checks to be generated?

arsenm updated this revision to Diff 323074.Feb 11 2021, 10:42 AM

Fewer checks

This revision is now accepted and ready to land.Feb 11 2021, 10:46 AM