This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix not counting shader input registers
AbandonedPublic

Authored by arsenm on May 3 2016, 4:58 PM.

Details

Summary

The user specified shader inputs were not counting towards
the total user SGPR count, so it was possible to request
more than are possible, as well as report fewer registers
than really used.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 56081.May 3 2016, 4:58 PM
arsenm retitled this revision from to AMDGPU: Fix not counting shader input registers.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
arsenm updated this revision to Diff 56085.May 3 2016, 6:14 PM

Make sure total register usage is at least the number of inputs

nhaehnle edited edge metadata.May 5 2016, 10:37 AM

I'm not sure I understand the design of addArgUserReg. Is the CurReg parameter supposed to be correct or not? If yes, why is there a separate return value? If not, why is it there in the first place? It also seems like the alignment of SGPRs could just be done by a numerical rounding up instead of a loop.

arsenm added a comment.May 5 2016, 3:17 PM

I'm not sure I understand the design of addArgUserReg. Is the CurReg parameter supposed to be correct or not? If yes, why is there a separate return value? If not, why is it there in the first place? It also seems like the alignment of SGPRs could just be done by a numerical rounding up instead of a loop.

It's supposed to be ensuring the number of used registers is consistent with the reported register by the generated calling convention (there was supposed to be an assert in here). I was trying to avoid assumptions based on the value of the register enums. It's kind of gross, but I'm not sure of a better way to express the connection between the two separate ways the calling convention registers are tracked

mareko edited edge metadata.May 6 2016, 2:59 AM

s/UserSGPR/InputSGPR/
s/UserReg/InputReg/

The LLVM backend can't deduce the number of user SGPRs, because Mesa doesn't supply that information. LLVM only receives the list of all input SGPRs. Some of them come from USER DATA, others are preloaded by the hardware based on other states, etc.

arsenm added a comment.May 6 2016, 6:08 PM

I found a problem with the generated input registers. If you have a case like
(<3 x i32> inreg, i64 inreg, i32 inreg), it correctly selects s0, s1, s2 for the vector, s[4:5] for the i64, and the final i32 picks the s3 in the alignment gap between the vector and i64. I'm guessing that changing the order this way will break something

nhaehnle resigned from this revision.Feb 21 2018, 8:42 AM
arsenm abandoned this revision.Apr 5 2020, 7:39 AM