The user specified shader inputs were not counting towards
the total user SGPR count, so it was possible to request
more than are possible, as well as report fewer registers
than really used.
Details
- Reviewers
• tstellarAMD mareko nhaehnle
Diff Detail
Event Timeline
I'm not sure I understand the design of addArgUserReg. Is the CurReg parameter supposed to be correct or not? If yes, why is there a separate return value? If not, why is it there in the first place? It also seems like the alignment of SGPRs could just be done by a numerical rounding up instead of a loop.
It's supposed to be ensuring the number of used registers is consistent with the reported register by the generated calling convention (there was supposed to be an assert in here). I was trying to avoid assumptions based on the value of the register enums. It's kind of gross, but I'm not sure of a better way to express the connection between the two separate ways the calling convention registers are tracked
s/UserSGPR/InputSGPR/
s/UserReg/InputReg/
The LLVM backend can't deduce the number of user SGPRs, because Mesa doesn't supply that information. LLVM only receives the list of all input SGPRs. Some of them come from USER DATA, others are preloaded by the hardware based on other states, etc.
I found a problem with the generated input registers. If you have a case like
(<3 x i32> inreg, i64 inreg, i32 inreg), it correctly selects s0, s1, s2 for the vector, s[4:5] for the i64, and the final i32 picks the s3 in the alignment gap between the vector and i64. I'm guessing that changing the order this way will break something