Makes the SRC_(SHARED|PRIVATE)_(BASE|LIMIT) registers 64 bit instead of 32.
They're still usable as 32 bit operands by using the _LO suffix.
Preparation for D137542
Differential D137767
[AMDGPU] Make aperture registers 64 bit Pierre-vh on Nov 10 2022, 12:17 AM. Authored by
Details
Makes the SRC_(SHARED|PRIVATE)_(BASE|LIMIT) registers 64 bit instead of 32. Preparation for D137542
Diff Detail
Event TimelineComment Actions Can't this register still be used as a 32 bit operand ? (even if it's bugged) Then we need it for proper assembly/disassembly I believe, no? Comment Actions OK, seems reasonable. I see we already have some asm/dis tests for 64-bit uses of these sources, e.g. llvm/test/MC/AMDGPU/literals.s tests s_and_b64 s[0:1], s[0:1], src_shared_base. How did that work? Should it work differently now? Comment Actions I do remember struggling to define a register without sub registers, and it eventually blowing up the number of defined register classes Comment Actions I'd rather not define a separate 32-bit version, other than _LO and _HI are probably necessary. Do we already have assembler tests for this? Comment Actions Comments I wasn't able to do this without creating the 16 bit subregisters. Comment Actions I think the end result is the same but internally it's different. Before, it looks like SRC_SHARED_BASE was used for both 64 bit and 32 bit operand. Maybe as a hack? decodeSpecialReg64 was returning those aperture registers even though they're 32 bits. Comment Actions Note: I can't use COPY in D137542 because otherwise it'll "simplify" it and we end up with instructions that use the _HI register variants - which shouldn't be there and are just here because TableGen needs them/I get hundreds of crashes without it. It seems to assume there'll be sub1 for every reg in the class. Comment Actions Ah the class restriction was it! I fixed it. I got confused and was using SREG_64 (includes aperture register) for the COPY dst register, but if I use SGPR (doesn't include it) then copy coalescing won't mess it up Comment Actions But naturally it should be SReg_64. It should be valid to copy to VCC, and used directly as a SSrc_b64/VSrc_b64 operand. You may want an SReg_64 variant that excludes these
Comment Actions In D137542 you said I shouldn't set a register class. If I add a register class, how will it help? Should I add the new class & keep using the constrain? Comment Actions The complaint was mostly about where you set the class on the virtual register, during legalize and not during the select of the copy. Here I mean you may want a different register class to represent the use restriction where you want everything else, but to exclude the aperture regs |
The whole point of reserveRegisterTuples is you don't need to handle each individual subregister