The base RA support infrastructure that only allow a specific register
class be allocated in RA pss. Since greedy RA, basic RA derived from
base RA, they all allow allocating specific register class. Fast RA
doesn't support allocating register for specific register class. This
patch is to enable ShouldAllocateClass in fast RA, so that it can
support allocating register for specific register class.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I thought I implemented this before. We do need to add some AMDGPU tests for this. @cdevadas recently noticed this was broken
It is great that you supported allocating specific register class before. Do you remember which lit test cast cover this feature? I'd would like to add similar test case for X86 target.
I think the idea of "split" special registers RA pass is good (especially for the registers who need to be config).
I notice all the "isVirtual<Register>" places need carefully to "exclude" the special registers.
Seems we may no need to do it if we can make sure that we handle these special registers before normal fast RA.
(because they have be allocated to physic registers)
Maybe there is some misunderstanding. This patch is to allow fast RA not only allocate all virtual register but also allocate specific register (e.g. tile register). Tile register is still allocated by fast RA, but it is allocated alone, not mixed with other register class. We can observe pass pipeline in O0-pipeline.ll.
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1152 | I just take a look on this pass. This check seems a little duplicated with MO.isReg(). I think we just need to replace all the "MO.isReg()" with a function which do the check of ShouldAllocateRegister Some like: ShouldAllocateForMO(MO){ return MO.isReg() && ShouldAllocateRegister(MO.getReg()); } and keep your change in handleDebugValue (or unify the code with MO) |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1152 | Yes, this patch is to check the callback when allocate virtual registers, but I'm not sure it is a good idea to just add check the callback following MO.isReg(). First it may be physical register for a mahcine operand. It would crash if pass physical register to ShouldAllocateRegister(). Secondly unlikely RegAllocBase, the current infrastructure of RegAllocFast has much details in allocateInstruction(). |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1310 | It seems keeping !ShouldAllocateRegister(Reg) can catch the unexpected bug by line 1404 where tile register has been allocated to tile physical register but it is still virtual register. I can add an additional assert here. | |
1410 | It seems keeping !ShouldAllocateRegister(Reg) can catch the unexpected bug by line 1404 where tile register has been allocated to tile physical register but it is still virtual register. I can add an additional assert here. |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1169 | Are you handling here too ? |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1169 | For physical register it is not filtered. I think the pass won't do anything for physical registers given there is no virtual register for the corresponding register class. |
llvm/test/CodeGen/AMDGPU/collapse-endcf.ll | ||
---|---|---|
425 | Yes, this is the right thing. The buffer_load_dword (the VGPR spill) should appear at the block prolog as they are inserted during 2 RA pass. In the left-hand side they are incorrectly inserted during the 1 RA pass which is supposed to allocate only SGPRs. |
llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll | ||
---|---|---|
43 | That's right. VGPRs during the 2nd RA pass. |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
1169 | LGTM, Just a flaw here: |
Looks like this breaks check-clang: http://45.33.8.238/linux/79438/step_7.txt
Please take a look and revert for now if it takes a while to fix.
On many of the official bots too, eg https://lab.llvm.org/buildbot/#/builders/109/builds/41178 (did that not generate an email?)
This also broke check-llvm on macOS: http://45.33.8.238/macm1/37938/step_11.txt (this both is an M1 mac, not sure if intel vs arm matter).
Please take a look at that too, and revert for now if it takes a while to fix.
(This failure was masked by the check-clang failure. But I locally bisected and verified that it's due to this change, and that locally reverting fixes the check-llvm failure.)
I don't have macOS environment, but from the log it seems some virtual register don't have a register class. Is there any full calltrace for the failure?
I also bisected an lldb failure down to this: https://lab.llvm.org/buildbot/#/builders/96/builds/25038/steps/6/logs/stdio
It has backtrace of a sort but it's probably not much help, I will get hold of the reproducer files.
Thank you! I guess it was caused by line 306, I'll try getRegBankOrNull() instead of getRegClass().
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
306 | I'll use getRegClassOrNull() to replace getRegClass(). |
I was able to duplicate this issue with --global-isel. I'll come up with a patch to fix it.
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
306 | No, getRegClass should be used here. Virtual registers must have a class at this point |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
306 | In goable instruction selection is it possible there is no register class for a virtual register? |
llvm/lib/CodeGen/RegAllocFast.cpp | ||
---|---|---|
306 | Classes should have been set as part of the selection process |
Given that the bot's been red for quite a few hours by now and there's some disagreement about the right fix, I've reverted this in 851a5efe45a026047ba8c0262a892b9895e355bf for now.
clang-format not found in user’s local PATH; not linting file.