This is an archive of the discontinued LLVM Phabricator instance.

[WIP][Allocator] Fix the spill of status register nzcv
AbandonedPublic

Authored by Allen on Dec 27 2021, 5:43 AM.

Details

Diff Detail

Event Timeline

Allen created this revision.Dec 27 2021, 5:43 AM
Allen requested review of this revision.Dec 27 2021, 5:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 27 2021, 5:43 AM
Allen updated this revision to Diff 396803.Dec 31 2021, 7:57 AM
Allen added a reviewer: nikic.

Change the code format

Allen added a comment.Dec 31 2021, 8:00 AM

Fix the crash as the register nzcv is copied to gpr64 with different size.
In fact, the size of register nzcv is 64-bit, and it is legal.

This is missing description.
Test looks huge, and i don't see a FileCheck there.

Allen added a comment.EditedDec 31 2021, 8:25 AM

This is missing description.
Test looks huge, and i don't see a FileCheck there.

As the origin issue will bring in crash , so I only use this case to check it can pass the compile.
the issue related to register spill, I already try to simplify the test case , but it is hard to get a smart one, do you have some advice ?

the crash related to the following line #L4211-L4213

if (DstMO.getSubReg() == 0 && SrcMO.getSubReg() == 0) {
  assert(TRI.getRegSizeInBits(*getRegClass(DstReg)) ==
             TRI.getRegSizeInBits(*getRegClass(SrcReg)) &&
         "Mismatched register size in non subreg COPY");
nikic resigned from this revision.Jan 3 2022, 1:19 AM
nikic edited reviewers, added: dmgreen, t.p.northover; removed: nikic.

Sorry, not familiar with this code.

Thanks for the patch. As others have said, this test looks a lot larger than needed. I would suspect that the right combination of COPY's to and from nzcv, with something between to force the register to spill should be enough. And that way the test can hopefully be made more reliable - more likely to keep testing spilling nzcv as other code in the backend changes.

I don't think we really support spilling nzcv though. The usual way this is handled is to re-compute nzcv, not to spill and reload it. I would guess that something in DAG combine is re-using the output of CCMP, not creating a new one. Newer CPUs deal with moving to/from nzcv a lot better than they used to, but older cpus can be quite slow as far as I understand.

If I do generate output for the test case you have here it looks like this:

; CHECK-NEXT:   dead $xzr = SUBSXri [[LDRXui2]], 0, 0, implicit-def $nzcv                                                                        
; CHECK-NEXT:   CCMPWi %56.sub_32, 0, 4, 1, implicit-def $nzcv, implicit $nzcv                                                                   
; CHECK-NEXT:   STRXui $nzcv, %stack.1, 0 :: (store (s64) into %stack.1)
...
; CHECK-NEXT:   $nzcv = LDRXui %stack.1, 0 :: (load (s64) from %stack.1)

Those loads/stores are not valid. It shows the importance to have decent CHECK lines.

Allen added a comment.EditedJan 3 2022, 5:31 PM

Thanks for the patch. As others have said, this test looks a lot larger than needed. I would suspect that the right combination of COPY's to and from nzcv, with something between to force the register to spill should be enough. And that way the test can hopefully be made more reliable - more likely to keep testing spilling nzcv as other code in the backend changes.

I don't think we really support spilling nzcv though. The usual way this is handled is to re-compute nzcv, not to spill and reload it. I would guess that something in DAG combine is re-using the output of CCMP, not creating a new one. Newer CPUs deal with moving to/from nzcv a lot better than they used to, but older cpus can be quite slow as far as I understand.

If I do generate output for the test case you have here it looks like this:

; CHECK-NEXT:   dead $xzr = SUBSXri [[LDRXui2]], 0, 0, implicit-def $nzcv                                                                        
; CHECK-NEXT:   CCMPWi %56.sub_32, 0, 4, 1, implicit-def $nzcv, implicit $nzcv                                                                   
; CHECK-NEXT:   STRXui $nzcv, %stack.1, 0 :: (store (s64) into %stack.1)
...
; CHECK-NEXT:   $nzcv = LDRXui %stack.1, 0 :: (load (s64) from %stack.1)

Those loads/stores are not valid. It shows the importance to have decent CHECK lines.

Done, added checking

Allen updated this revision to Diff 398820.Jan 10 2022, 8:01 PM

Add CHECK according review

dmgreen added inline comments.Jan 11 2022, 12:38 AM
llvm/test/CodeGen/MIR/AArch64/greedy-spill-nzcv.mir
25

My point is that this isn't a valid instruction (I'm pretty sure, correct me if I'm wrong). There is no AArch64 instruction that loads nzcv directly from the stack.

I think we need to stop the code from getting to this point, by preventing the DAG Combine from making overlapping nzcv ranges, in this case by producing two CCMPWi's for each of the places that they get used.

Allen retitled this revision from [Huawei][Allocator] Fix the spill of status register nzcv to [WIP][Allocator] Fix the spill of status register nzcv.Mar 5 2022, 2:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 2:29 AM
Allen abandoned this revision.Nov 1 2022, 4:22 AM

The issue disapprear base on commit 66b83088

Allen added a comment.Nov 1 2022, 5:17 AM

it seems fixed by commit 163c77b2e (https://reviews.llvm.org/D127294)

AArch64InstrInfo.cpp:4451: virtual llvm::MachineInstr* llvm::AArch64InstrInfo::foldMemoryOperandImpl(llvm::MachineFunction&, llvm::MachineInstr&, llvm::ArrayRef<unsigned int>, llvm::MachineBasicBlock::iterator, int, llvm::LiveIntervals*, llvm::VirtRegMap*) const: Assertion `TRI.getRegSizeInBits(*getRegClass(DstReg)) == TRI.getRegSizeInBits(*getRegClass(SrcReg)) && "Mismatched register size in non subreg COPY"' failed