This is an archive of the discontinued LLVM Phabricator instance.

RegisterClassInfo: Fix CSR cache invalidation
ClosedPublic

Authored by MatzeB on Aug 17 2022, 4:21 PM.

Details

Summary

RegisterClassInfo caches information like allocation orders and reuses it for multiple machine functions where possible. However the MCPhysReg *CalleeSavedRegs field used to test whether the set of callee saved registers changed did not work: After D28566 MachineRegisterInfo::getCalleeSavedRegs() can return dynamically computed CSR sets that are only valid while the MachineRegisterInfo object of the current function exists.

This changes the code to make a copy of the CSR list instead of keeping a possibly invalid pointer around.

Diff Detail

Event Timeline

MatzeB created this revision.Aug 17 2022, 4:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 4:21 PM
MatzeB requested review of this revision.Aug 17 2022, 4:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 4:21 PM

Adding PowerPC and XCore test authors. I update the tests to match the new output, I'm not sure if they still test what they are intended to test. I would recommend rewriting the tests into .mir tests that run a single pass and do not depend on a particular register allocation if necessary!

For context: I found this while tweaking regalloc heuristics. Surprisingly this bug seems to have nearly no effect on most targets in practice. I suspect this is because calling conventions and CSR lists don't typically change for most targets and even if they did this only affected the allocation orders and things like eviction could restore the desired CSR usage.

mtrofin accepted this revision.Aug 17 2022, 4:32 PM
mtrofin added inline comments.
llvm/lib/CodeGen/RegisterClassInfo.cpp
58

how about:

bool CSRChanged = true;

if (!Update) {
  CSRChanged = false;
  ...
}
This revision is now accepted and ready to land.Aug 17 2022, 4:32 PM

Thanks for the pointer to the XCore test. There are a couple of issues:

(1) An extra stack slot is being allocated, leaving a slot unused. A register is being saved in high memory (bottom of stack) instead of near SP, even though XCore useFPForScavengingIndex() returns false, to save near SP. A extra constant appears in the constant pool, holding this large offset.

(2) When I delete the first function (f), the second function (ScavengeSlots) generates the same code as before this patch. So there seems to be some kind of new dependency on previous state, previous functions.

So I will look into this a bit more. Do you know where state might be affecting code generation?

Do you know where state might be affecting code generation?

I fix one instance of this with this very diff! RegisterClassInfo::CalleeSavedRegs is maintained between functions and before this diff the code sometimes incorrectly decided to keep the RegisterClassInfo data around where it should have been invalidated and recomputed.

This revision was automatically updated to reflect the committed changes.

A problem was revealed on XCore because in scavenging.ll CSR is different in the second function than the first: a different hard-coded list of registers, with the different register being the last entry. In the loop comparing CSR with LastCSR, the condition “CSR[I] == 0 || I >= LastSize” combines two cases. In the first case, you do indeed want to set change = (I != LastSize). But in the second case, the lists have different length, and the change flag should be true, even if I == LastSize. I guess I can't add a code comment because this is committed but my fix is:

if (CSR[I] == 0) {
  CSRChanged = I != LastSize;
  break;
}
if (I >= LastSize) {
  CSRChanged = true;
  break;
}

The XCore test now generates the same code it did before. I've done no other testing.

@MatzeB Do you agree with the problem in the logic of this diff, that I reported above? The code:

if (CSR[I] == 0 || I >= LastSize) {
    CSRChanged = I != LastSize;
    break;
  }

is wrong in the case CSR[I] != 0 && I == LastSize, i.e. when an entry in the new list is beyond the length of the old list (all previous entries having matched). In that case, the lists are different, and code should set CSRChanged = true. But the committed code sets CSRChanged false.

This means that incorrect expected test results have now been committed in scavenging.ll.

@nigelp-xmos indeed I got that wrong here, sorry. Thanks for checking, putting up a fix now.