This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen][AArch64] Don't split jump table basic blocks
ClosedPublic

Authored by dhoekwater on Aug 4 2023, 11:45 AM.

Details

Summary

Jump tables on AArch64 are label-relative rather than table-relative, so
having jump table destinations that are in different sections causes
problems with relocation. Jump table lookups have a max range of 1MB, so
all destinations must be in the same section as the lookup code. Both of
these restrictions can be mitigated with some careful and complex logic,
but doing so doesn't gain a huge performance benefit.

Efficiently ensuring jump tables are correct and can be compressed on
AArch64 is a TODO item. In the meantime, don't split blocks that can
cause problems.

Diff Detail

Event Timeline

dhoekwater created this revision.Aug 4 2023, 11:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2023, 11:45 AM
dhoekwater requested review of this revision.Aug 4 2023, 11:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2023, 11:46 AM
dhoekwater updated this revision to Diff 548775.EditedAug 9 2023, 2:53 PM

Use an LLVM IR test instead of an MIR test since those are easier to read. Removes dependency on https://reviews.llvm.org/D157063.

dhoekwater updated this revision to Diff 548776.Aug 9 2023, 2:56 PM

Use the correct test this time.

dhoekwater edited the summary of this revision. (Show Details)Aug 9 2023, 3:01 PM

thanks! Mostly LG with one comment from compile-time perspective

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

Mostly from compile-time perspective, could this be pre-computed per MachineFunction in machine-function-splitter pass for AArch64?

Something like:

// precompute 'JumpTableTargets' for each machine function 
const auto& MJTI = F->getJumpTableInfo();

SmallSet<MachineBasicBlock*> JumpTableTargetsSet;

for (auto JT : MJTI->getJumpTables()) {
  for (auto* MBB : JT.MBBs) { 
    JumpTableTargetsSet.insert(MBB);
  }
}

As long as 'JumpTableTargetsSet' remains valid (i.e., not invalidated by any code transformations, like branch folding, etc)
For each 'AArch64InstrInfo::isMBBSafeToSplitToCold' method call

auto isJumpTableTarget = [&](const MachineBasicBlock& MBB) {
  return JumpTableTargets.count(MBB);
}
llvm/test/CodeGen/AArch64/machine-function-splitter.ll
32–34 ↗(On Diff #548776)

For my understanding, is cold1 split out to nosplit_jumptable.cold before this patch?

63 ↗(On Diff #548776)

nit: if this is not used, maybe remove this?

dhoekwater marked 2 inline comments as done.

Move tests to Generic

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

Yeah, that would definitely be more efficient (O(n) worst case complexity in the number of blocks instead of O(n^2)).

However, I don't think this has a huge performance impact in real builds, and this logic only exists temporarily before I push changes that allow splitting jump table dispatch and targets. I'll collect a breakdown of how it impacts compile time.

llvm/test/CodeGen/AArch64/machine-function-splitter.ll
32–34 ↗(On Diff #548776)

Yeah, cold1 and cold2 are split out to the cold section before this patch.

63 ↗(On Diff #548776)

Resolved in updated test.

snehasish added inline comments.Aug 24 2023, 9:05 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8427

IMO structuring the code with functors makes it hard to follow. Additionally, there are several cases where we could return false early instead which would lower the compile time in practice.

I would prefer the code to be structured as follows:

bool AArch64InstrInfo::isMBBSafeToSplitToCold(
    const MachineBasicBlock &MBB) const {
  // Check if MBB is JumpTableTarget
  const MachineJumpTableInfo *MJTI = MBB.getParent()->getJumpTableInfo();
 bool isJumpTableTarget = MJTI != nullptr &&
           llvm::any_of(MJTI->getJumpTables(), 
      [](const MachineJumpTableEntry &JTE){ return  llvm::find(JTE.MBBs, &MBB) != JTE.MBBs.end()});

  if (isJumpTableTarget) return false;
  
  // Check if contains lookup
  for(const auto &MI : MBB) {
        switch (MI.getOpcode()) {
    case TargetOpcode::G_BRJT:
    case AArch64::JumpTableDest32:
    case AArch64::JumpTableDest16:
    case AArch64::JumpTableDest8:
      return false;
    default:
      continue;
    }
  }
  // All checks passed.
  return true;
}

What do you think?

dhoekwater marked 2 inline comments as done.

Refactor isMBBSafeToSplitToCold to be more readable.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

I just timed the backend compile. Although I don't have per-pass numbers, enabling function splitting only increases the build time of a large binary by 0.15%.

@mingmingl @snehasish Thoughts on this?

8427

That sounds pretty reasonable to me; it's definitely an easier read.

dhoekwater added inline comments.Aug 24 2023, 4:17 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8427

I refactored the function per your feedback, but I did rework and keep one lambda. It follows the common pattern of defining a lambda then using it in an STL library on the next line as its only use, which I think is pretty readable.

snehasish added inline comments.Aug 24 2023, 4:28 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

Seems fine in the interim while we wait for jump table specific fixes to land.

8427

Agree, looks good. Maybe limit the capture in the lambda to just &MBB to make it clearer?

llvm/test/CodeGen/Generic/machine-function-splitter.ll
521

Do we need another test case when the MBB contains a jump table lookup?

dhoekwater marked an inline comment as done.

Reduce the scope of the lambda capture and add a test to make sure jump table dispatch isn't split on AArch64

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8427

Done!

dhoekwater marked an inline comment as done.Aug 28 2023, 12:08 PM
dhoekwater added inline comments.
llvm/test/CodeGen/Generic/machine-function-splitter.ll
521

Good call; done.

dhoekwater marked an inline comment as done.Aug 28 2023, 12:08 PM

Rebase onto main to reduce diff

This revision is now accepted and ready to land.Aug 28 2023, 12:15 PM
aeubanks added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

0.15% is a fairly large regression. do you mean that enabling function splitting on its own regresses compile times compared to not enabling it, or that this patch regresses function splitting builds by 0.15%?

dhoekwater marked an inline comment as done.Aug 28 2023, 1:43 PM
dhoekwater added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

Enabling function splitting for an AArch64 target makes the build 0.15% slower than leaving it disabled. This patch has no impact on compile times of builds that don't use machine function splitting for AArch64.

I'll collect the impact of just this patch to see its individual impact.

aeubanks added inline comments.Aug 28 2023, 1:49 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

ah ok that sounds reasonable

snehasish added inline comments.Aug 28 2023, 1:49 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

I believe that builds which enable function splitting on AArch64 can break without this fix. So it seemed reasonable to me to submit a fix while we wait for a more optimized approach to land.

Also @aeubanks do we have some guidelines on compile time increases across configurations that we can refer to in the future? Thanks!

aeubanks added a subscriber: nikic.Aug 28 2023, 1:55 PM
aeubanks added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

+@nikic
I don't think we have anything official, but we do yell at people who noticeably regress compile times, especially if it shows up on https://llvm-compile-time-tracker.com/ as red. (although I believe that only tracks x86-64 for now)
One 0.15% regression might be fine, but they add up over time so we'd really like to not introduce noticeable regressions when possible. We probably should have a document somewhere that talks about acceptable compile time tradeoffs.

dhoekwater marked 3 inline comments as done.Aug 28 2023, 2:41 PM
dhoekwater added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8418–8425

Builds which enable function splitting on AArch64 without this fix can break unless they disable jump tables. For AArch64 MFS builds, compiling with this fix takes virtually the same time (technically 0.07% faster) as compiling with jump tables disabled.

This revision was automatically updated to reflect the committed changes.
dhoekwater marked an inline comment as done.