This is an archive of the discontinued LLVM Phabricator instance.

Improve and enable folding of conditional branches with tail calls.
ClosedPublic

Authored by goldstein.w.n on Jan 3 2023, 4:58 PM.

Details

Summary

Improve and enable folding of conditional branches with tail calls.

  1. Make it so that conditional tail calls can be emitted even when there are multiple predecessors.
  1. Don't guard the transformation behind -Os. The rationale for guarding it was static-prediction can be affected by whether the branch is forward of backward. This is no longer true for almost any X86 cpus (anything newer than SnB) so is no longer a meaningful concern.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Jan 3 2023, 4:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 4:58 PM
goldstein.w.n requested review of this revision.Jan 3 2023, 4:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 4:58 PM

@hans Added you b.c you seem to the original author. I didn't quite understand the rationale behind leaving this off given that:

  1. replaceBranchWithTailCall already keep the condition constant L3020: if (CC != BranchCond[0].getImm())
  2. The entire transform is guarded behind MBB == TBB.

I'm not sure if we can arbitrarily replace jmp foo with jcc foo. At least jmp has a large range (64-bit) scope which supports code size > 2GB. Maybe we should limit it to kernel/small code model?

llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

We do we need to add it if it's unused? Do not put unrelated/useless code in the same patch. It obstructs people to understand the patch.

goldstein.w.n added a comment.EditedJan 4 2023, 12:04 AM

I'm not sure if we can arbitrarily replace jmp foo with jcc foo. At least jmp has a large range (64-bit) scope which supports code size > 2GB. Maybe we should limit it to kernel/small code model?

Can it? https://www.felixcloutier.com/x86/jmp I though rel32 was maximum imm encoding.

But in canMakeTailCallConditional we have:

if (TailCall.getOpcode() != X86::TCRETURNdi &&
    TailCall.getOpcode() != X86::TCRETURNdi64) {
  // Only direct calls can be done with a conditional branch.
  return false;
}

We could limit X86::TCRETURNdi64 with kernel/small-code model. Although this is done after
reg alloc and only within function so we could only do if function has less than 2^32 / 16 instructions
and be safe.

Edit:
The stuff about in function is wrong.

goldstein.w.n added inline comments.Jan 4 2023, 12:11 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Its not needed. The reason I added it was b.c the rationale previously for keeping this optimization guarded behind -Os was that if it changed branch direction + reordered CFG it could be a de-optimization. That seems like a reasonable concern so thought it would be good to have in the generic API.

A more appropriate comment might have been: "At the moment never change branch direction".

Can drop if your prefer. Let me know for V2.

I'm not sure if we can arbitrarily replace jmp foo with jcc foo. At least jmp has a large range (64-bit) scope which supports code size > 2GB. Maybe we should limit it to kernel/small code model?

Can it? https://www.felixcloutier.com/x86/jmp I though rel32 was maximum imm encoding.

But in canMakeTailCallConditional we have:

if (TailCall.getOpcode() != X86::TCRETURNdi &&
    TailCall.getOpcode() != X86::TCRETURNdi64) {
  // Only direct calls can be done with a conditional branch.
  return false;
}

We could limit X86::TCRETURNdi64 with kernel/small-code model. Although this is done after
reg alloc and only within function so we could only do if function has less than 2^32 / 16 instructions
and be safe.

Edit:
The stuff about in function is wrong.

I didn't look into the details in canMakeTailCallConditional. Seems we have checked for several cases. So this is not a fresh question, or maybe even not a problem. At least, it's not specific to this patch.

llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

I'm still confused by the intention of this flag. If you believe it's a reasonable concern, why can we do it by default in this patch? The flag doesn't prevent from doing it.
OTOH, if we want make target specific decision, should we simply add the code

if (MayChangeDirection)
  return false;

to other targets (if they override it)? So we don't need to pass MayChangeDirection again to replaceBranchWithTailCall, right?

Regarding de-optimization, I had the same concern at the beginning. But test cases show that in most cases both branches are terminators, either jmp or ret. So the change of the layout should have negligible affect to performance?

goldstein.w.n added inline comments.Jan 4 2023, 8:59 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

I'm still confused by the intention of this flag. If you believe it's a reasonable concern, why can we do it by default in this patch? The flag doesn't prevent from doing it.

Oh man...

The comment is wrong. We assume MayChangeDirection is false (I.e we only find branch if CC matches, not also able to find matches inverse). Sorry for the confusion.

OTOH, if we want make target specific decision, should we simply add the code

if (MayChangeDirection)
  return false;

to other targets (if they override it)? So we don't need to pass MayChangeDirection again to replaceBranchWithTailCall, right?

So at the moment MayChangeDirection is unused in both X86InstrInfo::canMakeTailCallConditional and X86InstrInfo::replaceBranchWithTailCall (assumed false). X86InstrInfo::replaceBranchWithTailCall needs to find the branch we are replacing which may rely on MayChangeDirection. Also we have the sanity tests:

assert(canMakeTailCallConditional(BranchCond, TailCall, MayChangeDirection));

sitting at the top.

Regarding de-optimization, I had the same concern at the beginning. But test cases show that in most cases both branches are terminators, either jmp or ret. So the change of the layout should have negligible affect to performance?

This is because the entire case is guarded behind:

// Only eliminate if MBB == TBB (Taken Basic Block)
if (PredAnalyzable && !PredCond.empty() && PredTBB == MBB &&
    PredTBB != PredFBB) {

So we really are only ever doing this opt if it won't cause CFG reordering already. The MayChangeDirection AFAICT is redundant in the current implementation. Would you prefer me to just turn it into a comment at the call site?

Note we do see the final condition switch but from what I can tell that is b.c eliminating this edge in the CFG affects future calls to BranchFolder::OptimizeBlock. At the time this transformation takes place it only matches the same taken condition.

Fix wrong comment about MayChangeDirection

goldstein.w.n added inline comments.Jan 4 2023, 9:34 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

I'm still confused by the intention of this flag. If you believe it's a reasonable concern, why can we do it by default in this patch? The flag doesn't prevent from doing it.

Oh man...

The comment is wrong. We assume MayChangeDirection is false (I.e we only find branch if CC matches, not also able to find matches inverse). Sorry for the confusion.

I fixed the wrong comment in V2. Let me know your decision about dropping MayChangeDirection entirely for V3.

OTOH, if we want make target specific decision, should we simply add the code

if (MayChangeDirection)
  return false;

to other targets (if they override it)? So we don't need to pass MayChangeDirection again to replaceBranchWithTailCall, right?

So at the moment MayChangeDirection is unused in both X86InstrInfo::canMakeTailCallConditional and X86InstrInfo::replaceBranchWithTailCall (assumed false). X86InstrInfo::replaceBranchWithTailCall needs to find the branch we are replacing which may rely on MayChangeDirection. Also we have the sanity tests:

assert(canMakeTailCallConditional(BranchCond, TailCall, MayChangeDirection));

sitting at the top.

Regarding de-optimization, I had the same concern at the beginning. But test cases show that in most cases both branches are terminators, either jmp or ret. So the change of the layout should have negligible affect to performance?

This is because the entire case is guarded behind:

// Only eliminate if MBB == TBB (Taken Basic Block)
if (PredAnalyzable && !PredCond.empty() && PredTBB == MBB &&
    PredTBB != PredFBB) {

So we really are only ever doing this opt if it won't cause CFG reordering already. The MayChangeDirection AFAICT is redundant in the current implementation. Would you prefer me to just turn it into a comment at the call site?

Note we do see the final condition switch but from what I can tell that is b.c eliminating this edge in the CFG affects future calls to BranchFolder::OptimizeBlock. At the time this transformation takes place it only matches the same taken condition.

pengfei added inline comments.Jan 4 2023, 6:23 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Thanks! The flag is clear to me now. But it's still not clear to me why it is conservative.
According to the old comments in BranchFolding.cpp, the true conservative action is to do replaceBranchWithTailCall only when OptForSize or MayChangeDirection is true. But this change always does replaceBranchWithTailCall now. So it looks to me like aggressive rather than conservative.

goldstein.w.n added inline comments.Jan 4 2023, 6:39 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Thanks! The flag is clear to me now. But it's still not clear to me why it is conservative.
According to the old comments in BranchFolding.cpp, the true conservative action is to do replaceBranchWithTailCall only when OptForSize or MayChangeDirection is true. But this change always does replaceBranchWithTailCall now. So it looks to me like aggressive rather than conservative.

Thats not how I read it. How I read it was only do replaceBranchWithTailCall when OptForSize because we may change branch direction and mess up other CFG optimizations. AFAICT from the code, however, we never actually change branch direction (at the moment). I added the flag so that in the future if someone wants to make this more aggressive, the original intent, to not change branch direction unless OptForSize it would be clear.

pengfei added inline comments.Jan 4 2023, 10:30 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Seems we have opposite understanding in the sentences. cc @hans who wrote it in D29856.
My understanding is the "branch direction" means the BB layout that the BPU may take it as taken or not taken. The layout is usually calculated according to the branch possibility. Changing the order may regress performance.

goldstein.w.n added inline comments.Jan 4 2023, 11:07 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Seems we have opposite understanding in the sentences. cc @hans who wrote it in D29856.
My understanding is the "branch direction" means the BB layout that the BPU may take it as taken or not taken. The layout is usually calculated according to the branch possibility. Changing the order may regress performance.

That can't change either.

// Only eliminate if MBB == TBB (Taken Basic Block)
 if (PredAnalyzable && !PredCond.empty() && PredTBB == MBB &&
     PredTBB != PredFBB) {

I think it defacto ends up changing things later on b.c once an MBB has no preds it gets killed and that can re-order (more opportunistic optimization than forced changes) but we never explicitly swap fallthrough/taken. (Just as an reference, bolt now does this unconditionally in SimplifyConditionalTailCalls::fixTailCalls)

3018–3019 ↗(On Diff #486118)

Seems we have opposite understanding in the sentences. cc @hans who wrote it in D29856.
My understanding is the "branch direction" means the BB layout that the BPU may take it as taken or not taken. The layout is usually calculated according to the branch possibility. Changing the order may regress performance.

The comment at the end:

// If the predecessor is falling through to this block, we could reverse
  // the branch condition and fold the tail call into that. However, after
  // that we might have to re-arrange the CFG to fall through to the other
  // block and there is a high risk of regressing code size rather than
  // improving it.

I think is more in line with what you're saying.

hans added inline comments.Jan 9 2023, 7:08 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

I didn't look at the current patch so I don't have the full context, but when I wrote that comment, it was referring to branch direction as in whether it's branching to a higher or lower address. For example changing "Jcc foo; foo: jmp bar;" into "Jcc bar;" might be changing Jcc from a forwards branch (to foo) to a backwards branch to bar (if it's at a lower address).

Someone had expressed concerns about this, which is why we limited it to optsize functions.

goldstein.w.n added inline comments.Jan 12 2023, 11:32 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

I didn't look at the current patch so I don't have the full context, but when I wrote that comment, it was referring to branch direction as in whether it's branching to a higher or lower address. For example changing "Jcc foo; foo: jmp bar;" into "Jcc bar;" might be changing Jcc from a forwards branch (to foo) to a backwards branch to bar (if it's at a lower address).

Someone had expressed concerns about this, which is why we limited it to optsize functions.

What was the rationale? Static prediction? If so we can could probably limit this to haswell and never where static prediction was changed.

Make it so that conditional tail calls can be emitted even when there are multiple predecessors.

Do you notice a test case to reflect this target? Otherwise we may need to add one for it.

llvm/lib/CodeGen/BranchFolding.cpp
1513–1545

I think this flags turns to meaningless now. Either we leave code as was to match with SOM or we totally ignore it.

llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Yeah, it looks like Assembly/Compiler Coding Rule 3 in SOM. It suggests to maintain for consistency although Core microarchitecture does not use the static prediction heuristic.

llvm/test/CodeGen/X86/tail-opts.ll
866–867

This looks like increasing size some time. Is this caused by the change of checking for all predecessors?

Remove MayChangeDirection flag

Make it so that conditional tail calls can be emitted even when there are multiple predecessors.

Do you notice a test case to reflect this target? Otherwise we may need to add one for it.

Yes the following tests rely on it:

LLVM :: CodeGen/X86/jump_sign.ll
LLVM :: CodeGen/X86/or-branch.ll
LLVM :: CodeGen/X86/segmented-stacks.ll
LLVM :: CodeGen/X86/tailcall-extract.ll
llvm/lib/CodeGen/BranchFolding.cpp
1513–1545

I think this flags turns to meaningless now. Either we leave code as was to match with SOM or we totally ignore it.

Dropped the flag, LMK if you want me to replace with a CPU newer than SnB check.

llvm/lib/Target/X86/X86InstrInfo.cpp
3018–3019 ↗(On Diff #486118)

Yeah, it looks like Assembly/Compiler Coding Rule 3 in SOM. It suggests to maintain for consistency although Core microarchitecture does not use the static prediction heuristic.

IMO the comment is quite outdated and just because it did exist it shouldn't get in the way of improvements for nearly ALL modern machines.

We could add a flag to may it only SnB and newer (I think starts at SnB but might be IVB).

llvm/test/CodeGen/X86/tail-opts.ll
866–867

This looks like increasing size some time. Is this caused by the change of checking for all predecessors?

Is this increasing size? Seems to be one less basic-block to me and -1 instruction. I think the numbering just gets weird and it jumps from bb.2 -> bb.5.

goldstein.w.n edited the summary of this revision. (Show Details)Jan 29 2023, 10:37 AM
pengfei added inline comments.Jan 29 2023, 6:04 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3030–3032 ↗(On Diff #493109)

Revert the format change in this file.

llvm/test/CodeGen/X86/segmented-stacks.ll
1725

It doesn't look like multi predecessors.

1729

This looks like dead code?

llvm/test/CodeGen/X86/tail-opts.ll
866–867

jne rel8 vs. jne rel32. 3 bytes longer and no instruction reduced.

goldstein.w.n added inline comments.Jan 29 2023, 6:57 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
3030–3032 ↗(On Diff #493109)

Revert the format change in this file.

Will do for v4.

llvm/test/CodeGen/X86/segmented-stacks.ll
1725

It doesn't look like multi predecessors.

Hmm?

1729

This looks like dead code?

Appears to be. Not sure whats causing that. Is __morestack doing something tricky?

llvm/test/CodeGen/X86/tail-opts.ll
866–867

jne rel8 vs. jne rel32. 3 bytes longer and no instruction reduced.

Yeah, you're right, its because the tail call MBB is used by multiple branches, some of which can be dropped, but not all.

We could make it so that for -Os removing the tail calls is all or nothing (on the assumption that the MBB is more likely to be imm8 encodable than the jmp, although the alternative is also possible).

What do you think?

goldstein.w.n edited the summary of this revision. (Show Details)

Fix fmt issues.

goldstein.w.n marked an inline comment as done.Jan 29 2023, 7:31 PM
pengfei added inline comments.Jan 29 2023, 7:38 PM
llvm/test/CodeGen/X86/switch-bt.ll
82–95

Just noticed this. The manual written CHECK has gone, but no corresponding new check generated. Should we still use the old format with a bit twist?

goldstein.w.n added inline comments.Jan 29 2023, 7:46 PM
llvm/test/CodeGen/X86/switch-bt.ll
82–95

Just noticed this. The manual written CHECK has gone, but no corresponding new check generated. Should we still use the old format with a bit twist?

We can also just remove the:
-asm-verbose=false and then autogenerate. That Okay?

goldstein.w.n added inline comments.Jan 29 2023, 7:57 PM
llvm/test/CodeGen/X86/switch-bt.ll
82–95

Just noticed this. The manual written CHECK has gone, but no corresponding new check generated. Should we still use the old format with a bit twist?

Made D142860 to autogenerate tests for switch-bt.ll, then this one just updates.

pengfei accepted this revision.Jan 31 2023, 1:22 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jan 31 2023, 1:22 AM
This revision was landed with ongoing or failed builds.Jan 31 2023, 11:26 PM
This revision was automatically updated to reflect the committed changes.
hctim added a subscriber: hctim.Feb 1 2023, 8:40 AM

Hey, looks like this probably broke the ASan buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/31141/steps/13/logs/stdio

I'm running a bisect now just to double check.

Instructions on how to reproduce the bots can be found here: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild. You're probably looking for buildbot_fast.sh, the same one that's referenced in the instructions.

Testing:  0.. 10.. 20.. 30.. 40.. 50..
FAIL: LLVM :: CodeGen/X86/tailcall-extract.ll (42751 of 72693)
******************** TEST 'LLVM :: CodeGen/X86/tailcall-extract.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -opaque-pointers=0 -mtriple=x86_64-linux < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/tailcall-extract.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/tailcall-extract.ll
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/opt -opaque-pointers=0 -codegenprepare -S -mtriple=x86_64-linux < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/tailcall-extract.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/tailcall-extract.ll --check-prefix OPT
--
Exit Code: 2
Command Output (stderr):
--
=================================================================
==1490386==ERROR: AddressSanitizer: container-overflow on address 0x602000006718 at pc 0x55bf6065a91e bp 0x7ffc1fdaeed0 sp 0x7ffc1fdaeec8
READ of size 8 at 0x602000006718 thread T0
    #0 0x55bf6065a91d in llvm::BranchFolder::OptimizeBlock(llvm::MachineBasicBlock*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/BranchFolding.cpp:1517:34
    #1 0x55bf60649db4 in llvm::BranchFolder::OptimizeBranches(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/BranchFolding.cpp:1207:19
    #2 0x55bf606465b7 in llvm::BranchFolder::OptimizeFunction(llvm::MachineFunction&, llvm::TargetInstrInfo const*, llvm::TargetRegisterInfo const*, llvm::MachineLoopInfo*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/BranchFolding.cpp:212:34
    #3 0x55bf60660955 in (anonymous namespace)::BranchFolderPass::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/BranchFolding.cpp:137:17
    #4 0x55bf60b3be12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:91:13
    #5 0x55bf61a80e73 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #6 0x55bf61a9aa70 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #7 0x55bf61a82ce2 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #8 0x55bf61a82ce2 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #9 0x55bf5b7d7904 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #10 0x55bf5b7d7904 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #11 0x7f6384b17d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #12 0x7f6384b17e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #13 0x55bf5b704064 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x751c064)
0x602000006718 is located 8 bytes inside of 16-byte region [0x602000006710,0x602000006720)
allocated by thread T0 here:
    #0 0x55bf5b7c3bcd in operator new(unsigned long) /b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0x55bf5c95c847 in __libcpp_operator_new<unsigned long> /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/new:266:10
    #2 0x55bf5c95c847 in __libcpp_allocate /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/new:292:10
    #3 0x55bf5c95c847 in allocate /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/__memory/allocator.h:115:38
    #4 0x55bf5c95c847 in __allocate_at_least<std::__1::allocator<llvm::MachineBasicBlock *> > /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/__memory/allocate_at_least.h:55:19
    #5 0x55bf5c95c847 in __split_buffer /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/__split_buffer:323:29
    #6 0x55bf5c95c847 in void std::__1::vector<llvm::MachineBasicBlock*, std::__1::allocator<llvm::MachineBasicBlock*>>::__push_back_slow_path<llvm::MachineBasicBlock* const&>(llvm::MachineBasicBlock* const&) /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/vector:1538:49
    #7 0x55bf609e858b in push_back /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan/include/c++/v1/vector:1556:9
    #8 0x55bf609e858b in addPredecessor /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineBasicBlock.cpp:882:16
    #9 0x55bf609e858b in llvm::MachineBasicBlock::addSuccessor(llvm::MachineBasicBlock*, llvm::BranchProbability) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineBasicBlock.cpp:779:9
    #10 0x55bf62e4b401 in addSuccessorWithProb /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2196:10
    #11 0x55bf62e4b401 in llvm::SelectionDAGBuilder::visitSwitchCase(llvm::SwitchCG::CaseBlock&, llvm::MachineBasicBlock*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2576:3
    #12 0x55bf63063ffe in llvm::SelectionDAGISel::FinishBasicBlock() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1876:10
    #13 0x55bf630506c3 in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1644:5
    #14 0x55bf630464c4 in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:480:3
    #15 0x55bf5f5032f9 in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:191:25
    #16 0x55bf60b3be12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:91:13
    #17 0x55bf61a80e73 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #18 0x55bf61a9aa70 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #19 0x55bf61a82ce2 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #20 0x55bf61a82ce2 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #21 0x55bf5b7d7904 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #22 0x55bf5b7d7904 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #23 0x7f6384b17d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_container_overflow=0.
If you suspect a false positive see also: https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow.
SUMMARY: AddressSanitizer: container-overflow /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/BranchFolding.cpp:1517:34 in llvm::BranchFolder::OptimizeBlock(llvm::MachineBasicBlock*)
Shadow bytes around the buggy address:
  0x0c047fff8c90: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fd
  0x0c047fff8ca0: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fd
  0x0c047fff8cb0: fa fa 00 fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c047fff8cc0: fa fa 04 fa fa fa 00 fc fa fa 00 fa fa fa fd fd
  0x0c047fff8cd0: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
=>0x0c047fff8ce0: fa fa 00[fc]fa fa 00 fa fa fa 00 00 fa fa 00 fa
  0x0c047fff8cf0: fa fa fd fd fa fa fd fd fa fa fd fa fa fa fd fd
  0x0c047fff8d00: fa fa fd fa fa fa fd fa fa fa fd fd fa fa fd fd
  0x0c047fff8d10: fa fa fd fa fa fa fd fa fa fa fd fd fa fa 00 00
  0x0c047fff8d20: fa fa 00 00 fa fa fd fa fa fa fd fd fa fa fd fd
  0x0c047fff8d30: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1490386==ABORTING
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/tailcall-extract.ll
--
hctim added a comment.Feb 1 2023, 9:45 AM

Ah, looks like you beat me to it. Thanks for reverting early :).

Ah, looks like you beat me to it. Thanks for reverting early :).

Not me, thanks are to @goncharov for catching it almost immediately.

Thanks for the reproduction instructions :)

goldstein.w.n reopened this revision.Feb 1 2023, 4:58 PM
This revision is now accepted and ready to land.Feb 1 2023, 4:58 PM

Fix ASAN issues

pengfei accepted this revision.Feb 1 2023, 6:33 PM
pengfei added inline comments.
llvm/lib/CodeGen/BranchFolding.cpp
1546

return true;

Matt added a subscriber: Matt.Feb 1 2023, 8:28 PM
goldstein.w.n marked an inline comment as done.Feb 2 2023, 12:22 AM