This is an archive of the discontinued LLVM Phabricator instance.

[BasicBlockSections] avoid insertting redundant branch to fall through blocks
AbandonedPublic

Authored by rahmanl on Oct 19 2022, 2:19 AM.

Details

Summary

Originally authored by @sinan

blocks reached by an explicit unconditional branch from its layout predecessor are also recognized as a fallthrough block via MachineBasicBlock::getFallThrough. For such cases, we do not need an extra unconditional branch.

If test case basic-block-sections_2.ll is run with O0, then you can see the instruction sequence like:

__cxx_global_var_init:                  # @__cxx_global_var_init
	.cfi_startproc
# %bb.0:                                # %entry
	pushq	%rax
	.cfi_def_cfa_offset 16
	movl	$5, %edi
	callq	_Z3bari
	movb	%al, %cl
	xorl	%eax, %eax
                                        # kill: def $al killed $al killed $eax
	testb	$1, %cl
	movb	%al, 7(%rsp)                    # 1-byte Spill
	jne	__cxx_global_var_init.__part.1
	jmp	__cxx_global_var_init.__part.2
	jmp	__cxx_global_var_init.__part.1
.LBB_END0_0:
	.cfi_endproc
	.section	.text.startup,"ax",@progbits,unique,2
__cxx_global_var_init.__part.1:         # %cond.true
	.cfi_startproc
	.cfi_def_cfa %rsp, 16
	movl	$3, %edi
	callq	_Z3bari

after this patch, the redundant branch jmp __cxx_global_var_init.__part.1 could be removed.

__cxx_global_var_init:                  # @__cxx_global_var_init
	.cfi_startproc
# %bb.0:                                # %entry
	pushq	%rax
	.cfi_def_cfa_offset 16
	movl	$5, %edi
	callq	_Z3bari
	movb	%al, %cl
	xorl	%eax, %eax
                                        # kill: def $al killed $al killed $eax
	testb	$1, %cl
	movb	%al, 7(%rsp)                    # 1-byte Spill
	jne	__cxx_global_var_init.__part.1
	jmp	__cxx_global_var_init.__part.2
.LBB_END0_0:
	.cfi_endproc
	.section	.text.startup,"ax",@progbits,unique,2
__cxx_global_var_init.__part.1:         # %cond.true
	.cfi_startproc
	.cfi_def_cfa %rsp, 16
	movl	$3, %edi
	callq	_Z3bari

This issue happens much more frequently when I test bb section option with AArch64.

Diff Detail

Event Timeline

sinan created this revision.Oct 19 2022, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 2:19 AM
sinan requested review of this revision.Oct 19 2022, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 2:19 AM
sinan updated this revision to Diff 468887.Oct 19 2022, 5:56 AM

simplify code

Thanks for the fix. Does this every trigger with -O2?

llvm/lib/CodeGen/BasicBlockSections.cpp
133

I suggest we change the name to hasUncoditionalBranch and flip the return value and the condition below, to make it more readable.

134–141

Can we simply write:
return !MBB.terminators.empty() && MBB.terminators.back().isUncoditionalBranch()

168

Or simply change this to
MBB.terminators.empty() || !MBB.terminators.back().isUnconditionalBranch() and add a comment.

llvm/test/CodeGen/X86/basic-block-sections_2.ll
21–25

Can we use a different file for this test? The file header describes a different purpose for this test. Also __cxx_global_var_init is an automatically-generated function. It would be better if we come up a with a standalone test with a normal function -- if possible.

sinan added a comment.Oct 20 2022, 5:53 AM

Hi @rahmanl, I tested with -O2, and the redundant branch can be removed by machine code sinking pass.

sinan added inline comments.Oct 20 2022, 6:03 AM
llvm/lib/CodeGen/BasicBlockSections.cpp
134–141

I implemented it in this way since I have encountered code pieces like

0000000000400c98 <main.eh>:
  400c98:	d503201f 	nop
  400c9c:	aa0003f3 	mov	x19, x0
  400ca0:	7100043f 	cmp	w1, #0x1
  400ca4:	54ffff61 	b.ne	400c90 <main.__part.9>  // b.any
  400ca8:	17ffffef 	b	400c64 <main.__part.6>
  400cac:	aa0003f3 	mov	x19, x0
  400cb0:	97ffff50 	bl	4009f0 <__cxa_end_catch@plt>
  400cb4:	17fffff7 	b	400c90 <main.__part.9>
  400cb8:	94000001 	bl	400cbc <__clang_call_terminate>

But I just checked it and they seem to come from two eh-related basic blocks.

I will update it according to your suggestion.

sinan updated this revision to Diff 473587.Nov 7 2022, 12:29 AM

update according to comments

sinan added a comment.Nov 7 2022, 12:36 AM

Hi @rahmanl, Sorry for the delay. I did not find any countercase for just checking the last terminator, and I spent some time investigating a pattern:

jne  func.part2
jmp func.part1
nop
nop

and it turns out that the nops are inserted during emitting asm.

Thanks for the revision.

lld/test/ELF/lto/basic-block-sections-redundant-br.ll
1 ↗(On Diff #473587)

Please add a one line comment, explaining the purpose of this test.

20–24 ↗(On Diff #473587)

You don't need this since you have CHECKS for mapping.__part.1 and mapping.__part.2. I suggest we just check the jumps.

27 ↗(On Diff #473587)

CHECK-NOT: jmp is sufficient.

llvm/lib/CodeGen/BasicBlockSections.cpp
125–126

This comment seems to be explaining something about the context in which this function is called. I suggest we remove it from here.

129

Looking again, it seems that terminators is simply a range starting at the first terminator and ending at MachineBasicBlock::end().https://llvm.org/doxygen/classllvm_1_1MachineBasicBlock.html#a7f0521fa2de44271fd4b909ea7351ef3
So I think we can simply do:
!MBB.empty() && MBB.back().isUnconditionalBranch().

164–169

This comment is incorrect. If it has an unconditional branch to its fallthrough, then we *don't* need an additional jump.

sinan updated this revision to Diff 474418.Nov 9 2022, 7:32 PM

update the patch according to suggestions.

Has this been revised yet? I see you have updated the patch but my comments are still "Not Done".

llvm/lib/CodeGen/BasicBlockSections.cpp
164–169

Still pending.
You can just remove the third item (it's implicitly followed from "needs an explicit unconditional branch").

llvm/test/CodeGen/X86/basic-block-sections-redundant-br.ll
1 ↗(On Diff #474418)
sinan updated this revision to Diff 475990.Nov 16 2022, 7:42 PM

remove the incorrect description.

rahmanl commandeered this revision.Aug 21 2023, 12:56 PM
rahmanl edited reviewers, added: sinan; removed: rahmanl.
rahmanl updated this revision to Diff 552109.Aug 21 2023, 12:57 PM
rahmanl edited the summary of this revision. (Show Details)

Apply final changes.

rahmanl updated this revision to Diff 552111.Aug 21 2023, 12:58 PM

Rename the test.

rahmanl updated this revision to Diff 552113.Aug 21 2023, 12:59 PM

clang-format.

rahmanl updated this revision to Diff 552114.Aug 21 2023, 1:02 PM

fix comment.

rahmanl edited the summary of this revision. (Show Details)Aug 21 2023, 1:04 PM
sinan accepted this revision.Aug 31 2023, 8:15 AM

LGTM, thanks for continuing this work.

This revision is now accepted and ready to land.Aug 31 2023, 8:15 AM
rahmanl abandoned this revision.Sep 8 2023, 3:44 PM

I think D158674 handled this. So abandoning this revision.