This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Target/
-
llvm/
-
Target/
-
TargetMachine.h
-
lib/
-
CodeGen/
-
BranchRelaxation.cpp
-
Target/
-
AArch64/
-
AArch64InstrInfo.h
7/8
AArch64InstrInfo.cpp
-
TargetMachine.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
branch-relax-b.ll
-
branch-relax-cross-section.mir

Differential D145211

Relax cross-section branches
ClosedPublic

Authored by dhoekwater on Mar 2 2023, 8:09 PM.

Download Raw Diff

Details

Reviewers

arsenm
mingmingl

Commits

rGd7bca8e4942b: [AArch64] Relax cross-section branches

Summary

Because the code layout is not known during compilation, the distance of
cross-section jumps is not knowable at compile-time. Unconditional branches
are relaxed via thunk insertion by the linker, but conditional branches
must be manually relaxed. Because of this, we should assume that any
cross-sectional conditional jumps are out of range. This assumption is
necessary for machine function splitting on Arm.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dhoekwater created this revision.Mar 2 2023, 8:09 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2023, 8:09 PM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B217104: Diff 502042.Mar 2 2023, 8:16 PM

Add a missing semicolon

Harbormaster completed remote builds in B217733: Diff 502834.Mar 6 2023, 8:11 PM

Add unconditional branch relaxation for AArch64 and remove the flag

Make all getCrossSectionBranchDistance instances return uint64_t

Harbormaster completed remote builds in B234329: Diff 525343.May 24 2023, 3:11 PM

Add checks to tests

Harbormaster completed remote builds in B234346: Diff 525366.May 24 2023, 5:39 PM

Rebase updated changes and clang-format

Harbormaster completed remote builds in B234572: Diff 525698.May 25 2023, 11:55 AM

dhoekwater retitled this revision from Add a flag to assume cross-section conditional branches must be relaxed to Relax cross-section branches.May 25 2023, 1:24 PM

dhoekwater published this revision for review.May 25 2023, 1:54 PM

dhoekwater added reviewers: arsenm, hliao, mingmingl.

Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2023, 1:54 PM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

Clarify documentation comment wording

Harbormaster completed remote builds in B234703: Diff 525866.May 25 2023, 5:20 PM

mingmingl added reviewers: amharc, MaskRay.May 26 2023, 11:55 AM

tmsriram added a subscriber: tmsriram.Jun 1 2023, 10:02 AM

arsenm added inline comments.Jun 8 2023, 4:48 PM

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
276–278	I think it would be better to just fix the scavenger to tolerate empty blocks.

dhoekwater marked an inline comment as done.Jun 13 2023, 10:45 AM

dhoekwater added inline comments.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
276–278	Agreed, but I think this would best be done in a different patch. This same approach is used by AMDGPU/SI, RISCV, and LoongArch.

@arsenm Do you have any other feedback?

arsenm added inline comments.Jun 16 2023, 3:40 PM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
600 ↗	(On Diff #525866)	I'm surprised this would require a new hook. Does seem like it should take the two blocks or section IDs Alternatively could just add blocks to isBranchOffsetInRange?
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
276–278	Ugh. Is this even true anymore with backwards scavenging?

dhoekwater marked 2 inline comments as done.Jun 28 2023, 4:31 PM

dhoekwater added inline comments.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
276–278	Yeah, this is still true. The signature of the function is `Register scavengeRegisterBackwards(const TargetRegisterClass &RC, MachineBasicBlock::iterator To, bool RestoreAfter, int SPAdj, bool AllowSpill = true);`, and `To` must be an iterator to a valid instruction because `scavengeRegisterBackwards` uses `To->getParent()`. It seems like fixing that could be reasonable since `RegisterScavenger` already has a `MBB` member, but again I think that's best done in a separate patch.

I think argumentless target hooks are generally bad and need to provide some additional context, such that a target could consider different behavior depending on the section

This revision now requires changes to proceed.Jul 11 2023, 1:10 PM

Use the code model instead of some arbitrary number for cross-section branches

The idea of a fixed cross-section branch distance isn't really
representative of the cross-section branch issue. The linker
can place two sections any distance away from each other
so long as both fit within the program address range specified
by the code model. Use the code model instead.

In D145211#4491071, @arsenm wrote:

I think argumentless target hooks are generally bad and need to provide some additional context, such that a target could consider different behavior depending on the section

I know I'm replacing an argumentless target hook with another, but this does provide some additional context, which is the size restrictions that each code model assumes.

Harbormaster completed remote builds in B247319: Diff 543063.Jul 21 2023, 9:26 PM

Rebase onto main

Harbormaster completed remote builds in B247783: Diff 543693.Jul 24 2023, 10:43 PM

Refactor code so it doesn't use virtual registers. Because we only care about the liveness at the end of the basic block, we don't need to use backwards scavenging. This structure also sets us up much more nicely for adding alternate implementations of buildIndirectBranch.

arsenm accepted this revision.Jul 26 2023, 4:41 PM

This revision is now accepted and ready to land.Jul 26 2023, 4:41 PM

Extra brace, oops.

Unconditional branches are relaxed via thunk insertion by the linker, but conditional branches must be manually relaxed. Because of this, we should assume that any cross-sectional conditional jumps are out of range

As-is, it looks like this patch is also relaxing cross-sectional unconditional jumps. Given we can trust the linker to relax these, and we expect the branches to be cold, is there a reason for the compiler to relax them? I guess the linker-generated stub could clobber x16/x17?

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
300	Do we know at this point that the function isn't using the red zone? (We usually use the emergency spill slot for this sort of situation.)

Harbormaster completed remote builds in B248404: Diff 544557.Jul 26 2023, 8:25 PM

In D145211#4537173, @efriedma wrote:

Unconditional branches are relaxed via thunk insertion by the linker, but conditional branches must be manually relaxed. Because of this, we should assume that any cross-sectional conditional jumps are out of range

As-is, it looks like this patch is also relaxing cross-sectional unconditional jumps. Given we can trust the linker to relax these, and we expect the branches to be cold, is there a reason for the compiler to relax them? I guess the linker-generated stub could clobber x16/x17?

Yeah, the linker will clobber X16 to relax unconditional jumps. The ABI seems designed around the assumption that out-of-range unconditional jumps only exist at function call boundaries, so we have to work around that.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
300	Because BranchRelaxation runs after Frame Finalization, we can't add an emergency spill slot at this point. We could add a spill slot to every function just in case spilling is necessary, but I would expect it to degrade performance. Since Red Zone on AArch64 isn't widely used, it should be alright if it isn't compatible with MFS/BBSections on AArch64. I'll add an explicit check here to assert that we're not doing anything we shouldn't.

Yeah, the linker will clobber X16 to relax unconditional jumps. The ABI seems designed around the assumption that out-of-range unconditional jumps only exist at function call boundaries, so we have to work around that.

Alternatively, we could try to make register allocation understand that branches clobber x16, I guess? Haven't thought through how exactly that would work. The advantages of letting the linker split: one, we can completely avoid the extra instructions in libraries that are small, and two, linker-generated thunks would move the thunking code outside the hot section. But I won't insist on it (we can always revise this part later).

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
300	Can we write some reasonable heuristic for whether we might need to spill? I mean, during frame finalization, we should at least be able to tell if there's any cold code in the function. If disabling red zone usage in the relevant functions is the simplest way forward, I guess I'm fine with that.

Make sure that we don't spill into the red zone of a function.

In D145211#4540838, @efriedma wrote:

Alternatively, we could try to make register allocation understand that branches clobber x16, I guess? Haven't thought through how exactly that would work. The advantages of letting the linker split: one, we can completely avoid the extra instructions in libraries that are small, and two, linker-generated thunks would move the thunking code outside the hot section. But I won't insist on it (we can always revise this part later).

I have a follow-up patch to this one that defers relaxation to the linker when possible and picks the manual relaxation strategy that minimally impacts the hot section.
Making register allocation understand x16 clobbering would be a neat idea; I'd definitely like to explore it when fine-tuning MFS. Thanks for the review!

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
300	Can we write some reasonable heuristic for whether we might need to spill? Since frame finalization happens before `MachineFunctionSplitter`/`BBSections` run, there isn't any way to tell what blocks may be hot or cold at this point. We could hypothetically say "officially, MFS / bbsections are unsupported with Red zone, but they do work 99% of the time" and minimize pushing the stack pointer, but that sounds hairy. AFAICT, disabling red zone doesn't affect those that use MFS / bbsections. The AArch64 ABI Procedure Call Standard specifies that not writing past the stack pointer is a universal constraint and doesn't mention the red zone at all (although Apple and Windows platforms still respect it).

Harbormaster completed remote builds in B248730: Diff 544988.Jul 27 2023, 6:36 PM

Simplify and rename the test function for clarity

Harbormaster completed remote builds in B248955: Diff 545313.Jul 28 2023, 6:17 PM

Make the tests a bit less brittle. This way the tests will better accommodate future optimization.

dhoekwater added a child revision: D156766: [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation.Jul 31 2023, 5:56 PM

dhoekwater added a child revision: D156767: [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation.Jul 31 2023, 5:59 PM

Harbormaster completed remote builds in B249361: Diff 545875.Jul 31 2023, 6:38 PM

dhoekwater added a child revision: D156837: [AArch64][CodeGen] Avoid inverting hot branches during relaxation.Aug 1 2023, 5:21 PM

dhoekwater updated this revision to Diff 546957.Aug 3 2023, 11:21 AM

This comment was removed by dhoekwater.

Harbormaster completed remote builds in B250134: Diff 546957.Aug 3 2023, 1:51 PM

dhoekwater updated this revision to Diff 547413.Aug 4 2023, 5:08 PM

Harbormaster completed remote builds in B250475: Diff 547413.Aug 4 2023, 6:09 PM

dhoekwater removed reviewers: hliao, amharc, MaskRay.Aug 8 2023, 3:59 PM

removed reviewers because I initially added way too many.

Closed by commit rGd7bca8e4942b: [AArch64] Relax cross-section branches (authored by dhoekwater). · Explain WhyAug 15 2023, 6:43 PM

This revision was automatically updated to reflect the committed changes.

dhoekwater added a commit: rGd7bca8e4942b: [AArch64] Relax cross-section branches.

MaskRay mentioned this in D156767: [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation.Sep 5 2023, 10:20 PM

dhoekwater mentioned this in rG866ae69cfa73: [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch….Sep 6 2023, 1:48 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Target/

TargetMachine.h

3 lines

lib/

CodeGen/

BranchRelaxation.cpp

45 lines

Target/

AArch64/

AArch64InstrInfo.h

5 lines

AArch64InstrInfo.cpp

69 lines

TargetMachine.cpp

13 lines

test/

CodeGen/

AArch64/

branch-relax-b.ll

139 lines

branch-relax-cross-section.mir

71 lines

Diff 550570

llvm/include/llvm/Target/TargetMachine.h

Show First 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	public:
/// Returns the code generation relocation model. The choices are static, PIC,		/// Returns the code generation relocation model. The choices are static, PIC,
/// and dynamic-no-pic, and target default.		/// and dynamic-no-pic, and target default.
Reloc::Model getRelocationModel() const;		Reloc::Model getRelocationModel() const;

/// Returns the code model. The choices are small, kernel, medium, large, and		/// Returns the code model. The choices are small, kernel, medium, large, and
/// target default.		/// target default.
CodeModel::Model getCodeModel() const { return CMModel; }		CodeModel::Model getCodeModel() const { return CMModel; }

		/// Returns the maximum code size possible under the code model.
		uint64_t getMaxCodeSize() const;

/// Set the code model.		/// Set the code model.
void setCodeModel(CodeModel::Model CM) { CMModel = CM; }		void setCodeModel(CodeModel::Model CM) { CMModel = CM; }

bool isLargeData() const;		bool isLargeData() const;

bool isPositionIndependent() const;		bool isPositionIndependent() const;

bool shouldAssumeDSOLocal(const Module &M, const GlobalValue *GV) const;		bool shouldAssumeDSOLocal(const Module &M, const GlobalValue *GV) const;
▲ Show 20 Lines • Show All 288 Lines • Show Last 20 Lines

llvm/lib/CodeGen/BranchRelaxation.cpp

Show All 20 Lines
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/Format.h"		#include "llvm/Support/Format.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Target/TargetMachine.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <memory>		#include <memory>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "branch-relaxation"		#define DEBUG_TYPE "branch-relaxation"
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	class BranchRelaxation : public MachineFunctionPass {

SmallVector<BasicBlockInfo, 16> BlockInfo;		SmallVector<BasicBlockInfo, 16> BlockInfo;
std::unique_ptr<RegScavenger> RS;		std::unique_ptr<RegScavenger> RS;
LivePhysRegs LiveRegs;		LivePhysRegs LiveRegs;

MachineFunction *MF = nullptr;		MachineFunction *MF = nullptr;
const TargetRegisterInfo *TRI = nullptr;		const TargetRegisterInfo *TRI = nullptr;
const TargetInstrInfo *TII = nullptr;		const TargetInstrInfo *TII = nullptr;
		const TargetMachine *TM = nullptr;

bool relaxBranchInstructions();		bool relaxBranchInstructions();
void scanFunction();		void scanFunction();

MachineBasicBlock *createNewBlockAfter(MachineBasicBlock &OrigMBB);		MachineBasicBlock *createNewBlockAfter(MachineBasicBlock &OrigMBB);
MachineBasicBlock *createNewBlockAfter(MachineBasicBlock &OrigMBB,		MachineBasicBlock *createNewBlockAfter(MachineBasicBlock &OrigMBB,
const BasicBlock *BB);		const BasicBlock *BB);

▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
/// and insert it after \p OrigMBB		/// and insert it after \p OrigMBB
MachineBasicBlock *		MachineBasicBlock *
BranchRelaxation::createNewBlockAfter(MachineBasicBlock &OrigMBB,		BranchRelaxation::createNewBlockAfter(MachineBasicBlock &OrigMBB,
const BasicBlock *BB) {		const BasicBlock *BB) {
// Create a new MBB for the code after the OrigBB.		// Create a new MBB for the code after the OrigBB.
MachineBasicBlock *NewBB = MF->CreateMachineBasicBlock(BB);		MachineBasicBlock *NewBB = MF->CreateMachineBasicBlock(BB);
MF->insert(++OrigMBB.getIterator(), NewBB);		MF->insert(++OrigMBB.getIterator(), NewBB);

		// Place the new block in the same section as OrigBB
		NewBB->setSectionID(OrigMBB.getSectionID());
		NewBB->setIsEndSection(OrigMBB.isEndSection());
		OrigMBB.setIsEndSection(false);

// Insert an entry into BlockInfo to align it properly with the block numbers.		// Insert an entry into BlockInfo to align it properly with the block numbers.
BlockInfo.insert(BlockInfo.begin() + NewBB->getNumber(), BasicBlockInfo());		BlockInfo.insert(BlockInfo.begin() + NewBB->getNumber(), BasicBlockInfo());

return NewBB;		return NewBB;
}		}

/// Split the basic block containing MI into two blocks, which are joined by		/// Split the basic block containing MI into two blocks, which are joined by
/// an unconditional branch. Update data structures and renumber blocks to		/// an unconditional branch. Update data structures and renumber blocks to
/// account for this change and returns the newly created block.		/// account for this change and returns the newly created block.
MachineBasicBlock *BranchRelaxation::splitBlockBeforeInstr(MachineInstr &MI,		MachineBasicBlock *
		BranchRelaxation::splitBlockBeforeInstr(MachineInstr &MI,
MachineBasicBlock *DestBB) {		MachineBasicBlock *DestBB) {
MachineBasicBlock *OrigBB = MI.getParent();		MachineBasicBlock *OrigBB = MI.getParent();

// Create a new MBB for the code after the OrigBB.		// Create a new MBB for the code after the OrigBB.
MachineBasicBlock *NewBB =		MachineBasicBlock *NewBB =
MF->CreateMachineBasicBlock(OrigBB->getBasicBlock());		MF->CreateMachineBasicBlock(OrigBB->getBasicBlock());
MF->insert(++OrigBB->getIterator(), NewBB);		MF->insert(++OrigBB->getIterator(), NewBB);

		// Place the new block in the same section as OrigBB.
		NewBB->setSectionID(OrigBB->getSectionID());
		NewBB->setIsEndSection(OrigBB->isEndSection());
		OrigBB->setIsEndSection(false);

// Splice the instructions starting with MI over to NewBB.		// Splice the instructions starting with MI over to NewBB.
NewBB->splice(NewBB->end(), OrigBB, MI.getIterator(), OrigBB->end());		NewBB->splice(NewBB->end(), OrigBB, MI.getIterator(), OrigBB->end());

// Add an unconditional branch from OrigBB to NewBB.		// Add an unconditional branch from OrigBB to NewBB.
// Note the new unconditional branch is not being recorded.		// Note the new unconditional branch is not being recorded.
// There doesn't seem to be meaningful DebugInfo available; this doesn't		// There doesn't seem to be meaningful DebugInfo available; this doesn't
// correspond to anything in the source.		// correspond to anything in the source.
TII->insertUnconditionalBranch(*OrigBB, NewBB, DebugLoc());		TII->insertUnconditionalBranch(*OrigBB, NewBB, DebugLoc());
Show All 34 Lines

/// isBlockInRange - Returns true if the distance between specific MI and		/// isBlockInRange - Returns true if the distance between specific MI and
/// specific BB can fit in MI's displacement field.		/// specific BB can fit in MI's displacement field.
bool BranchRelaxation::isBlockInRange(		bool BranchRelaxation::isBlockInRange(
const MachineInstr &MI, const MachineBasicBlock &DestBB) const {		const MachineInstr &MI, const MachineBasicBlock &DestBB) const {
int64_t BrOffset = getInstrOffset(MI);		int64_t BrOffset = getInstrOffset(MI);
int64_t DestOffset = BlockInfo[DestBB.getNumber()].Offset;		int64_t DestOffset = BlockInfo[DestBB.getNumber()].Offset;

if (TII->isBranchOffsetInRange(MI.getOpcode(), DestOffset - BrOffset))		const MachineBasicBlock *SrcBB = MI.getParent();

		if (TII->isBranchOffsetInRange(MI.getOpcode(),
		SrcBB->getSectionID() != DestBB.getSectionID()
		? TM->getMaxCodeSize()
		: DestOffset - BrOffset))
return true;		return true;

LLVM_DEBUG(dbgs() << "Out of range branch to destination "		LLVM_DEBUG(dbgs() << "Out of range branch to destination "
<< printMBBReference(DestBB) << " from "		<< printMBBReference(DestBB) << " from "
<< printMBBReference(*MI.getParent()) << " to "		<< printMBBReference(*MI.getParent()) << " to "
<< DestOffset << " offset " << DestOffset - BrOffset << '\t'		<< DestOffset << " offset " << DestOffset - BrOffset << '\t'
<< MI);		<< MI);

▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	bool BranchRelaxation::fixupUnconditionalBranch(MachineInstr &MI) {
MachineBasicBlock *MBB = MI.getParent();		MachineBasicBlock *MBB = MI.getParent();
SmallVector<MachineOperand, 4> Cond;		SmallVector<MachineOperand, 4> Cond;
unsigned OldBrSize = TII->getInstSizeInBytes(MI);		unsigned OldBrSize = TII->getInstSizeInBytes(MI);
MachineBasicBlock *DestBB = TII->getBranchDestBlock(MI);		MachineBasicBlock *DestBB = TII->getBranchDestBlock(MI);

int64_t DestOffset = BlockInfo[DestBB->getNumber()].Offset;		int64_t DestOffset = BlockInfo[DestBB->getNumber()].Offset;
int64_t SrcOffset = getInstrOffset(MI);		int64_t SrcOffset = getInstrOffset(MI);

assert(!TII->isBranchOffsetInRange(MI.getOpcode(), DestOffset - SrcOffset));		assert(!TII->isBranchOffsetInRange(
		MI.getOpcode(), MBB->getSectionID() != DestBB->getSectionID()
		? TM->getMaxCodeSize()
		: DestOffset - SrcOffset));

BlockInfo[MBB->getNumber()].Size -= OldBrSize;		BlockInfo[MBB->getNumber()].Size -= OldBrSize;

MachineBasicBlock *BranchBB = MBB;		MachineBasicBlock *BranchBB = MBB;

// If this was an expanded conditional branch, there is already a single		// If this was an expanded conditional branch, there is already a single
// unconditional branch in a block.		// unconditional branch in a block.
if (!MBB->empty()) {		if (!MBB->empty()) {
Show All 13 Lines	bool BranchRelaxation::fixupUnconditionalBranch(MachineInstr &MI) {
DebugLoc DL = MI.getDebugLoc();		DebugLoc DL = MI.getDebugLoc();
MI.eraseFromParent();		MI.eraseFromParent();

// Create the optional restore block and, initially, place it at the end of		// Create the optional restore block and, initially, place it at the end of
// function. That block will be placed later if it's used; otherwise, it will		// function. That block will be placed later if it's used; otherwise, it will
// be erased.		// be erased.
MachineBasicBlock *RestoreBB = createNewBlockAfter(MF->back(),		MachineBasicBlock *RestoreBB = createNewBlockAfter(MF->back(),
DestBB->getBasicBlock());		DestBB->getBasicBlock());
		std::prev(RestoreBB->getIterator())
		->setIsEndSection(RestoreBB->isEndSection());
		RestoreBB->setIsEndSection(false);

TII->insertIndirectBranch(BranchBB, DestBB, *RestoreBB, DL,		TII->insertIndirectBranch(BranchBB, DestBB, *RestoreBB, DL,
DestOffset - SrcOffset, RS.get());		BranchBB->getSectionID() != DestBB->getSectionID()
		? TM->getMaxCodeSize()
		: DestOffset - SrcOffset,
		RS.get());

BlockInfo[BranchBB->getNumber()].Size = computeBlockSize(*BranchBB);		BlockInfo[BranchBB->getNumber()].Size = computeBlockSize(*BranchBB);
adjustBlockOffsets(*MBB);		adjustBlockOffsets(*MBB);

// If RestoreBB is required, try to place just before DestBB.		// If RestoreBB is required, try to place just before DestBB.
if (!RestoreBB->empty()) {		if (!RestoreBB->empty()) {
// TODO: For multiple far branches to the same destination, there are		// TODO: For multiple far branches to the same destination, there are
// chances that some restore blocks could be shared if they clobber the		// chances that some restore blocks could be shared if they clobber the
Show All 14 Lines	if (!RestoreBB->empty()) {
RestoreBB->addSuccessor(DestBB);		RestoreBB->addSuccessor(DestBB);
BranchBB->replaceSuccessor(DestBB, RestoreBB);		BranchBB->replaceSuccessor(DestBB, RestoreBB);
if (TRI->trackLivenessAfterRegAlloc(*MF))		if (TRI->trackLivenessAfterRegAlloc(*MF))
computeAndAddLiveIns(LiveRegs, *RestoreBB);		computeAndAddLiveIns(LiveRegs, *RestoreBB);
// Compute the restore block size.		// Compute the restore block size.
BlockInfo[RestoreBB->getNumber()].Size = computeBlockSize(*RestoreBB);		BlockInfo[RestoreBB->getNumber()].Size = computeBlockSize(*RestoreBB);
// Update the offset starting from the previous block.		// Update the offset starting from the previous block.
adjustBlockOffsets(*PrevBB);		adjustBlockOffsets(*PrevBB);

		// Fix up section information for RestoreBB and DestBB
		RestoreBB->setSectionID(DestBB->getSectionID());
		RestoreBB->setIsBeginSection(DestBB->isBeginSection());
		DestBB->setIsBeginSection(false);
} else {		} else {
// Remove restore block if it's not required.		// Remove restore block if it's not required.
MF->erase(RestoreBB);		MF->erase(RestoreBB);
}		}

return true;		return true;
}		}

Show All 12 Lines	for (MachineBasicBlock &MBB : *MF) {
// conditional branch, this will end up changing the branch destination of		// conditional branch, this will end up changing the branch destination of
// it to be over the newly inserted indirect branch block, which may avoid		// it to be over the newly inserted indirect branch block, which may avoid
// the need to try expanding the conditional branch first, saving an extra		// the need to try expanding the conditional branch first, saving an extra
// jump.		// jump.
if (Last->isUnconditionalBranch()) {		if (Last->isUnconditionalBranch()) {
// Unconditional branch destination might be unanalyzable, assume these		// Unconditional branch destination might be unanalyzable, assume these
// are OK.		// are OK.
if (MachineBasicBlock DestBB = TII->getBranchDestBlock(Last)) {		if (MachineBasicBlock DestBB = TII->getBranchDestBlock(Last)) {
if (!isBlockInRange(Last, DestBB)) {		if (!isBlockInRange(Last, DestBB) && !TII->isTailCall(*Last)) {
fixupUnconditionalBranch(*Last);		fixupUnconditionalBranch(*Last);
++NumUnconditionalRelaxed;		++NumUnconditionalRelaxed;
Changed = true;		Changed = true;
}		}
}		}
}		}

// Loop over the conditional branches.		// Loop over the conditional branches.
Show All 37 Lines

bool BranchRelaxation::runOnMachineFunction(MachineFunction &mf) {		bool BranchRelaxation::runOnMachineFunction(MachineFunction &mf) {
MF = &mf;		MF = &mf;

LLVM_DEBUG(dbgs() << "*** BranchRelaxation ***\n");		LLVM_DEBUG(dbgs() << "*** BranchRelaxation ***\n");

const TargetSubtargetInfo &ST = MF->getSubtarget();		const TargetSubtargetInfo &ST = MF->getSubtarget();
TII = ST.getInstrInfo();		TII = ST.getInstrInfo();
		TM = &MF->getTarget();

TRI = ST.getRegisterInfo();		TRI = ST.getRegisterInfo();
if (TRI->trackLivenessAfterRegAlloc(*MF))		if (TRI->trackLivenessAfterRegAlloc(*MF))
RS.reset(new RegScavenger());		RS.reset(new RegScavenger());

// Renumber all of the machine basic blocks in the function, guaranteeing that		// Renumber all of the machine basic blocks in the function, guaranteeing that
// the numbers agree with the position of the block in the function.		// the numbers agree with the position of the block in the function.
MF->RenumberBlocks();		MF->RenumberBlocks();
Show All 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.h

Show First 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	public:

/// \returns true if a branch from an instruction with opcode \p BranchOpc		/// \returns true if a branch from an instruction with opcode \p BranchOpc
/// bytes is capable of jumping to a position \p BrOffset bytes away.		/// bytes is capable of jumping to a position \p BrOffset bytes away.
bool isBranchOffsetInRange(unsigned BranchOpc,		bool isBranchOffsetInRange(unsigned BranchOpc,
int64_t BrOffset) const override;		int64_t BrOffset) const override;

MachineBasicBlock *getBranchDestBlock(const MachineInstr &MI) const override;		MachineBasicBlock *getBranchDestBlock(const MachineInstr &MI) const override;

		void insertIndirectBranch(MachineBasicBlock &MBB,
		MachineBasicBlock &NewDestBB,
		MachineBasicBlock &RestoreBB, const DebugLoc &DL,
		int64_t BrOffset, RegScavenger *RS) const override;

bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,		bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify = false) const override;		bool AllowModify = false) const override;
bool analyzeBranchPredicate(MachineBasicBlock &MBB,		bool analyzeBranchPredicate(MachineBasicBlock &MBB,
MachineBranchPredicate &MBP,		MachineBranchPredicate &MBP,
bool AllowModify) const override;		bool AllowModify) const override;
unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
▲ Show 20 Lines • Show All 394 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 24 Lines
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstr.h"		#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineMemOperand.h"		#include "llvm/CodeGen/MachineMemOperand.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachineOperand.h"		#include "llvm/CodeGen/MachineOperand.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
		#include "llvm/CodeGen/RegisterScavenging.h"
#include "llvm/CodeGen/StackMaps.h"		#include "llvm/CodeGen/StackMaps.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"		#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
Show All 24 Lines
static cl::opt<unsigned> CBZDisplacementBits(		static cl::opt<unsigned> CBZDisplacementBits(
"aarch64-cbz-offset-bits", cl::Hidden, cl::init(19),		"aarch64-cbz-offset-bits", cl::Hidden, cl::init(19),
cl::desc("Restrict range of CB[N]Z instructions (DEBUG)"));		cl::desc("Restrict range of CB[N]Z instructions (DEBUG)"));

static cl::opt<unsigned>		static cl::opt<unsigned>
BCCDisplacementBits("aarch64-bcc-offset-bits", cl::Hidden, cl::init(19),		BCCDisplacementBits("aarch64-bcc-offset-bits", cl::Hidden, cl::init(19),
cl::desc("Restrict range of Bcc instructions (DEBUG)"));		cl::desc("Restrict range of Bcc instructions (DEBUG)"));

		static cl::opt<unsigned>
		BDisplacementBits("aarch64-b-offset-bits", cl::Hidden, cl::init(26),
		cl::desc("Restrict range of B instructions (DEBUG)"));

AArch64InstrInfo::AArch64InstrInfo(const AArch64Subtarget &STI)		AArch64InstrInfo::AArch64InstrInfo(const AArch64Subtarget &STI)
: AArch64GenInstrInfo(AArch64::ADJCALLSTACKDOWN, AArch64::ADJCALLSTACKUP,		: AArch64GenInstrInfo(AArch64::ADJCALLSTACKDOWN, AArch64::ADJCALLSTACKUP,
AArch64::CATCHRET),		AArch64::CATCHRET),
RI(STI.getTargetTriple()), Subtarget(STI) {}		RI(STI.getTargetTriple()), Subtarget(STI) {}

/// GetInstSize - Return the number of bytes of code the specified		/// GetInstSize - Return the number of bytes of code the specified
/// instruction may be. This returns the maximum number of bytes.		/// instruction may be. This returns the maximum number of bytes.
unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {		unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	static void parseCondBranch(MachineInstr LastInst, MachineBasicBlock &Target,
}		}
}		}

static unsigned getBranchDisplacementBits(unsigned Opc) {		static unsigned getBranchDisplacementBits(unsigned Opc) {
switch (Opc) {		switch (Opc) {
default:		default:
llvm_unreachable("unexpected opcode!");		llvm_unreachable("unexpected opcode!");
case AArch64::B:		case AArch64::B:
return 64;		return BDisplacementBits;
case AArch64::TBNZW:		case AArch64::TBNZW:
case AArch64::TBZW:		case AArch64::TBZW:
case AArch64::TBNZX:		case AArch64::TBNZX:
case AArch64::TBZX:		case AArch64::TBZX:
return TBZDisplacementBits;		return TBZDisplacementBits;
case AArch64::CBNZW:		case AArch64::CBNZW:
case AArch64::CBZW:		case AArch64::CBZW:
case AArch64::CBNZX:		case AArch64::CBNZX:
Show All 28 Lines	AArch64InstrInfo::getBranchDestBlock(const MachineInstr &MI) const {
case AArch64::CBNZW:		case AArch64::CBNZW:
case AArch64::CBZX:		case AArch64::CBZX:
case AArch64::CBNZX:		case AArch64::CBNZX:
case AArch64::Bcc:		case AArch64::Bcc:
return MI.getOperand(1).getMBB();		return MI.getOperand(1).getMBB();
}		}
}		}

		void AArch64InstrInfo::insertIndirectBranch(MachineBasicBlock &MBB,
		MachineBasicBlock &NewDestBB,
		MachineBasicBlock &RestoreBB,
		const DebugLoc &DL,
		int64_t BrOffset,
		RegScavenger *RS) const {
		assert(RS && "RegScavenger required for long branching");
		assert(MBB.empty() &&
		"new block should be inserted for expanding unconditional branch");
		assert(MBB.pred_size() == 1);
		assert(RestoreBB.empty() &&
		"restore block should be inserted for restoring clobbered registers");

		auto buildIndirectBranch = [&](Register Reg, MachineBasicBlock &DestBB) {
		// Offsets outside of the signed 33-bit range are not supported for ADRP +
		// ADD.
		if (!isInt<33>(BrOffset))
		report_fatal_error(
		"Branch offsets outside of the signed 33-bit range not supported");

		BuildMI(MBB, MBB.end(), DL, get(AArch64::ADRP), Reg)
		.addSym(DestBB.getSymbol(), AArch64II::MO_PAGE);
		BuildMI(MBB, MBB.end(), DL, get(AArch64::ADDXri), Reg)
		arsenmUnsubmitted Done Reply Inline Actions I think it would be better to just fix the scavenger to tolerate empty blocks. arsenm: I think it would be better to just fix the scavenger to tolerate empty blocks.
		dhoekwaterAuthorUnsubmitted Done Reply Inline Actions Agreed, but I think this would best be done in a different patch. This same approach is used by AMDGPU/SI, RISCV, and LoongArch. dhoekwater: Agreed, but I think this would best be done in a different patch. This same approach is used by…
		arsenmUnsubmitted Done Reply Inline Actions Ugh. Is this even true anymore with backwards scavenging? arsenm: Ugh. Is this even true anymore with backwards scavenging?
		dhoekwaterAuthorUnsubmitted Done Reply Inline Actions Yeah, this is still true. The signature of the function is `Register scavengeRegisterBackwards(const TargetRegisterClass &RC, MachineBasicBlock::iterator To, bool RestoreAfter, int SPAdj, bool AllowSpill = true);`, and `To` must be an iterator to a valid instruction because `scavengeRegisterBackwards` uses `To->getParent()`. It seems like fixing that could be reasonable since `RegisterScavenger` already has a `MBB` member, but again I think that's best done in a separate patch. dhoekwater: Yeah, this is still true. The signature of the function is `Register scavengeRegisterBackwards…
		.addReg(Reg)
		.addSym(DestBB.getSymbol(), AArch64II::MO_PAGEOFF \| AArch64II::MO_NC)
		.addImm(0);
		BuildMI(MBB, MBB.end(), DL, get(AArch64::BR)).addReg(Reg);
		};

		RS->enterBasicBlockEnd(MBB);
		Register Reg = RS->FindUnusedReg(&AArch64::GPR64RegClass);

		// If there's a free register, manually insert the indirect branch using it.
		if (Reg != AArch64::NoRegister) {
		buildIndirectBranch(Reg, NewDestBB);
		RS->setRegUsed(Reg);
		return;
		}

		// Otherwise, spill and use X16. This briefly moves the stack pointer, making
		// it incompatible with red zones.
		AArch64FunctionInfo *AFI = MBB.getParent()->getInfo<AArch64FunctionInfo>();
		if (!AFI \|\| AFI->hasRedZone().value_or(true))
		report_fatal_error(
		"Unable to insert indirect branch inside function that has red zone");
		efriedmaUnsubmitted Done Reply Inline Actions Do we know at this point that the function isn't using the red zone? (We usually use the emergency spill slot for this sort of situation.) efriedma: Do we know at this point that the function isn't using the red zone? (We usually use the…
		dhoekwaterAuthorUnsubmitted Done Reply Inline Actions Because BranchRelaxation runs after Frame Finalization, we can't add an emergency spill slot at this point. We could add a spill slot to every function just in case spilling is necessary, but I would expect it to degrade performance. Since Red Zone on AArch64 isn't widely used, it should be alright if it isn't compatible with MFS/BBSections on AArch64. I'll add an explicit check here to assert that we're not doing anything we shouldn't. dhoekwater: Because BranchRelaxation runs after Frame Finalization, we can't add an emergency spill slot at…
		efriedmaUnsubmitted Not Done Reply Inline Actions Can we write some reasonable heuristic for whether we might need to spill? I mean, during frame finalization, we should at least be able to tell if there's any cold code in the function. If disabling red zone usage in the relevant functions is the simplest way forward, I guess I'm fine with that. efriedma: Can we write some reasonable heuristic for whether we might need to spill? I mean, during…
		dhoekwaterAuthorUnsubmitted Done Reply Inline Actions Can we write some reasonable heuristic for whether we might need to spill? Since frame finalization happens before `MachineFunctionSplitter`/`BBSections` run, there isn't any way to tell what blocks may be hot or cold at this point. We could hypothetically say "officially, MFS / bbsections are unsupported with Red zone, but they do work 99% of the time" and minimize pushing the stack pointer, but that sounds hairy. AFAICT, disabling red zone doesn't affect those that use MFS / bbsections. The AArch64 ABI Procedure Call Standard specifies that not writing past the stack pointer is a universal constraint and doesn't mention the red zone at all (although Apple and Windows platforms still respect it). dhoekwater: > Can we write some reasonable heuristic for whether we might need to spill? Since frame…

		Reg = AArch64::X16;
		BuildMI(MBB, MBB.end(), DL, get(AArch64::STRXpre))
		.addReg(AArch64::SP, RegState::Define)
		.addReg(Reg)
		.addReg(AArch64::SP)
		.addImm(-16);

		buildIndirectBranch(Reg, RestoreBB);

		BuildMI(RestoreBB, RestoreBB.end(), DL, get(AArch64::LDRXpost))
		.addReg(AArch64::SP, RegState::Define)
		.addReg(Reg, RegState::Define)
		.addReg(AArch64::SP)
		.addImm(16);
		}

// Branch analysis.		// Branch analysis.
bool AArch64InstrInfo::analyzeBranch(MachineBasicBlock &MBB,		bool AArch64InstrInfo::analyzeBranch(MachineBasicBlock &MBB,
MachineBasicBlock *&TBB,		MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify) const {		bool AllowModify) const {
// If the block has no terminators, it just falls into the block after it.		// If the block has no terminators, it just falls into the block after it.
MachineBasicBlock::iterator I = MBB.getLastNonDebugInstr();		MachineBasicBlock::iterator I = MBB.getLastNonDebugInstr();
▲ Show 20 Lines • Show All 8,141 Lines • Show Last 20 Lines

llvm/lib/Target/TargetMachine.cpp

Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	#define RESET_OPTION(X, Y) \
RESET_OPTION(NoSignedZerosFPMath, "no-signed-zeros-fp-math");		RESET_OPTION(NoSignedZerosFPMath, "no-signed-zeros-fp-math");
RESET_OPTION(ApproxFuncFPMath, "approx-func-fp-math");		RESET_OPTION(ApproxFuncFPMath, "approx-func-fp-math");
}		}

/// Returns the code generation relocation model. The choices are static, PIC,		/// Returns the code generation relocation model. The choices are static, PIC,
/// and dynamic-no-pic.		/// and dynamic-no-pic.
Reloc::Model TargetMachine::getRelocationModel() const { return RM; }		Reloc::Model TargetMachine::getRelocationModel() const { return RM; }

		uint64_t TargetMachine::getMaxCodeSize() const {
		switch (getCodeModel()) {
		case CodeModel::Tiny:
		return llvm::maxUIntN(10);
		case CodeModel::Small:
		case CodeModel::Kernel:
		case CodeModel::Medium:
		return llvm::maxUIntN(31);
		case CodeModel::Large:
		return llvm::maxUIntN(64);
		}
		}

/// Get the IR-specified TLS model for Var.		/// Get the IR-specified TLS model for Var.
static TLSModel::Model getSelectedTLSModel(const GlobalValue *GV) {		static TLSModel::Model getSelectedTLSModel(const GlobalValue *GV) {
switch (GV->getThreadLocalMode()) {		switch (GV->getThreadLocalMode()) {
case GlobalVariable::NotThreadLocal:		case GlobalVariable::NotThreadLocal:
llvm_unreachable("getSelectedTLSModel for non-TLS variable");		llvm_unreachable("getSelectedTLSModel for non-TLS variable");
break;		break;
case GlobalVariable::GeneralDynamicTLSModel:		case GlobalVariable::GeneralDynamicTLSModel:
return TLSModel::GeneralDynamic;		return TLSModel::GeneralDynamic;
▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/branch-relax-b.ll

This file was added.

				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu --verify-machineinstrs -aarch64-b-offset-bits=9 -aarch64-tbz-offset-bits=6 -aarch64-cbz-offset-bits=6 -aarch64-bcc-offset-bits=6 \| FileCheck %s

				define void @relax_b_nospill(i1 zeroext %0) {
				; CHECK-LABEL: relax_b_nospill:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: tbnz w0,
				; CHECK-SAME: LBB0_1
				; CHECK-NEXT: // %bb.3: // %entry
				; CHECK-NEXT: adrp [[SCAVENGED_REGISTER:x[0-9]+]], .LBB0_2
				; CHECK-NEXT: add [[SCAVENGED_REGISTER]], [[SCAVENGED_REGISTER]], :lo12:.LBB0_2
				; CHECK-NEXT: br [[SCAVENGED_REGISTER]]
				; CHECK-NEXT: .LBB0_1: // %iftrue
				; CHECK-NEXT: //APP
				; CHECK-NEXT: .zero 2048
				; CHECK-NEXT: //NO_APP
				; CHECK-NEXT: ret
				; CHECK-NEXT: .LBB0_2: // %iffalse
				; CHECK-NEXT: //APP
				; CHECK-NEXT: .zero 8
				; CHECK-NEXT: //NO_APP
				; CHECK-NEXT: ret
				entry:
				br i1 %0, label %iftrue, label %iffalse

				iftrue:
				call void asm sideeffect ".space 2048", ""()
				ret void

				iffalse:
				call void asm sideeffect ".space 8", ""()
				ret void
				}

				define void @relax_b_spill() {
				; CHECK-LABEL: relax_b_spill: // @relax_b_spill
				; CHECK: // %bb.0: // %entry
				; CHECK-COUNT-5: // 16-byte Folded Spill
				; CHECK-NOT: // 16-byte Folded Spill
				; CHECK: //APP
				; CHECK-COUNT-29: mov {{x[0-9]+}},
				; CHECK-NOT: mov {{x[0-9]+}},
				; CHECK-NEXT: //NO_APP
				; CHECK-NEXT: b.eq .LBB1_1
				; CHECK-NEXT: // %bb.4: // %entry
				; CHECK-NEXT: str [[SPILL_REGISTER:x[0-9]+]], [sp,
				; CHECK-SAME: -16]!
				; CHECK-NEXT: adrp [[SPILL_REGISTER:x[0-9]+]], .LBB1_5
				; CHECK-NEXT: add [[SPILL_REGISTER:x[0-9]+]], [[SPILL_REGISTER:x[0-9]+]], :lo12:.LBB1_5
				; CHECK-NEXT: br [[SPILL_REGISTER:x[0-9]+]]
				; CHECK-NEXT: .LBB1_1: // %iftrue
				; CHECK-NEXT: //APP
				; CHECK-NEXT: .zero 2048
				; CHECK-NEXT: //NO_APP
				; CHECK-NEXT: b .LBB1_3
				; CHECK-NEXT: .LBB1_5: // %iffalse
				; CHECK-NEXT: ldr [[SPILL_REGISTER:x[0-9]+]], [sp],
				; CHECK-SAME: 16
				; CHECK-NEXT: // %bb.2: // %iffalse
				; CHECK-NEXT: //APP
				; CHECK-COUNT-29: // reg use {{x[0-9]+}}
				; CHECK-NOT: // reg use {{x[0-9]+}}
				; CHECK-NEXT: //NO_APP
				; CHECK-NEXT: .LBB1_3: // %common.ret
				; CHECK-COUNT-5: // 16-byte Folded Reload
				; CHECK-NOT: // 16-byte Folded Reload
				; CHECK-NEXT: ret
				entry:
				%x0 = call i64 asm sideeffect "mov x0, 1", "={x0}"()
				%x1 = call i64 asm sideeffect "mov x1, 1", "={x1}"()
				%x2 = call i64 asm sideeffect "mov x2, 1", "={x2}"()
				%x3 = call i64 asm sideeffect "mov x3, 1", "={x3}"()
				%x4 = call i64 asm sideeffect "mov x4, 1", "={x4}"()
				%x5 = call i64 asm sideeffect "mov x5, 1", "={x5}"()
				%x6 = call i64 asm sideeffect "mov x6, 1", "={x6}"()
				%x7 = call i64 asm sideeffect "mov x7, 1", "={x7}"()
				%x8 = call i64 asm sideeffect "mov x8, 1", "={x8}"()
				%x9 = call i64 asm sideeffect "mov x9, 1", "={x9}"()
				%x10 = call i64 asm sideeffect "mov x10, 1", "={x10}"()
				%x11 = call i64 asm sideeffect "mov x11, 1", "={x11}"()
				%x12 = call i64 asm sideeffect "mov x12, 1", "={x12}"()
				%x13 = call i64 asm sideeffect "mov x13, 1", "={x13}"()
				%x14 = call i64 asm sideeffect "mov x14, 1", "={x14}"()
				%x15 = call i64 asm sideeffect "mov x15, 1", "={x15}"()
				%x16 = call i64 asm sideeffect "mov x16, 1", "={x16}"()
				%x17 = call i64 asm sideeffect "mov x17, 1", "={x17}"()
				%x18 = call i64 asm sideeffect "mov x18, 1", "={x18}"()
				%x19 = call i64 asm sideeffect "mov x19, 1", "={x19}"()
				%x20 = call i64 asm sideeffect "mov x20, 1", "={x20}"()
				%x21 = call i64 asm sideeffect "mov x21, 1", "={x21}"()
				%x22 = call i64 asm sideeffect "mov x22, 1", "={x22}"()
				%x23 = call i64 asm sideeffect "mov x23, 1", "={x23}"()
				%x24 = call i64 asm sideeffect "mov x24, 1", "={x24}"()
				%x25 = call i64 asm sideeffect "mov x25, 1", "={x25}"()
				%x26 = call i64 asm sideeffect "mov x26, 1", "={x26}"()
				%x27 = call i64 asm sideeffect "mov x27, 1", "={x27}"()
				%x28 = call i64 asm sideeffect "mov x28, 1", "={x28}"()

				%cmp = icmp eq i64 %x16, %x15
				br i1 %cmp, label %iftrue, label %iffalse

				iftrue:
				call void asm sideeffect ".space 2048", ""()
				ret void

				iffalse:
				call void asm sideeffect "# reg use $0", "{x0}"(i64 %x0)
				call void asm sideeffect "# reg use $0", "{x1}"(i64 %x1)
				call void asm sideeffect "# reg use $0", "{x2}"(i64 %x2)
				call void asm sideeffect "# reg use $0", "{x3}"(i64 %x3)
				call void asm sideeffect "# reg use $0", "{x4}"(i64 %x4)
				call void asm sideeffect "# reg use $0", "{x5}"(i64 %x5)
				call void asm sideeffect "# reg use $0", "{x6}"(i64 %x6)
				call void asm sideeffect "# reg use $0", "{x7}"(i64 %x7)
				call void asm sideeffect "# reg use $0", "{x8}"(i64 %x8)
				call void asm sideeffect "# reg use $0", "{x9}"(i64 %x9)
				call void asm sideeffect "# reg use $0", "{x10}"(i64 %x10)
				call void asm sideeffect "# reg use $0", "{x11}"(i64 %x11)
				call void asm sideeffect "# reg use $0", "{x12}"(i64 %x12)
				call void asm sideeffect "# reg use $0", "{x13}"(i64 %x13)
				call void asm sideeffect "# reg use $0", "{x14}"(i64 %x14)
				call void asm sideeffect "# reg use $0", "{x15}"(i64 %x15)
				call void asm sideeffect "# reg use $0", "{x16}"(i64 %x16)
				call void asm sideeffect "# reg use $0", "{x17}"(i64 %x17)
				call void asm sideeffect "# reg use $0", "{x18}"(i64 %x18)
				call void asm sideeffect "# reg use $0", "{x19}"(i64 %x19)
				call void asm sideeffect "# reg use $0", "{x20}"(i64 %x20)
				call void asm sideeffect "# reg use $0", "{x21}"(i64 %x21)
				call void asm sideeffect "# reg use $0", "{x22}"(i64 %x22)
				call void asm sideeffect "# reg use $0", "{x23}"(i64 %x23)
				call void asm sideeffect "# reg use $0", "{x24}"(i64 %x24)
				call void asm sideeffect "# reg use $0", "{x25}"(i64 %x25)
				call void asm sideeffect "# reg use $0", "{x26}"(i64 %x26)
				call void asm sideeffect "# reg use $0", "{x27}"(i64 %x27)
				call void asm sideeffect "# reg use $0", "{x28}"(i64 %x28)
				ret void
				}

				declare i32 @bar()
				declare i32 @baz()
				No newline at end of file

llvm/test/CodeGen/AArch64/branch-relax-cross-section.mir

This file was added.

				# RUN: llc -mtriple=aarch64-none-linux-gnu -run-pass branch-relaxation -aarch64-b-offset-bits=64 %s -o - \| FileCheck %s

				--- \|
				declare i32 @bar()
				declare i32 @baz()
				declare i32 @qux()

				define void @relax_tbz(i1 zeroext %0) {
				br i1 %0, label %false_block, label %true_block

				false_block: ; preds = %1
				%2 = call i32 @baz()
				br label %end

				end: ; preds = %true_block, %false_block
				%3 = tail call i32 @qux()
				ret void

				true_block: ; preds = %1
				%4 = call i32 @bar()
				br label %end
				}

				...
				---
				name: relax_tbz
				tracksRegLiveness: true
				liveins:
				- { reg: '$w0', virtual-reg: '' }
				stack:
				- { id: 0, name: '', type: spill-slot, offset: -16, size: 8, alignment: 16,
				stack-id: default, callee-saved-register: '$lr', callee-saved-restored: true,
				debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
				body: \|
				; CHECK-LABEL: name: relax_tbz
				; COM: Check that cross-section conditional branches are
				; COM: relaxed.
				; CHECK: bb.0 (%ir-block.1, bbsections 1):
				; CHECK-NEXT: successors: %bb.3(0x40000000)
				; CHECK: TBNZW
				; CHECK-SAME: %bb.3
				; CHECK: B %bb.2
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: bb.3 (%ir-block.1, bbsections 1):
				; CHECK-NEXT: successors: %bb.1(0x80000000)
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: B %bb.1
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: bb.1.false_block (bbsections 2):
				; CHECK: TCRETURNdi @qux, 0, csr_aarch64_aapcs, implicit $sp
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: bb.2.true_block (bbsections 3):
				; CHECK: TCRETURNdi @qux, 0, csr_aarch64_aapcs, implicit $sp
				bb.0 (%ir-block.1, bbsections 1):
				successors: %bb.1(0x40000000), %bb.2(0x40000000)
				liveins: $w0, $lr

				early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 :: (store (s64) into %stack.0)
				TBZW killed renamable $w0, 0, %bb.2
				B %bb.1

				bb.1.false_block (bbsections 2):
				BL @baz, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp, implicit-def dead $w0
				early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
				TCRETURNdi @qux, 0, csr_aarch64_aapcs, implicit $sp

				bb.2.true_block (bbsections 3):
				BL @bar, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp, implicit-def dead $w0
				early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
				TCRETURNdi @qux, 0, csr_aarch64_aapcs, implicit $sp
				...