This is an archive of the discontinued LLVM Phabricator instance.

A light-weight solution to align branches within 32B boundary by prefix padding
AbandonedPublic

Authored by skan on Feb 27 2020, 8:50 AM.

Download Raw Diff

Details

Reviewers

annita.zhang
craig.topper
MaskRay
reames
LuoYuanke
jyknight

Summary

If we want a branch or a fused pair not to cross or be against the boundary, we currently emit NOP before it (D70157), and in most cases, we can bring back the lost performance due to microcode update. We also observed cases in which nop padding doesn't mitigate the effect very well, but prefix padding does (D72225). As we discussed about the prefix padding, D72225 adopts an aggressive way to add prefixes and, as a result, the fact that every single intruction ends up in it's own fragment is a huge increase in memory usage. So we put forward a light-weight solution. In this solution, to align a branch, at most one instruction can be prefixed, and if there is no sufficient room to add segment prefixes, NOP will be inserted instead. We measured the memory usage of the link process with lto when building SPEC, it only increased a little compared to NOP padding. We turned on the new prefix padding by default and passed the internal large test set and llvm's testsuite.

D75203 seems to support a general alignment padding. If the general alignment padding supports adding prefixes for instructions, this patch is not needed. Currently, the revison is opened here to avoid duplicate work .

Diff Detail

Unit TestsFailed

	Time	Test
	1,100 ms	MLIR.mlir-cpu-runner::Unknown Unit Message ("")
	1,100 ms	MLIR.mlir-cpu-runner::Unknown Unit Message ("")
	1,090 ms	MLIR.mlir-cpu-runner::Unknown Unit Message ("")

Event Timeline

skan created this revision.Feb 27 2020, 8:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2020, 8:50 AM

Herald added subscribers: llvm-commits, dexonsmith, hiraditya. · View Herald Transcript

skan edited the summary of this revision. (Show Details)Feb 27 2020, 8:53 AM

Harbormaster failed remote builds in B47442: Diff 246957!Feb 27 2020, 9:31 AM

I think this is a step in the right direction. I expect we'll need to iterate on the design to allow further padding without extensive memory usage, but starting with a single padding instruction seems like a reasonable starting point. Being able to iterate in tree is obvious valuable, so I think cleaning this up and getting something in is a good idea.

I'd like to suggest a couple of design changes to simplify the code here though.

First, I think we should introduce a new MCFragment type for the padding opportunity. Having MCBranchAlign mean both "this is a place we need to enforce alignment" and "this is a place we can optional add padding for a later alignment" is confusing. I'd suggest something along the lines of a MCPrefixPaddingFragment. I think splitting this will make the code a lot more readable.

Second, I think we should rebase this on D75203. The advantage of doing so is that we could completely remove the MCPrefixPaddingFragment from relaxation, and only adjust it's size afterwards. Relaxation would be responsible for figuring out how much padding is needed and recording that in the MCBoundaryAlign, and then the post pass would divide the padding between nop, prefix, and other relaxable instructions.

Third, the complexity of the state machine for inserting fragments really bothers me. I don't have a concrete suggestion here, but I think we need to simplify this.

Hm, after reflecting a bit on points 1 and 2, I may have an alternate suggestion which eliminates (1) entirely. The basic idea is that we treat prefix padding as a thing done to relaxable instructions. A relaxable instruction has all the MCInst info to determine legal prefixes. If we added an interface to MCAsmBackend along the lines of "padInstructionEncoding(MCRelaxableFragment&)" which takes a fragment and changes it's encoding to increase encoding size, this would nicely fit into the design of D75203, and would allow all the prefix logic to live inside a single function in the X86AsmBackend. Then we'd just have to decide which instructions to make relaxable instead of directly adding to the data fragment. (i.e. the state machine)

On the surface, that seems like it would work well. I'm going to prototype that a bit on top of D75203 and see if it works as well as it seems.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
385	This looks like a potentially unrelated change. Can it be separated?
493	Please replace constants with X86::CS_PREFIX and friends.

reames mentioned this in D75203: [X86] Relax existing instructions to reduce the number of nops needed for alignment purposes.Feb 27 2020, 11:33 AM

My prototyping resulted in a POC patch (D75300) which nicely demonstrates that we can prefix pad MCRelaxableFragment instructions with zero additional memory overhead. Note that this approach *only* covers relaxable instructions, not those currently combined into DataFragments.

This patch goes several steps further in choosing instructions to insert prefixes before without making them relaxable. We could either a) just make more instructions relaxable, or b) support both mechanisms. This mechanism - particularly after splitting out a 'MCPrefixPaddingFragment' - will require less memory per potentially padded instruction than converting every instruction we might wish to pad into a RelxableFragment of it's own.

Another idea to explore would be trying to frame prefix padding insertion as something analogous to a fixup. Today, all of our fixups are fixed size (I think), but having a fixup which inserts a set of bytes might be reasonable. If we did that, we could use a single DataFragment for a block of instructions and still insert padding later if needed.

To be clear, I think it's fine to iterate on this design (for non-relaxable instructions) in tree. We could potentially start simple by simply making more instructions relaxable, and then implement the fixup like scheme just mentioned or something like this patch does.

In D75268#1896088, @reames wrote:

I think this is a step in the right direction. I expect we'll need to iterate on the design to allow further padding without extensive memory usage, but starting with a single padding instruction seems like a reasonable starting point. Being able to iterate in tree is obvious valuable, so I think cleaning this up and getting something in is a good idea.

I'd like to suggest a couple of design changes to simplify the code here though.

Thanks for your comments and suggestions! They are really useful!

First, I think we should introduce a new MCFragment type for the padding opportunity. Having MCBranchAlign mean both "this is a place we need to enforce alignment" and "this is a place we can optional add padding for a later alignment" is confusing. I'd suggest something along the lines of a MCPrefixPaddingFragment. I think splitting this will make the code a lot more readable.

Yes, I aslo think using one type of fragment to do two kinds of things is confusing.

Second, I think we should rebase this on D75203. The advantage of doing so is that we could completely remove the MCPrefixPaddingFragment from relaxation, and only adjust it's size afterwards. Relaxation would be responsible for figuring out how much padding is needed and recording that in the MCBoundaryAlign, and then the post pass would divide the padding between nop, prefix, and other relaxable instructions.

Agree, doing this in the post pass would reduce the complexity of laying out the fragments. I think D75203 is a good start and seems reasonable, we don't need to stick to his patch since this patch does not nicely fit into the design of D75203. We can enable prefix padding in another way based on D75203. Let's focus on D75203 first.

Third, the complexity of the state machine for inserting fragments really bothers me. I don't have a concrete suggestion here, but I think we need to simplify this.

Hm, after reflecting a bit on points 1 and 2, I may have an alternate suggestion which eliminates (1) entirely. The basic idea is that we treat prefix padding as a thing done to relaxable instructions. A relaxable instruction has all the MCInst info to determine legal prefixes. If we added an interface to MCAsmBackend along the lines of "padInstructionEncoding(MCRelaxableFragment&)" which takes a fragment and changes it's encoding to increase encoding size, this would nicely fit into the design of D75203, and would allow all the prefix logic to live inside a single function in the X86AsmBackend. Then we'd just have to decide which instructions to make relaxable instead of directly adding to the data fragment. (i.e. the state machine)

Nice design! But I have one concern. D75203 replaces short jumps with long jumps to reduce the number of the NOP, is this friendly to performance? I think currently we should add a option to turn it on rather than enabling it by default before we get some performance data.

On the surface, that seems like it would work well. I'm going to prototype that a bit on top of D75203 and see if it works as well as it seems.

skan marked 2 inline comments as done.Feb 27 2020, 8:06 PM

skan added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
385	Okay
493	X86::CS_PREFIX is a enum value and used to represent the instruction is a cs prefix indeed. Constants 0x2e is the encoding value for X86::CS, thery are different.

skan marked an inline comment as done.Feb 28 2020, 5:34 AM

skan added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
385	Separate this piece of code to D75346

skan marked an inline comment as done.Feb 28 2020, 8:22 AM

skan added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
446–447	Split the piece of code to D75357

Plan to rebase this on D75203

Reimplemented this by D76286 based on D75300

Revision Contents

Path

Size

llvm/

include/

llvm/

MC/

MCFragment.h

51 lines

MCObjectStreamer.h

5 lines

lib/

MC/

MCAssembler.cpp

58 lines

MCFragment.cpp

11 lines

MCObjectStreamer.cpp

18 lines

Target/

X86/

MCTargetDesc/

X86AsmBackend.cpp

364 lines

test/

MC/

X86/

5 lines

25 lines

112 lines

35 lines

35 lines

19 lines

30 lines

17 lines

33 lines

Diff 246957

llvm/include/llvm/MC/MCFragment.h

Show First 20 Lines • Show All 511 Lines • ▼ Show 20 Lines	public:

StringRef getFixedSizePortion() const { return FixedSizePortion; }		StringRef getFixedSizePortion() const { return FixedSizePortion; }

static bool classof(const MCFragment *F) {		static bool classof(const MCFragment *F) {
return F->getKind() == MCFragment::FT_CVDefRange;		return F->getKind() == MCFragment::FT_CVDefRange;
}		}
};		};

/// Represents required padding such that a particular other set of fragments		/// This is a placeholder fragment used to emit NOP or values to align a set of
/// does not cross a particular power-of-two boundary. The other fragments must		/// fragments within specific boundary. If we call the nearest backward
/// follow this one within the same section.		/// MCBoundaryAlignFragment of LastFragment as NBBF, then the set of fragments
		/// to be aligned is (NBBF, LastFragment]. The fragments to be aligned should be
		/// in the same section with this fragment, and each non-BF fragment on the path
		/// from this fragment to the fragments to be aligned must have a fixed size
		/// after finite times of relaxation.
class MCBoundaryAlignFragment : public MCFragment {		class MCBoundaryAlignFragment : public MCFragment {
		/// Flag to indicate that (optimal) NOPs should be emitted instead
		/// of using the provided value.
		bool EmitNops = false;
/// The alignment requirement of the branch to be aligned.		/// The alignment requirement of the branch to be aligned.
Align AlignBoundary;		Align AlignBoundary;
/// Flag to indicate whether the branch is fused. Use in determining the
/// region of fragments being aligned.
bool Fused : 1;
/// Flag to indicate whether NOPs should be emitted.
bool EmitNops : 1;
/// The size of the fragment. The size is lazily set during relaxation, and		/// The size of the fragment. The size is lazily set during relaxation, and
/// is not meaningful before that.		/// is not meaningful before that.
uint64_t Size = 0;		uint64_t Size = 0;
		/// Value to use for filling padding bytes if existing.
		Optional<uint8_t> Value;
		/// The maximum number of bytes to emit; if the Flag EmitNops is true,
		/// then this constraint is ignored.
		uint64_t MaxBytesToEmit = 0;
		/// The fragment to be aligned.
		const MCFragment *LastFragment = nullptr;

public:		public:
MCBoundaryAlignFragment(Align AlignBoundary, bool Fused = false,		MCBoundaryAlignFragment(MCSection *Sec = nullptr)
bool EmitNops = false, MCSection *Sec = nullptr)		: MCFragment(FT_BoundaryAlign, false, Sec) {}
: MCFragment(FT_BoundaryAlign, false, Sec), AlignBoundary(AlignBoundary),
Fused(Fused), EmitNops(EmitNops) {}

uint64_t getSize() const { return Size; }		uint64_t getSize() const { return Size; }
void setSize(uint64_t Value) { Size = Value; }		void setSize(uint64_t V) { Size = V; }

Align getAlignment() const { return AlignBoundary; }		Align getAlignment() const { return AlignBoundary; }
		void setAlignment(Align V) { AlignBoundary = V; }

bool isFused() const { return Fused; }		bool hasValue() const { return Value.hasValue(); }
void setFused(bool Value) { Fused = Value; }		uint8_t getValue() const { return Value.getValue(); }
		void setValue(uint8_t V) { Value = V; }

bool canEmitNops() const { return EmitNops; }		bool hasEmitNops() const { return EmitNops; }
void setEmitNops(bool Value) { EmitNops = Value; }		void setEmitNops(bool V) { EmitNops = V; }

		bool hasEmitNopsOrValue() const { return EmitNops \|\| Value.hasValue(); }

		uint8_t getMaxBytesToEmit() const { return MaxBytesToEmit; }
		void setMaxBytesToEmit(uint64_t V) { MaxBytesToEmit = V; }

		const MCFragment *getFragment() const { return LastFragment; }
		void setFragment(const MCFragment *F) { LastFragment = F; }

static bool classof(const MCFragment *F) {		static bool classof(const MCFragment *F) {
return F->getKind() == MCFragment::FT_BoundaryAlign;		return F->getKind() == MCFragment::FT_BoundaryAlign;
}		}
};		};
} // end namespace llvm		} // end namespace llvm

#endif // LLVM_MC_MCFRAGMENT_H		#endif // LLVM_MC_MCFRAGMENT_H

llvm/include/llvm/MC/MCObjectStreamer.h

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	public:
}		}

/// Get a data fragment to write into, creating a new one if the current		/// Get a data fragment to write into, creating a new one if the current
/// fragment is not a data fragment.		/// fragment is not a data fragment.
/// Optionally a \p STI can be passed in so that a new fragment is created		/// Optionally a \p STI can be passed in so that a new fragment is created
/// if the Subtarget differs from the current fragment.		/// if the Subtarget differs from the current fragment.
MCDataFragment getOrCreateDataFragment(const MCSubtargetInfo STI = nullptr);		MCDataFragment getOrCreateDataFragment(const MCSubtargetInfo STI = nullptr);

		/// Get a boundary-align fragment to write into, creating a new one if the
		/// current fragment is not a boundary-align fragment or has been used to emit
		/// something.
		MCBoundaryAlignFragment *getOrCreateBoundaryAlignFragment();

protected:		protected:
bool changeSectionImpl(MCSection Section, const MCExpr Subsection);		bool changeSectionImpl(MCSection Section, const MCExpr Subsection);

/// Assign a label to the current Section and Subsection even though a		/// Assign a label to the current Section and Subsection even though a
/// fragment is not yet present. Use flushPendingLabels(F) to associate		/// fragment is not yet present. Use flushPendingLabels(F) to associate
/// a fragment with this label.		/// a fragment with this label.
void addPendingLabel(MCSymbol* label);		void addPendingLabel(MCSymbol* label);

▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/lib/MC/MCAssembler.cpp

Show First 20 Lines • Show All 607 Lines • ▼ Show 20 Lines	static void writeFragment(raw_ostream &OS, const MCAssembler &Asm,

case MCFragment::FT_LEB: {		case MCFragment::FT_LEB: {
const MCLEBFragment &LF = cast<MCLEBFragment>(F);		const MCLEBFragment &LF = cast<MCLEBFragment>(F);
OS << LF.getContents();		OS << LF.getContents();
break;		break;
}		}

case MCFragment::FT_BoundaryAlign: {		case MCFragment::FT_BoundaryAlign: {
		const MCBoundaryAlignFragment &BF = cast<MCBoundaryAlignFragment>(F);
		if (BF.hasEmitNops()) {
if (!Asm.getBackend().writeNopData(OS, FragmentSize))		if (!Asm.getBackend().writeNopData(OS, FragmentSize))
report_fatal_error("unable to write nop sequence of " +		report_fatal_error("unable to write nop sequence of " +
Twine(FragmentSize) + " bytes");		Twine(FragmentSize) + " bytes");
		} else if (BF.hasValue()) {
		for (uint64_t i = 0; i != FragmentSize; ++i)
		OS << char(BF.getValue());
		}
break;		break;
}		}

case MCFragment::FT_SymbolId: {		case MCFragment::FT_SymbolId: {
const MCSymbolIdFragment &SF = cast<MCSymbolIdFragment>(F);		const MCSymbolIdFragment &SF = cast<MCSymbolIdFragment>(F);
support::endian::write<uint32_t>(OS, SF.getSymbol()->getIndex(), Endian);		support::endian::write<uint32_t>(OS, SF.getSymbol()->getIndex(), Endian);
break;		break;
}		}
▲ Show 20 Lines • Show All 362 Lines • ▼ Show 20 Lines
static bool needPadding(uint64_t StartAddr, uint64_t Size,		static bool needPadding(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {		Align BoundaryAlignment) {
return mayCrossBoundary(StartAddr, Size, BoundaryAlignment) \|\|		return mayCrossBoundary(StartAddr, Size, BoundaryAlignment) \|\|
isAgainstBoundary(StartAddr, Size, BoundaryAlignment);		isAgainstBoundary(StartAddr, Size, BoundaryAlignment);
}		}

bool MCAssembler::relaxBoundaryAlign(MCAsmLayout &Layout,		bool MCAssembler::relaxBoundaryAlign(MCAsmLayout &Layout,
MCBoundaryAlignFragment &BF) {		MCBoundaryAlignFragment &BF) {
// The MCBoundaryAlignFragment that doesn't emit NOP should not be relaxed.		// The MCBoundaryAlignFragment that does not emit anything or not have any
if (!BF.canEmitNops())		// fragment to be aligned should not be relaxed.
		if (!BF.hasEmitNopsOrValue() \|\| !BF.getFragment())
return false;		return false;

uint64_t AlignedOffset = Layout.getFragmentOffset(BF.getNextNode());		// Compute the size of all the fragments in the range we're trying to align.
uint64_t AlignedSize = 0;		const MCFragment *TF = BF.getFragment();
const MCFragment *F = BF.getNextNode();		uint64_t AlignedSize = computeFragmentSize(Layout, *TF);
// If the branch is unfused, it is emitted into one fragment, otherwise it is		uint64_t AlignedOffset = Layout.getFragmentOffset(TF);
// emitted into two fragments at most, the next MCBoundaryAlignFragment(if		// Note: It should be guaranteed that there is a MCBoundaryAlignFragment
// exists) also marks the end of the branch.		// before TF in the same section.
for (auto i = 0, N = BF.isFused() ? 2 : 1;		for (auto *F = TF->getPrevNode(); !isa<MCBoundaryAlignFragment>(F);
i != N && !isa<MCBoundaryAlignFragment>(F); ++i, F = F->getNextNode()) {		F = F->getPrevNode()) {
AlignedSize += computeFragmentSize(Layout, *F);		assert(F->hasInstructions() &&
}		"The fragment doesn't have any instruction.");
uint64_t OldSize = BF.getSize();		uint64_t Size = computeFragmentSize(Layout, *F);
AlignedOffset -= OldSize;		AlignedSize += Size;
		AlignedOffset -= Size;
		}

		// Compute the size of all the MCBoundaryAlignFragments in the range
		// [BF,BF.getFragment).
		uint64_t FixedValue = 0;
		for (const MCFragment *F = &BF; F != TF; F = F->getNextNode())
		if (auto *MBF = dyn_cast<MCBoundaryAlignFragment>(F))
		FixedValue += MBF->getSize();

		AlignedOffset -= FixedValue;
Align BoundaryAlignment = BF.getAlignment();		Align BoundaryAlignment = BF.getAlignment();
uint64_t NewSize = needPadding(AlignedOffset, AlignedSize, BoundaryAlignment)		uint64_t NewSize = needPadding(AlignedOffset, AlignedSize, BoundaryAlignment)
? offsetToAlignment(AlignedOffset, BoundaryAlignment)		? offsetToAlignment(AlignedOffset, BoundaryAlignment)
: 0U;		: 0U;
if (NewSize == OldSize)		if (!BF.hasEmitNops()) {
		assert(BF.getNextNode()->hasInstructions() &&
		"The fragment doesn't have any instruction.");
		if (NewSize > static_cast<uint64_t>(BF.getMaxBytesToEmit()))
		NewSize = 0;
		}
		if (NewSize == BF.getSize())
return false;		return false;
BF.setSize(NewSize);		BF.setSize(NewSize);
Layout.invalidateFragmentsFrom(&BF);		Layout.invalidateFragmentsFrom(&BF);
return true;		return true;
}		}

bool MCAssembler::relaxDwarfLineAddr(MCAsmLayout &Layout,		bool MCAssembler::relaxDwarfLineAddr(MCAsmLayout &Layout,
MCDwarfLineAddrFragment &DF) {		MCDwarfLineAddrFragment &DF) {
▲ Show 20 Lines • Show All 183 Lines • Show Last 20 Lines

llvm/lib/MC/MCFragment.cpp

Show First 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	LLVM_DUMP_METHOD void MCFragment::dump() const {
case MCFragment::FT_LEB: {		case MCFragment::FT_LEB: {
const auto *LF = cast<MCLEBFragment>(this);		const auto *LF = cast<MCLEBFragment>(this);
OS << "\n ";		OS << "\n ";
OS << " Value:" << LF->getValue() << " Signed:" << LF->isSigned();		OS << " Value:" << LF->getValue() << " Signed:" << LF->isSigned();
break;		break;
}		}
case MCFragment::FT_BoundaryAlign: {		case MCFragment::FT_BoundaryAlign: {
const auto *BF = cast<MCBoundaryAlignFragment>(this);		const auto *BF = cast<MCBoundaryAlignFragment>(this);
if (BF->canEmitNops())		if (BF->hasEmitNops())
OS << " (can emit nops to align";		OS << " (emit nops)";
if (BF->isFused())
OS << " fused branch)";
else
OS << " unfused branch)";
OS << "\n ";		OS << "\n ";
		if (BF->hasValue())
		OS << " Value:" << hexdigit(BF->getValue());
OS << " BoundarySize:" << BF->getAlignment().value()		OS << " BoundarySize:" << BF->getAlignment().value()
		<< " MaxBytesToEmit:" << BF->getMaxBytesToEmit()
<< " Size:" << BF->getSize();		<< " Size:" << BF->getSize();
break;		break;
}		}
case MCFragment::FT_SymbolId: {		case MCFragment::FT_SymbolId: {
const auto *F = cast<MCSymbolIdFragment>(this);		const auto *F = cast<MCSymbolIdFragment>(this);
OS << "\n ";		OS << "\n ";
OS << " Sym:" << F->getSymbol();		OS << " Sym:" << F->getSymbol();
break;		break;
Show All 23 Lines

llvm/lib/MC/MCObjectStreamer.cpp

Show First 20 Lines • Show All 185 Lines • ▼ Show 20 Lines	MCFragment *MCObjectStreamer::getCurrentFragment() const {
assert(getCurrentSectionOnly() && "No current section!");		assert(getCurrentSectionOnly() && "No current section!");

if (CurInsertionPoint != getCurrentSectionOnly()->getFragmentList().begin())		if (CurInsertionPoint != getCurrentSectionOnly()->getFragmentList().begin())
return &*std::prev(CurInsertionPoint);		return &*std::prev(CurInsertionPoint);

return nullptr;		return nullptr;
}		}

static bool CanReuseDataFragment(const MCDataFragment &F,		static bool CanReuseDataFragment(const MCDataFragment &F, MCObjectStreamer &OS,
const MCAssembler &Assembler,
const MCSubtargetInfo *STI) {		const MCSubtargetInfo *STI) {
if (!F.hasInstructions())		if (!F.hasInstructions())
return true;		return true;

		MCAssembler &Assembler = OS.getAssembler();

// When bundling is enabled, we don't want to add data to a fragment that		// When bundling is enabled, we don't want to add data to a fragment that
// already has instructions (see MCELFStreamer::EmitInstToData for details)		// already has instructions (see MCELFStreamer::EmitInstToData for details)
if (Assembler.isBundlingEnabled())		if (Assembler.isBundlingEnabled())
return Assembler.getRelaxAll();		return Assembler.getRelaxAll();

// If the subtarget is changed mid fragment we start a new fragment to record		// If the subtarget is changed mid fragment we start a new fragment to record
// the new STI.		// the new STI.
return !STI \|\| F.getSubtargetInfo() == STI;		return !STI \|\| F.getSubtargetInfo() == STI;
}		}

MCDataFragment *		MCDataFragment *
MCObjectStreamer::getOrCreateDataFragment(const MCSubtargetInfo *STI) {		MCObjectStreamer::getOrCreateDataFragment(const MCSubtargetInfo *STI) {
MCDataFragment *F = dyn_cast_or_null<MCDataFragment>(getCurrentFragment());		MCDataFragment *F = dyn_cast_or_null<MCDataFragment>(getCurrentFragment());
if (!F \|\| !CanReuseDataFragment(F, Assembler, STI)) {		if (!F \|\| !CanReuseDataFragment(F, this, STI)) {
F = new MCDataFragment();		F = new MCDataFragment();
insert(F);		insert(F);
}		}
return F;		return F;
}		}

		MCBoundaryAlignFragment *MCObjectStreamer::getOrCreateBoundaryAlignFragment() {
		auto *F = dyn_cast_or_null<MCBoundaryAlignFragment>(getCurrentFragment());
		if (!F \|\| F->hasEmitNopsOrValue()) {
		F = new MCBoundaryAlignFragment();
		insert(F);
		}
		return F;
		}

void MCObjectStreamer::visitUsedSymbol(const MCSymbol &Sym) {		void MCObjectStreamer::visitUsedSymbol(const MCSymbol &Sym) {
Assembler->registerSymbol(Sym);		Assembler->registerSymbol(Sym);
}		}

void MCObjectStreamer::emitCFISections(bool EH, bool Debug) {		void MCObjectStreamer::emitCFISections(bool EH, bool Debug) {
MCStreamer::emitCFISections(EH, Debug);		MCStreamer::emitCFISections(EH, Debug);
EmitEHFrame = EH;		EmitEHFrame = EH;
EmitDebugFrame = Debug;		EmitDebugFrame = Debug;
▲ Show 20 Lines • Show All 544 Lines • Show Last 20 Lines

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

//===-- X86AsmBackend.cpp - X86 Assembler Backend -------------------------===//		//===-- X86AsmBackend.cpp - X86 Assembler Backend -------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "MCTargetDesc/X86BaseInfo.h"		#include "MCTargetDesc/X86BaseInfo.h"
#include "MCTargetDesc/X86FixupKinds.h"		#include "MCTargetDesc/X86FixupKinds.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/BinaryFormat/MachO.h"		#include "llvm/BinaryFormat/MachO.h"
#include "llvm/MC/MCAsmBackend.h"		#include "llvm/MC/MCAsmBackend.h"
#include "llvm/MC/MCAssembler.h"		#include "llvm/MC/MCAssembler.h"
		#include "llvm/MC/MCCodeEmitter.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCDwarf.h"		#include "llvm/MC/MCDwarf.h"
#include "llvm/MC/MCELFObjectWriter.h"		#include "llvm/MC/MCELFObjectWriter.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCFixupKindInfo.h"		#include "llvm/MC/MCFixupKindInfo.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCInstrInfo.h"		#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCMachObjectWriter.h"		#include "llvm/MC/MCMachObjectWriter.h"
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	public:
void addKind(X86::AlignBranchBoundaryKind Value) { AlignBranchKind \|= Value; }		void addKind(X86::AlignBranchBoundaryKind Value) { AlignBranchKind \|= Value; }
};		};

X86AlignBranchKind X86AlignBranchKindLoc;		X86AlignBranchKind X86AlignBranchKindLoc;

cl::opt<unsigned> X86AlignBranchBoundary(		cl::opt<unsigned> X86AlignBranchBoundary(
"x86-align-branch-boundary", cl::init(0),		"x86-align-branch-boundary", cl::init(0),
cl::desc(		cl::desc(
"Control how the assembler should align branches with NOP. If the "		"Control how the assembler should align branches with NOP or segment "
"boundary's size is not 0, it should be a power of 2 and no less "		"override prefix. If the boundary's size is not 0, it should be a "
"than 32. Branches will be aligned to prevent from being across or "		"power of 2 and no less than 16. Branches will be aligned to prevent "
"against the boundary of specified size. The default value 0 does not "		"from being across or against the boundary of specified size. The "
"align branches."));		"default value 0 does not align branches."));

cl::opt<X86AlignBranchKind, true, cl::parser<std::string>> X86AlignBranch(		cl::opt<X86AlignBranchKind, true, cl::parser<std::string>> X86AlignBranch(
"x86-align-branch",		"x86-align-branch",
cl::desc(		cl::desc(
"Specify types of branches to align (plus separated list of types):"		"Specify types of branches to align (plus separated list of types):"
"\njcc indicates conditional jumps"		"\njcc indicates conditional jumps"
"\nfused indicates fused conditional jumps"		"\nfused indicates fused conditional jumps"
"\njmp indicates direct unconditional jumps"		"\njmp indicates direct unconditional jumps"
"\ncall indicates direct and indirect calls"		"\ncall indicates direct and indirect calls"
"\nret indicates rets"		"\nret indicates rets"
"\nindirect indicates indirect unconditional jumps"),		"\nindirect indicates indirect unconditional jumps"),
cl::location(X86AlignBranchKindLoc));		cl::location(X86AlignBranchKindLoc));

		cl::opt<unsigned> X86AlignBranchPrefixSize(
		"x86-align-branch-prefix-size", cl::init(0),
		cl::desc("Specify the maximum number of prefixes on an instruction to "
		"align branches. The number should be between 0 and 5."));

cl::opt<bool> X86AlignBranchWithin32BBoundaries(		cl::opt<bool> X86AlignBranchWithin32BBoundaries(
"x86-branches-within-32B-boundaries", cl::init(false),		"x86-branches-within-32B-boundaries", cl::init(false),
cl::desc(		cl::desc(
"Align selected instructions to mitigate negative performance impact "		"Align selected instructions to mitigate negative performance impact "
"of Intel's micro code update for errata skx102. May break "		"of Intel's micro code update for errata skx102. May break "
"assumptions about labels corresponding to particular instructions, "		"assumptions about labels corresponding to particular instructions, "
"and should be used with caution."));		"and should be used with caution."));

class X86ELFObjectWriter : public MCELFObjectTargetWriter {		class X86ELFObjectWriter : public MCELFObjectTargetWriter {
public:		public:
X86ELFObjectWriter(bool is64Bit, uint8_t OSABI, uint16_t EMachine,		X86ELFObjectWriter(bool is64Bit, uint8_t OSABI, uint16_t EMachine,
bool HasRelocationAddend, bool foobar)		bool HasRelocationAddend, bool foobar)
: MCELFObjectTargetWriter(is64Bit, OSABI, EMachine, HasRelocationAddend) {}		: MCELFObjectTargetWriter(is64Bit, OSABI, EMachine, HasRelocationAddend) {}
};		};

class X86AsmBackend : public MCAsmBackend {		class X86AsmBackend : public MCAsmBackend {
const MCSubtargetInfo &STI;		const MCSubtargetInfo &STI;
std::unique_ptr<const MCInstrInfo> MCII;		std::unique_ptr<const MCInstrInfo> MCII;
X86AlignBranchKind AlignBranchType;		X86AlignBranchKind AlignBranchType;
Align AlignBoundary;		Align AlignBoundary;
		uint8_t AlignMaxPrefixSize = 0;
		uint64_t LastDFSizeOfInst = 0;
		bool HasPrefixedInst = false;
		bool IsPrefixedInst = false;

bool isMacroFused(const MCInst &Cmp, const MCInst &Jcc) const;		bool isMacroFused(const MCInst &Cmp, const MCInst &Jcc) const;

bool needAlign(MCObjectStreamer &OS) const;		bool needAlign(MCObjectStreamer &OS) const;
bool needAlignInst(const MCInst &Inst) const;		bool needAlignInst(const MCInst &Inst) const;
MCBoundaryAlignFragment *
getOrCreateBoundaryAlignFragment(MCObjectStreamer &OS) const;		bool shouldAddPrefix(const MCInst &Inst) const;
		uint8_t choosePrefix(const MCInst &Inst) const;
		uint8_t getMaxPrefixSize(MCObjectStreamer &OS, const MCInst &Inst,
		uint8_t InstSize) const;
		const MCFragment *LastFragmentToBeAligned = nullptr;
		const MCDataFragment *LastDFOfInst = nullptr;
MCInst PrevInst;		MCInst PrevInst;

public:		public:
X86AsmBackend(const Target &T, const MCSubtargetInfo &STI)		X86AsmBackend(const Target &T, const MCSubtargetInfo &STI)
: MCAsmBackend(support::little), STI(STI),		: MCAsmBackend(support::little), STI(STI),
MCII(T.createMCInstrInfo()) {		MCII(T.createMCInstrInfo()) {
if (X86AlignBranchWithin32BBoundaries) {		if (X86AlignBranchWithin32BBoundaries) {
// At the moment, this defaults to aligning fused branches, unconditional		// At the moment, this defaults to aligning fused branches, unconditional
// jumps, and (unfused) conditional jumps with nops. Both the		// jumps, and (unfused) conditional jumps with nops. Both the
// instructions aligned and the alignment method (nop vs prefix) may		// instructions aligned and the alignment method (nop vs prefix) may
// change in the future.		// change in the future.
AlignBoundary = assumeAligned(32);;		AlignBoundary = assumeAligned(32);;
AlignBranchType.addKind(X86::AlignBranchFused);		AlignBranchType.addKind(X86::AlignBranchFused);
AlignBranchType.addKind(X86::AlignBranchJcc);		AlignBranchType.addKind(X86::AlignBranchJcc);
AlignBranchType.addKind(X86::AlignBranchJmp);		AlignBranchType.addKind(X86::AlignBranchJmp);
}		}
// Allow overriding defaults set by master flag		// Allow overriding defaults set by master flag
if (X86AlignBranchBoundary.getNumOccurrences())		if (X86AlignBranchBoundary.getNumOccurrences())
AlignBoundary = assumeAligned(X86AlignBranchBoundary);		AlignBoundary = assumeAligned(X86AlignBranchBoundary);
if (X86AlignBranch.getNumOccurrences())		if (X86AlignBranch.getNumOccurrences())
AlignBranchType = X86AlignBranchKindLoc;		AlignBranchType = X86AlignBranchKindLoc;
		if (X86AlignBranchPrefixSize.getNumOccurrences())
		AlignMaxPrefixSize = std::min<uint8_t>(X86AlignBranchPrefixSize, 5);
}		}

bool allowAutoPadding() const override;		bool allowAutoPadding() const override;
void alignBranchesBegin(MCObjectStreamer &OS, const MCInst &Inst) override;		void alignBranchesBegin(MCObjectStreamer &OS, const MCInst &Inst) override;
void alignBranchesEnd(MCObjectStreamer &OS, const MCInst &Inst) override;		void alignBranchesEnd(MCObjectStreamer &OS, const MCInst &Inst) override;

unsigned getNumFixupKinds() const override {		unsigned getNumFixupKinds() const override {
return X86::NumTargetFixupKinds;		return X86::NumTargetFixupKinds;
▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	bool X86AsmBackend::allowAutoPadding() const {
return (AlignBoundary != Align(1) && AlignBranchType != X86::AlignBranchNone);		return (AlignBoundary != Align(1) && AlignBranchType != X86::AlignBranchNone);
}		}

bool X86AsmBackend::needAlign(MCObjectStreamer &OS) const {		bool X86AsmBackend::needAlign(MCObjectStreamer &OS) const {
if (!OS.getAllowAutoPadding())		if (!OS.getAllowAutoPadding())
return false;		return false;
assert(allowAutoPadding() && "incorrect initialization!");		assert(allowAutoPadding() && "incorrect initialization!");

MCAssembler &Assembler = OS.getAssembler();		// Currently don't deal with Bundle cases.
MCSection *Sec = OS.getCurrentSectionOnly();		if (OS.getAssembler().isBundlingEnabled())
		reamesUnsubmitted Not Done Reply Inline Actions This looks like a potentially unrelated change. Can it be separated? reames: This looks like a potentially unrelated change. Can it be separated?
		skanAuthorUnsubmitted Done Reply Inline Actions Okay skan: Okay
		skanAuthorUnsubmitted Done Reply Inline Actions Separate this piece of code to D75346 skan: Separate this piece of code to D75346
// To be Done: Currently don't deal with Bundle cases.
if (Assembler.isBundlingEnabled() && Sec->isBundleLocked())
return false;		return false;

// Branches only need to be aligned in 32-bit or 64-bit mode.		// Branches only need to be aligned in 32-bit or 64-bit mode.
if (!(STI.hasFeature(X86::Mode64Bit) \|\| STI.hasFeature(X86::Mode32Bit)))		if (!(STI.hasFeature(X86::Mode64Bit) \|\| STI.hasFeature(X86::Mode32Bit)))
return false;		return false;

return true;		return true;
}		}
Show All 13 Lines	return (InstDesc.isConditionalBranch() &&
(InstDesc.isCall() &&		(InstDesc.isCall() &&
(AlignBranchType & X86::AlignBranchCall)) \|\|		(AlignBranchType & X86::AlignBranchCall)) \|\|
(InstDesc.isReturn() &&		(InstDesc.isReturn() &&
(AlignBranchType & X86::AlignBranchRet)) \|\|		(AlignBranchType & X86::AlignBranchRet)) \|\|
(InstDesc.isIndirectBranch() &&		(InstDesc.isIndirectBranch() &&
(AlignBranchType & X86::AlignBranchIndirect));		(AlignBranchType & X86::AlignBranchIndirect));
}		}

static bool canReuseBoundaryAlignFragment(const MCBoundaryAlignFragment &F) {		/// Check if prefix can be added before instruction \p Inst.
// If a MCBoundaryAlignFragment has not been used to emit NOP,we can reuse it.		bool X86AsmBackend::shouldAddPrefix(const MCInst &Inst) const {
return !F.canEmitNops();		assert(!needAlignInst(Inst) && "Unexpected control flow!");

		// At most one instruction can be prefixed to align one instruction or a fused
		// pair.
		if (HasPrefixedInst)
		return false;

		// No prefix can be added if AlignMaxPrefixSize is 0.
		if (AlignMaxPrefixSize == 0)
		return false;

		unsigned Opcode = Inst.getOpcode();
		const MCInstrDesc &Desc = MCII->get(Opcode);

		// We only add prefix on a real instruction.
		if(Desc.isPseudo() \|\| X86::isPrefix(Opcode))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if(Desc.isPseudo() \|\| X86::isPrefix(Opcode)) + if (Desc.isPseudo() \|\| X86::isPrefix(Opcode)) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if(Desc.isPseudo() \|\| X86::isPrefix(Opcode)) +…
		return false;

		// Linker may rewrite the instruction with variant symbol operand.
		return !hasVariantSymbol(Inst);
}		}

MCBoundaryAlignFragment *		/// Choose which prefix should be inserted before the instruction.
X86AsmBackend::getOrCreateBoundaryAlignFragment(MCObjectStreamer &OS) const {		///
auto *F = dyn_cast_or_null<MCBoundaryAlignFragment>(OS.getCurrentFragment());		/// If there is one, use the existing segment override prefix.
if (!F \|\| !canReuseBoundaryAlignFragment(*F)) {		/// If the target is 64-bit, use the CS.
F = new MCBoundaryAlignFragment(AlignBoundary);		/// If the target is 32-bit,
OS.insert(F);		/// - If the instruction has a ESP/EBP base register, use SS.
		/// - Otherwise use DS.
		uint8_t X86AsmBackend::choosePrefix(const MCInst &Inst) const {
		assert((STI.hasFeature(X86::Mode32Bit) \|\| STI.hasFeature(X86::Mode64Bit)) &&
		skanAuthorUnsubmitted Done Reply Inline Actions Split the piece of code to D75357 skan: Split the piece of code to D75357
		"Prefixes can be added only in 32-bit or 64-bit mode.");
		unsigned Opcode = Inst.getOpcode();
		const MCInstrDesc &Desc = MCII->get(Opcode);
		uint64_t TSFlags = Desc.TSFlags;

		unsigned CurOp = X86II::getOperandBias(Desc);

		// Determine where the memory operand starts, if present.
		int MemoryOperand = X86II::getMemoryOperandNo(TSFlags);
		if (MemoryOperand != -1)
		MemoryOperand += CurOp;

		unsigned SegmentReg = 0;
		if (MemoryOperand >= 0) {
		// Check for explicit segment override on memory operand.
		SegmentReg = Inst.getOperand(MemoryOperand + X86::AddrSegmentReg).getReg();
}		}
return F;
		uint64_t Form = TSFlags & X86II::FormMask;
		switch (Form) {
		default:
		break;
		case X86II::RawFrmDstSrc: {
		// Check segment override opcode prefix as needed (not for %ds).
		if (Inst.getOperand(2).getReg() != X86::DS)
		SegmentReg = Inst.getOperand(2).getReg();
		break;
		}
		case X86II::RawFrmSrc: {
		// Check segment override opcode prefix as needed (not for %ds).
		if (Inst.getOperand(1).getReg() != X86::DS)
		SegmentReg = Inst.getOperand(1).getReg();
		break;
		}
		case X86II::RawFrmMemOffs: {
		// Check segment override opcode prefix as needed.
		SegmentReg = Inst.getOperand(1).getReg();
		break;
		}
		}

		switch (SegmentReg) {
		case 0:
		break;
		case X86::CS:
		return 0x2e;
		reamesUnsubmitted Not Done Reply Inline Actions Please replace constants with X86::CS_PREFIX and friends. reames: Please replace constants with X86::CS_PREFIX and friends.
		skanAuthorUnsubmitted Done Reply Inline Actions X86::CS_PREFIX is a enum value and used to represent the instruction is a cs prefix indeed. Constants 0x2e is the encoding value for X86::CS, thery are different. skan: X86::CS_PREFIX is a enum value and used to represent the instruction is a cs prefix indeed.
		case X86::SS:
		return 0x36;
		case X86::DS:
		return 0x3e;
		case X86::ES:
		return 0x26;
		case X86::FS:
		return 0x64;
		case X86::GS:
		return 0x65;
		}

		if (STI.hasFeature(X86::Mode64Bit))
		return 0x2e;

		if (MemoryOperand >= 0) {
		unsigned BaseRegNum = MemoryOperand + X86::AddrBaseReg;
		unsigned BaseReg = Inst.getOperand(BaseRegNum).getReg();
		if (BaseReg == X86::ESP \|\| BaseReg == X86::EBP)
		return 0x36;
		}
		return 0x3e;
		}

		/// Get the maximum size of prefixes that can be added on the instruction.
		uint8_t X86AsmBackend::getMaxPrefixSize(MCObjectStreamer &OS,
		const MCInst &Inst,
		uint8_t InstSize) const {

		assert(InstSize <= 15 &&
		"The length of instruction must be no longer than 15.");
		SmallString<256> Code;
		raw_svector_ostream VecOS(Code);
		OS.getAssembler().getEmitter().emitPrefix(Inst, VecOS, STI);
		assert(Code.size() < 15 && "The number of prefixes must be less than 15.");
		uint8_t ExistingPrefixSize = static_cast<uint8_t>(Code.size());
		uint8_t MaxPrefixSize = (AlignMaxPrefixSize > ExistingPrefixSize)
		? (AlignMaxPrefixSize - ExistingPrefixSize)
		: 0;
		MaxPrefixSize = std::min(MaxPrefixSize, static_cast<uint8_t>(15 - InstSize));
		return MaxPrefixSize;
}		}

/// Insert MCBoundaryAlignFragment before instructions to align branches.		/// Insert MCBoundaryAlignFragment before instructions to align branches.
void X86AsmBackend::alignBranchesBegin(MCObjectStreamer &OS,		void X86AsmBackend::alignBranchesBegin(MCObjectStreamer &OS,
const MCInst &Inst) {		const MCInst &Inst) {
if (!needAlign(OS))		if (!needAlign(OS))
return;		return;

		// Summary of inserting scheme(Two Steps):
		// Step 1:
		// If the previous instruction is the first instruction in a fusible pair
		// - If macro fusion actually happens, emit NOP before the first instrucion
		// in the fused pair and skip step 2.
		// - If the macro fusion doesn't happen indeed, emit prefix before the
		// previous instruction if necessary.
		//
		// Step 2:
		// If the instruction needs to be aligned, emit NOP before the instruction.
		//
		// If the instruction is the first instruction in a fusible pair, put a
		// a placeholder here.
		//
		// Otherwise emit prefix before the instruction if necessary.

MCFragment *CF = OS.getCurrentFragment();		MCFragment *CF = OS.getCurrentFragment();

		// Prefix or NOP shouldn't be inserted after hardcode, e.g.
		//
		// \code
		// .byte 0x2e
		// jmp .Label0
		// \endcode
		//
		// since there is no clear instruction boundary.
		if (auto *F = dyn_cast_or_null<MCDataFragment>(CF)) {
		// FIXME: The method to detect hardcode is tricky here.
		if (F != LastDFOfInst \|\| F->getContents().size() != LastDFSizeOfInst) {
		return;
		}
		}

		// Prefix or NOP shouldn't be inserted after prefix. e.g.
		//
		// \code
		// data16
		// leaq bar@tlsld(%rip), %rdi
		// \endcode
		if(X86::isPrefix(PrevInst.getOpcode()))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if(X86::isPrefix(PrevInst.getOpcode())) + if (X86::isPrefix(PrevInst.getOpcode())) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if(X86::isPrefix(PrevInst.getOpcode())) + if…
		return;

bool NeedAlignFused = AlignBranchType & X86::AlignBranchFused;		bool NeedAlignFused = AlignBranchType & X86::AlignBranchFused;
if (NeedAlignFused && isMacroFused(PrevInst, Inst) && CF) {		// Step 1:
// Macro fusion actually happens and there is no other fragment inserted		// Handle the condition when the previous the instruction is the first
// after the previous instruction. NOP can be emitted in PF to align fused		// instruction in a fusible pair. Note: We need to check the previous
// jcc.		// fragment is a BF since we may encounter the case:
if (auto *PF =
dyn_cast_or_null<MCBoundaryAlignFragment>(CF->getPrevNode())) {
const_cast<MCBoundaryAlignFragment *>(PF)->setEmitNops(true);
const_cast<MCBoundaryAlignFragment *>(PF)->setFused(true);
}
} else if (needAlignInst(Inst)) {
// Note: When there is at least one fragment, such as MCAlignFragment,
// inserted after the previous instruction, e.g.
//		//
// \code		// \code
// cmp %rax %rcx		// cmp %rax %rcx
// .align 16		// .align 16
// je .Label0		// je .Label0
// \ endcode		// \endcode
//		//
// We will treat the JCC as a unfused branch although it may be fused		// MCAlignFragment can grow and shrink, so it is not ensured to get a fixed
// with the CMP.		// size after finite times of relaxation. NOP or prefix should not emitted
auto *F = getOrCreateBoundaryAlignFragment(OS);		// before the CMP since it may cause MCAssembler::relaxBoundaryAlign not to
		// converge.
		if (NeedAlignFused && isFirstMacroFusibleInst(PrevInst, *MCII) && CF &&
		isa_and_nonnull<MCBoundaryAlignFragment>(CF->getPrevNode())) {
		auto PF = const_cast<MCBoundaryAlignFragment >(
		cast<MCBoundaryAlignFragment>(CF->getPrevNode()));
		// Macro fusion actually happens, so emit NOP before the first instrucion in
		// the fused pair.
		if (isMacroFused(PrevInst, Inst)) {
		PF->setAlignment(AlignBoundary);
		PF->setEmitNops(true);
		return;
		} else if (shouldAddPrefix(PrevInst)) {
		// Macro fusion doesn't happen indeed, emit prefix before the previous
		// instruction.
		PF->setAlignment(AlignBoundary);
		PF->setValue(choosePrefix(PrevInst));
		HasPrefixedInst = true;
		if (isa<MCDataFragment>(CF)) {
		uint8_t MaxBytesToEmit = getMaxPrefixSize(
		OS, PrevInst, static_cast<uint8_t>(LastDFSizeOfInst));
		PF->setMaxBytesToEmit(MaxBytesToEmit);
		}
		}
		}

		// Step 2:
		if (needAlignInst(Inst)) {
		// Handle the condition when the instruction to be aligned is unfused.
		// Emit NOP before the instruction to be aligned.
		auto *F = OS.getOrCreateBoundaryAlignFragment();
		F->setAlignment(AlignBoundary);
F->setEmitNops(true);		F->setEmitNops(true);
F->setFused(false);
} else if (NeedAlignFused && isFirstMacroFusibleInst(Inst, *MCII)) {		} else if (NeedAlignFused && isFirstMacroFusibleInst(Inst, *MCII)) {
// We don't know if macro fusion happens until the reaching the next		// We don't know if macro fusion happens until reaching the next
// instruction, so a place holder is put here if necessary.		// instruction, so a placeholder is put here if necessary.
getOrCreateBoundaryAlignFragment(OS);		OS.getOrCreateBoundaryAlignFragment();
		} else if (shouldAddPrefix(Inst)) {
		// Emit prefixes before instruction that doesn't need to be aligned.
		auto *F = OS.getOrCreateBoundaryAlignFragment();
		F->setAlignment(AlignBoundary);
		F->setValue(choosePrefix(Inst));
		HasPrefixedInst = true;
		IsPrefixedInst = true;
}		}

PrevInst = Inst;
}		}

/// Insert a MCBoundaryAlignFragment to mark the end of the branch to be aligned		/// Set the last fragment in the set of fragments to be aligned (which is
/// if necessary.		/// current fragment indeed) for BF and insert a new BF to prevent further
		/// instruction from being added to the current fragment if necessary.
void X86AsmBackend::alignBranchesEnd(MCObjectStreamer &OS, const MCInst &Inst) {		void X86AsmBackend::alignBranchesEnd(MCObjectStreamer &OS, const MCInst &Inst) {
if (!needAlign(OS))		if (!needAlign(OS))
return;		return;
// If the branch is emitted into a MCRelaxableFragment, we can determine the
// size of the branch easily in MCAssembler::relaxBoundaryAlign. When the		PrevInst = Inst;
// branch is fused, the fused branch(macro fusion pair) must be emitted into		const MCFragment *CF = OS.getCurrentFragment();
// two fragments. Or when the branch is unfused, the branch must be emitted
// into one fragment. The MCRelaxableFragment naturally marks the end of the		if (!MCII->get(Inst.getOpcode()).isPseudo()) {
// fused or unfused branch.		if (auto *F = dyn_cast_or_null<MCDataFragment>(CF)) {
// Otherwise, we need to insert a MCBoundaryAlignFragment to mark the end of		// Record the position and the size of data fragment if any instruction is
// the branch. This MCBoundaryAlignFragment may be reused to emit NOP to align		// emitted into it.
// other branch.		LastDFOfInst = F;
if (needAlignInst(Inst) && !isa<MCRelaxableFragment>(OS.getCurrentFragment()))		LastDFSizeOfInst = F->getContents().size();
OS.insert(new MCBoundaryAlignFragment(AlignBoundary));		// The number of prefixes is limited by AlignMaxPrefixSize for some
		// performance reasons, so we need to compute how many prefixes can be
		// added.
		if (IsPrefixedInst &&
		isa_and_nonnull<MCBoundaryAlignFragment>(F->getPrevNode())) {
		auto BF = const_cast<MCBoundaryAlignFragment >(
		cast<MCBoundaryAlignFragment>(F->getPrevNode()));
		assert(BF->hasValue() && "Unexpected control flow!");
		uint8_t MaxBytesToEmit =
		getMaxPrefixSize(OS, Inst, static_cast<uint8_t>(LastDFSizeOfInst));
		BF->setMaxBytesToEmit(MaxBytesToEmit);
		}
		}
		}

		IsPrefixedInst = false;

		if (!needAlignInst(Inst))
		return;

		for (auto *F = CF; F && F != LastFragmentToBeAligned &&
		(F->hasInstructions() \|\| isa<MCBoundaryAlignFragment>(F));
		F = F->getPrevNode()) {
		// The fragments to be aligned should be in the same section with this
		// fragment, and each non-BF fragment on the path from this fragment to the
		// fragments to be aligned must have a fixed size after finite times of
		// relaxation. Currently, we conservatively use hasInstruction to ensure
		// that.
		if (auto *BF = dyn_cast<MCBoundaryAlignFragment>(F)) {
		if (BF->hasEmitNopsOrValue())
		const_cast<MCBoundaryAlignFragment *>(BF)->setFragment(CF);
		// There is at most one MCBoundaryAlignFragment to align one instruction
		// if we only emit NOP to align instruction.
		if (AlignMaxPrefixSize == 0)
		break;
		}
		}

		HasPrefixedInst = false;
		LastFragmentToBeAligned = CF;

		// We need no further instructions can be emitted into the current fragment.
		//
		// If current fragment is a MCRelaxableFragment, then no more
		// instructions can be pushed into since MCRelaxableFragment only holds one
		// instruction.
		//
		// Otherwise, we need to insert a new BF to truncate the current fragment.
		// This MCBoundaryAlignFragment may be reused to emit NOP or segment override
		// prefix to align other instruction.
		if (!isa<MCRelaxableFragment>(OS.getCurrentFragment()))
		OS.insert(new MCBoundaryAlignFragment());

// Update the maximum alignment on the current section if necessary.		// Update the maximum alignment on the current section if necessary.
MCSection *Sec = OS.getCurrentSectionOnly();		MCSection *Sec = OS.getCurrentSectionOnly();
if (AlignBoundary.value() > Sec->getAlignment())		if (AlignBoundary.value() > Sec->getAlignment())
Sec->setAlignment(AlignBoundary);		Sec->setAlignment(AlignBoundary);
}		}

Optional<MCFixupKind> X86AsmBackend::getFixupKind(StringRef Name) const {		Optional<MCFixupKind> X86AsmBackend::getFixupKind(StringRef Name) const {
▲ Show 20 Lines • Show All 685 Lines • Show Last 20 Lines

llvm/test/MC/X86/align-branch-32-1a.s

	# Check NOP padding is disabled before instruction that has variant symbol operand.			## Check NOP/Prefix padding is disabled for instruction that has variant symbol operand.
	# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=call %s \| llvm-objdump -d - \| FileCheck %s			# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=call+jmp %s \| llvm-objdump -d - \| FileCheck %s
				# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=call+jmp --x86-align-branch-prefix-size=4 %s \| llvm-objdump -d - \| FileCheck %s

	# CHECK: 00000000 foo:			# CHECK: 00000000 foo:
	# CHECK-COUNT-5: : 64 a3 01 00 00 00 movl %eax, %fs:1			# CHECK-COUNT-5: : 64 a3 01 00 00 00 movl %eax, %fs:1
	# CHECK: 1e: e8 fc ff ff ff calll {{.*}}			# CHECK: 1e: e8 fc ff ff ff calll {{.*}}
	# CHECK-COUNT-4: : 64 a3 01 00 00 00 movl %eax, %fs:1			# CHECK-COUNT-4: : 64 a3 01 00 00 00 movl %eax, %fs:1
	# CHECK: 3b: 55 pushl %ebp			# CHECK: 3b: 55 pushl %ebp
	# CHECK-NEXT: 3c: ff 91 00 00 00 00 calll *(%ecx)			# CHECK-NEXT: 3c: ff 91 00 00 00 00 calll *(%ecx)
	# CHECK-COUNT-4: : 64 a3 01 00 00 00 movl %eax, %fs:1			# CHECK-COUNT-4: : 64 a3 01 00 00 00 movl %eax, %fs:1
	Show All 28 Lines

llvm/test/MC/X86/align-branch-32-2a.s

This file was added.

				## Check no prefix is inserted after hardcode.
				# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=2 %s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 00000000 main:
				# CHECK-NEXT: 0: 2e 55 pushl %ebp
				# CHECK-NEXT: 2: 2e 89 e5 movl %esp, %ebp
				# CHECK-NEXT: 5: 3e 55 pushl %ebp
				# CHECK-COUNT-25: 55 pushl %ebp
				# CHECK-NEXT: 20: e9 fc ff ff ff jmp {{.*}}
				# CHECK: 00000025 infiniteLoop:
				# CHECK-NEXT: 25: eb d9 jmp {{.*}}

				.text
				.globl infiniteLoop
				main:
				.byte 0x2e
				pushl %ebp
				.byte 0x2e
				movl %esp, %ebp
				.rept 26
				pushl %ebp
				.endr
				jmp infiniteLoop
				infiniteLoop:
				jmp main

llvm/test/MC/X86/align-branch-32-3a.s

This file was added.

				## Check approriate prefix is choosen to prefix an instruction.
				# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=3 %s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 00000000 foo:
				# CHECK-NEXT: 0: 65 65 65 a3 01 00 00 00 movl %eax, %gs:1
				# CHECK-NEXT: 8: 55 pushl %ebp
				# CHECK-NEXT: 9: 57 pushl %edi
				# CHECK-COUNT-2: : 55 pushl %ebp
				# CHECK: c: 89 e5 movl %esp, %ebp
				# CHECK-NEXT: e: 89 7d f8 movl %edi, -8(%ebp)
				# CHECK-COUNT-5: : 89 75 f4 movl %esi, -12(%ebp)
				# CHECK: 20: 39 c5 cmpl %eax, %ebp
				# CHECK-NEXT: 22: 74 5e je {{.*}}
				# CHECK-NEXT: 24: 3e 89 73 f4 movl %esi, %ds:-12(%ebx)
				# CHECK-NEXT: 28: 89 75 f4 movl %esi, -12(%ebp)
				# CHECK-NEXT: 2b: 89 7d f8 movl %edi, -8(%ebp)
				# CHECK-COUNT-5: : 89 75 f4 movl %esi, -12(%ebp)
				# CHECK-COUNT-3: : 5d popl %ebp
				# CHECK: 40: 74 40 je {{.*}}
				# CHECK-NEXT: 42: 5d popl %ebp
				# CHECK-NEXT: 43: 74 3d je {{.*}}
				# CHECK-NEXT: 45: 36 89 44 24 fc movl %eax, %ss:-4(%esp)
				# CHECK-NEXT: 4a: 89 75 f4 movl %esi, -12(%ebp)
				# CHECK-NEXT: 4d: 89 7d f8 movl %edi, -8(%ebp)
				# CHECK-COUNT-5: : 89 75 f4 movl %esi, -12(%ebp)
				# CHECK: 5f: 5d popl %ebp
				# CHECK-NEXT: 60: eb 26 jmp {{.*}}
				# CHECK-NEXT: 62: eb 24 jmp {{.*}}
				# CHECK-NEXT: 64: eb 22 jmp {{.*}}
				# CHECK-NEXT: 66: 89 45 fc movl %eax, -4(%ebp)
				# CHECK-NEXT: 69: 89 75 f4 movl %esi, -12(%ebp)
				# CHECK-NEXT: 6c: 89 7d f8 movl %edi, -8(%ebp)
				# CHECK-COUNT-3: : 89 75 f4 movl %esi, -12(%ebp)
				# CHECK-COUNT-2: : 5d popl %ebp
				# CHECK-NEXT: 7a: 39 c5 cmpl %eax, %ebp
				# CHECK-NEXT: 7c: 74 04 je {{.*}}
				# CHECK-COUNT-2: : 90 nop
				# CHECK-NEXT: 80: eb 06 jmp {{.*}}
				# CHECK-NEXT: 82: 8b 45 f4 movl -12(%ebp), %eax
				# CHECK-NEXT: 85: 89 45 fc movl %eax, -4(%ebp)
				# CHECK-COUNT-4: : 89 b5 50 fb ff ff movl %esi, -1200(%ebp)
				# CHECK: a0: 89 75 0c movl %esi, 12(%ebp)
				# CHECK-NEXT: a3: e9 fc ff ff ff jmp {{.*}}
				# CHECK-COUNT-3: : 64 8e 15 01 00 00 00 movw %fs:1, %ss
				# CHECK-COUNT-3: : 90 nop
				# CHECK: c0: 39 c5 cmpl %eax, %ebp
				# CHECK-NEXT: c2: 74 c4 je {{.*}}
				.text
				.globl foo
				.p2align 4
				foo:
				movl %eax, %gs:0x1
				pushl %ebp
				pushl %edi
				.rept 2
				pushl %ebp
				.endr
				movl %esp, %ebp
				movl %edi, -8(%ebp)
				.rept 5
				movl %esi, -12(%ebp)
				.endr
				cmp %eax, %ebp
				je .L_2
				movl %esi, -12(%ebx)
				movl %esi, -12(%ebp)
				movl %edi, -8(%ebp)
				.rept 5
				movl %esi, -12(%ebp)
				.endr
				.rept 3
				popl %ebp
				.endr
				je .L_2
				popl %ebp
				je .L_2
				movl %eax, -4(%esp)
				movl %esi, -12(%ebp)
				movl %edi, -8(%ebp)
				.rept 5
				movl %esi, -12(%ebp)
				.endr
				popl %ebp
				jmp .L_3
				jmp .L_3
				jmp .L_3
				movl %eax, -4(%ebp)
				movl %esi, -12(%ebp)
				movl %edi, -8(%ebp)
				.rept 3
				movl %esi, -12(%ebp)
				.endr
				.rept 2
				popl %ebp
				.endr
				cmp %eax, %ebp
				je .L_2
				jmp .L_3
				.L_2:
				movl -12(%ebp), %eax
				movl %eax, -4(%ebp)
				.L_3:
				.rept 4
				movl %esi, -1200(%ebp)
				.endr
				movl %esi, 12(%ebp)
				jmp bar
				.rept 3
				mov %fs:0x1, %ss
				.endr
				cmp %eax, %ebp
				je .L_3

llvm/test/MC/X86/align-branch-32-4a.s

This file was added.

				## Check prefix of instruction is limited by option --x86-align-branch-prefix-size=NUM.
				# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=4 %s \| llvm-objdump -d - \| FileCheck %s -check-prefixes=CHECK,PREFIX4

				# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=5 %s \| llvm-objdump -d - \| FileCheck %s -check-prefixes=CHECK,PREFIX5


				# CHECK: 00000000 foo:
				# PREFIX4: 0: 66 0f 3a 60 00 03 pcmpestrm $3, (%eax), %xmm0
				# PREFIX4-NEXT: 6: c4 e3 79 60 00 03 vpcmpestrm $3, (%eax), %xmm0
				# PREFIX4-NEXT: c: 65 a3 01 00 00 00 movl %eax, %gs:1
				# PREFIX4-COUNT-4: : 89 75 f4 movl %esi, -12(%ebp)
				# PREFIX4-COUNT-2: : 90 nop

				# PREFIX5: 0: 3e 3e 66 0f 3a 60 00 03 pcmpestrm $3, %ds:(%eax), %xmm0
				# PREFIX5-NEXT: 8: c4 e3 79 60 00 03 vpcmpestrm $3, (%eax), %xmm0
				# PREFIX5-NEXT: e: 65 a3 01 00 00 00 movl %eax, %gs:1
				# PREFIX5-COUNT-4: : 89 75 f4 movl %esi, -12(%ebp)

				# CHECK: 20: a8 04 testb $4, %al
				# CHECK-NEXT: 22: 70 dc jo {{.*}}

				.text
				.globl foo
				.p2align 4
				foo:
				.L1:
				pcmpestrm $3, (%eax), %xmm0
				vpcmpestrm $3, (%eax), %xmm0
				movl %eax, %gs:0x1
				.rept 4
				movl %esi, -12(%ebp)
				.endr
				testb $0x4,%al
				jo .L1

llvm/test/MC/X86/align-branch-64-1e.s

This file was added.

				## Check only fused conditional jumps, conditional jumps and unconditional jumps are aligned with option --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=4
				# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=4 %p/Inputs/align-branch-64-1.s \| llvm-objdump -d - > %t1
				# RUN: FileCheck --input-file=%t1 %s

				# CHECK: 0000000000000000 foo:
				# CHECK-NEXT: 0: 64 64 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 1b: 48 39 c5 cmpq %rax, %rbp
				# CHECK-NEXT: 1e: 31 c0 xorl %eax, %eax
				# CHECK-NEXT: 20: 48 39 c5 cmpq %rax, %rbp
				# CHECK-NEXT: 23: 74 5d je {{.*}}
				# CHECK-NEXT: 25: 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 3e: 31 c0 xorl %eax, %eax
				# CHECK-NEXT: 40: 74 40 je {{.*}}
				# CHECK-NEXT: 42: 5d popq %rbp
				# CHECK-NEXT: 43: 74 3d je {{.*}}
				# CHECK-NEXT: 45: 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 5e: 31 c0 xorl %eax, %eax
				# CHECK-NEXT: 60: eb 26 jmp {{.*}}
				# CHECK-NEXT: 62: eb 24 jmp {{.*}}
				# CHECK-NEXT: 64: eb 22 jmp {{.*}}
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 76: 89 45 fc movl %eax, -4(%rbp)
				# CHECK-NEXT: 79: 5d popq %rbp
				# CHECK-NEXT: 7a: 48 39 c5 cmpq %rax, %rbp
				# CHECK-NEXT: 7d: 74 03 je {{.*}}
				# CHECK-NEXT: 7f: 90 nop
				# CHECK-NEXT: 80: eb 06 jmp {{.*}}
				# CHECK-NEXT: 82: 8b 45 f4 movl -12(%rbp), %eax
				# CHECK-NEXT: 85: 89 45 fc movl %eax, -4(%rbp)
				# CHECK-COUNT-10: : 89 b5 50 fb ff ff movl %esi, -1200(%rbp)
				# CHECK: c4: eb c2 jmp {{.*}}
				# CHECK-NEXT: c6: c3 retq

llvm/test/MC/X86/align-branch-64-2d.s

This file was added.

				## Check only indirect jumps and calls are aligned with option --x86-align-branch-boundary=32 --x86-align-branch=indirect+call --x86-align-branch-prefix-size=4
				# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=indirect+call --x86-align-branch-prefix-size=4 %p/Inputs/align-branch-64-2.s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 0000000000000000 foo:
				# CHECK-NEXT: 0: 64 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 89 75 f4 movl %esi, -12(%rbp)
				# CHECK: 20: ff e0 jmpq *%rax
				# CHECK-NEXT: 22: 64 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-COUNT-2: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 3c: 89 75 f4 movl %esi, -12(%rbp)
				# CHECK-NEXT: 3f: 55 pushq %rbp
				# CHECK-NEXT: 40: ff d0 callq *%rax
				# CHECK-COUNT-3: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK-NEXT: 5a: 55 pushq %rbp
				# CHECK-COUNT-5: : 90 nop
				# CHECK-NEXT: 60: e8 00 00 00 00 callq {{.*}}
				# CHECK-COUNT-4: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
				# CHECK: 85: ff 14 25 00 00 00 00 callq *0

llvm/test/MC/X86/align-branch-64-7a.s

This file was added.

				## Check no prefixes is added to the instruction if there is a align directive between the instruction and the target branch
				# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=jmp --x86-align-branch-prefix-size=5 %s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 0000000000000000 test1:
				# CHECK-NEXT: 0: 31 d2 xorl %edx, %edx
				# CHECK-NEXT: 2: 89 8c 24 84 00 00 00 movl %ecx, 132(%rsp)
				# CHECK-NEXT: 9: 4c 89 c1 movq %r8, %rcx
				# CHECK-NEXT: c: 4c 8b 8c 24 88 00 00 00 movq 136(%rsp), %r9
				# CHECK-COUNT-4: : 90 nop
				# CHECK: 18: 66 66 90 nop
				# CHECK-NEXT: 1b: 4c 89 c1 movq %r8, %rcx
				# CHECK-COUNT-2: : 90 nop
				# CHECK-NEXT: 20: eb de jmp {{.*}}
				# CHECK-NEXT: 22: c3 retq

				.text
				.globl test1
				test1:
				.Ltmp0:
				xorl %edx, %edx
				movl %ecx, 132(%rsp)
				movq %r8, %rcx
				movq 136(%rsp), %r9
				.p2align 3, 0x90
				.byte 102
				.byte 102
				nop
				movq %r8, %rcx
				jmp .Ltmp0
				retq

llvm/test/MC/X86/align-branch-64-8a.s

This file was added.

				## Check the case multiple CMPs are followed a jcc is correctly handled.
				# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5 %s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 0000000000000000 test1:
				# CHECK-NEXT: 0: 2e 2e 48 39 c5 cmpq %rax, %rbp
				# CHECK-COUNT-9: : 48 39 c5 cmpq %rax, %rbp
				# CHECK-NEXT: 20: 48 39 c5 cmpq %rax, %rbp
				# CHECK-NEXT: 23: 74 db je {{.*}}

				.text
				.globl test1
				test1:
				.Ltmp0:
				.rept 11
				cmp %rax, %rbp
				.endr
				je .Ltmp0

llvm/test/MC/X86/align-branch-64-9a.s

This file was added.

				## Check prefix won't be prepended to instruction that has variant symbol operand.
				# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5 %s \| llvm-objdump -d - \| FileCheck %s

				# CHECK: 0000000000000000 _start:
				# CHECK-NEXT: 0: 66 66 48 48 rex64
				# CHECK-NEXT: 4: 8d 3d 00 00 00 00 leal (%rip), %edi
				# CHECK-NEXT: a: 66 48 8d 3d 00 00 00 00 leaq (%rip), %rdi
				# CHECK-NEXT: 12: e8 00 00 00 00 callq {{.*}}
				# CHECK-NEXT: 17: 48 8b 98 00 00 00 00 movq (%rax), %rbx
				# CHECK-NEXT: 1e: 90 nop
				# CHECK-NEXT: 1f: 90 nop
				# CHECK-NEXT: 20: 48 85 db testq %rbx, %rbx
				# CHECK-NEXT: 23: 74 00 je {{.*}}
				# CHECK-NEXT: 25: c3 retq

				.text
				.globl _start
				_start:
				data16
				data16
				rex64
				leaq bar@tlsld(%rip), %rdi
				data16
				leaq bar@tlsld(%rip), %rdi
				call __tls_get_addr@PLT
				movq bar@DTPOFF(%rax), %rbx
				testq %rbx, %rbx
				je .L1
				.L1:
				ret
				.section ".tdata", "awT", @progbits
				bar:
				.long 10

This is an archive of the discontinued LLVM Phabricator instance.

A light-weight solution to align branches within 32B boundary by prefix paddingAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 246957

llvm/include/llvm/MC/MCFragment.h

llvm/include/llvm/MC/MCObjectStreamer.h

llvm/lib/MC/MCAssembler.cpp

llvm/lib/MC/MCFragment.cpp

llvm/lib/MC/MCObjectStreamer.cpp

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

llvm/test/MC/X86/align-branch-32-1a.s

llvm/test/MC/X86/align-branch-32-2a.s

llvm/test/MC/X86/align-branch-32-3a.s

llvm/test/MC/X86/align-branch-32-4a.s

llvm/test/MC/X86/align-branch-64-1e.s

llvm/test/MC/X86/align-branch-64-2d.s

llvm/test/MC/X86/align-branch-64-7a.s

llvm/test/MC/X86/align-branch-64-8a.s

llvm/test/MC/X86/align-branch-64-9a.s

A light-weight solution to align branches within 32B boundary by prefix padding
AbandonedPublic