Download Raw Diff

Details

Reviewers

skan
MaskRay
andreadb
RKSimon

Summary

Introduce an option x86-align-for-macrofusion to prevent a pair of
macro-fusion eligible instructions from being split by a given alignment
boundary by automatically padding the first instruction in a pair with
a minimal size nop.

In effect, it ensures that a pair of macro-fusible instructions is not split by
a cache line boundary, which is a precondition for macro-op fusion in
modern Intel Cores (see Intel Architecture Optimization Reference Manual,
2.3.2.1 Legacy Decode Pipeline: Macro-Fusion).

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	50 ms	x64 debian > LLVM.CodeGen/X86::align-branch-boundary-suppressions-tls.ll
	50 ms	x64 debian > LLVM.CodeGen/X86::align-branch-boundary-suppressions.ll
	30 ms	x64 debian > LLVM.MC/X86::align-branch-32bit.s
	20 ms	x64 debian > LLVM.MC/X86::align-branch-align.s
	30 ms	x64 debian > LLVM.MC/X86::align-branch-basic.s
		View Full Test Results (26 Failed)

Event Timeline

Amir created this revision.May 3 2021, 11:54 PM

Herald added subscribers: pengfei, hiraditya. · View Herald TranscriptMay 3 2021, 11:54 PM

Amir requested review of this revision.May 3 2021, 11:54 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2021, 11:54 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

tschuett added a subscriber: tschuett.May 3 2021, 11:59 PM

Harbormaster completed remote builds in B102470: Diff 342656.May 4 2021, 12:43 AM

Addressed clang-format warnings

RKSimon added reviewers: andreadb, RKSimon.May 4 2021, 1:24 AM

Harbormaster completed remote builds in B102478: Diff 342664.May 4 2021, 1:53 AM

skan added inline comments.May 4 2021, 7:43 PM

llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
364–366 ↗	(On Diff #342664)	The extension is definitely wrong. BranchFused indicates fused macro fusion pairs, why do you add something like "AlignMacroFusionCmp, AlignMacroFusionBranch" .

In D97982, NeverAlign fragment type is introduced to prevent macro fusion pairs from ending at a specified boundary. I am confused why you are trying to reuse the BoundaryAlign fragment here do the same thing. Maybe I do not see the big pitcture, so could you clarify you design?

Amir mentioned this in D97982: [MC] Introduce NeverAlign fragment type.May 4 2021, 11:48 PM

@skan:
The big picture is that NeverAlign fragment insertion is controlled by a client (BOLT), while BoundaryAlign insertion is done automatically. This automatic macro-fusion alignment might be useful as a standalone performance optimization.
This diff leverages the infrastructure to detect eligible macro-fusion pairs and insert BoundaryAlign fragment. I've added logic on top of it to perform macro-fusion alignment when that is requested by an option.

Addressed the comment by @skan:

removed newly added BranchKinds, reused existing ones

Amir marked an inline comment as done.May 5 2021, 4:44 PM

Amir added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
364–366 ↗	(On Diff #342664)	I wanted to avoid overloading the existing types with different actions based on them: ie JCC erratum would have used existing BranchBoundaryKinds, while auto MF alignment would use new ones. But I guess it's OK to reuse and make the action depend on a cl opt.

Amir marked an inline comment as done.May 5 2021, 4:45 PM

Harbormaster completed remote builds in B102883: Diff 343238.May 5 2021, 5:34 PM

skan added inline comments.May 6 2021, 6:39 PM

llvm/include/llvm/MC/MCFragment.h
563–568	The comments here is weired after you added the new usage to the fragment, you need to refine it.
llvm/lib/MC/MCAssembler.cpp
1086–1088	`uint64_t NewSize = 0` is enough, we used U suffix before because of the conditional operator.
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
81–82	The words "cmp+jcc" and "falling between" seem not accurate.
167–174	In fact, X86AlignForMacroFusion is compatible with X86AlignBranchWithin32BBoundaries, you can set AlignBoundary to 64 and add AlignBranchFused to AlignBranchType if both of the options exist.

Addressed comments by @skan

Amir marked 2 inline comments as done.May 7 2021, 10:52 PM

Amir added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
167–174	No, they're semantically incompatible: X86AlignBranchWithin32BBoundaries would prevent macrofusion pair from crossing the 32B boundary (stronger alignment restriction), while X86AlignForMacroFusion would prevent macrofusion pair being perfectly split by 64B boundary. So if X86AlignForMacroFusion is on, it's known not to satisfy X86AlignBranchWithin32BBoundaries restrictions. (Conversely, if X86AlignBranchWithin32BBoundaries is on, X86AlignForMacroFusion is satisfied automatically). They're further incompatible by the fact that X86AlignForMacroFusion flag enables the new BoundaryAlign behavior `isAvoidEndAlign` which performs the same alignment as NeverAlign.

Amir marked 2 inline comments as done.May 7 2021, 10:54 PM

Amir added inline comments.

llvm/include/llvm/MC/MCFragment.h
563–568	The comments are up-to-date. Please take a closer look at AvoidEndAlign flag usage below

Addressed linter warning

skan added inline comments.May 7 2021, 11:30 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
167–174	"So if X86AlignForMacroFusion is on, it's known not to satisfy X86AlignBranchWithin32BBoundaries restrictions. (Conversely, if X86AlignBranchWithin32BBoundaries is on, X86AlignForMacroFusion is satisfied automatically)." The requirements of X86AlignForMacroFusion and X86AlignBranchWithin32BBoundaries can be met at the same time, so they are compatible. `isAvoidEndAlign` is a detail about implementation and is not a good proof for the incompatibility. Meanwhile, AlignBranchType, AlignBoundary can be also be set by options X86AlignBranchBoundary, X86AlignBranch. So the assert here is not correct.

Harbormaster completed remote builds in B103300: Diff 343814.May 7 2021, 11:47 PM

Amir added inline comments.May 8 2021, 12:27 AM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
167–174	I see what you mean. We can handle X86AlignForMacroFusion first and then override it if X86AlignBranchWithin32BBoundaries is set as it's a stronger alignment requirement. Let me fix that.

Handle X86AlignForMacroFusion option first.
Override it if X86AlignBranchWithin32BBoundaries is set.

Amir marked an inline comment as done.May 8 2021, 3:02 PM

Addressed linter warning

Harbormaster completed remote builds in B103344: Diff 343870.May 8 2021, 4:00 PM

skan added inline comments.May 10 2021, 8:51 PM

llvm/lib/MC/MCAssembler.cpp
1098–1100	Replace tab with space here.
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
166–171	You need test case for this I don't think it's correct to set value to an option in the constructor, maybe you need to define a class member Nest the if clauses here is incorrect. As I commented before, you should care about the fields set by options rather than the options themselves because the values like `AlignBranchType`, `AlignBoundary` are not set only by X86AlignBranchWithin32BBoundaries. Maybe a simple solution is to add the following code to the end of the constructor // Clean the alignment request if (AlignBoundary == Align()) AlignBranchType.clear(); else if (AlignBranchType) AlignBoundary = Align(); If(X86AlignForMacroFusion) { if(AlignBoundary == Align()) AlignForMacroFusionOnly = true; if (AlignBoundary > Align(64) \|\| AlignBoundary == Align()) { AlignBoundary = assumeAligned(64); } AlignBranchType.addKind(X86::AlignBranchFused); }
663–667	Name `IsBranchFused` is quite misleading. `Inst` here is not a branch and may not be fused when `IsBranchFused` is true. You need a better name here, maybe `IsFirstMacroFusibleInstAndMayNeedAlign`?
670–671	X86AlignForMacroFusion && IsBranchFused -> AlignForMacroFusionOnly // Add some comments ...

Addressed comments by @skan.

llvm/lib/MC/MCAssembler.cpp
1098–1100	Verified that these are spaces (I think Phabricator just shows with >> that the indentation changed)

skan added inline comments.May 13 2021, 12:53 AM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
188	Why do you clear the branch type here?
189–190	We need test to make sure unfused jcc is not aligned.

Harbormaster completed remote builds in B104222: Diff 345063.May 13 2021, 1:36 AM

Intel folks have collected perf data for this optimization on SPECint17 workload:

SPECcpu2017
 
Compiler/optimizer:  bin/clang 
 
Overall score:              228
Breakdown
500.perlbench_r        235
502.gcc_r                    238
505.mcf_r                   165  
520.omnetpp_r         132
523.xalancbmk_r       190
525.x264_r                 465
531.deepsjeng_r       228
541.leela_r                 232
548.exchange2_r      423
557.xz_r                      162
 
Compiler/optimizer:  bin/clang -mllvm -x86-align-for-macrofusion
 
Overall score:              228
Breakdown
500.perlbench_r        235
502.gcc_r                    238
505.mcf_r                   164  
520.omnetpp_r         131
523.xalancbmk_r       190
525.x264_r                 464
531.deepsjeng_r       228
541.leela_r                 234
548.exchange2_r      422
557.xz_r                      161

I'll abandon this diff based on that data. This functionality has non-trivial compile-time overhead (as it relies on relaxation for extra fragments), but it doesn't bring any perf benefit to workloads represented by SPECint17.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
188	I think it doesn't make sense to have JCC erratum mitigation for the other branch types,

spupyrev mentioned this in rG6d0528636ae5: Rebase: [Facebook] [MC] Introduce NeverAlign fragment type.Jul 11 2022, 9:33 AM

Diff 345063

llvm/include/llvm/MC/MCFragment.h

Show First 20 Lines • Show All 554 Lines • ▼ Show 20 Lines	public:

StringRef getFixedSizePortion() const { return FixedSizePortion; }		StringRef getFixedSizePortion() const { return FixedSizePortion; }

static bool classof(const MCFragment *F) {		static bool classof(const MCFragment *F) {
return F->getKind() == MCFragment::FT_CVDefRange;		return F->getKind() == MCFragment::FT_CVDefRange;
}		}
};		};

/// Represents required padding such that a particular other set of fragments		/// Represents required padding such that a particular other set of fragments
/// does not cross a particular power-of-two boundary. The other fragments must		/// does not cross a particular power-of-two boundary. The other fragments must
/// follow this one within the same section.		/// follow this one within the same section.
		/// If AvoidEndAlign is set, this fragment will emit a minimum size nop to
		/// prevent the fragment following it from ending at a given \p AlignBoundary.
class MCBoundaryAlignFragment : public MCFragment {		class MCBoundaryAlignFragment : public MCFragment {
		skanUnsubmitted Done Reply Inline Actions The comments here is weired after you added the new usage to the fragment, you need to refine it. skan: The comments here is weired after you added the new usage to the fragment, you need to refine…
		AmirAuthorUnsubmitted Done Reply Inline Actions The comments are up-to-date. Please take a closer look at AvoidEndAlign flag usage below Amir: The comments are up-to-date. Please take a closer look at AvoidEndAlign flag usage below
/// The alignment requirement of the branch to be aligned.		/// The alignment requirement of the branch to be aligned.
Align AlignBoundary;		Align AlignBoundary;
/// The last fragment in the set of fragments to be aligned.		/// The last fragment in the set of fragments to be aligned.
const MCFragment *LastFragment = nullptr;		const MCFragment *LastFragment = nullptr;
/// The size of the fragment. The size is lazily set during relaxation, and		/// The size of the fragment. The size is lazily set during relaxation, and
/// is not meaningful before that.		/// is not meaningful before that.
uint64_t Size = 0;		uint64_t Size = 0;
		/// Whether this fragment pads the subsequent fragment to prevent it from
		/// ending at AlignBoundary.
		bool IsAvoidEndAlign = false;

public:		public:
MCBoundaryAlignFragment(Align AlignBoundary, MCSection *Sec = nullptr)		MCBoundaryAlignFragment(Align AlignBoundary, MCSection *Sec = nullptr)
: MCFragment(FT_BoundaryAlign, false, Sec), AlignBoundary(AlignBoundary) {		: MCFragment(FT_BoundaryAlign, false, Sec), AlignBoundary(AlignBoundary) {
}		}

uint64_t getSize() const { return Size; }		uint64_t getSize() const { return Size; }
void setSize(uint64_t Value) { Size = Value; }		void setSize(uint64_t Value) { Size = Value; }

Align getAlignment() const { return AlignBoundary; }		Align getAlignment() const { return AlignBoundary; }
void setAlignment(Align Value) { AlignBoundary = Value; }		void setAlignment(Align Value) { AlignBoundary = Value; }

const MCFragment *getLastFragment() const { return LastFragment; }		const MCFragment *getLastFragment() const { return LastFragment; }
void setLastFragment(const MCFragment *F) {		void setLastFragment(const MCFragment *F) {
assert(!F \|\| getParent() == F->getParent());		assert(!F \|\| getParent() == F->getParent());
LastFragment = F;		LastFragment = F;
}		}

		bool isAvoidEndAlign() const { return IsAvoidEndAlign; }
		void setAvoidEndAlign(bool V) { IsAvoidEndAlign = V; }

static bool classof(const MCFragment *F) {		static bool classof(const MCFragment *F) {
return F->getKind() == MCFragment::FT_BoundaryAlign;		return F->getKind() == MCFragment::FT_BoundaryAlign;
}		}
};		};

class MCPseudoProbeAddrFragment : public MCEncodedFragmentWithFixups<8, 1> {		class MCPseudoProbeAddrFragment : public MCEncodedFragmentWithFixups<8, 1> {
/// The expression for the difference of the two symbols that		/// The expression for the difference of the two symbols that
/// make up the address delta between two .pseudoprobe directives.		/// make up the address delta between two .pseudoprobe directives.
Show All 16 Lines

llvm/lib/MC/MCAssembler.cpp

Show First 20 Lines • Show All 1,077 Lines • ▼ Show 20 Lines	bool MCAssembler::relaxBoundaryAlign(MCAsmLayout &Layout,
MCBoundaryAlignFragment &BF) {		MCBoundaryAlignFragment &BF) {
// BoundaryAlignFragment that doesn't need to align any fragment should not be		// BoundaryAlignFragment that doesn't need to align any fragment should not be
// relaxed.		// relaxed.
if (!BF.getLastFragment())		if (!BF.getLastFragment())
return false;		return false;

uint64_t AlignedOffset = Layout.getFragmentOffset(&BF);		uint64_t AlignedOffset = Layout.getFragmentOffset(&BF);
uint64_t AlignedSize = 0;		uint64_t AlignedSize = 0;
		uint64_t NewSize = 0;
		Align BoundaryAlignment = BF.getAlignment();

		skanUnsubmitted Done Reply Inline Actions `uint64_t NewSize = 0` is enough, we used U suffix before because of the conditional operator. skan: `uint64_t NewSize = 0` is enough, we used U suffix before because of the conditional operator.
		if (BF.isAvoidEndAlign()) {
		// Get fragment size for the fragment following this BoundaryAlign.
		const MCFragment *NF = BF.getNextNode();
		AlignedSize = computeFragmentSize(Layout, *NF);

		// Pad with a minimum size nop.
		if (isAgainstBoundary(AlignedOffset, AlignedSize, BoundaryAlignment))
		NewSize = getBackend().getMinimumNopSize();
		} else {
for (const MCFragment *F = BF.getLastFragment(); F != &BF;		for (const MCFragment *F = BF.getLastFragment(); F != &BF;
F = F->getPrevNode())		F = F->getPrevNode())
AlignedSize += computeFragmentSize(Layout, *F);		AlignedSize += computeFragmentSize(Layout, *F);
		skanUnsubmitted Done Reply Inline Actions Replace tab with space here. skan: Replace tab with space here.
		AmirAuthorUnsubmitted Done Reply Inline Actions Verified that these are spaces (I think Phabricator just shows with >> that the indentation changed) Amir: Verified that these are spaces (I think Phabricator just shows with >> that the indentation…

Align BoundaryAlignment = BF.getAlignment();		if (needPadding(AlignedOffset, AlignedSize, BoundaryAlignment))
uint64_t NewSize = needPadding(AlignedOffset, AlignedSize, BoundaryAlignment)		NewSize = offsetToAlignment(AlignedOffset, BoundaryAlignment);
? offsetToAlignment(AlignedOffset, BoundaryAlignment)		}
: 0U;
if (NewSize == BF.getSize())		if (NewSize == BF.getSize())
return false;		return false;
BF.setSize(NewSize);		BF.setSize(NewSize);
Layout.invalidateFragmentsFrom(&BF);		Layout.invalidateFragmentsFrom(&BF);
return true;		return true;
}		}

bool MCAssembler::relaxDwarfLineAddr(MCAsmLayout &Layout,		bool MCAssembler::relaxDwarfLineAddr(MCAsmLayout &Layout,
▲ Show 20 Lines • Show All 197 Lines • Show Last 20 Lines

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	for (auto BranchType : BranchTypes) {
<< " to -x86-align-branch=; each element must be one of: fused, "		<< " to -x86-align-branch=; each element must be one of: fused, "
"jcc, jmp, call, ret, indirect.(plus separated)\n";		"jcc, jmp, call, ret, indirect.(plus separated)\n";
}		}
}		}
}		}

operator uint8_t() const { return AlignBranchKind; }		operator uint8_t() const { return AlignBranchKind; }
void addKind(X86::AlignBranchBoundaryKind Value) { AlignBranchKind \|= Value; }		void addKind(X86::AlignBranchBoundaryKind Value) { AlignBranchKind \|= Value; }
		void clear() { AlignBranchKind = 0; }
};		};

X86AlignBranchKind X86AlignBranchKindLoc;		X86AlignBranchKind X86AlignBranchKindLoc;

		cl::opt<bool> X86AlignForMacroFusion(
		"x86-align-for-macrofusion", cl::init(false),
		cl::desc(
		"Align macro-fusion pairs to avoid 64B boundary falling between "
		skanUnsubmitted Done Reply Inline Actions The words "cmp+jcc" and "falling between" seem not accurate. skan: The words "cmp+jcc" and "falling between" seem not accurate.
		"the instructions. May break assumptions about labels corresponding "
		"to particular instructions, and should be used with caution."));

cl::opt<unsigned> X86AlignBranchBoundary(		cl::opt<unsigned> X86AlignBranchBoundary(
"x86-align-branch-boundary", cl::init(0),		"x86-align-branch-boundary", cl::init(0),
cl::desc(		cl::desc(
"Control how the assembler should align branches with NOP. If the "		"Control how the assembler should align branches with NOP. If the "
"boundary's size is not 0, it should be a power of 2 and no less "		"boundary's size is not 0, it should be a power of 2 and no less "
"than 32. Branches will be aligned to prevent from being across or "		"than 32. Branches will be aligned to prevent from being across or "
"against the boundary of specified size. The default value 0 does not "		"against the boundary of specified size. The default value 0 does not "
"align branches."));		"align branches."));
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	class X86AsmBackend : public MCAsmBackend {
X86AlignBranchKind AlignBranchType;		X86AlignBranchKind AlignBranchType;
Align AlignBoundary;		Align AlignBoundary;
unsigned TargetPrefixMax = 0;		unsigned TargetPrefixMax = 0;

MCInst PrevInst;		MCInst PrevInst;
MCBoundaryAlignFragment *PendingBA = nullptr;		MCBoundaryAlignFragment *PendingBA = nullptr;
std::pair<MCFragment *, size_t> PrevInstPosition;		std::pair<MCFragment *, size_t> PrevInstPosition;
bool CanPadInst;		bool CanPadInst;
		bool AlignForMacroFusionOnly;

uint8_t determinePaddingPrefix(const MCInst &Inst) const;		uint8_t determinePaddingPrefix(const MCInst &Inst) const;
bool isMacroFused(const MCInst &Cmp, const MCInst &Jcc) const;		bool isMacroFused(const MCInst &Cmp, const MCInst &Jcc) const;
bool needAlign(const MCInst &Inst) const;		bool needAlign(const MCInst &Inst) const;
bool canPadBranches(MCObjectStreamer &OS) const;		bool canPadBranches(MCObjectStreamer &OS) const;
bool canPadInst(const MCInst &Inst, MCObjectStreamer &OS) const;		bool canPadInst(const MCInst &Inst, MCObjectStreamer &OS) const;

public:		public:
X86AsmBackend(const Target &T, const MCSubtargetInfo &STI)		X86AsmBackend(const Target &T, const MCSubtargetInfo &STI)
: MCAsmBackend(support::little), STI(STI),		: MCAsmBackend(support::little), STI(STI),
MCII(T.createMCInstrInfo()) {		MCII(T.createMCInstrInfo()) {
if (X86AlignBranchWithin32BBoundaries) {		if (X86AlignBranchWithin32BBoundaries) {
// At the moment, this defaults to aligning fused branches, unconditional		// At the moment, this defaults to aligning fused branches, unconditional
// jumps, and (unfused) conditional jumps with nops. Both the		// jumps, and (unfused) conditional jumps with nops. Both the
// instructions aligned and the alignment method (nop vs prefix) may		// instructions aligned and the alignment method (nop vs prefix) may
// change in the future.		// change in the future.
AlignBoundary = assumeAligned(32);;		AlignBoundary = assumeAligned(32);
AlignBranchType.addKind(X86::AlignBranchFused);		AlignBranchType.addKind(X86::AlignBranchFused);
AlignBranchType.addKind(X86::AlignBranchJcc);		AlignBranchType.addKind(X86::AlignBranchJcc);
AlignBranchType.addKind(X86::AlignBranchJmp);		AlignBranchType.addKind(X86::AlignBranchJmp);
}		}
// Allow overriding defaults set by master flag		// Allow overriding defaults set by master flag
if (X86AlignBranchBoundary.getNumOccurrences())		if (X86AlignBranchBoundary.getNumOccurrences())
AlignBoundary = assumeAligned(X86AlignBranchBoundary);		AlignBoundary = assumeAligned(X86AlignBranchBoundary);
if (X86AlignBranch.getNumOccurrences())		if (X86AlignBranch.getNumOccurrences())
AlignBranchType = X86AlignBranchKindLoc;		AlignBranchType = X86AlignBranchKindLoc;
		skanUnsubmitted Not Done Reply Inline Actions You need test case for this I don't think it's correct to set value to an option in the constructor, maybe you need to define a class member Nest the if clauses here is incorrect. As I commented before, you should care about the fields set by options rather than the options themselves because the values like `AlignBranchType`, `AlignBoundary` are not set only by X86AlignBranchWithin32BBoundaries. Maybe a simple solution is to add the following code to the end of the constructor // Clean the alignment request if (AlignBoundary == Align()) AlignBranchType.clear(); else if (AlignBranchType) AlignBoundary = Align(); If(X86AlignForMacroFusion) { if(AlignBoundary == Align()) AlignForMacroFusionOnly = true; if (AlignBoundary > Align(64) \|\| AlignBoundary == Align()) { AlignBoundary = assumeAligned(64); } AlignBranchType.addKind(X86::AlignBranchFused); } skan: 1. You need test case for this 2. I don't think it's correct to set value to an option in the…
if (X86PadMaxPrefixSize.getNumOccurrences())		if (X86PadMaxPrefixSize.getNumOccurrences())
TargetPrefixMax = X86PadMaxPrefixSize;		TargetPrefixMax = X86PadMaxPrefixSize;
		// Clean the alignment request
		skanUnsubmitted Done Reply Inline Actions In fact, X86AlignForMacroFusion is compatible with X86AlignBranchWithin32BBoundaries, you can set AlignBoundary to 64 and add AlignBranchFused to AlignBranchType if both of the options exist. skan: In fact, X86AlignForMacroFusion is compatible with X86AlignBranchWithin32BBoundaries, you can…
		AmirAuthorUnsubmitted Done Reply Inline Actions No, they're semantically incompatible: X86AlignBranchWithin32BBoundaries would prevent macrofusion pair from crossing the 32B boundary (stronger alignment restriction), while X86AlignForMacroFusion would prevent macrofusion pair being perfectly split by 64B boundary. So if X86AlignForMacroFusion is on, it's known not to satisfy X86AlignBranchWithin32BBoundaries restrictions. (Conversely, if X86AlignBranchWithin32BBoundaries is on, X86AlignForMacroFusion is satisfied automatically). They're further incompatible by the fact that X86AlignForMacroFusion flag enables the new BoundaryAlign behavior `isAvoidEndAlign` which performs the same alignment as NeverAlign. Amir: No, they're semantically incompatible: X86AlignBranchWithin32BBoundaries would prevent…
		skanUnsubmitted Done Reply Inline Actions "So if X86AlignForMacroFusion is on, it's known not to satisfy X86AlignBranchWithin32BBoundaries restrictions. (Conversely, if X86AlignBranchWithin32BBoundaries is on, X86AlignForMacroFusion is satisfied automatically)." The requirements of X86AlignForMacroFusion and X86AlignBranchWithin32BBoundaries can be met at the same time, so they are compatible. `isAvoidEndAlign` is a detail about implementation and is not a good proof for the incompatibility. Meanwhile, AlignBranchType, AlignBoundary can be also be set by options X86AlignBranchBoundary, X86AlignBranch. So the assert here is not correct. skan: "So if X86AlignForMacroFusion is on, it's known not to satisfy…
		AmirAuthorUnsubmitted Done Reply Inline Actions I see what you mean. We can handle X86AlignForMacroFusion first and then override it if X86AlignBranchWithin32BBoundaries is set as it's a stronger alignment requirement. Let me fix that. Amir: I see what you mean. We can handle X86AlignForMacroFusion first and then override it if…
		if (AlignBoundary == Align())
		AlignBranchType.clear();
		else if (AlignBranchType)
		AlignBoundary = Align();
		// X86AlignForMacroFusion overrides AlignBoundary and AlignBranchType
		if (X86AlignForMacroFusion) {
		AlignForMacroFusionOnly = true;
		// Constrain or initialize AlignBoundary to the boundary for macrofusion
		// alignment (64B).
		if (AlignBoundary > Align(64) \|\| AlignBoundary == Align()) {
		AlignBoundary = assumeAligned(64);
		}
		// Required AlignBranch kinds for macro-fusion alignment
		AlignBranchType.clear();
		skanUnsubmitted Not Done Reply Inline Actions Why do you clear the branch type here? skan: Why do you clear the branch type here?
		AmirAuthorUnsubmitted Done Reply Inline Actions I think it doesn't make sense to have JCC erratum mitigation for the other branch types, Amir: I think it doesn't make sense to have JCC erratum mitigation for the other branch types,
		AlignBranchType.addKind(X86::AlignBranchFused);
		AlignBranchType.addKind(X86::AlignBranchJcc);
		skanUnsubmitted Not Done Reply Inline Actions We need test to make sure unfused jcc is not aligned. skan: We need test to make sure unfused jcc is not aligned.
		}
}		}

bool allowAutoPadding() const override;		bool allowAutoPadding() const override;
bool allowEnhancedRelaxation() const override;		bool allowEnhancedRelaxation() const override;
void emitInstructionBegin(MCObjectStreamer &OS, const MCInst &Inst) override;		void emitInstructionBegin(MCObjectStreamer &OS, const MCInst &Inst) override;
void emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) override;		void emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) override;

unsigned getNumFixupKinds() const override {		unsigned getNumFixupKinds() const override {
▲ Show 20 Lines • Show All 437 Lines • ▼ Show 20 Lines	void X86AsmBackend::emitInstructionBegin(MCObjectStreamer &OS,

if (!CanPadInst)		if (!CanPadInst)
return;		return;

if (PendingBA && OS.getCurrentFragment()->getPrevNode() == PendingBA) {		if (PendingBA && OS.getCurrentFragment()->getPrevNode() == PendingBA) {
// Macro fusion actually happens and there is no other fragment inserted		// Macro fusion actually happens and there is no other fragment inserted
// after the previous instruction.		// after the previous instruction.
//		//
// Do nothing here since we already inserted a BoudaryAlign fragment when		// Do nothing here since we already inserted a BoundaryAlign fragment when
// we met the first instruction in the fused pair and we'll tie them		// we met the first instruction in the fused pair and we'll tie them
// together in emitInstructionEnd.		// together in emitInstructionEnd.
//		//
// Note: When there is at least one fragment, such as MCAlignFragment,		// Note: When there is at least one fragment, such as MCAlignFragment,
// inserted after the previous instruction, e.g.		// inserted after the previous instruction, e.g.
//		//
// \code		// \code
// cmp %rax %rcx		// cmp %rax %rcx
// .align 16		// .align 16
// je .Label0		// je .Label0
// \ endcode		// \ endcode
//		//
// We will treat the JCC as a unfused branch although it may be fused		// We will treat the JCC as a unfused branch although it may be fused
// with the CMP.		// with the CMP.
return;		return;
}		}

if (needAlign(Inst) \|\| ((AlignBranchType & X86::AlignBranchFused) &&		bool IsFirstMacroFusibleInstAndMayNeedAlign =
isFirstMacroFusibleInst(Inst, *MCII))) {		(AlignBranchType & X86::AlignBranchFused) &&
		isFirstMacroFusibleInst(Inst, *MCII);
		if (needAlign(Inst) \|\| IsFirstMacroFusibleInstAndMayNeedAlign) {
// If we meet a unfused branch or the first instuction in a fusiable pair,		// If we meet a unfused branch or the first instuction in a fusiable pair,
		skanUnsubmitted Done Reply Inline Actions Name `IsBranchFused` is quite misleading. `Inst` here is not a branch and may not be fused when `IsBranchFused` is true. You need a better name here, maybe `IsFirstMacroFusibleInstAndMayNeedAlign`? skan: Name `IsBranchFused` is quite misleading. `Inst` here is not a branch and may not be fused when…
// insert a BoundaryAlign fragment.		// insert a BoundaryAlign fragment.
OS.insert(PendingBA = new MCBoundaryAlignFragment(AlignBoundary));		OS.insert(PendingBA = new MCBoundaryAlignFragment(AlignBoundary));
		// MacroFusion alignment overrides BoundaryAlign logic:
		// calculate only the size of the first instruction (not the pair) and
		skanUnsubmitted Done Reply Inline Actions X86AlignForMacroFusion && IsBranchFused -> AlignForMacroFusionOnly // Add some comments ... skan: X86AlignForMacroFusion && IsBranchFused -> AlignForMacroFusionOnly // Add some comments ...
		// check if it is against the boundary (not if it crosses the boundary).
		if (AlignForMacroFusionOnly)
		PendingBA->setAvoidEndAlign(true);
}		}
}		}

/// Set the last fragment to be aligned for the BoundaryAlignFragment.		/// Set the last fragment to be aligned for the BoundaryAlignFragment.
void X86AsmBackend::emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) {		void X86AsmBackend::emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) {
PrevInst = Inst;		PrevInst = Inst;
MCFragment *CF = OS.getCurrentFragment();		MCFragment *CF = OS.getCurrentFragment();
PrevInstPosition = std::make_pair(CF, getSizeForInstFragment(CF));		PrevInstPosition = std::make_pair(CF, getSizeForInstFragment(CF));
if (auto *F = dyn_cast_or_null<MCRelaxableFragment>(CF))		if (auto *F = dyn_cast_or_null<MCRelaxableFragment>(CF))
F->setAllowAutoPadding(CanPadInst);		F->setAllowAutoPadding(CanPadInst);

if (!canPadBranches(OS))		if (!canPadBranches(OS))
return;		return;

if (!needAlign(Inst) \|\| !PendingBA)		if (!needAlign(Inst) \|\| !PendingBA)
return;		return;

// Tie the aligned instructions into a a pending BoundaryAlign.		// Tie the aligned instructions into a pending BoundaryAlign.
PendingBA->setLastFragment(CF);		PendingBA->setLastFragment(CF);
PendingBA = nullptr;		PendingBA = nullptr;

// We need to ensure that further data isn't added to the current		// We need to ensure that further data isn't added to the current
// DataFragment, so that we can get the size of instructions later in		// DataFragment, so that we can get the size of instructions later in
// MCAssembler::relaxBoundaryAlign. The easiest way is to insert a new empty		// MCAssembler::relaxBoundaryAlign. The easiest way is to insert a new empty
// DataFragment.		// DataFragment.
if (isa_and_nonnull<MCDataFragment>(CF))		if (isa_and_nonnull<MCDataFragment>(CF))
▲ Show 20 Lines • Show All 963 Lines • Show Last 20 Lines

llvm/test/MC/X86/auto-mf-align.s

This file was added.

				# RUN: llvm-mc -triple=x86_64 -x86-align-for-macrofusion %s -filetype=obj \| llvm-objdump --no-show-raw-insn -d - \| FileCheck %s

				# no padding is expected since test doesn't end at alignment boundary:
				# CHECK-NOT: nop
				testl %eax, %eax
				# CHECK: testl %eax, %eax
				je .LBB0

				.nops 57
				int3
				# BoundaryAlign followed by MCDataFragment:
				# inserts nop because `test` would end at alignment boundary:
				# CHECK: 3e: nop
				testl %eax, %eax
				# CHECK-NEXT: 3f: testl %eax, %eax
				je .LBB0
				# CHECK-NEXT: 41: je
				.LBB0:
				retq

				.p2align 6
				.L0:
				.nops 57
				int3
				# BoundaryAlign followed by RelaxableFragment:
				# CHECK: ba: nop
				cmpl $(.L1-.L0), %eax
				# CHECK-NEXT: bb: cmpl
				je .L0
				# CHECK-NEXT: c1: je
				.nops 65
				.L1:

This is an archive of the discontinued LLVM Phabricator instance.

[MC][X86] Automatic alignment for Macro-Op Fusion
AbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 345063

llvm/include/llvm/MC/MCFragment.h

llvm/lib/MC/MCAssembler.cpp

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

llvm/test/MC/X86/auto-mf-align.s

This is an archive of the discontinued LLVM Phabricator instance.

[MC][X86] Automatic alignment for Macro-Op FusionAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 345063

llvm/include/llvm/MC/MCFragment.h

llvm/lib/MC/MCAssembler.cpp

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

llvm/test/MC/X86/auto-mf-align.s

[MC][X86] Automatic alignment for Macro-Op Fusion
AbandonedPublic