This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
12/12
X86MCInstLower.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
1/1
patchable-function-entry.ll
1/1
patchable-prologue.ll

Differential D81301

[X86] Emit two-byte NOP when possible
ClosedPublic

Authored by aganea on Jun 5 2020, 1:30 PM.

Download Raw Diff

Details

Reviewers

MaskRay
reames
rnk
sanjoy
craig.topper

Commits

rGacb30f6856c3: [X86] For 32-bit targets, emit two-byte NOP when possible

Summary

In order to support hot-patching, we need to make sure the first emitted instruction in a function a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case.

Whenever using the "patchable-function" attribute, a PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 or i386 targets with /arch:IA32 or /arch:SSE, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this byte pattern.

The goal of this patch, along with D43002 and D80833, is to support the clang-cl flag /hotpatch: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=vs-2019

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aganea created this revision.Jun 5 2020, 1:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2020, 1:30 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

aganea mentioned this in D19908: [X86] Support the "ms-hotpatch" attribute..Jun 5 2020, 1:37 PM

MaskRay added inline comments.Jun 5 2020, 2:19 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
99–100	Since you are changing the signature, you can rename it to emitNops as well to conform to the coding standards.
1106	Can the x86-64 code path below be reused?
llvm/test/CodeGen/X86/patchable-function-entry.ll
66	32-NEXT

Are two byte nops legal on all 32 bit x86? The comment seems to imply no.

Once this question is answered and we're happy with the general design, I will ask you to split out the NFC signature change and land it separately, but let's *not* do that yet.

This revision now requires changes to proceed.Jun 5 2020, 2:26 PM

Harbormaster failed remote builds in B59317: Diff 268924!Jun 5 2020, 2:37 PM

efriedma added a subscriber: efriedma.Jun 5 2020, 2:44 PM

efriedma added inline comments.

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	I see three issues with using the 64-bit codepath on 32-bit: We can't use the patterns based on REX prefixes. The X86::NOOPL opcode requires FeatureNOPL. If Mode16Bit is enabled, XCHG16ar is actually a one-byte instruction. So we could sort of reuse the code below, but it would require a lot of refactoring.

In D81301#2077585, @reames wrote:

Are two byte nops legal on all 32 bit x86? The comment seems to imply no.

Once this question is answered and we're happy with the general design, I will ask you to split out the NFC signature change and land it separately, but let's *not* do that yet.

The encoding for xchg %ax, %ax should be legal all the way back to the instruction of 32-bit mode. It wasn't given special treatment in hardware until 486. Prior to that it really did read ax and swap it with itself.

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do we? The only part in the reset of the code that is truly 64-bit specific is the IndexReg being RAX. There's nothing in there that requiers a REX prefix. So it seems like we just need to replace the Is64Bit check we had before with FeatureNOPL and then pick the index register as RAX or EAX depending on 64 bit or 32 bit.

craig.topper added inline comments.Jun 5 2020, 3:01 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	I guess we'd still need to handle the case where the CPU was set to a 386/486/pentium where FeatureNOPL isn't set.

efriedma added inline comments.Jun 5 2020, 3:27 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do we? We support generating "16-bit" code, yes, if you specify `-m16` to clang. (Semantically, it's actually 32-bit code with a bunch of size prefixes to allow running it in 16-bit mode, but that doesn't really matter from the assembler's perspective.) then pick the index register as RAX or EAX depending on 64 bit or 32 bit. We can't just replace RAX with EAX; that would reduce the length of the opcode by one. We'd actually need to pick different encodings in the cases where that's relevant.

craig.topper added inline comments.Jun 5 2020, 3:49 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	We can't just replace RAX with EAX; that would reduce the length of the opcode by one. We'd actually need to pick different encodings in the cases where that's relevant. It's the index register, it only requires REX prefix if its R8-R15. The jump from 4 bytes to 5 bytes where index was added would have been the SIB byte.

efriedma added inline comments.Jun 5 2020, 5:29 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	It's the index register, it only requires REX prefix if its `R8-R15`. The jump from 4 bytes to 5 bytes where index was added would have been the SIB byte. Right; I misread the code.

Whenever using the "patchable-function" attribute, a PATCHABLE_OP now lowers to a xchg ax, ax, like MSVC does.

Is this true? Last I checked--and admittedly, this was a long time ago--MSVC emitted mov edi, edi--more specifically, opcode bytes 8b ff. GCC also emits those same bytes.

There's also a compatibility angle here. The reason this was added to GCC in the first place, and the reason I wrote the original change, is that Wine needed this to make certain Windows programs run correctly. I think they specifically check for that 8b ff signature before installing their hotpatches--which I admit they really shouldn't be doing, and may be a factor in why MS started using xchg ax, ax (66 90), but they do it, and we need specifically that sequence to make them work correctly.

In D81301#2077585, @reames wrote:

Are two byte nops legal on all 32 bit x86? The comment seems to imply no.

Once this question is answered and we're happy with the general design, I will ask you to split out the NFC signature change and land it separately, but let's *not* do that yet.

My understanding is that the story goes like this:

90 NOP - introduced with 8086.
66 90 XCHG AX, AX - supported starting with 80386, with the introduction of instruction prefixes [1]
0F 1F 00 .. NOP DWORD ptr - long NOP variants were introduced with SSE, in Pentium III Katmai [2]

llvm/lib/Target/X86/X86.td specifies FeatureNOPL being supported in Pentium Pro [3] and Pentium 2 [4], but looking at the respective manuals that seems incorrect. There's no trace of long NOP instructions for those CPUs. @craig.topper could you please confirm?

[1] http://css.csail.mit.edu/6.858/2014/readings/i386/s02_04.htm
[2] https://en.wikipedia.org/wiki/X86_instruction_listings
[3] http://www.mathcs.emory.edu/~cheung/Courses/355/Syllabus/6-io/Docs/Intel-instructions.pdf
[4] http://www.archive.ece.cmu.edu/~ece548/localcpy/24281603.pdf

In D81301#2077964, @cdavis5x wrote:

Whenever using the "patchable-function" attribute, a PATCHABLE_OP now lowers to a xchg ax, ax, like MSVC does.

Is this true? Last I checked--and admittedly, this was a long time ago--MSVC emitted mov edi, edi--more specifically, opcode bytes 8b ff. GCC also emits those same bytes.

VS2012 and VS2013 are emitting 8B FF MOV EDI, EDI.
After VS2015, 66 90 XCHG AX, AX is generated, as recommended by Intel, see [5] Vol. 2B page 4-167.

[5] https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html#combined

There's also a compatibility angle here. The reason this was added to GCC in the first place, and the reason I wrote the original change, is that Wine needed this to make certain Windows programs run correctly. I think they specifically check for that 8b ff signature before installing their hotpatches--which I admit they really shouldn't be doing, and may be a factor in why MS started using xchg ax, ax (66 90), but they do it, and we need specifically that sequence to make them work correctly.

I'm a bit confused. Wine lists support up to Windows 10, which means recently-compiled applications should be supported. Meanning they (Wine) should see 66 90, and support that. I asked the question anyway on the wine-devel mailing list.

+ @tentzen @BillyONeal in case they know more about why MSVC moved from 8B FF MOV EDI, EDI to 66 90 XCHG AX, AX.

aganea edited the summary of this revision. (Show Details)Jun 6 2020, 9:26 AM

In D81301#2078189, @aganea wrote:

In D81301#2077585, @reames wrote:

Are two byte nops legal on all 32 bit x86? The comment seems to imply no.

Once this question is answered and we're happy with the general design, I will ask you to split out the NFC signature change and land it separately, but let's *not* do that yet.

My understanding is that the story goes like this:

90 NOP - introduced with 8086.

66 90 XCHG AX, AX - supported starting with 80386, with the introduction of instruction prefixes [1]

0F 1F 00 .. NOP DWORD ptr - long NOP variants were introduced with SSE, in Pentium III Katmai [2]

llvm/lib/Target/X86/X86.td specifies FeatureNOPL being supported in Pentium Pro [3] and Pentium 2 [4], but looking at the respective manuals that seems incorrect. There's no trace of long NOP instructions for those CPUs. @craig.topper could you please confirm?

Other sites on internet show them a being added in P6(Pentium Pro). For example the row for group 16 here https://sandpile.org/x86/opc_grp.htm I think they were undocumented for a long time since they were originally intended for future expansion. So they didn't want compilers to start depending on them as NOPs. Sometime later they made 0F 1F an official multibyte NOP. 0F 18 eventually became a prefetch instruction, but you can still execute it on a Pentium Pro, it just won't do anything.

I'm a bit confused. Wine lists support up to Windows 10, which means recently-compiled applications should be supported. Meanning they (Wine) should see 66 90, and support that. I asked the question anyway on the wine-devel mailing list.

No, I meant "they" as in "Windows programs that run on Wine." Older programs compiled before VS 2015 in particular are going to check for 8b ff in functions from the system's DLLs. That's why Wine needed this from GCC and needs this from LLVM: so that functions in its DLLs have the signature so programs will know it's OK to hotpatch them.

jacek added a subscriber: jacek.Jun 7 2020, 4:18 AM

A ReactOS developer pointed out that the default for x86 cl.exe is /arch:SSE2 which generates 66 90 when specifying /hotpatch. However with /arch:[IA32|SSE] /hotpatch the previous sequence 8B FF is emitted instead. I will replicate the behavior.

Addressed comments.

Fold the codepath for 32-bit and 64-bit.
Emit MOV EDI,EDI when using /arch:IA32 and /arch:SSE.

aganea edited the summary of this revision. (Show Details)Jun 9 2020, 2:43 PM

aganea marked 8 inline comments as done.

aganea added inline comments.

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	@efriedma Is there a way to generate a 16-bit triple from llc?

craig.topper added inline comments.Jun 9 2020, 3:04 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1106	I think i386-unknown-linux-code16 might be the 16-bit triple
llvm/test/CodeGen/X86/patchable-prologue.ll
3	Should we add -show-mc-encoding so we make sure we're really emitting the right bytes?

Harbormaster failed remote builds in B59704: Diff 269681!Jun 9 2020, 4:04 PM

As suggested, added -show-mc-encoding to ensure we emit the right opcodes. Added coverage for 16-bit targets as well.

aganea marked 2 inline comments as done.Jun 9 2020, 6:28 PM

Harbormaster failed remote builds in B59724: Diff 269716!Jun 9 2020, 7:52 PM

Ping! Any further comments?

craig.topper added inline comments.Jun 15 2020, 12:53 PM

llvm/lib/Target/X86/X86MCInstLower.cpp
1364	How much code would it be to just emit the legacy nop directly here and call emitNop for the other cases? Rather than pushing this legacy nop concept into emitNop?

Rebase. Remove LegacyNOP param.

llvm/lib/Target/X86/X86MCInstLower.cpp
1364	Changed as suggested, looks better now, thanks!

Harbormaster completed remote builds in B60387: Diff 270887.Jun 15 2020, 6:14 PM

aganea added a reviewer: craig.topper.Jun 16 2020, 9:00 AM

LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Jun 17 2020, 10:46 AM

Closed by commit rGacb30f6856c3: [X86] For 32-bit targets, emit two-byte NOP when possible (authored by aganea). · Explain Why

This revision was automatically updated to reflect the committed changes.

aganea mentioned this in D43002: [CodeView] Emit S_OBJNAME record.Nov 10 2020, 5:54 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86MCInstLower.cpp

35 lines

test/

CodeGen/

X86/

patchable-function-entry.ll

8 lines

patchable-prologue.ll

71 lines

Diff 271412

llvm/lib/Target/X86/X86MCInstLower.cpp

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	void changeAndComment(bool b) {
if (b)		if (b)
OS.emitRawComment("autopadding");		OS.emitRawComment("autopadding");
else		else
OS.emitRawComment("noautopadding");		OS.emitRawComment("noautopadding");
}		}
};		};

// Emit a minimal sequence of nops spanning NumBytes bytes.		// Emit a minimal sequence of nops spanning NumBytes bytes.
static void emitX86Nops(MCStreamer &OS, unsigned NumBytes,		static void emitX86Nops(MCStreamer &OS, unsigned NumBytes,
const X86Subtarget *Subtarget);		const X86Subtarget *Subtarget);
		MaskRayUnsubmitted Done Reply Inline Actions Since you are changing the signature, you can rename it to emitNops as well to conform to the coding standards. MaskRay: Since you are changing the signature, you can rename it to emitNops as well to conform to the…

void X86AsmPrinter::StackMapShadowTracker::count(MCInst &Inst,		void X86AsmPrinter::StackMapShadowTracker::count(MCInst &Inst,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
MCCodeEmitter *CodeEmitter) {		MCCodeEmitter *CodeEmitter) {
if (InShadow) {		if (InShadow) {
SmallString<256> Code;		SmallString<256> Code;
SmallVector<MCFixup, 4> Fixups;		SmallVector<MCFixup, 4> Fixups;
raw_svector_ostream VecOS(Code);		raw_svector_ostream VecOS(Code);
▲ Show 20 Lines • Show All 969 Lines • ▼ Show 20 Lines	if (Is64Bits) {
}		}
}		}
}		}

/// Return the longest nop which can be efficiently decoded for the given		/// Return the longest nop which can be efficiently decoded for the given
/// target cpu. 15-bytes is the longest single NOP instruction, but some		/// target cpu. 15-bytes is the longest single NOP instruction, but some
/// platforms can't decode the longest forms efficiently.		/// platforms can't decode the longest forms efficiently.
static unsigned maxLongNopLength(const X86Subtarget *Subtarget) {		static unsigned maxLongNopLength(const X86Subtarget *Subtarget) {
uint64_t MaxNopLength = 10;
if (Subtarget->getFeatureBits()[X86::ProcIntelSLM])		if (Subtarget->getFeatureBits()[X86::ProcIntelSLM])
MaxNopLength = 7;		return 7;
else if (Subtarget->getFeatureBits()[X86::FeatureFast15ByteNOP])		if (Subtarget->getFeatureBits()[X86::FeatureFast15ByteNOP])
MaxNopLength = 15;		return 15;
else if (Subtarget->getFeatureBits()[X86::FeatureFast11ByteNOP])		if (Subtarget->getFeatureBits()[X86::FeatureFast11ByteNOP])
MaxNopLength = 11;		return 11;
return MaxNopLength;		if (Subtarget->getFeatureBits()[X86::FeatureNOPL] \|\| Subtarget->is64Bit())
		return 10;
		if (Subtarget->is32Bit())
		return 2;
		return 1;
}		}

/// Emit the largest nop instruction smaller than or equal to \p NumBytes		/// Emit the largest nop instruction smaller than or equal to \p NumBytes
/// bytes. Return the size of nop emitted.		/// bytes. Return the size of nop emitted.
static unsigned emitNop(MCStreamer &OS, unsigned NumBytes,		static unsigned emitNop(MCStreamer &OS, unsigned NumBytes,
const X86Subtarget *Subtarget) {		const X86Subtarget *Subtarget) {
if (!Subtarget->is64Bit()) {
// TODO Do additional checking if the CPU supports multi-byte nops.
OS.emitInstruction(MCInstBuilder(X86::NOOP), *Subtarget);
return 1;
}

// Cap a single nop emission at the profitable value for the target		// Cap a single nop emission at the profitable value for the target
NumBytes = std::min(NumBytes, maxLongNopLength(Subtarget));		NumBytes = std::min(NumBytes, maxLongNopLength(Subtarget));

unsigned NopSize;		unsigned NopSize;
		MaskRayUnsubmitted Done Reply Inline Actions Can the x86-64 code path below be reused? MaskRay: Can the x86-64 code path below be reused?
		efriedmaUnsubmitted Done Reply Inline Actions I see three issues with using the 64-bit codepath on 32-bit: We can't use the patterns based on REX prefixes. The X86::NOOPL opcode requires FeatureNOPL. If Mode16Bit is enabled, XCHG16ar is actually a one-byte instruction. So we could sort of reuse the code below, but it would require a lot of refactoring. efriedma: I see three issues with using the 64-bit codepath on 32-bit: 1. We can't use the patterns…
		craig.topperUnsubmitted Done Reply Inline Actions Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do we? The only part in the reset of the code that is truly 64-bit specific is the IndexReg being RAX. There's nothing in there that requiers a REX prefix. So it seems like we just need to replace the Is64Bit check we had before with FeatureNOPL and then pick the index register as RAX or EAX depending on 64 bit or 32 bit. craig.topper: Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do we?
		craig.topperUnsubmitted Done Reply Inline Actions I guess we'd still need to handle the case where the CPU was set to a 386/486/pentium where FeatureNOPL isn't set. craig.topper: I guess we'd still need to handle the case where the CPU was set to a 386/486/pentium where…
		efriedmaUnsubmitted Done Reply Inline Actions Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do we? We support generating "16-bit" code, yes, if you specify `-m16` to clang. (Semantically, it's actually 32-bit code with a bunch of size prefixes to allow running it in 16-bit mode, but that doesn't really matter from the assembler's perspective.) then pick the index register as RAX or EAX depending on 64 bit or 32 bit. We can't just replace RAX with EAX; that would reduce the length of the opcode by one. We'd actually need to pick different encodings in the cases where that's relevant. efriedma: > Can Mode16Bit be set here? We're coming from MachineInstr and we don't compile to 16 bit do…
		craig.topperUnsubmitted Done Reply Inline Actions We can't just replace RAX with EAX; that would reduce the length of the opcode by one. We'd actually need to pick different encodings in the cases where that's relevant. It's the index register, it only requires REX prefix if its R8-R15. The jump from 4 bytes to 5 bytes where index was added would have been the SIB byte. craig.topper: > We can't just replace RAX with EAX; that would reduce the length of the opcode by one. We'd…
		efriedmaUnsubmitted Done Reply Inline Actions It's the index register, it only requires REX prefix if its `R8-R15`. The jump from 4 bytes to 5 bytes where index was added would have been the SIB byte. Right; I misread the code. efriedma: > It's the index register, it only requires REX prefix if its `R8-R15`. The jump from 4 bytes…
		aganeaAuthorUnsubmitted Done Reply Inline Actions @efriedma Is there a way to generate a 16-bit triple from llc? aganea: @efriedma Is there a way to generate a 16-bit triple from llc?
		craig.topperUnsubmitted Done Reply Inline Actions I think i386-unknown-linux-code16 might be the 16-bit triple craig.topper: I think i386-unknown-linux-code16 might be the 16-bit triple
unsigned Opc, BaseReg, ScaleVal, IndexReg, Displacement, SegmentReg;		unsigned Opc, BaseReg, ScaleVal, IndexReg, Displacement, SegmentReg;
IndexReg = Displacement = SegmentReg = 0;		IndexReg = Displacement = SegmentReg = 0;
BaseReg = X86::RAX;		BaseReg = X86::RAX;
ScaleVal = 1;		ScaleVal = 1;
switch (NumBytes) {		switch (NumBytes) {
case 0:		case 0:
llvm_unreachable("Zero nops?");		llvm_unreachable("Zero nops?");
break;		break;
▲ Show 20 Lines • Show All 219 Lines • ▼ Show 20 Lines	if (auto MaybeOperand = MCIL.LowerMachineOperand(&MI, MO))
MCI.addOperand(MaybeOperand.getValue());		MCI.addOperand(MaybeOperand.getValue());

SmallString<256> Code;		SmallString<256> Code;
SmallVector<MCFixup, 4> Fixups;		SmallVector<MCFixup, 4> Fixups;
raw_svector_ostream VecOS(Code);		raw_svector_ostream VecOS(Code);
CodeEmitter->encodeInstruction(MCI, VecOS, Fixups, getSubtargetInfo());		CodeEmitter->encodeInstruction(MCI, VecOS, Fixups, getSubtargetInfo());

if (Code.size() < MinSize) {		if (Code.size() < MinSize) {
if (MinSize == 2 && Opcode == X86::PUSH64r) {		if (MinSize == 2 && Subtarget->is32Bit() &&
		Subtarget->isTargetWindowsMSVC() &&
		(Subtarget->getCPU().empty() \|\| Subtarget->getCPU() == "pentium3")) {
		// For compatibilty reasons, when targetting MSVC, is is important to
		// generate a 'legacy' NOP in the form of a 8B FF MOV EDI, EDI. Some tools
		// rely specifically on this pattern to be able to patch a function.
		// This is only for 32-bit targets, when using /arch:IA32 or /arch:SSE.
		OutStreamer->emitInstruction(
		MCInstBuilder(X86::MOV32rr_REV).addReg(X86::EDI).addReg(X86::EDI),
		*Subtarget);
		} else if (MinSize == 2 && Opcode == X86::PUSH64r) {
// This is an optimization that lets us get away without emitting a nop in		// This is an optimization that lets us get away without emitting a nop in
// many cases.		// many cases.
//		//
// NB! In some cases the encoding for PUSH64r (e.g. PUSH64r %r9) takes two		// NB! In some cases the encoding for PUSH64r (e.g. PUSH64r %r9) takes two
// bytes too, so the check on MinSize is important.		// bytes too, so the check on MinSize is important.
MCI.setOpcode(X86::PUSH64rmr);		MCI.setOpcode(X86::PUSH64rmr);
} else {		} else {
unsigned NopSize = emitNop(*OutStreamer, MinSize, Subtarget);		unsigned NopSize = emitNop(*OutStreamer, MinSize, Subtarget);
assert(NopSize == MinSize && "Could not implement MinSize!");		assert(NopSize == MinSize && "Could not implement MinSize!");
(void)NopSize;		(void)NopSize;
}		}
}		}
		craig.topperUnsubmitted Done Reply Inline Actions How much code would it be to just emit the legacy nop directly here and call emitNop for the other cases? Rather than pushing this legacy nop concept into emitNop? craig.topper: How much code would it be to just emit the legacy nop directly here and call emitNop for the…
		aganeaAuthorUnsubmitted Done Reply Inline Actions Changed as suggested, looks better now, thanks! aganea: Changed as suggested, looks better now, thanks!

OutStreamer->emitInstruction(MCI, getSubtargetInfo());		OutStreamer->emitInstruction(MCI, getSubtargetInfo());
}		}

// Lower a stackmap of the form:		// Lower a stackmap of the form:
// <id>, <shadowBytes>, ...		// <id>, <shadowBytes>, ...
void X86AsmPrinter::LowerSTACKMAP(const MachineInstr &MI) {		void X86AsmPrinter::LowerSTACKMAP(const MachineInstr &MI) {
SMShadowTracker.emitShadowPadding(*OutStreamer, getSubtargetInfo());		SMShadowTracker.emitShadowPadding(*OutStreamer, getSubtargetInfo());
▲ Show 20 Lines • Show All 1,246 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/patchable-function-entry.ll

	Show All 25 Lines

	;; Without -function-sections, f2 is in the same text section as f1.			;; Without -function-sections, f2 is in the same text section as f1.
	;; They share the __patchable_function_entries section.			;; They share the __patchable_function_entries section.
	;; With -function-sections, f1 and f2 are in different text sections.			;; With -function-sections, f1 and f2 are in different text sections.
	;; Use separate __patchable_function_entries.			;; Use separate __patchable_function_entries.
	define void @f2() "patchable-function-entry"="2" {			define void @f2() "patchable-function-entry"="2" {
	; CHECK-LABEL: f2:			; CHECK-LABEL: f2:
	; CHECK-NEXT: .Lfunc_begin2:			; CHECK-NEXT: .Lfunc_begin2:
	; 32-COUNT-2: nop			; 32: xchgw %ax, %ax
	; 64: xchgw %ax, %ax			; 64: xchgw %ax, %ax
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .section __patchable_function_entries,"awo",@progbits,f2{{$}}			; CHECK: .section __patchable_function_entries,"awo",@progbits,f2{{$}}
	; 32: .p2align 2			; 32: .p2align 2
	; 32-NEXT: .long .Lfunc_begin2			; 32-NEXT: .long .Lfunc_begin2
	; 64: .p2align 3			; 64: .p2align 3
	; 64-NEXT: .quad .Lfunc_begin2			; 64-NEXT: .quad .Lfunc_begin2
	ret void			ret void
	}			}

	$f3 = comdat any			$f3 = comdat any
	define void @f3() "patchable-function-entry"="3" comdat {			define void @f3() "patchable-function-entry"="3" comdat {
	; CHECK-LABEL: f3:			; CHECK-LABEL: f3:
	; CHECK-NEXT: .Lfunc_begin3:			; CHECK-NEXT: .Lfunc_begin3:
	; 32-COUNT-3: nop			; 32: xchgw %ax, %ax
				; 32-NEXT: nop
	; 64: nopl (%rax)			; 64: nopl (%rax)
	; CHECK: ret			; CHECK: ret
	; CHECK: .section __patchable_function_entries,"aGwo",@progbits,f3,comdat,f3{{$}}			; CHECK: .section __patchable_function_entries,"aGwo",@progbits,f3,comdat,f3{{$}}
	; 32: .p2align 2			; 32: .p2align 2
	; 32-NEXT: .long .Lfunc_begin3			; 32-NEXT: .long .Lfunc_begin3
	; 64: .p2align 3			; 64: .p2align 3
	; 64-NEXT: .quad .Lfunc_begin3			; 64-NEXT: .quad .Lfunc_begin3
	ret void			ret void
	}			}

	$f5 = comdat any			$f5 = comdat any
	define void @f5() "patchable-function-entry"="5" comdat {			define void @f5() "patchable-function-entry"="5" comdat {
	; CHECK-LABEL: f5:			; CHECK-LABEL: f5:
	; CHECK-NEXT: .Lfunc_begin4:			; CHECK-NEXT: .Lfunc_begin4:
	; 32-COUNT-5: nop			; 32-COUNT-2: xchgw %ax, %ax
				; 32-NEXT: nop
				MaskRayUnsubmitted Done Reply Inline Actions 32-NEXT MaskRay: 32-NEXT
	; 64: nopl 8(%rax,%rax)			; 64: nopl 8(%rax,%rax)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .section __patchable_function_entries,"aGwo",@progbits,f5,comdat,f5{{$}}			; CHECK: .section __patchable_function_entries,"aGwo",@progbits,f5,comdat,f5{{$}}
	; 32: .p2align 2			; 32: .p2align 2
	; 32-NEXT: .long .Lfunc_begin4			; 32-NEXT: .long .Lfunc_begin4
	; 64: .p2align 3			; 64: .p2align 3
	; 64-NEXT: .quad .Lfunc_begin4			; 64-NEXT: .quad .Lfunc_begin4
	ret void			ret void
	Show All 26 Lines

llvm/test/CodeGen/X86/patchable-prologue.ll

	; RUN: llc -verify-machineinstrs -filetype=obj -o - -mtriple=x86_64-apple-macosx < %s \| llvm-objdump --triple=x86_64-apple-macosx -d - \| FileCheck %s			; RUN: llc -verify-machineinstrs -filetype=obj -o - -mtriple=x86_64-apple-macosx < %s \| llvm-objdump --triple=x86_64-apple-macosx -d - \| FileCheck %s
	; RUN: llc -verify-machineinstrs -mtriple=x86_64-apple-macosx < %s \| FileCheck %s --check-prefix=CHECK-ALIGN			; RUN: llc -verify-machineinstrs -mtriple=x86_64-apple-macosx < %s \| FileCheck %s --check-prefix=CHECK-ALIGN
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=i386 < %s \| FileCheck %s --check-prefixes=32,32CFI,XCHG
				craig.topperUnsubmitted Done Reply Inline Actions Should we add -show-mc-encoding so we make sure we're really emitting the right bytes? craig.topper: Should we add -show-mc-encoding so we make sure we're really emitting the right bytes?
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=i386-windows-msvc < %s \| FileCheck %s --check-prefixes=32,MOV
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=i386-windows-msvc -mcpu=pentium3 < %s \| FileCheck %s --check-prefixes=32,MOV
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=i386-windows-msvc -mcpu=pentium4 < %s \| FileCheck %s --check-prefixes=32,XCHG
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=x86_64-windows-msvc < %s \| FileCheck %s --check-prefix=64
				; RUN: llc -verify-machineinstrs -show-mc-encoding -mtriple=i386-unknown-linux-code16 < %s \| FileCheck %s --check-prefix=16

				; 16-NOT: movl %edi, %edi
				; 16-NOT: xchgw %ax, %ax

	declare void @callee(i64*)			declare void @callee(i64*)

	define void @f0() "patchable-function"="prologue-short-redirect" {			define void @f0() "patchable-function"="prologue-short-redirect" {
	; CHECK-LABEL: _f0{{>?}}:			; CHECK-LABEL: _f0{{>?}}:
	; CHECK-NEXT: 66 90 nop			; CHECK-NEXT: 66 90 nop

	; CHECK-ALIGN: .p2align 4, 0x90			; CHECK-ALIGN: .p2align 4, 0x90
	; CHECK-ALIGN: _f0:			; CHECK-ALIGN: _f0:

				; 32: f0:
				; 32CFI-NEXT: .cfi_startproc
				; 32-NEXT: # %bb.0:
				; XCHG-NEXT: xchgw %ax, %ax # encoding: [0x66,0x90]
				; MOV-NEXT: movl %edi, %edi # encoding: [0x8b,0xff]
				; 32-NEXT: retl

				; 64: f0:
				; 64-NEXT: # %bb.0:
				; 64-NEXT: xchgw %ax, %ax # encoding: [0x66,0x90]
				; 64-NEXT: retq

	ret void			ret void
	}			}

	define void @f1() "patchable-function"="prologue-short-redirect" "frame-pointer"="all" {			define void @f1() "patchable-function"="prologue-short-redirect" "frame-pointer"="all" {
	; CHECK-LABEL: _f1			; CHECK-LABEL: _f1
	; CHECK-NEXT: ff f5 pushq %rbp			; CHECK-NEXT: ff f5 pushq %rbp

	; CHECK-ALIGN: .p2align 4, 0x90			; CHECK-ALIGN: .p2align 4, 0x90
	; CHECK-ALIGN: _f1:			; CHECK-ALIGN: _f1:

				; 32: f1:
				; 32CFI-NEXT: .cfi_startproc
				; 32-NEXT: # %bb.0:
				; XCHG-NEXT: xchgw %ax, %ax # encoding: [0x66,0x90]
				; MOV-NEXT: movl %edi, %edi # encoding: [0x8b,0xff]
				; 32-NEXT: pushl %ebp

				; 64: f1:
				; 64-NEXT: .seh_proc f1
				; 64-NEXT: # %bb.0:
				; 64-NEXT: pushq %rbp

	ret void			ret void
	}			}

	define void @f2() "patchable-function"="prologue-short-redirect" {			define void @f2() "patchable-function"="prologue-short-redirect" {
	; CHECK-LABEL: _f2			; CHECK-LABEL: _f2
	; CHECK-NEXT: 48 81 ec a8 00 00 00 subq $168, %rsp			; CHECK-NEXT: 48 81 ec a8 00 00 00 subq $168, %rsp

	; CHECK-ALIGN: .p2align 4, 0x90			; CHECK-ALIGN: .p2align 4, 0x90
	; CHECK-ALIGN: _f2:			; CHECK-ALIGN: _f2:

				; 32: f2:
				; 32CFI-NEXT: .cfi_startproc
				; 32-NEXT: # %bb.0:
				; XCHG-NEXT: xchgw %ax, %ax # encoding: [0x66,0x90]
				; MOV-NEXT: movl %edi, %edi # encoding: [0x8b,0xff]
				; 32-NEXT: pushl %ebp

				; 64: f2:
				; 64-NEXT: .seh_proc f2
				; 64-NEXT: # %bb.0:
				; 64-NEXT: subq $200, %rsp

	%ptr = alloca i64, i32 20			%ptr = alloca i64, i32 20
	call void @callee(i64* %ptr)			call void @callee(i64* %ptr)
	ret void			ret void
	}			}

	define void @f3() "patchable-function"="prologue-short-redirect" optsize {			define void @f3() "patchable-function"="prologue-short-redirect" optsize {
	; CHECK-LABEL: _f3			; CHECK-LABEL: _f3
	; CHECK-NEXT: 66 90 nop			; CHECK-NEXT: 66 90 nop

	; CHECK-ALIGN: .p2align 4, 0x90			; CHECK-ALIGN: .p2align 4, 0x90
	; CHECK-ALIGN: _f3:			; CHECK-ALIGN: _f3:

				; 32: f3:
				; 32CFI-NEXT: .cfi_startproc
				; 32-NEXT: # %bb.0:
				; XCHG-NEXT: xchgw %ax, %ax
				; MOV-NEXT: movl %edi, %edi
				; 32-NEXT: retl

				; 64: f3:
				; 64-NEXT: # %bb.0:
				; 64-NEXT: xchgw %ax, %ax
				; 64-NEXT: retq

	ret void			ret void
	}			}

	; This testcase happens to produce a KILL instruction at the beginning of the			; This testcase happens to produce a KILL instruction at the beginning of the
	; first basic block. In this case the 2nd instruction should be turned into a			; first basic block. In this case the 2nd instruction should be turned into a
	; patchable one.			; patchable one.
	; CHECK-LABEL: f4{{>?}}:			; CHECK-LABEL: f4{{>?}}:
	; CHECK-NEXT: 8b 0c 37 movl (%rdi,%rsi), %ecx			; CHECK-NEXT: 8b 0c 37 movl (%rdi,%rsi), %ecx
				; 32: f4:
				; 32CFI-NEXT: .cfi_startproc
				; 32-NEXT: # %bb.0:
				; XCHG-NEXT: xchgw %ax, %ax
				; MOV-NEXT: movl %edi, %edi
				; 32-NEXT: pushl %ebx

				; 64: f4:
				; 64-NEXT: # %bb.0:
				; 64-NOT: xchgw %ax, %ax

	define i32 @f4(i8* %arg1, i64 %arg2, i32 %arg3) "patchable-function"="prologue-short-redirect" {			define i32 @f4(i8* %arg1, i64 %arg2, i32 %arg3) "patchable-function"="prologue-short-redirect" {
	bb:			bb:
	%tmp10 = getelementptr i8, i8* %arg1, i64 %arg2			%tmp10 = getelementptr i8, i8* %arg1, i64 %arg2
	%tmp11 = bitcast i8* %tmp10 to i32*			%tmp11 = bitcast i8* %tmp10 to i32*
	%tmp12 = load i32, i32* %tmp11, align 4			%tmp12 = load i32, i32* %tmp11, align 4
	fence acquire			fence acquire
	%tmp13 = add i32 %tmp12, %arg3			%tmp13 = add i32 %tmp12, %arg3
	%tmp14 = cmpxchg i32* %tmp11, i32 %tmp12, i32 %tmp13 seq_cst monotonic			%tmp14 = cmpxchg i32* %tmp11, i32 %tmp12, i32 %tmp13 seq_cst monotonic
	Show All 10 Lines