Download Raw Diff

Details

Reviewers

craig.topper
MaskRay
jyknight

Group Reviewers

Restricted Project

Commits

rGa9f9ceb35f81: [X86] Use correct padding when in 16-bit mode

Summary

In 16-bit mode, some of the nop patterns used in 32-bit mode can end up
mangling other instructions. For instance, an aligned "movz" instruction
may have the 0x66 and 0x67 prefixes omitted, because the nop that's used
messes things up.

xorl    %ebx, %ebx
.p2align 4, 0x90
movzbl  (%esi,%ebx), %ecx

Use instead nop patterns we know 16-bit mode can handle.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

void created this revision.Feb 23 2021, 2:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2021, 2:45 AM

Herald added a reviewer: Restricted Project. · View Herald Transcript

Herald added subscribers: pengfei, steven_wu, hiraditya, emaste. · View Herald Transcript

void requested review of this revision.Feb 23 2021, 2:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2021, 2:45 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B90365: Diff 325724.Feb 23 2021, 3:32 AM

This change also modifies (breaks!) the 32-bit NOP set, definitely shouldn't do that. (But I think all that can be reverted, in any case).

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1075–1077	I think we could just return 1 in 16-bit mode. There's no real reason to do nop-optimization in that mode, is there? Then the rest of the patch isn't necessary.

In D97268#2581555, @jyknight wrote:

This change also modifies (breaks!) the 32-bit NOP set, definitely shouldn't do that. (But I think all that can be reverted, in any case).

Which part did it break?

Reduce size and effect of patch.

craig.topper added inline comments.Feb 23 2021, 2:35 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1136	Isn't this xchg %eax, %eax since 0x66 would override the register to 32-bit in 16-bit mode?
1138	Should these %esi's be %si like the ones below?

Correct comments with proper disassembly.

void marked an inline comment as done.Feb 23 2021, 2:55 PM

In D97268#2582826, @void wrote:

In D97268#2581555, @jyknight wrote:

This change also modifies (breaks!) the 32-bit NOP set, definitely shouldn't do that. (But I think all that can be reverted, in any case).

Which part did it break?

It replaced the single-instruction NOPs with new sequences consisting of multiple instructions.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1142	If you want to go the complex route, and support multi-byte nops in 16bit mode: The 5 and larger-byte strings here should be removed, as they're combinations of smaller instructions. Leave only the initial 4 elements in the Nops16Bit array. Therefore, getMaximumNopSize() should return 4 in 16bit mode.
1152	This code is unnecessary. Rename Nops to Nops32Bit, make both arrays have a second dimension of 11, and then you can do `char (*Nops)[11] = STI.getFeatureBits()[X86::Mode16Bit])? Nops16Bit : Nops32Bit;` -- and then the existing loop below is sufficient.

In D97268#2583270, @jyknight wrote:

In D97268#2582826, @void wrote:

In D97268#2581555, @jyknight wrote:

This change also modifies (breaks!) the 32-bit NOP set, definitely shouldn't do that. (But I think all that can be reverted, in any case).

Which part did it break?

It replaced the single-instruction NOPs with new sequences consisting of multiple instructions.

That...shouldn't matter since a NOP is a NOP. Regardless, I removed that modification to simplify the patch.

In D97268#2583270, @jyknight wrote:

In D97268#2582826, @void wrote:

In D97268#2581555, @jyknight wrote:

This change also modifies (breaks!) the 32-bit NOP set, definitely shouldn't do that. (But I think all that can be reverted, in any case).

Which part did it break?

It replaced the single-instruction NOPs with new sequences consisting of multiple instructions.

That...shouldn't matter since a NOP is a NOP. Regardless, I removed that modification to simplify the patch.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1142	That is needlessly complex. These patterns are from binutils, which claims that they're "efficient." The code in this patch is far easier to understand, in my opinion.

Harbormaster completed remote builds in B90470: Diff 325879.Feb 23 2021, 3:52 PM

FWIW: I recall there being a header in the Linux kernel for various nop sleds: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/nops.h. I don't recall the context for why these different sequences exist. There seems to be commit messages about whether certain sequences are atomic or not. (Perhaps longer sequences are not interruptable by faults, for instance).

Harbormaster completed remote builds in B90486: Diff 325906.Feb 23 2021, 5:46 PM

MaskRay added inline comments.Feb 23 2021, 5:59 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1075–1077	I agree that performance in 16-bit mode is not important. We can just return 1 :)
llvm/test/MC/X86/code16gcc-align.s
26	The input contains many instructions that are not checked. They should just be replaced by `.nops number`.

In D97268#2583455, @nickdesaulniers wrote:

FWIW: I recall there being a header in the Linux kernel for various nop sleds: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/nops.h. I don't recall the context for why these different sequences exist. There seems to be commit messages about whether certain sequences are atomic or not. (Perhaps longer sequences are not interruptable by faults, for instance).

The kernel uses at least 5-byte nop and 3-byte nop for code patching. Using long nops is just for performance reasons (better for the decoder). Using 1-byte nops is no worse than using long nops if performance is not a concern, as here we do for the 16-bit code.
(For code patching, atomics matter for the leading jmp/call instruction.)

In D97268#2583366, @void wrote:

That...shouldn't matter since a NOP is a NOP. Regardless, I removed that modification to simplify the patch.

It _does_ matter. To start with it can affect performance (marginally).

But more importantly, if you're doing tricky things with runtime code modification, it's a correctness issue. A trap won't leave the instruction pointer in the middle of a long-nop instruction (which you may be about to replace with another instruction), but it can leave it between two distinct nops.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1075–1077	Actually, changing my mind here. Supporting 4-byte NOPs in 16-bit mode is useful, in order that `.nops 4, 4` works (dunno if anyone actually does that in practice, but it is a reasonable thing to support.)
1142	Adding an entirely distinct second method for emitting NOPs is not the answer to complexity of the existing implementation. Using the same code with two sets of data is less complexity overall -- just need a single LOC to choose the correct table...

In D97268#2584819, @jyknight wrote:

In D97268#2583366, @void wrote:

That...shouldn't matter since a NOP is a NOP. Regardless, I removed that modification to simplify the patch.

It _does_ matter. To start with it can affect performance (marginally).

But more importantly, if you're doing tricky things with runtime code modification, it's a correctness issue. A trap won't leave the instruction pointer in the middle of a long-nop instruction (which you may be about to replace with another instruction), but it can leave it between two distinct nops.

I argue that because gas is comfortable with it it's most likely a non-issue. Especially since you can't guarantee that a sequence of nops will always be one instruction long. But like I said I reverted that bit.

void marked an inline comment as done.Feb 24 2021, 2:21 PM

Rework 16-bit nops to use existing loop.

Please reduce the test case per MaskRay's comment, then lg.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1076	Needs to move below the FeatureNOPL stanza.

Reduce testcase.

Harbormaster completed remote builds in B90686: Diff 326205.Feb 24 2021, 5:21 PM

Harbormaster completed remote builds in B90695: Diff 326218.Feb 24 2021, 6:01 PM

MaskRay added inline comments.Feb 24 2021, 7:29 PM

llvm/test/MC/X86/code16gcc-align.s
8	Need a test for the 2-byte nop.
26	All the following instructions and directives can be deleted. You can drop .LBB0_2 too.

Test all nop sizes.

Harbormaster completed remote builds in B90737: Diff 326274.Feb 24 2021, 11:04 PM

craig.topper added inline comments.Feb 25 2021, 12:22 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1080	Should this just be above the NOPL check if we're going to AND NOPL with !STI.hasFeature(X86::Mode16Bit). Should we consistently use hasFeature in this function instead of mixing with getFeatureBits()?

void added inline comments.Feb 25 2021, 3:37 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1080	I was told to place this here and not at the start.

craig.topper added inline comments.Feb 25 2021, 3:40 PM

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1080	Sorry. I realize that, but that request from James doesn't make sense to me unless James thought the 16-bit NOPs required NOPL. Which they don't. @jyknight can you confirm why you asked for that change?

Use "hasFeature"

Harbormaster completed remote builds in B90914: Diff 326533.Feb 25 2021, 6:44 PM

jyknight accepted this revision.Feb 25 2021, 7:35 PM

jyknight added inline comments.

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
1080	What I had intended to ask was that with !NOPL, the function should return 1 rather than 4 even in 16-bit mode. And the reason I asked for that is if we were disabling the 2-byte nop in 32-bit mode, surely we should also be disabling it in 16-bit mode, too -- the 2-byte NOP is the same in the 16 and 32-bit nop lists. But, looking deeper, this was incorrect, because as far as I can tell there was never a reason for the code to be disabling 0x66 0x90 for !NOPL in the first place. So, we could've/should've been returning at least 2 even with the current single table. We could also provide a 3rd version of the nop-table for !16bit && !NOPL && !X86_64, having `lea 0(%esi),%esi` and `lea 0(%esi,1),%esi` as the 3/4-byte NOPs. I don't know if that'd be a good idea though -- I could imagine that using 3/4-byte LEA pseudo-NOPs instead of 1/2-byte actual NOPs might marginally degrade performance when people are compiling for generic-i686 and running on a modern CPU -- even though it does result in fewer instructions. So, upshot: yep, just put this back the way you had it before. (And if you want to tackle the other issues too, that'd be fine, but also fine if not.)

This revision is now accepted and ready to land.Feb 25 2021, 7:35 PM

Closed by commit rGa9f9ceb35f81: [X86] Use correct padding when in 16-bit mode (authored by void). · Explain WhyFeb 25 2021, 8:05 PM

This revision was automatically updated to reflect the committed changes.

void added a commit: rGa9f9ceb35f81: [X86] Use correct padding when in 16-bit mode.

Diff 326586

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

Show First 20 Lines • Show All 1,066 Lines • ▼ Show 20 Lines	#endif
// The layout is done. Mark every fragment as valid.		// The layout is done. Mark every fragment as valid.
for (unsigned int i = 0, n = Layout.getSectionOrder().size(); i != n; ++i) {		for (unsigned int i = 0, n = Layout.getSectionOrder().size(); i != n; ++i) {
MCSection &Section = *Layout.getSectionOrder()[i];		MCSection &Section = *Layout.getSectionOrder()[i];
Layout.getFragmentOffset(&*Section.getFragmentList().rbegin());		Layout.getFragmentOffset(&*Section.getFragmentList().rbegin());
Asm.computeFragmentSize(Layout, *Section.getFragmentList().rbegin());		Asm.computeFragmentSize(Layout, *Section.getFragmentList().rbegin());
}		}
}		}

unsigned X86AsmBackend::getMaximumNopSize() const {		unsigned X86AsmBackend::getMaximumNopSize() const {
		if (STI.hasFeature(X86::Mode16Bit))
		jyknightUnsubmitted Not Done Reply Inline Actions Needs to move below the FeatureNOPL stanza. jyknight: Needs to move below the FeatureNOPL stanza.
		return 4;
		jyknightUnsubmitted Not Done Reply Inline Actions I think we could just return 1 in 16-bit mode. There's no real reason to do nop-optimization in that mode, is there? Then the rest of the patch isn't necessary. jyknight: I think we could just return 1 in 16-bit mode. There's no real reason to do nop-optimization in…
		MaskRayUnsubmitted Not Done Reply Inline Actions I agree that performance in 16-bit mode is not important. We can just return 1 :) MaskRay: I agree that performance in 16-bit mode is not important. We can just return 1 :)
		jyknightUnsubmitted Not Done Reply Inline Actions Actually, changing my mind here. Supporting 4-byte NOPs in 16-bit mode is useful, in order that `.nops 4, 4` works (dunno if anyone actually does that in practice, but it is a reasonable thing to support.) jyknight: Actually, changing my mind here. Supporting 4-byte NOPs in 16-bit mode is useful, in order that…
if (!STI.hasFeature(X86::FeatureNOPL) && !STI.hasFeature(X86::Mode64Bit))		if (!STI.hasFeature(X86::FeatureNOPL) && !STI.hasFeature(X86::Mode64Bit))
return 1;		return 1;
if (STI.getFeatureBits()[X86::FeatureFast7ByteNOP])		if (STI.getFeatureBits()[X86::FeatureFast7ByteNOP])
		craig.topperUnsubmitted Not Done Reply Inline Actions Should this just be above the NOPL check if we're going to AND NOPL with !STI.hasFeature(X86::Mode16Bit). Should we consistently use hasFeature in this function instead of mixing with getFeatureBits()? craig.topper: Should this just be above the NOPL check if we're going to AND NOPL with !STI.hasFeature(X86…
		voidAuthorUnsubmitted Done Reply Inline Actions I was told to place this here and not at the start. void: I was told to place this here and not at the start.
		craig.topperUnsubmitted Not Done Reply Inline Actions Sorry. I realize that, but that request from James doesn't make sense to me unless James thought the 16-bit NOPs required NOPL. Which they don't. @jyknight can you confirm why you asked for that change? craig.topper: Sorry. I realize that, but that request from James doesn't make sense to me unless James…
		jyknightUnsubmitted Not Done Reply Inline Actions What I had intended to ask was that with !NOPL, the function should return 1 rather than 4 even in 16-bit mode. And the reason I asked for that is if we were disabling the 2-byte nop in 32-bit mode, surely we should also be disabling it in 16-bit mode, too -- the 2-byte NOP is the same in the 16 and 32-bit nop lists. But, looking deeper, this was incorrect, because as far as I can tell there was never a reason for the code to be disabling 0x66 0x90 for !NOPL in the first place. So, we could've/should've been returning at least 2 even with the current single table. We could also provide a 3rd version of the nop-table for !16bit && !NOPL && !X86_64, having `lea 0(%esi),%esi` and `lea 0(%esi,1),%esi` as the 3/4-byte NOPs. I don't know if that'd be a good idea though -- I could imagine that using 3/4-byte LEA pseudo-NOPs instead of 1/2-byte actual NOPs might marginally degrade performance when people are compiling for generic-i686 and running on a modern CPU -- even though it does result in fewer instructions. So, upshot: yep, just put this back the way you had it before. (And if you want to tackle the other issues too, that'd be fine, but also fine if not.) jyknight: What I had intended to ask was that with !NOPL, the function should return 1 rather than 4 even…
return 7;		return 7;
if (STI.getFeatureBits()[X86::FeatureFast15ByteNOP])		if (STI.getFeatureBits()[X86::FeatureFast15ByteNOP])
return 15;		return 15;
if (STI.getFeatureBits()[X86::FeatureFast11ByteNOP])		if (STI.getFeatureBits()[X86::FeatureFast11ByteNOP])
return 11;		return 11;
// FIXME: handle 32-bit mode		// FIXME: handle 32-bit mode
// 15-bytes is the longest single NOP instruction, but 10-bytes is		// 15-bytes is the longest single NOP instruction, but 10-bytes is
// commonly the longest that can be efficiently decoded.		// commonly the longest that can be efficiently decoded.
return 10;		return 10;
}		}

/// Write a sequence of optimal nops to the output, covering \p Count		/// Write a sequence of optimal nops to the output, covering \p Count
/// bytes.		/// bytes.
/// \return - true on success, false on failure		/// \return - true on success, false on failure
bool X86AsmBackend::writeNopData(raw_ostream &OS, uint64_t Count) const {		bool X86AsmBackend::writeNopData(raw_ostream &OS, uint64_t Count) const {
static const char Nops[10][11] = {		static const char Nops32Bit[10][11] = {
// nop		// nop
"\x90",		"\x90",
// xchg %ax,%ax		// xchg %ax,%ax
"\x66\x90",		"\x66\x90",
// nopl (%[re]ax)		// nopl (%[re]ax)
"\x0f\x1f\x00",		"\x0f\x1f\x00",
// nopl 0(%[re]ax)		// nopl 0(%[re]ax)
"\x0f\x1f\x40\x00",		"\x0f\x1f\x40\x00",
// nopl 0(%[re]ax,%[re]ax,1)		// nopl 0(%[re]ax,%[re]ax,1)
"\x0f\x1f\x44\x00\x00",		"\x0f\x1f\x44\x00\x00",
// nopw 0(%[re]ax,%[re]ax,1)		// nopw 0(%[re]ax,%[re]ax,1)
"\x66\x0f\x1f\x44\x00\x00",		"\x66\x0f\x1f\x44\x00\x00",
// nopl 0L(%[re]ax)		// nopl 0L(%[re]ax)
"\x0f\x1f\x80\x00\x00\x00\x00",		"\x0f\x1f\x80\x00\x00\x00\x00",
// nopl 0L(%[re]ax,%[re]ax,1)		// nopl 0L(%[re]ax,%[re]ax,1)
"\x0f\x1f\x84\x00\x00\x00\x00\x00",		"\x0f\x1f\x84\x00\x00\x00\x00\x00",
// nopw 0L(%[re]ax,%[re]ax,1)		// nopw 0L(%[re]ax,%[re]ax,1)
"\x66\x0f\x1f\x84\x00\x00\x00\x00\x00",		"\x66\x0f\x1f\x84\x00\x00\x00\x00\x00",
// nopw %cs:0L(%[re]ax,%[re]ax,1)		// nopw %cs:0L(%[re]ax,%[re]ax,1)
"\x66\x2e\x0f\x1f\x84\x00\x00\x00\x00\x00",		"\x66\x2e\x0f\x1f\x84\x00\x00\x00\x00\x00",
};		};

		// 16-bit mode uses different nop patterns than 32-bit.
		static const char Nops16Bit[4][11] = {
		// nop
		"\x90",
		// xchg %eax,%eax
		"\x66\x90",
		// lea 0(%si),%si
		"\x8d\x74\x00",
		// lea 0w(%si),%si
		"\x8d\xb4\x00\x00",
		};

		const char(*Nops)[11] =
		STI.getFeatureBits()[X86::Mode16Bit] ? Nops16Bit : Nops32Bit;

uint64_t MaxNopLength = (uint64_t)getMaximumNopSize();		uint64_t MaxNopLength = (uint64_t)getMaximumNopSize();

// Emit as many MaxNopLength NOPs as needed, then emit a NOP of the remaining		// Emit as many MaxNopLength NOPs as needed, then emit a NOP of the remaining
		craig.topperUnsubmitted Done Reply Inline Actions Isn't this xchg %eax, %eax since 0x66 would override the register to 32-bit in 16-bit mode? craig.topper: Isn't this xchg %eax, %eax since 0x66 would override the register to 32-bit in 16-bit mode?
// length.		// length.
do {		do {
		craig.topperUnsubmitted Done Reply Inline Actions Should these %esi's be %si like the ones below? craig.topper: Should these %esi's be %si like the ones below?
const uint8_t ThisNopLength = (uint8_t) std::min(Count, MaxNopLength);		const uint8_t ThisNopLength = (uint8_t) std::min(Count, MaxNopLength);
const uint8_t Prefixes = ThisNopLength <= 10 ? 0 : ThisNopLength - 10;		const uint8_t Prefixes = ThisNopLength <= 10 ? 0 : ThisNopLength - 10;
for (uint8_t i = 0; i < Prefixes; i++)		for (uint8_t i = 0; i < Prefixes; i++)
OS << '\x66';		OS << '\x66';
		jyknightUnsubmitted Not Done Reply Inline Actions If you want to go the complex route, and support multi-byte nops in 16bit mode: The 5 and larger-byte strings here should be removed, as they're combinations of smaller instructions. Leave only the initial 4 elements in the Nops16Bit array. Therefore, getMaximumNopSize() should return 4 in 16bit mode. jyknight: If you want to go the complex route, and support multi-byte nops in 16bit mode: The 5 and…
		voidAuthorUnsubmitted Done Reply Inline Actions That is needlessly complex. These patterns are from binutils, which claims that they're "efficient." The code in this patch is far easier to understand, in my opinion. void: That is needlessly complex. These patterns are from binutils, which claims that they're…
		jyknightUnsubmitted Done Reply Inline Actions Adding an entirely distinct second method for emitting NOPs is not the answer to complexity of the existing implementation. Using the same code with two sets of data is less complexity overall -- just need a single LOC to choose the correct table... jyknight: Adding an entirely distinct second method for emitting NOPs is not the answer to complexity of…
const uint8_t Rest = ThisNopLength - Prefixes;		const uint8_t Rest = ThisNopLength - Prefixes;
if (Rest != 0)		if (Rest != 0)
OS.write(Nops[Rest - 1], Rest);		OS.write(Nops[Rest - 1], Rest);
Count -= ThisNopLength;		Count -= ThisNopLength;
} while (Count != 0);		} while (Count != 0);

return true;		return true;
}		}

/* *** */		/* *** */
		jyknightUnsubmitted Not Done Reply Inline Actions This code is unnecessary. Rename Nops to Nops32Bit, make both arrays have a second dimension of 11, and then you can do `char (Nops)[11] = STI.getFeatureBits()[X86::Mode16Bit])? Nops16Bit : Nops32Bit;` -- and then the existing loop below is sufficient. jyknight:* This code is unnecessary. Rename Nops to Nops32Bit, make both arrays have a second dimension…

namespace {		namespace {

class ELFX86AsmBackend : public X86AsmBackend {		class ELFX86AsmBackend : public X86AsmBackend {
public:		public:
uint8_t OSABI;		uint8_t OSABI;
ELFX86AsmBackend(const Target &T, uint8_t OSABI, const MCSubtargetInfo &STI)		ELFX86AsmBackend(const Target &T, uint8_t OSABI, const MCSubtargetInfo &STI)
: X86AsmBackend(T, STI), OSABI(OSABI) {}		: X86AsmBackend(T, STI), OSABI(OSABI) {}
▲ Show 20 Lines • Show All 460 Lines • Show Last 20 Lines

llvm/test/MC/X86/code16gcc-align.s

This file was added.

				# RUN: llvm-mc -filetype=obj -triple=i386-unknown-unknown-code16 %s \| llvm-objdump --triple=i386-unknown-unknown-code16 -d - \| FileCheck %s

				# Ensure that the "movzbl" is aligned such that the prefixes 0x67 0x66 are
				# properly included in the "movz" instruction.

				# CHECK-LABEL: <test>:
				# CHECK: 1c: 8d b4 00 00 leaw (%si), %si
				# CHECK-NEXT: 20: 66 90 nop
				MaskRayUnsubmitted Not Done Reply Inline Actions Need a test for the 2-byte nop. MaskRay: Need a test for the 2-byte nop.
				# CHECK-NEXT: 22: 66 89 c7 movl %eax, %edi
				# CHECK-NEXT: 25: 66 31 db xorl %ebx, %ebx
				# CHECK-NEXT: 28: 8d b4 00 00 leaw (%si), %si
				# CHECK-NEXT: 2c: 8d b4 00 00 leaw (%si), %si
				# CHECK-NEXT: 30: 67 66 0f b6 0c 1e movzbl (%esi,%ebx), %ecx
				# CHECK-NEXT: 36: 66 e8 14 00 00 00 calll 0x50 <called>
				# CHECK-NEXT: 3c: 8d 74 00 leaw (%si), %si

				# CHECK-LABEL: <called>:
				# CHECK-NEXT: 50: 90 nop
				# CHECK-NEXT: 51: 66 c3 retl

				.text
				.code16gcc
				.globl test
				.p2align 4, 0x90
				.type test,@function
				test:
				MaskRayUnsubmitted Not Done Reply Inline Actions The input contains many instructions that are not checked. They should just be replaced by `.nops number`. MaskRay: The input contains many instructions that are not checked. They should just be replaced by `.
				MaskRayUnsubmitted Not Done Reply Inline Actions All the following instructions and directives can be deleted. You can drop .LBB0_2 too. MaskRay: All the following instructions and directives can be deleted. You can drop .LBB0_2 too.
				.nops 34
				movl %eax, %edi
				xorl %ebx, %ebx
				.p2align 4, 0x90
				movzbl (%esi,%ebx), %ecx
				calll called
				.nops 3
				retl

				.p2align 4, 0x90
				.type called,@function
				called:
				.nops 1
				retl

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use correct padding when in 16-bit mode
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 326586

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

llvm/test/MC/X86/code16gcc-align.s

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use correct padding when in 16-bit modeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 326586

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

llvm/test/MC/X86/code16gcc-align.s

[X86] Use correct padding when in 16-bit mode
ClosedPublic