This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/BinaryFormat/
-
llvm/
-
BinaryFormat/
-
ELF.h
-
lib/
-
ObjectYAML/
-
ELFYAML.cpp
-
Target/ARM/
-
ARM/
1/4
ARM.td
-
Disassembler/
-
ARMDisassembler.cpp
-
test/tools/llvm-objdump/ELF/ARM/
-
tools/
-
llvm-objdump/
-
ELF/
-
ARM/
2
be-image-disasm.test
-
be-object-disasm.test
-
tools/llvm-objdump/
-
llvm-objdump/
1
llvm-objdump.cpp

Differential D130902

[llvm-objdump,ARM] Fix big-endian AArch32 disassembly.
ClosedPublic

Authored by simon_tatham on Aug 1 2022, 7:31 AM.

Download Raw Diff

Details

Reviewers

DavidSpickett
ostannard
MaskRay
jhenderson

Commits

rG72017e9b16b7: [llvm-objdump,ARM] Fix big-endian AArch32 disassembly.

Summary

The ABI for big-endian AArch32, as specified by AAELF32, is above-
averagely complicated. Relocatable object files are expected to store
instruction encodings in byte order matching the ELF file's endianness
(so, big-endian for a BE ELF file). But executable images can
either do that or store instructions little-endian regardless
of data and ELF endianness (to support BE32 and BE8 platforms
respectively). They signal the latter by setting the EF_ARM_BE8 flag
in the ELF header.

(In the case of the Thumb instruction set, this all means that each
16-bit halfword of a Thumb instruction is stored in one or other
endianness. The two halfwords of a 32-bit Thumb instruction must
appear in the same order no matter what, because the first halfword is
the one that must avoid overlapping the encoding of any 16-bit Thumb
instruction.)

llvm-objdump was unconditionally expecting Arm instructions to be
stored little-endian. So it would correctly disassemble a BE8 image,
but if you gave it a BE32 image or a BE object file, it would retrieve
every instruction in byte-swapped form and disassemble it to
nonsense. (Even an object file output by LLVM itself, because
ARMMCCodeEmitter outputs instructions big-endian in big-endian mode,
which is correct for writing an object file.)

This patch allows llvm-objdump to correctly disassemble all three of
those classes of Arm ELF file. It does it by introducing a new
SubtargetFeature for big-endian instructions, setting it from the ELF
image type and flags during llvm-objdump setup, and teaching both
ARMDisassembler and llvm-objdump itself to pay attention to it when
retrieving instruction data from a section being disassembled.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	1,770 ms	x64 debian > SanitizerCommon-asan-x86_64-Linux.SanitizerCommon-asan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp
	1,700 ms	x64 debian > SanitizerCommon-lsan-x86_64-Linux.SanitizerCommon-lsan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp
	1,640 ms	x64 debian > SanitizerCommon-msan-x86_64-Linux.SanitizerCommon-msan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp

Event Timeline

simon_tatham created this revision.Aug 1 2022, 7:31 AM

Herald added a reviewer: jhenderson. · View Herald TranscriptAug 1 2022, 7:31 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: StephenFan, rupprecht, hiraditya and 2 others. · View Herald Transcript

simon_tatham requested review of this revision.Aug 1 2022, 7:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 7:31 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Updated with arc lint style fixes.

(I didn't apply them ahead of time because I was confused by output from arc that appeared to want to reformat most of ARMDisassembler.cpp! But it turns out that was some kind of intermediate output, and not part of the final patch it wanted to apply.)

DavidSpickett added inline comments.Aug 1 2022, 7:54 AM

llvm/lib/Target/ARM/ARM.td
748	When you say "in memory" here, does it refer to "memory" on the device or "memory" in the object file? I think from the last sentence, you mean inside the object file, which is what objdump cares about. Because if I had an object file that was going to be byte reversed by the linker then I'd need to disassemble the object file in one endian, but if I was disassembling the content of memory on the device. I'd have to use the opposite endian.

simon_tatham added inline comments.Aug 1 2022, 7:57 AM

llvm/lib/Target/ARM/ARM.td
748	Hmmm. It's true, of course, that when you read a relocatable object file, the endianness in memory after linking is not yet specified, because the linker may or may not choose to byte-reverse the instructions and set the BE8 flag. But I had imagined it would be clear that "in memory" here means in the memory that the disassembler is reading from, not the memory on the target! Do you have a suggested alternative wording that doesn't end up being a whole paragraph? Perhaps just deleting "in memory" would be simplest?

Harbormaster completed remote builds in B178568: Diff 449016.Aug 1 2022, 8:18 AM

MaskRay added inline comments.Aug 2 2022, 12:19 AM

llvm/tools/llvm-objdump/llvm-objdump.cpp
683	printInst is called for every instruction. Calling checkFeatures here may be too expensive.

MaskRay added inline comments.Aug 2 2022, 12:21 AM

llvm/test/tools/llvm-objdump/ELF/ARM/be32-image-disasm.test
4 ↗	(On Diff #449016)	Add `-NEXT` whenever appropriate
45 ↗	(On Diff #449016)	delete `...`
llvm/test/tools/llvm-objdump/ELF/ARM/be8-image-disasm.test
1 ↗	(On Diff #449016)	Do we need two files? Grep `yaml2obj .*-D` and use it to combine tests into one file.
3 ↗	(On Diff #449016)	Add -NEXT whenever appropriate
45 ↗	(On Diff #449016)	delete `...`

DavidSpickett added inline comments.Aug 2 2022, 2:52 AM

llvm/lib/Target/ARM/ARM.td
748	Perhaps just deleting "in memory" would be simplest? Sounds good to me.

Addressed all review comments, I think.

I couldn't find any convenient way to make
STI.getFeatureBits()[ARM::ModeBigEndianInstructions] work in
llvm-objdump.cpp, because that tool doesn't have the right include
path to include the generated file where those enums are defined. So
I've made the prettyprinter have its own endianness flag that's set
directly at the same time as setting that feature in the
SubtargetInfo, and that will save having to check anything in the STI
at all.

Harbormaster completed remote builds in B178744: Diff 449257.Aug 2 2022, 6:08 AM

LGTM if @MaskRay is cool with it.

I remembered that this patch https://reviews.llvm.org/D48811 was posted ages ago and never went anywhere. Do you have an opinion on that? It makes objdump disassemble as big endian if there is eb in the target name. From what I read here I don't think it's needed because the important thing is the format in the object file. Is that correct?

llvm/lib/Target/ARM/ARM.td
733	`// Endianness of instruction encodings`

I didn't know about that previous patch at all, but off the top of my head, yes, I don't think it should be necessary at least for ELF files, because the combination of ELF header endianness and EF_ARM_BE8 flag is enough to autodetect the right code endianness in all cases.

For other Arm object file formats, on the other hand, I can't say the same. I know nothing at all about big-endian support in Arm COFF, for example, including whether it has any BE support at all!

I wouldn't rule out the possibility that at some point an instance of MCDisassembler might need to have its code endianness manually set to the right state – but the facility provided in this commit can be reused for that, if and when the use case shows up, because all you have to do is to set the new big-endian-instructions subtarget feature.

MaskRay added inline comments.Aug 5 2022, 10:22 AM

llvm/test/tools/llvm-objdump/ELF/ARM/be-image-disasm.test
1	You can place be-image-disasm.test and be-object-disasm.test in one file. Grep `docnum` on existing tests.
4	binary utility test directories use `##` for non-RUN non-CHECK comments. They make comments stand out and they may be highlit with different styles in an editor.

LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Aug 8 2022, 2:50 AM

This revision was landed with ongoing or failed builds.

Closed by commit rG72017e9b16b7: [llvm-objdump,ARM] Fix big-endian AArch32 disassembly. (authored by simon_tatham). · Explain Why

This revision was automatically updated to reflect the committed changes.

simon_tatham added a commit: rG72017e9b16b7: [llvm-objdump,ARM] Fix big-endian AArch32 disassembly..

Oh, oops, caught out again :-( I really must stop assuming that a Phab comment saying "LGTM" actually marks the patch as accepted. Sorry about that. Will revert if you need me to.

alanphipps added a subscriber: alanphipps.Oct 7 2022, 8:42 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

BinaryFormat/

ELF.h

1 line

lib/

ObjectYAML/

ELFYAML.cpp

1 line

Target/

ARM/

ARM.td

18 lines

Disassembler/

ARMDisassembler.cpp

20 lines

test/

tools/

llvm-objdump/

ELF/

ARM/

be-image-disasm.test

58 lines

be-object-disasm.test

38 lines

tools/

llvm-objdump/

llvm-objdump.cpp

34 lines

Diff 449257

llvm/include/llvm/BinaryFormat/ELF.h

	Show First 20 Lines • Show All 429 Lines • ▼ Show 20 Lines
	};			};

	// ARM Specific e_flags			// ARM Specific e_flags
	enum : unsigned {			enum : unsigned {
	EF_ARM_SOFT_FLOAT = 0x00000200U, // Legacy pre EABI_VER5			EF_ARM_SOFT_FLOAT = 0x00000200U, // Legacy pre EABI_VER5
	EF_ARM_ABI_FLOAT_SOFT = 0x00000200U, // EABI_VER5			EF_ARM_ABI_FLOAT_SOFT = 0x00000200U, // EABI_VER5
	EF_ARM_VFP_FLOAT = 0x00000400U, // Legacy pre EABI_VER5			EF_ARM_VFP_FLOAT = 0x00000400U, // Legacy pre EABI_VER5
	EF_ARM_ABI_FLOAT_HARD = 0x00000400U, // EABI_VER5			EF_ARM_ABI_FLOAT_HARD = 0x00000400U, // EABI_VER5
				EF_ARM_BE8 = 0x00800000U,
	EF_ARM_EABI_UNKNOWN = 0x00000000U,			EF_ARM_EABI_UNKNOWN = 0x00000000U,
	EF_ARM_EABI_VER1 = 0x01000000U,			EF_ARM_EABI_VER1 = 0x01000000U,
	EF_ARM_EABI_VER2 = 0x02000000U,			EF_ARM_EABI_VER2 = 0x02000000U,
	EF_ARM_EABI_VER3 = 0x03000000U,			EF_ARM_EABI_VER3 = 0x03000000U,
	EF_ARM_EABI_VER4 = 0x04000000U,			EF_ARM_EABI_VER4 = 0x04000000U,
	EF_ARM_EABI_VER5 = 0x05000000U,			EF_ARM_EABI_VER5 = 0x05000000U,
	EF_ARM_EABIMASK = 0xFF000000U			EF_ARM_EABIMASK = 0xFF000000U
	};			};
	▲ Show 20 Lines • Show All 1,371 Lines • Show Last 20 Lines

llvm/lib/ObjectYAML/ELFYAML.cpp

Show First 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	case ELF::EM_ARM:
BCase(EF_ARM_SOFT_FLOAT);		BCase(EF_ARM_SOFT_FLOAT);
BCase(EF_ARM_VFP_FLOAT);		BCase(EF_ARM_VFP_FLOAT);
BCaseMask(EF_ARM_EABI_UNKNOWN, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_UNKNOWN, EF_ARM_EABIMASK);
BCaseMask(EF_ARM_EABI_VER1, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_VER1, EF_ARM_EABIMASK);
BCaseMask(EF_ARM_EABI_VER2, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_VER2, EF_ARM_EABIMASK);
BCaseMask(EF_ARM_EABI_VER3, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_VER3, EF_ARM_EABIMASK);
BCaseMask(EF_ARM_EABI_VER4, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_VER4, EF_ARM_EABIMASK);
BCaseMask(EF_ARM_EABI_VER5, EF_ARM_EABIMASK);		BCaseMask(EF_ARM_EABI_VER5, EF_ARM_EABIMASK);
		BCaseMask(EF_ARM_BE8, EF_ARM_BE8);
break;		break;
case ELF::EM_MIPS:		case ELF::EM_MIPS:
BCase(EF_MIPS_NOREORDER);		BCase(EF_MIPS_NOREORDER);
BCase(EF_MIPS_PIC);		BCase(EF_MIPS_PIC);
BCase(EF_MIPS_CPIC);		BCase(EF_MIPS_CPIC);
BCase(EF_MIPS_ABI2);		BCase(EF_MIPS_ABI2);
BCase(EF_MIPS_32BITMODE);		BCase(EF_MIPS_32BITMODE);
BCase(EF_MIPS_FP64);		BCase(EF_MIPS_FP64);
▲ Show 20 Lines • Show All 1,494 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARM.td

Show First 20 Lines • Show All 724 Lines • ▼ Show 20 Lines	def FeatureHardenSlsBlr : SubtargetFeature<"harden-sls-blr",
"HardenSlsBlr", "true",		"HardenSlsBlr", "true",
"Harden against straight line speculation across indirect calls">;		"Harden against straight line speculation across indirect calls">;
/// Generate thunk code for SLS mitigation in the normal text section.		/// Generate thunk code for SLS mitigation in the normal text section.
def FeatureHardenSlsNoComdat : SubtargetFeature<"harden-sls-nocomdat",		def FeatureHardenSlsNoComdat : SubtargetFeature<"harden-sls-nocomdat",
"HardenSlsNoComdat", "true",		"HardenSlsNoComdat", "true",
"Generate thunk code for SLS mitigation in the normal text section">;		"Generate thunk code for SLS mitigation in the normal text section">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// Endianness of instruction encodings in memory.
		DavidSpickettUnsubmitted Not Done Reply Inline Actions `// Endianness of instruction encodings` DavidSpickett: `// Endianness of instruction encodings`
		//
		// In the current Arm architecture, this is usually little-endian regardless of
		// data endianness. But before Armv7 it was typical for instruction endianness
		// to match data endianness, so that a big-endian system was consistently big-
		// endian. And Armv7-R can be configured to use big-endian instructions.
		//
		// Additionally, even when targeting Armv7-A, big-endian instructions can be
		// found in relocatable object files, because the Arm ABI specifies that the
		// linker byte-reverses them depending on the target architecture.
		//
		// So we have a feature here to indicate that instructions are stored big-
		// endian, which you can set when instantiating an MCDisassembler.
		def ModeBigEndianInstructions : SubtargetFeature<"big-endian-instructions",
		"BigEndianInstructions", "true",
		"Expect instructions to be stored big-endian.">;
		DavidSpickettUnsubmitted Not Done Reply Inline Actions When you say "in memory" here, does it refer to "memory" on the device or "memory" in the object file? I think from the last sentence, you mean inside the object file, which is what objdump cares about. Because if I had an object file that was going to be byte reversed by the linker then I'd need to disassemble the object file in one endian, but if I was disassembling the content of memory on the device. I'd have to use the opposite endian. DavidSpickett: When you say "in memory" here, does it refer to "memory" on the device or "memory" in the…
		simon_tathamAuthorUnsubmitted Done Reply Inline Actions Hmmm. It's true, of course, that when you read a relocatable object file, the endianness in memory after linking is not yet specified, because the linker may or may not choose to byte-reverse the instructions and set the BE8 flag. But I had imagined it would be clear that "in memory" here means in the memory that the disassembler is reading from, not the memory on the target! Do you have a suggested alternative wording that doesn't end up being a whole paragraph? Perhaps just deleting "in memory" would be simplest? simon_tatham: Hmmm. It's true, of course, that when you read a relocatable object file, the endianness in…
		DavidSpickettUnsubmitted Not Done Reply Inline Actions Perhaps just deleting "in memory" would be simplest? Sounds good to me. DavidSpickett: > Perhaps just deleting "in memory" would be simplest? Sounds good to me.

		//===----------------------------------------------------------------------===//
// ARM Processor subtarget features.		// ARM Processor subtarget features.
//		//

def ProcA5 : SubtargetFeature<"a5", "ARMProcFamily", "CortexA5",		def ProcA5 : SubtargetFeature<"a5", "ARMProcFamily", "CortexA5",
"Cortex-A5 ARM processors", []>;		"Cortex-A5 ARM processors", []>;
def ProcA7 : SubtargetFeature<"a7", "ARMProcFamily", "CortexA7",		def ProcA7 : SubtargetFeature<"a7", "ARMProcFamily", "CortexA7",
"Cortex-A7 ARM processors", []>;		"Cortex-A7 ARM processors", []>;
def ProcA8 : SubtargetFeature<"a8", "ARMProcFamily", "CortexA8",		def ProcA8 : SubtargetFeature<"a8", "ARMProcFamily", "CortexA8",
▲ Show 20 Lines • Show All 927 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	private:
SmallVector<unsigned char, 4> VPTStates;		SmallVector<unsigned char, 4> VPTStates;
};		};

/// ARM disassembler for all ARM platforms.		/// ARM disassembler for all ARM platforms.
class ARMDisassembler : public MCDisassembler {		class ARMDisassembler : public MCDisassembler {
public:		public:
ARMDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx) :		ARMDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx) :
MCDisassembler(STI, Ctx) {		MCDisassembler(STI, Ctx) {
		InstructionEndianness = STI.getFeatureBits()[ARM::ModeBigEndianInstructions]
		? llvm::support::big
		: llvm::support::little;
}		}

~ARMDisassembler() override = default;		~ARMDisassembler() override = default;

DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,		DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
ArrayRef<uint8_t> Bytes, uint64_t Address,		ArrayRef<uint8_t> Bytes, uint64_t Address,
raw_ostream &CStream) const override;		raw_ostream &CStream) const override;

Show All 9 Lines	DecodeStatus getThumbInstruction(MCInst &Instr, uint64_t &Size,
ArrayRef<uint8_t> Bytes, uint64_t Address,		ArrayRef<uint8_t> Bytes, uint64_t Address,
raw_ostream &CStream) const;		raw_ostream &CStream) const;

mutable ITStatus ITBlock;		mutable ITStatus ITBlock;
mutable VPTStatus VPTBlock;		mutable VPTStatus VPTBlock;

DecodeStatus AddThumbPredicate(MCInst&) const;		DecodeStatus AddThumbPredicate(MCInst&) const;
void UpdateThumbVFPPredicate(DecodeStatus &, MCInst&) const;		void UpdateThumbVFPPredicate(DecodeStatus &, MCInst&) const;

		llvm::support::endianness InstructionEndianness;
};		};

} // end anonymous namespace		} // end anonymous namespace

static bool Check(DecodeStatus &Out, DecodeStatus In) {		static bool Check(DecodeStatus &Out, DecodeStatus In) {
switch (In) {		switch (In) {
case MCDisassembler::Success:		case MCDisassembler::Success:
// Out stays the same.		// Out stays the same.
▲ Show 20 Lines • Show All 593 Lines • ▼ Show 20 Lines	uint64_t ARMDisassembler::suggestBytesToSkip(ArrayRef<uint8_t> Bytes,
// second half as something else.		// second half as something else.
//		//
// If we don't have the instruction data available, we just have to		// If we don't have the instruction data available, we just have to
// recommend skipping the minimum sensible distance, which is 2		// recommend skipping the minimum sensible distance, which is 2
// bytes.		// bytes.
if (Bytes.size() < 2)		if (Bytes.size() < 2)
return 2;		return 2;

uint16_t Insn16 = (Bytes[1] << 8) \| Bytes[0];		uint16_t Insn16 = llvm::support::endian::read<uint16_t>(
		Bytes.data(), InstructionEndianness);
return Insn16 < 0xE800 ? 2 : 4;		return Insn16 < 0xE800 ? 2 : 4;
}		}

DecodeStatus ARMDisassembler::getInstruction(MCInst &MI, uint64_t &Size,		DecodeStatus ARMDisassembler::getInstruction(MCInst &MI, uint64_t &Size,
ArrayRef<uint8_t> Bytes,		ArrayRef<uint8_t> Bytes,
uint64_t Address,		uint64_t Address,
raw_ostream &CS) const {		raw_ostream &CS) const {
if (STI.getFeatureBits()[ARM::ModeThumb])		if (STI.getFeatureBits()[ARM::ModeThumb])
Show All 12 Lines	assert(!STI.getFeatureBits()[ARM::ModeThumb] &&
"mode!");		"mode!");

// We want to read exactly 4 bytes of data.		// We want to read exactly 4 bytes of data.
if (Bytes.size() < 4) {		if (Bytes.size() < 4) {
Size = 0;		Size = 0;
return MCDisassembler::Fail;		return MCDisassembler::Fail;
}		}

// Encoded as a small-endian 32-bit word in the stream.		// Encoded as a 32-bit word in the stream.
uint32_t Insn =		uint32_t Insn = llvm::support::endian::read<uint32_t>(Bytes.data(),
(Bytes[3] << 24) \| (Bytes[2] << 16) \| (Bytes[1] << 8) \| (Bytes[0] << 0);		InstructionEndianness);

// Calling the auto-generated decoder function.		// Calling the auto-generated decoder function.
DecodeStatus Result =		DecodeStatus Result =
decodeInstruction(DecoderTableARM32, MI, Insn, Address, this, STI);		decodeInstruction(DecoderTableARM32, MI, Insn, Address, this, STI);
if (Result != MCDisassembler::Fail) {		if (Result != MCDisassembler::Fail) {
Size = 4;		Size = 4;
return checkDecodedInstruction(MI, Size, Address, CS, Insn, Result);		return checkDecodedInstruction(MI, Size, Address, CS, Insn, Result);
}		}
▲ Show 20 Lines • Show All 271 Lines • ▼ Show 20 Lines	assert(STI.getFeatureBits()[ARM::ModeThumb] &&
"Asked to disassemble in Thumb mode but Subtarget is in ARM mode!");		"Asked to disassemble in Thumb mode but Subtarget is in ARM mode!");

// We want to read exactly 2 bytes of data.		// We want to read exactly 2 bytes of data.
if (Bytes.size() < 2) {		if (Bytes.size() < 2) {
Size = 0;		Size = 0;
return MCDisassembler::Fail;		return MCDisassembler::Fail;
}		}

uint16_t Insn16 = (Bytes[1] << 8) \| Bytes[0];		uint16_t Insn16 = llvm::support::endian::read<uint16_t>(
		Bytes.data(), InstructionEndianness);
DecodeStatus Result =		DecodeStatus Result =
decodeInstruction(DecoderTableThumb16, MI, Insn16, Address, this, STI);		decodeInstruction(DecoderTableThumb16, MI, Insn16, Address, this, STI);
if (Result != MCDisassembler::Fail) {		if (Result != MCDisassembler::Fail) {
Size = 2;		Size = 2;
Check(Result, AddThumbPredicate(MI));		Check(Result, AddThumbPredicate(MI));
return Result;		return Result;
}		}

Show All 37 Lines	DecodeStatus ARMDisassembler::getThumbInstruction(MCInst &MI, uint64_t &Size,

// We want to read exactly 4 bytes of data.		// We want to read exactly 4 bytes of data.
if (Bytes.size() < 4) {		if (Bytes.size() < 4) {
Size = 0;		Size = 0;
return MCDisassembler::Fail;		return MCDisassembler::Fail;
}		}

uint32_t Insn32 =		uint32_t Insn32 =
(Bytes[3] << 8) \| (Bytes[2] << 0) \| (Bytes[1] << 24) \| (Bytes[0] << 16);		(uint32_t(Insn16) << 16) \| llvm::support::endian::read<uint16_t>(
		Bytes.data() + 2, InstructionEndianness);

Result =		Result =
decodeInstruction(DecoderTableMVE32, MI, Insn32, Address, this, STI);		decodeInstruction(DecoderTableMVE32, MI, Insn32, Address, this, STI);
if (Result != MCDisassembler::Fail) {		if (Result != MCDisassembler::Fail) {
Size = 4;		Size = 4;

// Nested VPT blocks are UNPREDICTABLE. Must be checked before we add		// Nested VPT blocks are UNPREDICTABLE. Must be checked before we add
// the VPT predicate.		// the VPT predicate.
▲ Show 20 Lines • Show All 5,873 Lines • Show Last 20 Lines

llvm/test/tools/llvm-objdump/ELF/ARM/be-image-disasm.test

This file was added.

				# RUN: yaml2obj -DCONTENT=FA000002E59F100CE0800001E12FFF1E4802EB00308047703141592627182818 %s \| llvm-objdump -d --triple=armv7r - \| FileCheck %s
				MaskRayUnsubmitted Not Done Reply Inline Actions You can place be-image-disasm.test and be-object-disasm.test in one file. Grep `docnum` on existing tests. MaskRay: You can place be-image-disasm.test and be-object-disasm.test in one file. Grep `docnum` on…
				# RUN: yaml2obj -DCONTENT=020000FA0C109FE5010080E01EFF2FE1024800EB803070473141592627182818 -DFLAG=,EF_ARM_BE8 %s \| llvm-objdump -d --triple=armv7r - \| FileCheck %s

				# Test llvm-objdump disassembly of both kinds of AAELF32-compliant
				MaskRayUnsubmitted Not Done Reply Inline Actions binary utility test directories use `##` for non-RUN non-CHECK comments. They make comments stand out and they may be highlit with different styles in an editor. MaskRay: binary utility test directories use `## ` for non-RUN non-CHECK comments. They make comments…
				# big-endian AArch32 ELF image. By default, AArch32 ELF stores the
				# instructions big-endian ('BE32' style), unless the EF_ARM_BE8 flag
				# is set in the ELF header, which indicates that instructions are
				# stored little-endian ('BE8' style). llvm-objdump should detect the
				# flag and handle both types, using the $a, $t and $d mapping symbols
				# to distinguish Arm instructions, Thumb instructions, and data.
				#
				# The two test runs provide llvm-objdump with the BE32 and BE8
				# versions of the same image file, with the code section byte-swapped,
				# and the EF_ARM_BE8 flag absent and present respectively to indicate
				# that. We expect the identical disassembly from both.

				# CHECK: 8000: fa000002 blx 0x8010
				# CHECK-NEXT: 8004: e59f100c ldr r1, [pc, #12]
				# CHECK-NEXT: 8008: e0800001 add r0, r0, r1
				# CHECK-NEXT: 800c: e12fff1e bx lr
				# CHECK: 8010: 4802 ldr r0, [pc, #8]
				# CHECK-NEXT: 8012: eb00 3080 add.w r0, r0, r0, lsl #14
				# CHECK-NEXT: 8016: 4770 bx lr
				# CHECK: 8018: 31 41 59 26 .word 0x31415926
				# CHECK-NEXT: 801c: 27 18 28 18 .word 0x27182818

				--- !ELF
				FileHeader:
				Class: ELFCLASS32
				Data: ELFDATA2MSB
				Type: ET_EXEC
				Machine: EM_ARM
				Flags: [ EF_ARM_EABI_UNKNOWN[[FLAG=]] ]
				Entry: 0x8000
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				FirstSec: .text
				LastSec: .text
				VAddr: 0x8000
				Align: 0x4
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x8000
				AddressAlign: 0x4
				Content: [[CONTENT]]
				Symbols:
				- Name: '$a'
				Section: .text
				Value: 0x8000
				- Name: '$t'
				Section: .text
				Value: 0x8010
				- Name: '$d'
				Section: .text
				Value: 0x8018

llvm/test/tools/llvm-objdump/ELF/ARM/be-object-disasm.test

This file was added.

				# RUN: yaml2obj %s \| llvm-objdump -d --triple=armv7r - \| FileCheck %s

				# Test llvm-objdump disassembly of an AAELF32-compliant big-endian
				# AArch32 ELF object file. In this kind of file, the instructions are
				# stored big-endian, in the manner of older Arm architecture versions
				# and also Armv7-R BE32 mode.

				# CHECK: 0: fa000002 blx 0x10
				# CHECK-NEXT: 4: e59f100c ldr r1, [pc, #12]
				# CHECK-NEXT: 8: e0800001 add r0, r0, r1
				# CHECK-NEXT: c: e12fff1e bx lr
				# CHECK: 10: 4802 ldr r0, [pc, #8]
				# CHECK-NEXT: 12: eb00 3080 add.w r0, r0, r0, lsl #14
				# CHECK-NEXT: 16: 4770 bx lr
				# CHECK: 18: 31 41 59 26 .word 0x31415926
				# CHECK-NEXT: 1c: 27 18 28 18 .word 0x27182818

				--- !ELF
				FileHeader:
				Class: ELFCLASS32
				Data: ELFDATA2MSB
				Type: ET_REL
				Machine: EM_ARM
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x4
				Content: FA000002E59F100CE0800001E12FFF1E4802EB00308047703141592627182818
				Symbols:
				- Name: '$a'
				Section: .text
				- Name: '$t'
				Section: .text
				Value: 0x10
				- Name: '$d'
				Section: .text
				Value: 0x18

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show First 20 Lines • Show All 674 Lines • ▼ Show 20 Lines	void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
object::SectionedAddress Address, formatted_raw_ostream &OS,		object::SectionedAddress Address, formatted_raw_ostream &OS,
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,		StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
StringRef ObjectFilename, std::vector<RelocationRef> *Rels,		StringRef ObjectFilename, std::vector<RelocationRef> *Rels,
LiveVariablePrinter &LVP) override {		LiveVariablePrinter &LVP) override {
if (SP && (PrintSource \|\| PrintLines))		if (SP && (PrintSource \|\| PrintLines))
SP->printSourceLine(OS, Address, ObjectFilename, LVP);		SP->printSourceLine(OS, Address, ObjectFilename, LVP);
LVP.printBetweenInsts(OS, false);		LVP.printBetweenInsts(OS, false);

size_t Start = OS.tell();		size_t Start = OS.tell();
		MaskRayUnsubmitted Not Done Reply Inline Actions printInst is called for every instruction. Calling checkFeatures here may be too expensive. MaskRay: printInst is called for every instruction. Calling checkFeatures here may be too expensive.
if (LeadingAddr)		if (LeadingAddr)
OS << format("%8" PRIx64 ":", Address.Address);		OS << format("%8" PRIx64 ":", Address.Address);
if (ShowRawInsn) {		if (ShowRawInsn) {
size_t Pos = 0, End = Bytes.size();		size_t Pos = 0, End = Bytes.size();
if (STI.checkFeatures("+thumb-mode")) {		if (STI.checkFeatures("+thumb-mode")) {
for (; Pos + 2 <= End; Pos += 2)		for (; Pos + 2 <= End; Pos += 2)
OS << ' '		OS << ' '
<< format_hex_no_prefix(		<< format_hex_no_prefix(
llvm::support::endian::read<uint16_t>(		llvm::support::endian::read<uint16_t>(
Bytes.data() + Pos, llvm::support::little),		Bytes.data() + Pos, InstructionEndianness),
4);		4);
} else {		} else {
for (; Pos + 4 <= End; Pos += 4)		for (; Pos + 4 <= End; Pos += 4)
OS << ' '		OS << ' '
<< format_hex_no_prefix(		<< format_hex_no_prefix(
llvm::support::endian::read<uint32_t>(		llvm::support::endian::read<uint32_t>(
Bytes.data() + Pos, llvm::support::little),		Bytes.data() + Pos, InstructionEndianness),
8);		8);
}		}
if (Pos < End) {		if (Pos < End) {
OS << ' ';		OS << ' ';
dumpBytes(Bytes.slice(Pos), OS);		dumpBytes(Bytes.slice(Pos), OS);
}		}
}		}

AlignToInstStartColumn(Start, STI, OS);		AlignToInstStartColumn(Start, STI, OS);

if (MI) {		if (MI) {
IP.printInst(MI, Address.Address, "", STI, OS);		IP.printInst(MI, Address.Address, "", STI, OS);
} else		} else
OS << "\t<unknown>";		OS << "\t<unknown>";
}		}

		void setInstructionEndianness(llvm::support::endianness Endianness) {
		InstructionEndianness = Endianness;
		}

		private:
		llvm::support::endianness InstructionEndianness = llvm::support::little;
};		};
ARMPrettyPrinter ARMPrettyPrinterInst;		ARMPrettyPrinter ARMPrettyPrinterInst;

class AArch64PrettyPrinter : public PrettyPrinter {		class AArch64PrettyPrinter : public PrettyPrinter {
public:		public:
void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,		void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
object::SectionedAddress Address, formatted_raw_ostream &OS,		object::SectionedAddress Address, formatted_raw_ostream &OS,
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,		StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
▲ Show 20 Lines • Show All 1,123 Lines • ▼ Show 20 Lines	std::unique_ptr<const MCAsmInfo> AsmInfo(
TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));		TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));
if (!AsmInfo)		if (!AsmInfo)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no assembly info for target " + TripleName);		"no assembly info for target " + TripleName);

if (MCPU.empty())		if (MCPU.empty())
MCPU = Obj->tryGetCPUName().value_or("").str();		MCPU = Obj->tryGetCPUName().value_or("").str();

		if (isArmElf(*Obj)) {
		// When disassembling big-endian Arm ELF, the instruction endianness is
		// determined in a complex way. In relocatable objects, AAELF32 mandates
		// that instruction endianness matches the ELF file endianness; in
		// executable images, that's true unless the file header has the EF_ARM_BE8
		// flag, in which case instructions are little-endian regardless of data
		// endianness.
		//
		// We must set the big-endian-instructions SubtargetFeature to make the
		// disassembler read the instructions the right way round, and also tell
		// our own prettyprinter to retrieve the encodings the same way to print in
		// hex.
		const auto *Elf32BE = dyn_cast<ELF32BEObjectFile>(Obj);

		if (Elf32BE && (Elf32BE->isRelocatableObject() \|\|
		!(Elf32BE->getPlatformFlags() & ELF::EF_ARM_BE8))) {
		Features.AddFeature("+big-endian-instructions");
		ARMPrettyPrinterInst.setInstructionEndianness(llvm::support::big);
		} else {
		ARMPrettyPrinterInst.setInstructionEndianness(llvm::support::little);
		}
		}

std::unique_ptr<const MCSubtargetInfo> STI(		std::unique_ptr<const MCSubtargetInfo> STI(
TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));		TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));
if (!STI)		if (!STI)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no subtarget info for target " + TripleName);		"no subtarget info for target " + TripleName);
std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());		std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());
if (!MII)		if (!MII)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
▲ Show 20 Lines • Show All 1,117 Lines • Show Last 20 Lines