This is an archive of the discontinued LLVM Phabricator instance.

This seems fine, but we generally trust input and don't do too many error checks, so I wonder what is your motivation to add this error check. Was this something you can hit without doing assembly programming?

COFF/Chunks.cpp
137 ↗	(On Diff #161565)	Error messages should start with a lowercase letter.

In D50998#1207125, @ruiu wrote:

This seems fine, but we generally trust input and don't do too many error checks, so I wonder what is your motivation to add this error check. Was this something you can hit without doing assembly programming?

I ran into a bug when LLVM can produce code where the movw+movt instructions are separated. (I have a fix for the bug, just missing a good testcase before I'll submit it.)

This relocation is tricky since it covers both instructions, but the way it is handled within LLVM is that it only is attached to the movw (and the relocation for the movt is dropped). Ideally maybe LLVM should have a check at some point to make sure that wherever we have this relocation, it actually points to both these instructions - but I don't really know where such a check would be placed.

COFF/Chunks.cpp
137 ↗	(On Diff #161565)	Will fix.

Lowercased the error message.

For what it's worth I've checked that the bit-patterns used in the MOVT/MOVW instructions are correct. In assembler it seems like it is possible to create an error that isn't detectable at link time.

    .global main
    .global variable
    .global variable2
    .text
    .thumb
main:
    movw r0, :lower16:variable
    movt r0, :upper16:variable2
    ldr  r0, [r0]
    bx   lr
    .data
variable:
 .long 42

The assembler doesn't detect the different relocation targets and just emits a 0x0 IMAGE_REL_ARM_MOV32T variable relocation. It would be much better if this were caught at assembly time. I don't have any great ideas of how to solve it in MC though as that typically only presents Targets with one fixup at a time.

In D50998#1207297, @peter.smith wrote:
For what it's worth I've checked that the bit-patterns used in the MOVT/MOVW instructions are correct. In assembler it seems like it is possible to create an error that isn't detectable at link time.
    .global main
    .global variable
    .global variable2
    .text
    .thumb
main:
    movw r0, :lower16:variable
    movt r0, :upper16:variable2
    ldr  r0, [r0]
    bx   lr
    .data
variable:
 .long 42
The assembler doesn't detect the different relocation targets and just emits a 0x0 IMAGE_REL_ARM_MOV32T variable relocation. It would be much better if this were caught at assembly time. I don't have any great ideas of how to solve it in MC though as that typically only presents Targets with one fixup at a time.

Indeed, it's possible to craft other malicious cases that this won't catch - but it would at least catch most other bugs where the ARM target unintentionally produces something broken.

On the LLVM level, this is handled as the movw fixup translating into IMAGE_REL_ARM_MOV32T while the movt fixup just gets dropped; see lib/Target/ARM/MCTargetDesc/ARMWinCOFFObjectWriter.cpp:

unsigned ARMWinCOFFObjectWriter::getRelocType(MCContext &Ctx,
                                              const MCValue &Target,
                                              const MCFixup &Fixup,
                                              bool IsCrossSection,
                                              const MCAsmBackend &MAB) const {
  switch (static_cast<unsigned>(Fixup.getKind())) {
  [snip]
  case ARM::fixup_t2_movw_lo16:
  case ARM::fixup_t2_movt_hi16:
    return COFF::IMAGE_REL_ARM_MOV32T;
  }
}

bool ARMWinCOFFObjectWriter::recordRelocation(const MCFixup &Fixup) const {
  return static_cast<unsigned>(Fixup.getKind()) != ARM::fixup_t2_movt_hi16;
}

So past this stage, the information about the movt instruction's intent is lost.

As for the actual bug that I was able to catch with this error, I'll post the patch now.

Does @efriedma have any opinion about this one? It's slightly unconventional to check the instruction contents in the linker, but OTOH it's at least a simple and safe place to check that nothing really did break these pairs. Unless there's some easy way to add a check in LLVM (and I guess the correct solution there is to keep them together as a pseudo instruction longer, as suggested in the other review), do you think this would be worthwhile?

In D50998#1212923, @mstorsjo wrote:

Does @efriedma have any opinion about this one? It's slightly unconventional to check the instruction contents in the linker, but OTOH it's at least a simple and safe place to check that nothing really did break these pairs. Unless there's some easy way to add a check in LLVM (and I guess the correct solution there is to keep them together as a pseudo instruction longer, as suggested in the other review), do you think this would be worthwhile?

Not @efreidma, but that doesn't stop me from opining... I think it's a reasonable safety check. I don't think we have any relaxations in the COFF linker, but they are important elsewhere (turn GOT load into leaq sym(%rip) for ELF), and that requires looking at instructions.

This clearly isn't a substitute for catching this issue in the asm parser; LLVM shouldn't generate invalid object files. That said, it seems reasonable for the linker to check the input is sane. (Probably the testcase should use yaml2obj so it doesn't break when we fix LLVM.)

Changed the test to use yaml2obj instead of assembling it with llvm-mc.

LGTM

It looks like these guards are reasonable given that these relocations are tricky. Our policy is to not protect lld from all possible input corruptions, but that doesn't mean we shouldn't do a reasonable error check.

This revision is now accepted and ready to land.Aug 26 2018, 7:16 PM

Closed by commit rL340715: [COFF] Check the instructions in ARM MOV32T relocations (authored by mstorsjo). · Explain WhyAug 26 2018, 11:06 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lld/

trunk/

COFF/

Chunks.cpp

12 lines

test/

COFF/

broken-arm-reloc.yaml

92 lines

Diff 162613

lld/trunk/COFF/Chunks.cpp

Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines	void SectionChunk::applyRelX86(uint8_t Off, uint16_t Type, OutputSection OS,
}		}
}		}

static void applyMOV(uint8_t *Off, uint16_t V) {		static void applyMOV(uint8_t *Off, uint16_t V) {
write16le(Off, (read16le(Off) & 0xfbf0) \| ((V & 0x800) >> 1) \| ((V >> 12) & 0xf));		write16le(Off, (read16le(Off) & 0xfbf0) \| ((V & 0x800) >> 1) \| ((V >> 12) & 0xf));
write16le(Off + 2, (read16le(Off + 2) & 0x8f00) \| ((V & 0x700) << 4) \| (V & 0xff));		write16le(Off + 2, (read16le(Off + 2) & 0x8f00) \| ((V & 0x700) << 4) \| (V & 0xff));
}		}

static uint16_t readMOV(uint8_t *Off) {		static uint16_t readMOV(uint8_t *Off, bool MOVT) {
uint16_t Op1 = read16le(Off);		uint16_t Op1 = read16le(Off);
		if ((Op1 & 0xfbf0) != (MOVT ? 0xf2c0 : 0xf240))
		error("unexpected instruction in " + Twine(MOVT ? "MOVT" : "MOVW") +
		" instruction in MOV32T relocation");
uint16_t Op2 = read16le(Off + 2);		uint16_t Op2 = read16le(Off + 2);
		if ((Op2 & 0x8000) != 0)
		error("unexpected instruction in " + Twine(MOVT ? "MOVT" : "MOVW") +
		" instruction in MOV32T relocation");
return (Op2 & 0x00ff) \| ((Op2 >> 4) & 0x0700) \| ((Op1 << 1) & 0x0800) \|		return (Op2 & 0x00ff) \| ((Op2 >> 4) & 0x0700) \| ((Op1 << 1) & 0x0800) \|
((Op1 & 0x000f) << 12);		((Op1 & 0x000f) << 12);
}		}

void applyMOV32T(uint8_t *Off, uint32_t V) {		void applyMOV32T(uint8_t *Off, uint32_t V) {
uint16_t ImmW = readMOV(Off); // read MOVW operand		uint16_t ImmW = readMOV(Off, false); // read MOVW operand
uint16_t ImmT = readMOV(Off + 4); // read MOVT operand		uint16_t ImmT = readMOV(Off + 4, true); // read MOVT operand
uint32_t Imm = ImmW \| (ImmT << 16);		uint32_t Imm = ImmW \| (ImmT << 16);
V += Imm; // add the immediate offset		V += Imm; // add the immediate offset
applyMOV(Off, V); // set MOVW operand		applyMOV(Off, V); // set MOVW operand
applyMOV(Off + 4, V >> 16); // set MOVT operand		applyMOV(Off + 4, V >> 16); // set MOVT operand
}		}

static void applyBranch20T(uint8_t *Off, int32_t V) {		static void applyBranch20T(uint8_t *Off, int32_t V) {
if (!isInt<21>(V))		if (!isInt<21>(V))
▲ Show 20 Lines • Show All 505 Lines • Show Last 20 Lines

lld/trunk/test/COFF/broken-arm-reloc.yaml

				# REQUIRES: arm

				# .global main
				# .global variable
				# .text
				# .thumb
				#main:
				# movw r0, :lower16:variable
				# nop
				# movt r0, :upper16:variable
				# ldr r0, [r0]
				# bx lr
				# .data
				#variable:
				# .long 42

				# RUN: yaml2obj %s > %t.obj
				# RUN: not lld-link -out:%t.exe -entry:main %t.obj 2>&1 \| FileCheck %s

				# CHECK: error: unexpected instruction in MOVT instruction in MOV32T relocation

				--- !COFF
				header:
				Machine: IMAGE_FILE_MACHINE_ARMNT
				Characteristics: [ ]
				sections:
				- Name: .text
				Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
				Alignment: 4
				SectionData: 40F2000000BFC0F2000000687047
				Relocations:
				- VirtualAddress: 0
				SymbolName: variable
				Type: IMAGE_REL_ARM_MOV32T
				- Name: .data
				Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
				Alignment: 4
				SectionData: 2A000000
				- Name: .bss
				Characteristics: [ IMAGE_SCN_CNT_UNINITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
				Alignment: 4
				SectionData: ''
				symbols:
				- Name: .text
				Value: 0
				SectionNumber: 1
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_STATIC
				SectionDefinition:
				Length: 14
				NumberOfRelocations: 1
				NumberOfLinenumbers: 0
				CheckSum: 2762100735
				Number: 1
				- Name: .data
				Value: 0
				SectionNumber: 2
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_STATIC
				SectionDefinition:
				Length: 4
				NumberOfRelocations: 0
				NumberOfLinenumbers: 0
				CheckSum: 3482275674
				Number: 2
				- Name: .bss
				Value: 0
				SectionNumber: 3
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_STATIC
				SectionDefinition:
				Length: 0
				NumberOfRelocations: 0
				NumberOfLinenumbers: 0
				CheckSum: 0
				Number: 3
				- Name: main
				Value: 0
				SectionNumber: 1
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_EXTERNAL
				- Name: variable
				Value: 0
				SectionNumber: 2
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_EXTERNAL
				...

This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [COFF] Check the instructions in ARM MOV32T relocationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 162613

lld/trunk/COFF/Chunks.cpp

lld/trunk/test/COFF/broken-arm-reloc.yaml

[LLD] [COFF] Check the instructions in ARM MOV32T relocations
ClosedPublic