This is an archive of the discontinued LLVM Phabricator instance.

Currently this patch is WIP, I need to do some clean up, and add a unittest to it. I managed to test it with some example, it seems it works. I will update it early next week.

Looks sensible overall, except for two relocations that have incorrect interpretations. You seem to be missing a couple relocation types as well; check lld/COFF/Chunks.cpp for reference on how to handle the rest of them. Some of them are not very probable to use in JITed code (as TLS code requires space for the variables to be allocated in the TLS section by the system's runtime loader; the SECREL_LOW/HIGH_12A/L relocations are used for that), but most of the remaining ones are pretty straightforward to handle anyway.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	No, this is wrong - and this is one of the less obvious details. If you have an addend stored in the instruction pointed to by `IMAGE_REL_ARM64_PAGEBASE_REL21`, the addend is expressed in bytes, not in 4096 byte pages. Consider you have a symbol close to the end of a page, and you want to express an offset by a few bytes (less than a page), that makes the pointed to location in another page. If the addend would express a number of pages (as this patch expects right now), the addend here would be zero, and you'd end up with this part of the instruction pair pointing at the wrong page. Therefore, the immediate stored in the instruction before handling relocation is expressed as a number of bytes, even though it means a number of pages after the relocation is done and the instruction is executed.
175	Nit: The indentation of the first comment line is off here
180	Nit: Double spaces between "store" and "or"
188	I guess this check could also be for the relocation type `IMAGE_REL_ARM64_PAGEOFFSET_12L`?
210	Would it make more sense, stylistically, to extend the ifdef around the debug statement as well? Right now it does look weird to have code referring to variables that don't exist (even though LLVM_DEBUG will make them disappear).
220	The indentation here is weird. Please run `clang-format-diff -style LLVM` on the changes.
243	This doesn't seem right. `IMAGE_REL_ARM64_ADDR64` is a plain 64 bit integer (just like `IMAGE_REL_ARM64_ADDR32NB`), not a series of instructions that should get immediates added.
287	This switch for checking for alignment is rather verbose - would it make sense to condense it down to a single expression for all alignments?
330	Missing newline at end of file

Martin, thanks for your thoughts. I'm on to fix them.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	Thanks for explaining this. It seems I misunderstood this behavior. To be honest I looked at how MachOAArch64 does, how they decode the addend, I compared to aarch64 reference manual to see how it works, it seemed to me I can do the same thing as MachO does.
188	It is just for IMAGE_REL_ARM64_PAGEOFFSET_12L, we need to determine the shift value only for load/store, since for instructions ADD/ADDS (immediate) we should use zero shift.
210	Yes, you're right, I will fix it.
220	Thanks, I will use it.
243	Yes, you're right. Currently I handle a long branch instruction with this relocation, so when I needed to create stub function (which generates movz/movk instruction), I just set this relocation to detect this case when we have an external symbol. When I tested it with a small examples and Swithshader It worked because I haven't got this relocation type. Of course I should use another one, maybe an internal type. What do you think?
287	Sure, it could be.

kaadam added a subscriber: richard.townsend.arm.Oct 29 2019, 10:11 AM

mstorsjo added inline comments.Oct 29 2019, 2:42 PM

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	I guess it differs a bit between how the different object file formats encode the relocations and symbol offsets. (IIRC ELF and MachO can say that a relocation is relative to a symbol with offset, while COFF only points at a symbol, and any offset must be applied via the instruction immediates.)
188	Yes, but the relocation names already imply this. The `L` suffixed relocation is used for loads/stores, and the `A` suffixed relocation is used for add instructions. For `IMAGE_REL_ARM64_PAGEOFFSET_12L` we do not need to check whether the instruction is a load/store, but we can go directly to reading out the implicit shift amount, and for `IMAGE_REL_ARM64_PAGEOFFSET_12A` we should not read any implicit shift amount at all. See lld/COFF/Chunks.cpp, SectionChunk::applyRelARM64. For IMAGE_REL_ARM64_PAGEOFFSET_12L we call applyArm64Ldr which reads out the shift amount and then calls applyArm64Imm, while applyArm64Imm is called directly for IMAGE_REL_ARM64_PAGEOFFSET_12A. In general, when a linker (either dynamic or static) resolves a relocation, it should seldom need to inspect the instruction it is applied on, even though it is needed here for reading out the implicit shift amount. In general, the relocation type just encodes a specific action that should be done on that memory location with very little extra logic.
243	Hmm, as I'm not familiar with those bits that generate it, I don't know for sure. As you're already setting a small code model for aarch64/win, doesn't that already achieve this? Otherwise some variant of adrp+add would normally be used for forming any arbitrary address, instead of a series of mov instructions, unless the value to be formed is a constant. If necessary I guess one could consider using private internal relocation types, but I'd at least defer those bits to a later patch where it can be discussed properly on its own, and keep this first for the official relocation types.

Martin, sorry for the delayed response.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	Okay, I see now. By the way I don't necessary need to decode addend in processRelocationRef function, it could be encoded before the relocation applied, am I right? Which is better? Since the inst contains the offset in the immediate part, so it have to be considered (add the offset to 'Value') right before immediate is rewritten. I update the change later today.

mstorsjo added inline comments.Nov 5 2019, 6:38 AM

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	I presume you meant "it could be decoded before the relocation applied", not encoded? Yes, you could do the decode+update+encode all in one step. When reviewing I noted that this followed such a two-step style, but I presumed this came from general RuntimeDyld design. In lld it's all done in one single function. I'm not familiar with RuntimeDyld to say if there's any specific needs here for it to be this way, but if other parts of RuntimeDyld does it this way (in particular, other COFF architectures) it might be good to match the style.

kaadam marked an inline comment as done.Nov 5 2019, 9:41 AM

kaadam added inline comments.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
169	Yes, that's what I meant. Thanks for your suggestion. Yes, I follow that style, since RuntimeDyld design follows a two-step style, but as far as I see there is no any style convention, for example in COFF Thumb a few instruction's addend are decoded, most of them are not. I don't see any significance to be this way, but I would like to do a clean approach for this.

kaadam updated this revision to Diff 228226.Nov 7 2019, 6:10 AM

kaadam edited the summary of this revision. (Show Details)

Looking pretty good now, just a few details I found. Then this needs tests to get rid of the WIP status.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
47	Leftover commented out code?
58	Is there any better error reporting mechanism available here? The current form of assert is wrong, as the address of the string will evaluate as nonzero and the assert wouldn't trigger. Maybe `assert(0 && "message")` in case there's no better way of actually reporting the issue to the caller.
118	I can't say I entirely understand this function (I understand what it tries to do, but fail to follow exactly how the details fit together), but as I see that it is very similar to the corresponding existing code for x86_64, I presume it's correct.
172	Nit: Here and below you have pretty superfluous outer parentheses
181	This doesn't seem to be right? For instructions B.cond and e.g. cbz, you have 5 least significant bits of the instruction being other data than the immediate, and after that, the resulting value should be left shifted by 2. So here, it should be `Addend = ((orig & 0x00FFFFE0) >> 5) << 2;` (for clarity) or `Addend = (orig & 0x00FFFFE0) >> 3;` (for more straightforward but less obvious code). Something similar should be done for branch14 below as well. LLD currently actually ignores the existing immediate in these relocations (which hasn't been an issue so far, but technically is an oversight).
206	Do we need to support reading out the immediate from `INTERNAL_REL_ARM64_LONG_BRANCH26` here? Or as the only place that generates it writes a zero immediate I guess it's not necessary?
303	If there actually was a nonzero immediate in the instruction here from before, `or32le` won't do the right thing. We don't handle this in lld right now, but as this code at least tries to read out the immediate further up, it would be good for consistency to actually clear the immediate from the branch instruction here before or'ing in the final value.
339	Hmm, where does Value end up added to this one? I do see that the existing COFF targets does it the same way, and that code does seem to be used and have a working test, but I don't see how it works. Do you have any clue?

Thanks for the lots of comment, it was really helpful. I update the patch, I enclose a unit_test for this. There are tree test failures (adr, b.cond, secrel.), I'm still looking for the reasons.

Expression 'decode_operand(adr1, 1) = (_const[20:0] - adr1[20:0])' is false: 0x2014 != 0x10014
Expression 'decode_operand(bcond, 1)[23:5] = (_foo - bcond)[20:2]' is false: 0x7ffff != 0x7fff8
Expression '*{4}secrel = _foo - section_addr(COFF_AArch64.obj, .text)' is false: 0xfffeffea != 0x4

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
118	Yes, the logic is the same as other COFF targets do. Basically we should handle an external symbol which is so far. So first time when we detected we have an external symbol, generate a stub function (movz/movk [1]). Currently the Value contains the address of the external symbol for the original relocation branch26 (b/bl instruction). We need to replace the Value here to point to the address of the stub instead of the external symbol. That's why we need to call resolveRelocation for the original relocation, but here we will pass the stub address as Value to it. So when original branch26 relocation is resolved that will point to stub. After that we returned with our internal relocation type and stuboffset, create RelocationEntry for these data and we can use the original Value for external symbol. Finally we will resolve the relocation for this internal type, the symbol address will be encode the into movz/movk stub instructions. all in all we're going to have two jump, bl -> stub -> external symbol. [1]: https://github.com/llvm-mirror/llvm/blob/master/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp#L933
206	Yes, it is not necessary since the Addend is always zero this internal type.
303	Yes, you're right . maybe the immediate for these instruction could be cleared when the addends are decoded.
339	Unfortunately I'm not sure what happens under the hood, but it seems it is correct to use only the addend here, which contains the offset of the item from the beginning of its section in this case.

change is updated, unit_test is attached.

mstorsjo added inline comments.Nov 15 2019, 5:48 AM

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h
118	Yes, that much I understand (I implemented the same feature in the lld linker as well) - it was more relating to the exact details of each of the method calls here, but I don't have any concrete question regarding it right now either.
194	You can't do the masking out of the original immediate here; updating `orig` has no effect at all, as that's a local variable. Handling masking out the old value within `processRelocationRef` feels wrong in general (as this function only inspects and gathers info but doesn't update anything yet), I think this should be in `resolveRelocation`. So then you can't use `or32le` there in those cases, but more something like `write32le(P, (read32le(P) & ~Mask) \| V);` (which perhaps can warrant a helper function of its own).

updated the write method in branch relocations, the unit_test is also updated.

kaadam updated this revision to Diff 230091.Nov 19 2019, 9:08 AM

LGTM

Do you need someone to commit this for you? Do you have a preferred form for the git author line in that case ("User Name <email@address>"); with or without non-ascii diacritics?

This revision is now accepted and ready to land.Nov 19 2019, 1:57 PM

Martin, thanks for the review. Yes, I need. May I ask you to commit this change? I use "Adam Kallai <kadam@inf.u-szeged.hu> in the git author line.

mstorsjo added inline comments.Nov 20 2019, 1:00 AM

test/ExecutionEngine/RuntimeDyld/AArch64/COFF_AArch64.s
127 ↗	(On Diff #230091)	FWIW, I had to fix up the filename here, to `COFF_AArch64.s.tmp.obj` to make the tests pass. Pushed with that changed.

Closed by commit rGdc3ee330891c: ExecutionEngine: add preliminary support for COFF ARM64 (authored by kaadam, committed by mstorsjo). · Explain WhyNov 20 2019, 1:08 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: hiraditya. · View Herald TranscriptNov 20 2019, 1:08 AM

kaadam marked an inline comment as done.Nov 20 2019, 1:21 AM

kaadam added inline comments.

test/ExecutionEngine/RuntimeDyld/AArch64/COFF_AArch64.s
127 ↗	(On Diff #230091)	That's interesting. I ran tests manually, so I think that's why I didn't noticed this failure. So LLVM test infrastructure use this form of the object file? Thanks for fixing that. There was no problem with check in line 39? rtdyld-check: decode_operand(brel, 0)[25:0] = (stub_addr(COFF_AArch64.obj/.text, dummy) - brel)[27:2]

kaadam marked an inline comment as done.Nov 20 2019, 1:24 AM

kaadam added inline comments.

test/ExecutionEngine/RuntimeDyld/AArch64/COFF_AArch64.s
127 ↗	(On Diff #230091)	Ahh sorry, I see now, you've modified that as well. Thanks again.

mstorsjo added inline comments.Nov 20 2019, 1:36 AM

test/ExecutionEngine/RuntimeDyld/AArch64/COFF_AArch64.s
127 ↗	(On Diff #230091)	That's how the test infrastructure expands it yes. The %t expands to a temporary file name based on the current file name, plus .tmp. You can manually run the tests for just one file, by running `bin/llvm-lit -v path/to/test/ExecutionEngine/Foo/file.s`, for a subdirectory tree with the same command pointing to a directory, or all of them (with `ninja check-llvm`).

My CL is broke the buildbot, a break is necessary in default case. How could I fix it? Shall I need to update this change?

In D69434#1753145, @kaadam wrote:

My CL is broke the buildbot, a break is necessary in default case. How could I fix it? Shall I need to update this change?

I pushed a new commit to fix this now.

You could post a new patch for fixing it, but in this case the fix was trivial enough to just do myself. If fixing takes longer, it might be good to revert the commit in the meantime, and then it might be better to update the full patch so that it can be reapplied in fixed form.

In D69434#1753153, @mstorsjo wrote:

In D69434#1753145, @kaadam wrote:

My CL is broke the buildbot, a break is necessary in default case. How could I fix it? Shall I need to update this change?

I pushed a new commit to fix this now.

You could post a new patch for fixing it, but in this case the fix was trivial enough to just do myself. If fixing takes longer, it might be good to revert the commit in the meantime, and then it might be better to update the full patch so that it can be reapplied in fixed form.

Okay, I see. Thanks Martin.

Coming to this a bit late -- Thanks for working on this kaadam! And thanks for the review mstorsjo.

Does COFF/AArch64 use a GOT? I couldn't see handling for that here?

Revision Contents

Path

Size

lib/

ExecutionEngine/

RuntimeDyld/

RuntimeDyldCOFF.cpp

3 lines

Targets/

RuntimeDyldCOFFAArch64.h

329 lines

Target/

AArch64/

AArch64TargetMachine.cpp

5 lines

Diff 226440

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp

//===-- RuntimeDyldCOFF.cpp - Run-time dynamic linker for MC-JIT -- C++ --==//		//===-- RuntimeDyldCOFF.cpp - Run-time dynamic linker for MC-JIT -- C++ --==//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Implementation of COFF support for the MC-JIT runtime dynamic linker.		// Implementation of COFF support for the MC-JIT runtime dynamic linker.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "RuntimeDyldCOFF.h"		#include "RuntimeDyldCOFF.h"
#include "Targets/RuntimeDyldCOFFI386.h"		#include "Targets/RuntimeDyldCOFFI386.h"
#include "Targets/RuntimeDyldCOFFThumb.h"		#include "Targets/RuntimeDyldCOFFThumb.h"
#include "Targets/RuntimeDyldCOFFX86_64.h"		#include "Targets/RuntimeDyldCOFFX86_64.h"
		#include "Targets/RuntimeDyldCOFFAArch64.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/Object/ObjectFile.h"		#include "llvm/Object/ObjectFile.h"

using namespace llvm;		using namespace llvm;
using namespace llvm::object;		using namespace llvm::object;

#define DEBUG_TYPE "dyld"		#define DEBUG_TYPE "dyld"
Show All 25 Lines	llvm::RuntimeDyldCOFF::create(Triple::ArchType Arch,
switch (Arch) {		switch (Arch) {
default: llvm_unreachable("Unsupported target for RuntimeDyldCOFF.");		default: llvm_unreachable("Unsupported target for RuntimeDyldCOFF.");
case Triple::x86:		case Triple::x86:
return std::make_unique<RuntimeDyldCOFFI386>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldCOFFI386>(MemMgr, Resolver);
case Triple::thumb:		case Triple::thumb:
return std::make_unique<RuntimeDyldCOFFThumb>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldCOFFThumb>(MemMgr, Resolver);
case Triple::x86_64:		case Triple::x86_64:
return std::make_unique<RuntimeDyldCOFFX86_64>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldCOFFX86_64>(MemMgr, Resolver);
		case Triple::aarch64:
		return std::make_unique<RuntimeDyldCOFFAArch64>(MemMgr, Resolver);
}		}
}		}

std::unique_ptr<RuntimeDyld::LoadedObjectInfo>		std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
RuntimeDyldCOFF::loadObject(const object::ObjectFile &O) {		RuntimeDyldCOFF::loadObject(const object::ObjectFile &O) {
if (auto ObjSectionToIDOrErr = loadObjectImpl(O)) {		if (auto ObjSectionToIDOrErr = loadObjectImpl(O)) {
return std::make_unique<LoadedCOFFObjectInfo>(this, ObjSectionToIDOrErr);		return std::make_unique<LoadedCOFFObjectInfo>(this, ObjSectionToIDOrErr);
} else {		} else {
Show All 17 Lines

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h

This file was added.

				//===-- RuntimeDyldCOFFAArch64.h --- COFF/AArch64 specific code ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// COFF AArch64 support for MC-JIT runtime dynamic linker.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_EXECUTIONENGINE_RUNTIMEDYLD_TARGETS_RUNTIMEDYLDCOFFAARCH64_H
				#define LLVM_LIB_EXECUTIONENGINE_RUNTIMEDYLD_TARGETS_RUNTIMEDYLDCOFFAARCH64_H

				#include "../RuntimeDyldCOFF.h"
				#include "llvm/BinaryFormat/COFF.h"
				#include "llvm/Object/COFF.h"
				#include "llvm/Support/Endian.h"

				#define DEBUG_TYPE "dyld"

				using namespace llvm::support::endian;

				namespace llvm {

				static void or32le(void *P, int32_t V) { write32le(P, read32le(P) \| V); }

				static void or32AArch64Imm(void *L, uint64_t Imm) {
				or32le(L, (Imm & 0xFFF) << 10);
				}

				static void write32AArch64Addr(void *L, uint64_t Imm) {
				uint32_t ImmLo = (Imm & 0x3) << 29;
				uint32_t ImmHi = (Imm & 0x1FFFFC) << 3;
				uint64_t Mask = (0x3 << 29) \| (0x1FFFFC << 3);
				write32le(L, (read32le(L) & ~Mask) \| ImmLo \| ImmHi);
				}

				// Return the bits [Start, End] from Val shifted Start bits.
				// For instance, getBits(0xF0, 4, 8) returns 0xF.
				static uint64_t getBits(uint64_t Val, int Start, int End) {
				uint64_t Mask = ((uint64_t)1 << (End + 1 - Start)) - 1;
				return (Val >> Start) & Mask;
				}

				class RuntimeDyldCOFFAArch64 : public RuntimeDyldCOFF {
				mstorsjoUnsubmitted Done Reply Inline Actions Leftover commented out code? mstorsjo: Leftover commented out code?

				private:
				// When a module is loaded we save the SectionID of the unwind
				// sections in a table until we receive a request to register all
				// unregisteredEH frame sections with the memory manager.
				SmallVector<SID, 2> UnregisteredEHFrameSections;
				SmallVector<SID, 2> RegisteredEHFrameSections;
				uint64_t ImageBase;

				public:
				RuntimeDyldCOFFAArch64(RuntimeDyld::MemoryManager &MM,
				mstorsjoUnsubmitted Done Reply Inline Actions Is there any better error reporting mechanism available here? The current form of assert is wrong, as the address of the string will evaluate as nonzero and the assert wouldn't trigger. Maybe `assert(0 && "message")` in case there's no better way of actually reporting the issue to the caller. mstorsjo: Is there any better error reporting mechanism available here? The current form of assert is…
				JITSymbolResolver &Resolver)
				: RuntimeDyldCOFF(MM, Resolver), ImageBase(0) {}

				unsigned getStubAlignment() override { return 8; }

				unsigned getMaxStubSize() const override { return 20; }

				std::tuple<uint64_t, uint64_t, uint64_t>
				generateRelocationStub(unsigned SectionID, StringRef TargetName,
				uint64_t Offset, uint64_t RelType, uint64_t Addend,
				StubMap &Stubs) {
				uintptr_t StubOffset;
				SectionEntry &Section = Sections[SectionID];

				RelocationValueRef OriginalRelValueRef;
				OriginalRelValueRef.SectionID = SectionID;
				OriginalRelValueRef.Offset = Offset;
				OriginalRelValueRef.Addend = Addend;
				OriginalRelValueRef.SymbolName = TargetName.data();

				auto Stub = Stubs.find(OriginalRelValueRef);
				if (Stub == Stubs.end()) {
				LLVM_DEBUG(dbgs() << " Create a new stub function for "
				<< TargetName.data() << "\n");

				StubOffset = Section.getStubOffset();
				Stubs[OriginalRelValueRef] = StubOffset;
				createStubFunction(Section.getAddressWithOffset(StubOffset));
				Section.advanceStubOffset(getMaxStubSize());
				} else {
				LLVM_DEBUG(dbgs() << " Stub function found for " << TargetName.data()
				<< "\n");
				StubOffset = Stub->second;
				}

				// Resolve original relocation to stub function.
				const RelocationEntry RE(SectionID, Offset, RelType, Addend);
				resolveRelocation(RE, Section.getLoadAddressWithOffset(StubOffset));

				// adjust relocation info so resolution writes to the stub function
				Addend = 0;
				Offset = StubOffset;
				RelType = COFF::IMAGE_REL_ARM64_ADDR64;

				return std::make_tuple(Offset, RelType, Addend);
				}

				Expected<object::relocation_iterator>
				processRelocationRef(unsigned SectionID,
				object::relocation_iterator RelI,
				const object::ObjectFile &Obj,
				ObjSectionToIDMap &ObjSectionToID,
				StubMap &Stubs) override {

				auto Symbol = RelI->getSymbol();
				if (Symbol == Obj.symbol_end())
				report_fatal_error("Unknown symbol in relocation");

				Expected<StringRef> TargetNameOrErr = Symbol->getName();
				if (!TargetNameOrErr)
				mstorsjoUnsubmitted Not Done Reply Inline Actions I can't say I entirely understand this function (I understand what it tries to do, but fail to follow exactly how the details fit together), but as I see that it is very similar to the corresponding existing code for x86_64, I presume it's correct. mstorsjo: I can't say I entirely understand this function (I understand what it tries to do, but fail to…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, the logic is the same as other COFF targets do. Basically we should handle an external symbol which is so far. So first time when we detected we have an external symbol, generate a stub function (movz/movk [1]). Currently the Value contains the address of the external symbol for the original relocation branch26 (b/bl instruction). We need to replace the Value here to point to the address of the stub instead of the external symbol. That's why we need to call resolveRelocation for the original relocation, but here we will pass the stub address as Value to it. So when original branch26 relocation is resolved that will point to stub. After that we returned with our internal relocation type and stuboffset, create RelocationEntry for these data and we can use the original Value for external symbol. Finally we will resolve the relocation for this internal type, the symbol address will be encode the into movz/movk stub instructions. all in all we're going to have two jump, bl -> stub -> external symbol. [1]: https://github.com/llvm-mirror/llvm/blob/master/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp#L933 kaadam: Yes, the logic is the same as other COFF targets do. Basically we should handle an external…
				mstorsjoUnsubmitted Not Done Reply Inline Actions Yes, that much I understand (I implemented the same feature in the lld linker as well) - it was more relating to the exact details of each of the method calls here, but I don't have any concrete question regarding it right now either. mstorsjo: Yes, that much I understand (I implemented the same feature in the lld linker as well) - it was…
				return TargetNameOrErr.takeError();
				StringRef TargetName = *TargetNameOrErr;

				auto SectionOrErr = Symbol->getSection();
				if (!SectionOrErr)
				return SectionOrErr.takeError();
				auto Section = *SectionOrErr;

				uint64_t RelType = RelI->getType();
				uint64_t Offset = RelI->getOffset();

				// If there is no section, this must be an external reference.
				const bool IsExtern = Section == Obj.section_end();

				// Determine the Addend used to adjust the relocation value.
				uint64_t Addend = 0;
				SectionEntry &AddendSection = Sections[SectionID];
				uintptr_t ObjTarget = AddendSection.getObjAddress() + Offset;
				uint8_t Displacement = (uint8_t )ObjTarget;

				switch (RelType) {
				case COFF::IMAGE_REL_ARM64_ADDR32:
				case COFF::IMAGE_REL_ARM64_ADDR32NB:
				case COFF::IMAGE_REL_ARM64_SECREL:
				Addend = readBytesUnaligned(Displacement, 4);
				break;
				case COFF::IMAGE_REL_ARM64_BRANCH26: {
				auto p = reinterpret_cast<support::aligned_ulittle32_t >(Displacement);
				assert(((*p & 0xFC000000) == 0x14000000 \|\|
				(*p & 0xFC000000) == 0x94000000) &&
				"Expected branch instruction.");

				// Get the 26 bit addend encoded in the branch instruction and sign-extend
				// to 64 bit. The lower 2 bits are always zeros and are therefore implicit
				// (<< 2).
				Addend = (*p & 0x03FFFFFF) << 2;
				Addend = SignExtend64(Addend, 28);

				if (IsExtern)
				std::tie(Offset, RelType, Addend) = generateRelocationStub(
				SectionID, TargetName, Offset, RelType, Addend, Stubs);
				break;
				}
				case COFF::IMAGE_REL_ARM64_PAGEBASE_REL21: {
				auto p = reinterpret_cast<support::aligned_ulittle32_t >(Displacement);
				assert((*p & 0x9F000000) == 0x90000000 && "Expected adrp instruction.");

				// Get the 21 bit addend encoded in the adrp instruction and sign-extend
				// to 64 bit. The lower 12 bits (4096 byte page) are always zeros and are
				// therefore implicit (<< 12).
				Addend = ((p & 0x60000000) >> 29) \| ((p & 0x01FFFFE0) >> 3) << 12;
				mstorsjoUnsubmitted Not Done Reply Inline Actions No, this is wrong - and this is one of the less obvious details. If you have an addend stored in the instruction pointed to by `IMAGE_REL_ARM64_PAGEBASE_REL21`, the addend is expressed in bytes, not in 4096 byte pages. Consider you have a symbol close to the end of a page, and you want to express an offset by a few bytes (less than a page), that makes the pointed to location in another page. If the addend would express a number of pages (as this patch expects right now), the addend here would be zero, and you'd end up with this part of the instruction pair pointing at the wrong page. Therefore, the immediate stored in the instruction before handling relocation is expressed as a number of bytes, even though it means a number of pages after the relocation is done and the instruction is executed. mstorsjo: No, this is wrong - and this is one of the less obvious details. If you have an addend stored…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Thanks for explaining this. It seems I misunderstood this behavior. To be honest I looked at how MachOAArch64 does, how they decode the addend, I compared to aarch64 reference manual to see how it works, it seemed to me I can do the same thing as MachO does. kaadam: Thanks for explaining this. It seems I misunderstood this behavior. To be honest I looked at…
				mstorsjoUnsubmitted Not Done Reply Inline Actions I guess it differs a bit between how the different object file formats encode the relocations and symbol offsets. (IIRC ELF and MachO can say that a relocation is relative to a symbol with offset, while COFF only points at a symbol, and any offset must be applied via the instruction immediates.) mstorsjo: I guess it differs a bit between how the different object file formats encode the relocations…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Okay, I see now. By the way I don't necessary need to decode addend in processRelocationRef function, it could be encoded before the relocation applied, am I right? Which is better? Since the inst contains the offset in the immediate part, so it have to be considered (add the offset to 'Value') right before immediate is rewritten. I update the change later today. kaadam: Okay, I see now. By the way I don't necessary need to decode addend in processRelocationRef…
				mstorsjoUnsubmitted Not Done Reply Inline Actions I presume you meant "it could be decoded before the relocation applied", not encoded? Yes, you could do the decode+update+encode all in one step. When reviewing I noted that this followed such a two-step style, but I presumed this came from general RuntimeDyld design. In lld it's all done in one single function. I'm not familiar with RuntimeDyld to say if there's any specific needs here for it to be this way, but if other parts of RuntimeDyld does it this way (in particular, other COFF architectures) it might be good to match the style. mstorsjo: I presume you meant "it could be decoded before the relocation applied", not encoded? Yes, you…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, that's what I meant. Thanks for your suggestion. Yes, I follow that style, since RuntimeDyld design follows a two-step style, but as far as I see there is no any style convention, for example in COFF Thumb a few instruction's addend are decoded, most of them are not. I don't see any significance to be this way, but I would like to do a clean approach for this. kaadam: Yes, that's what I meant. Thanks for your suggestion. Yes, I follow that style, since…
				Addend = SignExtend64(Addend, 33);
				break;
				}
				mstorsjoUnsubmitted Done Reply Inline Actions Nit: Here and below you have pretty superfluous outer parentheses mstorsjo: Nit: Here and below you have pretty superfluous outer parentheses
				case COFF::IMAGE_REL_ARM64_PAGEOFFSET_12A:
				case COFF::IMAGE_REL_ARM64_PAGEOFFSET_12L: {
				// Verify that the relocation points to one of the expected load / store
				mstorsjoUnsubmitted Not Done Reply Inline Actions Nit: The indentation of the first comment line is off here mstorsjo: Nit: The indentation of the first comment line is off here
				// or add / sub instructions.
				auto p = reinterpret_cast<support::aligned_ulittle32_t >(Displacement);
				assert((((*p & 0x3B000000) == 0x39000000) \|\|
				((*p & 0x11C00000) == 0x11000000) ) &&
				"Expected load / store or add/sub instruction.");
				mstorsjoUnsubmitted Not Done Reply Inline Actions Nit: Double spaces between "store" and "or" mstorsjo: Nit: Double spaces between "store" and "or"

				mstorsjoUnsubmitted Done Reply Inline Actions This doesn't seem to be right? For instructions B.cond and e.g. cbz, you have 5 least significant bits of the instruction being other data than the immediate, and after that, the resulting value should be left shifted by 2. So here, it should be `Addend = ((orig & 0x00FFFFE0) >> 5) << 2;` (for clarity) or `Addend = (orig & 0x00FFFFE0) >> 3;` (for more straightforward but less obvious code). Something similar should be done for branch14 below as well. LLD currently actually ignores the existing immediate in these relocations (which hasn't been an issue so far, but technically is an oversight). mstorsjo: This doesn't seem to be right? For instructions B.cond and e.g. cbz, you have 5 least…
				// Get the 12 bit addend encoded in the instruction.
				Addend = (*p & 0x003FFC00) >> 10;

				// Check which instruction we are decoding to obtain the implicit shift
				// factor of the instruction.
				int ImplicitShift = 0;
				if ((*p & 0x3B000000) == 0x39000000) { // << load / store
				mstorsjoUnsubmitted Not Done Reply Inline Actions I guess this check could also be for the relocation type `IMAGE_REL_ARM64_PAGEOFFSET_12L`? mstorsjo: I guess this check could also be for the relocation type `IMAGE_REL_ARM64_PAGEOFFSET_12L`?
				kaadamAuthorUnsubmitted Done Reply Inline Actions It is just for IMAGE_REL_ARM64_PAGEOFFSET_12L, we need to determine the shift value only for load/store, since for instructions ADD/ADDS (immediate) we should use zero shift. kaadam: It is just for IMAGE_REL_ARM64_PAGEOFFSET_12L, we need to determine the shift value only for…
				mstorsjoUnsubmitted Not Done Reply Inline Actions Yes, but the relocation names already imply this. The `L` suffixed relocation is used for loads/stores, and the `A` suffixed relocation is used for add instructions. For `IMAGE_REL_ARM64_PAGEOFFSET_12L` we do not need to check whether the instruction is a load/store, but we can go directly to reading out the implicit shift amount, and for `IMAGE_REL_ARM64_PAGEOFFSET_12A` we should not read any implicit shift amount at all. See lld/COFF/Chunks.cpp, SectionChunk::applyRelARM64. For IMAGE_REL_ARM64_PAGEOFFSET_12L we call applyArm64Ldr which reads out the shift amount and then calls applyArm64Imm, while applyArm64Imm is called directly for IMAGE_REL_ARM64_PAGEOFFSET_12A. In general, when a linker (either dynamic or static) resolves a relocation, it should seldom need to inspect the instruction it is applied on, even though it is needed here for reading out the implicit shift amount. In general, the relocation type just encodes a specific action that should be done on that memory location with very little extra logic. mstorsjo: Yes, but the relocation names already imply this. The `L` suffixed relocation is used for…
				// For load / store instructions the size is encoded in bits 31:30.
				ImplicitShift = ((*p >> 30) & 0x3);
				if (ImplicitShift == 0) {
				// Check if this a vector op to get the correct shift value.
				if ((*p & 0x04800000) == 0x04800000)
				ImplicitShift = 4;
				mstorsjoUnsubmitted Not Done Reply Inline Actions You can't do the masking out of the original immediate here; updating `orig` has no effect at all, as that's a local variable. Handling masking out the old value within `processRelocationRef` feels wrong in general (as this function only inspects and gathers info but doesn't update anything yet), I think this should be in `resolveRelocation`. So then you can't use `or32le` there in those cases, but more something like `write32le(P, (read32le(P) & ~Mask) \| V);` (which perhaps can warrant a helper function of its own). mstorsjo: You can't do the masking out of the original immediate here; updating `orig` has no effect at…
				}
				}
				// Compensate for implicit shift.
				Addend <<= ImplicitShift;
				}
				default:
				break;
				}

				#if !defined(NDEBUG)
				SmallString<32> RelTypeName;
				RelI->getTypeName(RelTypeName);
				mstorsjoUnsubmitted Not Done Reply Inline Actions Do we need to support reading out the immediate from `INTERNAL_REL_ARM64_LONG_BRANCH26` here? Or as the only place that generates it writes a zero immediate I guess it's not necessary? mstorsjo: Do we need to support reading out the immediate from `INTERNAL_REL_ARM64_LONG_BRANCH26` here?
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, it is not necessary since the Addend is always zero this internal type. kaadam: Yes, it is not necessary since the Addend is always zero this internal type.
				#endif
				LLVM_DEBUG(dbgs() << "\t\tIn Section " << SectionID << " Offset " << Offset
				<< " RelType: " << RelTypeName << " TargetName: "
				<< TargetName << " Addend " << Addend << "\n");
				mstorsjoUnsubmitted Not Done Reply Inline Actions Would it make more sense, stylistically, to extend the ifdef around the debug statement as well? Right now it does look weird to have code referring to variables that don't exist (even though LLVM_DEBUG will make them disappear). mstorsjo: Would it make more sense, stylistically, to extend the ifdef around the debug statement as well?
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, you're right, I will fix it. kaadam: Yes, you're right, I will fix it.

				unsigned TargetSectionID = -1;
				if (IsExtern) {
				RelocationEntry RE(SectionID, Offset, RelType, Addend);
				addRelocationForSymbol(RE, TargetName);
				} else {
				if (auto TargetSectionIDOrErr =
				findOrEmitSection(Obj, *Section, Section->isText(), ObjSectionToID)) {
				TargetSectionID = *TargetSectionIDOrErr;
				}
				mstorsjoUnsubmitted Not Done Reply Inline Actions The indentation here is weird. Please run `clang-format-diff -style LLVM` on the changes. mstorsjo: The indentation here is weird. Please run `clang-format-diff -style LLVM` on the changes.
				kaadamAuthorUnsubmitted Done Reply Inline Actions Thanks, I will use it. kaadam: Thanks, I will use it.
				else
				return TargetSectionIDOrErr.takeError();

				// This relocation is ignored.
				if (RelType != COFF::IMAGE_REL_ARM64_ABSOLUTE) {
				uint64_t TargetOffset = getSymbolOffset(*Symbol);
				RelocationEntry RE(SectionID, Offset, RelType, TargetOffset + Addend);
				addRelocationForSection(RE, TargetSectionID);
				}
				}
				return ++RelI;
				}

				void resolveRelocation(const RelocationEntry &RE, uint64_t Value) override
				{
				const auto Section = Sections[RE.SectionID];
				uint8_t *Target = Section.getAddressWithOffset(RE.Offset);
				uint64_t FinalAddress = Section.getLoadAddressWithOffset(RE.Offset);

				switch (RE.RelType) {
				default: llvm_unreachable("unsupported relocation type");
				case COFF::IMAGE_REL_ARM64_ADDR64: {
				or32le(Target + 12, ((Value + RE.Addend) & 0xFFFF) << 5);
				mstorsjoUnsubmitted Not Done Reply Inline Actions This doesn't seem right. `IMAGE_REL_ARM64_ADDR64` is a plain 64 bit integer (just like `IMAGE_REL_ARM64_ADDR32NB`), not a series of instructions that should get immediates added. mstorsjo: This doesn't seem right. `IMAGE_REL_ARM64_ADDR64` is a plain 64 bit integer (just like…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, you're right. Currently I handle a long branch instruction with this relocation, so when I needed to create stub function (which generates movz/movk instruction), I just set this relocation to detect this case when we have an external symbol. When I tested it with a small examples and Swithshader It worked because I haven't got this relocation type. Of course I should use another one, maybe an internal type. What do you think? kaadam: Yes, you're right. Currently I handle a long branch instruction with this relocation, so when I…
				mstorsjoUnsubmitted Not Done Reply Inline Actions Hmm, as I'm not familiar with those bits that generate it, I don't know for sure. As you're already setting a small code model for aarch64/win, doesn't that already achieve this? Otherwise some variant of adrp+add would normally be used for forming any arbitrary address, instead of a series of mov instructions, unless the value to be formed is a constant. If necessary I guess one could consider using private internal relocation types, but I'd at least defer those bits to a later patch where it can be discussed properly on its own, and keep this first for the official relocation types. mstorsjo: Hmm, as I'm not familiar with those bits that generate it, I don't know for sure. As you're…
				or32le(Target + 8, ((Value + RE.Addend) & 0xFFFF0000) >> 11);
				or32le(Target + 4, ((Value + RE.Addend) & 0xFFFF00000000) >> 27);
				or32le(Target + 0, ((Value + RE.Addend) & 0xFFFF000000000000) >> 43);
				break;
				}
				case COFF::IMAGE_REL_ARM64_PAGEBASE_REL21: {
				uint64_t FinalAddress = Section.getLoadAddressWithOffset(RE.Offset);
				int64_t PCRelVal =
				((Value + RE.Addend) & (-4096)) - (FinalAddress & (-4096));
				write32AArch64Addr(Target, PCRelVal >> 12);
				break;
				}

				case COFF::IMAGE_REL_ARM64_PAGEOFFSET_12A:
				case COFF::IMAGE_REL_ARM64_PAGEOFFSET_12L: {
				Value += RE.Addend;
				Value &= 0xFFF;
				auto p = reinterpret_cast<support::aligned_ulittle32_t >(Target);
				// Check which instruction we are decoding to obtain the implicit shift
				// factor of the instruction and verify alignment.
				int ImplicitShift = 0;
				if ((*p & 0x3B000000) == 0x39000000) { // << load / store
				// For load / store instructions the size is encoded in bits 31:30.
				ImplicitShift = ((*p >> 30) & 0x3);
				switch (ImplicitShift) {
				case 0:
				// Check if this a vector op to get the correct shift value.
				if ((*p & 0x04800000) == 0x04800000) {
				ImplicitShift = 4;
				assert(((Value & 0xF) == 0) &&
				"128-bit LDR/STR not 16-byte aligned.");
				}
				break;
				case 1:
				assert(((Value & 0x1) == 0) && "16-bit LDR/STR not 2-byte aligned.");
				break;
				case 2:
				assert(((Value & 0x3) == 0) && "32-bit LDR/STR not 4-byte aligned.");
				break;
				case 3:
				assert(((Value & 0x7) == 0) && "64-bit LDR/STR not 8-byte aligned.");
				break;
				}
				}
				mstorsjoUnsubmitted Not Done Reply Inline Actions This switch for checking for alignment is rather verbose - would it make sense to condense it down to a single expression for all alignments? mstorsjo: This switch for checking for alignment is rather verbose - would it make sense to condense it…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Sure, it could be. kaadam: Sure, it could be.
				// Compensate for implicit shift.
				Value >>= ImplicitShift;
				assert(isUInt<12>(Value) && "Addend cannot be encoded.");
				p = (p & 0xFFC003FF) \| ((uint32_t)(Value << 10) & 0x003FFC00);
				break;
				}
				case COFF::IMAGE_REL_ARM64_ADDR32NB: {
				// The target's 32-bit RVA.
				// NOTE: use Section[0].getLoadAddress() as an approximation of ImageBase
				uint64_t Result = FinalAddress -
				Sections[0].getLoadAddress() + RE.Addend;
				writeBytesUnaligned(Result, Target, 4);
				break;
				}
				case COFF::IMAGE_REL_ARM64_BRANCH26: {
				uint64_t FinalAddress = Section.getLoadAddressWithOffset(RE.Offset);
				mstorsjoUnsubmitted Not Done Reply Inline Actions If there actually was a nonzero immediate in the instruction here from before, `or32le` won't do the right thing. We don't handle this in lld right now, but as this code at least tries to read out the immediate further up, it would be good for consistency to actually clear the immediate from the branch instruction here before or'ing in the final value. mstorsjo: If there actually was a nonzero immediate in the instruction here from before, `or32le` won't…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Yes, you're right . maybe the immediate for these instruction could be cleared when the addends are decoded. kaadam: Yes, you're right . maybe the immediate for these instruction could be cleared when the addends…
				int64_t PCRelVal = Value - FinalAddress + RE.Addend;
				auto p = reinterpret_cast<support::aligned_ulittle32_t >(Target);
				// Verify that the relocation points to the expected branch instruction.
				assert(((*p & 0xFC000000) == 0x14000000 \|\|
				(*p & 0xFC000000) == 0x94000000) &&
				"Expected branch instruction.");

				// Verify addend value.
				assert((PCRelVal & 0x3) == 0 && "Branch target is not aligned");
				assert(isInt<28>(PCRelVal) && "Branch target is out of range.");

				// Encode the addend as 26 bit immediate in the branch instruction.
				or32le(p, (PCRelVal & 0x0FFFFFFC) >> 2);
				// p = (p & 0xFC000000) \| ((uint32_t)(PCRelVal >> 2) & 0x03FFFFFF);
				break;
				}
				}
				}


				void registerEHFrames() override {}
				};

				} // End namespace llvm

				#endif
				No newline at end of file
				mstorsjoUnsubmitted Not Done Reply Inline Actions Missing newline at end of file mstorsjo: Missing newline at end of file
				mstorsjoUnsubmitted Not Done Reply Inline Actions Hmm, where does Value end up added to this one? I do see that the existing COFF targets does it the same way, and that code does seem to be used and have a working test, but I don't see how it works. Do you have any clue? mstorsjo: Hmm, where does Value end up added to this one? I do see that the existing COFF targets does…
				kaadamAuthorUnsubmitted Done Reply Inline Actions Unfortunately I'm not sure what happens under the hood, but it seems it is correct to use only the addend here, which contains the offset of the item from the beginning of its section in this case. kaadam: Unfortunately I'm not sure what happens under the hood, but it seems it is correct to use only…

lib/Target/AArch64/AArch64TargetMachine.cpp

Show First 20 Lines • Show All 241 Lines • ▼ Show 20 Lines	if (CM != CodeModel::Small && CM != CodeModel::Tiny &&
"are allowed on AArch64");		"are allowed on AArch64");
} else if (*CM == CodeModel::Tiny && !TT.isOSBinFormatELF())		} else if (*CM == CodeModel::Tiny && !TT.isOSBinFormatELF())
report_fatal_error("tiny code model is only supported on ELF");		report_fatal_error("tiny code model is only supported on ELF");
return *CM;		return *CM;
}		}
// The default MCJIT memory managers make no guarantees about where they can		// The default MCJIT memory managers make no guarantees about where they can
// find an executable page; JITed code needs to be able to refer to globals		// find an executable page; JITed code needs to be able to refer to globals
// no matter how far away they are.		// no matter how far away they are.
if (JIT)		// We should set the CodeModel::Small for Windows ARM64 in JIT mode,
		// since with large code model LLVM generating 4 MOV instructions, and
		// Windows doesn't support relocating these long branch (4 MOVs).
		if (JIT && !TT.isOSWindows())
return CodeModel::Large;		return CodeModel::Large;
return CodeModel::Small;		return CodeModel::Small;
}		}

/// Create an AArch64 architecture model.		/// Create an AArch64 architecture model.
///		///
AArch64TargetMachine::AArch64TargetMachine(const Target &T, const Triple &TT,		AArch64TargetMachine::AArch64TargetMachine(const Target &T, const Triple &TT,
StringRef CPU, StringRef FS,		StringRef CPU, StringRef FS,
▲ Show 20 Lines • Show All 368 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

ExecutionEngine: add preliminary support for COFF ARM64 ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 226440

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h

lib/Target/AArch64/AArch64TargetMachine.cpp

ExecutionEngine: add preliminary support for COFF ARM64
ClosedPublic