This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/Arch/
-
Arch/
2/20
PPC64.cpp
-
test/ELF/
-
ELF/
3/4
ppc64-pcrel34-reloc.s

Differential D81457

[LLD][PowerPC] Add support for R_PPC64_PCREL34
ClosedPublic

Authored by stefanp on Jun 9 2020, 4:54 AM.

Download Raw Diff

Details

Reviewers

nemanjai
sfertile
MaskRay
ruiu
hfinkel
• espindola

Group Reviewers

Restricted Project

Commits

rG3a55a2a97fd4: [LLD][PowerPC] Add support for R_PPC64_PCREL34

Summary

Add support for the 34bit relocation R_PPC64_PCREL34 for PC Relative in LLD.

Diff Detail

Unit TestsFailed

	Time	Test
	120 ms	Clang-Unit.StaticAnalyzer/_/StaticAnalysisTests::Unknown Unit Message ("")
	40 ms	Clang-Unit.StaticAnalyzer/_/StaticAnalysisTests_exe::Unknown Unit Message ("")
	680 ms	LLVM.CodeGen/AMDGPU::Unknown Unit Message ("")
	470 ms	LLVM.CodeGen/AMDGPU::Unknown Unit Message ("")
	20 ms	LLVM.MC/AArch64::Unknown Unit Message ("")
		View Full Test Results (7 Failed)

Event Timeline

stefanp created this revision.Jun 9 2020, 4:54 AM

Herald added a reviewer: • espindola. · View Herald TranscriptJun 9 2020, 4:54 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: shchenz, kbarton, arichardson, emaste. · View Herald Transcript

Harbormaster failed remote builds in B59616: Diff 269493!Jun 9 2020, 6:00 AM

Herald added a subscriber: • wuzish. · View Herald TranscriptJun 9 2020, 6:00 AM

MaskRay added inline comments.Jun 10 2020, 9:07 PM

lld/test/ELF/ppc64-pcrel34-reloc.s
3	Delete `-unknown-linux`
21	`# 0x0` is unnecessary ditto below CHECK lines before the code are more common. (A few ppc64 tests may not follow the convention)
51	Delete trailing empty lines

sfertile added inline comments.Jun 11 2020, 8:38 AM

lld/ELF/Arch/PPC64.cpp
382	Can I ask the rational on why we want to treat a prefixed instruction as a single 8-byte instruction as opposed to a 4-byte prefix and a 4-byte instruction? I'm not necessarily disagreeing we should treat them as a single entity, I just want to make sure we have though about what the best representation is considering all the new relocations to be added.
1011	I've got a patch which removes the masking of the instruction bits on the other relocations types we were doing it for (D81106). My understanding is that for REL ELF archs the addend is encoded into the bits to be relocated, and for RELA ELF archs the bits to be relocated are zeros, with the addend in the relocation. The extra masking doesn't hurt anything functionally but it can lead to misunderstandings on what the bits to be relocated should contain so I think its important to be precise and omit the extra masking despite it not hurting anything.

MaskRay added inline comments.Jun 11 2020, 11:50 AM

lld/ELF/Arch/PPC64.cpp
1011	Even if powerpc is a RELA target, if it is easy to drop the requirement, I hope we can make some efforts making that work. See D80496 (-z rel)

stefanp added inline comments.Jun 12 2020, 12:17 PM

lld/ELF/Arch/PPC64.cpp
382	Good question. I like to treat them as one because conceptually they are one instruction. So, in my mind I tend to think of them as 8 byte instructions. Some reasons I tend to think of them this way: We can have the instruction part without the prefix but we can never have a prefix without a following instruction to go with it. Immediates are represented across the boundary of the prefix/instruction. For example `paddi` has 18 bits of immediate in the prefix and 16 bits of the same immediate in the instruction. Those two are concatenated together to form the actual immediate. Flags in the prefix change the meaning of the subsequent instruction. Think of the PC Relative flag. In llc we treat them as one instruction (I know this is a bit circular...) For example in `PPCInstrPrefix.td`we have the base class for all prefixed instructions: // Top-level class for prefixed instructions. class PI<bits<6> pref, bits<6> opcode, dag OOL, dag IOL, string asmstr, InstrItinClass itin> : Instruction { field bits<64> Inst; field bits<64> SoftFail = 0; bit PCRel = 0; // Default value, set by isPCRel. let Size = 8; In the end the reason is purely conceptual from my perspective. I don't have a technical reason to treat them as one.
1011	I do see what you mean. The way I look at this is that we have 3 options: We just ignore it and assume that the zeros are there. This is what you are suggesting. However, my worry is that this way we could be silently producing bad code. Basically if there is a garbage bit in there we end up with an "or" of that bad bit and a valid offset and we sometimes end up with a bad offset. We mask it out. This is what I did. It basically hides the problem. If there are bad bits they get zeroed out and we end up with a valid instruction. We will never know that the initial encoding is wrong. We assert on those bits. This means that if there are non-zero bits in the encoding we don't even link the code. I don't want to do this last one mainly because GCC doesn't do it. I know, it is not necessarily a valid reason to do this. I had originally picked option 2 because that is what GCC did. It produces the correct output despite the garbage I put in there. Again, just because GCC does this it doesn't mean we should. With respect to the second comment: At this point I'm not going to try to change this to support REL yet. I'm just looking at RELA. If we want to add that support I would say doing it in a completely different patch.

Addressed review comments for the test case.

MaskRay added inline comments.Jun 12 2020, 12:50 PM

lld/ELF/Arch/PPC64.cpp
1011	I changed my mind on D81106. We mask it out. This is what I did. It basically hides the problem. If there are bad bits they get zeroed out and we end up with a valid instruction. We will never know that the initial encoding is wrong. This one looks good to me. Why does it "hide the problem"? powerpc{,64} are RELA targets. For a RELA target, the implicit addend should not matter. FWIW, AArch64.cpp marks out the bits as well.

nemanjai added inline comments.Jun 15 2020, 9:29 AM

lld/ELF/Arch/PPC64.cpp
382	Not that it makes a significant difference, but also one store vs. two.
1011	I too am in favour of masking out the bits. This is what `ld.bfd` does. It doesn't hurt performance. It is conservatively correct with minimal overhead.

sfertile added inline comments.Jun 15 2020, 10:08 AM

lld/ELF/Arch/PPC64.cpp
382	I can see why we think of them as a single instruction in the backend. The reason I asked though is that in the object file we have a 4-byte prefix and a 4-byte instruction not an 8-byte instruction (as you can see from how we have to shuffle in the read/writes), and in this case the bits being relocated are broken into bits in the prefix and bits in the instruction. (with the 2 fields being discontinuous) The prefix28 relocations was similar (high 12 in the prefix, lo 16 in the instruction and again the 2 fields are discontinuous). I haven't looked at all the new relocations though, which format does the paddi use?
1011	FWIW, AArch64.cpp marks out the bits as well. AARCH64 is inconsistent: It masks to zero out the bits in `write32AArch64Addr`, but doesn't in `or32le`. static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) \| v); } With respect to D80496 --> I'm not sure how this is relevant. IIUC the option is to use REL instead of RELA for the dynamic relocations. ie LLDs output, the masking we are talking about here is in relation to the linkers input. I had originally picked option 2 because that is what GCC did. It produces the correct output despite the garbage I put in there. Again, just because GCC does this it doesn't mean we should. I had looked at what LLVM outputs, what gas produces for a bunch of inputs (since I can't look at gas source) and the fact that the AARCH64 target doesn't mask out the bits to be relocated (in `or32le`) and though the best way to make this consistent was to omit the masking. I think we should be consistent both for all input relocations but also across targets. The reason I think its important to omit the masking (if its valid to omit) is precisely for the reason you pick option 2. The wrong input (ie with the addend encoded into the instruction) produced the right output which made you think the addend had to be there. The linker being 'nice' lead to us not catching the improper LLVM codegen. If the linker had been stricter we would have realized the problem earlier. A lot of these minor details aren't well documented and the tools implementations essentially becomes the documentation. If it is supposed to work with garbage in the bits then I am all for consistently masking anywhere we need a wider read then the bits to be relocated: but I'd like to see something concrete that suggest it is intended to work that way.

stefanp added inline comments.Jun 15 2020, 2:26 PM

lld/ELF/Arch/PPC64.cpp
382	paddi uses the same idea where it splits the 34 bit relocation bits into 18 bits in the prefix and 16 bits in the instruction. The two fields are discontinuous as you say. So, what you are saying is that you prefer to have this in a struct or class with two parts because we are going to be working with them separately anyway?
1011	By "hide the problem" I'm basically saying that there could be a compiler that is setting garbage bits into where the relocation is supposed to go. If that's the case we won't know about it because the linker is nice enough to fix it for us. It's a bug that is not detected. I have not been able to find a place that specified whether or not the relocation bits need to be zero or not when we are using RELA. What do people think of the following compromise: A warning or a debug message. We do compile everything but if we find that `readPrefixedInstruction(loc) & ~fullMask != readPrefixedInstruction(loc)` we emit a warning (or debug) saying that we have found garbage bits. We can still link everything that LD compiles but we notify potential problems.

sfertile added inline comments.Jun 16 2020, 7:24 AM

lld/ELF/Arch/PPC64.cpp
382	Nothing so heavy weight. I was think just keeping them a 2 separate uint32_t and modifying each individually. Something like: checkInt(loc, val, 34, rel); uint32_t si0Mask = 0x3ffff; uint32_t si0 = (val >> 16) & si0Mask write32(loc, (read32(loc) & ~si0Mask) \| si0); uint32_t si1Mask = 0xffff write32(loc + 4, (read(loc + 4) & ~si1Mask)\| (val & si1mask)); If we didn't need to mask off the read then I think keeping them separate was slightly cleaner. With masking I think its about the same either way so treating it as a 8-byte unit with read and write helpers is probably the better choice.
1011	I have not been able to find a place that specified whether or not the relocation bits need to be zero That's because nothing does specify it. After talking with Alan I have realized i am wrong on this: we can't rely on the bits being zeroed out. The current gas/LLVM behavior of zeroing out the bits isn't required and we should keep masking off any bits to be relocated when the read/write is bigger then the bits being relocated.

I think all of the comments have been addressed at this point as, based on the discussions, we have decided to leave the code as-is.

lld/ELF/Arch/PPC64.cpp
382	Thank you. I will leave it as read/write with helpers.
1011	Ok, in that case I will leave this code as-is.

MaskRay added inline comments.Jun 16 2020, 3:17 PM

lld/ELF/Arch/PPC64.cpp
1004	Nit: Lower-case hexadecimals are more common. (I know that there are a few places in this file where upper-case hexadecimals are inconsistently used)
lld/test/ELF/ppc64-pcrel34-reloc.s
21	Instead of a `.section`, it is better using a label. You'll then able to use `CHECK-NEXT`: # CHECK: <foo>: # CHECK-NEXT: plwa 3, 12(0), 1

Addressed review comments.

Generally looks good, only a few nits.

lld/ELF/Arch/PPC64.cpp
1009	si0 and si1 are only used once. Consider inlining them.
1012	This empty line can be deleted
lld/test/ELF/ppc64-reloc-pcrel34.s
17 ↗	(On Diff #271473)	One empty line is sufficient.
18 ↗	(On Diff #271473)	Delete `.text`

Address review comments.

Do you want to add a test for overflow of a 34-bit value?

lld/ELF/Arch/PPC64.cpp
381	This comments a out of date (ie load/store the pieces separately). If you mention its a 4 byte prefix followed by a 4 byte instruction I think the shifting we have to do for LE becomes pretty obvious.
lld/test/ELF/ppc64-reloc-pcrel34.s
37 ↗	(On Diff #271654)	Do you want to define an int of size 4 or 8? you use an lwa so I would expect a 4 byte int, and you use .long which allocates 4 bytes (I belive its the same as .int), but then give it a size of 8 and name it as int8? Ditto for `glob_int8_big`

stefanp added inline comments.Jun 22 2020, 11:37 AM

lld/test/ELF/ppc64-reloc-pcrel34.s
37 ↗	(On Diff #271654)	Sorry, what I meant to do was read the last 4 bytes of an 8 byte int. I'm using `glob_int8@PCREL+4` so I want to skip over the first 4 bytes and read the last 4. I'm actually testing the `+4`. I guess `.long` is not the right type to use. I think `.quad` is correct here.

Fixed comment and added a new overflow test case.

LGTM.

This revision is now accepted and ready to land.Jun 23 2020, 7:14 AM

sfertile mentioned this in D81106: [PowerPC][NFC] Don't need to and instruction with inverse of mask..Jun 23 2020, 7:17 AM

kamaub added a commit: rG3a55a2a97fd4: [LLD][PowerPC] Add support for R_PPC64_PCREL34.Jun 23 2020, 1:29 PM

kamaub closed this revision.Jun 23 2020, 1:46 PM

Hi, your git commit does not contain Differential Revision:, so the revision was not automatically closed when you pushed the commit.

(
You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people. Please keep the tag. (I have updated my script to use --date=now (setting author date to committer date))

https://reviews.llvm.org/D80978 contains a git pre-push hook to automate this.)

Revision Contents

Path

Size

lld/

ELF/

Arch/

PPC64.cpp

27 lines

test/

ELF/

ppc64-pcrel34-reloc.s

52 lines

Diff 269493

lld/ELF/Arch/PPC64.cpp

Show First 20 Lines • Show All 370 Lines • ▼ Show 20 Lines
static void writeFromHalf16(uint8_t *loc, uint32_t insn) {		static void writeFromHalf16(uint8_t *loc, uint32_t insn) {
write32(config->isLE ? loc : loc - 2, insn);		write32(config->isLE ? loc : loc - 2, insn);
}		}

static uint32_t readFromHalf16(const uint8_t *loc) {		static uint32_t readFromHalf16(const uint8_t *loc) {
return read32(config->isLE ? loc : loc - 2);		return read32(config->isLE ? loc : loc - 2);
}		}

		// The prefix is always in lower memory than the instruction (regardless of
		// endianness). So on little endian machines, we have to load/store the
		// pieces separately and assemble/disassemble them accordingly.
		sfertileUnsubmitted Not Done Reply Inline Actions This comments a out of date (ie load/store the pieces separately). If you mention its a 4 byte prefix followed by a 4 byte instruction I think the shifting we have to do for LE becomes pretty obvious. sfertile: This comments a out of date (ie load/store the pieces separately). If you mention its a 4…
		static void writePrefixedInstruction(uint8_t *loc, uint64_t insn) {
		sfertileUnsubmitted Not Done Reply Inline Actions Can I ask the rational on why we want to treat a prefixed instruction as a single 8-byte instruction as opposed to a 4-byte prefix and a 4-byte instruction? I'm not necessarily disagreeing we should treat them as a single entity, I just want to make sure we have though about what the best representation is considering all the new relocations to be added. sfertile: Can I ask the rational on why we want to treat a prefixed instruction as a single 8-byte…
		stefanpAuthorUnsubmitted Not Done Reply Inline Actions Good question. I like to treat them as one because conceptually they are one instruction. So, in my mind I tend to think of them as 8 byte instructions. Some reasons I tend to think of them this way: We can have the instruction part without the prefix but we can never have a prefix without a following instruction to go with it. Immediates are represented across the boundary of the prefix/instruction. For example `paddi` has 18 bits of immediate in the prefix and 16 bits of the same immediate in the instruction. Those two are concatenated together to form the actual immediate. Flags in the prefix change the meaning of the subsequent instruction. Think of the PC Relative flag. In llc we treat them as one instruction (I know this is a bit circular...) For example in `PPCInstrPrefix.td`we have the base class for all prefixed instructions: // Top-level class for prefixed instructions. class PI<bits<6> pref, bits<6> opcode, dag OOL, dag IOL, string asmstr, InstrItinClass itin> : Instruction { field bits<64> Inst; field bits<64> SoftFail = 0; bit PCRel = 0; // Default value, set by isPCRel. let Size = 8; In the end the reason is purely conceptual from my perspective. I don't have a technical reason to treat them as one. stefanp: Good question. I like to treat them as one because conceptually they are one instruction. So…
		nemanjaiUnsubmitted Not Done Reply Inline Actions Not that it makes a significant difference, but also one store vs. two. nemanjai: Not that it makes a significant difference, but also one store vs. two.
		sfertileUnsubmitted Not Done Reply Inline Actions I can see why we think of them as a single instruction in the backend. The reason I asked though is that in the object file we have a 4-byte prefix and a 4-byte instruction not an 8-byte instruction (as you can see from how we have to shuffle in the read/writes), and in this case the bits being relocated are broken into bits in the prefix and bits in the instruction. (with the 2 fields being discontinuous) The prefix28 relocations was similar (high 12 in the prefix, lo 16 in the instruction and again the 2 fields are discontinuous). I haven't looked at all the new relocations though, which format does the paddi use? sfertile: I can see why we think of them as a single instruction in the backend. The reason I asked…
		stefanpAuthorUnsubmitted Not Done Reply Inline Actions paddi uses the same idea where it splits the 34 bit relocation bits into 18 bits in the prefix and 16 bits in the instruction. The two fields are discontinuous as you say. So, what you are saying is that you prefer to have this in a struct or class with two parts because we are going to be working with them separately anyway? stefanp: paddi uses the same idea where it splits the 34 bit relocation bits into 18 bits in the prefix…
		sfertileUnsubmitted Not Done Reply Inline Actions Nothing so heavy weight. I was think just keeping them a 2 separate uint32_t and modifying each individually. Something like: checkInt(loc, val, 34, rel); uint32_t si0Mask = 0x3ffff; uint32_t si0 = (val >> 16) & si0Mask write32(loc, (read32(loc) & ~si0Mask) \| si0); uint32_t si1Mask = 0xffff write32(loc + 4, (read(loc + 4) & ~si1Mask)\| (val & si1mask)); If we didn't need to mask off the read then I think keeping them separate was slightly cleaner. With masking I think its about the same either way so treating it as a 8-byte unit with read and write helpers is probably the better choice. sfertile: Nothing so heavy weight. I was think just keeping them a 2 separate uint32_t and modifying each…
		stefanpAuthorUnsubmitted Done Reply Inline Actions Thank you. I will leave it as read/write with helpers. stefanp: Thank you. I will leave it as read/write with helpers.
		insn = config->isLE ? insn << 32 \| insn >> 32 : insn;
		write64(loc, insn);
		}

		static uint64_t readPrefixedInstruction(const uint8_t *loc) {
		uint64_t fullInstr = read64(loc);
		return config->isLE ? (fullInstr << 32 \| fullInstr >> 32) : fullInstr;
		}

PPC64::PPC64() {		PPC64::PPC64() {
copyRel = R_PPC64_COPY;		copyRel = R_PPC64_COPY;
gotRel = R_PPC64_GLOB_DAT;		gotRel = R_PPC64_GLOB_DAT;
noneRel = R_PPC64_NONE;		noneRel = R_PPC64_NONE;
pltRel = R_PPC64_JMP_SLOT;		pltRel = R_PPC64_JMP_SLOT;
relativeRel = R_PPC64_RELATIVE;		relativeRel = R_PPC64_RELATIVE;
iRelativeRel = R_PPC64_IRELATIVE;		iRelativeRel = R_PPC64_IRELATIVE;
symbolicRel = R_PPC64_ADDR64;		symbolicRel = R_PPC64_ADDR64;
▲ Show 20 Lines • Show All 278 Lines • ▼ Show 20 Lines	RelExpr PPC64::getRelExpr(RelType type, const Symbol &s,
case R_PPC64_REL14:		case R_PPC64_REL14:
case R_PPC64_REL24:		case R_PPC64_REL24:
return R_PPC64_CALL_PLT;		return R_PPC64_CALL_PLT;
case R_PPC64_REL16_LO:		case R_PPC64_REL16_LO:
case R_PPC64_REL16_HA:		case R_PPC64_REL16_HA:
case R_PPC64_REL16_HI:		case R_PPC64_REL16_HI:
case R_PPC64_REL32:		case R_PPC64_REL32:
case R_PPC64_REL64:		case R_PPC64_REL64:
		case R_PPC64_PCREL34:
return R_PC;		return R_PC;
case R_PPC64_GOT_TLSGD16:		case R_PPC64_GOT_TLSGD16:
case R_PPC64_GOT_TLSGD16_HA:		case R_PPC64_GOT_TLSGD16_HA:
case R_PPC64_GOT_TLSGD16_HI:		case R_PPC64_GOT_TLSGD16_HI:
case R_PPC64_GOT_TLSGD16_LO:		case R_PPC64_GOT_TLSGD16_LO:
return R_TLSGD_GOT;		return R_TLSGD_GOT;
case R_PPC64_GOT_TLSLD16:		case R_PPC64_GOT_TLSLD16:
case R_PPC64_GOT_TLSLD16_HA:		case R_PPC64_GOT_TLSLD16_HA:
▲ Show 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	case R_PPC64_REL24: {
checkInt(loc, val, 26, rel);		checkInt(loc, val, 26, rel);
checkAlignment(loc, val, 4, rel);		checkAlignment(loc, val, 4, rel);
write32(loc, (read32(loc) & ~mask) \| (val & mask));		write32(loc, (read32(loc) & ~mask) \| (val & mask));
break;		break;
}		}
case R_PPC64_DTPREL64:		case R_PPC64_DTPREL64:
write64(loc, val - dynamicThreadPointerOffset);		write64(loc, val - dynamicThreadPointerOffset);
break;		break;
		case R_PPC64_PCREL34: {
		uint64_t si0Mask = 0x00000003FFFF0000;
		MaskRayUnsubmitted Not Done Reply Inline Actions Nit: Lower-case hexadecimals are more common. (I know that there are a few places in this file where upper-case hexadecimals are inconsistently used) MaskRay: Nit: Lower-case hexadecimals are more common. (I know that there are a few places in this file…
		uint64_t si1Mask = 0x000000000000FFFF;
		uint64_t fullMask = 0x0003FFFF0000FFFF;
		checkInt(loc, val, 34, rel);

		uint64_t si0 = (val & si0Mask) << 16;
		MaskRayUnsubmitted Not Done Reply Inline Actions si0 and si1 are only used once. Consider inlining them. MaskRay: si0 and si1 are only used once. Consider inlining them.
		uint64_t si1 = (val & si1Mask);
		uint64_t instr = readPrefixedInstruction(loc) & ~fullMask;
		sfertileUnsubmitted Not Done Reply Inline Actions I've got a patch which removes the masking of the instruction bits on the other relocations types we were doing it for (D81106). My understanding is that for REL ELF archs the addend is encoded into the bits to be relocated, and for RELA ELF archs the bits to be relocated are zeros, with the addend in the relocation. The extra masking doesn't hurt anything functionally but it can lead to misunderstandings on what the bits to be relocated should contain so I think its important to be precise and omit the extra masking despite it not hurting anything. sfertile: I've got a patch which removes the masking of the instruction bits on the other relocations…
		MaskRayUnsubmitted Not Done Reply Inline Actions Even if powerpc is a RELA target, if it is easy to drop the requirement, I hope we can make some efforts making that work. See D80496 (-z rel) MaskRay: Even if powerpc is a RELA target, if it is easy to drop the requirement, I hope we can make…
		stefanpAuthorUnsubmitted Not Done Reply Inline Actions I do see what you mean. The way I look at this is that we have 3 options: We just ignore it and assume that the zeros are there. This is what you are suggesting. However, my worry is that this way we could be silently producing bad code. Basically if there is a garbage bit in there we end up with an "or" of that bad bit and a valid offset and we sometimes end up with a bad offset. We mask it out. This is what I did. It basically hides the problem. If there are bad bits they get zeroed out and we end up with a valid instruction. We will never know that the initial encoding is wrong. We assert on those bits. This means that if there are non-zero bits in the encoding we don't even link the code. I don't want to do this last one mainly because GCC doesn't do it. I know, it is not necessarily a valid reason to do this. I had originally picked option 2 because that is what GCC did. It produces the correct output despite the garbage I put in there. Again, just because GCC does this it doesn't mean we should. With respect to the second comment: At this point I'm not going to try to change this to support REL yet. I'm just looking at RELA. If we want to add that support I would say doing it in a completely different patch. stefanp: I do see what you mean. The way I look at this is that we have 3 options: 1) We just ignore it…
		MaskRayUnsubmitted Not Done Reply Inline Actions I changed my mind on D81106. We mask it out. This is what I did. It basically hides the problem. If there are bad bits they get zeroed out and we end up with a valid instruction. We will never know that the initial encoding is wrong. This one looks good to me. Why does it "hide the problem"? powerpc{,64} are RELA targets. For a RELA target, the implicit addend should not matter. FWIW, AArch64.cpp marks out the bits as well. MaskRay: I changed my mind on D81106. > 2. We mask it out. This is what I did. It basically hides the…
		sfertileUnsubmitted Not Done Reply Inline Actions FWIW, AArch64.cpp marks out the bits as well. AARCH64 is inconsistent: It masks to zero out the bits in `write32AArch64Addr`, but doesn't in `or32le`. static void or32le(uint8_t p, int32_t v) { write32le(p, read32le(p) \| v); } With respect to D80496 --> I'm not sure how this is relevant. IIUC the option is to use REL instead of RELA for the dynamic relocations. ie LLDs output, the masking we are talking about here is in relation to the linkers input. I had originally picked option 2 because that is what GCC did. It produces the correct output despite the garbage I put in there. Again, just because GCC does this it doesn't mean we should. I had looked at what LLVM outputs, what gas produces for a bunch of inputs (since I can't look at gas source) and the fact that the AARCH64 target doesn't mask out the bits to be relocated (in `or32le`) and though the best way to make this consistent was to omit the masking. I think we should be consistent both for all input relocations but also across targets. The reason I think its important to omit the masking (if its valid to omit) is precisely for the reason you pick option 2. The wrong input (ie with the addend encoded into the instruction) produced the right output which made you think the addend had to be there. The linker being 'nice' lead to us not catching the improper LLVM codegen. If the linker had been stricter we would have realized the problem earlier. A lot of these minor details aren't well documented and the tools implementations essentially becomes the documentation. If it is supposed to work with garbage in the bits then I am all for consistently masking anywhere we need a wider read then the bits to be relocated: but I'd like to see something concrete that suggest it is intended to work that way. sfertile:* > FWIW, AArch64.cpp marks out the bits as well. AARCH64 is inconsistent: It masks to zero out…
		nemanjaiUnsubmitted Not Done Reply Inline Actions I too am in favour of masking out the bits. This is what `ld.bfd` does. It doesn't hurt performance. It is conservatively correct with minimal overhead. nemanjai: I too am in favour of masking out the bits. This is what `ld.bfd` does. It doesn't hurt…
		stefanpAuthorUnsubmitted Not Done Reply Inline Actions By "hide the problem" I'm basically saying that there could be a compiler that is setting garbage bits into where the relocation is supposed to go. If that's the case we won't know about it because the linker is nice enough to fix it for us. It's a bug that is not detected. I have not been able to find a place that specified whether or not the relocation bits need to be zero or not when we are using RELA. What do people think of the following compromise: A warning or a debug message. We do compile everything but if we find that `readPrefixedInstruction(loc) & ~fullMask != readPrefixedInstruction(loc)` we emit a warning (or debug) saying that we have found garbage bits. We can still link everything that LD compiles but we notify potential problems. stefanp: By "hide the problem" I'm basically saying that there could be a compiler that is setting…
		sfertileUnsubmitted Not Done Reply Inline Actions I have not been able to find a place that specified whether or not the relocation bits need to be zero That's because nothing does specify it. After talking with Alan I have realized i am wrong on this: we can't rely on the bits being zeroed out. The current gas/LLVM behavior of zeroing out the bits isn't required and we should keep masking off any bits to be relocated when the read/write is bigger then the bits being relocated. sfertile: > I have not been able to find a place that specified whether or not the relocation bits need…
		stefanpAuthorUnsubmitted Done Reply Inline Actions Ok, in that case I will leave this code as-is. stefanp: Ok, in that case I will leave this code as-is.

		MaskRayUnsubmitted Not Done Reply Inline Actions This empty line can be deleted MaskRay: This empty line can be deleted
		writePrefixedInstruction(loc, instr \| si0 \| si1);
		break;
		}
default:		default:
llvm_unreachable("unknown relocation");		llvm_unreachable("unknown relocation");
}		}
}		}

bool PPC64::needsThunk(RelExpr expr, RelType type, const InputFile *file,		bool PPC64::needsThunk(RelExpr expr, RelType type, const InputFile *file,
uint64_t branchAddr, const Symbol &s, int64_t a) const {		uint64_t branchAddr, const Symbol &s, int64_t a) const {
if (type != R_PPC64_REL14 && type != R_PPC64_REL24)		if (type != R_PPC64_REL14 && type != R_PPC64_REL24)
▲ Show 20 Lines • Show All 199 Lines • Show Last 20 Lines

lld/test/ELF/ppc64-pcrel34-reloc.s

This file was added.

				# REQUIRES: ppc

				# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %s -o %t.o
				MaskRayUnsubmitted Done Reply Inline Actions Delete `-unknown-linux` MaskRay: Delete `-unknown-linux`
				# RUN: ld.lld %t.o -o %t
				# RUN: llvm-readelf -s %t \| FileCheck %s --check-prefix=SYMBOL
				# RUN: llvm-objdump -d --no-show-raw-insn --mcpu=future %t \| FileCheck %s

				# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %s -o %t.o
				# RUN: ld.lld %t.o -o %t
				# RUN: llvm-readelf -s %t \| FileCheck %s --check-prefix=SYMBOL
				# RUN: llvm-objdump -d --no-show-raw-insn --mcpu=future %t \| FileCheck %s


				.text
				.globl _start
				_start:
				.section .GLOB_INT_PCREL,"ax",@progbits
				plwa 3, glob_int@PCREL(0), 1
				blr
				glob_int:
				.long 0 # 0x0
				MaskRayUnsubmitted Done Reply Inline Actions `# 0x0` is unnecessary ditto below CHECK lines before the code are more common. (A few ppc64 tests may not follow the convention) MaskRay: `# 0x0` is unnecessary ditto below CHECK lines before the code are more common. (A few ppc64…
				MaskRayUnsubmitted Not Done Reply Inline Actions Instead of a `.section`, it is better using a label. You'll then able to use `CHECK-NEXT`: # CHECK: <foo>: # CHECK-NEXT: plwa 3, 12(0), 1 MaskRay: Instead of a `.section`, it is better using a label. You'll then able to use `CHECK-NEXT`: ```…
				.size glob_int, 4

				# CHECK-LABEL: Disassembly of section .GLOB_INT_PCREL
				# CHECK: 10010180: plwa 3, 12(0), 1
				# SYMBOL: 1001018c 4 NOTYPE LOCAL DEFAULT 2 glob_int

				.section .GLOB_INT_PCREL_OFFSET,"ax",@progbits
				plwa 3, glob_int8@PCREL+4(0), 1
				blr
				glob_int8:
				.long 0 # 0x0
				.size glob_int8, 8

				# CHECK-LABEL: Disassembly of section .GLOB_INT_PCREL_OFFSET
				# CHECK: 100101c0: plwa 3, 16(0), 1
				# SYMBOL: 100101cc 8 NOTYPE LOCAL DEFAULT 3 glob_int8

				.section .GLOB_INT_PCREL_BIGOFFSET,"ax",@progbits
				plwa 3, glob_int8_big@PCREL+4(0), 1
				blr
				.space 500000
				glob_int8_big:
				.long 0 # 0x0
				.size glob_int8_big, 8


				# CHECK-LABEL: Disassembly of section .GLOB_INT_PCREL_BIGOFFSET
				# CHECK: 10010200: plwa 3, 500016(0), 1
				# SYMBOL: 1008a32c 8 NOTYPE LOCAL DEFAULT 4 glob_int8_big

				MaskRayUnsubmitted Done Reply Inline Actions Delete trailing empty lines MaskRay: Delete trailing empty lines