Add support for the 34bit relocation R_PPC64_PCREL34 for PC Relative in LLD.
Details
- Reviewers
nemanjai sfertile MaskRay ruiu hfinkel • espindola - Group Reviewers
Restricted Project - Commits
- rG3a55a2a97fd4: [LLD][PowerPC] Add support for R_PPC64_PCREL34
Diff Detail
Event Timeline
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
382 | Can I ask the rational on why we want to treat a prefixed instruction as a single 8-byte instruction as opposed to a 4-byte prefix and a 4-byte instruction? I'm not necessarily disagreeing we should treat them as a single entity, I just want to make sure we have though about what the best representation is considering all the new relocations to be added. | |
1012 | I've got a patch which removes the masking of the instruction bits on the other relocations types we were doing it for (D81106). My understanding is that for REL ELF archs the addend is encoded into the bits to be relocated, and for RELA ELF archs the bits to be relocated are zeros, with the addend in the relocation. The extra masking doesn't hurt anything functionally but it can lead to misunderstandings on what the bits to be relocated should contain so I think its important to be precise and omit the extra masking despite it not hurting anything. |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
382 | Good question. I like to treat them as one because conceptually they are one instruction. So, in my mind I tend to think of them as 8 byte instructions.
For example in PPCInstrPrefix.tdwe have the base class for all prefixed instructions: // Top-level class for prefixed instructions. class PI<bits<6> pref, bits<6> opcode, dag OOL, dag IOL, string asmstr, InstrItinClass itin> : Instruction { field bits<64> Inst; field bits<64> SoftFail = 0; bit PCRel = 0; // Default value, set by isPCRel. let Size = 8; In the end the reason is purely conceptual from my perspective. I don't have a technical reason to treat them as one. | |
1012 | I do see what you mean. The way I look at this is that we have 3 options:
I had originally picked option 2 because that is what GCC did. It produces the correct output despite the garbage I put in there. Again, just because GCC does this it doesn't mean we should. With respect to the second comment: |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
1012 | I changed my mind on D81106.
This one looks good to me. Why does it "hide the problem"? powerpc{,64} are RELA targets. For a RELA target, the implicit addend should not matter. FWIW, AArch64.cpp marks out the bits as well. |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
382 | I can see why we think of them as a single instruction in the backend. The reason I asked though is that in the object file we have a 4-byte prefix and a 4-byte instruction not an 8-byte instruction (as you can see from how we have to shuffle in the read/writes), and in this case the bits being relocated are broken into bits in the prefix and bits in the instruction. (with the 2 fields being discontinuous) The prefix28 relocations was similar (high 12 in the prefix, lo 16 in the instruction and again the 2 fields are discontinuous). I haven't looked at all the new relocations though, which format does the paddi use? | |
1012 |
AARCH64 is inconsistent: It masks to zero out the bits in write32AArch64Addr, but doesn't in or32le. static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); } With respect to D80496 --> I'm not sure how this is relevant. IIUC the option is to use REL instead of RELA for the dynamic relocations. ie LLDs output, the masking we are talking about here is in relation to the linkers input.
I had looked at what LLVM outputs, what gas produces for a bunch of inputs (since I can't look at gas source) and the fact that the AARCH64 target doesn't mask out the bits to be relocated (in or32le) and though the best way to make this consistent was to omit the masking. I think we should be consistent both for all input relocations but also across targets. The reason I think its important to omit the masking (if its valid to omit) is precisely for the reason you pick option 2. The wrong input (ie with the addend encoded into the instruction) produced the right output which made you think the addend had to be there. The linker being 'nice' lead to us not catching the improper LLVM codegen. If the linker had been stricter we would have realized the problem earlier. A lot of these minor details aren't well documented and the tools implementations essentially becomes the documentation. If it is supposed to work with garbage in the bits then I am all for consistently masking anywhere we need a wider read then the bits to be relocated: but I'd like to see something concrete that suggest it is intended to work that way. |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
382 | paddi uses the same idea where it splits the 34 bit relocation bits into 18 bits in the prefix and 16 bits in the instruction. The two fields are discontinuous as you say. So, what you are saying is that you prefer to have this in a struct or class with two parts because we are going to be working with them separately anyway? | |
1012 | By "hide the problem" I'm basically saying that there could be a compiler that is setting garbage bits into where the relocation is supposed to go. If that's the case we won't know about it because the linker is nice enough to fix it for us. It's a bug that is not detected. I have not been able to find a place that specified whether or not the relocation bits need to be zero or not when we are using RELA. |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
382 | Nothing so heavy weight. I was think just keeping them a 2 separate uint32_t and modifying each individually. Something like: checkInt(loc, val, 34, rel); uint32_t si0Mask = 0x3ffff; uint32_t si0 = (val >> 16) & si0Mask write32(loc, (read32(loc) & ~si0Mask) | si0); uint32_t si1Mask = 0xffff write32(loc + 4, (read(loc + 4) & ~si1Mask)| (val & si1mask)); If we didn't need to mask off the read then I think keeping them separate was slightly cleaner. With masking I think its about the same either way so treating it as a 8-byte unit with read and write helpers is probably the better choice. | |
1012 |
That's because nothing does specify it. After talking with Alan I have realized i am wrong on this: we can't rely on the bits being zeroed out. The current gas/LLVM behavior of zeroing out the bits isn't required and we should keep masking off any bits to be relocated when the read/write is bigger then the bits being relocated. |
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
1005 | Nit: Lower-case hexadecimals are more common. (I know that there are a few places in this file where upper-case hexadecimals are inconsistently used) | |
lld/test/ELF/ppc64-pcrel34-reloc.s | ||
20 ↗ | (On Diff #270488) | Instead of a .section, it is better using a label. You'll then able to use CHECK-NEXT: # CHECK: <foo>: # CHECK-NEXT: plwa 3, 12(0), 1 |
Do you want to add a test for overflow of a 34-bit value?
lld/ELF/Arch/PPC64.cpp | ||
---|---|---|
381 | This comments a out of date (ie load/store the pieces separately). If you mention its a 4 byte prefix followed by a 4 byte instruction I think the shifting we have to do for LE becomes pretty obvious. | |
lld/test/ELF/ppc64-reloc-pcrel34.s | ||
38 | Do you want to define an int of size 4 or 8? you use an lwa so I would expect a 4 byte int, and you use .long which allocates 4 bytes (I belive its the same as .int), but then give it a size of 8 and name it as int8? Ditto for glob_int8_big |
lld/test/ELF/ppc64-reloc-pcrel34.s | ||
---|---|---|
38 | Sorry, what I meant to do was read the last 4 bytes of an 8 byte int. |
Hi, your git commit does not contain Differential Revision:, so the revision was not automatically closed when you pushed the commit.
(
You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:
arcfilter () { arc amend git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F - }
Reviewed By: is considered important by some people. Please keep the tag. (I have updated my script to use --date=now (setting author date to committer date))
https://reviews.llvm.org/D80978 contains a git pre-push hook to automate this.)
This comments a out of date (ie load/store the pieces separately). If you mention its a 4 byte prefix followed by a 4 byte instruction I think the shifting we have to do for LE becomes pretty obvious.