- User Since
- Jan 4 2017, 12:12 PM (159 w, 4 d)
Thu, Jan 23
Switch from explicit iterator to recently-added range-based for loop interface. Ok to merge?
Wed, Jan 22
Pass correct target to shouldForceRelocation (although unused).
Addressed review comments.
Mon, Jan 20
Never mind, I see why this is done, we need to respect the ordering, so skipping the SC misses the release.
Tue, Jan 14
Rebased and addressed review comments.
Oh, yes, that was silly.
I think this is good now (and definitely better now the big comment has moved outside of the let).
Sun, Jan 12
Thu, Jan 9
Tue, Jan 7
Simplified to avoid duplicating logic.
Perhaps this helps. Let's take:
Yes, FK_IsPCRel isn't really accurate for this, and we have talked in the past about maybe teaching LLVM about indirection in fixups, but the approach that's here has proved sufficient in all the cases we can come up with, and has the advantage of keeping this RISC-V-specific fixup internal to the backend.
If you look at the implementation of getPCRelHiFixup, the fixup returned is normally the same fragment as the AUIPC symbol, which is what this code was relying on. However, the one case where it doesn't is the special case of being at the end of a fragment, where getPCRelHiFixup gets the fixup from the *next* fragment *at offset 0*. This logic therefore needs to be mirrored in evaluatePCRelLo so that they agree on what fragment they're talking about for the AUIPC fixup. The issue arises because .option (and other directives) are delayed until the next code/data, but emitting a symbol does not flush that, so the local symbols end up at the end of the previous fragment.
Ping; it would be good to get this in before 10.0 branches.
Lewis, I believe this should now be marked abandoned in light of the other fixes?
Thu, Jan 2
Wed, Jan 1
I am still of the view that we should support rewriting the instruction stream in the linker when necessary like BFD does. We need to do this to be able to link in GCC-compiled code, including libraries. It is a very simple thing to do with a patch available that provides consistency between the GNU world and the LLVM world.
Tue, Dec 31
For calls, whilst it differs from BFD, it can work just fine as it’s what AArch64 does. For data references, however, this is highly inappropriate. They *need* to resolve to 0 as *real world code* relies on that, and it’s what happens on every architecture out there. Code generated by both GCC and LLVM uses PC-relative relocations to refer to external weak symbols. We can argue all we like about whether that’s good or not, but at the end of the day *that is what happens* and we need to not break that. Non-CALL(_PLT) PC-relative relocations *must not* be mislinked like this. Either do the same rewriting as in John’s patch that BFD does, or produce a suitable error, but it is not acceptable to silently mis-link this code. And I would really like LLD to support linking GCC-produced code; we should not be going off and doing our own incompatible thing. Maybe you don’t want to support gcc -fuse-ld=lld, but what about when using Clang and LLD and you’re statically-linking against a library compiled by GCC? That means coordination; either support what GCC produces (which a patch already exists for), or convince GCC to change.
Sun, Dec 29
Dec 23 2019
I would personally rather this remain unindented, much like we do with namespace llvm etc.
Dec 22 2019
Dec 21 2019
Dec 17 2019
Dec 16 2019
Dec 5 2019
Dec 3 2019
Nov 25 2019
Nov 24 2019
Nov 21 2019
I think the fact that many runtime linkers out there don't handle it correctly is a pretty compelling reason in and of itself (but I agree that, being a new ABI, it should avoid introducing more copy relocations). The fact that GCC relies on this behaviour (which glibc and bfd correctly implement) but doesn't document it anywhere other than in the compiler source itself is worrying, since it leads to the situation we have now where binaries built by GCC+bfd for other systems (such as FreeBSD) may not run correctly, as they will copy garbage from early on in the mapping.
Nov 19 2019
I would suggest stepping back and looking at the whole picture. TLS is not special here; for normal globals, both code models also rely on copy relocations, and there are likely to be many more of those. If you think TLS copy relocations are bad then you should also be against using HI/LO and PCREL_HI/LO for extern symbols and require those to be indirected through the GOT.
What's strange is that TLS GD(/LD)/IE *don't* have linker relaxations currently even in BFD (and no R_RISCV_RELAX relocations are emitted since both compilers know it's not currently a relaxable sequence) despite RISC-V's love of linker relaxations for other things, so the model chosen at compile-time is set in stone. The only exception is that a 32-bit LE offset gets relaxed into a 12-bit LE offset when possible, but that's not changing the model. It really feels like they should have gone with IE and defined linker relaxations (something that even other architectures have here, albeit they pad with NOPs rather than deleting code) instead of this approach, but alas that's what we have.
Nov 18 2019
I would agree that, for other architectures, not supporting copy relocations for TLS makes sense, and a world without copy relocations would be good. However, at the end of the day, this is what the RISC-V toolchains out there currently rely upon, and it would be very sad to force an incompatibility, especially since -fPIC introduces extra indirection in many cases that aren't needed, in particular because code generation has to assume symbol preemption. It does seem at least however that -fPIE yields the saner IE model again for both LLVM and GCC.... which begs the question why, as PIE vs non-PIE makes absolutely zero difference to the TLS addresses and semantics. I haven't yet found a discussion as to why this decision was made; the comment in the GCC source was the only reference I could even find that acknowledged this would require copy relocations. Ultimately, Clang+LLD can't currently be used to build FreeBSD as a result of this (nor can GCC+LLD), and telling people they can't use the default compiler mode that has always worked doesn't seem like the right answer; either LLVM and GCC need to stop forcing TLS copy relocations for position-dependent executables, or we need to support it here in LLD (or both...).
Note that, for both GCC and LLVM, the default model used for TLS variables in position-depedent code is LE even if extern, relying on TLS copy relocations. GCC explicitly mentions this in gcc/config/riscv/riscv.c with /* Since we support TLS copy relocs, non-PIC TLS accesses may all use LE. */, and LLVM says // Non-PIC TLS lowering should always use the LocalExec model. (although perhaps that should mention *why* this is ok). Hence this is needed in order to link some real-world code.
Yes that looks rather better. Especially SET6 which was previously an ADD6 (although that aspect was likely harmless as nothing sensible would put something there just to be clobbered by a SET6).
This broke -DBUILD_SHARED_LIBS=ON for me; lld::errs() lives in Common, but Core has also been modified to use it, and Common depends on Core, not vice versa:
R_MIPS_JALR really should not be being emitted here; its semantics are a hint to say that you are making a direct call to the referenced symbol (albeit by indirecting through a loaded GOT entry), but this is an indirect call. Interestingly, Local Dynamic is the only case where LLVM wants to emit the relocation; the three other models all leave it out, as does a normal global. This is because it happens to have the same structure as a direct call in the SelectionDAG representation, with the second operand for the instruction defining $t9 being a GlobalAddressSDNode, just this time an MO_DTPREL_LO rather than the expected MO_GOT_CALL (%call16)/MO_CALL_LO16 (%call_lo).
Nov 15 2019
Oct 29 2019
Oct 23 2019
Oct 22 2019
Oct 21 2019
I was not aware of that aside in the spec, which does indeed suggest this specific case should sign-extend, and I guess gives us a reason why this isn't a slippery slope. However, even if this is the case, I still feel code shouldn't be relying on this, simply because it's unclear (and perhaps surprising to those who don't know this particular note from the spec). One might perfectly reasonably assume that both signed and unsigned i32 types (in the original source) are zero-extended, or (probably the default position of people) that the signed type is sign-extended and the unsigned type is zero-extended (which would require passing the extension information through via the asm call instruction's operands).
Oct 19 2019
So, I'm not sure whether this is actually something we want to make work or not. You can already end up with both inconsistencies between compilers and inconsistencies within a compiler for how 16-bit values get represented when passed as register operands, and the fact that we're going against the grain and making i32 an illegal type just makes that also apply to i32; it's nothing new, just more widespread. I wrote a bunch of stuff exploring how this is all already broken for i16 over on the corresponding FreeBSD revision that found this issue (https://reviews.freebsd.org/D22084#482766), but trying to support this sounds like opening a can of worms, and it seems likely that we will still end up with subtle differences, ones which might rear their heads only once there is sufficient inlining (and perhaps due to LTO), removing the sanitisation applied by the calling convention. Moreover, only guaranteeing GCC compatibility (and self-consistency) for i32 and iPTR (whatever that may be) across architectures seems a bit arbitrary. Yes, i32 is a bit special due to C's integer promotion rules, but it feels like a slippery slope towards standardising on i16 and i8 too. Yes, I went and wrote this patch, but having further explored the issue and thought about it, my inclination is that we should just tell people to not do this, and put in the explicit sign-extending or zero-extending casts they want in order to get an XLEN-sized value. Especially since there is not likely to be much RISC-V inline assembly out there at the moment that relies on this. What do others think? Perhaps Clang could gain more than the current warn_asm_mismatched_size_modifier (which is currently about whether the register you're getting due to your modifiers matches what your input is, e.g. if the input gets promoted to an i32 on AArch64 you should be using a W register with %w0 rather than %0; it's not warning about the promotion itself).
Oct 18 2019
Do we need separate schedule information for compressed and uncompressed instructions? I would assume that any sensible RVC implementation would decode the two to exactly the same control logic (other than for PC incrementing and the original instruction for mtval).
I wonder whether, instead of putting all the scheduling resource information as part of the instruction definition, we should be doing something like we do with patterns, ie declaring them separately (either in each RISCVInstrInfoX.td, or in RISCVSchedule.td to keep scheduling completely separate from encoding and codegen). For example (formatting aside):
If you have let UnsupportedFeatures = [IsRV32] (or similarly for RV64) in your model definition, you can avoid needing to list a bunch of scheduler resources with let Unsupported = 1;.
Oct 15 2019
Oct 8 2019
I wonder whether it would be nicer to add CC_RISCV_FastCC as a function, much like CC_RISCV, instead, to match the current code style.
Rebased, added explicit no-warning test, clang-format'ed, updated assertion message.