- User Since
- Nov 23 2012, 10:16 AM (282 w, 2 d)
Fri, Apr 6
Can you make sure that we handle the older ARM versions correctly as well, i.e. v4, v5 and v6? I take it we still have test cases for the arm <-> thumb transition? That's the one part of the triple logic that is really non-trivial.
Fri, Mar 30
The "struct object" is an implementation detail of the unwind implementation. You are guaranteed historically to get at least 8 longs / 8 pointers for internal use statically allocated in each object. What is stored inside is up to the unwind implementation.
Wed, Mar 28
GCC supports -mbss-plt to get the legacy behavior. Not sure if anyone actually uses it though.
Tue, Mar 27
Given that some people like to post-process assembler files, using the section symbol directly is a bad idea. Adding the local symbols is fine.
Mon, Mar 26
Oh, we certainly should never be hitting an assertion on front-end flags. As such, there is a problem to fix here. I still maintain that the combination of flags is non-sense, so the question is:
Mar 23 2018
It should be kept in mind that secure PLT is desirable for certain cases with non-position independent code as well. Even in static binaries it can be desirable... But that is for a follow-up patch.
IMO we should explicitly error out. That combination is nonsense to me. Creating useless JSON database fragments is not an improvement.
Mar 5 2018
The difference is that modsi3 etc are all paired instructions. A backend should not be lowering to one of them if a real division instruction exists and it should be consistent in the lowering.
Feb 21 2018
ARM and x86 implement different chars, don't they?
Feb 19 2018
We tried to keep the condition simple. I.e. does the compiler on any of those platforms ever use the libcall? If not, it is IMO not worth the complexity.
Feb 12 2018
Please stop adding complexity to doctor around the symptoms. There are two real fixes here and this change doesn't help with either:
(1) Emit cross-section pointers as indirect. This increases the binary size, but otherwise ensures that any linker can create read-only .eh_frame on MIPS.
(2) Teach lld on MIPS to properly reassemble the DWARF instructions, similar to what GNU ld can do. The latter is a bit stupid and needs a good kick to work properly, but this is the correct approach forward.
This is not acceptable. If anything, the encoding should be switched to indirect, but that should already be the case.
I really don't like ignoring options that are supposed to provide actual functionality. Most of the other options are for pointless fine tuning and workarounds for broken gcc behavior in ancient versions.
Feb 8 2018
Feb 5 2018
I really, really dislike this patch. It is using very blunt force to workaround a GCC bug. The comment is too verbose as well. Please try the following change from NetBSD instead:
Jan 19 2018
Good enough for me.
Jan 18 2018
Do you see the comment just following the code? The patch completely violates that basic design principle. It would be perfectly sensible to hard-code a list of dumb terminals and explicitly default to no colors for them. The reverse (hard-coding a list and assuming it is fine for everything else) is not.
That's no excuse for making the situation even worse.
I completely disagree with this approach. A lot of GNU tools (including GCC) are completely broken. We shouldn't follow them. There are a lot more terminals around than just "dumb", "xterm" and "linux". It is completely non-acceptable to just assume ANSI escape sequences work. If Android doesn't ship a usable terminfo implementation, I consider that an Android bug. Wouldn't be the first portability nightmare with Android.
Jan 11 2018
Jan 7 2018
This works in 32bit mode as well? I'm suprised.
Dec 14 2017
I'm not really a fan of linking libutil into all binaries. Why is this code using forkpty in first place and not posix_openpt/grantpt?
Dec 5 2017
Dec 1 2017
Instead of computing and storing the modulus directly, it is likely better to precompute the inverse and use that to improve the performance of the operation in first place. Consider using fast_remainder32 and associated functions.
Nov 29 2017
So the next steps if you have the time would IMO be:
Nov 21 2017
Split into verbose conditional register names into a separate function. We likely want to remove them going forward as they are a specific feature of the Darwin assembler and not wildly supported.
Nov 19 2017
The public interface for obtaining the TLS storage is the combination of reading the DTV vector of a thread in combination with dl_iterate_phdr to find the size of the TLS block of a specific module. That gives you all that you need to know. It is important to keep in mind that the vector can be initialized lazily, so __tls_get_addr and friends will have to be intercepted to update the global view.
Nov 17 2017
Nov 16 2017
Nov 13 2017
I really dislike this direction. fallocate can double the amount of disk IO and increase cache trashing, especially when linking large programs with debug information. Keeping more things in memory doesn't sound like an actual improvement either. If the goal is really only to improve the diagnostics in tools, I think a better idea would be to figure out a good way to handle this from a SIGBUS handler based on the passed in siginfo_t.
Nov 7 2017
No need for a custom container, just allocate the vector dynamically and free it when it becomes empty.
Nov 6 2017
No, __cxa_atexit will always reference the DSO handle. That exists even in the main executable.
Nov 3 2017
Is there any reason why keeping at_exit and __cxa_atexit handling merged? They are pretty much disjunct code paths, especially since the at_exit stack means that the real at_exit can be used.
Oct 27 2017
Oct 24 2017
Let me phrase it differently. What is this patch (and the matching backend PR) supposed to achieve? There are effectively two ways to get rid of PLT entries:
(1) Bind references locally. This is effectively what -Bsymbolic does and what is breaking the ELF interposition rules.
(2) Do an indirect call via the GOT. Requires knowing what an external symbol is, making it non-attractive for anything but LTO, since it will create performance issues for all non-local accesses (i.e. anything private).
Why again is this a good idea? This is an even worse hack than -Bsymbolic, the latter at least is visible in ELF header without code inspection. This is breaking core premises of ELF.
Oct 23 2017
This is even worse. You can't new and then free(). Please follow the suggestion on just embedding the prefix directly, if desirable.
Even a full static binary will have a PLT when IFUNC is used. As such, a linker has to deal with conversion between direct and PLT branches anyway.
A PLT is used not only by PIC code. It is required for all dynamic entry points and that's not limited to PIC. It's not even limited to dynamically linked binaries. There is no support for the embedded ABIs as I said before. I'm going to stop responding since it is rather pointless now. My objection stands.
Oct 22 2017
A working linker is certainly expected to use the most efficient binding. The assembler (and by extension the compiler) can't tell. This applies in both directions -- in your example, the linker might need to translate the direct call into a PLT call, depending on the target. It all goes back to "use a working linker". So yes, at this point in time the existance of two different relocations for call instructions can be clearly seen as a historic artifact. This doesn't change anything of what I have been saying since the beginning. The PowerPC ABIs are heavily biases towards position independent code and this is just adding complications for no good reason.
I've looked at this in some detail now. I'm not exactly sure yet why it is broken. The patch seems quite wrong to me. DW_CFA_GNU_args_size should be applied only when unwinding a call instruction and that regard, the commit message of the original change is quite correct. What I am still trying to understand is how the precise unwind frame disagrees with the unwinder.
I think we should special case Darwin and Windows and fall-back to LD_LIBRARY_PATH for the rest. Can't remember if there is a UNIX-like platform left where it doesn't work.
There is no layer of indirection here. The call gets resolved to the local symbol by a working linker.
On the contrary, I find little reason for adding the complexity here. That's even ignoring the question of whether the suggested check even covers the relocation models correctly.
The existing code works fine. What you have demonstrated so far is more an issue with lld and not a problem in the code. GNU ld is perfectly fine with translating the PLT references to direct calls. As such, I see no need for changes here.
Oct 20 2017
The patch is not acceptable in the current form. This includes fixing the memory leaks.
This is not about any operating system, but basic consistent behavior. either do the canonicalisation or not. Doing it sometimes is just bogus. You've effectively implemented -fcanonical-system-headers=sometimes.
Oct 19 2017
The behavior of sometimes transforming the path and sometimes not is IMO completely unacceptable. I don't care if GCC created that bug first.
Oct 12 2017
Oct 4 2017
In that case: the short version is that there is no EABI support at the moment. That patch changes the behavior for SYSV ABI, it is not acceptable as such. The test case doesn't reflect the ABI variance either.
Let's start with the obvious: I have no idea why you need this code at all. NetBSD builds a mix of static, PIC and PIE code using GNU ld without hitting any unsupported relocations. As I said before, the normal ABI on PowerPC is mostly PIC by default. I wonder if the root of your problem is that you want EABI and not the SYSV ABI.
This doesn't look right to me at all. The normal SYSV ABI on PowerPC is effectively PIC, with a few edge cases. This really looks like a step backwards.
Sep 28 2017
Sep 20 2017
Well, the background for the use of the option in NetBSD is related to inducted differences in reproducable builds. From that perspective, it is even worth to sometimes shorten the dependency and sometimes not.
Sep 17 2017
ninja is not the only consumer of this. For human inspection, it can be quite surprising to see symlinks resolved, i.e. /usr/include/machine on NetBSD. Build systems have different ideas on whether they want absolute resolved paths or not, so it seems like ninja should be able to cope with either format. This is a lossy transformation, so I'm somewhat reluctant to agree with it.
The comments at the very least are misleading. Resolving the path doesn't necessary shorten the string at all, in many cases it will be longer. I'm really not sure if clang should resolve symlinks like this as it can be quite surprising.
Sep 15 2017
So what about targets that don't support subnormals? I'm moderately sure ARM falls into this category given the right phase of the moon.
Sep 13 2017
I can't really comment on the Linux interface, but we generally ignore the system call error in NetBSD where one is necessary. While aborting might be legal, it seems to be the worst possible way of dealing with questionable arguments.
Sep 12 2017
This version is fine with me. The only contentious part is whether it should be opt-in or opt-out for platforms, so getting this version in and revisiting the issue again later is OK from my perspective.
Sep 8 2017
Create a function that does check the guard variable and make sure that it has the correct visibility declaration and/or access.
Aug 29 2017
Aug 28 2017
Aug 22 2017
Aug 21 2017
Well, the libexecinfo one exists as fallback because gcc doesn't provide one.
Kamil, which unwind.h are you using? The outdated copy in libexecinfo.h or the modern one used by libunwind? I see little reason to cater to the bugs in the former...
Aug 18 2017
BTW, I recently spend some time slapping GNU ld in NetBSD into shape so that it can properly support read-only .eh_frame even on MIPS. You might want to look at adopting similar changes.
Aug 17 2017
divtc3 and friends.
Because PPC uses the TC variant.
Just assume that the full 32bit address space is available.
The primary reason for using mmap is not so much performance, but reduced memory foot print.
Aug 12 2017
@spatel: I don't see a reason why we can't (or shouldn't) try to do common-prefix elimination for the memcmp intrinsic. It certainly seems to be better to me to preserve the intrinsics in your case as they should be easier to reason about. That's kind of my question for here too -- why does the expansion allow better code?
What is the advantage of expanding the memcpy intrinsic in InstCombine vs doing it later in the target-specific code?
Aug 10 2017
Aug 9 2017
Aug 3 2017
I don't see any reason why zero-initialised constants should be emitted in BSS. I know that GCC does that and I just fixed bugs in that because created wrong section flags for it. So yes, I'd prefer to revert this and fix the real problem.
I'm not sure I want anything like this unconditionally. It is going to waste quite a bit of space, i.e. 2KB per executable and shared library sums up.
Aug 1 2017
I had a long discussion with James about this on IRC without reaching a clear consensus. I consider forcing this behavior on all targets to be a major bug. It should be opt-in and opt-in only:
Jul 26 2017
Merging would be reasonable, yes.
Jul 25 2017
Warning for .ctor/.dtor use would IMO be completely bogus. They can easily be translated and they are kind of the LCD for "portable" assembler, i.e. much less problematic to deal with than plain .init/.fini segments.
Jul 23 2017
I don't really like this. The reason why -lm is added explicitly on many targets is because the C++ STL typically depends on it and that means for static linking and broken ELF linkers, it will be necessary to link against it explicitly.
There is also the question on whether any platform we have currently uses separate STL and ABI libraries and it is not clear whether the flag should handle both.
While more localized, this seems to be an even greater hack than adopting the JIT users to use AllocateRWXMemory.
Jul 21 2017
Jul 20 2017
Keep in mind that the SELinux case in libffi is not fork-safe. One important part LLVM needs to consider is
whether it wants to enshrine the performance penalty of mprotect-after-commit in its APIs or not. The second part
is whether platforms should aim to support hot-patchable JIT for multi-threaded environments or not. If the latter is
not considered relevant, the API only needs to provide a function to allocate JIT-safe memory and a function to make
it executable. If the latter is relevant, the current AllocateRWX is the interface you will end up with, one way or the other.