- User Since
- Nov 23 2012, 10:16 AM (263 w, 5 d)
Tue, Dec 5
Fri, Dec 1
Instead of computing and storing the modulus directly, it is likely better to precompute the inverse and use that to improve the performance of the operation in first place. Consider using fast_remainder32 and associated functions.
Wed, Nov 29
So the next steps if you have the time would IMO be:
Tue, Nov 21
Split into verbose conditional register names into a separate function. We likely want to remove them going forward as they are a specific feature of the Darwin assembler and not wildly supported.
Sun, Nov 19
The public interface for obtaining the TLS storage is the combination of reading the DTV vector of a thread in combination with dl_iterate_phdr to find the size of the TLS block of a specific module. That gives you all that you need to know. It is important to keep in mind that the vector can be initialized lazily, so __tls_get_addr and friends will have to be intercepted to update the global view.
Fri, Nov 17
Thu, Nov 16
Mon, Nov 13
I really dislike this direction. fallocate can double the amount of disk IO and increase cache trashing, especially when linking large programs with debug information. Keeping more things in memory doesn't sound like an actual improvement either. If the goal is really only to improve the diagnostics in tools, I think a better idea would be to figure out a good way to handle this from a SIGBUS handler based on the passed in siginfo_t.
Nov 7 2017
No need for a custom container, just allocate the vector dynamically and free it when it becomes empty.
Nov 6 2017
No, __cxa_atexit will always reference the DSO handle. That exists even in the main executable.
Nov 3 2017
Is there any reason why keeping at_exit and __cxa_atexit handling merged? They are pretty much disjunct code paths, especially since the at_exit stack means that the real at_exit can be used.
Oct 27 2017
Oct 24 2017
Let me phrase it differently. What is this patch (and the matching backend PR) supposed to achieve? There are effectively two ways to get rid of PLT entries:
(1) Bind references locally. This is effectively what -Bsymbolic does and what is breaking the ELF interposition rules.
(2) Do an indirect call via the GOT. Requires knowing what an external symbol is, making it non-attractive for anything but LTO, since it will create performance issues for all non-local accesses (i.e. anything private).
Why again is this a good idea? This is an even worse hack than -Bsymbolic, the latter at least is visible in ELF header without code inspection. This is breaking core premises of ELF.
Oct 23 2017
This is even worse. You can't new and then free(). Please follow the suggestion on just embedding the prefix directly, if desirable.
Even a full static binary will have a PLT when IFUNC is used. As such, a linker has to deal with conversion between direct and PLT branches anyway.
A PLT is used not only by PIC code. It is required for all dynamic entry points and that's not limited to PIC. It's not even limited to dynamically linked binaries. There is no support for the embedded ABIs as I said before. I'm going to stop responding since it is rather pointless now. My objection stands.
Oct 22 2017
A working linker is certainly expected to use the most efficient binding. The assembler (and by extension the compiler) can't tell. This applies in both directions -- in your example, the linker might need to translate the direct call into a PLT call, depending on the target. It all goes back to "use a working linker". So yes, at this point in time the existance of two different relocations for call instructions can be clearly seen as a historic artifact. This doesn't change anything of what I have been saying since the beginning. The PowerPC ABIs are heavily biases towards position independent code and this is just adding complications for no good reason.
I've looked at this in some detail now. I'm not exactly sure yet why it is broken. The patch seems quite wrong to me. DW_CFA_GNU_args_size should be applied only when unwinding a call instruction and that regard, the commit message of the original change is quite correct. What I am still trying to understand is how the precise unwind frame disagrees with the unwinder.
I think we should special case Darwin and Windows and fall-back to LD_LIBRARY_PATH for the rest. Can't remember if there is a UNIX-like platform left where it doesn't work.
There is no layer of indirection here. The call gets resolved to the local symbol by a working linker.
On the contrary, I find little reason for adding the complexity here. That's even ignoring the question of whether the suggested check even covers the relocation models correctly.
The existing code works fine. What you have demonstrated so far is more an issue with lld and not a problem in the code. GNU ld is perfectly fine with translating the PLT references to direct calls. As such, I see no need for changes here.
Oct 20 2017
The patch is not acceptable in the current form. This includes fixing the memory leaks.
This is not about any operating system, but basic consistent behavior. either do the canonicalisation or not. Doing it sometimes is just bogus. You've effectively implemented -fcanonical-system-headers=sometimes.
Oct 19 2017
The behavior of sometimes transforming the path and sometimes not is IMO completely unacceptable. I don't care if GCC created that bug first.
Oct 12 2017
Oct 4 2017
In that case: the short version is that there is no EABI support at the moment. That patch changes the behavior for SYSV ABI, it is not acceptable as such. The test case doesn't reflect the ABI variance either.
Let's start with the obvious: I have no idea why you need this code at all. NetBSD builds a mix of static, PIC and PIE code using GNU ld without hitting any unsupported relocations. As I said before, the normal ABI on PowerPC is mostly PIC by default. I wonder if the root of your problem is that you want EABI and not the SYSV ABI.
This doesn't look right to me at all. The normal SYSV ABI on PowerPC is effectively PIC, with a few edge cases. This really looks like a step backwards.
Sep 28 2017
Sep 20 2017
Well, the background for the use of the option in NetBSD is related to inducted differences in reproducable builds. From that perspective, it is even worth to sometimes shorten the dependency and sometimes not.
Sep 17 2017
ninja is not the only consumer of this. For human inspection, it can be quite surprising to see symlinks resolved, i.e. /usr/include/machine on NetBSD. Build systems have different ideas on whether they want absolute resolved paths or not, so it seems like ninja should be able to cope with either format. This is a lossy transformation, so I'm somewhat reluctant to agree with it.
The comments at the very least are misleading. Resolving the path doesn't necessary shorten the string at all, in many cases it will be longer. I'm really not sure if clang should resolve symlinks like this as it can be quite surprising.
Sep 15 2017
So what about targets that don't support subnormals? I'm moderately sure ARM falls into this category given the right phase of the moon.
Sep 13 2017
I can't really comment on the Linux interface, but we generally ignore the system call error in NetBSD where one is necessary. While aborting might be legal, it seems to be the worst possible way of dealing with questionable arguments.
Sep 12 2017
This version is fine with me. The only contentious part is whether it should be opt-in or opt-out for platforms, so getting this version in and revisiting the issue again later is OK from my perspective.
Sep 8 2017
Create a function that does check the guard variable and make sure that it has the correct visibility declaration and/or access.
Aug 29 2017
Aug 28 2017
Aug 22 2017
Aug 21 2017
Well, the libexecinfo one exists as fallback because gcc doesn't provide one.
Kamil, which unwind.h are you using? The outdated copy in libexecinfo.h or the modern one used by libunwind? I see little reason to cater to the bugs in the former...
Aug 18 2017
BTW, I recently spend some time slapping GNU ld in NetBSD into shape so that it can properly support read-only .eh_frame even on MIPS. You might want to look at adopting similar changes.
Aug 17 2017
divtc3 and friends.
Because PPC uses the TC variant.
Just assume that the full 32bit address space is available.
The primary reason for using mmap is not so much performance, but reduced memory foot print.
Aug 12 2017
@spatel: I don't see a reason why we can't (or shouldn't) try to do common-prefix elimination for the memcmp intrinsic. It certainly seems to be better to me to preserve the intrinsics in your case as they should be easier to reason about. That's kind of my question for here too -- why does the expansion allow better code?
What is the advantage of expanding the memcpy intrinsic in InstCombine vs doing it later in the target-specific code?
Aug 10 2017
Aug 9 2017
Aug 3 2017
I don't see any reason why zero-initialised constants should be emitted in BSS. I know that GCC does that and I just fixed bugs in that because created wrong section flags for it. So yes, I'd prefer to revert this and fix the real problem.
I'm not sure I want anything like this unconditionally. It is going to waste quite a bit of space, i.e. 2KB per executable and shared library sums up.
Aug 1 2017
I had a long discussion with James about this on IRC without reaching a clear consensus. I consider forcing this behavior on all targets to be a major bug. It should be opt-in and opt-in only:
Jul 26 2017
Merging would be reasonable, yes.
Jul 25 2017
Warning for .ctor/.dtor use would IMO be completely bogus. They can easily be translated and they are kind of the LCD for "portable" assembler, i.e. much less problematic to deal with than plain .init/.fini segments.
Jul 23 2017
I don't really like this. The reason why -lm is added explicitly on many targets is because the C++ STL typically depends on it and that means for static linking and broken ELF linkers, it will be necessary to link against it explicitly.
There is also the question on whether any platform we have currently uses separate STL and ABI libraries and it is not clear whether the flag should handle both.
While more localized, this seems to be an even greater hack than adopting the JIT users to use AllocateRWXMemory.
Jul 21 2017
Jul 20 2017
Keep in mind that the SELinux case in libffi is not fork-safe. One important part LLVM needs to consider is
whether it wants to enshrine the performance penalty of mprotect-after-commit in its APIs or not. The second part
is whether platforms should aim to support hot-patchable JIT for multi-threaded environments or not. If the latter is
not considered relevant, the API only needs to provide a function to allocate JIT-safe memory and a function to make
it executable. If the latter is relevant, the current AllocateRWX is the interface you will end up with, one way or the other.
Jul 19 2017
Jul 18 2017
i386: code requires three push instructions + call + potential stack cleanup.
x86_64: code requires three register loads + call
This doesn't make sense to me, both the existing and new code. off_t is a signed type.
Jul 16 2017
Please mark it as alias for -nopie, otherwise fine.
Jul 12 2017
What if the constexpr function is called with in a non-constexpr context, e.g. with a random global variable as argument?
Jul 11 2017
Strictly speaking, I think this comes from csh, but LGTM.
A better construct would be to compute the full length and change both it and Str in the loop. No point in the complex arithmetic. I generally do prefer this as it improves interaction with pipes etc.
Jul 10 2017
Jul 6 2017
If you want to go this way, rename them consistently and use a different prefix (e.g. AUXV_*) please.
LGTM, as Kamil said, we have the same change in pkgsrc for Illumos.
The bigger question for me is: what depends on the RTLD_GLOBAL behavior? I generally consider using it a design bug. I.e. what breaks if we just change to RTLD_LOCAL?
Jul 1 2017
Lock-free atomic operations are also signal safe. Your code is not. While I don't know whether all this functions are not required to be signal safe, the general assertion is certainly questionable.
Jun 30 2017
LGTM, but please cut down the test cases by hand. Most of the attributes and module flags should be unnecessary.
Jun 29 2017
As I said in the discussion about similar flags for GVN, I'm generally fine with. We should have a big fat warning in the Release Notes that all -fexperimental-* flags are exactly that -- temporary and not intended for long term consumption. I.e. "we can and will remove them whenever it creates the most havoc".