- User Since
- Nov 23 2012, 10:16 AM (239 w, 2 d)
Thu, Jun 22
Do you need someone to commit this?
I don't think it is a good idea to make this function non-transitive.
It would be nice to make the stackprobesize a proper TLI attribute, so that OSes can decide how much guarding they are willing to guarantee. I think I would also like to have an option to inline the probing without the complexity of the function call. But both can be done as a separate step.
Wed, Jun 21
If the jump table is writable, you can just use the absolute pointers, PIC or no PIC. That would be the "block address" encoding.
Tue, Jun 20
If you want to create a stop-gap solution that bad, I would make it emit the jump table as writable for non-PIC && largeish code model. That's a much more contained change. It would be really better to work on the real fix and not add more hacks...
For some reason we are using large code model with PIC in the JIT and this is the only issue I see on x86-64 so far.
The x86_64 code is mixed: it is correct for non-PIC but large, it doesn't work correctly for PIC. That covers the cases most people have been interested in, especially since large model and PIC runs into various other issues too from what I remember.
At least the PPC change is definitely wrong. AArch64 should be wrong as well from what I discussed with Tim.
What about the old TODO item of extending powi to integer types and folding them together in the middle end?
Mon, Jun 19
Not parsers, encoders. Note that the escaping is not correct for control characters other than \n.
I don't disagree with you, but please see the referenced review for further details.
Sun, Jun 18
My suggestion would be to just use the YAML writer for now and leave a comment to make it clear that this is really JSON only.
Sat, Jun 17
Please see the discussion in D31992. This patch seems to go in the wrong direction.
As I said, I don't see the point in pretending we support float128 when the runtime doesn't contain the necessary pieces.
Wed, Jun 14
Tue, Jun 13
Sun, Jun 11
This fixes the problem in mbedtls, thanks.
Thu, Jun 8
Soft-float on the runtime environment part. I.e. non-trivial operations are lowered to library calls.
At the very least, missing test case.
Wed, Jun 7
Actually, for NetBSD we want to use -fomit-frame-pointer by default whenever the implicit -funwind-tables is also present. In general, that should be the justification for the default behavior: "Can I reliably unwind a binary with the available information?" Performance of stack unwinding is IMO secondary and users of sanitizers can disable it where necessary.
Tue, Jun 6
Given that W^X is becoming a lot more popular across systems including SELinux and other variants, I think it would be better to extend MemoryBlock to store separate pointers for W and X mappings. That avoids the complexity of storing the pointer directly in the allocation.
Thu, Jun 1
A small subset can be found by searching for LINKER_RPATH_FLAG in pkgsrc. A classic offender is Emacs. For more, I would have to systematically search.
Wed, May 31
sysroot is already handled in NetBSD.cpp line 118 or so.
Such knowledge is necessary anyway. There is enough software that wants to use the linker directly.
I'm against it. I consider this a strong bug on the LLD side and the behavior of clang correct.
Mon, May 29
May 21 2017
Strictly speaking, the glibc attribute violates the specification of const. I.e. we could turn a call for pthread_self() into a once() initialised static variable and that transform would be valid under the const rules. That clearly doesn't reflect the intention...
May 20 2017
May 19 2017
May 18 2017
I find the use of "must" at the very least inappropriate. If there was no use case for not including it, it wouldn't be an option. There is also nothing really Android-specific here beside maybe the open64 mess.
Can this use ErrorOr?
May 17 2017
Lock-free operations provide two major advantages. You don't need to worry about signal safety and they can be mixed with certain non-atomic operations without creating havoc. All I said is that the presence of a 32bit CAS is enough to ensure lock-free atomic operations for (appropriately aligned) 8bit and 16bit values can be implemented. That will not be worse than any locked atomic operation on any non-brain-dead platform. E.g. it might not be true on SPARC v7 and v8 or the VAX, but then they don't have a 32bit CAS. It doesn't mean that you have to implement 16bit atomics using a 32bit CAS. I would hope the webassembly backend exposes atomics as precise operations, but if it only provides a 32bit CAS, that's still good enough for the needs of any frontend language.
May 16 2017
While I fully agree that the current fallback-to-GCC behavior is far from helpful, I'm not sure how much sense providing linker support for bare-metal makes. I would expect this to changes wildly depending on the specific environment.
I don't get this. If you have a lock-free 32bit CAS, you can implement all the 16bit operations on top of that and they are still lock-free.
May 15 2017
Can you check the countable part in one of the cases? Otherwise it looks good.
May 10 2017
At least NetBSD doesn't support/use DF_TEXTREL and only DT_TEXTREL.
It's actually more a diagnostic hint. Due to things like Nvidia's binary-only libGL, this is actually supported by most TLS implementations as long as no initializers are needed.
Just as a side note: this is not only ARMv4. Older SPARC, M68K, SH, IA64, Alpha all seem to lack a FFS / CLZ instruction from cursory check.
May 9 2017
To expand on what I said on IRC:
Why do you call __enable_execute_stack in first place? I would just leave it and fix any potential callers.
Yeah, but even the generic expansion results in ~19 instructions on ARMv4. Compare that to one instruction in the loop and it can hardly be said to be a general win.
That's not true. ARMv4 for example has no clz, it's a V5T feature. That's my point: if the CPU has no direct lowering for the intrinsic, this transform is beneficial only if the resulting intrinsic can be constant folded. But I wonder if we don't catch those cases already with SCEV based optimisations. If a CPU has no direct lowering like on ARMv4, it will add a libcall and with a high chance of being more expensive than the optimisation.
May 5 2017
Apr 27 2017
I agree. If the CPU has it, it will be beneficial. If it doesn't, it is only a useful transformation if the intrinsic can be constant folded.
What about platforms that don't have ctlz/ffs hardware support?
My primary concern here is that it can make it impossible to hand-patch the resulting assembler. I'm not sure if anyone is doing that, but if I have learned one lesson from pkgsrc development and the set of Linux support patches, it is to expect "yes, someone is really doing" as most likely answer. As such, I would hope for some global option to disable the pass.
Apr 25 2017
Liveness of registers is pretty much ephemeral at this point. The unwind register locations are not recursive either. They are applied in order.
Apr 24 2017
You are missing the first loop in https://nxr.netbsd.org/xref/src/sys/lib/libunwind/DwarfParser.hpp#489
I'm not sure the results will really work that well either.
Apr 22 2017
I think a better idea would be to just allow this tests to fail with EPERM.
The change itself sounds fine, but the referenced documentation can likely be improved, i.e. searching for "qbs compilation database" or even "clangdb" doesn't trigger any hits here.
Apr 20 2017
My point remains.
Apr 16 2017
We already have a couple of case that expect the encoding to be compatible. I'm not very attached to the additional special cases from YAML, but having either a common escape function OR a JSON escape in LLVM/Support is quite important.
Just to avoid any confusion: this should be the generic YAML escape routine in llvm/lib/Support, i.e. IMO we don't want to have separate YAML and JSON escape routines.
Effective, change YAMLParser.cpp line 697 to use \u and drop the whole UTF-8 handling part.
The short version is perfectly fine as long as works for both JSON and YAML. Less output is always good :)
Apr 13 2017
I'm strongly against this patch. Can you give an actual test case for the problematic behavior?
Apr 4 2017
Actually, there are three platforms that use: i386, amd64 and m68k. The latter might not currently be supported by LLVM, but it still exists.
__int64_t and therefore CRT_HAS_128BIT are supported by all 64bit platforms and only those. The two features are orthogonal.
Apr 3 2017
Please do not drop checks for CRT_HAS_128BIT -- the code requirse__ int128_t and doesn't work on platforms without.
Apr 2 2017
That would be perfectly reasonable. Some other places already tag it with HAS_80_BIT_LONG_DOUBLE, but I don't know what is responsible for setting that macro.
This doesn't make sense to me. XF *is* for modes with Intel extended precision. IEEE 128bit floats should be TF.
Mar 31 2017
Mar 25 2017
Mar 20 2017
Mar 19 2017
Personally, I think this should have been handled by whatever tool created the JSON database file in first place.
Mar 18 2017
Can you please also try this on top of the changes from D30333? That's changing one switch transformation to happen much later, making it more sensitive to inline cost estimates.
Mar 13 2017
Does this have any impact on compressibility of executables? E.g. gzip or xz.
Statically linked binaries don't normally have a GOT, but should synthesize an entry in .data for it. Chances are that is exactly the problem you are seeing?
Mar 11 2017
Installing libc++ along clang makes a lot of sense for non-system compilers. It doesn't imply any version locking like in GCC, i.e. you are still free to install whatever version of libunwind and libc++ you want. I'm somewhat conflicted about this patch. I think it is an actual improvement for some deployment cases, but I also consider it a bit of an abuse of the ressource directory. I'm not sure we have a better mechanism at hand right now though.
Are we talking about statically or dynamically linked programs? It makes a different for the former, but shouldn't for the latter.
Mar 10 2017
Add comments on why the split pass is useful.
Mar 9 2017
Move the full version out of the function simplification set into the late optimisation steps after all function inlining is done.
Mar 5 2017
I'm not sure that clang should care at all about relocatable output, it is too much of a special case. IMO if you want to link relocatable output, you should call the linker directly.