User Details
- User Since
- Nov 23 2012, 10:16 AM (425 w, 6 d)
Fri, Jan 15
It might be specific to the jump table case, but it should be instructional on how to do it. One important point is that it avoids inter-section relocations, which are a problem at least on MIPS.
Some targets already do this. Please check that you don't create regressions, especially on PowerPC.
Nov 30 2020
off_t is s signed type. Please fix the description.
Nov 23 2020
Do we really want to change the code here? It is perfectly well defined behavior.
Nov 19 2020
The difference is whether we promise to be instruction precise or not. I'm not sure we do or want to promise that as default.
I have no problem with the change, but please adjust the description to take about -funwind-tables. I don't think we do async by default, at least not by design.
Oct 29 2020
"It compiles with GCC" is not really helpful as argument because the observed behavior depends on a lot of internals. It is quite frankly impossible to be 100% identical to GCC's behavior. What LLVM guarantees is:
(1) Folding to false happens as late as reasonable possible. It happens most importantly after the first round of function and loop optimizations.
(2) Folding to true happens as part of constant folding etc as early as the condition is evaluated.
(3) The result is propagated to derived expression and those are folded recursively, including dead code elimination. This happens even for -O0.
Oct 28 2020
I completely agree with Roman that this patch is very undesirable since it has a high chance of breaking other reasonable uses of the intrinsic. In fact, I would say that the observed behavior here is perfectly reasonable under the definition of @lllvm.is.constant. The original code for LLVM at least should be just __builtin_constant_p(b) && b > -129 && b < 128).
Oct 27 2020
You can do an anon-mapping with size+pagesize in that case and map over it with MAP_FIXED.
Oct 15 2020
Disabling builtin handling when asm is enabled would be a major annoyance for NetBSD. The interaction between symbol renaming and TLI is complicated. There are cases where it would make perfect sense and others were it would be harmful. Example of the latter is the way SSP is implemented by inline functions that call a prototype renamed back to the original symbol name.
Sep 16 2020
I'm still curious about the source of the vptr diff, but that's a minor question, otherwise. LGTM
Sep 14 2020
I don't think this is the right place for this at all. Look at X86Subtarget::initSubtargetFeatures please.
Aug 25 2020
Can you change raw_fd_ostream::has_colors to memoize the result of FileDescriptorHasColors(FD) instead? That should give most of the performance gain without breaking the abstractions as much.
Aug 10 2020
This issue is on my long term TODO list. If there are no 64bit atomics, it is a really stupid idea to use them for timekeeping. I think for the use case of Timer, just splitting it into second and subsecond with explicit overflow handling is perfectly reasonable and does not affect the correctness of the normal use as the aggregates are only accessed on completion?
Jun 13 2020
Please don't change the style of the code, it messes up review and the full source. Please split the introduction and testing of the new relocation into one patch and using a different set of relocations to implement large code model. The latter part needs at least a test as well?
Jun 3 2020
I don't agree with the justification at all, but it also seems that noone else cares about the build option creep here.
May 27 2020
May 21 2020
I see that more as a short-coming in the existing DEFAULT_SYSROOT behavior and less an argument for making more cases like it. So the general idea is that for turnkey toolchains,
we want to allow customizing the "default" target of the toolchain to hard-code options like --sysroot, --target, -Wl,-rpath etc. Those are all related, so when using a different target, they no longer make sense. One way to deal with all those options in a consistent manner is hook into the logic for determining the current target, if none is explicitly specified on the command line or implicit from the executable name, then prepend some additional flags on the command line based on some cmake variable or so. This flags shouldn't trigger the unused argument warnings, so you can always pass -Wl,-rpath, --sysroot etc, independent of whether the compiler is in preprocessing / compile / assemble / link mode. That seems to be a more general and systematic approach than adding many additional build-time options.
I do agree with the feature request, but I'm not sure about the implementation. It doesn't seem to work well with the cross-compiling support in the driver as clearly shown by the amount of tests that need patching. Is cross-compiling a concern for you at all? Otherwise I would suggest going a slightly different route and use default values iff no "-target" or "target-command" mechanism is active. Might need a way to temporarily disable unused flag warnings, but that way would be pretty much toolchain independent?
May 13 2020
LGTM
May 3 2020
ARM's data detection only works for unstripped binaries. Other architectures often don't even have such markers at all.
The more compact output is a lot harder to read when you have to find embedded data. I wouldn't mind dropping half the spaces to have columns of 4 hex digits, but anything more is really hard to read.
I'm not sure about this change. It might make sense on x86, but many RISC architectures love to put data into the instruction stream and being able to pick them out is quite important for understanding the disassembly.
Apr 15 2020
LGTM
Apr 14 2020
Apr 8 2020
@ldionne I was updating libc++ from d42baff45d9700a199982ba0ac04dbc6c6d911bb and LLVM itself from 38aebe5c04ab4cb3695dc1bcc60b9a7b55215aff to 3d1424bc7a0e9a6f9887727e2bc93a10f50e1709 when it started failing. It likely has been present for a while.
This fixes the module build of clang for me.
Apr 5 2020
If no (inline) assembler is involved, you can likely get away with IAS. Just to give two trivial examples that are not handled by IAS right now:
Apr 3 2020
Louis, did I answer your questions?
Mar 28 2020
We tend to test push .section and .pushsection in this file, so it would be nice to continue doing that.
Can you add a variation with .pushsection for @progbits and @nobits? Otherwise LGTM.
Ping?
Mar 24 2020
This summarizes the IRC discussion.
Mar 21 2020
Mar 20 2020
The original code has a functional dependency between sz and bytes and whether they can be constant evaluated. But the code doesn't express that. I don't think we can enforce that in any sensible way. There are valid use cases after all where partial inlining would result in entirely sensible decisions, just think about the more typical case of __builtin_constant_p selecting between inline asm taking immediate operands and one taking register/memory operands. That's why I am saying that I consider it a lot more useful to provide reliable building blocks to express the dependency and make sure they work.
Mar 19 2020
...in the outer condition, I meant.
Yes, builtin_expect is just noise here. The point I wanted to make is that both sz and bytes should be used in builtin_constant_p in the other condition.
IMO the root of the problem here is that the branches are mixing predicated variables (bytes) with non-predicated variables (sz). If users want deterministic behavior here, that shouldn't be done.
Mar 18 2020
I was discussing this with Bill off-list to get a better idea of the original test case. Basically, with the new constant intrinsic pass, we can provide a stronger semantic: the default LLVM pass chain always constant folds expressions involving is.constant and performs DCE trivially dead branches resulting from the former folding. This means that if the original expression is adjusted from:
if (__builtin_expect(!!(sz >= 0 && sz < bytes), 0)) { if (!__builtin_constant_p(bytes)) ... }
to the functionally equivalent:
if (!(__builtin_constant_p(sz) && __builtin_constant_p(bytes) && sz >= 0 && sz >= bytes) && __builtin_expect(!!(sz >= 0 && sz < bytes), 0)) { if (!__builtin_constant_p(bytes)) ... }
then we can actually ensure proper DCE even with -O0 in the default pass chain. That depends over all on two properties:
- If __builtin_constant_p is constant folded by clang, it must DCE the appropiate places.
- the constant intrinsic pass is run
With -O1 or higher, this should work in general.
Mar 17 2020
Committed as 0b999f76575f0196d3cd01c0781b2513b0a1af15 without link.
Require __STDCPP_NEW_ALIGNMENT__ in C++03 mode. Prefer it over max_align_t in a number of tests when allocation alignment is desired. Adjust some tests to do minimal sanity checking of the alignment for C++03 only. Add an explicit check that __STDCPP_NEW_ALIGNMENT__ is more general than max_align_t if both are present.
Mar 16 2020
This looks much better than the in-tree state, thanks.
Mar 15 2020
Mar 14 2020
Mar 13 2020
Mar 12 2020
Mar 11 2020
Ping
Mar 9 2020
It still seems to be a perfectly reasonable optimisation under the semantics of is.constant. The code here seems to be an abuse of it though and makes assumptions that are not sensible.
I'm confused by that example. If jumpthreading makes it possible to answer true, it is no differerent from inlining and other use cases. The reverse is the problematic case, turning constant cases into non-constant cases?
Mar 3 2020
Mar 2 2020
Feb 26 2020
libc++ has no idea what a correct max_align_t is. The internal definition works due to historic requirements that all three fundamental types are supported for new/delete, but we don't have any such guarantees for every other context. A correctly implemented stddef.h does not provide max_align_t in C++03 mode, since that would pollute the global namespace. This means that libc++ currently has two failure modes: on NetBSD, it outright tries to use a non-existing symbol. On other platforms it silently defines max_align_t in a way that can be subtle wrong.
Feb 25 2020
Do not depend on max_align_t in C++03 mode in the test cases.
Feb 24 2020
Feb 21 2020
Feb 20 2020
Feb 19 2020
It is used both in <new> and <memory> and the use in the latter is currently unconditional AFAICT. I don't have a problem splitting the conditional to avoid the typedef. That would address the ODR concern?
Ping2?
Feb 12 2020
Ping?
Feb 4 2020
Jan 30 2020
Let me clarify the situation for a moment:
(1) libc++ does try to work in C++03 mode. See the separate implementation of <functional> for pre-C++11. It is also desirable to support. This is completely beside the question of TR1 support.
(2) The only reason why max_align_t is currently necessary is because it is used as default alignment in <new>.
(3) Most compilers we care about already have a preprocessor symbol specifically for that purpose.
Jan 22 2020
libc++ generally tries to play nice with C++03 code. It doesn't go out of its way to break it and keeping it usable helps dealing with a lot of rusty old code. That's what the patch is all about, not breaking things for no good reason.
Jan 20 2020
Jan 16 2020
Can this be restricted to Linux?
Jan 14 2020
Jan 5 2020
What's the fixup behavior of the linker here, e.g. does it make sense to just use absolute encoding here and let the linker fix it up?
Nov 18 2019
This is normally done by using -Bstatic/-Bdynamic around the library. See tools::addOpenMPRuntime.
Oct 30 2019
Just for the sake of correctness, this is provided by GNU ld as part of the map file (-Wl,-Map=foo.map). It lists all archive members and which object file and symbol triggered the inclusion.
Oct 28 2019
Oct 27 2019
Can we please get this or the other fix in RSN? Mesa hits this.
Oct 24 2019
Oct 21 2019
Oct 15 2019
I'm a bit puzzled by the need for this change. The bump allocator already has logic to do power-of-two scaling for the allocation, so I wonder why it doesn't work properly here.
Oct 14 2019
Oct 13 2019
Oct 11 2019
Adjust based on comments.
Oct 7 2019
Why go back to the large tables for crc32? Just because JamCRC had that bug doesn't mean it should persist.
Oct 5 2019
I wonder if we should actually enumerate evil here, i.e. give the situations in which inlining actually fails. As mentioned on IRC, I wonder if we shouldn't aim for the stronger semantics and at least warn by default of any situation that prevents always_inline from doing its job.
Oct 3 2019
I'd make the argument that it should be strlcpy if present. That covers a lot of existing systems too. I'm not sure this optimisation should be done for freestanding mode though.
Oct 1 2019
2nd ping. Chandler, care to check this please?