User Details
- User Since
- May 5 2018, 9:37 AM (217 w, 4 d)
Today
LG
Compile-time impact is low / acceptable: http://llvm-compile-time-tracker.com/compare.php?from=40a4078e14c2c6c5e2d0a1776285aa7491e791b3&to=d753f388d52778fff19b5a6d82a5b9c4869a4273&stat=instructions
Do you need me to commit this for you? If so, please provide the "Name <email>" to use as attribution for the commit.
Another possibility is to mark llvm.coro.save as IntrNoMerge, though that would prevent any merging (similar to the SimplifyCFG patch, just only for this intrinsic and not other token-returning intrinsics).
Yesterday
Currently, we should also be checking for callbr as well, but I hope that the need for that goes away with D129205 anyway.
Remove one more check.
@nathanchance Thanks for the report. I've applied the obvious fix in https://github.com/llvm/llvm-project/commit/20962c1240691d25b21ce425313c81eed0b1b358. However, I think that this (and similar limitations relating to critical edge splitting) might actually be unnecessary since https://github.com/llvm/llvm-project/commit/7a7bba289521e6d4da199565cf72dc9eaed3d671. callbr has stricter semantics than indirectbr, and apparently updating successors of a callbr is supposed to be possible (because the blockaddresses used by it must be directly used in the call arguments and thus can be updated, while in the indirectbr case they may be passed indirectly).
Tue, Jul 5
I ended up reverting this patch in https://github.com/llvm/llvm-project/commit/a4772cbaf0dc717ab6b4639272ca2910897613f0, because I reduced another infinite loop (https://github.com/llvm/llvm-project/issues/56203) to it as well, which makes it the third one. It's probably not possible to reliably prevent infinite combine loops in this area with a dominator tree.
@HanKuanChen Thanks for the report! I've landed a fix for this test case in https://github.com/llvm/llvm-project/commit/dc969061c68e62328607d68215ed8b9ef4a1e4b1. This is based on making SimplifyCFG behave more like JumpThreading, which does not have an infinite loop for this case. Let me know if you see further issues.
Related: https://reviews.llvm.org/D125774
Yeah, this is on X86. Let me know if I need to extract the IR for this.
This patch causes an assertion failure during LTO when building bullet:
ld: /home/npopov/repos/llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From*) [with To = llvm::ShuffleVectorInst; From = llvm::Value]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
Steps to reproduce:
cmake -G Ninja -Bbuild-ReleaseLTO-g -H. \ -DCMAKE_C_COMPILER=$PATH_TO/bin/clang \ -C cmake/caches/ReleaseLTO-g.cmake cd build-ReleaseLTO-g ninja bullet
LGTM
FWIW the bitcode patch has landed, so implementing the variant using a token type should be possible now. I'm not sure whether it's better to start with the current patch where the intrinsic is optional, or go directly to the one where it is required.
Mon, Jul 4
Check for desirable binops in one more place.
Be more thorough about not creating div/rem expressions, update tests.
I don't think this hack is acceptable. The actual problem here seems to be that the phi (gep(p, c1), gep(p, c2)) doesn't get converted into gep p, phi(c1, c2) afterwards, which is what would give you the best of both worlds: The store is sunk, but we still make use of the common base.
Don't think we have a clear understanding of the transforms that are valid for token types, but this generally looks reasonable to me. We should avoid making token values flow across control flow edges, even if it does not require introducing a phi.
LGTM
Ping
Sun, Jul 3
Sat, Jul 2
memcpy the libcall can be an indirect call target. llvm.memcpy the intrinsic cannot be. If an indirect memcpy becomes direct and is then replaced with the intrinsic, that's a divirtualization, but I don't think it's one relevant for the purposes of DevirtSCC, because such a call cannot be inlined. (Phrased like that, I think the relevant property here is not so much that something is an intrinsic, but that it is a declaration, as declarations are not eligible for inlining.)
Fri, Jul 1
This should be split into a patch for CallSiteSplitting and one for Loads. For CallSiteSplitting, you'll probably want to implement ephemeral value handling along the lines of https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2941. For Loads, we'd want something along the lines of D101553.
I think the general idea behind this change makes sense. However, we should be treating all intrinsics the same way here (by skipping them), and not special-case assume-like and dbg intrinsics. Intrinsics generally cannot be called indirectly (this is enforced by the IR verifier), so they are irrelevant for the purposes of devirtualization.
LG
When switching this to AggressiveInstCombine, I would strongly recommend to start with a much more minimal patch. Handle only a single simple case, without any of the possible variants. We can build on that base later.
LGTM
Yeah, I think I'm going to drop this for now. What would probably make most sense to me is that the caller of SCEVExpander can provide the builder, so that different users can use different configurations. But that would need some IRBuilder API changes, because we'd still want to use a custom inserter in either case.
Ping :)
LGTM
This looks fine, only bit I'm uncertain about is the special-case for strchr(A, '\0') == null. Please precommit the tests to show current codegen. Possibly we can address that differently, in particular I notice that we're currently missing inbounds on GEPs, which might be limiting subsequent comparison folding.