- User Since
- Nov 16 2012, 6:02 PM (295 w, 4 d)
Sun, Jul 15
Sat, Jul 14
There's nothing special about debug intrinsics here except that there are a lot of them. The problem, as far as I can tell, is that we're repeatedly using dyn_cast on each instruction and doing multiple redundant tests. Adding yet another redundant test will help having a lot of debug intrinsics makes things incrementally more expensive for all other kinds of intrinsics. Thus, this doesn't seem like the right way to fix this. Instead of testing for IntrinsicInst, and then CallInst (which is always true whenever the IntrinsicInst is true), and then we always test for StoreInst (but we don't use an 'else' so we always do this test even when the IntrinsicInst/CallInst is true (which will include debug intrinsics)).
Fri, Jul 13
We have a check in aliasSameBasePointerGEPs to avoid PR32314 (which I'll glibly summarize as: we need to be careful about looking through PHIs so that we don't, without realizing it, end up comparing values from different loop iterations). The fact that this looks back through an unknown number of PHIs makes me concerned that we'll get into a similar situation.
Misc cleanup and try to validate ret attrs instead of disqualifying them all.
Thu, Jul 12
Updated to not infer speculatable if there are instructions with non-debug metadata, loads with alignment requirements that we can't independently prove, and for functions with value-constraining return-value attributes.
LGTM (there are a lot of changes here, but given that it produces no changes to existing matching tables, that seems like pretty good test coverage).
Tue, Jul 10
I specifically didn't want to do this for the YAML output. The tool consuming the YAML should demangle if it wants. It's hard to go backward, and so if the tool needs to map back to symbols in the program, it needs the mangled name. That having been said:
Mon, Jul 9
This version detects and report integers only. If there is interest of merging the tool I can add the functionality for floats as well.
Sun, Jul 8
Fri, Jul 6
Wed, Jul 4
Mon, Jul 2
Sun, Jul 1
Fri, Jun 29
Tue, Jun 26
Thu, Jun 21
Tue, Jun 19
I've not looked at the patch itself, but regarding the general functionality: Nice! I've wanted this for as long as I've been working on LLVM :-)
Jun 15 2018
Seems like a good idea. A couple of questions below.
Jun 14 2018
I don't think this is right. Saying the result is undefined seems like what we intend. We might choose, as an implementation technique, to limit how we take advantage of that undefinedness in certain cases, but the violating the constraint still produces logical inconsistencies that can transfer to other parts of the code, and in general, turn into any other kind of undefined behavior (depending on the structure of the code).
Jun 12 2018
I think that this optimization is going to be very fragile in practice without also handling the case where both sides look like (m-idx). Nevertheless, it's good to make AA changes in minimal small increments. Let's try this single case for now.
I put this on the llvm-dev thread, but to repeat it here:
Jun 11 2018
Jun 7 2018
Jun 6 2018
I don't see how inlining is relevant here. Sinking across function call boundaries is generally also bad (because increasing register pressure across calls tends to increase spilling). I feel like the test you want here is to call TTI::isLoweredToCall.
Bottom line -- the situation is far from perfect, but IMO the patch does sensible thing if we're compiling IR with mixed target-cpu and target-features attributes using NVPTX.
Jun 5 2018
These are likely to be quite mechanical, and I suspect are probably best done with post-commit review.
Jun 4 2018
That's odd; I can't think of any reason why we'd drop the flags in this case. LGTM.
Jun 3 2018
Can you please update the comment at the top of include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h to define "full vs. partial" unswitching and "trivial vs. non-trivial" unswitching?
May 31 2018
May 30 2018
I've added a few specific comments, but I think that you should move forward with the associated RFC.
May 29 2018
May 28 2018
May 27 2018
LGTM. We're well past LLVM 5 now.
You say that you don't yet know if this is profitable yet. Do you have reason to believe that it might not be profitable (e.g., some example where it seems like we might want to constrain the behavior)? We almost never favor keeping metadata over doing other transformations, so I think it's worth being explicit about our reasoning here. Just saying "I don't know yet" is probably not sufficient, as we could say that about nearly all metadata, and in that case, change the default logic.
May 25 2018
Does this just not work well with other linkers, do other linkers also have special handling, or does it work better elsewhere for some other reason?
May 24 2018
and instead re-add the now mutated loop to the pass manager to re-visit (much like we do with non-trivial unswitching) and rely on it then iterating for us. This will ... not be a somewhat surprisingly significant behavior change. I think its good, but its worth noting. This will essentially make loop-unswitch a fixed-point pass in the pass pipeline, but it will do so using the pass manager. I'm not aware of any other passes that currently do this. Anyways, brave new world. The pipeline will now be ... truly dynamic.
Out of curiosity, did the original test case come from C/C++? Because if it did, given that we get to assume infinite loops like this don't happen in C/C++, we may end up "breaking" this again in the future.
May 23 2018
I have some doubts whether this will ever be used sensibly in the real world, but can
be useful for testing and was not difficult to put together.
May 22 2018
This is still a bit convoluted, but I think that someone can now figure out what's going on from the comments. They also raise an interesting point of whether we should somehow separate out the capture-only-by-return case from the more-general case (although that's nothing to be addressed here).