- User Since
- May 5 2018, 9:37 AM (141 w, 2 d)
A test case would be nice of course, but with this kind of worklist order issue (select folded before icmp) it's always tricky.
Rebase over more tests, prevent exponential renaming with reused operands.
I'm somewhat skeptical about this change. The motivation seems a bit weak given the costs involved. The costs are:
Does the same issue apply to ThinLTO?
Fix typos in mask test.
As the discussion is spread out across multiple threads, do I understand correctly that the current consensus is to introduce the -disable-noundef-analysis flag, and explicitly add it to all the relevant tests (rather than adding it to the substitutions)?
I've reported the polly bug at https://bugs.llvm.org/show_bug.cgi?id=48783.
LGTM. I'm not particularly familiar with this code, but don't see an issue with exposing this information. Currently MCSubtargetInfo directly incorporates the logic for printing a help text to the error stream, which is very dubious design. It should be possible to obtain the same information for printing a help text manually.
Sat, Jan 16
I posted an alternative patch at D94866. It's potentially more expensive, but I think it will cover more of the cases we need.
I think this is a case where being in select form favors us, and we can handle this through a more general transform, rather than the special case that is needed for and/or. The general transform is something along these lines: https://gist.github.com/nikic/4bad7e7e7ecda8d9c9e9633d86836b4d I've wanted to do this for a while, and this looks like a good excuse to make it happen :)
Fri, Jan 15
I tested the original three patches plus this one, and the results look good (or at least in line with expectations): https://llvm-compile-time-tracker.com/compare.php?from=33be50daa9ce1074c3b423a4ab27c70c0722113a&to=972a64b1331d1b3022154297593837234a958d0b&stat=instructions
@rnk Sorry, I got my binaries mixed up again... I can reproduce the crash now.
Logic looks fine now, though the GEP handling can be made a bit more lax.
The new direction of the patch looks good to me. And yes, I would prefer it if the intrinsic attribute updates were split off into a separate patch.
Thu, Jan 14
Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=a3904cc77f181cff7355357688edfc392a236f5d&to=b5ccbbb5b1d8fc76e2291a860e4cf0a4787b77d4&stat=instructions Seems mostly fine, only outlier is kimwitu++ in NewPM-O3 configuration (k.cc has 1.25% regression).
Over in D91837 the reserveForAndGetAddress() implementation contained a check for TakesParamByValue that seems to be gone in this revision. My first guess is that this is the culprit, and we're now always performing the isReferenceToStorage() check.
The last time I tested this in https://reviews.llvm.org/D91837#2464554, the impact was much smaller. Some change between then and the landed version must have regressed things again. I think the numbers are now similar to what we saw before the "small value" optimization was added.
Add test for infinite recursion and irreducible flow.
Wed, Jan 13
Fix clang tests. Don't disable MemCpyOpt at O1 (previous version disabled it by accident).
Tue, Jan 12
FYI this change had some pretty significant impact on compile-time (https://llvm-compile-time-tracker.com/compare.php?from=4718ec01669b01373180f4cd1256c6e2dd6f3999&to=f748e92295515ea7b39cd687a718915b559de6ec&stat=instructions) and code size (https://llvm-compile-time-tracker.com/compare.php?from=4718ec01669b01373180f4cd1256c6e2dd6f3999&to=f748e92295515ea7b39cd687a718915b559de6ec&stat=size-text). Is that expected?
With this patch applied, this is the impact on InstCombine after flipping the flag: https://gist.github.com/nikic/052815150762859dddaefde10f25374d We can see that there are still quite a few cases where we regress. However, many can be fixed by stronger impliesPoison handling. E.g. we're currently missing icmp X, Y && icmp X, Y and icmp (X & C), C2 && icmp (X, C'), C2'.
LGTM as well.
Yes, I think this is fine to land as-is. Note that SDAG always works on a single basic block, so I don't think there is any value in having "branch on undef is UB" semantics, as we can't infer information from branch conditions anyway.
The thought that the current limitations are compile-time related crossed my mind, but that doesn't seem to be a concern: https://llvm-compile-time-tracker.com/compare.php?from=00f773cf424699d8eb31591fdc95e0ca18b2682c&to=10ec7c7960ea47e899003fcd7d1ba5c389ba1cae&stat=instructions
Mon, Jan 11
@jdoerfert It would be great if you could look over this as well.
Sun, Jan 10
Based on what @RalfJung mentioned on zulip, the question of whether the transform is legal for inbounds comes down to the particular choice of inbounds semantics. I was using the semantics specified in LangRef, which make the optimization illegal, while @nlopes used the semantics from https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf (or something similar), which makes it legal. The relevant difference to the LangRef semantics (if we stick to the gep-inbounds-logical case) would be:
@bjope There are two more revisions based on this one, D86079 implements improved X86 lowerings and is already accepted, and D86078 implements improved AArch64 lowerings. Do you plan to land these as well?
LGTM. I'm not an expert on the pass manager, but after looking into it a bit this seems like the right direction. I don't think not preserving AAResults in HexagonLoopIdiomRecognition/GVNHoist is much of a problem. Preserving AAResults is generally not important (this "analysis" is essentially free) and these are non-default passes anyway.
Sat, Jan 9
LGTM, but please wait for @jdoerfert to check this as well.