User Details
- User Since
- Dec 17 2012, 10:03 AM (422 w, 1 d)
Mon, Jan 11
Thu, Jan 7
Dec 18 2020
Dec 15 2020
Dec 3 2020
Nov 6 2020
Nov 4 2020
Oct 30 2020
This is allowed by the current definition. See D85603
Oct 27 2020
I guess the compiling time increase on these benchmarks should be acceptable?
Oct 26 2020
Oct 21 2020
Thanks all for your help and @nikic in particular for doing the perf measurements and pushing for a better solution.
I think this shows that reviews work and produce better code quality!
- Use Known directly instead of using an intermediate AddrKnownBits variable
- Change AccCstIndices into AccConstIndices
- Move initialization code of IndexBits next to first use
- Regenerate test case using update_test_check
Oct 20 2020
Oups, tagged Eli's old account.
Tagging the new one @efriedma
Did 1 & 2 of @eli.friedman's suggestions.
- Skip zero indices
- Accumulate constant indices with constant scaling factor in a separate variable to avoid calls to computeForAddSub
Thanks @efriedma for the suggestions!
If we remove that from the equation, I expect the compile-time impact of this patch would be pretty minimal. I started some work on this, but I'm not sure when I'll be able to finish it.
Looks reasonable but I'd like to have an idea of the compile time impact of this patch.
Oct 19 2020
Following the conversation in D87342 , this patch has been rebased and all the previous feedbacks have been addressed.
Get rid of TrailZ and track everything with AddrKnownBits.
Oct 16 2020
Rebase patch
Add the reviewers from D87342
Oct 9 2020
The general direction is fine, but IMHO there are needless code churn.
Why can't we stick to getCostPerUse instead of changing all the users?
Nice catch!
Oct 8 2020
Thanks for the quick review @aemerson!
Oct 7 2020
Could use a test, but it generally makes sense to have such a method.
- Add a test for the new method
Thanks guys for the quick review!
- Move method earlier in the header
- Fix comment in test
- Add getMinValue/getMaxValue
Oct 6 2020
Abandoning this diff in favor of D86364 based on our discussion.
- Added the tests in https://reviews.llvm.org/D88934 (without the gep tests since they wouldn't produce the expected results.)
(I'll add the gep test in a separate PR to demonstrate the lack of precision for these.)
- Refactored the compute known bits for mul in https://reviews.llvm.org/D88935
- Added the sextOrTrunc method to KnownBits https://reviews.llvm.org/D88937
I don't have any ideas yet on how to limit the cost, but maybe we can collectively find some more savings in an update of that patch.
If the motivating cases all start with a gep, then could we add the extra logic to a wrapper around the regular computeKnownBits()?
Oct 5 2020
@nikic, @spatel, @lebedev.ri, @RKSimon, thank you for your feedbacks!
LGTM with nitpicks
It's been almost a month and nobody commented on the approach.
Oct 1 2020
Sep 30 2020
Given you've already taken a look, can you do the review or do you want me to step in?
Sep 29 2020
When looking at Greedy, for example, it's easier to read that "weight calculation always happens the same way".
From the perspective of the caller, this patch exposes some implementation details that I am not sure are worth it.
Thanks for doing this.
Sep 28 2020
Gentle ping @nikic, @aqjune, @spatel, @lebedev.ri, @RKSimon.
LGTM.
The test case could be more descriptive though, see my inline comment.
Sep 25 2020
I like that we take the iterator path, but I don't think the current patch is clearer than the previous implementation.
Thanks for looking into a test case.
Sep 23 2020
Any progress on a fix?
The IMPLICIT_DEF is erased only if there some "incoming" value from "other reg".
Sep 22 2020
Why would we need to keep this impdef?
Sep 21 2020
Note: A possible fix could be:
diff --git a/llvm/lib/CodeGen/SplitKit.cpp b/llvm/lib/CodeGen/SplitKit.cpp index 4029c855c910..a1029bcbc71c 100644 --- a/llvm/lib/CodeGen/SplitKit.cpp +++ b/llvm/lib/CodeGen/SplitKit.cpp @@ -655,13 +655,20 @@ VNInfo *SplitEditor::defFromParent(unsigned RegIdx, if (S.liveAt(UseIdx)) LaneMask |= S.LaneMask; } - assert(LaneMask.any() && "Interval has no live subranges"); } else { LaneMask = LaneBitmask::getAll(); }
It does improve the original by considering the COPY from the same subreg.
The comment seems outdated, if my understanding is right, and even the original code cannot perform that change since, once the 2nd COPY with same source is found in L1407, the check @ L1419 just skips that earlier as the 1st COPY has no subreg and the 2nd COPY has sub1.
Ping!
I must be missing something, but it feels to me that this patch is actually making the situation worse.
Sep 18 2020
Weird, I added case test/CodeGen/PowerPC/sink-down-more-instructions-1.mir. Could you do another check?
Sep 17 2020
1: add one more mir test case
Sep 16 2020
LGTM
As expected, the improvement for the benchmark is gone now. : (
Sep 15 2020
- Fix a few call sites where I was passing TTI instead of the boolean for UseInstrInfo (it doesn't help that the compiler didn't warm on these)
- Remove const_cast that was a left over from the proof of concept
Gentle ping @nikic.
The real world case I met is not a loop invariant instruction, it has an operand which is a user of PHI in loop header.
Sep 14 2020
I had a (slightly) closer look and I have a question:
- Use Register default constructor instead of forcing = 0