Page MenuHomePhabricator

ebrevnov (Evgeniy)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 23 2018, 4:17 PM (86 w, 2 d)

Recent Activity

Wed, Sep 18

ebrevnov added a comment to D67690: [LV][NFC] Factor out calculation of "best" estimated trip count..

Thanks Hideki & Ayal for quick response!
If you are satisfied with the current version please go ahead and commit it since I'm not permitted to do that myself.

Wed, Sep 18, 4:54 AM · Restricted Project
ebrevnov updated the diff for D67690: [LV][NFC] Factor out calculation of "best" estimated trip count..

Fixed as requested by Ayal.

Wed, Sep 18, 4:49 AM · Restricted Project

Tue, Sep 17

ebrevnov added reviewers for D67690: [LV][NFC] Factor out calculation of "best" estimated trip count.: hsaito, Ayal, fhahn, reames.
Tue, Sep 17, 9:21 PM · Restricted Project
ebrevnov added a comment to D65276: [SCEV] Disable canonical expansion for nested recurrences..

ping @apilipenko

Tue, Sep 17, 9:04 PM · Restricted Project
ebrevnov created D67690: [LV][NFC] Factor out calculation of "best" estimated trip count..
Tue, Sep 17, 9:03 PM · Restricted Project

Fri, Sep 13

ebrevnov added inline comments to D67408: [IndVars] An implementation of loop predication without a need for speculation.
Fri, Sep 13, 3:38 AM · Restricted Project

Thu, Sep 12

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

ping @reames

Thu, Sep 12, 1:52 AM · Restricted Project
ebrevnov added a comment to D67408: [IndVars] An implementation of loop predication without a need for speculation.

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Thu, Sep 12, 1:48 AM · Restricted Project

Wed, Sep 4

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

If you like current version please go ahead and commit it since I don't have committer rights yet.

Wed, Sep 4, 6:03 AM · Restricted Project
ebrevnov added inline comments to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
Wed, Sep 4, 6:00 AM · Restricted Project
ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Corrections requested by Philip.

Wed, Sep 4, 5:54 AM · Restricted Project

Wed, Aug 21

ebrevnov added a comment to D65276: [SCEV] Disable canonical expansion for nested recurrences..

ping @apilipenko

Wed, Aug 21, 3:14 AM · Restricted Project

Aug 11 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

ping @reames

Aug 11 2019, 8:22 PM · Restricted Project

Aug 6 2019

ebrevnov added reviewers for D65845: [SCEV] evaluateAtIteration may produce incorrect result if accuracy of evalutation point is not enough.: reames, apilipenko.
Aug 6 2019, 10:25 PM · Restricted Project
ebrevnov updated the summary of D65845: [SCEV] evaluateAtIteration may produce incorrect result if accuracy of evalutation point is not enough..
Aug 6 2019, 10:23 PM · Restricted Project
ebrevnov created D65845: [SCEV] evaluateAtIteration may produce incorrect result if accuracy of evalutation point is not enough..
Aug 6 2019, 9:35 PM · Restricted Project
ebrevnov updated the diff for D65276: [SCEV] Disable canonical expansion for nested recurrences..
Aug 6 2019, 9:35 PM · Restricted Project
ebrevnov updated the diff for D65276: [SCEV] Disable canonical expansion for nested recurrences..
Aug 6 2019, 9:35 PM · Restricted Project
ebrevnov updated the diff for D65276: [SCEV] Disable canonical expansion for nested recurrences..

Splitting exitsting patch to two parts. The first one is an actual fix for the problem. The second one is an enhancement to evaluateAtIteration which is going to be reviewd and committed separately.

Aug 6 2019, 4:13 AM · Restricted Project
ebrevnov updated the diff for D65276: [SCEV] Disable canonical expansion for nested recurrences..
Aug 6 2019, 3:54 AM · Restricted Project
ebrevnov updated the diff for D65276: [SCEV] Disable canonical expansion for nested recurrences..
Aug 6 2019, 3:46 AM · Restricted Project

Aug 4 2019

ebrevnov added a comment to D65286: [OpenCL] Allow OpenCL C style vector initialization in C++.

Please be aware about build bot failure (http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-full-sh/builds/2185) most likely caused by this change.

Aug 4 2019, 9:53 PM · Restricted Project
ebrevnov added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Reported buildbot fails for Evgueni to follow up as needed:

http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/18311  --- blamelist had only this commit. Resolved by moving the LIT test to X86.
 
http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-full-sh/builds/2185 --- I think this is not relevant. "clang -emit-llvm -O0".
Aug 4 2019, 9:49 PM · Restricted Project
ebrevnov added a comment to D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency.

Original miscompile is expected to be fixed by:

Aug 4 2019, 9:07 PM · Restricted Project

Aug 2 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Hi Philip (@reames ), please take a look when possible. Thanks!

Aug 2 2019, 5:10 AM · Restricted Project

Jul 31 2019

ebrevnov added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Thanks for quick response @hsaito ! May I ask you to commit it as well since I don't have committer rights yet.

Jul 31 2019, 9:28 PM · Restricted Project
ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

We got performance improvement in our internal search benchmark.

How does this transformation impact the benchmark when not using profile data?

In plain mode we also got performance improvement, the speedup is a little smaller than FDO Mode.

I have a general question/comment. By now it's more or less evident that benefit of optimization heavily depends on correctness of profile information. That means in general case there is no way to reason about its effectiveness. Thus I believe it should be turned off if there is no profile.

The optimization decision is based on profile information, if a real profile is not available, the statically estimated profile information (generated by BranchProbabilityInfo.cpp) is used. So if an unreasonable probability is generated as in your first case, or if an user program has untypical run time behavior than BPI expected, it may make bad decision.
Since you have strong concern about the optimization without real profile information, I will restore the old behavior if no real profile information is available.

Jul 31 2019, 2:24 AM · Restricted Project
ebrevnov added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Would be nice if you find time to take a look.

Jul 31 2019, 1:54 AM · Restricted Project

Jul 30 2019

ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

We got performance improvement in our internal search benchmark.

How does this transformation impact the benchmark when not using profile data?

In plain mode we also got performance improvement, the speedup is a little smaller than FDO Mode.

I have a general question/comment. By now it's more or less evident that benefit of optimization heavily depends on correctness of profile information. That means in general case there is no way to reason about its effectiveness. Thus I believe it should be turned off if there is no profile.

The optimization decision is based on profile information, if a real profile is not available, the statically estimated profile information (generated by BranchProbabilityInfo.cpp) is used. So if an unreasonable probability is generated as in your first case, or if an user program has untypical run time behavior than BPI expected, it may make bad decision.
Since you have strong concern about the optimization without real profile information, I will restore the old behavior if no real profile information is available.

Jul 30 2019, 9:55 PM · Restricted Project

Jul 29 2019

ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

We got performance improvement in our internal search benchmark.

How does this transformation impact the benchmark when not using profile data?

In plain mode we also got performance improvement, the speedup is a little smaller than FDO Mode.

Jul 29 2019, 11:02 PM · Restricted Project
ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

@ebrevnov, it's better to provide a reproducer. Otherwise I can't analyze the problem that impacts your code. Four is not a big number.

Jul 29 2019, 9:53 PM · Restricted Project

Jul 26 2019

ebrevnov added a comment to D65303: [BPI] Adjust the probability for floating point unordered comparison.

I've never looked to this piece of code before. For that reason I would like some one more familiar with this part to make a review.

Jul 26 2019, 3:07 AM · Restricted Project

Jul 25 2019

ebrevnov added reviewers for D65276: [SCEV] Disable canonical expansion for nested recurrences.: apilipenko, reames.
Jul 25 2019, 5:41 AM · Restricted Project
ebrevnov created D65276: [SCEV] Disable canonical expansion for nested recurrences..
Jul 25 2019, 5:41 AM · Restricted Project
ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

I checked your patch on original test case and it does resolve the issue. Unfortunately, we have 4 other test cases regressed from 1% to 5%. Bad news is that we don't have sources for these cases. At glance they don't involve operations with floats. So that is not a surprise they are not affected. I can continue investigation and try to provide short reproducer if needed. My fear here is that original change touches very wide class of cases and we won't be able to fix all of them one by one (or it will take too much time). Can we think of a more realistic approach to deal with regressions?

Jul 25 2019, 4:36 AM · Restricted Project

Jul 23 2019

ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Here is a C++ equivalent of my original code (which is actually java application) for you to reproduce.

Jul 23 2019, 5:04 AM · Restricted Project

Jul 19 2019

ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
  1. Implemented check for "best" possible result suggested by Johannes.
  2. Merged test cases into one file and applied update_test_checks.py
Jul 19 2019, 5:34 AM · Restricted Project
ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Is the test case slow down with PGO or not?

No, it's not PGO.

Also do you have branch misprediction perf data? (large slowdowns like this is usually triggered by side effect like this).

I do. There is no any branch/data/instruction miss predictions. LSD works 100% in both cases as well. Even after regression CPI is 0.3. That means we are execution bounded. If I run the benchmark under perf scores change a little and there is 19% difference instead of original 35%. About half of slowdown (8.3%) comes from increased path length due to extra jump instruction. Rest comes from CPI increase by 10%. I'm checking different hypotheses what could cause that but no success so far.

Jul 19 2019, 3:58 AM · Restricted Project

Jul 18 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Out of curiosity, would that still be broken with your example?

It should work. Main challenge I see here is how to perform "isBestPossible" check effectively so it doesn't introduce compile time slow down. Because when this check fails we will have to perform full analysis in addition.

Isn't MemDepResult type Other with OtherType::NonFuncLocal the best case possible? Checking for that one should be cheap, or am I missing something? Would you be willing to give it a try and see what the actual performance is?

Jul 18 2019, 5:30 AM · Restricted Project
ebrevnov added inline comments to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
Jul 18 2019, 5:24 AM · Restricted Project
ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

I'd like to suggest a change of approach here. There's already precedent in the code for dealing with inconsistent results from metadata. See the getNonLocalPointerDepFromBB handling for AATags and Size changes. I don't see a reason why invariant_load can't be handled in the same manner in the same place. Doing that consolidates all the cache invalidation in one place, and has the benefit of caching all accesses with as much precision as possible. (i.e. an invariant_load would be cached as invariant until the first normal load from that location was found, and then cached as normal loads)

Thanks for the suggestion. Will take a look.

Jul 18 2019, 5:16 AM · Restricted Project
ebrevnov added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

This change causes 35% regression on very simple loop which is hot part of our internal micro benchmark.
This loop takes 99% of total execution time and has reasonably large number of iterations.
All measurements were performed on Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz.

Jul 18 2019, 3:54 AM · Restricted Project

Jul 17 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

I'd like to suggest a change of approach here. There's already precedent in the code for dealing with inconsistent results from metadata. See the getNonLocalPointerDepFromBB handling for AATags and Size changes. I don't see a reason why invariant_load can't be handled in the same manner in the same place. Doing that consolidates all the cache invalidation in one place, and has the benefit of caching all accesses with as much precision as possible. (i.e. an invariant_load would be cached as invariant until the first normal load from that location was found, and then cached as normal loads)

Jul 17 2019, 10:05 PM · Restricted Project
ebrevnov added a comment to D64898: Standardize how we check for legality of frame realignment.

I was independently looking to this part and came up with essentially same change. The only difference is that I chose MachineFrameInfo as source of "truth". Not really sure which one is better though...

Jul 17 2019, 9:58 PM · Restricted Project

Jul 16 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Just found small issue with the patch. Will update once testing is good.

Jul 16 2019, 1:38 AM · Restricted Project

Jul 15 2019

ebrevnov updated the diff for D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Updating patch to use implementation proposed by Ayal.

Jul 15 2019, 4:43 AM · Restricted Project
ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

I don't have commit rights. I need someone's assistance with putting the fix in. Thanks!

Jul 15 2019, 4:27 AM · Restricted Project
ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Why do we need both changes, reading the cache and writing it, assuming I understand the code correctly?
I would have assumed not caching "inv-load" results would fix the problem. And I can see we do not want to reuse
non-inv-load results for inv-loads. However, I would argue one could always check for a cached result and use
it if the result is "good". Do I miss something here?

Yes, you are reading the change correctly. The thing is that both results (one for inv-load case another for non inv-load) are "good". They are just different. Each result should be used for corresponding case only and not for the other. In other words using result from non-inv case for inv-case is not correct as well. Does it make sense?

No, but I'm tired. I would have assumed that I can drop the inv-load metadata without semantic change. If that is the case, I should be able to use the non-inv result if it is cached and already good enough. That would mean we can keep the lookup but do only return the cached result if it is "the best". Maybe my problem is that I do not know the lattice of potential values.

Either way, is it possible and useful to test for the number of (un)cached queries?

I believe you assumption regarding 'invariant.load' metadata not changing the semantics is not accurate. According to spec, If such instruction is executed there are number of assumptions the optimizer may do regarding the program. For example it's safe to assume that there are no stores before and no stores changing value after such instruction. Otherwise behavior is undefined. In general case there is no way to know at compile time if invariant load will be exectuted or not. That means you can't blindly propagate results of dependence analysis for invariant load to "normal" loads. And visa-verse results of dependence analysis for "normal" load doesn't take semantic imposed by "invariant.load" into account and would be sub-optimal if used for invariant load.

First, I already agreed, you cannot propagate results from inv. loads to non-inv. loads.

I still think you can drop inv. metadata, you did only say it has more restrictions but that is not a problem. If you think dropping the inv. metdata is a problem, please give me an example.

Now the "sub-optimal" part of propagating non-inv. load results to inv. loads. I also agree that this is in general not what you want. However, I was trying to make the point that hat there is one exception: the best possible result. If we have the best possible result already cached for a non-inv. load, we could reuse it for inv. loads as well because they only have less, not more dependences. If you disagree, an example would be good.

Looks we are totally aligned here. Honestly I don't understand why we can't track this thread down to the end. I agree potentially we may reuse result of non-inv load for inv load if we know it's the best. But supporting this case would be about the same complexity as extending current caching for non-inv as well as inv loads. If I simply drop the first guard around the code which does cache lookup we will end up with inv load not removed on the following example.

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"

declare void @llvm.memset.p0i8.i8(i8*, i8, i32, i1)
declare void @foo(i8*)

define i8 @test(i1 %cmp, i8 *%p) {
entry:

%res1 = load i8, i8* %p
call void @foo(i8 *%p)
br i1 %cmp, label %b2, label %b1

b1:

%res2 = load i8, i8* %p
%res3 = add i8 %res1, %res2
br label %alive

b2:

%v = load i8, i8* %p, !invariant.load !1
%res.dead = add i8 %v, %res1
br label %alive

alive:

%res.phi = phi i8 [%res3, %b1], [%res.dead, %b2]
ret i8 %res.phi

}

!1 = !{}

That means we should either follow current approach and completely exclude non inv loads from caching or implement full support for caching inv.load.

I don't want to block this further, so LGTM.


For the record, I did not want to simply remove the first guard but I thought we could do something like this:

Before:

// If we have a cached entry, and it is non-dirty, use it as the value for
// this dependency.
if (ExistingResult && !ExistingResult->getResult().isDirty()) {
  ++NumCacheNonLocalPtr;
  return ExistingResult->getResult();
}

After:

// If we have a cached entry, and it is non-dirty, use it as the value for
// this dependency. If this is an invariant load, use the existing result
// only if it is the best possible, otherwise we forget we checked for a
// cached value.
if (ExistingResult && !ExistingResult->getResult().isDirty()) {
  if (!isInvariantLoad || isBestPossible(ExistingResult)) {
    ++NumCacheNonLocalPtr;
    return ExistingResult->getResult();
  }
  ExistingResult = nullptr;
}

Out of curiosity, would that still be broken with your example?

Jul 15 2019, 4:27 AM · Restricted Project
ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

One more test case added.

Jul 15 2019, 4:18 AM · Restricted Project
ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Updating comment as requested.

Jul 15 2019, 3:03 AM · Restricted Project

Jul 12 2019

ebrevnov updated subscribers of D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Why do we need both changes, reading the cache and writing it, assuming I understand the code correctly?
I would have assumed not caching "inv-load" results would fix the problem. And I can see we do not want to reuse
non-inv-load results for inv-loads. However, I would argue one could always check for a cached result and use
it if the result is "good". Do I miss something here?

Yes, you are reading the change correctly. The thing is that both results (one for inv-load case another for non inv-load) are "good". They are just different. Each result should be used for corresponding case only and not for the other. In other words using result from non-inv case for inv-case is not correct as well. Does it make sense?

No, but I'm tired. I would have assumed that I can drop the inv-load metadata without semantic change. If that is the case, I should be able to use the non-inv result if it is cached and already good enough. That would mean we can keep the lookup but do only return the cached result if it is "the best". Maybe my problem is that I do not know the lattice of potential values.

Either way, is it possible and useful to test for the number of (un)cached queries?

I believe you assumption regarding 'invariant.load' metadata not changing the semantics is not accurate. According to spec, If such instruction is executed there are number of assumptions the optimizer may do regarding the program. For example it's safe to assume that there are no stores before and no stores changing value after such instruction. Otherwise behavior is undefined. In general case there is no way to know at compile time if invariant load will be exectuted or not. That means you can't blindly propagate results of dependence analysis for invariant load to "normal" loads. And visa-verse results of dependence analysis for "normal" load doesn't take semantic imposed by "invariant.load" into account and would be sub-optimal if used for invariant load.

First, I already agreed, you cannot propagate results from inv. loads to non-inv. loads.

I still think you can drop inv. metadata, you did only say it has more restrictions but that is not a problem. If you think dropping the inv. metdata is a problem, please give me an example.

Now the "sub-optimal" part of propagating non-inv. load results to inv. loads. I also agree that this is in general not what you want. However, I was trying to make the point that hat there is one exception: the best possible result. If we have the best possible result already cached for a non-inv. load, we could reuse it for inv. loads as well because they only have less, not more dependences. If you disagree, an example would be good.

Looks we are totally aligned here. Honestly I don't understand why we can't track this thread down to the end. I agree potentially we may reuse result of non-inv load for inv load if we know it's the best. But supporting this case would be about the same complexity as extending current caching for non-inv as well as inv loads. If I simply drop the first guard around the code which does cache lookup we will end up with inv load not removed on the following example.

Jul 12 2019, 1:41 AM · Restricted Project

Jul 11 2019

ebrevnov added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

From the SUMMARY above:

I don't think we actually need to do any additional book keeping (as suggested in D57180). Current algorithm already examines all pairs and in case of dependence will invalidate interleaving group. I think the real issue is WAW dependence (on the same iteration) is not collected by LoopAccessInfo in the first place and as a result canReorderMemAccessesForInterleavedGroups returns wrong answer. The fix is to explicitly check for WAW dependence in canReorderMemAccessesForInterleavedGroups.

Nice catch! Agree about the real issue being a deficiency of LoopAccessInfo, and that it's better to treat all interleave-group-preventing dependencies the same, w/o dedicating special treatment for same-iteration WAW dependences. Suggest to fix this real issue at its core.

LoopAccessInfo does collect the 6 same-iteration WAR Forward dependences between each of the 3 loads and their 2 subsequent same-address stores, in the testcase below, but these do not interfere with interleave-grouping the stores. So it's perfectly capable of collecting same-iteration WAW Forward dependences as well. (LoopAccessInfo should also already collect any same-iteration RAW dependences, although such cases would hopefully be optimized away.)

The reason why LoopAccessInfo does not collect same-iteration WAW dependences is due to the way areDepsSafe() traverses pairs of DependentAccesses, held in EquivalenceClasses<MemAccessInfo> which clusters stores-to-same-address together (and load-to-same-address together, but separately from the stores). One way of fixing this may be:

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 36bd9a8b7ea..a3f40826861 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1641,13 +1641,21 @@ bool MemoryDepChecker::areDepsSafe(DepCandidates &AccessSets,
     // Check every access pair.
     while (AI != AE) {
       Visited.insert(*AI);
-      EquivalenceClasses<MemAccessInfo>::member_iterator OI = std::next(AI);
+      bool AIIsWrite = AI->getInt();
+      // Check loads only against next equivalent class, but stores also against
+      // other stores in the same equivalence class - to the same address.
+      EquivalenceClasses<MemAccessInfo>::member_iterator OI =
+          (AIIsWrite ? AI : std::next(AI));
       while (OI != AE) {
         // Check every accessing instruction pair in program order.
         for (std::vector<unsigned>::iterator I1 = Accesses[*AI].begin(),
              I1E = Accesses[*AI].end(); I1 != I1E; ++I1)
-          for (std::vector<unsigned>::iterator I2 = Accesses[*OI].begin(),
-               I2E = Accesses[*OI].end(); I2 != I2E; ++I2) {
+          // Scan all accesses of another equivalence class, but only the next
+          // accesses of the same equivalent class.
+          for (std::vector<unsigned>::iterator
+                   I2 = (OI == AI ? std::next(I1) : Accesses[*OI].begin()),
+                   I2E = (OI == AI ? I1E : Accesses[*OI].end());
+               I2 != I2E; ++I2) {
             auto A = std::make_pair(&*AI, *I1);
             auto B = std::make_pair(&*OI, *I2);

(could possibly save dependences by recording only those between pairs of adjacent stores to same address, relying on transitivity)

Jul 11 2019, 4:45 AM · Restricted Project
ebrevnov added inline comments to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Jul 11 2019, 4:28 AM · Restricted Project

Jul 10 2019

ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Why do we need both changes, reading the cache and writing it, assuming I understand the code correctly?
I would have assumed not caching "inv-load" results would fix the problem. And I can see we do not want to reuse
non-inv-load results for inv-loads. However, I would argue one could always check for a cached result and use
it if the result is "good". Do I miss something here?

Yes, you are reading the change correctly. The thing is that both results (one for inv-load case another for non inv-load) are "good". They are just different. Each result should be used for corresponding case only and not for the other. In other words using result from non-inv case for inv-case is not correct as well. Does it make sense?

No, but I'm tired. I would have assumed that I can drop the inv-load metadata without semantic change. If that is the case, I should be able to use the non-inv result if it is cached and already good enough. That would mean we can keep the lookup but do only return the cached result if it is "the best". Maybe my problem is that I do not know the lattice of potential values.

Either way, is it possible and useful to test for the number of (un)cached queries?

Jul 10 2019, 3:13 AM · Restricted Project

Jul 9 2019

ebrevnov added inline comments to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
Jul 9 2019, 9:10 PM · Restricted Project
ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Why do we need both changes, reading the cache and writing it, assuming I understand the code correctly?
I would have assumed not caching "inv-load" results would fix the problem. And I can see we do not want to reuse
non-inv-load results for inv-loads. However, I would argue one could always check for a cached result and use
it if the result is "good". Do I miss something here?

Jul 9 2019, 9:10 PM · Restricted Project
ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Adding unit test.

Jul 9 2019, 5:36 AM · Restricted Project
ebrevnov added a comment to D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

There should be a test.

Jul 9 2019, 5:22 AM · Restricted Project
ebrevnov retitled D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151 from [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. (PR42151) to [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
Jul 9 2019, 5:14 AM · Restricted Project
ebrevnov updated the diff for D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.

Remove accidental changes.

Jul 9 2019, 5:08 AM · Restricted Project
ebrevnov created D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151.
Jul 9 2019, 4:58 AM · Restricted Project

Jul 8 2019

ebrevnov accepted D63439: Wordsmith the semantics of invariant.load.

LGTM

Jul 8 2019, 9:05 PM · Restricted Project

Jul 7 2019

ebrevnov added a comment to D62563: Fix incorrect expand of non-linear addrecs.

There is a https://bugs.llvm.org/show_bug.cgi?id=42384 failing due same problem.

Jul 7 2019, 9:31 PM · Restricted Project

Jul 3 2019

ebrevnov abandoned D64126: # Enter a commit message. # # Changes: # # lib/Analysis/ScalarEvolution.cpp [Scalar Evolution][Test] Disable distribution of truncation..
Jul 3 2019, 2:57 AM · Restricted Project
ebrevnov created D64126: # Enter a commit message. # # Changes: # # lib/Analysis/ScalarEvolution.cpp [Scalar Evolution][Test] Disable distribution of truncation..
Jul 3 2019, 2:52 AM · Restricted Project

Jun 30 2019

ebrevnov removed a parent revision for D63981: [LV] Avoid building interleaved group in presence of WAW dependency: D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency.
Jun 30 2019, 11:57 PM · Restricted Project
ebrevnov added a parent revision for D63981: [LV] Avoid building interleaved group in presence of WAW dependency: D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency.
Jun 30 2019, 11:57 PM · Restricted Project
ebrevnov removed a child revision for D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency: D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Jun 30 2019, 11:57 PM · Restricted Project
ebrevnov added a child revision for D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency: D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Jun 30 2019, 11:57 PM · Restricted Project
ebrevnov removed a reviewer for D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency: ebrevnov.
Jun 30 2019, 11:52 PM · Restricted Project
ebrevnov updated the summary of D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Jun 30 2019, 11:51 PM · Restricted Project
ebrevnov created D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Jun 30 2019, 11:36 PM · Restricted Project

Jun 28 2019

ebrevnov added a reviewer for D57180: [LV] Avoid adding into interleaved group in presence of WAW dependency: ebrevnov.
Jun 28 2019, 5:06 AM · Restricted Project