Page MenuHomePhabricator

congzhe (Congzhe Cao)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 2 2020, 7:36 AM (168 w, 3 d)

Recent Activity

Tue, Mar 7

congzhe added inline comments to D144378: [LoopDataPrefetch] Support metadata llvm.loop.prefetch for prefetch (part 2).
Tue, Mar 7, 9:00 PM · Restricted Project, Restricted Project
congzhe added inline comments to D144377: [clang] Add #pragma clang loop [no]prefetch() (part1).
Tue, Mar 7, 8:59 PM · Restricted Project, Restricted Project

Jan 16 2023

congzhe committed rGee7188c8b2ab: [LoopInterchange] Correcting the profitability check (authored by ram-NK).
[LoopInterchange] Correcting the profitability check
Jan 16 2023, 11:37 AM · Restricted Project, Restricted Project
congzhe closed D135808: [LoopInterchange] Correcting the profitability check.
Jan 16 2023, 11:36 AM · Restricted Project, Restricted Project, Restricted Project

Dec 13 2022

congzhe added a reviewer for D135808: [LoopInterchange] Correcting the profitability check: Restricted Project.
Dec 13 2022, 8:13 PM · Restricted Project, Restricted Project, Restricted Project

Dec 12 2022

congzhe added inline comments to D135808: [LoopInterchange] Correcting the profitability check.
Dec 12 2022, 3:28 PM · Restricted Project, Restricted Project, Restricted Project

Dec 1 2022

congzhe added inline comments to D135808: [LoopInterchange] Correcting the profitability check.
Dec 1 2022, 8:13 AM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D135808: [LoopInterchange] Correcting the profitability check.

I would suggest to revise the summary and/or the title to describe that this patch now also addresses the endless interchange problem, in addition to correction in isProfitableForVectorization().

Dec 1 2022, 8:02 AM · Restricted Project, Restricted Project, Restricted Project

Nov 17 2022

congzhe committed rGcd58333a6214: [LoopInterchange] Refactor and rewrite validDepInterchange() (authored by Narutoworld).
[LoopInterchange] Refactor and rewrite validDepInterchange()
Nov 17 2022, 10:41 AM · Restricted Project, Restricted Project
congzhe closed D137461: [LoopInterchange] Refactor and rewrite validDepInterchange().
Nov 17 2022, 10:41 AM · Restricted Project, Restricted Project, Restricted Project

Nov 11 2022

congzhe committed rGec210f394233: [LoopFuse] Ensure inner loops are in loop simplified form under new PM (authored by Narutoworld).
[LoopFuse] Ensure inner loops are in loop simplified form under new PM
Nov 11 2022, 12:58 PM · Restricted Project, Restricted Project
congzhe closed D137672: [LoopFuse] Ensure inner loops are in loop simplified form under new PM.
Nov 11 2022, 12:58 PM · Restricted Project, Restricted Project
congzhe accepted D137461: [LoopInterchange] Refactor and rewrite validDepInterchange().

Hi @bmahjour and @Meinersbur , would you like to take a look on the patch?

Nov 11 2022, 7:42 AM · Restricted Project, Restricted Project, Restricted Project

Nov 7 2022

congzhe added inline comments to D135808: [LoopInterchange] Correcting the profitability check.
Nov 7 2022, 9:03 PM · Restricted Project, Restricted Project, Restricted Project

Nov 3 2022

congzhe committed rG75b33d6bd518: [LoopInterchange] Check phis in all subloops (authored by congzhe).
[LoopInterchange] Check phis in all subloops
Nov 3 2022, 9:22 PM · Restricted Project, Restricted Project
congzhe closed D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.
Nov 3 2022, 9:21 PM · Restricted Project, Restricted Project
congzhe added a comment to D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.

According to the discussion at the LoopOptWG, I will land the patch to fix the miscompile for now. Will look into how to fully support such reduction scenarios as the next steps.

Nov 3 2022, 8:56 PM · Restricted Project, Restricted Project

Nov 1 2022

congzhe added inline comments to D136277: [LoopInterchange] Simplify DepMatrix to a dependency vector..
Nov 1 2022, 6:05 PM · Restricted Project, Restricted Project
congzhe added a comment to D136277: [LoopInterchange] Simplify DepMatrix to a dependency vector..

Just one more question on the summary: when combining the two dependencies [<,=] and [=,<] that generated [<=,<=], I think we could still interchange the combined dependency? [<=,<=] is non-negative and swapping the two elements <= and <= gives a non-negative dependency. Thus we are not converting a positive dependency to a negative dependency, hence not violating the dependency rules for a valid interchange. Perhaps with the current validDepInterchange() we cannot interchange [<=,<=], but like I mentioned in the past meetings I'm rewriting validDepInterchange() to make the legality decisions be purely based on the signs of the dependence vectors before/after interchange. With the rewritten validDepInterchange() I think we can interchange [<=,<=].

Nov 1 2022, 2:59 PM · Restricted Project, Restricted Project
congzhe added inline comments to D136277: [LoopInterchange] Simplify DepMatrix to a dependency vector..
Nov 1 2022, 12:59 PM · Restricted Project, Restricted Project

Oct 31 2022

congzhe updated subscribers of D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.

LGTM. Thank you.

The additional check make sense.

Oct 31 2022, 7:10 PM · Restricted Project, Restricted Project
congzhe updated the diff for D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.
Oct 31 2022, 7:09 PM · Restricted Project, Restricted Project
congzhe committed rGeda3c93486ca: [LoopFuse] Ensure loops are in loop simplified form under new PM (authored by Narutoworld).
[LoopFuse] Ensure loops are in loop simplified form under new PM
Oct 31 2022, 8:50 AM · Restricted Project, Restricted Project
congzhe closed D136781: [LoopFuse] Ensure loops are in loop simplified form under new PM.
Oct 31 2022, 8:50 AM · Restricted Project, Restricted Project

Oct 17 2022

congzhe added a comment to D135808: [LoopInterchange] Correcting the profitability check.

If outer loop dependency direction "=" and Inner loop dependency direction is "S" and "I" then, loop interchange is considered as profitable. Only two cases of dependency is profitable. But for vectorization, ">" and "<" dependency in outer loop is more profitable when interchanged. After patch [=,<] and [=,>] will be interchanged for vectorization.

Oct 17 2022, 11:33 AM · Restricted Project, Restricted Project, Restricted Project

Oct 4 2022

congzhe committed rGa58b6acf1f36: [NFC][LoopInterchange] Clean up of irrelevent dependency checking with (authored by ram-NK).
[NFC][LoopInterchange] Clean up of irrelevent dependency checking with
Oct 4 2022, 11:56 AM · Restricted Project, Restricted Project
congzhe closed D132982: [NFC][LoopInterchange] Clean up of Irrelevant dependency checking using isOuterMostDepPositive..
Oct 4 2022, 11:55 AM · Restricted Project, Restricted Project

Sep 29 2022

congzhe added a comment to D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.

Hello @congzhe ,

I think I'm seeing a miscompile with this patch:

opt "-passes=function(loop(loop-interchange))" bbi-74005_x86.ll -S -o -

Now I don't know how this is supposed to work but it looks to me that in the input code we read

%arrayidx14.promoted.i = load i32, ptr %arrayidx14.i, align 1

then do 512 rounds of the inner loop and add the read elements, and then we store what we have so far:

%18 = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %16)
store i32 %18, ptr %arrayidx14.i, align 1

But after loop-interchange we load, then add, but then we dont do the store, but do the load and execute the inner loop again? So it looks like we throw away the calculated addition and just read the same values again?
Or am I missing something here?

Sep 29 2022, 9:14 PM · Restricted Project, Restricted Project, Restricted Project
congzhe retitled D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable from [LoopInterchange] Do not interchange when reduction in subloops is not recognizable to [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.
Sep 29 2022, 9:13 PM · Restricted Project, Restricted Project
congzhe updated the summary of D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.
Sep 29 2022, 9:12 PM · Restricted Project, Restricted Project
congzhe requested review of D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable.
Sep 29 2022, 5:46 PM · Restricted Project, Restricted Project

Sep 26 2022

congzhe added a comment to D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.

Hello @congzhe ,

I think I'm seeing a miscompile with this patch:

opt "-passes=function(loop(loop-interchange))" bbi-74005_x86.ll -S -o -

Now I don't know how this is supposed to work but it looks to me that in the input code we read

%arrayidx14.promoted.i = load i32, ptr %arrayidx14.i, align 1

then do 512 rounds of the inner loop and add the read elements, and then we store what we have so far:

%18 = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %16)
store i32 %18, ptr %arrayidx14.i, align 1

But after loop-interchange we load, then add, but then we dont do the store, but do the load and execute the inner loop again? So it looks like we throw away the calculated addition and just read the same values again?
Or am I missing something here?

Sep 26 2022, 10:18 PM · Restricted Project, Restricted Project, Restricted Project

Sep 21 2022

congzhe committed rG22c91df52ccc: [LoopInterchange][PR57148] Ensure the correct form of IR after transformation (authored by congzhe).
[LoopInterchange][PR57148] Ensure the correct form of IR after transformation
Sep 21 2022, 9:21 PM · Restricted Project, Restricted Project
congzhe closed D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Sep 21 2022, 9:21 PM · Restricted Project, Restricted Project, Restricted Project
congzhe committed rG6782d71680ea: [LoopPassManager] Ensure to construct loop nests with the outermost loop (authored by congzhe).
[LoopPassManager] Ensure to construct loop nests with the outermost loop
Sep 21 2022, 9:00 PM · Restricted Project, Restricted Project
congzhe closed D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 21 2022, 9:00 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added inline comments to D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Sep 21 2022, 8:32 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Sep 21 2022, 8:32 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

thanks, lgtm with one nit
sorry for the long back and forth

I do think that the loop nest infrastructure in general needs some work and doesn't fit in with the rest of the new pass manager infrastructure very well. there should have been an RFC (unless I missed it)

Sep 21 2022, 10:37 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 21 2022, 10:36 AM · Restricted Project, Restricted Project, Restricted Project
congzhe accepted D132982: [NFC][LoopInterchange] Clean up of Irrelevant dependency checking using isOuterMostDepPositive..

LGTM but I'd suggest to wait a couple of days to see if people have other comments. @fhahn would you like to comment on this patch?

Sep 21 2022, 9:24 AM · Restricted Project, Restricted Project

Sep 19 2022

congzhe added inline comments to D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Sep 19 2022, 7:56 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.

I have no idea if this is the right fix, but I've verified that it solves the problem that I reported in pr 57148.
Thanks!

However, that was a reduced version of the problem, and if I try the original non-reduced version (unfortunately for an out-of-tree target) I get another failure with this patch.
So it might be a step in the right direction, but there are more problems lurking.

A reduced version of the new problem is

opt -passes="loop-interchange" bbi-72571_2.ll -o /dev/null

which results in

Instruction does not dominate all uses!
  %i.166.lcssa = phi i16 [ %3, %vector.body85.split ]
  %arrayidx60 = getelementptr inbounds [2 x [4 x i32]], ptr @c, i16 0, i16 %i.166.lcssa, i16 %j.165
LLVM ERROR: Broken module found, compilation aborted!

Sep 19 2022, 7:55 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Sep 19 2022, 7:54 PM · Restricted Project, Restricted Project, Restricted Project

Sep 16 2022

congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

I think I'm ok with this sort of patch being a temporary workaround
could you add a FIXME pointing to the discussion here?

Using PreservedAnalyses to tell the loop pass manager that the loop nest structure has changed is still IMO an abuse of it. The whole point of PreservedAnalyses is that we determine whether or not some cached analysis for a given IR unit should be invalidated. The loop nest analysis shouldn't be modeled as a loop analysis since it's not at the same IR level (that's why https://reviews.llvm.org/D132581 is wrong, caused issues, and was reverted). This is essentially adding an extra bit to the pass's return value to tell the loop pass manager to change how it behaves, which is not what PreservedAnalyses is intended to be used for. But that's exactly why LPMUpdater exists, for this sort of thing. So I'd still like to go back to the LPMUpdater version.

I also think there may be issues with loop analyses not being invalidated in regards to loop-interchange. If an inner loop has a cached analysis, right now I believe interchanging does not invalidate the (original) inner loop's analyses, but we may end up running a loop nest pass on it with its old analyses once it's the outer loop, even though the loop structure has changed. But that can be a separate patch.

Sep 16 2022, 9:49 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 16 2022, 9:49 AM · Restricted Project, Restricted Project, Restricted Project

Sep 13 2022

congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

Sorry, I'd meant to reply but didn't find time, I'll try to be more prompt about responding to this

This is definitely better than the previous version.

IMO it would still be best if continually rerunning a pass on some IR would reach a fixed point. For example, when loop unswitching happens, we revisit the current loop (which means restarting the entire loop pipeline), and if we end up creating a new loop, we also add that new loop to the worklist. There has been care taken to ensure that unswitching cannot run forever. If interchange did this, we'd be able to just revisit the entire loop nest again. For the interchange example, you say that loop-interchange may continually swap between L1(L2) and L2(L1), one of them must be better than the other, why can't we converge on the better nesting? I don't quite understand the "locality vs vectorization" argument, there still must be one nesting that's ultimately optimal in the end. Then we can restart the pipeline on the loop nest, or just the loops that got swapped around.

But if that doesn't make sense, then something along these lines seems ok. I think it might be worth revisiting exactly how loop nest passes work, but I haven't thought too hard about it. But then again, this seems like a loop-interchange-specific issue, not a loop nest issue.

also you'll have to rebase again since my patch was reverted
also seems like we've dropped the test?

Sep 13 2022, 3:34 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 13 2022, 3:33 PM · Restricted Project, Restricted Project, Restricted Project

Sep 12 2022

congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

ping @aeubanks :)
I'm wondering if you have comments on my most recent change?

Sep 12 2022, 11:43 AM · Restricted Project, Restricted Project, Restricted Project

Sep 6 2022

congzhe added a comment to D131606: [Loop Fusion] Sink/hoist memory instructions between loop fusion candidates.

Like I said I won't block the patch, I'm fine if other reviewers accept this patch.

Sep 6 2022, 12:25 PM · Restricted Project, Restricted Project

Sep 2 2022

congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

Rebased on D132581.

Sep 2 2022, 1:00 PM · Restricted Project, Restricted Project, Restricted Project

Sep 1 2022

congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

Will rebase on D132581 once it is landed.

Sep 1 2022, 7:53 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added inline comments to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 1 2022, 7:50 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Sep 1 2022, 7:50 PM · Restricted Project, Restricted Project, Restricted Project

Aug 25 2022

congzhe added inline comments to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Aug 25 2022, 3:17 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D131606: [Loop Fusion] Sink/hoist memory instructions between loop fusion candidates.

IIUC the problem you described has been addressed and handled in CodeMoverUtils already.

The function isSafeToMoveBefore() has the following signature where the last argument CheckForEntireBlock can be set to true such that previous instructions (before the current instruction that is under consideration) would be considered when checking if it is legal to move it. If I'm not mistaken, for the IR you posted we can actually move both %i1 and %i2 (by setting CheckForEntireBlock=true). You might want to check out this patch https://reviews.llvm.org/D110378, which introduced CheckForEntireBlock.

bool isSafeToMoveBefore(Instruction &I, Instruction &InsertPoint,
                        DominatorTree &DT,
                        const PostDominatorTree *PDT = nullptr,
                        DependenceInfo *DI = nullptr,
                        bool CheckForEntireBlock = false);

If I'm not mistaken (please correct me if I'm wrong), I wish we could reuse CodeMoverUtils since it could simplify things.

I tried some tests with isSafeToMoveBefore(). The sink_preheader.ll test fails, since %sinkme can be sunk but then %sinkme2 cannot sink, because isSafeToMoveBefore() apparently does not ignore users in the same block like it ignores operands in the same block when hoisting. The condition CheckForEntireBlock for sinking could be added, to mirror the case with hoisting, but I foresee other issues. For example, %sinkme3 in sink_preheader.ll should not be hoisted, though the only obstacle for hoisting it (%sinkme1 ) is within the same block as %sinkme3 itself; we need to know that %sinkme1 also cannot be hoisted, and it is insufficient to just ignore variable.

Aug 25 2022, 2:44 PM · Restricted Project, Restricted Project

Aug 24 2022

congzhe added a comment to D131606: [Loop Fusion] Sink/hoist memory instructions between loop fusion candidates.

The changes made in this patch, notably canHoistInst() and canSinkInst() seems to have duplicate functionality from CodeMoverUtils.cpp. I'm wondering if we can reuse the code from CodeMoverUtils.cpp? Or is there something that canHoistInst() and canSinkInst() can do but functions from CodeMoverUtils.cpp cannot do?

The difficulty I see with utilizing isSafeToMove{Before, After}() from CodeMoverUtils in this context is that when determining if an instruction in the preheader of FC1 can be sunk/hoisted, we need to take into account other instructions in the preheader of FC1 which are eligible for sinking/hoisting.

Consider this example:

define void @sink_preheader(i32 %N) {
pre1:
  br label %body1

body1:  ; preds = %pre1, %body1
  ...
  br i1 %cond, label %body1, label %pre2

pre2:
  %i1 = add i32 1, %N
  %i2 = add i32 1, %i1
  br label %body2

....
}

Both %i1 and %i2 can be hoisted, though isSafeToMoveBefore() would presumably not believe %i2 should be moved to the end of pre1. If you're seeing something I'm not, let me know and I'd be happy to simplify the code.

Aug 24 2022, 2:02 PM · Restricted Project, Restricted Project
congzhe added a comment to D131606: [Loop Fusion] Sink/hoist memory instructions between loop fusion candidates.

The changes made in this patch, notably canHoistInst() and canSinkInst() seems to have duplicate functionality from CodeMoverUtils.cpp. I'm wondering if we can reuse the code from CodeMoverUtils.cpp? Or is there something that canHoistInst() and canSinkInst() can do but functions from CodeMoverUtils.cpp cannot do?

Aug 24 2022, 1:28 PM · Restricted Project, Restricted Project

Aug 22 2022

congzhe updated the diff for D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Aug 22 2022, 3:21 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

LGTM other than the comment from Bardia.

Aug 22 2022, 3:20 PM · Restricted Project, Restricted Project, Restricted Project

Aug 18 2022

congzhe updated the summary of D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Aug 18 2022, 9:10 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D124926: [LoopInterchange] New cost model for loop interchange.

Hello,

We stumbled upon a case where loop-interchange seems to start hanging and consume more and more memory until it cannot do that any longer with this patch:

opt -passes="loop(loop-interchange,loop-interchange)" bbi-72548.ll -o /dev/null

Hi Mikael, thanks for this. I've quickly looked at it and the problem can be avoided by running opt -passes="loop(loop-interchange),loop(loop-interchange))" bbi-72548.ll -o /dev/null using the new pass manager, or opt -loop-interchange -loop-interchange" bbi-72548.ll -o /dev/null using the legacy pass manager.

My impression is that the root cause is not within this patch but I suspect it is a problem with the loopnest pass pipeline within the new pass manager. This IR is a triply-nested loop, after the first interchange pass, it should still be a triply-nested loop. However, when running opt -passes="loop(loop-interchange,loop-interchange), what is populated into the loopnest data structure in the LoopInterchangePass::run() function at the beginning of the 2nd interchange pass is a 2-level (doubly-nested) loop. This is incorrect and caused the trouble (infinitely looping over the following code in LoopInterchange.cpp).

while (Dep.size() != Level) {
  Dep.push_back('I');
}

Seems like the loopnest pass pipeline does not get the loop correct, when the loop is modified during the pipeline.

When running opt -passes="loop(loop-interchange),loop(loop-interchange))" bbi-72548.ll -o /dev/null or opt -loop-interchange -loop-interchange" bbi-72548.ll -o /dev/null, what is populated into the loopnest data structure in the LoopInterchangePass::run() function at the beginning of the 2nd interchange pass is a 3-level (triply-nested) loop, which is correct.

I'll do more investigations and post updates here.

Aug 18 2022, 9:08 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the summary of D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Aug 18 2022, 9:06 PM · Restricted Project, Restricted Project, Restricted Project
congzhe requested review of D132199: [LoopPassManager] Ensure to construct loop nests with the outermost loop.
Aug 18 2022, 9:02 PM · Restricted Project, Restricted Project, Restricted Project

Aug 17 2022

congzhe added a comment to D124926: [LoopInterchange] New cost model for loop interchange.

I wrote an issue about a failed assertion we started seeing with this patch:
https://github.com/llvm/llvm-project/issues/57148

Note: Both the hanging and the new crash have popped up during fuzz testing with nonstandard pipelines.

Aug 17 2022, 11:36 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the summary of D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Aug 17 2022, 11:35 AM · Restricted Project, Restricted Project, Restricted Project
congzhe requested review of D132055: [LoopInterchange][PR57148] Ensure LCSSA form after loop interchnange.
Aug 17 2022, 11:34 AM · Restricted Project, Restricted Project, Restricted Project

Aug 12 2022

congzhe added a comment to D124926: [LoopInterchange] New cost model for loop interchange.

Hello,

We stumbled upon a case where loop-interchange seems to start hanging and consume more and more memory until it cannot do that any longer with this patch:

opt -passes="loop(loop-interchange,loop-interchange)" bbi-72548.ll -o /dev/null

Aug 12 2022, 11:58 AM · Restricted Project, Restricted Project, Restricted Project

Aug 3 2022

congzhe committed rG8dc4b2edfad7: [LoopInterchange][PR56275] Fix legality with negative dependence vectors (authored by congzhe).
[LoopInterchange][PR56275] Fix legality with negative dependence vectors
Aug 3 2022, 5:00 PM · Restricted Project, Restricted Project
congzhe committed rG76be5549318a: [DependenceAnalysis][PR56275] Normalize negative dependence analysis results (authored by congzhe).
[DependenceAnalysis][PR56275] Normalize negative dependence analysis results
Aug 3 2022, 5:00 PM · Restricted Project, Restricted Project
congzhe closed D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Aug 3 2022, 5:00 PM · Restricted Project, Restricted Project, Restricted Project
congzhe closed D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Aug 3 2022, 5:00 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.

Updated to take the bool return value of normalize() into account, will land soon.

Aug 3 2022, 4:36 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.

Addressed comments, will land soon.

Aug 3 2022, 4:24 PM · Restricted Project, Restricted Project, Restricted Project

Jul 29 2022

congzhe updated the diff for D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.

Updated the patch according to the most recent update in D130188

Jul 29 2022, 2:56 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.

Hi @bmahjour @Meinersbur, I've updated the patch such that normalize() changes the Dependence object in place, and I've also leveraged the capability of the NPM to parse pass options instead of using cl::opt. I'd appreciate it if you could take a look at this version, thanks a lot!

Jul 29 2022, 2:48 PM · Restricted Project, Restricted Project, Restricted Project

Jul 26 2022

congzhe added inline comments to D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 26 2022, 11:48 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 26 2022, 11:47 AM · Restricted Project, Restricted Project, Restricted Project
congzhe added inline comments to D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 26 2022, 11:46 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 26 2022, 11:46 AM · Restricted Project, Restricted Project, Restricted Project

Jul 25 2022

congzhe updated the diff for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.

Thanks @bmahjour, I've tried to address all your comments in the updated version.

Jul 25 2022, 10:41 AM · Restricted Project, Restricted Project, Restricted Project

Jul 24 2022

congzhe added inline comments to D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 24 2022, 4:16 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 24 2022, 4:15 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.

Since the normalize() function is conceptually an operation that applies to the result (as opposed to the analysis itself), it makes more sense for it to be a member of the Dependence class. Client code, such as loop interchange, would then do something like this:

if (auto D = DI->depends(Src, Dst, true).normalize()) {
   ...

This way the analysis can be done once, but the results can be interpreted differently if the need arises.

Jul 24 2022, 4:08 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the diff for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 24 2022, 4:07 PM · Restricted Project, Restricted Project, Restricted Project

Jul 20 2022

congzhe committed rG05ccde8023a6: [LoopCacheAnalysis] Fix a type mismatch problem in cost calculation (authored by congzhe).
[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation
Jul 20 2022, 10:58 PM · Restricted Project, Restricted Project
congzhe closed D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.
Jul 20 2022, 10:58 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added reviewers for D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors: kawashima-fj, bmahjour, Meinersbur, Restricted Project.
Jul 20 2022, 12:04 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added reviewers for D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required: kawashima-fj, bmahjour, Meinersbur, Restricted Project.
Jul 20 2022, 12:03 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the summary of D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 20 2022, 12:02 PM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the summary of D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 20 2022, 11:48 AM · Restricted Project, Restricted Project, Restricted Project
congzhe updated the summary of D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 20 2022, 11:47 AM · Restricted Project, Restricted Project, Restricted Project
congzhe requested review of D130189: [LoopInterchange][PR56275] Fix legality in dependence vectors.
Jul 20 2022, 11:11 AM · Restricted Project, Restricted Project, Restricted Project
congzhe requested review of D130188: [DependenceAnalysis][PR56275] Normalize dependence analysis results to be non-negative when required.
Jul 20 2022, 11:09 AM · Restricted Project, Restricted Project, Restricted Project

Jul 18 2022

congzhe updated the diff for D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.

Thanks Michael for your suggestion, I've updated the patch accordingly.

Jul 18 2022, 1:43 PM · Restricted Project, Restricted Project, Restricted Project

Jul 13 2022

congzhe updated the diff for D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.

Updated the patch according to the discussion in the loopopt meeting. @Meinersbur I would appreciate it if you could take a second look, thanks a lot!

Jul 13 2022, 12:13 PM · Restricted Project, Restricted Project, Restricted Project

Jul 8 2022

congzhe added a comment to D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.

Hi folks, my apologies for the delay, and thanks for the input. I do agree with you that we need to treat this scev expansion more carefully, although for some places where getNoopOrAnyExtend() is called (like the piece of code that Bjorn posted above), both Stride and TripCount are positive so it might be straightforward to use SE.getNoopOrZeroExtend(), or maybe just SE.getNoopOrAnyExtend() as how it is used now.

Jul 8 2022, 12:18 PM · Restricted Project, Restricted Project, Restricted Project

Jun 30 2022

congzhe added a comment to D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.

Regarding Michael's question that whether SE.getNoopOrAnyExtend() is signed extension or unsigned extension: it is actually well handled in ScalarEvolution::getAnyExtendExpr() where it could do either signed or unsigned extension depending on the actual SCEV type of the value we want to extend. I'm wondering if it answers your question? @Meinersbur

Whether to use sext or zext can make a semantic difference. Say we have an element size of i16 4 and a stride of -1 (going backwards) in 8-bit precision. That represented as i8 255. zero-extension of that is i16 255, and multiplying both gives i16 1020 which should be -4 (i16 0xFFFC).

getNoopOrAnyExtend() seems to prefer zero-extension over sign-extension.

Jun 30 2022, 12:59 PM · Restricted Project, Restricted Project, Restricted Project

Jun 29 2022

congzhe updated the diff for D128877: [LoopCacheAnalysis] Fix a type mismatch bug in cost calculation.
Jun 29 2022, 10:49 PM · Restricted Project, Restricted Project, Restricted Project
congzhe added a comment to rGb941857b40ed: [LoopInterchange] New cost model for loop interchange.

Notice that the assert is hit also when using -passes='print<loop-cache-cost>' instead of running loop-interchange.
So I do not think the problem is in this patch, but rather that perhaps LoopCacheAnalysis is broken in some way?

Thanks for letting me know! Looking into it now.

Will post a patch for loop cache analysis to fix it.

Jun 29 2022, 10:38 PM · Restricted Project, Restricted Project