Page MenuHomePhabricator

gareevroman (Roman)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 25 2015, 5:06 AM (375 w, 3 d)

Recent Activity

Aug 14 2022

gareevroman added a comment to rGa5d981045de7: [Polly] Remove the test case that depends on InstCombine and DeLICM..

Why? Reason?

Aug 14 2022, 6:07 AM · Restricted Project
gareevroman added inline comments to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Aug 14 2022, 3:39 AM · Restricted Project, Restricted Project
gareevroman committed rGa5d981045de7: [Polly] Remove the test case that depends on InstCombine and DeLICM. (authored by gareevroman).
[Polly] Remove the test case that depends on InstCombine and DeLICM.
Aug 14 2022, 3:38 AM · Restricted Project

Aug 7 2022

gareevroman committed rGe8c9eb49ead0: [Polly] Suppress the LLVM-IR output for pattern matching tests, if there is no… (authored by gareevroman).
[Polly] Suppress the LLVM-IR output for pattern matching tests, if there is no…
Aug 7 2022, 4:56 AM · Restricted Project
gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

Thank you Gareev. I think the description can still be improved, I but we should also move forward and can improve iteratively.

Looking forward for the actual TC optimization.

Aug 7 2022, 4:27 AM · Restricted Project, Restricted Project
gareevroman committed rGb02c7e2b630a: [Polly] Generalize the pattern matching to the case of tensor contractions (authored by gareevroman).
[Polly] Generalize the pattern matching to the case of tensor contractions
Aug 7 2022, 4:23 AM · Restricted Project
gareevroman closed D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Aug 7 2022, 4:22 AM · Restricted Project, Restricted Project

Jul 29 2022

gareevroman added inline comments to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Jul 29 2022, 11:56 PM · Restricted Project, Restricted Project
gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
void foo(int n, double C[1024][1024], double A[1024][64][64], double B[64][1024][64]) {
for (int i = 0; i < 1024; i++)
    for (int j = 0; j < 1024; j++)
      for (int l = 0; l < 64; l++)
        for (int w = 0; w < 64; ++w)
           if (w != 0)
             C[i][j] += A[i][l][w] * B[w][j][l];
}

ScopBuilder generates the following memory accesses:

{ Stmt3[i0, i1, i2, i3] -> MemRef2[o0, o1] : o0 = i0 and o1 = i1 }
{ Stmt3[i0, i1, i2, i3] -> MemRef2[o0, o1] : o0 = i0 and o1 = i1 }
{ Stmt3[i0, i1, i2, i3] -> MemRef1[o0, o1, o2] : o0 = 1 + i3 and o1 = i1 and o2 = i2 }
{ Stmt3[i0, i1, i2, i3] -> MemRef0[o0, o1, o2] : o0 = i0 and o1 = i2 and o2 = 1 + i3 }

In the context of the previous discussion, I meant that memory accesses are modified in comparison to the previous considered case.

Where does it come from?

If we look at the domain for the i3 variable, we see that the value 0 from the domain of w-loop is excluded and the loop bounds are modified to start from 0. Memory accesses correspond to this.

Looks like some other optimization (maybe JumpThreading?) modifies the loop range. Ideally, the detection would be robust enough to not depend on the whether the domain space has an offset.

Jul 29 2022, 11:48 PM · Restricted Project, Restricted Project
gareevroman updated the diff for D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Jul 29 2022, 11:43 PM · Restricted Project, Restricted Project

Jun 12 2022

gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

1
Yes, it was intended. The transformation helps to optimize a class of programs, which is broader then a tensor contraction. However, it heavily depends on the codegen part. I think that the improvement of the detection can be the goal of the future work.

Please document what pattern is intended to be recognized. I don't think the doc for isTCPattern is sufficient, it only mentioned what is checked. Documenting the intended pattern would help identifying if a check has been forgotten. E.g. for the statement domain.

Jun 12 2022, 4:12 AM · Restricted Project, Restricted Project
gareevroman updated the diff for D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Jun 12 2022, 3:58 AM · Restricted Project, Restricted Project

May 23 2022

gareevroman added a comment to D125202: [Polly] Disable matmul pattern-match + -polly-parallel.

I would suggest to parallelize the second loop around the micro-kernel by default. It would not violate the dependencies. In general, it can provide a good opportunity for parallelization (please, see [1] and [2]). In particular, the reduction of time spent in this loop may cancel out the cost of packing the elements of the created array Packed_A into the L2 cache.

I fear that $loop_4$ does not have enough work to justify the parallelization overhead.

May 23 2022, 2:48 AM · Restricted Project, Restricted Project, Restricted Project

May 15 2022

gareevroman added a comment to D125202: [Polly] Disable matmul pattern-match + -polly-parallel.

I added this in aa8a976174c7ac08676bbc7bb647f6bc0efd2e72 and I think it does not actually make anything parallel, but I am not sure it is actually allowed due to Packed_A shared between all the threads.

May 15 2022, 11:16 PM · Restricted Project, Restricted Project, Restricted Project

May 9 2022

gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

1

The following is successfully detected as tensor contraction. Is this intended?

May 9 2022, 8:06 AM · Restricted Project, Restricted Project
gareevroman updated the diff for D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
May 9 2022, 7:44 AM · Restricted Project, Restricted Project

Apr 24 2022

gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

Thank you very much for the review! I am sorry for the late response. I will try to to address all your comments within the next few weeks.

Apr 24 2022, 10:03 PM · Restricted Project, Restricted Project

Feb 5 2022

gareevroman added a comment to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

ping

Feb 5 2022, 11:32 PM · Restricted Project, Restricted Project

Dec 12 2021

gareevroman added inline comments to D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Dec 12 2021, 1:30 AM · Restricted Project, Restricted Project
gareevroman updated the diff for D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..

Thank you very much for the review!

Dec 12 2021, 1:25 AM · Restricted Project, Restricted Project

Nov 21 2021

gareevroman requested review of D114336: [Polly] Generalize the pattern matching to the case of tensor contractions..
Nov 21 2021, 6:10 AM · Restricted Project, Restricted Project

Sep 28 2021

gareevroman added a comment to D110491: [Polly] Check the properties of accesses to operands of a matrix-matrix multiplication.

The test need an assertions-enabled build (REQUIRES: asserts) to print anything. That is, it will always pass on non-assert builds.

Sep 28 2021, 10:59 AM · Restricted Project
gareevroman committed rG113fa82c3ca4: [Polly] Check the properties of accesses to operands of a matrix-matrix (authored by gareevroman).
[Polly] Check the properties of accesses to operands of a matrix-matrix
Sep 28 2021, 10:59 AM
gareevroman closed D110491: [Polly] Check the properties of accesses to operands of a matrix-matrix multiplication.
Sep 28 2021, 10:59 AM · Restricted Project

Sep 25 2021

gareevroman requested review of D110491: [Polly] Check the properties of accesses to operands of a matrix-matrix multiplication.
Sep 25 2021, 10:38 PM · Restricted Project

Oct 7 2019

gareevroman closed D35761: [Polly][WIP] Use SCEV information for the second level aliasing.
Oct 7 2019, 5:41 AM · Restricted Project

Sep 25 2017

gareevroman added a comment to D38218: [Polly] Information about generalized matrix multiplication.

Hi Tobias,

Sep 25 2017, 5:02 AM
gareevroman created D38218: [Polly] Information about generalized matrix multiplication.
Sep 25 2017, 5:02 AM
gareevroman updated the diff for D38218: [Polly] Information about generalized matrix multiplication.

The png file was missed.

Sep 25 2017, 5:02 AM

Sep 11 2017

gareevroman created D37692: [Polly] Unroll and separate the remaining parts of isolation.
Sep 11 2017, 8:05 AM

Aug 31 2017

gareevroman added a comment to D37340: [Polly] Run GVN during the cleanup.

Any reason to use NewGVN instead of (old) GVN?

Aug 31 2017, 11:07 PM
gareevroman created D37340: [Polly] Run GVN during the cleanup.
Aug 31 2017, 9:22 AM

Aug 26 2017

gareevroman updated subscribers of D37178: [Polly] Use the information about the target cache provided by the TargetTransformInfo.
Aug 26 2017, 2:13 AM
gareevroman created D37178: [Polly] Use the information about the target cache provided by the TargetTransformInfo.
Aug 26 2017, 2:12 AM

Aug 24 2017

gareevroman added a comment to D37051: Model cache size and associativity in TargetTransformInfo.

I think throughput and latency of vector fma instructions are pretty constant across micro-architectures too. Can we also add them?

Sorry, probably, it’d require to specify it for each architecture.

We could do this. Some of the backend experts might be able to help you how to do this best. I think we should do this in a separate patch. Care to propose one?

Aug 24 2017, 12:37 AM

Aug 23 2017

gareevroman added a comment to D37051: Model cache size and associativity in TargetTransformInfo.

I think throughput and latency of vector fma instructions are pretty constant across micro-architectures too. Can we also add them?

Aug 23 2017, 1:02 AM
gareevroman added a comment to D37051: Model cache size and associativity in TargetTransformInfo.

information (based on ideas from "Analytical Models for the BLIS Framework").

Could you add the reference to the paper to the summary?

Aug 23 2017, 12:58 AM

Aug 22 2017

gareevroman added a comment to D36928: [Polly][MatMul][WIP] Disable the Loop Vectorizer.

Hi Hal,

I think this is conceptually the right approach. We currently generate code -- with explicit register unrolling -- and expect the SLP vectorizer to perform the vectorization. I believe communicating this information via explicit metadata is reasonable.

We may want to move towards using the LLVM loop vectorizer rather than the SLP vectorizer, but this requires both changes to the loop vectorizer and to our code generation strategy. We should certainly consider this, but I feel that this could be separate steps. 1) clarify current behavior and fix regressions, 2) expand the loop vectorizer, 3) change our code generation logic.

Roman, I think Hal is right that we should look into how to improve the loop vectorizer.

Aug 22 2017, 10:39 AM

Aug 21 2017

gareevroman added inline comments to D36460: [Polly][MatMul] Make MatMul detection independent of internal isl representations..
Aug 21 2017, 12:00 AM
gareevroman added a comment to D36928: [Polly][MatMul][WIP] Disable the Loop Vectorizer.

I suggest that you wait on this until we understand the regression.

In part, it's possible that this is the wrong fix. Based on the bug report, it is possible that the problem is that the vectorizer is generating runtime checks. Maybe aliasing metadata would help. Maybe it's unrolling too much.

Aug 21 2017, 12:00 AM

Aug 19 2017

gareevroman created D36928: [Polly][MatMul][WIP] Disable the Loop Vectorizer.
Aug 19 2017, 10:58 AM

Aug 8 2017

gareevroman added inline comments to D36460: [Polly][MatMul] Make MatMul detection independent of internal isl representations..
Aug 8 2017, 10:28 AM
gareevroman added a comment to D35761: [Polly][WIP] Use SCEV information for the second level aliasing.

OK, that's fine for me than. Can you possibly add to the test case check lines for the load and store to illustrate to which alias data they refer to?

Aug 8 2017, 9:57 AM · Restricted Project
gareevroman added a comment to D36460: [Polly][MatMul] Make MatMul detection independent of internal isl representations..

The pattern recognition for MatMul is restrictive.

Aug 8 2017, 7:19 AM

Aug 5 2017

gareevroman added a comment to D36278: [Polly][WIP] Do not use isl_set_project_out to get all loop prefixes.

Hi Roman,

thank you for the update.

Hi Tobias,

thanks for the reply.

The function drop_constraints_involving_dims seems to accidentally drop more constraints that involve outer dimensions, but this does not seem to be a change we actually understand well.

Please, correct me if I'm wrong, but I think it's fine to drop more constraints because it should help to get the same result:

This will not result in incorrect program output, but it will be rather fragile as it just happens to work by accident due to the way the given set is modeled today. If it does not work with projection, our approach is likely not fully correct. As I know you are under time constraints I spent the morning debugging this myself. Please see the latest email "Isolation with non-convex conditions" to the isl mailing list for some infos about the bug. It seems isl is over-approximating the isolated set, in case the set is non convex. This pulls iterations that require conditions in the innermost loop. I am fine committing this as a temporary workaround if you are happy to fix it after your the deadline. In this case, just put a TODO note for us to not forget.

Aug 5 2017, 6:36 AM
gareevroman updated the diff for D35761: [Polly][WIP] Use SCEV information for the second level aliasing.
  1. Why do we need to use SCEV? Should we not be able to tell form our information which base pointers are expected to be identical?
Aug 5 2017, 2:19 AM · Restricted Project

Aug 4 2017

gareevroman requested review of D36278: [Polly][WIP] Do not use isl_set_project_out to get all loop prefixes.

Hi Roman,

thank you for the update.

Aug 4 2017, 9:30 AM

Aug 3 2017

gareevroman created D36278: [Polly][WIP] Do not use isl_set_project_out to get all loop prefixes.
Aug 3 2017, 11:55 AM

Jul 26 2017

gareevroman added a comment to D35845: [Polly][ScheduleOptimizer] Translate to C++ bindings.

Very straightforward. ++

Jul 26 2017, 7:59 AM

Jul 25 2017

gareevroman created D35845: [Polly][ScheduleOptimizer] Translate to C++ bindings.
Jul 25 2017, 10:14 AM

Jul 22 2017

gareevroman added inline comments to D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.
Jul 22 2017, 5:14 AM
gareevroman created D35761: [Polly][WIP] Use SCEV information for the second level aliasing.
Jul 22 2017, 12:25 AM · Restricted Project

Jul 19 2017

gareevroman updated the diff for D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.

PS: Can you try to upload your patches with full context?

Jul 19 2017, 9:31 AM

Jun 26 2017

gareevroman added inline comments to D34609: [Polly][WIP] Insert copy statements into the domain of the schedule tree.
Jun 26 2017, 12:21 AM
gareevroman created D34609: [Polly][WIP] Insert copy statements into the domain of the schedule tree.
Jun 26 2017, 12:20 AM

Jun 18 2017

gareevroman added a comment to D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.

Hi Roman,

I just tried to run the following command on polybench 3.2 and could not get the pattern based optimizations to work:

clang linear-algebra/kernels/gemm/gemm.c -O3 -DPOLYBENCH_TIME -I utilities/ -mllvm -polly -mllvm -polly-tiling=true -mllvm -polly-position=before-vectorizer -mllvm -polly-enable-delicm -mllvm -debug-only=polly-delicm -mllvm -polly-delicm-overapproximate-writes -mllvm -debug-only=polly-ast -mllvm -polly=true -fno-vectorize -fno-inline -mllvm -polly-enable-simplify utilities/polybench.c

using

git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@302926 91177308-0d34-0410-b5e6-96231b3b80d8

as well as:

[Polly][WIP] Make the pattern matching work with modified memory accesses
Differential Revision: https://reviews.llvm.org/D33138

[Polly][Simplify] Remove writes that are overwritten.
Differential Revision: https://reviews.llvm.org/D33142

Any idea what might be missing?

Best,
Tobias

Jun 18 2017, 7:26 AM
gareevroman updated the diff for D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.

Update the revision.

Jun 18 2017, 7:24 AM
gareevroman added a comment to D31842: [Polly] Load hoisting of indirect loads.

Hi,

thanks for the comments!

All the callers of isHoistableLoad do add the load to Context.RequiredILS to ensure that the load is indeed preloaded (a precondition to be a SCoP). With this patch this is only done with the LoadInst itself, but not the loads it depends on.
In your test case it still happens anyway because hoistability/invariance is also checked individually for their base pointer to be invariant so that they get into the RequiredILS. However, by themselves they might not be required to be load-hoisted. This could result in the %tmp3 to be hoisted, but not %tmp2 or %tmp1.
Could you either add a comment explaining why the dependent loads are also always load-hoisted - or - also add the dependent loads to RequiredILS?
I tested your patch on the test-suite. Compilation of test-suite/MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/jidctfst.c does not finish.

AFAIU, Johannes's approach doesn't cause the issue.

Johannes, could you please clarify how isHoistableLoad function works? I just want to make sure I understand it correctly:

  1. It checks that the pointer operand of LInst is unchanging in the loops of region R.
  2. Subsequently, it checks that there is no an instruction UserI from R that can modify the pointer operand and/or dominates all predecessors of the region exit of R.

Why should we check that UserI doesn't dominate all predecessors of the region exit of R?

Jun 18 2017, 7:23 AM

May 12 2017

gareevroman added a comment to D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.

The isLatestArrayKind() part is clear, but what does the isMatMulOperandAcc part do?

May 12 2017, 11:52 PM
gareevroman added inline comments to D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.
May 12 2017, 10:11 AM
gareevroman created D33138: [Polly][WIP] Make the pattern matching work with modified memory accesses.
May 12 2017, 10:10 AM
gareevroman updated the diff for D31842: [Polly] Load hoisting of indirect loads.

thanks for the comments!

May 12 2017, 10:09 AM

Apr 25 2017

gareevroman added inline comments to D31703: [Polly] [DependenceInfo] Change reduction building to use May-writes..
Apr 25 2017, 11:47 PM

Apr 7 2017

gareevroman created D31842: [Polly] Load hoisting of indirect loads.
Apr 7 2017, 11:48 PM
gareevroman added a comment to D31703: [Polly] [DependenceInfo] Change reduction building to use May-writes..

@gareevroman: Could you take a look at containsOnlyMatMulDep and affirm that the assumption made is correct?

Apr 7 2017, 11:39 PM

Apr 5 2017

gareevroman added inline comments to D31741: [Polly][WIP] Restore the initial ordering of dimensions before applying the pattern matching.
Apr 5 2017, 11:58 PM
gareevroman created D31741: [Polly][WIP] Restore the initial ordering of dimensions before applying the pattern matching.
Apr 5 2017, 11:57 PM

Apr 2 2017

gareevroman added inline comments to D31386: [Polly] [DependenceInfo] change WAR, WAW generation to correct semantics..
Apr 2 2017, 2:55 AM

Mar 24 2017

gareevroman added a comment to D31244: [DependenceInfo] Remove access to WAW and WAR: Expose both as False [NFC].

Hi Siddharth,

Mar 24 2017, 3:43 AM

Mar 22 2017

gareevroman updated the diff for D30605: [Polly] Map the new load to the base pointer of the invariant load hoisted load.

Add the removed line.

Mar 22 2017, 2:41 AM
gareevroman updated the diff for D30605: [Polly] Map the new load to the base pointer of the invariant load hoisted load.

Since, according to "assert(MA->isArrayKind() && MA->isRead())", MA models an array and it is a read memory access, AccInst, the access instruction of MA, is a base pointer and PreloadVal should be always mapped to it.

Mar 22 2017, 2:37 AM

Mar 13 2017

gareevroman updated the diff for D30606: [Polly] Introduce another level of metadata to distinguish non-aliasing accesses.

Add the test case.

Mar 13 2017, 2:13 AM
gareevroman updated the diff for D30606: [Polly] Introduce another level of metadata to distinguish non-aliasing accesses.

Use mark nodes to mark base pointers.

Mar 13 2017, 2:01 AM

Mar 12 2017

gareevroman added a comment to D30815: [Polly][ScheduleOptimizer] Allow tiling after fusion.

Thanks for the patch!

Mar 12 2017, 4:33 AM · Restricted Project

Mar 11 2017

gareevroman updated the diff for D30605: [Polly] Map the new load to the base pointer of the invariant load hoisted load.

Hi Tobias,

Mar 11 2017, 7:42 AM

Mar 4 2017

gareevroman added a comment to D30606: [Polly] Introduce another level of metadata to distinguish non-aliasing accesses.

P.S: This is a draft of the patch. Currently, it's the responsibility of the user of IRBuilder to make sure that only individual non-aliasing accesses have pointer operands marked with "polly.no.inter.iteration.aliasing" metatdata.

Mar 4 2017, 5:55 AM
gareevroman created D30606: [Polly] Introduce another level of metadata to distinguish non-aliasing accesses.
Mar 4 2017, 5:54 AM
gareevroman created D30605: [Polly] Map the new load to the base pointer of the invariant load hoisted load.
Mar 4 2017, 5:53 AM

Feb 26 2017

gareevroman created D30394: [Polly] Disable the parallel code generation in case of extension nodes.
Feb 26 2017, 11:09 PM

Feb 23 2017

gareevroman created D30293: [Polly] Make optimizations based on pattern matching be enabled by default.
Feb 23 2017, 3:38 AM

Feb 10 2017

gareevroman added a comment to D29814: [Polly] Check reduction dependencies in case of the matrix multiplication optimization.

2017-02-10 13:33 GMT+05:00 Tobias Grosser via Phabricator <reviews@reviews.llvm.org>:

grosser added a comment.

Otherwise, this looks good to me.

Some of the lower changes seem to be mostly stylistic. In case they are, I suggest to commit them ahead of time and then add the actual functionality change separately. In case it introduces a bug, the surface is smaller.

Feb 10 2017, 11:26 PM
gareevroman added a comment to D29269: [Polly] Use the size of the widest type of the matrix multiplication operands.

getTypeAllocSize() is wrong. E.g. for 8-bit char it would return 64 on most 64 platforms (it's alignment), but e.g. SSE can put 16 of them into an 128 bit xmm register.

I've tried to reproduce it on x86-64 (the test case can be found in https://reviews.llvm.org/D29814). However, getTypeAllocSize() returns 1 for 8-bit char. Could you please advise me how to reproduce it?

It is a bit more complicated than I imagined. What this takes into account is "ABIAlignment", which depends on the platform. What we are looking for is a type that occupies more space per element in an array than sizeof() returns. This is possible with a struct { int i; char i; },

An example without struct is X86's long double type with an TypeSize of 80 bits and an AllocSize of 128 bits.

Feb 10 2017, 11:10 PM
gareevroman updated the diff for D29269: [Polly] Use the size of the widest type of the matrix multiplication operands.

getTypeAllocSize() is wrong. E.g. for 8-bit char it would return 64 on most 64 platforms (it's alignment), but e.g. SSE can put 16 of them into an 128 bit xmm register.

Feb 10 2017, 12:31 AM
gareevroman created D29814: [Polly] Check reduction dependencies in case of the matrix multiplication optimization.
Feb 10 2017, 12:30 AM

Feb 8 2017

gareevroman updated the diff for D29244: [Polly] Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication..

With "hurt performance when the gemm is not parametric" I was thinking about that that unrolling non-parametric loops could be unnecessary as the loop-vectorizer would handle them already? Instead, (partial?) unrolling could add an additional burden like code size blowup (or additional control flow).

Feb 8 2017, 6:15 AM

Feb 3 2017

gareevroman updated the diff for D29269: [Polly] Use the size of the widest type of the matrix multiplication operands.

Hi Michael,

Feb 3 2017, 11:14 PM
gareevroman updated the diff for D29244: [Polly] Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication..

Hi Michael,

Feb 3 2017, 11:12 PM

Jan 29 2017

gareevroman created D29269: [Polly] Use the size of the widest type of the matrix multiplication operands.
Jan 29 2017, 11:26 PM
gareevroman updated the diff for D28021: [Polly] Update the documentation on how the packing transformation is implemented.

Update according to the comments.

Jan 29 2017, 1:24 AM

Jan 28 2017

gareevroman added a comment to D29244: [Polly] Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication..

This patch also makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, standardBandOpts can try to tile a band node with anchored subtree. Furthermore, it seems that it's not correct to apply standard optimizations, when the matrix multiplication has been detected. Should we commit it in a separate patch?

Jan 28 2017, 12:04 AM
gareevroman created D29244: [Polly] Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication..
Jan 28 2017, 12:04 AM

Jan 27 2017

gareevroman added a comment to D28021: [Polly] Update the documentation on how the packing transformation is implemented.

Thanks for the comments! I'll try to address them soon.

Jan 27 2017, 8:49 AM

Jan 26 2017

gareevroman added a comment to D28020: [Polly] Align newly created arrays to the first level cache line boundary.

Hi Hongbin,

Jan 26 2017, 12:37 AM
gareevroman updated the diff for D28357: [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication.

Hi Michael,

Jan 26 2017, 12:34 AM

Jan 18 2017

gareevroman updated the diff for D28357: [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication.

Thanks for comments! I've updated the patch according to them.

Jan 18 2017, 11:38 PM
gareevroman closed D28020: [Polly] Align newly created arrays to the first level cache line boundary.

Hi Roman,

Jan 18 2017, 4:28 AM

Jan 5 2017

gareevroman added a comment to D28357: [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication.

I've found out that test/ScheduleOptimizer/pattern-matching-based-opts.ll, test/ScheduleOptimizer/pattern-matching-based-opts_2.ll, test/ScheduleOptimizer/pattern-matching-based-opts_3.ll contain code that produced array accesses, which should be delinearized to be identified by the algorithm. Since, as far as I understand, they are delinearized in common case, these test cases were modified. Should we commit it in a separate patch?

Jan 5 2017, 8:19 AM
gareevroman retitled D28357: [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication from to [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication.
Jan 5 2017, 8:18 AM

Dec 25 2016

gareevroman updated the diff for D28090: [Polly] Specify the default values of the cache parameters.

Update according to the comments.

Dec 25 2016, 7:11 AM
gareevroman updated the diff for D28090: [Polly] Specify the default values of the cache parameters.

Hi Roman,

Dec 25 2016, 1:58 AM