Page MenuHomePhabricator
Feed Advanced Search

Sep 7 2016

JongwonLee retitled D24327: [DAGCombine] Modification of visitBR_CC from to [DAGCombine] Modification of visitBR_CC.
Sep 7 2016, 7:37 PM

May 10 2016

JongwonLee abandoned D18890: [AArch64] add SSA Load Store optimization pass.
May 10 2016, 1:50 AM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

Hi,

So why aren't you doing this in codegen prepare or dag combine?

Cheers,

James

Hi,
I didn't try this work on other positions except backend.
As I mentioned before, this work is only considering AArch64 not other backends.
Is there any advantages of doing this work on codegen prepare or dag combine?

Hi Jongwon,

Sure, there are advantages. Modifying IR is substantially easier and less prone to error than modifying machine instructions. This is also a generic optimization fixing a problem that likely affects more than just the AArch64 target, therefore the right thing to do is to implement it in such a way that it will benefit other targets (unless that causes a very high cost).

In this case it would seem to me quicker, easier and less technical debt later to implement this higher up in the compiler.

James

May 10 2016, 1:50 AM

May 4 2016

JongwonLee abandoned D19760: [AArch64] Turn on "aarch64-ssa-load-store-opt" by default.
May 4 2016, 12:12 AM
JongwonLee added a comment to D19760: [AArch64] Turn on "aarch64-ssa-load-store-opt" by default.

This is not just for SPEC. 10% build time regression in an important
benchmark with no performance improvement is a bad thing. The
improvements on the test-suite also show too small a margin to justify
such a big compiler-time regression.

If this was 2012, when Clang was 40~60% faster than GCC, I'd be ok
with the change once you can prove there's a relevant benchmark out
there that benefits immensely. So far, I haven't seen any evidence.

But right now, our compile time for most workloads have gone worse
than GCC, and 10% is a very big deal, especially now that James wants
to see the validity of this pass to other AArch64 cores.

Finally, such a big hit on compile time for a small back-end pass may
be an indication that you're doing something wrong. There may be a
much easier and faster way to do what you want, and I suggest you work
with James to make sure that situation improves.

cheers,
--renato

May 4 2016, 12:12 AM

May 1 2016

JongwonLee added a comment to D19760: [AArch64] Turn on "aarch64-ssa-load-store-opt" by default.

Hi,

Did the ssa patch get approved? I had outstanding questions on that that weren't resolved...

Cheers,

James

May 1 2016, 9:45 PM
JongwonLee added a comment to D19760: [AArch64] Turn on "aarch64-ssa-load-store-opt" by default.

This change request should be accompanied by benchmark numbers. If the one you shared in the other request are true, 10% compile time regression for zero execution time improvement in SPEC is unacceptable.

May 1 2016, 9:42 PM
JongwonLee added a comment to D19151: [SLPVectorizer] Set MinVecRegSize via a target hook.

ping

May 1 2016, 9:25 PM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

Hi,

So why aren't you doing this in codegen prepare or dag combine?

Cheers,

James

May 1 2016, 9:24 PM

Apr 29 2016

JongwonLee retitled D19760: [AArch64] Turn on "aarch64-ssa-load-store-opt" by default from to [AArch64] Turn on "aarch64-ssa-load-store-opt" by default.
Apr 29 2016, 7:13 PM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

This patch was motivated by performance improvement in commercial benchmark.
However, this patch has different tendency in SPEC and LLVM test-suite.
In SPEC benchmark, compile-time increases by about 10 %. In LLVM test-suite, compile-time decreases by about 0.02 %. Execution time has no regression in both SPEC and LLVM test-suites (0.04% and 0.37% improvement respectively). The data are measured from the average of three times executions.

Apr 29 2016, 6:59 PM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

OK, but this functionality is useful for other backends too. It isn't ideal in the long term to put generic functionality needlessly in target-specific areas.

If you're implementing this in the SLP vectorizer, why do you need to do it here as well?

Apr 29 2016, 6:38 PM
JongwonLee added a comment to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.

In SPEC benchmark, compile-time increases by 2.8%. In LLVM test-suite, compile-time decreases by 9 %. The data are measured from the average of three times executions.

Apr 29 2016, 5:18 PM

Apr 27 2016

JongwonLee added a comment to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.

This patch doesn't change the default value of MinVecRegSize, so it does not affect the performance.
(On top of this patch, another patch(http://reviews.llvm.org/D19151) is needed to change the default value of MinVecRegSize.)
In commercial benchmark, I found that clang lost the chance of 64-bit SLP vectorization in AArch64.
Although there would be side effect of extension of 64-bit SLP vectorization, the compiler should handle 64-bit SLP vectorization in AArch64. The side effect, if any, should be handled with another patch.

Apr 27 2016, 3:36 AM
JongwonLee updated the diff for D18890: [AArch64] add SSA Load Store optimization pass.
Apr 27 2016, 3:31 AM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

Hi,

I'm confused about why you're doing this optimization here instead of much earlier in the compiler. This is basically memcpy idiom recognition - taking multiple 32-bit loads/stores and converting them into a llvm.memcpy intrinsic for perfect lowering should do the same job, and would work for all backends.

Have you investigated doing this much earlier (IR, pre-ISel?)

James

Apr 27 2016, 3:04 AM

Apr 22 2016

JongwonLee added a comment to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Apr 22 2016, 3:31 AM

Apr 15 2016

JongwonLee added a comment to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.

ping~!

Apr 15 2016, 2:03 AM
JongwonLee retitled D19151: [SLPVectorizer] Set MinVecRegSize via a target hook from to [SLPVectorizer] Set MinVecRegSize via a target hook.
Apr 15 2016, 1:49 AM
JongwonLee updated the diff for D18890: [AArch64] add SSA Load Store optimization pass.
Apr 15 2016, 12:33 AM

Apr 14 2016

JongwonLee updated the diff for D18890: [AArch64] add SSA Load Store optimization pass.
Apr 14 2016, 7:38 PM
JongwonLee added inline comments to D18890: [AArch64] add SSA Load Store optimization pass.
Apr 14 2016, 7:38 PM
JongwonLee updated the diff for D18890: [AArch64] add SSA Load Store optimization pass.
Apr 14 2016, 3:58 AM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

Hi Jongwon
Thanks you for the update with the additional alias check. However, as you merge up the second load/store to the first load/stores, I believe you need to check if there is any instruction in between the first and second load/store, which may alias with the second load/store, not just for the first store and the second load.

I'm also curious how this pass was motivated. Did you see any performance gain with this change?

Please, see my inline comments for minor issues.

Apr 14 2016, 3:58 AM

Apr 12 2016

JongwonLee updated the diff for D18890: [AArch64] add SSA Load Store optimization pass.
Apr 12 2016, 3:54 AM
JongwonLee added inline comments to D18890: [AArch64] add SSA Load Store optimization pass.
Apr 12 2016, 3:53 AM
JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.
Apr 12 2016, 2:49 AM

Apr 11 2016

JongwonLee added a comment to D18890: [AArch64] add SSA Load Store optimization pass.

I have answered some of questions. Remaining questions will be answered later.

Apr 11 2016, 3:40 AM

Apr 8 2016

JongwonLee added a comment to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.

ping!

Apr 8 2016, 2:42 AM
JongwonLee retitled D18890: [AArch64] add SSA Load Store optimization pass from to [AArch64] add SSA Load Store optimization pass.
Apr 8 2016, 2:34 AM

Mar 23 2016

JongwonLee updated the diff for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 23 2016, 6:59 PM
JongwonLee added inline comments to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 23 2016, 6:57 PM
JongwonLee updated the diff for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.

Removed the function 'vectorizeStoreChain' and modified the function 'tryToVectroizeList'.

Mar 23 2016, 1:03 AM
JongwonLee added inline comments to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 23 2016, 1:01 AM

Mar 21 2016

JongwonLee updated the diff for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 21 2016, 10:05 PM
JongwonLee added inline comments to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 21 2016, 10:01 PM
JongwonLee updated the diff for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 21 2016, 12:46 AM
JongwonLee retitled D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize from [SLPVectorizer] Change MinVecRegSize from 128 bits to 64 bits to [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 21 2016, 12:45 AM
JongwonLee added inline comments to D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 21 2016, 12:32 AM

Mar 17 2016

JongwonLee updated the diff for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize.
Mar 17 2016, 12:13 AM

Mar 16 2016

JongwonLee added reviewers for D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize: jmolloy, mzolotukhin.
Mar 16 2016, 10:36 PM
JongwonLee retitled D18237: [SLPVectorizer] Try to vectorize in the range from MaxVecRegSize to MinVecRegSize from to [SLPVectorizer] Change MinVecRegSize from 128 bits to 64 bits.
Mar 16 2016, 10:06 PM