Page MenuHomePhabricator

mcrosier (Chad Rosier)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 12 2013, 7:42 AM (489 w, 4 d)

Recent Activity

Nov 1 2018

mcrosier committed rL345827: [AArch64] Add support for ARMv8.4 in Saphira..
[AArch64] Add support for ARMv8.4 in Saphira.
Nov 1 2018, 6:48 AM

Jun 18 2018

mcrosier added a comment to D48056: [AArch64] Implement FLT_ROUNDS macro.

Thanks for reviewing this! I don't have commit access, can someone commit this for me?

Jun 18 2018, 1:13 PM

May 24 2018

mcrosier resigned from D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.
May 24 2018, 12:57 PM
mcrosier committed rL333193: [InstCombine] Combine XOR and AES instructions on ARM/ARM64..
[InstCombine] Combine XOR and AES instructions on ARM/ARM64.
May 24 2018, 8:30 AM
mcrosier closed D47239: [InstCombine] Combine XOR and AES insructions on ARM/ARM64.
May 24 2018, 8:30 AM

May 23 2018

mcrosier committed rL333107: [CodeGen][AArch64] Use RegUnits to track register aliases. (NFC).
[CodeGen][AArch64] Use RegUnits to track register aliases. (NFC)
May 23 2018, 10:56 AM
mcrosier closed D47269: [CodeGen] Use RegUnits to track register aliases in AArch64RedundantCopyElimination. (NFC).
May 23 2018, 10:56 AM
mcrosier added reviewers for D47239: [InstCombine] Combine XOR and AES insructions on ARM/ARM64: t.p.northover, eli.friedman, sdesmalen, majnemer.
May 23 2018, 10:26 AM
mcrosier created D47269: [CodeGen] Use RegUnits to track register aliases in AArch64RedundantCopyElimination. (NFC).
May 23 2018, 9:50 AM

May 21 2018

mcrosier accepted D46953: [FastISel] Permit instructions to be skipped for FastISel generation..

LGTM.

May 21 2018, 11:51 AM
mcrosier abandoned D45225: [WIP] Add IR function attributes to represent codegen optimization level.
May 21 2018, 8:36 AM
mcrosier abandoned D45226: [WIP] Add IR function attributes to represent codegen optimization level.
May 21 2018, 8:33 AM
mcrosier resigned from D46230: For x86_64, gcc 7.2 under Amazon Linux AMI sets its paths to x86_64-amazon-linux.
May 21 2018, 8:33 AM · Restricted Project
mcrosier added a comment to D46953: [FastISel] Permit instructions to be skipped for FastISel generation..

Okay, now I understand. Seems like a reasonable extension.

May 21 2018, 8:32 AM

May 16 2018

mcrosier added a comment to D46953: [FastISel] Permit instructions to be skipped for FastISel generation..

Can you please add a test case?

May 16 2018, 9:57 AM

Apr 27 2018

Herald added a reviewer for D45098: [AArch64] Fix PR32384: bump up the number of stores per memset and memcpy: javed.absar.
Apr 27 2018, 12:01 PM

Apr 12 2018

mcrosier accepted D45109: Remove -cc1 option "-backend-option".

SGTM!

Apr 12 2018, 10:33 AM
mcrosier accepted D45208: [LoopInterchange] Make isProfitableForVectorization slightly more conservative..

This make sense to me. LGTM.

Apr 12 2018, 8:51 AM

Apr 11 2018

mcrosier resigned from D39976: [AArch64] Query the target when folding loads and stores.
Apr 11 2018, 11:31 AM
mcrosier resigned from D37461: [X86][AsmParser] re-introduce 'offset' operator.
Apr 11 2018, 11:31 AM · Restricted Project
mcrosier added a comment to D45374: [LoopUnroll] Limit peeling to conds in BBs executed on every iteration..

Thanks for the update Chad. There is no need for this change then I think.

Apr 11 2018, 8:02 AM
mcrosier committed rL329810: [Driver] Don't forward -m[no-]unaligned-access options to GCC when….
[Driver] Don't forward -m[no-]unaligned-access options to GCC when…
Apr 11 2018, 7:23 AM
mcrosier committed rC329810: [Driver] Don't forward -m[no-]unaligned-access options to GCC when….
[Driver] Don't forward -m[no-]unaligned-access options to GCC when…
Apr 11 2018, 7:23 AM
mcrosier closed D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.
Apr 11 2018, 7:23 AM
mcrosier added a comment to D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.

Thanks, Eric!

Apr 11 2018, 6:48 AM

Apr 10 2018

mcrosier updated the diff for D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.
Apr 10 2018, 2:13 PM
mcrosier accepted D45502: [AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass..

LGTM. Thanks, Geoff. It would be nice to get this into the 6.0.1 release, if at all possible.

Apr 10 2018, 1:39 PM
mcrosier added a comment to D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.

Should we instead be translating and passing the option expected?

Apr 10 2018, 1:37 PM
mcrosier updated the diff for D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.

Update based on Eric's feedback.

Apr 10 2018, 1:35 PM
mcrosier committed rL329754: [Driver] Handle the default case missed in r329748..
[Driver] Handle the default case missed in r329748.
Apr 10 2018, 1:33 PM
mcrosier committed rC329754: [Driver] Handle the default case missed in r329748..
[Driver] Handle the default case missed in r329748.
Apr 10 2018, 1:33 PM
mcrosier closed D45499: [Driver] Handle the default case.
Apr 10 2018, 1:33 PM
mcrosier resigned from D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions.
Apr 10 2018, 1:10 PM
mcrosier resigned from D42087: [DSE] Improve handling of noop stores exposed after dead interfering stores are removed.
Apr 10 2018, 1:09 PM
mcrosier added a comment to D45374: [LoopUnroll] Limit peeling to conds in BBs executed on every iteration..

Thanks for having a look and sorry for not being clearer. Chad discovered a regression in SPEC2006's h264ref with LTO, caused by this change. The problem was that we peeled off an iteration of a big loop before LTO. That caused the function to be too big for the inliner during LTO, whereas it would be inlined before. We based the peeling decision on a nested condition. With this patch I tried to find a balance between increasing code size and benefits of peeling (simplifying nested conditionals are likely to have less positive impact than top-level ones).

@mcrosier did some more digging and found that we might just want to run simple unrolling before LTO and normal unrolling/peeling during LTO, which makes a sense to me. With that, we would not need this patch (or we could only consider top-level conditionals for "simple" peeling) and IMO that is what we should try to do.

Prior to loop peeling the function we'd like to inlined in h264ref has a single use.

Currently, (non-simple) loop peeling will not peel a loop if it includes a function call that is likely to be inlined (i.e., is not marked with a noinline attribute, has internal linkage and has a single use). This is exactly the case we're dealing with in h264ref, except the function to be inlined isn't marked as internal until the LTO phase of compilation. Thus, one possible approach would be to defer peeling until the LTO phase.

Apr 10 2018, 12:42 PM
mcrosier added a comment to D45374: [LoopUnroll] Limit peeling to conds in BBs executed on every iteration..

Thanks for having a look and sorry for not being clearer. Chad discovered a regression in SPEC2006's h264ref with LTO, caused by this change. The problem was that we peeled off an iteration of a big loop before LTO. That caused the function to be too big for the inliner during LTO, whereas it would be inlined before. We based the peeling decision on a nested condition. With this patch I tried to find a balance between increasing code size and benefits of peeling (simplifying nested conditionals are likely to have less positive impact than top-level ones).

@mcrosier did some more digging and found that we might just want to run simple unrolling before LTO and normal unrolling/peeling during LTO, which makes a sense to me. With that, we would not need this patch (or we could only consider top-level conditionals for "simple" peeling) and IMO that is what we should try to do.

Apr 10 2018, 12:42 PM
mcrosier committed rL329709: Fix spelling. NFC..
Fix spelling. NFC.
Apr 10 2018, 8:01 AM

Apr 6 2018

mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

This is exactly the case we're dealing with here except FullPelBlockMotionBiPred isn't marked as internal until the LTO phase of compilation. Thus, one possible approach would be to defer peeling until the LTO phase. After r329392, this can be accomplished with a small change to the pass manager:

ThinLTO doesn't run vectorization or unrolling before link-time; among other reasons, it avoids problems like this. You might want to consider making non-thin LTO work the same way.

Apr 6 2018, 12:49 PM · Restricted Project, Restricted Project
mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

@mcrosier I've submitted D44983 for review. It prevents peeling, if we cannot simplify the loop body after peeling. Peeling if we cannot simplify the loop body afterwards is likely not beneficial. It would be great if you could check if that helps in your case. If it's not easy for you to check, I can try and test it myself.

Coming up with some additional heuristics, e.g. based on the number of instructions peeled and not eliminated, should be possible, but without knowing the inlining situation we would probably have to choose a rather arbitrary threshold.

Sure, I'll take a look now and let you know! Sorry I didn't see this sooner.

Unfortunately, D44983 does not fix this case. I'm going to dig into this now. I'll update you once I have some additional findings.

I had a closer look at FullPelBlockMotionBiPred and we peeled off an iteration because we have something like

if (i % 2) {
  ...
  if (i != 0) {...}
 ...
}

Peeling based on those nested conditional is likely to increase the code size too much compared to the benefit. D45374 only considers conditions in blocks that are executed on every iteration.

Apr 6 2018, 10:42 AM · Restricted Project, Restricted Project
mcrosier committed rL329395: [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference..
[LoopUnroll] Make LoopPeeling respect the AllowPeeling preference.
Apr 6 2018, 7:00 AM
mcrosier closed D45334: [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference..
Apr 6 2018, 7:00 AM
mcrosier added a comment to D45334: [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference..

Thanks everyone for the quick review. I'll commit shortly.

Apr 6 2018, 6:51 AM

Apr 5 2018

mcrosier created D45334: [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference..
Apr 5 2018, 1:52 PM

Apr 4 2018

mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

@mcrosier I've submitted D44983 for review. It prevents peeling, if we cannot simplify the loop body after peeling. Peeling if we cannot simplify the loop body afterwards is likely not beneficial. It would be great if you could check if that helps in your case. If it's not easy for you to check, I can try and test it myself.

Coming up with some additional heuristics, e.g. based on the number of instructions peeled and not eliminated, should be possible, but without knowing the inlining situation we would probably have to choose a rather arbitrary threshold.

Sure, I'll take a look now and let you know! Sorry I didn't see this sooner.

Apr 4 2018, 12:02 PM · Restricted Project, Restricted Project
mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

@mcrosier I've submitted D44983 for review. It prevents peeling, if we cannot simplify the loop body after peeling. Peeling if we cannot simplify the loop body afterwards is likely not beneficial. It would be great if you could check if that helps in your case. If it's not easy for you to check, I can try and test it myself.

Coming up with some additional heuristics, e.g. based on the number of instructions peeled and not eliminated, should be possible, but without knowing the inlining situation we would probably have to choose a rather arbitrary threshold.

Apr 4 2018, 11:04 AM · Restricted Project, Restricted Project
mcrosier added inline comments to D45225: [WIP] Add IR function attributes to represent codegen optimization level.
Apr 4 2018, 7:35 AM

Apr 3 2018

mcrosier added inline comments to D45225: [WIP] Add IR function attributes to represent codegen optimization level.
Apr 3 2018, 1:51 PM
mcrosier created D45226: [WIP] Add IR function attributes to represent codegen optimization level.
Apr 3 2018, 12:45 PM
mcrosier created D45225: [WIP] Add IR function attributes to represent codegen optimization level.
Apr 3 2018, 12:45 PM

Mar 30 2018

mcrosier updated the summary of D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.
Mar 30 2018, 8:01 AM
mcrosier updated subscribers of D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.
Mar 30 2018, 7:59 AM
mcrosier created D45092: [Driver] Don't forward -m[no-]unaligned-access options to GCC when assembling/linking.
Mar 30 2018, 7:55 AM

Mar 29 2018

mcrosier accepted D42260: [JumpThreading] Don't select an edge that we know we can't thread.

Still LGTM. Thanks for the fix, Haicheng.

Mar 29 2018, 6:50 AM

Mar 28 2018

mcrosier added a reviewer for D44983: [LoopUnroll] Only peel if a predicate becomes known in the loop body.: junbuml.
Mar 28 2018, 11:47 AM
mcrosier updated subscribers of rL324557: gold-plugin: Do not set codegen opt level based on LTO opt level..

Hi Peter/Rafael,
After this commit, I noticed a ~3.3% regression in SPEC2006/libquantum when compiling with -O3 -flto and the gold-plugin. In short, this change disabled tail duplication during machine block placement as that optimization is only performed when the optlevel is >= Aggressive (and we now default to -O2/Default). In turn, I've started looking into adding function attributes to control the code generation optimization level, per the suggestion in the commit message. Before I get too far into the implementation, I was wondering if there had been some discussion on adding these function attributes? If so, can you point me toward these discussions?

Mar 28 2018, 11:28 AM
mcrosier updated subscribers of rL324557: gold-plugin: Do not set codegen opt level based on LTO opt level..
Mar 28 2018, 11:18 AM

Mar 23 2018

mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

Hi Chad,

Hi Florian,
We identified a 2.15% regression in SPEC2006/h264ref due to this commit. After this change, the FullPelBlockMotionBiPred() function is no longer inlined into the hottest function, BlockMotionSearch(). Previous to this change, the function was inlined because there was a single callsite in the entire program (known only when compiling in LTO) and the original definition could be removed after inlining. However, after loop peeling the callsite of FullPelBlockMotionBiPred() is replicated, which prevents inlining.

I was wondering if we could avoid peeling in this case until we have some type of cost model that can determine if peeling would prevent inlining. Also, after looking at the code (which I can't share here) you might also notice that the amount of code being peeled in this case is fairly large relative to the amount of code being removed from the loop. It might also make sense to have a heuristic that takes code size into consideration when peeling, if that hasn't already been done.

Thoughts?

Thanks for making me aware of this, I originally thought considering MaxPeelCount should help us avoid those cases.

I will follow this up with patches in the following days. I think there are a couple of things that can be done to make the peeling more conservative for now. First, only peel if we can proof that the condition is known to be true in the peeled part and false in the loop (or vice versa). Otherwise we cannot simplify the loop body and peeling is likely not very beneficial. Second, have a simple cost function, that takes the size of the loop body vs the eliminated instructions into account.

Also, D43878, which enables induction variable simplification after peeling is not committed yet, so currently the loop body may not be simplified after peeling, even if it could be.

Cheers,
Florian

Mar 23 2018, 7:20 AM · Restricted Project, Restricted Project

Mar 22 2018

mcrosier added a comment to D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..

Hi Florian,
We identified a 2.15% regression in SPEC2006/h264ref due to this commit. After this change, the FullPelBlockMotionBiPred() function is no longer inlined into the hottest function, BlockMotionSearch(). Previous to this change, the function was inlined because there was a single callsite in the entire program (known only when compiling in LTO) and the original definition could be removed after inlining. However, after loop peeling the callsite of FullPelBlockMotionBiPred() is replicated, which prevents inlining.

Mar 22 2018, 12:02 PM · Restricted Project, Restricted Project

Mar 9 2018

mcrosier committed rL327150: [JumpThreading] Don't restrict cast-traversal to i1.
[JumpThreading] Don't restrict cast-traversal to i1
Mar 9 2018, 8:47 AM
mcrosier closed D42262: [JumpThreading] Don't restrict cast-traversal to i1.
Mar 9 2018, 8:47 AM

Mar 5 2018

mcrosier resigned from D40831: [AArch64] Only use writeback in the load/store optimizer when needed.
Mar 5 2018, 8:32 AM · Restricted Project

Feb 13 2018

mcrosier added a comment to D41463: [CodeGen] Add a new pass for PostRA sink.

LGTM, but formal approval should probably come from someone outside of our group (ping! :).

Feb 13 2018, 7:15 AM

Feb 12 2018

mcrosier added a comment to D42759: [CGP] Split large data structres to sink more GEPs.

A few quick comments. Will follow up with a more complete review later this week.

Feb 12 2018, 9:53 AM · Restricted Project, Restricted Project

Feb 4 2018

mcrosier committed rL324195: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.
[LV] Use Demanded Bits and ValueTracking for reduction type-shrinking
Feb 4 2018, 7:46 AM
mcrosier closed D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.
Feb 4 2018, 7:46 AM

Feb 2 2018

mcrosier removed a reviewer for D42179: [NewGVN] Re-evaluate phi of ops after moving an instr to new class : mcrosier.
Feb 2 2018, 8:45 AM
mcrosier removed a reviewer for D42180: [NewGVN] Add ops as dependency if we cannot find a leader for ValueOp.: mcrosier.
Feb 2 2018, 8:44 AM
mcrosier added a comment to D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.

Hi Digeo,
Matt will be out of the office for a few weeks and he asked me to follow up on this patch. Hopefully, the new version addresses all of your concerns.

Feb 2 2018, 8:11 AM
mcrosier updated the diff for D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.

Address Diego's comments.

Feb 2 2018, 8:10 AM
mcrosier commandeered D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.
Feb 2 2018, 8:08 AM

Jan 31 2018

mcrosier added a comment to D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking.

ping.

Jan 31 2018, 8:59 AM
mcrosier added a reviewer for D42260: [JumpThreading] Don't select an edge that we know we can't thread: bmakam.
Jan 31 2018, 7:00 AM

Jan 29 2018

mcrosier added a comment to D42006: AArch64: Omit callframe setup/destroy when not necessary.

Thanks, Matthias/Jun!

Jan 29 2018, 3:01 PM

Jan 28 2018

mcrosier added a comment to D42006: AArch64: Omit callframe setup/destroy when not necessary.

Would it be possible to revert r322917 while we investigate the regressions? We also identified a 3.61% regression in SPEC2006/bzip2, so here's to complete list of regressions we are currently seeing due to this change:

Jan 28 2018, 7:33 AM

Jan 23 2018

mcrosier added a reviewer for D42260: [JumpThreading] Don't select an edge that we know we can't thread: haicheng.
Jan 23 2018, 6:53 AM
mcrosier resigned from D42293: [TableGen][AsmMatcherEmitter] Fix tied-constraint checking for InstAliases.
Jan 23 2018, 6:39 AM
mcrosier resigned from D39830: [DAGCombine] Transform (A + -2.0*B*C) -> (A - (B+B)*C).
Jan 23 2018, 6:39 AM

Jan 18 2018

mcrosier abandoned D42070: [msan] Put send/recvmmsg behind check for mmsghdr.

I think we've identified a work around internally, so I'm just going to abandon this patch.

Jan 18 2018, 12:15 PM

Jan 15 2018

mcrosier added a comment to D42070: [msan] Put send/recvmmsg behind check for mmsghdr.

2.11.3 has been released in 2010, do we support so old releases?

Jan 15 2018, 7:05 AM
mcrosier created D42070: [msan] Put send/recvmmsg behind check for mmsghdr.
Jan 15 2018, 6:52 AM

Jan 8 2018

mcrosier added inline comments to D41835: [MachineCopyPropagation] Extend pass to do COPY source forwarding.
Jan 8 2018, 3:32 PM

Jan 5 2018

mcrosier added inline comments to D41782: [CallSiteSplitting]use constrained argument from single predecessors.
Jan 5 2018, 1:29 PM

Dec 26 2017

mcrosier added a comment to D41463: [CodeGen] Add a new pass for PostRA sink.

I'd like to suggest that this implementation be included in MachineSink.cpp, but continue to live as a separate pass. The two passes have the same intent, but must be independent passes due to the previously mentioned constrains; I'd almost want to refer to this as the PostRAMachineSink pass... The point being I'd like to have one place where I can see what's sunk pre-RA and what's sunk post-RA.

Dec 26 2017, 12:00 PM

Dec 11 2017

mcrosier accepted D33946: [InlineCost] Find identical loads in the callee.

All of the review feedback has been addressed and I have no additional comments. LGTM. Thanks, Haicheng.

Dec 11 2017, 6:44 AM

Nov 29 2017

mcrosier added a comment to D40476: Switch kryo to use -mcpu=cortex-a57 when invoking the assembler.

Thanks for the review. Now let's just hope the windows bots stay happy :)

Actually, I just checked and it looks like falkor and saphira were both added as of a few weeks ago. I'll revert this part of the patch shortly.

Nov 29 2017, 8:43 AM
mcrosier committed rL319323: [Driver] Turns out the GNU assembler does support falkor/saphira..
[Driver] Turns out the GNU assembler does support falkor/saphira.
Nov 29 2017, 8:43 AM
mcrosier committed rC319323: [Driver] Turns out the GNU assembler does support falkor/saphira..
[Driver] Turns out the GNU assembler does support falkor/saphira.
Nov 29 2017, 8:43 AM

Nov 27 2017

mcrosier added a comment to D40476: Switch kryo to use -mcpu=cortex-a57 when invoking the assembler.

Thanks for the review. Now let's just hope the windows bots stay happy :)

Nov 27 2017, 1:55 PM
mcrosier accepted D40476: Switch kryo to use -mcpu=cortex-a57 when invoking the assembler.
Nov 27 2017, 11:08 AM
mcrosier added a comment to D40476: Switch kryo to use -mcpu=cortex-a57 when invoking the assembler.

Am I correct in assuming this is going to be a problem for Falkor and Saphira as well? If so, can you add solutions for those as well? Cortex-a57 should be good enough for those targets as well.

Nov 27 2017, 10:05 AM

Nov 21 2017

mcrosier committed rL318788: [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects..
[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects.
Nov 21 2017, 10:10 AM

Nov 16 2017

mcrosier added inline comments to D40107: [AArch64] Remove obsoleted feature.
Nov 16 2017, 5:53 AM

Nov 15 2017

mcrosier accepted D40090: [AArch64] Refactor the loads and stores optimizer.

Please run clang-format, but otherwise LGTM.

Nov 15 2017, 11:40 AM

Nov 14 2017

mcrosier added a comment to D39976: [AArch64] Query the target when folding loads and stores.

Please, let me know if I addressed your concerns before I split the code refactoring in another patch.

Nov 14 2017, 1:45 PM
mcrosier added a reviewer for D39976: [AArch64] Query the target when folding loads and stores: junbuml.
Nov 14 2017, 6:49 AM
mcrosier added a comment to D39976: [AArch64] Query the target when folding loads and stores.

The load/store opt pass is already pretty expensive in terms of compile-time. Did you see any compile-time regressions in your testing? Also, what performance results have you collected?

Nov 14 2017, 6:48 AM

Nov 13 2017

mcrosier accepted D39915: [clang] Remove redundant return [NFC].
Nov 13 2017, 10:34 AM · Restricted Project

Nov 10 2017

mcrosier added a comment to D39830: [DAGCombine] Transform (A + -2.0*B*C) -> (A - (B+B)*C).

This solution doesn't seem very general, it won't catch.

double test2(double a, double b, double c, double d) {
  return a + -2.0*b*c*d;
}

The constant can be many layers of multiplies away. Reassociate pushes constants down the tree. Should reassociate be pulling out the negate when it factors the tree?

Nov 10 2017, 11:10 AM
mcrosier added inline comments to D39830: [DAGCombine] Transform (A + -2.0*B*C) -> (A - (B+B)*C).
Nov 10 2017, 9:24 AM
mcrosier added a comment to D39830: [DAGCombine] Transform (A + -2.0*B*C) -> (A - (B+B)*C).

what is the purpose of this transform?

Nov 10 2017, 9:21 AM