dberlin (Daniel Berlin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 13 2012, 1:07 PM (274 w, 3 d)

Recent Activity

Today

dberlin added inline comments to D38944: [GVN] Handle removal of first implicit CF instruction correctly.
Mon, Oct 16, 7:37 AM

Fri, Oct 13

dberlin accepted D37353: [SparsePropagation] Enable interprocedural analysis.

Implement the previously discussed approach using generic LatticeKeys. A few things to note about this version of the patch:

First, there's a requirement that LatticeKeys are able to be the key_type of whatever mapping structure the generic solver uses internally (i.e., DenseMap). If a client uses a non-trivial type for LatticeKey, it will have to provide a specialization of DenseMapInfo. But we already have specializations for pointers and PointerIntPairs, which I think probably covers most potential clients. I've added a comment at the top of AbstractLatticeFunction mentioning this.

Fri, Oct 13, 7:59 PM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

I attached a patch for this to PR 34937.
I don't have time ATM to perform testing and shepherding of it through review, however.

Fri, Oct 13, 5:48 PM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

This is use after free.
When the first implicit control flow instruction for a bb gets erased, we now have wrong info.

Fri, Oct 13, 5:38 PM

Thu, Oct 12

dberlin added inline comments to D38856: [IPSCCP] Remove calls without side effects.
Thu, Oct 12, 10:12 PM

Wed, Oct 11

dberlin added a comment to rL313701: Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order.".

I do not plan to address these at the moment, because the order is consistent in the pass and doesn't affect correctness or what gets optimized.
If someone else wants to try to make everything reverse order clean, they are welcome to, it's not a priority for me.

Wed, Oct 11, 11:26 AM
dberlin added a comment to rL313701: Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order.".

Yes, this is known. The original "fix" this reverts is clearly broken. There is a discussion on the original revert thread about this and ways to fix it properly.

Wed, Oct 11, 11:19 AM
dberlin created D38806: DepthFirstIterator.h: Use C++11 features to call a completed method on the set type, instead of requiring that one exists..
Wed, Oct 11, 9:42 AM

Tue, Oct 10

dberlin added a comment to D37353: [SparsePropagation] Enable interprocedural analysis.

This actually sounds like a great idea, IMHO.
Let's try it and see how well it works in practice.
It certainly would address my current concerns.

Tue, Oct 10, 10:40 PM
dberlin accepted D37638: [IPSCCP] Move common functions to IPOUtils (NFC).
Tue, Oct 10, 10:38 PM
dberlin accepted D38765: Don't replace constants with constants in GVN.

Okay if you address the remaining comments.
Thanks!

Tue, Oct 10, 8:58 PM
dberlin added inline comments to D38765: Don't replace constants with constants in GVN.
Tue, Oct 10, 8:56 PM
dberlin added a comment to D38765: Don't replace constants with constants in GVN.

Suggesting a slightly different fix.

Tue, Oct 10, 6:00 PM
dberlin added a comment to D36656: [SCCP] Propagate integer range info for parameters in IPSCCP..

I suspect at least one of your bugs is that the param state lattice doesn't get updated when the value state lattice changes.

Tue, Oct 10, 10:51 AM

Mon, Oct 9

dberlin added a comment to D38569: Expose must/may alias info in MemorySSA..

I'd reuse the type we already have to express alias results.
While you don't care about partial alias, it makes all code have the same meaning for the same thing :)

Mon, Oct 9, 7:04 PM
dberlin added a comment to D37353: [SparsePropagation] Enable interprocedural analysis.

(I think this needs a little thought)

Mon, Oct 9, 1:30 PM
dberlin added inline comments to D37353: [SparsePropagation] Enable interprocedural analysis.
Mon, Oct 9, 1:30 PM
dberlin added a comment to D37638: [IPSCCP] Move common functions to IPOUtils (NFC).

So, i can't find what the suggestion was, but these are fairly specific to value lattices and don't belong being used by most other things.
There are lattices where they are correct, and lattices where they make no sense.
It's also kinda conservative.

Mon, Oct 9, 1:27 PM
dberlin accepted D37355: Add CalledValuePropagation pass.

thanks for working through this :)

Mon, Oct 9, 11:26 AM
dberlin accepted D36656: [SCCP] Propagate integer range info for parameters in IPSCCP..

Sounds reasonable to start with.

Mon, Oct 9, 11:10 AM
dberlin added inline comments to D36656: [SCCP] Propagate integer range info for parameters in IPSCCP..
Mon, Oct 9, 7:25 AM
dberlin added a comment to D37355: Add CalledValuePropagation pass.

Also, if you used sets, you could just use set.swap in the constructor (or move constructors) to avoid the extra copying.

Mon, Oct 9, 6:51 AM
dberlin added a comment to D37355: Add CalledValuePropagation pass.

I'm a bit confused why you use a vector instead of std::set or DenseSet. It looks like either would be better in every way?

Mon, Oct 9, 6:50 AM

Sat, Oct 7

dberlin added a comment to D38586: SLPVectorizer.cpp: Ensure SLPVectorizer can visit each block by dominated order.

Preds will in fact be thousands sometimes with switch statements. If you don't want to fix it, let me know and I'll fix the sort on Monday

Sat, Oct 7, 7:37 AM

Fri, Oct 6

dberlin added inline comments to D38569: Expose must/may alias info in MemorySSA..
Fri, Oct 6, 7:56 AM

Thu, Oct 5

dberlin added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

I've taken some time to go through the code in NewGVN that computes similar things based on the suggestion from Danny. These now do *very* similar things in terms of walk. The differences I've seen are:

  1. While the safety checks are both preorder, my code uses DFS instead of BFS. The reason is that we walk up the stack to mark more nodes as unsafe when we discover an unsafe node.
Thu, Oct 5, 6:34 PM
dberlin requested changes to D38586: SLPVectorizer.cpp: Ensure SLPVectorizer can visit each block by dominated order.

Also this needs a test

Thu, Oct 5, 3:07 PM
dberlin added a comment to D38586: SLPVectorizer.cpp: Ensure SLPVectorizer can visit each block by dominated order.

We sort siblings by RPO in newgvn. If you want to guarantee parent before child, however, just sort by the dominator tree dfs in and out numbers.

Thu, Oct 5, 3:01 PM

Wed, Oct 4

dberlin added inline comments to D38569: Expose must/may alias info in MemorySSA..
Wed, Oct 4, 6:30 PM
dberlin added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

(Gonna let jakub comment on what should work/not work here)

Wed, Oct 4, 4:20 PM
dberlin accepted D38561: [SparsePropagation] Move member definitions to header (NFC).
Wed, Oct 4, 2:06 PM
dberlin added a comment to D38561: [SparsePropagation] Move member definitions to header (NFC).

LGTM

Wed, Oct 4, 2:05 PM
dberlin accepted D38541: [Dominators] Take fast path when applying <=1 updates.
Wed, Oct 4, 10:09 AM
dberlin added a comment to D38541: [Dominators] Take fast path when applying <=1 updates.

This looks right to me, since people were trying to do it anyway :)

Wed, Oct 4, 10:09 AM

Tue, Oct 3

dberlin added a comment to D38476: Template the sparse propagation solver instead of using void pointers.

Ah, the joy of not having users to break.
Yes, i expect you will have to move a bunch if not all of these to the header.

Tue, Oct 3, 1:09 PM

Mon, Oct 2

dberlin added a comment to D37353: [SparsePropagation] Enable interprocedural analysis.

FYI: I'm about to rename getOrInitValueState to getValueState.
That is what SCCP calls it, and IMHO, we should be consistent.

Mon, Oct 2, 4:08 PM
dberlin created D38476: Template the sparse propagation solver instead of using void pointers.
Mon, Oct 2, 1:26 PM
dberlin added a comment to D37353: [SparsePropagation] Enable interprocedural analysis.

It's worth noting: You can make propagation much faster by keeping a separate worklist for overdefined , and prioritizing it over the normal one.
This will move things to overdefined as quickly as possible.
(Because anything meet overdefined is just about always overdefined).

Mon, Oct 2, 11:38 AM
dberlin added a comment to D37355: Add CalledValuePropagation pass.

I see this review has been hanging around a while, so i'll take a stab at reviewing it.

Mon, Oct 2, 11:19 AM
dberlin accepted D38331: [Dominators] Add DFS number verification.
Mon, Oct 2, 10:51 AM
dberlin added inline comments to D38331: [Dominators] Add DFS number verification.
Mon, Oct 2, 8:47 AM
dberlin added inline comments to D38331: [Dominators] Add DFS number verification.
Mon, Oct 2, 8:40 AM

Fri, Sep 29

dberlin added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Fri, Sep 29, 1:14 PM
dberlin added a comment to D38336: Add an @llvm.sideeffect intrinsic.

This update revises the documentation to clarify that the intrinsic doesn't inherently prevent loops from being optimized away.

If it supports C++, ADCE needs to preserve calls to a library I/O functions, volatile operations, thread synchronizations, and atomic operations, and calls to functions which may perform any of these. Whatever it does to handle those should handle llvm.sideeffect as well.

This seems pretty incongruous to me with your answers to my earlier questions, so i feel like i'm missing something.

The basic idea of llvm.sideeffect is to just be a thing that gets treated like it has side effects. It's just like a volatile store through a non-escaped memory address.

I may have been unclear when you asked

Do you expect it to keep provably finite loops that have it alive.
IE a loop where we can prove the backedge is not taken?

and I said

No; a backedge can still be folded away in the normal ways. The llvm.sideeffect call would remain, no longer needed, but not invalid. If it were important, we could add code to remove it in that case, but the patch here doesn't have that.

I interpreted your question to be just about loops where we can prove the backedge is not taken. llvm.sideeffect does keep finite loops alive under the same conditions that other operations with side effects would.

In the future, we may want to optimize away redundant llvm.sideeffect calls, and an advanced form of this could allow optimizing away provably-finite loops containing llvm.sideeffect, however, the current patch doesn't depend on this.

Fri, Sep 29, 8:33 AM

Thu, Sep 28

dberlin added a comment to D38336: Add an @llvm.sideeffect intrinsic.

One reason i ask is because ADCE/etc wouldn't even see it. they won't process the block in such a case.
Hence, if the intrinsic relies on always being looked at and then deciding to keep everything containing it, i'd be opposed (since that adds O(N) walks to everything :P).
Right now, it doesn't sound like that's the case.

If it supports C++, ADCE needs to preserve calls to a library I/O functions, volatile operations, thread synchronizations, and atomic operations, and calls to functions which may perform any of these. Whatever it does to handle those should handle llvm.sideeffect as well.

Thu, Sep 28, 10:37 PM
dberlin added a comment to D38374: [memcpyopt] Memcpy-memcpy dependence isn't detected across basic blocks.

"Your suggestions above both involve adding functionality under a flag/option. That alone makes them different situations from just landing the current patch, in terms of seeing the work all the way through into production.
"

Thu, Sep 28, 10:20 PM
dberlin added a comment to D38374: [memcpyopt] Memcpy-memcpy dependence isn't detected across basic blocks.

So, if the work is done (I think you would have to change some getDefiningAccess calls to getClobberingAccess), and it works, i'd prefer for us to just do that and declare victory.
If it's *not* done, okay, fine, that sucks but not gonna hold up the world here.

Thu, Sep 28, 6:41 PM
dberlin added a comment to D38374: [memcpyopt] Memcpy-memcpy dependence isn't detected across basic blocks.
  1. non-local calls in GVN are the cause of pretty much all of it's massive slowdown.

    Have you measured the effect this has on larger memory-using testcases?

I've measured compile times on some C++ and Rust testcases, and it doesn't appear to significantly slow down -O2. memcpyopt remains under 1% of the time, according to -time-passes.

Thu, Sep 28, 2:36 PM
dberlin added a comment to D38374: [memcpyopt] Memcpy-memcpy dependence isn't detected across basic blocks.
  1. non-local calls in GVN are the cause of pretty much all of it's massive slowdown.
Thu, Sep 28, 12:00 PM
dberlin accepted D37528: [JumpThreading] Preserve DT and LVI across the pass..
Thu, Sep 28, 8:10 AM

Wed, Sep 27

dberlin added a comment to D38336: Add an @llvm.sideeffect intrinsic.

Is literally the only purpose to handle infinite loops?
If so, please tighten up the description a bit.
(Right now "indicate that the loop shouldn't be optimized away even if it's an infinite
loop with no other side effects." would cover unreachable loops, etc).

Another potential use was mentioned here.

Wed, Sep 27, 7:44 PM
dberlin added a comment to D38336: Add an @llvm.sideeffect intrinsic.

Is literally the only purpose to handle infinite loops?
If so, please tighten up the description a bit.
(Right now "indicate that the loop shouldn't be optimized away even if it's an infinite
loop with no other side effects." would cover unreachable loops, etc).

Wed, Sep 27, 4:56 PM
dberlin added a comment to D37575: [BasicBlock] add new function removeEdge().

(repeating comments i forgot to add to review)
This definitely needs an RFC.

Wed, Sep 27, 8:45 AM

Tue, Sep 26

dberlin added a comment to D36656: [SCCP] Propagate integer range info for parameters in IPSCCP..

FWIW: I think this is a good approach to start with (and is small enough that if we decide to go another way, it doesn't hurt anything).

Tue, Sep 26, 3:35 PM

Fri, Sep 22

dberlin added a comment to D38097: [IVUsers] Changes to make IVUsers's results robust to instruction and uselist ordering.

So, TL;DR, i'm not sure how much you really care, this isn't going to make your ordering completely consistent in the face of use list reordering or instruction ordering. It should work if there is a single cycle, but not if there are nested cycles.

Fri, Sep 22, 8:58 AM

Sep 15 2017

dberlin added inline comments to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
Sep 15 2017, 9:08 PM
dberlin added a comment to D37940: Peel off the dominant case in switch statement.

We've already discovered a slew of problem from over-optimizing things in simplifycfg, which is also being used as a canonicalization pass.
Either it needs a set of flags controlling what it does, or it needs to not be trying to do "performance improvement" and "canonicalization" as a single pass.
See, e.g, https://bugs.llvm.org/show_bug.cgi?id=34603

Sep 15 2017, 4:35 PM
dberlin added a comment to D37902: [CodeExtractor] Fix multiple bugs under certain shape of extracted region.

Please don't update it manually.
Pretty much every manual update LLVM does has been proven incorrect over time.
Please just express the added/removed edges if you have to.
If you are cutting out a region, this should not be difficult.

Sep 15 2017, 9:31 AM

Sep 13 2017

dberlin added a comment to D37762: [InstCombine] Remove single use restriction from InstCombine's explicit sinking code..

The assembly is clearly worse, the IR is clearly better.
The IR has removed all computation except a mask of the input.
(Note, in the dump i gave, it has not run DCE, so there's a bunch of dead phis).
In essence, it has turned the entire function into a nested if statement based on the bar & x values, where all branches are storing a constant value.
(or selects, or however you want to canonicalize this)

Sep 13 2017, 6:00 PM
dberlin added a comment to D37762: [InstCombine] Remove single use restriction from InstCombine's explicit sinking code..

So GCC-7 definitely eliminates all computation in your benchmark at -O3 except for one and, and we should be able to do the same with a newgvn PRE.
It won't do it at O2 because it requires partial antic, which gcc only does at O3.

Sep 13 2017, 5:04 PM
dberlin added a comment to D37762: [InstCombine] Remove single use restriction from InstCombine's explicit sinking code..

A couple things:

Sep 13 2017, 3:17 PM
dberlin added inline comments to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
Sep 13 2017, 7:39 AM

Sep 12 2017

dberlin added inline comments to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
Sep 12 2017, 10:09 PM

Sep 11 2017

dberlin added inline comments to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
Sep 11 2017, 7:53 PM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

Eli, i presume an updated version with isGuaranteed... should fix your testcases from 21041, right?
If so, Max, can you add them to this patch?

Sep 11 2017, 12:57 PM

Sep 8 2017

dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

(and using isGuarante* would handle guard since guard is marked as throwing)

Sep 8 2017, 12:56 PM
dberlin added inline comments to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
Sep 8 2017, 12:55 PM
dberlin added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

I'll admit to not staring at the cost model too hard (I rarely have useful input on that kind of target specific thing), but it looked at a glance like you trying to calculate which might constant fold as well.
If not, or it's not part of the goal, awesome.

Sep 8 2017, 8:57 AM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

Actually we don't even need OrderedInstructions to do that. It's enough to maintain a set of blocks where we have ever seen the guards. We iterate through blocks in reverse postorder which guarantees us that if a block has one predecessor it will be visited before the block. We also traverse block instructions in direct order. Using this, we just mark a block as having a guard when we meet one and know for sure that:

  1. if the current block is marked, there is a guard above the current instruction, and we cannot hoist;
  2. if the only predecessor (recursively) is marked, it also blocks us from hoisting.

    I've updated the patch using this solution. Please let me know if it makes sense to you.

Sure, this will work, but it requires clearing the guard set between iterations :)
Given how much faster either solution is than the original option, seems fine to me. I doubt it will ever be a real performance issue.

Sep 8 2017, 1:36 AM
dberlin added a comment to D37463: Fix miscompile in LoopSink pass.

I need to read what C++ specification says about this particular issue, but basically LLVM is not only used to compile C++. This situation can be illegal in other languages (again, need to dig more through specifications). My proposal is to add an option that prohibits this transform and set it to false by default, with abitily to turn it off for languages where it is prohibited.

Sep 8 2017, 12:51 AM

Sep 7 2017

dberlin added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Other random note:
NewGVN can do this value-wise, not just lexically.
So in some future when NewGVN is the default, you may just want to make this part of one of the things it does after the analysis (probably after elimination too).
The eliminator will already have eliminated all cases where doing the above eliminates redundancies without costing anything (IE the part where you see how it simplifies)
We could also, for zero cost, track the things where it it would have just been code motion (IE one operand constant, one operand not), and where we didn't do it because it required speculation of multiple operands.

Sep 7 2017, 9:24 PM
dberlin added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

I feel unclear on what you see the safety conditions of this transform are. I believe yours are off a little :)

Sep 7 2017, 8:57 PM
dberlin added a comment to D37528: [JumpThreading] Preserve DT and LVI across the pass..

@brzycki, one related thing: have you tried preserving postdominators as well? I'm not sure if that would be beneficial here, but in theory, it should be pretty simple, as you can pass exactly the same update sequences to the DominatorTree and PostDominatorTree.

I have not. I'll talk to @sebpop and see if he thinks this makes sense to include in this patch. Depending on our follow-on updates to JT we may need post dominators as well.

Sep 7 2017, 5:20 PM
dberlin added a comment to D37163: [LICM] sink through non-trivially replicable PHI.

Okay.
One reason we don't split critical edges by default in most passes in LLVM is that edges coming out of switches that lead to the same block are pretty much always critical.
Doing so by default for large switches leads to *huge* code blowup.
IE it goes from 1 block to N blocks, where N is the number of switch cases.
If there are multiple layers of switches, it's obviously even worse.
Have you tested on this on any switch heavy code to ensure it does not blow up code size and compile time?

I tried to build spec2000/spec2006/spec2017 with/without this change for aarch64. This change was applied widely and as far as I know perlbench use lots of big switches, but I didn't see any significant changes in code size and compile time across all benchmarks.
However, as you point out, it could possibly increase code size even in some case where splitting is required for sinking. Just to be safe what about checking if the number of predecessors is larger than some threshold from cl::opt before splitting?

Sep 7 2017, 12:43 PM
dberlin added a comment to D37163: [LICM] sink through non-trivially replicable PHI.

Okay.
One reason we don't split critical edges by default in most passes in LLVM is that edges coming out of switches that lead to the same block are pretty much always critical.
Doing so by default for large switches leads to *huge* code blowup.

Sep 7 2017, 10:39 AM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

Of course. Just in case you try to use the code, i'm pretty sure if (OrderedInstructions->dominates(*Iter, guard))
should be if (OrderedInstructions->dominates(guard, *Iter))

Sep 7 2017, 7:37 AM

Sep 6 2017

dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

Sure, it's a function argument in a few other places.
The other places, you should be using auto :)
I marked a bunch of them.

Sep 6 2017, 10:15 PM
dberlin added a comment to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.

The problem:
Right now, for each load, you go through the loadbb, and see if you hit a guard.
This information does not change for each load. At most it changes as GVN removes things (and even then, no).
The number of guards is smaller than the number of blocks.

Sep 6 2017, 1:36 PM

Sep 5 2017

dberlin requested changes to D37460: [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
  1. This seems wrong for assumes (I don't know enough about guards).

It is not incorrect to hoist through an assume, nor should doing so generate wrong code.
Assumes are essentially metadata, and documented as such.

Sep 5 2017, 6:42 AM

Sep 1 2017

dberlin added a comment to D33946: [InlineCost] Find identical loads in the callee.

(I suspect most folks are waiting for chandler to say whether he feels this is good enough at this point or not)

Sep 1 2017, 12:37 PM
dberlin added a comment to D37163: [LICM] sink through non-trivially replicable PHI.

Of course, it would be good to avoid unnecessary splitting in the fist place if it's not very costly. Let me try to find a reasonable approach for this.

Sep 1 2017, 9:02 AM
dberlin updated the diff for D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..
  • NewGVN: Cleanup comments
Sep 1 2017, 8:34 AM
dberlin added a comment to D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..

This has now passed all testing i can throw at it, so i'm going to commit it in a day or two, but please feel free to do post-commit review :)

Sep 1 2017, 12:27 AM
dberlin updated the diff for D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..
  • Update patch for correctness
  • Add test
  • NewGVN: Add debug message when we can't find a phi of ops operand but looked
  • NewGVN: Revert now-unnecessary changes
Sep 1 2017, 12:12 AM

Aug 31 2017

dberlin added a comment to D37340: [Polly] Run GVN during the cleanup.

Thanks Dani. Then we should certainly go for the new GVN.

Aug 31 2017, 11:38 PM
dberlin added a comment to D37340: [Polly] Run GVN during the cleanup.

Any reason to use NewGVN instead of (old) GVN?

I thought that NewGVN uses a stronger algorithm.

Aug 31 2017, 11:18 PM
dberlin added a comment to D37215: [ValueTracking] improve reverse assumption inference.

(At some point, now that we could use post-dominance, you may want to see if it helps. For starters, anything if control dependent on the entry block is guaranteed to execute)

Aug 31 2017, 1:58 PM
dberlin added a comment to D37163: [LICM] sink through non-trivially replicable PHI.

I agree that the test doesn't really require splitting since the operations of incoming values are the same (sub/mul) and all operands used in the sinkable chain (%N_addr.0.pn, %N, and %N2) dominate the PHI, so we can directly add PHIs for operands and share the same operation in the exit block like you comment. For me, it seems that the test case is a special case in which splitting is not required. In most case, I think splitting is required to sink through a non-trivially replicable PHI. For example, if operations of incoming values are different or any operands in the sinkable chain do not dominate the PHI, we should split preds.

Aug 31 2017, 11:34 AM
dberlin added a comment to D33987: [MergeICmps][WIP] Initial MergeICmps prototype.

So, LGTM at this point.
Have you performance tested it while being on by default?
Usually, passes that are off by default are undergoing work, with the expectation they will be on by default at some opt level at some point.

Aug 31 2017, 8:42 AM

Aug 30 2017

dberlin added a comment to D37163: [LICM] sink through non-trivially replicable PHI.

Overall, i'm unclear on why you can't compute which ones will actually be sunk, and then split the preds.
Also, the examples does not even require splitting preds to effect the transform.

Aug 30 2017, 12:17 PM

Aug 29 2017

dberlin added inline comments to D33987: [MergeICmps][WIP] Initial MergeICmps prototype.
Aug 29 2017, 9:54 AM
dberlin resigned from D37195: [InstCombine] Teach canEvaluateTruncated and EvaluateInDifferentType to handle expression tree with multi-used nodes..
Aug 29 2017, 8:18 AM

Aug 28 2017

dberlin added inline comments to D37239: [Instruction] add moveAfter() convenience function; NFCI.
Aug 28 2017, 5:28 PM

Aug 27 2017

dberlin added a comment to D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..

This also fixes PR 33432, i forgot to add it to the log :)
With the latest update it should be correct now.

Aug 27 2017, 6:39 PM
dberlin updated the diff for D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..
  • Update patch for correctness
  • Add test
Aug 27 2017, 6:30 PM
dberlin added a comment to D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..

Unfortunately, i can prove we have to give up rather than try to constant fold in this case.

Aug 27 2017, 10:38 AM

Aug 25 2017

dberlin added a comment to D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..

(In particular, i believe it may be correct to only care about phi nodes in the same block as the one we are trying to translate through, which would fix 33461 back to the better codegen)

Aug 25 2017, 8:41 PM
dberlin added a comment to D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..

The 33461 change is an example where i believe what we were doing is safe, but i want to see if we still find correctness bugs first :)

Aug 25 2017, 8:41 PM
dberlin created D37175: Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders..
Aug 25 2017, 8:36 PM
dberlin updated the diff for D37174: NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops.

Add testcase

Aug 25 2017, 8:26 PM