dberlin (Daniel Berlin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 13 2012, 1:07 PM (305 w, 5 d)

Recent Activity

Today

dberlin accepted D46866: [EarlyCSE] Avoid a poorly defined instruction comparison.

Thanks.
This makes me wonder if there should be a real class hierarchy here, but i think this is something we can worry about in the future.

Thu, May 24, 10:34 AM

Tue, May 22

dberlin updated subscribers of D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

SGTM.
If it turns out to be way too ugly. let's circle back and figure out what
we want to do.

Tue, May 22, 10:38 AM

Mon, May 21

dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

Sure, you will never find a single flag to control all of this, my goal was to be able to have flags at all to control the various issues.

Mon, May 21, 2:29 PM
dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

Right, this is why we haven't done anything yet past taht - i ran out of time to start driving through a real plan :)

Mon, May 21, 2:12 PM
dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

One of the reasons i moved Simplify* to take a SimplifyQuery is so that things like this are possible to add without a tremendous amount of work in Simplify*.

Mon, May 21, 1:56 PM
dberlin added a comment to D47149: [SCCP] Mark DominatorTreeAnalysis as preserved..

Davide beat me to it.

Mon, May 21, 10:01 AM
dberlin added a comment to D47149: [SCCP] Mark DominatorTreeAnalysis as preserved..

This is, AFAIK, right but your assumption is wrong.

Mon, May 21, 10:01 AM
dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

So, actually, this *is* a variant the same issue we have elsewhere with Simplify* (and there are a few newgvn bugs open about it). These actually *are* congruent, we just don't want it simplifying this hard.

Mon, May 21, 9:16 AM
dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

Yeah, some of this metadata travels backwards in time (IE if you can prove they are equivalent, the metadata must apply) and some of it is not.

Mon, May 21, 9:10 AM
dberlin added a comment to D47143: [NewGVN] Do not treat LoadExpressions with differing nonnull as equal..

Hey Florian,
Thanks for looking at this:
IMHO, We should be storing and hashing/comparing the relevant info in the loadexpression (IE store a bit that it's nonnull)
I really do not want to end up with complex business logic in the equals function, that's a thing the callers should be worrying about.

Mon, May 21, 9:09 AM

Fri, May 18

dberlin accepted D45320: [MemDep] Fixed handling of invariant.group.
Fri, May 18, 9:35 AM
dberlin accepted D46893: [CVP] Require DomTree for new Pass Manager.

I may have sowed some confusion here by not updating commit messages as the code changed.

The code here in CVP currently:

  • use getBestSimplifyQuery to get the best SQ available (but not require a DT in the NPM)
  • use the SQ in SQ.DT->dominates (causing the problem) and SimplifyInstruction(.., SQ)

Yeah, this is a bad idea :)

Fri, May 18, 8:49 AM

Thu, May 17

dberlin requested changes to D46866: [EarlyCSE] Avoid a poorly defined instruction comparison.

So, IMHO, the swapping should not be *determined* as part of the equality function, and that is your underlying problem.
This feels like a workaround instead of fixing that problem.

Thu, May 17, 11:24 AM
dberlin updated subscribers of D47010: [NewGVN] Minor cleanups.
Simplify handling of the false edge of branch predicates. We can handle

this the same way as the true edge with an inverted predicate. This is both

>more general and simpler than the previous code.

Thu, May 17, 8:18 AM

Wed, May 16

dberlin accepted D46974: [NewGVN] Fix handling of assumes.

Thanks!

Wed, May 16, 12:52 PM
dberlin added a comment to D46426: [SROA] Handle PHI with multiple duplicate predecessors.

Ugh, yes, i was focused on the error issuing part and not the hey you have to create real phi nodes part.

Wed, May 16, 12:20 PM
dberlin added a comment to D46426: [SROA] Handle PHI with multiple duplicate predecessors.

Oh looks like Eli beat me by a few minutes :)

Wed, May 16, 12:13 PM
dberlin added a comment to D46426: [SROA] Handle PHI with multiple duplicate predecessors.
  1. If this is common enough, should we just have a unique_predecessors iterator?
Wed, May 16, 12:12 PM
dberlin added a comment to D46893: [CVP] Require DomTree for new Pass Manager.

So, this seems very confused.
getBestSimplifyQuery simply queries the analysis manager to see what passes are available.

Wed, May 16, 7:53 AM

Tue, May 15

dberlin accepted D46899: [MemorySSA] Don't sort IDF blocks..

Thanks for doing this!

Tue, May 15, 11:35 AM

Mon, May 14

dberlin accepted D45330: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions..

I think the numbers are good enough to be worth it perf wise.
Did you post compile time numbers?

Mon, May 14, 1:11 PM
dberlin accepted D46775: [LICM] Preserve DT and LoopInfo specifically.

In a perfect world we should update LICM to preserve all CFG analysis.
For now, this at least makes it correct.

Mon, May 14, 8:01 AM

Fri, May 11

dberlin accepted D46646: [IDF] Enforce the returned blocks to be sorted..

Looks good to me. You may want to make a unit test for this.
It should only require three blocks.

Fri, May 11, 5:44 PM

Thu, May 10

dberlin updated subscribers of D46265: StackColoring: better handling of statically unreachable code.

This looks pretty obviously right, i'd commit it.

Thu, May 10, 10:48 AM
dberlin accepted D46696: [IRTests] Verify PDT instead of DT.

I can verify this is correct :)

Thu, May 10, 8:34 AM

Wed, May 9

dberlin added a comment to D45150: Less conservative LoopSafetyInfo for headers.

We created an abstraction for the ordered basic block caching that works and is used in a few passes.
Transforms/Utils/OrderedInstructions.h

Wed, May 9, 6:10 PM

Tue, May 8

dberlin added inline comments to D46600: [MergedLoadStoreMotion] Fix a debug invariant bug in mergeStores.
Tue, May 8, 2:50 PM

Mon, May 7

dberlin added a comment to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..

The solution in IDF should just be changing domtreenodepair to a tuple, with the last item being domtreenode->getDFSNumIn()

Just to make sure I understand you correctly: we need to change pair<DomTreeNode *, unsigned> to tuple<DomTreeNode *, unsigned, unsigned> and use the second and third elements as keys for the priority queue (i.e. we will need to implement a compare function for it, which will look at RootLevel first and at DfsNumber second if RootLevels are the same). Is it what you meant?

Michael

Mon, May 7, 6:12 PM
dberlin added a comment to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..

IDF does not guarantee that returned basic blocks are sorted in any
order,

Mon, May 7, 5:56 PM
dberlin accepted D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

Nope, LGTM.

Mon, May 7, 11:38 AM

Sat, May 5

dberlin updated subscribers of D45320: [MemDep] Fixed handling of invariant.group.
Sat, May 5, 7:44 AM
dberlin added a comment to D45320: [MemDep] Fixed handling of invariant.group.

During MemorySSA development, i'm pretty positive we tried some similar types of reverse maps, and while it did relatively okay on "llvm + clang testsuite", it fell down pretty hard in the real world from memory usage.

Sat, May 5, 7:44 AM

Fri, May 4

dberlin added a comment to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

A few notes:

  1. Because there is no single root here reaching all nodes, you can't guarantee the ordering will give a single iteration order for any sort order of the list.
  2. We are assuming no cycles (you could find sccs here to deal with the cycles, it's the same cost as finding RPO because they are done the same way)
  3. Off the top of my head, my intuition is that if the list is in dominator tree order (which is an RPO except for siblings) to start, the RPO generated by DFS of the operand tree should avoid the problem mentioned in 1, because it would initially order it the same way it would be ordered in the presence of a single root.
Fri, May 4, 5:05 PM
dberlin added a comment to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

Is it guaranteed that a PHI node, without uses, being inserted in PHIsToRemove only can be used by later created PHI nodes

PHI-nodes are only inserted when we call SSAUpdater.RewriteUse. I did not expected that their use lists change afterwards, so frankly, I'm surprised that inserted PHIs get more uses on later iterations - that's why I'm asking for a simpler test that could help to understand how and why this happens.

Then it might be enough to iterate in reverse insertion order...

Why do we need to reverse the order?

As seen in my earlier post we remove the following PHI nodes when using my patch:

%value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ]
%value3.lcssa = phi i16 [ %value3, %foo ]
%value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ]
%value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ]

My idea was that if we iterated in reverse order we would remove the use of %value3.lcssa4 before trying to remove the PHI node defining that value.
But as seen above we wouldn't be able to remove the PHI node defining %value3.lcssa instead (as it is used by the first PHI node in the list).
So it will not help by changing the iteration order in case we desire to remove all those four PHI nodes.

Fri, May 4, 4:58 PM
dberlin updated subscribers of D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

Is there an intermediate IR dump you can send me of where you end up with
phis using phis that are useless?

Fri, May 4, 3:40 PM

Thu, May 3

dberlin accepted D45320: [MemDep] Fixed handling of invariant.group.
Thu, May 3, 7:55 AM
dberlin added inline comments to D45320: [MemDep] Fixed handling of invariant.group.
Thu, May 3, 7:55 AM
dberlin updated subscribers of D45320: [MemDep] Fixed handling of invariant.group.

So, please don't let me stand in the way of reasonable fixes. I only get
~20 minutes a day to look at llvm stuff these days unless i'm on vacation.
Maybe some day it'll be better again, but i suspect it'll be a while ;)

Thu, May 3, 7:54 AM

Wed, May 2

dberlin added inline comments to D45734: [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions..
Wed, May 2, 9:52 AM
dberlin added inline comments to D45734: [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions..
Wed, May 2, 9:42 AM
dberlin added a comment to D45734: [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions..

This looks right.
There is an optimal order to visit and revisit, and if you switch them, you will definitely take an extra N iterations per instruction :)

Wed, May 2, 9:38 AM
dberlin accepted D45734: [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions..
Wed, May 2, 9:31 AM
dberlin added a comment to D44626: [InstCombine] Fold (A OR B) AND B code sequence over Phi node .

I'm still not sure this is really as general as it could be, but I guess it's okay.

I think what this patch really wants to ask/do is: "Does this binop simplify with the incoming value of the phi to 1 of the binop operands? If yes, substitute that value into the phi."

If you look at it that way, then it should be some add-on to the existing foldOpIntoPhi() - and that's probably a smaller and more general patch.

That substitution analysis seems to fall into the gray area -- or not gray depending on your viewpoint :) -- of whether this belongs in (New)GVN or instcombine. (cc @fhahn).

Wed, May 2, 7:58 AM

Tue, May 1

dberlin added a comment to D46259: [CFLGraph][NFC] Simplify/reorder switch in visitConstantExpr.

I don't understand at all why this is better/simpler

Tue, May 1, 12:47 PM
dberlin added a comment to D46279: [LVI] Remove an assert on case which could happen in LazyValueInfo..

The assertion is checking the a value is not live-in to the entry block unless it is a global. That assert is correct. From your test case, it looks like you've actually found a place where we recurse on a non-local search we shouldn't have. (i.e. a local add with two constant operands should never trigger the backwards walk.)

Tue, May 1, 9:20 AM
dberlin added inline comments to D46279: [LVI] Remove an assert on case which could happen in LazyValueInfo..
Tue, May 1, 8:54 AM

Mon, Apr 30

dberlin added a comment to D44564: [BasicAA] Relax restriction on PHI node handling..

You can't really have something it uses do caching, because there is no way to "invalidate" BasicAA.

But it looks like that's already happening?

Not really intentionally, it just grew this way :-)

Mon, Apr 30, 12:14 PM

Thu, Apr 26

dberlin added a comment to D44564: [BasicAA] Relax restriction on PHI node handling..

Hmm. I don't see anything wrong with the code, but how often will this trigger? instcombine will remove the PHI nodes in your testcase, so this is only a problem if you have a pass which creates PHI cycles like this. I guess GVN does in some cases? Can we avoid doing that?

Yes, this comes from GVN where it temporarily introduces phi nodes like this that are later removed, but which in the meantime prevent PRE from happening because of this inaccurate alias analysis. I'd previously gone for the approach of fixing this in GVN in D44160, but the conclusion there was to do it in alias analysis.

Yeah, i'd really like to understand the value of this before make this code more complex and handle more cases.

Really, as we've discussed many times, BasicAA's complex phi/cycle handling should probably be moved out to a caching forward pass that properly resolves cycles, instead of a lazy backwards pass that tries depth limiting to handle cycles. This was waiting on the new pass manager work, AFAIK (because you couldn't cache it otherwise)

Now that this is done (again, AFAIK), one way to start doing that would be to make PhiAA here, have it handle the cycles you are talking about (it's trivial to compute phi-induced cycles and it's not going to affect compile time in any meaningful way even if you invalidate it a lot. There is code to do it already if you want it, it's <100 lines) and do caching, and fall back to BasicAA.

Then we can slowly move functionality into that, from BasicAA.

If I understand you correctly, you're suggesting adding a new alias analysis pass which just handles phis (but by analyising them forwards, instead of backwards as BasicAA does), yes?

Thu, Apr 26, 12:08 PM

Wed, Apr 25

dberlin added a comment to D44564: [BasicAA] Relax restriction on PHI node handling..

Yeah, i'd really like to understand the value of this before make this code more complex and handle more cases.

Wed, Apr 25, 1:20 PM

Apr 21 2018

dberlin updated subscribers of D45910: [BasicAA] Don't assume a PHI node has inputs.

Trying to handle the invalid code we allow in unreachable blocks had so far
been a generally losing proposition. AFAIK we've always eventually given up
before getting any pass actually bug free. Can we just stop asking aa
about this?

Apr 21 2018, 10:17 AM

Apr 20 2018

dberlin added a comment to D42179: [NewGVN] Re-evaluate phi of ops after moving an instr to new class .

Okay. There is definitely at least 1 cause of value numbers changing after
the main loop completes that is not phi-of-ops related
(it's SCC related)

Apr 20 2018, 10:23 AM

Apr 19 2018

dberlin added a comment to D43865: [NewGVN] Split OpPHI detection and creation..

So, just to put this on the review: the current code on this review definitely not the right fix compared to the patch i sent :)
I think we should commit that split/cleanup (not sure if you did it as part of the last patch.
It sounds like you had time to bootstrap/test it. If so, please feel free to commit it

Apr 19 2018, 7:35 AM

Apr 18 2018

dberlin accepted D42180: [NewGVN] Add ops as dependency if we cannot find a leader for ValueOp..

Sorry for the delay. I've mostly convinced myself this isn't wrong, so let's go with it for now.

Apr 18 2018, 3:23 PM

Apr 12 2018

dberlin added a comment to D45599: [SimplifyLibcalls] Memcpy + strlen propagation.

GVN should already be able to get this, and if it can't, improve it's intrinsic handling (Either in VNCoercion's memory transfer intrinsic handling, or ...).

Apr 12 2018, 4:44 PM
dberlin updated subscribers of D45448: [CVP] simplify phi with constant incoming values that match common variable edge values .

https://bugs.llvm.org/show_bug.cgi?id=34531

Apr 12 2018, 4:03 PM
dberlin added a comment to D45510: [RFC][AliasAnalysis][BasicAA] Return MayAlias for the pointer plus variable offset to structure object member.

This C case is actually not legal (it's been discussed before many times).
You cannot take the address of .a field, add an offset, and get to another field legally.
So i wouldn't express it in these terms, since in that language, this is clearly illegal.

Apr 12 2018, 12:53 PM

Apr 11 2018

dberlin added a comment to D45448: [CVP] simplify phi with constant incoming values that match common variable edge values .

oh phabricator email and it's inability to be consistent about putting emails in comments.
Read the thread on llvm-commits and it'll make more sense :)

Apr 11 2018, 3:38 PM

Apr 10 2018

dberlin added a comment to D45448: [CVP] simplify phi with constant incoming values that match common variable edge values .

(and i have patches to start down this path if anyone is ever interested)

Apr 10 2018, 2:36 PM

Apr 9 2018

dberlin added a comment to D45330: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions..

whoops, hit send early.
That said, if you find this actually generates gains (papers on it basically say it should be a few percent improvement in found constants), i don't have any intrinsic reason not to do it just make sure we get the cost vs gain tradeoff right.

Apr 9 2018, 3:02 PM
dberlin added a comment to D45330: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions..

So, predicateinfo is one of the fastest passes (or was) that we had. I'm not horribly worried about speed (i could make it faster, too).
It was faster than anything else we do that isn't lazy (IE it's faster than computing dominators, etc).
It wasn't even really measurable except on hundreds of thousands or millions of blocks.

Apr 9 2018, 2:30 PM
dberlin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

I'm fine with this, we can always improve it more later.

Apr 9 2018, 12:31 PM

Apr 5 2018

dberlin added a comment to D45320: [MemDep] Fixed handling of invariant.group.

"I know, I also wish we would get rid of MemDep asap, but as long as it's there, it would be good to avoid weird miscompilations that causes calls to vtable."

Apr 5 2018, 1:55 PM
dberlin added a comment to D45320: [MemDep] Fixed handling of invariant.group.

It's kind of hard for me to know what to do with this.
The cache invalidation is still very very broken with or without this patch.
So this patch is not wrong but everything it does probably needs to be rewritten anyway

Apr 5 2018, 10:52 AM

Apr 4 2018

dberlin added inline comments to D45299: API to update MemorySSA for cloned blocks..
Apr 4 2018, 11:23 PM
dberlin accepted D44257: [MemorySSAUpdater] Mark non-trivial Phi for not to optimize.

This looks right to me (IE if it's wrong, i'm wrong about what will work, it looks implemented correctly)

Apr 4 2018, 12:06 PM

Apr 3 2018

dberlin updated the diff for D45238: MemCpyOpt does not preserve MemDep, unforunately. See PR 36965, 36964, 36940, 36944, 36943, and 36941 Testcases coming..
  1. Updating D45238: MemCpyOpt does not preserve MemDep, unforunately. See PR 36965, 36964, 36940, 36944, 36943, and 36941
Apr 3 2018, 4:46 PM
dberlin created D45238: MemCpyOpt does not preserve MemDep, unforunately. See PR 36965, 36964, 36940, 36944, 36943, and 36941 Testcases coming..
Apr 3 2018, 4:40 PM

Mar 28 2018

dberlin updated subscribers of D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

I'm ... confused.

Mar 28 2018, 4:38 PM
dberlin updated subscribers of D44891: [RFC][GVN] Remove redundant load by GVN.

Sorry, again, i don't think this is a thing memdep should be doing.
If GVN can't handle this sanely, that's on GVN.
So i'm not going t approve this.

Mar 28 2018, 10:40 AM

Mar 26 2018

dberlin requested changes to D44891: [RFC][GVN] Remove redundant load by GVN.
Mar 26 2018, 8:54 AM
dberlin added a comment to D44891: [RFC][GVN] Remove redundant load by GVN.

This isn't correct in the face of atomics and volatiles, and IMHO, MemoryDependence should not be performing this optimization.
If you want, teach GVN to requery MemDep in these cases.

Mar 26 2018, 8:40 AM

Mar 23 2018

dberlin accepted D44715: [MemorySSA] Fix exponential compile-time updating MemorySSA..

(I do wonder if we shouldn't add a unit test to the MemorySSAUpdater unit tests that just has 100 nested if or something that will take forever with exponential time but be fine with N^2 time, but otherwise, ...)

Mar 23 2018, 3:19 PM

Mar 22 2018

dberlin accepted D44817: Fix a block color copying problem in LICM.
Mar 22 2018, 10:11 PM
dberlin added a comment to D44817: Fix a block color copying problem in LICM.

Sigh.
(a quick scan of things named Map in LLVM doesn't find any other obvious cases of this).

Mar 22 2018, 10:11 PM

Mar 21 2018

dberlin requested changes to D44715: [MemorySSA] Fix exponential compile-time updating MemorySSA..

(there is a longer report version that describes the marker variation in detail that is escaping me at the moment. There is also code that implements it in clang).

Mar 21 2018, 3:00 PM

Mar 20 2018

dberlin added a comment to D44715: [MemorySSA] Fix exponential compile-time updating MemorySSA..

(but otherwise, if we discover the paper is wrong/etc, this change looks reasnable)

Mar 20 2018, 9:40 PM
dberlin added a comment to D44715: [MemorySSA] Fix exponential compile-time updating MemorySSA..

In particular, they should have the same time bounds as us, if both algorithms operate on blocks that each contains 1 instruction.
If the exponential factor comes from repeated walking, they should be just as exponential
Thus, my assumption is we screwed up.

Mar 20 2018, 7:41 PM
dberlin added a comment to D44715: [MemorySSA] Fix exponential compile-time updating MemorySSA..

So, staring at https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf, i don't see how we can get exponential time, and thus wonder if i screwed this up somehow.
I can see we walk blocks to find the defs and they don't have to (because of how they use it), but that should only add a factor of O(N).

Mar 20 2018, 7:35 PM

Mar 19 2018

dberlin updated subscribers of D40146: [JumpThreading] Preservation of DT and LVI across the pass.

+1. Real time updating in the presence of how JT operates right now is too
costly.
The lazy analysis it uses require DT be updated at all times, forever
(instead of say, once an iteration).
There is no way to update the domtree 100000 times a pass and make it fast
:)
What we should be targeting is O(10)-O(100) updates.

Mar 19 2018, 11:55 AM

Mar 16 2018

dberlin added a comment to D44564: [BasicAA] Relax restriction on PHI node handling..

Yeah, that is why on the other thread i suggested it probably makes sense
to just change how AA is doing it rather than all simplification users

Mar 16 2018, 1:31 PM
dberlin updated subscribers of D44564: [BasicAA] Relax restriction on PHI node handling..

https://reviews.llvm.org/D28934 has a version of the SCC finder you can
borrow if you want (TarjanSCC class)

Mar 16 2018, 8:28 AM
dberlin added a comment to D44564: [BasicAA] Relax restriction on PHI node handling..

What you've implemented (or trying to) is really the SCC algorithm from from "Simple and Efficient
Construction of Static Single Assignment Form" at
http://compilers.cs.uni-saarland.de/papers/bbhlmz13cc.pdf

Mar 16 2018, 8:26 AM
dberlin added a comment to D44160: [GVN] Don't use the eliminated load as an available value in phi construction.

FWIW: It's also possible to just fix AA without InstSimplify just by
replacing what it uses with something more powerful.
That would avoid disturbing the other users.

Mar 16 2018, 8:23 AM

Mar 12 2018

dberlin updated subscribers of D44160: [GVN] Don't use the eliminated load as an available value in phi construction.

In particular, i feel this way because GVN should already get your case
without this.
The fact that it doesn't is an issue with AA.
Working around it like this only fixes GVN's ability to optimize this.
There will be plenty of cases something *else* could eliminate code like
that will not be able to
Fixing AA fixes it for everyone (and doesn't require pass-specific
workarounds).

Mar 12 2018, 9:18 PM
dberlin added a comment to D34135: [JumpThreading] Use DT to avoid processing dead blocks.

Yeah, my review window didn't update for some reason till just now.
LGTM

Mar 12 2018, 12:19 PM
dberlin added a comment to D44160: [GVN] Don't use the eliminated load as an available value in phi construction.

So this looks like a really weird optimization to avoid AA being broken.
As you say, the code it inserts will be fixed up later (and you are taking an O(1) function and making it O(N).
We should just fix AA.

Mar 12 2018, 9:07 AM
dberlin accepted D34135: [JumpThreading] Use DT to avoid processing dead blocks.

I'm a little worried about the duplicated code here. I feel like you are having to patch way too many points in the same function, and it will be easy to miss one in the future if someone changes the logicc slightly.

Mar 12 2018, 8:53 AM

Mar 8 2018

dberlin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

So, FWIW:
It's definitely possible to make the SSAUpdater rewrite faster in bulk insertion. I just never bothered because it wasn't used that way :)
You will never make it as fast as exploiting the problem structure, but it could be made a *lot* faster.

Mar 8 2018, 6:14 PM
dberlin updated subscribers of D44147: [InstCombine] Fold add/sub over phi node.

Yeah, this is a generally tricky case to get.
It requires a value based PRE that is going to do a phi of ops transform,
because this testcase is really:

Mar 8 2018, 4:43 AM

Mar 6 2018

dberlin added a comment to D44147: [InstCombine] Fold add/sub over phi node.

This transform is a variation of some of the other phi of ops vs op of phi transforms we already do, just applied to a very limited case of expressions that would not normally be fully available, but where you've decide the cost is low.
It is also likely to screw up loop induction variables :)

Mar 6 2018, 2:02 PM
dberlin added a comment to D43865: [NewGVN] Split OpPHI detection and creation..

At a minimum, the patch needs bootstrap and testing, i didn't test it except on this case and test/Transforms/NewGVN.
:)

Mar 6 2018, 9:55 AM

Mar 2 2018

dberlin added a comment to D43865: [NewGVN] Split OpPHI detection and creation..

There is also a case 4 here i forgot, which is get unlucky, find something (where all blocks have the same number of preds) and probably generate wrong code :)

Mar 2 2018, 1:56 PM
dberlin added a comment to D43865: [NewGVN] Split OpPHI detection and creation..

So, we are definitely "doing it wrong".

Mar 2 2018, 1:54 PM

Feb 28 2018

dberlin accepted D40874: [LV][LoopInfo] Add irreducible CFG detection for outer loops.

Looks good to me, thanks so much for the updates!

Feb 28 2018, 9:31 PM
dberlin added a comment to D43865: [NewGVN] Split OpPHI detection and creation..

I don't understand how this can happen, since the value phi should be equivalent to an op phi in the same block.
I need to look at this more.

Feb 28 2018, 7:36 AM

Feb 26 2018

dberlin accepted D43770: [NewGVN] Update phi-of-ops def block when updating existing ValuePHI..

Can you just use update_test_checks to generate the CHECK lines? We are trying to use it for all the newgvn tests.

Feb 26 2018, 11:21 AM
dberlin added a comment to D43770: [NewGVN] Update phi-of-ops def block when updating existing ValuePHI..

This is definitely correct.
Note that this also means we are likely leaving around TempToBlock mappings for ops because we don't remove them in removePhiOfOps.
(But that would require reference counting ATM, i believe, because they may be reused).
This shouldn't hurt anything, i believe.

Feb 26 2018, 11:20 AM

Feb 25 2018

dberlin added inline comments to D43743: StructurizeCFG: Test for branch divergence correctly.
Feb 25 2018, 8:31 AM

Feb 21 2018

dberlin added a comment to D40874: [LV][LoopInfo] Add irreducible CFG detection for outer loops.

Thanks, that is very helpful. It's much more likely that Dominator info is up to date than loop info, but it's not worth getting into here. The only thing I request is that you update the comments to suggest this only probably makes sense to use if loopinfo is computed already, and please just add your description of how it works with the example to the comments.

Feb 21 2018, 6:11 PM

Feb 20 2018

dberlin added a comment to D40874: [LV][LoopInfo] Add irreducible CFG detection for outer loops.

Hey, can you help me understand the algorithm?
I don't have a good feel for the invariants you expect going in.
It just says "it works on anything given loop info".
IE What info you expect the loopinfo to provide vs the topo ordering.

Feb 20 2018, 12:58 PM