dberlin (Daniel Berlin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 13 2012, 1:07 PM (266 w, 4 d)

Recent Activity

Yesterday

dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

I think this is mostly reasonable at this point.
If you clean up the naming so it matches the style guide (IE you have a lot of things ending in T for no reason, etc), i'll approve it.

Tue, Aug 22, 2:31 PM
dberlin added a comment to D36331: Add ARC backend.

These copyright notices definitely should not be here :)

Tue, Aug 22, 8:59 AM

Mon, Aug 21

dberlin added a comment to D36878: Inst Combine GEP Flatten.

I don't think there is any ideal here. In the example you give, it reduces the number of bitcasts. In some of the tests you've changed, it adds more.

Mon, Aug 21, 7:07 PM

Thu, Aug 17

dberlin added a comment to D36680: [test-suite] Adding Pathfinder Benchmark.

This is the same as what i said on D36682 :)

Thu, Aug 17, 10:13 PM
dberlin added a comment to D36682: [test-suite] Adding miniAMR Benchmark.

TL;DR This statement is slightyl different, but also has no effect.
It is also a statement that is more or less required by statute/regulation.

Thu, Aug 17, 10:13 PM
dberlin added a comment to D36683: [test-suite] Adding miniFE Benchmark.

The statements are, AFAIK, best viewed as statements of provenance, and are fairly normal for government released works (some are even required by statute)
They do not change or affect the license in any way.

Thu, Aug 17, 10:07 PM

Tue, Aug 15

dberlin added a comment to D36113: [Loop Vectorize] Vectorize Loops with Backward Dependence.
Tue, Aug 15, 8:05 AM

Mon, Aug 14

dberlin added inline comments to D36113: [Loop Vectorize] Vectorize Loops with Backward Dependence.
Mon, Aug 14, 3:33 PM
dberlin added a comment to D36656: [SCCP] Propagate integer range information in IPSCCP (WIP)..

Should we use range/nonnull metadata or assumptions to generate ranges?

Mon, Aug 14, 12:37 PM
dberlin added a comment to D36656: [SCCP] Propagate integer range information in IPSCCP (WIP)..

Thanks for working on this!

Mon, Aug 14, 10:41 AM

Sat, Aug 12

dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

A few things, just so they end up here:
The NTSCD folks, unfortunately, seem to prove all of the algorithms for weak order dependence are equivalent to those that require with a unique end node. See, http://people.cs.ksu.edu/~tamtoft/Papers/Amt+al:WODvsCD/short.pdf . This means, i believe, outside of the NTSCD algorithms, we won't have a lot of luck with infinite loops.

Sat, Aug 12, 8:29 AM

Fri, Aug 11

dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

I think that we should go ahead with this approach.

Thanks for reviewing this.

Fri, Aug 11, 8:32 PM
dberlin added a comment to D36432: [IPSCCP] Add function specialization ability.

Danny/Davide,

Thanks very much for the feedback. I have some replies to your comments below (abridged, so it's easier to read in Phab.), but I thought I would first summarize things so far. It sounds like your main points are that (1) the cost model for function specialization is difficult to get right in practice, and that (2) our current IPSCCP infrastructure could be improved to do better than pass-through for arguments and returns. I agree with both of these points. So do we think we should iterate on this patch and add function specialization to our current infrastructure?

Fri, Aug 11, 1:24 PM
dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

I think this patch {c,sh}ould be considered for commit as is.

OK, another person is voting against me. I certainly won't single handedly block this this. As I said, I would like to have a brief opinion from Hal, as we discussed extensively. I will ask him. If he agrees with you guys, I am out.

The reason is twofold:

  1. The current postdom behaviour has some known brokeness that caused several bugs. PDSE is just an example, but try to fuzz LLVM and you'll find all this sort of craziness. I've reviewed the original change in January to change this behaviour, which was reverted because you disagreed. I'm afraid that until somebody sits down and implements what you propose (probably you :) the discussion can go on and on indefinitely.

I implemented an alternative (and proposed a second which I implemented earlier). I am still waiting for a test-case to break them.

Fri, Aug 11, 1:07 PM
dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

This looks reasonable at a glance, note it is likely to be subtly broken in some cases until D35381 goes in.
I've added a few coments, i'll take a deeper look in a little bit.

Fri, Aug 11, 11:43 AM

Thu, Aug 10

dberlin added a comment to D36553: [InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug.

The debug counter infrastructure also supports skipping some number of calls at the beginning as well, but that feels like it would generate very odd behavior with InstCombine.

Thu, Aug 10, 12:47 AM

Wed, Aug 9

dberlin added inline comments to D36113: [Loop Vectorize] Vectorize Loops with Backward Dependence.
Wed, Aug 9, 11:06 PM
dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

BTW, if you wish to convince yourself what i said about two identical PDTs not generating the same PDF, here is an example:

Wed, Aug 9, 5:55 PM
dberlin added a comment to D36539: [NewGVN] Add command-line option to control the generation of phi-of-ops (disable by default).

I think where i come down on this is: disabled by default , and modify the existing test arguments to enable it, so that you don't have to xfail or modify a ton of tests :)
(then when it's fixed we can just remove the test arguments again).
But i'm also not opposed to doing the test updating.

Wed, Aug 9, 1:17 PM
dberlin added a comment to D36432: [IPSCCP] Add function specialization ability.

Very neat work, thanks, this was in my todolist for a while :)
Some meta comments, I'll try to find some time to review this more carefully this week.

Thanks for taking a look!

I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused. What are your thoughts?

Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?

Same as anything else - lack of time for someone ;)

Wed, Aug 9, 1:07 PM
dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

If you start with a CFG where you require every node go to exit, you don't need to do anything special just when computing postdom, because you just added the previously virtual edges to the actual CFG instead :)

Very right, this phrase is very common (and basically the indication that they did not think about incomplete CFGs at all).
This is why none of the above papers directly applies.

Wed, Aug 9, 8:33 AM
dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

Add Dani's email reply to Phab:

> If you generate that, you will eliminate the store a, 5 again, and you should not.

Interesting, but slightly confusing.

This is what i get for trying to reduce an algorithm quickly.

That's OK. It gives us an example we can work with. At least we are now on the same page. We still need to correct the example and identify the property you are interested in, but I feel we make progress. Thanks a lot again!

I'm sure i incorrectly removed some postdominance frontier checks or something.

I doubt a post-dominance frontier check can be the reason. The post-dominance frontier is based on the PDT, so it should be identical for the two examples.

It's probably easiest to take one of the working PDSE patches and look at that.

AFAIK, you have been working closely with the PDSE people. The patches seem still WIP, so I will have a harder time digging through them to find the part that matters. Any chance you happen to have one of the authors in the office, such that you can ask him/her over lunch?

Wed, Aug 9, 8:25 AM
dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

Add Dani's email reply to Phab:

> If you generate that, you will eliminate the store a, 5 again, and you should not.

Interesting, but slightly confusing.

This is what i get for trying to reduce an algorithm quickly.

That's OK. It gives us an example we can work with. At least we are now on the same page. We still need to correct the example and identify the property you are interested in, but I feel we make progress. Thanks a lot again!

I'm sure i incorrectly removed some postdominance frontier checks or something.

I doubt a post-dominance frontier check can be the reason. The post-dominance frontier is based on the PDT, so it should be identical for the two examples.

This reasoning is not quite correct:
FIrst, remember that with this patch, we generate a different PDT for your example and mine. That's in fact, part of the point :)

Wed, Aug 9, 8:06 AM

Tue, Aug 8

dberlin accepted D36478: [NewGVN] Use a cast instead of a dyn_cast..

thank you!

Tue, Aug 8, 11:15 AM

Mon, Aug 7

dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

(Copying this comment because it ended up on the wrong review thread)
Note: There is also a rewritten DSE that uses PostDom as well: https://reviews.llvm.org/D29624

Mon, Aug 7, 10:15 AM

Thu, Aug 3

dberlin added a comment to D36287: [SimplifyCFG] Simplify based on a dominating condition (looking past hammocks and diamonds)..

EarlyCSE should be able to do this as well, and PredicateInfo will build a form that makes this trivial for anything to do it, in linear time, without any limits necessary.
(By numbers, predicateinfo was the fastest analysis we had that walked the IR at the time it was committed, so ...)

Thu, Aug 3, 4:15 PM

Wed, Aug 2

dberlin added a comment to D35851: [Dominators] Include infinite loops in PostDominatorTree.

While I currently do not have any benchmark where optimizing infinite loops is useful (is there one I could have a look?), let's working under the assumption that it actually is. Chandler and Daniel raised the point that just the fact that the PDT contains all nodes makes user code simpler, which is another argument for this kind of modeling. In general, both points together make a good motivation to include all nodes in the PDT. I agree, assuming we understand and are happy with the resulting implications.

Wed, Aug 2, 1:17 PM
dberlin accepted D35391: [Dominators] Teach LoopDeletion to use the new incremental API.

This looks fine to me.
I think it may make sense to try to add a few simple cfg drawings to show what machinations are really happening here, but if others undersrtand them just from the text descriptions, awesome.

Wed, Aug 2, 9:34 AM
dberlin added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

Let be very clear: I'm actually talking about *this specific transformation*, not instcombine.

The general transformations between
a = x op y
b = x op y
result = phi(a, b)
and
1 = phi(x, y)
2 = phi(x, y)
result = 1 op 2

is exponential when applied repeatedly.

Well, yes, because that would introduce two phis where there originally was just one. But InstCombine doesn't do that, either with or without my patch. Before the patch, InstCombine used to transform:

Wed, Aug 2, 9:25 AM
dberlin added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

The testcase is extracted from a real-word program. On that program, the transformation (moving some of those operations out of a hot loop) is a significant overall win (about 10% improved performance of the whole program). I agree that this application is a quite special case -- this patch doesn't make much difference to the overall performance of the platform in general.

I'd be against doing it for this reason, but i'll leave it to others.

Agreed that the overall InstCombine algorithm can exhibit performance issues, and should probably be replaced in the long term. But I don't think this particular patch makes those issues significantly worse :-)

Let be very clear: I'm actually talking about *this specific transformation*, not instcombine.

The general transformations between
a = x op y
b = x op y
result = phi(a, b)
and
1 = phi(x, y)
2 = phi(x, y)
result = 1 op 2

is exponential when applied repeatedly.

Wed, Aug 2, 8:48 AM
dberlin added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

The testcase is extracted from a real-word program. On that program, the transformation (moving some of those operations out of a hot loop) is a significant overall win (about 10% improved performance of the whole program). I agree that this application is a quite special case -- this patch doesn't make much difference to the overall performance of the platform in general.

Wed, Aug 2, 8:45 AM
dberlin added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

NewGVN currently has a limitation that davide is fixing around recursive translation, so it won't get it quite yet.
Note also that the usual transform it performs is to remove the redundancy, not to just sink (which was the original usecase for the transform).
If you mainly care about the sinking, it may require a bit of thought. It's also worth noting our dividing line for instcombine was one expressed to me as "local transforms", and this is most certainly not :)
(IE if this is a local transform, i can express PRE in InstCombine, and have it do global PRE).

Wed, Aug 2, 8:09 AM

Tue, Aug 1

dberlin added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

Please please don't :)
Doing it the way this codepath does it can be exponential time/space
NewGVN will do the same thing, and in fact, catch all possible cases that can exist here in polynomial time.

Tue, Aug 1, 2:34 PM

Mon, Jul 31

dberlin accepted D35441: Allow None as a MemoryLocation to getModRefInfo.

So, this is a subset of an already approved patch, so LGTM.

Mon, Jul 31, 1:42 PM
dberlin added inline comments to D36073: [CGP] Extends the scope of optimizeMemoryInst optimization.
Mon, Jul 31, 10:40 AM
dberlin added inline comments to D35851: [Dominators] Include infinite loops in PostDominatorTree.
Mon, Jul 31, 8:48 AM

Thu, Jul 27

dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What I understand is that the API of IDFCalculator::calculate, populates a vector with all the dominance frontiers of the defining blocks (AFAICT from the usage in ADCE.cpp).
How can I create a mapping of a basic block vs. its (post) dominance frontier. For this pass I would need to have such a mapping. I guess I'm having difficulty understanding the code. THanks for the help.

Thu, Jul 27, 3:58 PM
dberlin accepted D35851: [Dominators] Include infinite loops in PostDominatorTree.

I'm going to accept this.
But please give Tobias until early next week to comment, given his past concerns.
(IMHO, if he wants to suggest we do something different, at this point, I believe the onus is on him to implement such a thing and show it works)

Thu, Jul 27, 8:06 AM

Wed, Jul 26

dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

Wed, Jul 26, 3:47 PM
dberlin added a comment to D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly.

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

Wed, Jul 26, 3:05 PM
dberlin accepted D35581: [LoopRotate][Dominators] Use the incremental API to update DomTree.
Wed, Jul 26, 10:11 AM
dberlin accepted D35589: [LoopUnswitch][Dominators] Don't recalculate DominatorTree.
Wed, Jul 26, 10:10 AM
dberlin accepted D35597: [Dominators] Move root-finding out of DomTreeBase and simplify it.
Wed, Jul 26, 10:08 AM
dberlin added inline comments to D35851: [Dominators] Include infinite loops in PostDominatorTree.
Wed, Jul 26, 10:05 AM

Tue, Jul 25

dberlin added inline comments to D35851: [Dominators] Include infinite loops in PostDominatorTree.
Tue, Jul 25, 3:35 PM

Jul 21 2017

dberlin added a comment to D35741: Add MemorySSA alternative to AliasSetTracker in LICM..

Thanks, I'll review this soon.
A quick question. a concern people have about MemorySSA is that you pay upfront for O(1) queries, which may have some impact on compile time.

FWIW: This is and always has been false accounting in most passes as you know.
Most passes that could use MemorySSA already ask about every load and store in the function.
The only reason people think it's faster to not use MemorySSA, or to avoid the upfront cost, is because our time-passes doesn't account function time to lazy utilities/analysis properly.
If it did, AA/etc time would look *much* larger than it does now, because in reality, it *is* much larger than it seems.

Jul 21 2017, 5:15 PM

Jul 20 2017

dberlin added a comment to D34822: [LVI] Constant-propagate a zero extension of the switch condition value through case edges.

NewGVN, and other things will already get such cases.
I have a bug open for LVI to use predicateinfo, which would make it sensitive to this, as well as make it much faster, but that is a big amount of work.

Jul 20 2017, 5:44 PM

Jul 18 2017

dberlin accepted D35571: [Dominators] Improve error checking in deleteEdge.
Jul 18 2017, 11:24 AM

Jul 17 2017

dberlin added a comment to D35474: SSAUpdater: Add mode when Phi creation is not allowed.

SSA Updater exists to perform SSA Updates for complicated cases (I also have a patch out to rewrite it, because it's super-slow. The algorithm i gave you is a variant of that patch. See https://reviews.llvm.org/D28934).

Jul 17 2017, 10:42 PM
dberlin added a comment to D35474: SSAUpdater: Add mode when Phi creation is not allowed.

Hi Daniel, I plan to upload the patch for PR26233 this week. I'd like to think a bit more about relation of recursion and the patch I have - whether it works correctly and whether SSAUpdater work correctly with this pattern.

For now, shortly, the idea of future patch is as follows: in CodeGenPrepare::OptimizeMemoryInst I'd like to collect all cases where address computation is identical but base might be different. So to sink address computation to memory instruction I need to find a common Phi node for all bases which can be used in BB where memory instruction is located. For that purpose I'd like to use SSAUpdater. However it is possible that Phi node for bases may not exists (for example if there is no further uses of base it can be removed).

Jul 17 2017, 10:18 PM
dberlin added a comment to D35503: Make OrderedInstructions able to handle queries that Dominators could do, but faster (OrderedInstructions is O(N) once, O(1) after that, Dominators is O(N) all the time).

(Sorry, this is a work in progress, as you can see from the OldResult stuff that i'm using for verification)

Jul 17 2017, 2:05 PM
dberlin added a comment to D35474: SSAUpdater: Add mode when Phi creation is not allowed.
I need this patch for implementation of PR26223.
Jul 17 2017, 2:04 PM
dberlin added a comment to D35503: Make OrderedInstructions able to handle queries that Dominators could do, but faster (OrderedInstructions is O(N) once, O(1) after that, Dominators is O(N) all the time).

Note: I don't claim i believe the answers to be 100% correct from a theory perspective, they just match what dominators gives :)

Jul 17 2017, 2:03 PM
dberlin created D35503: Make OrderedInstructions able to handle queries that Dominators could do, but faster (OrderedInstructions is O(N) once, O(1) after that, Dominators is O(N) all the time).
Jul 17 2017, 1:46 PM
dberlin added a comment to D35475: [PATCH] [GVN] Ensure replacement instruction dominates its uses.

I set a breakpoint in the R->moveBefore(I) which I added and this was the block in question:

L.LB1_346:                                        ; preds = %L.LB1_423
  %6 = sext i32 %1 to i64
  %7 = mul i64 %6, 4
  %8 = getelementptr i8, i8* null, i64 %7
  %9 = bitcast void (...)* @poisn2_ to void ([...])*
  call [...]
  %.pre = sext i32 %1 to i64
  br label %L.LB1_350

The way I debugged this was to track the creation of the "%.pre" instruction. This led me to "GVN::performScalarPREInsertion". I set a breakpoint in that function and it gets hit before the BasicBlock above has a chance to go through "processBlock" (during iterateOnFunction). Hence the %7 has not been found yet (not yet in the leader list). This violates the assumption in performScalarPREInsertion, as you stated before (I think) which is documented by the comment in that function:

Jul 17 2017, 9:08 AM
dberlin added a comment to D35475: [PATCH] [GVN] Ensure replacement instruction dominates its uses.

Analyzing more, i'm confused, see the last paragraph.

Jul 17 2017, 8:10 AM

Jul 14 2017

dberlin accepted D35342: [Dominators] Implement incremental deletions.

Thank you!

Jul 14 2017, 2:08 PM
dberlin added a comment to D35317: [EarlyCSE] Handle calls with no MemorySSA info..

Yeah, i forgot to say, i'm fine if you just want to fix this as you have already, but if you do, you should leave a note saying it's not optimal or something.

Jul 14 2017, 9:47 AM
dberlin accepted D35315: [Dominators] Make IsPostDominator a template parameter.

(and feel free to commit once you've update it for all that)

Jul 14 2017, 12:16 AM
dberlin added a comment to D35317: [EarlyCSE] Handle calls with no MemorySSA info..

You may want to not treat these as memory at all, as Vedant suggested on list.

Jul 14 2017, 12:15 AM
dberlin added a comment to D35315: [Dominators] Make IsPostDominator a template parameter.

For the cases that are always true, can you just use PostDominatorTree instead?

From what I understand, PostDominatorTree lives in lib/Analysis and I'm not sure if I can include it everywhere.

How about we add something like:

template <typename T> using ForwardDomTreeBase = DominatorTreeBase<T, false>;
template <typename T> using PostDomTreeBase = DominatorTreeBase<T, true>;

at the bottom of GenericDominatorTree.h and use it instead of DominatorTreeBase<T, true/false>?

Jul 14 2017, 12:08 AM

Jul 13 2017

dberlin accepted D35316: [Dominators] Update Clang's DominatorTree to use the new template argument.
Jul 13 2017, 10:33 AM
dberlin accepted D35342: [Dominators] Implement incremental deletions.

This looks right, but can you give an explicit paper reference in the comments of deleteedge?

Jul 13 2017, 10:33 AM
dberlin accepted D35341: [Dominators] Implement incremental insertions.
Jul 13 2017, 10:31 AM
dberlin added a comment to D35315: [Dominators] Make IsPostDominator a template parameter.

For the cases that are always true, can you just use PostDominatorTree instead?

Jul 13 2017, 10:29 AM
dberlin accepted D35282: [Dominators] Split SemiNCA into smaller functions.

You should submit the machinecombiner change separately, but otherwise okay.

Jul 13 2017, 10:20 AM
dberlin accepted D35279: [Dominators] Improve reachability verification.
Jul 13 2017, 10:14 AM

Jul 12 2017

dberlin added inline comments to D35279: [Dominators] Improve reachability verification.
Jul 12 2017, 8:18 AM

Jul 11 2017

dberlin accepted D35074: [LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701..

I think this is fine for now (the test case actually tests the patch), especially if we are going to move it to use the incremental updater.
It tends to be hard to formulate certain types of testcases right now due to the preconditions of loop-rotate

Jul 11 2017, 8:37 PM
dberlin added inline comments to D35286: [Dominators] Simplify block and node printing.
Jul 11 2017, 8:35 PM
dberlin accepted D35285: [Dominators] Simplify templates.
Jul 11 2017, 8:34 PM
dberlin added inline comments to D35282: [Dominators] Split SemiNCA into smaller functions.
Jul 11 2017, 8:33 PM
dberlin added inline comments to D35279: [Dominators] Improve reachability verification.
Jul 11 2017, 8:31 PM
dberlin accepted D34651: [Dominators] Use a custom DFS implementation.

At some point we should make the DFS iterators more flexible, but i don't think we need to design that for this

Jul 11 2017, 2:56 PM
dberlin added inline comments to D35264: [LICM] Teach LICM to hoist conditional loads.
Jul 11 2017, 11:46 AM

Jul 9 2017

dberlin added inline comments to D35142: WIP! [CFG] Create a new removeUnreachable utility that updates the dom in place.
Jul 9 2017, 8:32 AM

Jul 5 2017

dberlin added a comment to D35028: [GlobalOpt] Remove unreachable blocks before optimizing a function.
for X in depth_first (DT->getNode(block)):

the depth_first here should be post_order, i forgot depth_first is preorder

Jul 5 2017, 3:27 PM
dberlin added a comment to D35028: [GlobalOpt] Remove unreachable blocks before optimizing a function.

Actually, thinking about this more, i'm confused.

Jul 5 2017, 3:26 PM
dberlin added a comment to D35028: [GlobalOpt] Remove unreachable blocks before optimizing a function.

Yeah, this is a very clear case where we should make removeUnreachableBlocks use the new incremental API, and declare victory.

Jul 5 2017, 3:14 PM
dberlin accepted D34478: [BasicAliasAnalysis] Allow idAddofNonZero() for values coming from the same loop iteration..

I'm fine with this for the moment, but, as you know, i think we gotta stop thinking "bandaids" :)

Jul 5 2017, 9:38 AM

Jun 30 2017

dberlin added a comment to D34482: [Dominators] Add parent and sibling property verification (non-hacky).

Tobias,
I added some simple explanation and references in r306912 and wordsmithed it a bit in r306916.
Please let me know if you think they are sufficient :)

Jun 30 2017, 4:51 PM
dberlin accepted D34894: [Dominators] Do not perform expensive checks by default. Fix PR33656..
Jun 30 2017, 9:30 AM

Jun 29 2017

dberlin added a comment to D32720: [LICM] Introduce a finer granularity option to compute early exits..

Actually, that won't entirely work without the DFS numbers.
You can shortcut the false cases using the level numbers (IE it's not possible for Inst to have level number 5, an early exit have level number 3, and you dominate it), but you still need to know what part of the dom tree you are in.

Jun 29 2017, 7:34 PM
dberlin added inline comments to D32720: [LICM] Introduce a finer granularity option to compute early exits..
Jun 29 2017, 7:29 PM
dberlin requested changes to D34823: [PredicateInfo] Invalidate an OrderedBasicBlock when instruction in it is deleted..

Without a reason to do this specifically, it seems a waste of time

Jun 29 2017, 1:22 PM

Jun 26 2017

dberlin accepted D34258: [Dominators] Use Semi-NCA instead of SLT to calculate dominators.

Given we haven't found anything fuzzing wise, i think it's fine to land this, but let's watch out for performance regressions and be ready to revert if we need to.

Jun 26 2017, 6:51 PM
dberlin added a comment to D28934: Write a new SSAUpdater.

Thanks for your comment, I'll see what I can do given this shows up in some of our internal workload.
One thing, I agree we should change the API at some point (and maybe use a O(def + uses) renamer as we do in PredInfo) but, for starters, we should be able to have a drop-in replacement for RewriteUse, i.e. the new algorithm will always pick the correct value?

void SSAUpdater::RewriteUse(Use &U) {
  Instruction *User = cast<Instruction>(U.getUser());

  Value *V;
  if (PHINode *UserPN = dyn_cast<PHINode>(User))
    V = GetValueAtEndOfBlock(UserPN->getIncomingBlock(U));
  else
    V = GetValueInMiddleOfBlock(User->getParent());
[...]
Jun 26 2017, 6:19 PM
dberlin added a comment to D34651: [Dominators] Use a custom DFS implementation.

So, just to note: It's generally nonsense to try to calculate the root list prior to DFS (and a source of bugs in the current postdominator tree). By nonsense i mean "impossible".
You need to see what's reachable to know what ends up needing to be connected to a virtual root.

Jun 26 2017, 4:34 PM
dberlin added a comment to D28934: Write a new SSAUpdater.

@dberlin I just came back to this while I was looking at NewGVN PRE.
I decided to give it a try to see how the API looked like and I personally didn't find any problems with it (I might just used it wrong, FWIW).

https://reviews.llvm.org/P7997

In particular, the old API getValueInTheMiddleOfTheBlock has been source of confusion (at least to me).

With your patch and my patch on top of it, I saw still one test failing:

--- /home/davide/old.ll 2017-05-30 20:56:21.058869000 -0700
+++ /home/davide/new.ll 2017-05-30 20:56:30.450514797 -0700
@@ -316,7 +316,7 @@
   br label %block4

 block4:                                           ; preds = %block3, %block2
-  %D = phi i32 [ 87, %block2 ], [ 97, %block3 ]
+  %D = phi i32 [ 97, %block3 ], [ 87, %block2 ]
   %A = phi i32 [ -1, %block2 ], [ 42, %block3 ]
   br i1 %cmpxy, label %block5, label %exit

@@ -350,7 +350,7 @@
   br label %loop

 loop:                                             ; preds = %loop, %entry
-  %Y2 = phi i8 [ %Y, %entry ], [ 0, %loop ]
+  %Y2 = phi i8 [ 0, %loop ], [ %Y, %entry ]
   %i = phi i32 [ 4, %entry ], [ 192, %loop ]
   %X2 = getelementptr i8, i8* %p, i32 %i
   %cond = call i1 @cond2()
@@ -372,7 +372,7 @@
   br label %loop

 loop:                                             ; preds = %cont, %entry
-  %Y2 = phi i8 [ %Y, %entry ], [ %Y2.pre, %cont ]
+  %Y2 = phi i8 [ %Y2.pre, %cont ], [ %Y, %entry ]
   %i = phi i32 [ 4, %entry ], [ 3, %cont ]
   %X2 = getelementptr i8, i8* %p, i32 %i
   %cond = call i1 @cond2()

I guess this is because we don't have a consistent ordering of PHI args. If that's the only problem, I might consider looking at it next (and then keep converting the remaining users).

Jun 26 2017, 7:45 AM

Jun 25 2017

dberlin added inline comments to D34285: [SROA] Be smarter when copying metadata to retyped loads.
Jun 25 2017, 7:34 PM

Jun 23 2017

dberlin accepted D34420: [Dominators] Move SemiNCAInfo and helper functions out of DominatorTreeBase.

Fine once namespace is renamed

Jun 23 2017, 7:23 PM
dberlin accepted D34482: [Dominators] Add parent and sibling property verification (non-hacky).

This is fine once the properties are documented.

Jun 23 2017, 7:22 PM
dberlin accepted D34527: [Dominators] Rearrange access specifiers in DominatorTreeBase.
Jun 23 2017, 7:22 PM
dberlin accepted D34548: [Dominators] Keep tree level in DomTreeNode and use it to find NCD and answer dominance queries.

Can you also change IteratedDominanceFrontier to use this level info instead of computing it on it's own?

Jun 23 2017, 7:21 PM
dberlin accepted D34575: [Dominators] Add NearestCommonDominator verification.
Jun 23 2017, 7:19 PM
dberlin added a comment to D34478: [BasicAliasAnalysis] Allow idAddofNonZero() for values coming from the same loop iteration..

"It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet."

Can you prove that this is actually true for all codepaths through here?

I agree that my comment about loop iteration is misleading.

In order to call isAddOfNonZero() on index1 and index2 and check whether index2 equals to index1 + x( and, vice versa), we need to prove that we are working on the same copy of index1. Which means the instance that used in index1 is the same as the one used in index2.
If we haven’t visited any phis yet, we can assume that part of the code is flat and we have the same copy, right?

Jun 23 2017, 7:40 AM

Jun 21 2017

dberlin added a comment to D34420: [Dominators] Move SemiNCAInfo and helper functions out of DominatorTreeBase.

I would just name the namespace seminca or something
detail is a random generic name :)

Jun 21 2017, 7:13 PM
dberlin accepted D34427: [Dominators] Move helper functions into SemiNCAInfo.

The formatting looks a little odd.
I'm not sure we would name the namespace detail (i'll comment in the patch that adds it)
Otherwise, looks fine.

Jun 21 2017, 7:12 PM
dberlin added a comment to D34482: [Dominators] Add parent and sibling property verification (non-hacky).

You should document what the parent and sibling properties are.

Jun 21 2017, 7:11 PM
dberlin accepted D34493: [Dominators] Remove DominatorBase class.
Jun 21 2017, 7:10 PM