mzolotukhin (Michael Zolotukhin)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 5 2014, 3:20 PM (180 w, 5 d)

Recent Activity

Yesterday

mzolotukhin added a comment to D41574: [Transforms] Adding a WeakReassociate pass.

What are benchmarks results with this pass (compile-time/performance)? Do we really need it at all optlevels, or can we only include it at -O3?

Wed, May 23, 5:44 PM
mzolotukhin added inline comments to D46775: [LICM] Preserve DT and LoopInfo specifically.
Wed, May 23, 9:05 AM

Mon, May 21

mzolotukhin accepted D46045: [LoopUnswitch] Fix SCEV invalidation in unswitching.

Looks correct to me.

Mon, May 21, 1:57 PM
mzolotukhin accepted D47134: [LoopVersioning] Don't modify the list that we iterate over in addPHINodes.

Looks good to me! (+ a couple of nitpicks inline)

Mon, May 21, 12:05 PM

Fri, May 18

mzolotukhin added a comment to D47023: Limit the number of phis in intptr/ptrint folding .

I thought about this transformation more, and I no longer think that we even need to move it to aggressive-instcombine (or another FunctionPass). What we need is to just change it from top-down to bottom-up: i.e. to start looking not from phi-nodes, but rather from inttoptr instructions. That is, the algorithm would look like:

visitIntToPtr(Instruction &I) {
  Value *Def = I.getOperand()
  if (!Def.hasSingleUse())
     return;
  if (isa<PtrToInt>(Def)) {   // Simple case without phi - it's probably already handled somewhere else, but I'm putting it here for completeness
     I.replaceAllUsesWith(Def.getOperand());
  }
  if (isa<PHINode>(Def)) {    // Interesting case where we have a phi-node
     if (all operands are PtrToInt with a single use) {
       NewPHI = RewritePHI();
       I.replaceAllUsesWith(NewPHI);
    }
  }
}

What do you think? Would it work?

Fri, May 18, 5:49 PM

Thu, May 17

mzolotukhin added a comment to D47023: Limit the number of phis in intptr/ptrint folding .

Hi David,

Thu, May 17, 12:44 PM
mzolotukhin accepted D47023: Limit the number of phis in intptr/ptrint folding .

Do you mind adding a TODO note describing the problem and another way to fix it? Otherwise, LGTM.

Thu, May 17, 12:13 PM

Wed, May 16

mzolotukhin added inline comments to D46775: [LICM] Preserve DT and LoopInfo specifically.
Wed, May 16, 2:33 PM
mzolotukhin added inline comments to D46775: [LICM] Preserve DT and LoopInfo specifically.
Wed, May 16, 11:31 AM

Tue, May 15

mzolotukhin added a comment to D46899: [MemorySSA] Don't sort IDF blocks..

Thanks!

Tue, May 15, 11:53 AM
mzolotukhin committed rL332385: [MemorySSA] Don't sort IDF blocks..
[MemorySSA] Don't sort IDF blocks.
Tue, May 15, 11:44 AM
mzolotukhin closed D46899: [MemorySSA] Don't sort IDF blocks..
Tue, May 15, 11:44 AM
mzolotukhin created D46899: [MemorySSA] Don't sort IDF blocks..
Tue, May 15, 11:26 AM

Fri, May 11

mzolotukhin committed rL332168: Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.".
Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."
Fri, May 11, 6:58 PM
mzolotukhin added a comment to D46646: [IDF] Enforce the returned blocks to be sorted..

Thanks!

Fri, May 11, 6:53 PM
mzolotukhin committed rL332167: [IDF] Enforce the returned blocks to be sorted..
[IDF] Enforce the returned blocks to be sorted.
Fri, May 11, 6:48 PM
mzolotukhin closed D46646: [IDF] Enforce the returned blocks to be sorted..
Fri, May 11, 6:48 PM

Wed, May 9

mzolotukhin added a comment to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..

Thanks, Daniel! A fix for IDF was posted for review in D46646. If that fix is accepted, we should no longer need the current patch. Also, we should be able to remove blocks sorting from MemorySSA.

Wed, May 9, 9:44 AM
mzolotukhin created D46646: [IDF] Enforce the returned blocks to be sorted..
Wed, May 9, 9:40 AM

Mon, May 7

mzolotukhin added a comment to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..

The solution in IDF should just be changing domtreenodepair to a tuple, with the last item being domtreenode->getDFSNumIn()

Just to make sure I understand you correctly: we need to change pair<DomTreeNode *, unsigned> to tuple<DomTreeNode *, unsigned, unsigned> and use the second and third elements as keys for the priority queue (i.e. we will need to implement a compare function for it, which will look at RootLevel first and at DfsNumber second if RootLevels are the same). Is it what you meant?

Mon, May 7, 6:07 PM
mzolotukhin added inline comments to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..
Mon, May 7, 5:53 PM
mzolotukhin added inline comments to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..
Mon, May 7, 5:38 PM
mzolotukhin added a comment to D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..

This patch includes two pieces: fixing non-determinism in SSAUpdaterBulk and enabling it in JumpThreading. When approved, I'll commit them separately, but I included them both here for convenience. The fix for SSAUpdaterBulk requires basic-block enumeration, which should be passed from the SSAUpdaterBulk user, and thus it incurs the corresponding changes in JumpThreading. I'm not a big fan of an extra parameter in RewriteAllUses function and will be happy to discuss alternative, but for now I followed an example from MemorySSA. What do you think?

Mon, May 7, 5:26 PM
mzolotukhin created D46564: [SSAUpdaterBulk] Sort blocks in IDF to avoid non-determinism..
Mon, May 7, 5:21 PM
mzolotukhin added a comment to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

I meant to LGTM it provided @dberlin didn't have objections.

Mon, May 7, 11:16 AM
mzolotukhin accepted D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

Looks good to me now, thanks!

Mon, May 7, 11:15 AM

Fri, May 4

mzolotukhin added a comment to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

I looked deeper into this, and I think you exposed several issues here:

Fri, May 4, 5:30 PM
mzolotukhin added a comment to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

Ok, I will make an attempt to reduce it further, but if there was an easy way to trigger the problem, then I guess we would have seen it a long time ago (and bugpoint would have been able to remove some more basic blocks).
If it is preferred, then I can remove all the checks and keep it as a "just check that this no longer asserts" kind of test.

Yes, it is much preferred, thank you! If we have more than one problem here, then we need more small tests, but not one huge test.

Fri, May 4, 1:04 PM
mzolotukhin requested changes to D46422: [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions.

First of all, please clean-up the testcase. The bugpoint-output tests are impossible to understand and maintain (it is impossible to say if such a test checks anything at all after some time). I'm pretty sure you can expose the same issue with much a smaller test.

Fri, May 4, 11:58 AM

Thu, May 3

mzolotukhin committed rL331502: [MachineCSE] Rewrite a loop checking if a block is in a set of blocks without….
[MachineCSE] Rewrite a loop checking if a block is in a set of blocks without…
Thu, May 3, 6:44 PM
mzolotukhin closed D46411: [MachineCSE] Rewrite a loop checking if a block is in a set of blocks without using a set. NFC..
Thu, May 3, 6:44 PM
mzolotukhin created D46411: [MachineCSE] Rewrite a loop checking if a block is in a set of blocks without using a set. NFC..
Thu, May 3, 4:47 PM

Apr 20 2018

mzolotukhin committed rL330446: Revert r330431..
Revert r330431.
Apr 20 2018, 10:01 AM
mzolotukhin committed rL330434: Fix typo in a test..
Fix typo in a test.
Apr 20 2018, 6:54 AM
mzolotukhin committed rL330431: Revert "Revert r330403 and r330413.".
Revert "Revert r330403 and r330413."
Apr 20 2018, 6:38 AM
mzolotukhin committed rL330413: [SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites..
[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites.
Apr 20 2018, 3:35 AM
mzolotukhin committed rL330403: Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time..
Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
Apr 20 2018, 1:04 AM
mzolotukhin committed rL330402: [SSAUpdaterBulk] Add an assert..
[SSAUpdaterBulk] Add an assert.
Apr 20 2018, 1:03 AM
mzolotukhin committed rL330400: [SSAUpdaterBulk] Add * and & to auto..
[SSAUpdaterBulk] Add * and & to auto.
Apr 20 2018, 1:02 AM
mzolotukhin committed rL330399: [SSAUpdaterBulk] Use PredCache in ComputeLiveInBlocks..
[SSAUpdaterBulk] Use PredCache in ComputeLiveInBlocks.
Apr 20 2018, 1:00 AM
mzolotukhin committed rL330398: [SSAUpdaterBulk] Use SmallVector instead of SmallPtrSet for uses..
[SSAUpdaterBulk] Use SmallVector instead of SmallPtrSet for uses.
Apr 20 2018, 12:59 AM

Apr 19 2018

mzolotukhin added inline comments to D45673: [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses across basic blocks in the limited cases where it is very straight forward to do so..
Apr 19 2018, 12:43 AM

Apr 18 2018

mzolotukhin added inline comments to D45673: [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses across basic blocks in the limited cases where it is very straight forward to do so..
Apr 18 2018, 11:13 PM

Apr 17 2018

mzolotukhin committed rL330180: Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again.".
Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again."
Apr 17 2018, 12:34 AM

Apr 16 2018

mzolotukhin committed rL330176: [SSAUpdaterBulk] Add debug logging..
[SSAUpdaterBulk] Add debug logging.
Apr 16 2018, 9:49 PM
mzolotukhin committed rL330175: Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again..
Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again.
Apr 16 2018, 9:48 PM
mzolotukhin added a comment to D43578: -ftime-report switch support in Clang.

What kinds of <some part> would be useful to you? (How do you want the runtime of Clang broken down?) Are vertical slices (through all Clang's various layers and potentially parts of LLVM) -- as this patch will produce -- useful, or would you really want horizontal slices (as in, what part of Clang is actually spending the time)? Or just anything that's basically expected to be consistent from run to run, so you can identify that something has slowed down, even if you don't have enough information to really know what?

For me "something has slowed down" would be enough. I.e. even if "parse templates" would be erroneously attributed to 90% time spent in front-end, I would be able to see a jump from 90% to 95%. While these numbers are not reflecting the actual ratio, they still will indicate changes.

Apr 16 2018, 1:23 AM

Apr 14 2018

mzolotukhin added a comment to D43578: -ftime-report switch support in Clang.

I've been monitoring compile-time for quite a while, so let my put my 2 cents here too.

Apr 14 2018, 9:36 PM

Apr 11 2018

mzolotukhin committed rL329865: Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time..
Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
Apr 11 2018, 4:41 PM
mzolotukhin committed rL329864: [SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify….
[SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify…
Apr 11 2018, 4:40 PM

Apr 9 2018

mzolotukhin committed rL329666: Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time..
Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
Apr 9 2018, 8:43 PM
mzolotukhin committed rL329661: Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."".
Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading.""
Apr 9 2018, 7:20 PM
mzolotukhin committed rL329660: [SSAUpdaterBulk] Handle CFG with unreachable from entry blocks..
[SSAUpdaterBulk] Handle CFG with unreachable from entry blocks.
Apr 9 2018, 7:20 PM
mzolotukhin committed rL329650: Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading.".
Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."
Apr 9 2018, 5:48 PM
mzolotukhin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

Thanks! I committed the patch in two parts: r329643 (Add SSAUpdaterBulk) and r329644 (Use SSAUpdaterBulk in JumpThreading).

Apr 9 2018, 4:41 PM
mzolotukhin committed rL329644: [PR16756] Use SSAUpdaterBulk in JumpThreading..
[PR16756] Use SSAUpdaterBulk in JumpThreading.
Apr 9 2018, 4:40 PM
mzolotukhin committed rL329643: [PR16756] Add SSAUpdaterBulk..
[PR16756] Add SSAUpdaterBulk.
Apr 9 2018, 4:40 PM
mzolotukhin closed D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..
Apr 9 2018, 4:40 PM
mzolotukhin accepted D43578: -ftime-report switch support in Clang.

LGTM. Please don't forget to commit headers sorting (and other NFC if any) in a separate patch.

Apr 9 2018, 8:28 AM

Apr 8 2018

mzolotukhin committed rL329542: Remove MachineLoopInfo dependency from AsmPrinter..
Remove MachineLoopInfo dependency from AsmPrinter.
Apr 8 2018, 6:00 PM
mzolotukhin closed D44812: Remove MachineLoopInfo dependency from AsmPrinter..
Apr 8 2018, 6:00 PM

Apr 5 2018

mzolotukhin added a comment to D44812: Remove MachineLoopInfo dependency from AsmPrinter..

Thanks, Renato! I'll wait a little to give other reviewers more time to take a look.

Apr 5 2018, 10:57 AM

Apr 4 2018

mzolotukhin added a comment to D43578: -ftime-report switch support in Clang.

I think it would be really instrumental to have a finer timing report, so thanks for your work! I tested the patch on CTMark with O0-g and didn't see detectable change either, so I think from compile-time impact this change is good. The patch needs some clean-up, but otherwise looks good to me.

Apr 4 2018, 9:23 PM
mzolotukhin added inline comments to D44812: Remove MachineLoopInfo dependency from AsmPrinter..
Apr 4 2018, 6:21 PM
mzolotukhin updated the diff for D44812: Remove MachineLoopInfo dependency from AsmPrinter..
  • Don't recompute MLI and MDT on visit of every basic block.
  • Rebase.
Apr 4 2018, 2:45 PM

Apr 2 2018

mzolotukhin added a comment to D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..

I don't know how I feel about this patch. It looks like it's papering over a huge problem, which is basically the fact that some passes down the road are either quadratic or linear with large constant factor.
If we have examples of the broken passes, maybe we should consider whether it's feasible to fix them instead of applying this hack here?

We do have examples of such passes, and we should fix them, I agree. However, I don't consider this patch as a fix for them. The purpose of this patch is to prevent compiler hangs in future - even when we fix the known issues, there is no guarantee that there are no more places like them. When we specifically want to look for such problematic spots, we can always set the option to -1, but by default it will just save us from "compiler hangs" (at least, from those coming from backend).

Apr 2 2018, 3:50 PM
mzolotukhin added a comment to D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..

Ping!

Apr 2 2018, 10:53 AM

Mar 29 2018

mzolotukhin added a comment to D44812: Remove MachineLoopInfo dependency from AsmPrinter..

Ping!

Mar 29 2018, 4:00 PM

Mar 28 2018

mzolotukhin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

I don't mind this approach, but as discussed offline we should consider also moving LCSSA to make sure the API makes sense.
In general, what's your plan for this? You want it to replace the SSA updater for all the instances in llvm? If so, we should carefully plan the transition costs.

I looked into LCSSA, and indeed it seems that it also can be improved. I tried direct replacement of the old SSAUpdater with the new one, but that didn't give any benefits. However, I think we can simplify the code in LCSSA by passing LoopInfo to SSAUpdaterBulk, which will then use it to insert a phi node whenever it crosses a loop boundary when rewriting a use. I don't know yet how much work it would take, but I don't think it would require much rewritings - it would probably be an addition to what we currently have in this patch.

Mar 28 2018, 4:33 PM
mzolotukhin updated the diff for D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..
  • Rebase.
  • Address Davide's remarks.
Mar 28 2018, 4:16 PM

Mar 23 2018

mzolotukhin added a comment to D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..

Adrian convinced me that we need to properly ignore debug-info intrinsics to guarantee the same code generation with and without debug info. I've updated the patch.

Mar 23 2018, 6:11 PM
mzolotukhin updated subscribers of D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..
Mar 23 2018, 6:10 PM
mzolotukhin updated the diff for D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..
  • Ignore DbgInfo intrinsics.
Mar 23 2018, 6:10 PM
mzolotukhin added a comment to D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..

Hi Eli,

Mar 23 2018, 5:40 PM
mzolotukhin updated the diff for D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..
  • Rewrite algorithm to Avoid using BB->size().
  • Skip PHI-nodes and EH-pads.
  • Don't split on terminators.
  • Add tests.
Mar 23 2018, 5:37 PM
mzolotukhin accepted D44845: [PostRAMachineSink] preserve CFG.

LGTM!

Mar 23 2018, 1:55 PM
mzolotukhin added a comment to D41463: [CodeGen] Add a new pass for PostRA sink.

I measured compile time impact of this patch for spec2000/2006/2017. Overall, I wasn't able to see any reproduciable regression; all up and down are in noise range. There is no change in CFG in this pass, preserve DT should be good and I will submit a follow-up patch for it.

Sounds good, thank you!

Mar 23 2018, 1:50 PM

Mar 22 2018

mzolotukhin added a comment to D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..

BB->size() has multiple problems: one, it's linear time, and two, it doesn't ignore debug info.

What would be a better way to get BB size?

Mar 22 2018, 7:02 PM
mzolotukhin added a reviewer for D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation.: davide.
Mar 22 2018, 6:51 PM
mzolotukhin created D44814: [CodeGenPrepare] Split huge basic blocks for faster compilation..
Mar 22 2018, 6:51 PM
mzolotukhin committed rT328269: Replace calls to 'system' with an error message and abort..
Replace calls to 'system' with an error message and abort.
Mar 22 2018, 5:07 PM
mzolotukhin created D44812: Remove MachineLoopInfo dependency from AsmPrinter..
Mar 22 2018, 4:57 PM
mzolotukhin committed rL328272: State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'..
State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'.
Mar 22 2018, 4:47 PM
mzolotukhin committed rL328269: Replace calls to 'system' with an error message and abort..
Replace calls to 'system' with an error message and abort.
Mar 22 2018, 4:29 PM
This revision was not accepted when it landed; it landed in state Needs Review.
Mar 22 2018, 4:29 PM
mzolotukhin committed rL328267: Reapply "[test] Add tests for llc passes pipelines." with a fix for bots with….
Reapply "[test] Add tests for llc passes pipelines." with a fix for bots with…
Mar 22 2018, 4:05 PM
mzolotukhin added a comment to D41463: [CodeGen] Add a new pass for PostRA sink.

This pass destroys DominatorInfo and we have to recompute it right after the pass from scratch. Is it possible to preserve it? Also, have you measured compile time impact of the patch?

Mar 22 2018, 3:45 PM
mzolotukhin added a comment to rL328160: [test] Add tests for opt passes pipelines for O0, O2, O3, and Os..

Hi Matthew,

Mar 22 2018, 2:46 PM

Mar 21 2018

mzolotukhin committed rL328167: [test] Try to unbreak hexagon bots after r328160..
[test] Try to unbreak hexagon bots after r328160.
Mar 21 2018, 4:00 PM
mzolotukhin committed rL328160: [test] Add tests for opt passes pipelines for O0, O2, O3, and Os..
[test] Add tests for opt passes pipelines for O0, O2, O3, and Os.
Mar 21 2018, 3:20 PM
mzolotukhin committed rL328159: [test] Add tests for llc passes pipelines..
[test] Add tests for llc passes pipelines.
Mar 21 2018, 3:20 PM

Mar 20 2018

mzolotukhin added a comment to D44551: Replace calls to 'system' with an error message and abort..

Ping!

Mar 20 2018, 3:08 PM
mzolotukhin added a comment to D44551: Replace calls to 'system' with an error message and abort..

Ping!

Mar 20 2018, 3:06 PM
mzolotukhin added a comment to D44666: [XRay] Lazily compute MachineLoopInfo instead of requiring it..

Thanks!

Mar 20 2018, 10:26 AM
mzolotukhin committed rL327999: [XRay] Lazily compute MachineLoopInfo instead of requiring it..
[XRay] Lazily compute MachineLoopInfo instead of requiring it.
Mar 20 2018, 10:05 AM
mzolotukhin closed D44666: [XRay] Lazily compute MachineLoopInfo instead of requiring it..
Mar 20 2018, 10:05 AM

Mar 19 2018

mzolotukhin created D44666: [XRay] Lazily compute MachineLoopInfo instead of requiring it..
Mar 19 2018, 5:47 PM
mzolotukhin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

Ping!

Mar 19 2018, 3:53 PM

Mar 15 2018

mzolotukhin created D44551: Replace calls to 'system' with an error message and abort..
Mar 15 2018, 6:38 PM

Mar 12 2018

mzolotukhin added a comment to D44282: [PR16756] JumpThreading: explicitly update SSA rather than use SSAUpdater..

I created a separate class for bulk SSA updates. With that implementation, we recompute IDF for every individual variable, but usually the subgraph we're working with is smaller. In the previous implementation we computed IDF once for the union of these subgraphs - that was faster, but also we might accidentally insert unneeded phi-nodes, which we later had to clean-up.

Mar 12 2018, 3:14 PM