wmi (Wei Mi)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 20 2015, 10:57 AM (130 w, 2 d)

Recent Activity

Thu, Aug 10

wmi added a comment to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

I limit the bitfield separation in the last update to only happen at the beginning of a run so no bitfield combine will be blocked.

Thu, Aug 10, 3:45 PM
wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Don't separate bitfield in the middle of a run because it is possible to hinder bitfields accesses combine. Only separate bitfield at the beginning of a run.

Thu, Aug 10, 3:36 PM

Wed, Aug 9

wmi added a comment to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

This has been discussed before and I still pretty strongly disagree with it.

This cripples the ability of TSan to find race conditions between accesses to consecutive bitfields -- and these bugs have actually come up.

Wed, Aug 9, 10:42 PM
wmi updated the summary of D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Wed, Aug 9, 5:13 PM
wmi updated the summary of D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Wed, Aug 9, 5:05 PM
wmi created D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Wed, Aug 9, 5:04 PM

Tue, Aug 8

wmi committed rL310421: [GVN] Remove stale entries in phitranslate cache when new phi is generated for….
[GVN] Remove stale entries in phitranslate cache when new phi is generated for…
Tue, Aug 8, 2:41 PM
wmi closed D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE by committing rL310421: [GVN] Remove stale entries in phitranslate cache when new phi is generated for….
Tue, Aug 8, 2:41 PM

Mon, Aug 7

wmi added a comment to D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE.

Ping.

Mon, Aug 7, 2:56 PM

Thu, Aug 3

wmi added a comment to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

This patch caused regressions from 5% to 23% in two our internal benchmarks on Cortex-M23 and Cortex-M0+. I attached test.ll which is reduced from the benchmarks. I used LLVM revision 309830. 'test.good.ll' is a result when filtering is disabled. 'test.bad.ll' is a result when filtering is enabled.
Comparing them I can see that this optimization changes how an induction variable is changed. Originally it is incremented from 0 to 256. The optimization changes this into decrementing from 0 to -256. This induction variable is also used as an offset to memory. So to preserve this semantic conversion of the induction variable from a negative value to a positive value is inserted. This is lowered to additional instructions which causes performance regressions.

Could you please have a look at this issue?

Thanks,
Evgeny Astigeevich
The ARM Compiler Optimization team leader

Thu, Aug 3, 11:12 AM

Mon, Jul 31

wmi created D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE.
Mon, Jul 31, 5:25 PM

Fri, Jul 28

wmi committed rL309397: [GVN] Recommit the patch "Add phi-translate support in scalarpre".
[GVN] Recommit the patch "Add phi-translate support in scalarpre"
Fri, Jul 28, 8:48 AM

Tue, Jul 25

wmi committed rL309073: Add "REQUIRES: asserts" for test unswitch-equality-undef.ll..
Add "REQUIRES: asserts" for test unswitch-equality-undef.ll.
Tue, Jul 25, 6:35 PM
wmi committed rL309059: Disable loop unswitching for some patterns containing equality comparison with….
Disable loop unswitching for some patterns containing equality comparison with…
Tue, Jul 25, 4:38 PM
wmi closed D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef by committing rL309059: Disable loop unswitching for some patterns containing equality comparison with….
Tue, Jul 25, 4:38 PM

Mon, Jul 24

wmi updated subscribers of D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef.
Mon, Jul 24, 12:04 PM
wmi created D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef.
Mon, Jul 24, 12:04 PM

Jul 20 2017

wmi added a comment to D34822: [LVI] Constant-propagate a zero extension of the switch condition value through case edges.

I think we can find many similar problems.

Jul 20 2017, 3:46 PM

Jul 11 2017

wmi added a comment to D34150: [LV] Test once if vector trip count is zero, instead of twice.

Re overflow - the point is that getOrCreateTripCount() returns, basically, PSE.getBackedgeTakenCount() + 1, and that may overflow, so the "trip count" may end up being 0 if the backedge taken count is 0. I don't think this is outdated, and this is behavior we want to preserve. But this patch should preserve this behavior IIUC. Can you make sure there's a test for this?

Jul 11 2017, 9:49 AM

Jul 6 2017

wmi committed rL307338: [ConstHoisting] Turn on consthoist-with-block-frequency by default..
[ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 5:11 PM
wmi closed D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default by committing rL307338: [ConstHoisting] Turn on consthoist-with-block-frequency by default..
Jul 6 2017, 5:11 PM
wmi committed rL307328: [ConstHoisting] choose to hoist when frequency is the same..
[ConstHoisting] choose to hoist when frequency is the same.
Jul 6 2017, 3:33 PM
wmi closed D35084: [ConstHoisting] choose to hoist when frequency is the same by committing rL307328: [ConstHoisting] choose to hoist when frequency is the same..
Jul 6 2017, 3:32 PM
wmi created D35084: [ConstHoisting] choose to hoist when frequency is the same.
Jul 6 2017, 1:47 PM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 10:46 AM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 10:13 AM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 9:44 AM
wmi created D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 9:17 AM
wmi committed rL307269: [LSR] Narrow search space by filtering non-optimal formulae with the same….
[LSR] Narrow search space by filtering non-optimal formulae with the same…
Jul 6 2017, 8:52 AM
wmi closed D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. by committing rL307269: [LSR] Narrow search space by filtering non-optimal formulae with the same….
Jul 6 2017, 8:52 AM

Jun 30 2017

wmi updated the diff for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

Cleanup and reduce the testcase.

Jun 30 2017, 10:13 AM
wmi added inline comments to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 30 2017, 10:10 AM

Jun 29 2017

wmi updated the diff for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

Address Sanjoy's comments.

Jun 29 2017, 5:22 PM
wmi added inline comments to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 29 2017, 5:04 PM
wmi accepted D31821: Remove redundant copy in recurrences.

LGTM.

Jun 29 2017, 2:06 PM
wmi added inline comments to D31821: Remove redundant copy in recurrences.
Jun 29 2017, 11:42 AM
wmi accepted D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 29 2017, 9:42 AM

Jun 28 2017

wmi added inline comments to D34608: [WIP][AArch64] Increase CSR cost when defering use of CSR is preferred.
Jun 28 2017, 4:50 PM
wmi added inline comments to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 28 2017, 4:13 PM
wmi added a comment to D31821: Remove redundant copy in recurrences.

For the example below, findTargetRecurrence starts from r2 and r3 to search a def reg equals to r1. There are a lot of possibilities to explore. That is where the complexity of findTargetRecurrence comes from.

Jun 28 2017, 3:50 PM

Jun 26 2017

wmi added a reviewer for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale.: sanjoy.
Jun 26 2017, 3:09 PM
wmi committed rL306313: [GVN] Recommit the patch "Add phi-translate support in scalarpre"..
[GVN] Recommit the patch "Add phi-translate support in scalarpre".
Jun 26 2017, 11:16 AM

Jun 23 2017

wmi created D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 23 2017, 5:02 PM
wmi added inline comments to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 23 2017, 8:22 AM

Jun 22 2017

wmi added a comment to D33928: [LoopStrengthReduction] Treat SCEVUnknown pessimistically in LSR.

Hi Max,

Jun 22 2017, 10:13 AM

Jun 21 2017

wmi added a comment to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.

Thanks for helping on fixing the bug!

Jun 21 2017, 3:30 PM

Jun 16 2017

wmi committed rL305603: Revert rL305578. There is still some buildbot failure to be fixed..
Revert rL305578. There is still some buildbot failure to be fixed.
Jun 16 2017, 4:15 PM
wmi added a reverting commit for rL305578: [GVN] Recommit the patch "Add phi-translate support in scalarpre".: rL305603: Revert rL305578. There is still some buildbot failure to be fixed..
Jun 16 2017, 4:15 PM
wmi committed rL305578: [GVN] Recommit the patch "Add phi-translate support in scalarpre"..
[GVN] Recommit the patch "Add phi-translate support in scalarpre".
Jun 16 2017, 1:21 PM

Jun 10 2017

wmi added a comment to D31821: Remove redundant copy in recurrences.

Sorry for the delay. The rewrite based on SSA looks much cleaner now. About the algorithm, IIUC it tries to find loop based on define-use of tied operand or operand commutable with tied operand. However, I still have concern that the method can increase redundent copy sometimes.

Jun 10 2017, 6:31 PM

Jun 7 2017

wmi added a comment to D33847: [PartialInlining] Enhance code outliner to sink locals declared outside the outline region.

One comment about simplifying the test. Other than that, LGTM.

Jun 7 2017, 9:45 AM

Jun 5 2017

wmi added inline comments to D33847: [PartialInlining] Enhance code outliner to sink locals declared outside the outline region.
Jun 5 2017, 8:36 AM

May 31 2017

wmi accepted D33694: [PartialInlining] : Partial inlining Overhead reduction: eliminate unnecessary live-out(s).

LGTM. Only a minor comment.

May 31 2017, 4:14 PM
wmi committed rL304350: Revert rL304050. It may break sanitizer bootstrap. Revert it for now while….
Revert rL304050. It may break sanitizer bootstrap. Revert it for now while…
May 31 2017, 2:30 PM
wmi added a reverting commit for rL304050: [GVN] Recommit the patch "Add phi-translate support in scalarpre".: rL304350: Revert rL304050. It may break sanitizer bootstrap. Revert it for now while….
May 31 2017, 2:30 PM

May 30 2017

wmi added a comment to D33618: [PartialInlining] Reduce function outlining overhead.

I have no other comments. See whether Davide has further comments.

May 30 2017, 2:07 PM

May 28 2017

wmi added inline comments to D33618: [PartialInlining] Reduce function outlining overhead.
May 28 2017, 5:33 PM

May 26 2017

wmi committed rL304050: [GVN] Recommit the patch "Add phi-translate support in scalarpre"..
[GVN] Recommit the patch "Add phi-translate support in scalarpre".
May 26 2017, 5:54 PM

May 25 2017

wmi committed rL303969: Revert rL303923 since it broke the sanitizer bootstrap build bot..
Revert rL303923 since it broke the sanitizer bootstrap build bot.
May 25 2017, 10:43 PM
wmi added a reverting commit for rL303923: [GVN] Add phi-translate support in scalarpre.: rL303969: Revert rL303923 since it broke the sanitizer bootstrap build bot..
May 25 2017, 10:43 PM
wmi committed rL303923: [GVN] Add phi-translate support in scalarpre..
[GVN] Add phi-translate support in scalarpre.
May 25 2017, 2:49 PM
wmi closed D32252: [GVN] Add phi-translate for scalarpre as a temporary solution by committing rL303923: [GVN] Add phi-translate support in scalarpre..
May 25 2017, 2:49 PM
wmi added a comment to D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.

No more comments seen so prepare to commit the patch. Thanks for the review!

May 25 2017, 2:34 PM

May 19 2017

wmi added a comment to D31821: Remove redundant copy in recurrences.

Hi Taewook,

May 19 2017, 12:04 PM
wmi updated the diff for D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.

Initialize commutative in Expression's constructor.
Fix a bug related with commutative: need to swap cmp predicate at the same time when we swap the operands.

May 19 2017, 10:01 AM

May 18 2017

wmi committed rL303361: [LSR] Call canonicalize after we generate a new Formula in GenerateTruncates..
[LSR] Call canonicalize after we generate a new Formula in GenerateTruncates.
May 18 2017, 10:34 AM

May 16 2017

wmi added a comment to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Discussed with Chandler offline, and we decided to split the patch and tried to commit the store shrinking first.

May 16 2017, 5:20 PM

May 14 2017

wmi added a comment to D33164: [Profile[ Enhance expect lowering to handle correlated branches.

Overall looks good. Some minor comments inlined.

May 14 2017, 5:37 PM

May 11 2017

wmi added inline comments to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.
May 11 2017, 7:26 AM

May 10 2017

wmi added inline comments to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.
May 10 2017, 10:37 PM
wmi added a comment to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Chandler, Thanks for the comments. They are very helpful. I will address them in the next revision. I only replied some comments which I had questions or concerns.

May 10 2017, 10:43 AM

May 9 2017

wmi updated the diff for D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.

I change the DenseMap numberingExpression (mapping from value numbering to GVN::Expression) to std::vector. Although there is still some minor increase but better than before.

May 9 2017, 3:59 PM

May 5 2017

wmi updated the diff for D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.

Forgot to update the patch.

May 5 2017, 9:42 AM

May 2 2017

wmi added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Ideally there should be a separate pass that runs on the SSA machine code (before register coalescing) to minimize register pressure and hide latency for chains of loads or FP ops. It should work across calls and loop boundaries. You could even do a kind of poor-man's cyclic scheduling this way.

MISched works at the lower level of instructions groups and CPU pipeline hazards. It would be nice if MISched worked at the level of extended basic blocks (it would be easy to implement and has been done out of tree). I don't think it makes as much sense for it to work across call sites though. That is not hard to implement but seems it will generate large DAGS and will be bad compile-time tradeoff.

MISched is not a scheduling algorithm, it's a scheduling framework. The generic scheduler is a pile of heuristics that exercise most of the functionality and seems to be working ok for several popular targets. The strategy that it takes is:

  • Make a single scheduling pass handling all heuristics at once. Don't reorder the instructions at all unless the heuristics identify a register pressure or latency problem.
  • Try to determine, before scheduling a block, whether register pressure or latency is likely to become a problem. This avoids the scheduler backing itself into a corner (we don't want the scheduler to backtrack).

    You'll notice that this is very conservative with respect to managing compile time and preserving the decisions made by earlier passes.

    You could follow that basic strategy and simply adjust the priority of heuristics for your target. You can add better up-front analysis to detect code patterns for each block before prioritizing heuristics. Or you could implement a completely different strategy. For example, schedule for register pressure first, then reschedule for ILP.

    I probably won't be able to help you must more than that. Matthias has taken over maintenance of MISched. I think it would help if you give more background on your situation (sorry if I haven't paid attention or have forgotten). Is this PowerPC? Inorder/out-of-order?
May 2 2017, 5:12 PM
wmi added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Why are the adds "sunk down" in the first place? Is this reassociation at work?

May 2 2017, 2:31 PM

May 1 2017

wmi added inline comments to D32563: Add LiveRangeShrink pass to shrink live range within BB..
May 1 2017, 4:58 PM
wmi updated subscribers of D32563: Add LiveRangeShrink pass to shrink live range within BB..

+ Andy, for the history on pre-RA-sched and misched.

May 1 2017, 4:43 PM

Apr 28 2017

wmi updated the diff for D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Address Eli, Matt and Chandler's comments.

Apr 28 2017, 4:20 PM
wmi added inline comments to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.
Apr 28 2017, 4:11 PM

Apr 26 2017

wmi accepted D32249: [PartialInl] Enhance partial inliner to handle more complex conditions.

LGTM.

Apr 26 2017, 1:52 PM
wmi added a comment to D32249: [PartialInl] Enhance partial inliner to handle more complex conditions.

Some minor comments.

Apr 26 2017, 10:47 AM

Apr 25 2017

wmi added a comment to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Thanks for drafting the comments. It is apparently more descriptive and clearer, and I like the varnames -- (LargeVal and SmallVal), which are much better than what I used -- (OrigVal, MaskedVal). I will rewrite the comments based on your draft.

Apr 25 2017, 10:29 AM

Apr 21 2017

wmi added a comment to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Thanks for bearing with my poor English. I will fix the terminologies and comments according to your suggestions.

Apr 21 2017, 10:11 PM
wmi added inline comments to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.
Apr 21 2017, 6:07 PM
wmi added inline comments to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.
Apr 21 2017, 4:31 PM
wmi added a comment to D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently.

Ping.

Apr 21 2017, 10:16 AM
wmi committed rL300989: [ConstHoisting] Add BFI in constanthoisting pass and select the best insertion.
[ConstHoisting] Add BFI in constanthoisting pass and select the best insertion
Apr 21 2017, 9:03 AM
wmi closed D28962: Add BFI in constanthoisting pass and do the hoisting selectively by committing rL300989: [ConstHoisting] Add BFI in constanthoisting pass and select the best insertion.
Apr 21 2017, 9:03 AM

Apr 20 2017

wmi accepted D32308: Use BasicBlock Util SplitBlock interface to update DT .

LGTM.

Apr 20 2017, 2:06 PM
wmi added a comment to D32249: [PartialInl] Enhance partial inliner to handle more complex conditions.

I am thinking of another two issues:

  1. Is it possible that some of the BBs on the chain may be very big and we don't want to partial inline them?
  2. The existing pattern handles if (a || b || c ...) case, but it may not be easy to extend for cases like (a && b && c) and ((a && b) || c). Basically, we want to find a collection of bbs with small sizes starting from entry. The bb collection only have two exits. one of them is ReturnBlock and the other is NonReturnBlock. The edges inside of the collection can be of any pattern.
Apr 20 2017, 10:18 AM

Apr 19 2017

wmi added a comment to D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.

Daniel, thanks for the comments.

Apr 19 2017, 10:41 PM
wmi added inline comments to D32249: [PartialInl] Enhance partial inliner to handle more complex conditions.
Apr 19 2017, 4:58 PM
wmi created D32252: [GVN] Add phi-translate for scalarpre as a temporary solution.
Apr 19 2017, 3:45 PM

Apr 18 2017

wmi created D32201: [RALLOC] Increase CSR cost in RegAllocGreedy to favour splitting over CSR first use.
Apr 18 2017, 5:02 PM

Apr 17 2017

wmi added a comment to D32037: Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mir.

Haicheng, the testcase LGTM. Thanks for working on it!

Apr 17 2017, 2:45 PM
wmi committed rL300499: Fix an unused variable error in rL300494..
Fix an unused variable error in rL300494.
Apr 17 2017, 2:13 PM
wmi committed rL300494: [SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent.
[SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent
Apr 17 2017, 1:52 PM
wmi closed D30350: [LSR] Add a cap for reassociation of AllFixupsOutsideLoop type LSRUse to protect compile time by committing rL300494: [SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent.
Apr 17 2017, 1:52 PM
wmi added a comment to D30350: [LSR] Add a cap for reassociation of AllFixupsOutsideLoop type LSRUse to protect compile time.

Sanjoy, thanks for all the helpful comments.

Apr 17 2017, 10:52 AM

Apr 13 2017

wmi updated the diff for D32037: Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mir.

Remove some code from the test that doesn't impact it.

Apr 13 2017, 1:54 PM