wmi (Wei Mi)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 20 2015, 10:57 AM (146 w, 5 d)

Recent Activity

Nov 13 2017

wmi added a comment to D39053: [Bitfield] Add more cases to making the bitfield a separate location.

I think it may be hard to fix the problem in backend. It will face the same issue of store-to-load forwarding if at some places the transformation happens but at some other places somehow it doesn't.

Nov 13 2017, 11:16 AM

Oct 16 2017

wmi committed rL315915: [Bitfield] Add an option to access bitfield in a fine-grained manner..
[Bitfield] Add an option to access bitfield in a fine-grained manner.
Oct 16 2017, 9:50 AM
wmi closed D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type by committing rL315915: [Bitfield] Add an option to access bitfield in a fine-grained manner..
Oct 16 2017, 9:50 AM

Oct 11 2017

wmi added a comment to D34077: DAGCombine: Combine BUILD_VECTOR to TRUNCATE.

Revert r307036 at r315540 because of PR34919

Oct 11 2017, 5:27 PM
wmi committed rL315540: Revert r307036 because of PR34919..
Revert r307036 because of PR34919.
Oct 11 2017, 5:25 PM
wmi added a comment to rL307036: DAGCombine: Combine BUILD_VECTOR to TRUNCATE.

Hello, we found a bug caused by the patch, could you help to take a look?

Oct 11 2017, 2:41 PM

Oct 8 2017

wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Address Hal's comments.

Oct 8 2017, 10:17 PM

Oct 5 2017

wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Address Hal's comment.

Oct 5 2017, 6:22 PM
wmi added inline comments to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Oct 5 2017, 6:22 PM

Sep 27 2017

wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Address Hal's comment. Separate bitfields to shards separated by the naturally-sized-and-aligned fields.

Sep 27 2017, 3:54 PM
wmi accepted D37832: Eliminate PHI (int typed) which is only used by inttoptr.
Sep 27 2017, 12:05 PM
wmi added inline comments to D37832: Eliminate PHI (int typed) which is only used by inttoptr.
Sep 27 2017, 10:12 AM

Sep 26 2017

wmi added a comment to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

You seem to be only changing the behavior for the "separatable" fields, but I suspect you want to change the behavior for the others too. The bitfield would be decomposed into shards, separated by the naturally-sized-and-aligned fields. Each access only loads its shard. For example, in your test case you have:

struct S3 {
  unsigned long f1:14;
  unsigned long f2:18;
  unsigned long f3:32;
};

and you test that, with this option, loading/storing to a3.f3 only access the specific 4 bytes composing f3. But if you load f1 or f2, we're still loading all 8 bytes, right? I think we should only load/store the lower 4 bytes when we access a3.f1 and/or a3.f2.

Sep 26 2017, 4:20 PM

Sep 25 2017

wmi committed rL314145: Reinstall the patch "Use EmitPointerWithAlignment to get alignment information….
Reinstall the patch "Use EmitPointerWithAlignment to get alignment information…
Sep 25 2017, 12:59 PM

Sep 22 2017

wmi committed rL313992: [Atomic][X8664] set max atomic inline width according to the target.
[Atomic][X8664] set max atomic inline width according to the target
Sep 22 2017, 9:31 AM
wmi closed D38046: [Atomic][X8664] set max atomic inline/promote width according to the target by committing rL313992: [Atomic][X8664] set max atomic inline width according to the target.
Sep 22 2017, 9:31 AM

Sep 21 2017

wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Changes following the discussion:

Sep 21 2017, 11:29 AM

Sep 20 2017

wmi updated the diff for D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.

Address Eli's comments.

Sep 20 2017, 4:10 PM
wmi added inline comments to D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.
Sep 20 2017, 4:08 PM
wmi added inline comments to D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.
Sep 20 2017, 2:43 PM

Sep 19 2017

wmi updated the diff for D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.

Address Eli's comment.

Sep 19 2017, 6:09 PM
wmi retitled D38046: [Atomic][X8664] set max atomic inline/promote width according to the target from [AtomicExpandPass][X86] set MaxAtomicSizeInBitsSupported according to the target to [Atomic][X8664] set max atomic inline/promote width according to the target.
Sep 19 2017, 6:07 PM
wmi created D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.
Sep 19 2017, 11:23 AM

Sep 15 2017

wmi added inline comments to D37832: Eliminate PHI (int typed) which is only used by inttoptr.
Sep 15 2017, 3:01 PM

Sep 14 2017

wmi added a comment to D18201: Switch over targets to use AtomicExpandPass, and clean up target atomics code..

Any plan to push the patch recently? After https://reviews.llvm.org/rL312830, with better alignment information of atomic object, more atomic load/store are generated for 128 bits atomic object instead of atomic libcalls. Those 128bits atomic load/store are translated into sync_* libcalls on x86-64 target without cmpxchg16b support. This patch is needed for atomicExpandPass to generate atomic libcalls before isel to generate sync_* libcalls.

Sep 14 2017, 5:14 PM

Sep 13 2017

wmi committed rL313199: Add a comment for the test. NFC..
Add a comment for the test. NFC.
Sep 13 2017, 2:48 PM
wmi committed rL313197: [RegAlloc] Keep a copy of live interval for the spilled vregs in….
[RegAlloc] Keep a copy of live interval for the spilled vregs in…
Sep 13 2017, 2:43 PM
wmi closed D37578: [RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper by committing rL313197: [RegAlloc] Keep a copy of live interval for the spilled vregs in….
Sep 13 2017, 2:43 PM

Sep 8 2017

wmi committed rL312830: Reinstall the patch "Use EmitPointerWithAlignment to get alignment information….
Reinstall the patch "Use EmitPointerWithAlignment to get alignment information…
Sep 8 2017, 3:00 PM
wmi committed rL312810: Delete empty file test/CodeGenCXX/atomic-align.cpp after the revert at rL312805..
Delete empty file test/CodeGenCXX/atomic-align.cpp after the revert at rL312805.
Sep 8 2017, 11:33 AM
wmi committed rL312805: Revert rL312801 since it generated some calls from libatomic and broke some….
Revert rL312801 since it generated some calls from libatomic and broke some…
Sep 8 2017, 11:11 AM
wmi added a reverting commit for rL312801: Use EmitPointerWithAlignment to get alignment information of the pointer used…: rL312805: Revert rL312801 since it generated some calls from libatomic and broke some….
Sep 8 2017, 11:11 AM
wmi committed rL312801: Use EmitPointerWithAlignment to get alignment information of the pointer used….
Use EmitPointerWithAlignment to get alignment information of the pointer used…
Sep 8 2017, 10:09 AM
wmi closed D37310: [Atomic] Merge alignment information from Decl and from Type when emit atomic expression. by committing rL312801: Use EmitPointerWithAlignment to get alignment information of the pointer used….
Sep 8 2017, 10:09 AM
wmi committed rL312799: Fix a bug for rL312641..
Fix a bug for rL312641.
Sep 8 2017, 9:46 AM

Sep 7 2017

wmi updated the diff for D37310: [Atomic] Merge alignment information from Decl and from Type when emit atomic expression..

Address John's comment.

Sep 7 2017, 6:13 PM
wmi created D37578: [RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper.
Sep 7 2017, 11:29 AM

Sep 6 2017

wmi added a comment to D37310: [Atomic] Merge alignment information from Decl and from Type when emit atomic expression..

Ping

Sep 6 2017, 9:40 AM
wmi committed rL312641: [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent.
[TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent
Sep 6 2017, 9:06 AM
wmi closed D37406: [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent function return the intrinsics's first argument by committing rL312641: [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent.
Sep 6 2017, 9:06 AM

Sep 1 2017

wmi created D37406: [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent function return the intrinsics's first argument.
Sep 1 2017, 4:53 PM

Aug 30 2017

wmi created D37310: [Atomic] Merge alignment information from Decl and from Type when emit atomic expression..
Aug 30 2017, 2:09 PM
wmi abandoned D37221: [AtomicExpand][X86] Let atomic expand generate inline sequence for unaligned load/store of atomic primitive integer types on x86_64.
Aug 30 2017, 2:03 PM

Aug 29 2017

wmi committed rL312045: [LoopUnswitch] Fix a simple bug which disables loop unswitch for select….
[LoopUnswitch] Fix a simple bug which disables loop unswitch for select…
Aug 29 2017, 2:46 PM
wmi closed D36985: [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement by committing rL312045: [LoopUnswitch] Fix a simple bug which disables loop unswitch for select….
Aug 29 2017, 2:46 PM

Aug 28 2017

wmi added a comment to D36985: [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement.

Ping.

Aug 28 2017, 5:08 PM
wmi created D37221: [AtomicExpand][X86] Let atomic expand generate inline sequence for unaligned load/store of atomic primitive integer types on x86_64.
Aug 28 2017, 10:36 AM

Aug 22 2017

wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Try another idea suggested by David.

Aug 22 2017, 6:31 PM

Aug 21 2017

wmi created D36985: [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement.
Aug 21 2017, 3:37 PM

Aug 10 2017

wmi added a comment to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

I limit the bitfield separation in the last update to only happen at the beginning of a run so no bitfield combine will be blocked.

Aug 10 2017, 3:45 PM
wmi updated the diff for D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

Don't separate bitfield in the middle of a run because it is possible to hinder bitfields accesses combine. Only separate bitfield at the beginning of a run.

Aug 10 2017, 3:36 PM

Aug 9 2017

wmi added a comment to D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.

This has been discussed before and I still pretty strongly disagree with it.

This cripples the ability of TSan to find race conditions between accesses to consecutive bitfields -- and these bugs have actually come up.

Aug 9 2017, 10:42 PM
wmi updated the summary of D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Aug 9 2017, 5:13 PM
wmi updated the summary of D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Aug 9 2017, 5:05 PM
wmi created D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type.
Aug 9 2017, 5:04 PM

Aug 8 2017

wmi committed rL310421: [GVN] Remove stale entries in phitranslate cache when new phi is generated for….
[GVN] Remove stale entries in phitranslate cache when new phi is generated for…
Aug 8 2017, 2:41 PM
wmi closed D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE by committing rL310421: [GVN] Remove stale entries in phitranslate cache when new phi is generated for….
Aug 8 2017, 2:41 PM

Aug 7 2017

wmi added a comment to D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE.

Ping.

Aug 7 2017, 2:56 PM

Aug 3 2017

wmi added a comment to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

This patch caused regressions from 5% to 23% in two our internal benchmarks on Cortex-M23 and Cortex-M0+. I attached test.ll which is reduced from the benchmarks. I used LLVM revision 309830. 'test.good.ll' is a result when filtering is disabled. 'test.bad.ll' is a result when filtering is enabled.
Comparing them I can see that this optimization changes how an induction variable is changed. Originally it is incremented from 0 to 256. The optimization changes this into decrementing from 0 to -256. This induction variable is also used as an offset to memory. So to preserve this semantic conversion of the induction variable from a negative value to a positive value is inserted. This is lowered to additional instructions which causes performance regressions.

Could you please have a look at this issue?

Thanks,
Evgeny Astigeevich
The ARM Compiler Optimization team leader

Aug 3 2017, 11:12 AM

Jul 31 2017

wmi created D36124: [GVN] Remove stale entry in phitranslate cache when new phi is generated for PRE.
Jul 31 2017, 5:25 PM

Jul 28 2017

wmi committed rL309397: [GVN] Recommit the patch "Add phi-translate support in scalarpre".
[GVN] Recommit the patch "Add phi-translate support in scalarpre"
Jul 28 2017, 8:48 AM

Jul 25 2017

wmi committed rL309073: Add "REQUIRES: asserts" for test unswitch-equality-undef.ll..
Add "REQUIRES: asserts" for test unswitch-equality-undef.ll.
Jul 25 2017, 6:35 PM
wmi committed rL309059: Disable loop unswitching for some patterns containing equality comparison with….
Disable loop unswitching for some patterns containing equality comparison with…
Jul 25 2017, 4:38 PM
wmi closed D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef by committing rL309059: Disable loop unswitching for some patterns containing equality comparison with….
Jul 25 2017, 4:38 PM

Jul 24 2017

wmi updated subscribers of D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef.
Jul 24 2017, 12:04 PM
wmi created D35811: A workaround for the bug caused by descrepancy between loop-unswitch and GVN about branch on undef.
Jul 24 2017, 12:04 PM

Jul 20 2017

wmi added a comment to D34822: [LVI] Constant-propagate a zero extension of the switch condition value through case edges.

I think we can find many similar problems.

Jul 20 2017, 3:46 PM

Jul 11 2017

wmi added a comment to D34150: [LV] Test once if vector trip count is zero, instead of twice.

Re overflow - the point is that getOrCreateTripCount() returns, basically, PSE.getBackedgeTakenCount() + 1, and that may overflow, so the "trip count" may end up being 0 if the backedge taken count is 0. I don't think this is outdated, and this is behavior we want to preserve. But this patch should preserve this behavior IIUC. Can you make sure there's a test for this?

Jul 11 2017, 9:49 AM

Jul 6 2017

wmi committed rL307338: [ConstHoisting] Turn on consthoist-with-block-frequency by default..
[ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 5:11 PM
wmi closed D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default by committing rL307338: [ConstHoisting] Turn on consthoist-with-block-frequency by default..
Jul 6 2017, 5:11 PM
wmi committed rL307328: [ConstHoisting] choose to hoist when frequency is the same..
[ConstHoisting] choose to hoist when frequency is the same.
Jul 6 2017, 3:33 PM
wmi closed D35084: [ConstHoisting] choose to hoist when frequency is the same by committing rL307328: [ConstHoisting] choose to hoist when frequency is the same..
Jul 6 2017, 3:32 PM
wmi created D35084: [ConstHoisting] choose to hoist when frequency is the same.
Jul 6 2017, 1:47 PM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 10:46 AM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 10:13 AM
wmi added inline comments to D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 9:44 AM
wmi created D35063: [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Jul 6 2017, 9:17 AM
wmi committed rL307269: [LSR] Narrow search space by filtering non-optimal formulae with the same….
[LSR] Narrow search space by filtering non-optimal formulae with the same…
Jul 6 2017, 8:52 AM
wmi closed D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. by committing rL307269: [LSR] Narrow search space by filtering non-optimal formulae with the same….
Jul 6 2017, 8:52 AM

Jun 30 2017

wmi updated the diff for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

Cleanup and reduce the testcase.

Jun 30 2017, 10:13 AM
wmi added inline comments to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 30 2017, 10:10 AM

Jun 29 2017

wmi updated the diff for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..

Address Sanjoy's comments.

Jun 29 2017, 5:22 PM
wmi added inline comments to D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 29 2017, 5:04 PM
wmi accepted D31821: Remove redundant copy in recurrences.

LGTM.

Jun 29 2017, 2:06 PM
wmi added inline comments to D31821: Remove redundant copy in recurrences.
Jun 29 2017, 11:42 AM
wmi accepted D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 29 2017, 9:42 AM

Jun 28 2017

wmi added inline comments to D34608: [WIP][AArch64] Increase CSR cost when defering use of CSR is preferred.
Jun 28 2017, 4:50 PM
wmi added inline comments to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 28 2017, 4:13 PM
wmi added a comment to D31821: Remove redundant copy in recurrences.

For the example below, findTargetRecurrence starts from r2 and r3 to search a def reg equals to r1. There are a lot of possibilities to explore. That is where the complexity of findTargetRecurrence comes from.

Jun 28 2017, 3:50 PM

Jun 26 2017

wmi added a reviewer for D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale.: sanjoy.
Jun 26 2017, 3:09 PM
wmi committed rL306313: [GVN] Recommit the patch "Add phi-translate support in scalarpre"..
[GVN] Recommit the patch "Add phi-translate support in scalarpre".
Jun 26 2017, 11:16 AM

Jun 23 2017

wmi created D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale..
Jun 23 2017, 5:02 PM
wmi added inline comments to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.
Jun 23 2017, 8:22 AM

Jun 22 2017

wmi added a comment to D33928: [LoopStrengthReduction] Treat SCEVUnknown pessimistically in LSR.

Hi Max,

Jun 22 2017, 10:13 AM

Jun 21 2017

wmi added a comment to D34273: [SCEV] Use depth limit instead of local cache for SExt and ZExt.

Thanks for helping on fixing the bug!

Jun 21 2017, 3:30 PM

Jun 16 2017

wmi committed rL305603: Revert rL305578. There is still some buildbot failure to be fixed..
Revert rL305578. There is still some buildbot failure to be fixed.
Jun 16 2017, 4:15 PM
wmi added a reverting commit for rL305578: [GVN] Recommit the patch "Add phi-translate support in scalarpre".: rL305603: Revert rL305578. There is still some buildbot failure to be fixed..
Jun 16 2017, 4:15 PM
wmi committed rL305578: [GVN] Recommit the patch "Add phi-translate support in scalarpre"..
[GVN] Recommit the patch "Add phi-translate support in scalarpre".
Jun 16 2017, 1:21 PM

Jun 10 2017

wmi added a comment to D31821: Remove redundant copy in recurrences.

Sorry for the delay. The rewrite based on SSA looks much cleaner now. About the algorithm, IIUC it tries to find loop based on define-use of tied operand or operand commutable with tied operand. However, I still have concern that the method can increase redundent copy sometimes.

Jun 10 2017, 6:31 PM

Jun 7 2017

wmi added a comment to D33847: [PartialInlining] Enhance code outliner to sink locals declared outside the outline region.

One comment about simplifying the test. Other than that, LGTM.

Jun 7 2017, 9:45 AM