Page MenuHomePhabricator

cfang (Changpeng Fang)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2015, 11:41 AM (186 w, 1 d)

Recent Activity

May 8 2019

cfang committed rG73b7272e7a87: AMDGPU: Fix a mis-placed bracket (authored by cfang).
AMDGPU: Fix a mis-placed bracket
May 8 2019, 12:44 PM

Mar 29 2019

cfang added inline comments to D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 29 2019, 3:03 PM

Mar 26 2019

cfang added inline comments to D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 26 2019, 11:45 AM
cfang created D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 26 2019, 11:16 AM

Mar 25 2019

cfang added a comment to D51584: [IndVars] Smart hard uses detection.

I am working on a regression caused by this patch. It is a memory access fault actually.
However, if we put back the original consition, my test would pass.

Mar 25 2019, 2:13 PM

Mar 15 2019

cfang committed rG989ec59c9f04: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges. (authored by cfang).
AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges.
Mar 15 2019, 2:03 PM
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

Add comment to the test

Mar 15 2019, 1:28 PM · Restricted Project
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

update the test with update_test_checks.py

Mar 15 2019, 11:13 AM · Restricted Project

Mar 14 2019

cfang added inline comments to D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..
Mar 14 2019, 9:20 PM · Restricted Project
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

Fix a typo and add a test.

Mar 14 2019, 10:18 AM · Restricted Project

Mar 13 2019

cfang created D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..
Mar 13 2019, 10:50 AM · Restricted Project

Feb 26 2019

cfang abandoned D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Feb 26 2019, 10:39 AM

Feb 20 2019

cfang abandoned D58479: This is just a try.
Feb 20 2019, 2:34 PM
cfang created D58479: This is just a try.
Feb 20 2019, 2:30 PM

Feb 18 2019

cfang created D58360: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint.
Feb 18 2019, 4:04 PM
cfang committed rG4cabf6d3b52b: AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint… (authored by cfang).
AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint…
Feb 18 2019, 3:02 PM
cfang added inline comments to D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 18 2019, 2:44 PM · Restricted Project
cfang updated the diff for D58295: AMDGPU: Fix memory dependence analysis by considering the offset..

Remove the unused Target Instruction Info argument in a couple functions.

Feb 18 2019, 2:33 PM · Restricted Project
cfang updated the diff for D58295: AMDGPU: Fix memory dependence analysis by considering the offset..

Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass.

Feb 18 2019, 2:09 PM · Restricted Project

Feb 15 2019

cfang added inline comments to D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 15 2019, 1:57 PM · Restricted Project
cfang created D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 15 2019, 11:10 AM · Restricted Project

Jan 16 2019

cfang updated the diff for D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Update based on reviewer's suggestions:

  1. -check-prefix=GCN
  2. addrspace(0) is not needed.
Jan 16 2019, 1:25 PM
cfang added inline comments to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..
Jan 16 2019, 1:23 PM
cfang added a comment to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

This problem isn't limited to private address space. This should have tests for every address space, and with cases using unrelated bases

Jan 16 2019, 12:17 PM
cfang added a comment to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Can you add a test where low half does not produce a chain? An arithmetic operation and an undef.

Jan 16 2019, 12:16 PM
cfang updated the diff for D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Update based on the comments from the reviewers:

Jan 16 2019, 12:15 PM

Jan 15 2019

cfang updated the diff for D56745: AMDGPU: Raise the priority of MAD24 in instruction selection..

update the test based on the following suggestion:

There should be no unnamed variables left.

Jan 15 2019, 3:05 PM
cfang created D56745: AMDGPU: Raise the priority of MAD24 in instruction selection..
Jan 15 2019, 2:46 PM

Jan 8 2019

cfang created D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..
Jan 8 2019, 2:19 PM

Jan 4 2019

cfang added inline comments to D56291: ScheduleDAG: Don't break the dependence in clustering neighboring loads..
Jan 4 2019, 10:52 AM

Jan 3 2019

cfang created D56291: ScheduleDAG: Don't break the dependence in clustering neighboring loads..
Jan 3 2019, 2:16 PM

Dec 17 2018

cfang updated the diff for D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..

Cheap check first.

Dec 17 2018, 10:00 AM

Dec 13 2018

cfang added inline comments to D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..
Dec 13 2018, 8:29 AM

Dec 11 2018

cfang created D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..
Dec 11 2018, 12:25 PM

Dec 10 2018

cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

No, it's not committed. Variable + constant is a common case in general.

It would probably be better to do this fold in the DAG for now though

Dec 10 2018, 10:18 AM

Dec 5 2018

cfang updated the diff for D55241: AMDGPU: Should always start from the first register in VGPR indexing..

Fix typos.

Dec 5 2018, 2:36 PM
cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

D30466 is the primitive computeKnownBits

Dec 5 2018, 11:23 AM
cfang added inline comments to D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Dec 5 2018, 10:28 AM
cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

We should try to use some known bits information to keep this. I have a patch to add a machine version, but there might be a better way

Would you please explain how would your knownbit approach resolve the negative index issue while keep the optimization for gfx9+?
Or just post your patch. Thanks.

If you know the base index isn't negative, you don't need to disable this

Dec 5 2018, 10:28 AM

Dec 3 2018

cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

We should try to use some known bits information to keep this. I have a patch to add a machine version, but there might be a better way

Dec 3 2018, 4:00 PM
cfang created D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Dec 3 2018, 3:47 PM

Oct 15 2018

cfang updated the diff for D52946: AMDGPU: Add support pattern for SUB of one bit.

Add test for non-uniform case.

Oct 15 2018, 3:10 PM

Oct 5 2018

cfang created D52946: AMDGPU: Add support pattern for SUB of one bit.
Oct 5 2018, 1:49 PM

Sep 25 2018

cfang created D52518: AMDGPU: Add Selection patterns to support add of one bit..
Sep 25 2018, 1:26 PM

May 17 2018

cfang added a reviewer for D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly: dberlin.

Thanks Justin for your comment. I added Daniel as a reviewer.

May 17 2018, 9:25 AM
cfang added a reviewer for D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly: jlebar.
May 17 2018, 8:44 AM

May 16 2018

cfang updated the diff for D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly.

Updated the test based on the comments

May 16 2018, 1:33 PM
cfang added a comment to D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly.

Right. Reverse Post Order (RPO) should be the fundamental order. But apparently there are some cases that pure RPO does not work.
That's the reason loop depth was introduced to guard the ordering of the nodes.

May 16 2018, 1:32 PM
cfang added a comment to D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..

We need this fix to unblock some important tasks. So if there is no additional comments, we need the permission
to integrate the patch. Thanks.

May 16 2018, 10:04 AM

May 15 2018

cfang created D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly.
May 15 2018, 4:06 PM

May 10 2018

cfang added a comment to D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.

ping

May 10 2018, 8:19 AM

May 9 2018

cfang added a comment to D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..

Any additional comments/suggestions? Thanks!

May 9 2018, 2:54 PM
cfang updated the diff for D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.

Update based on Matt's comments.

May 9 2018, 2:52 PM
cfang added inline comments to D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.
May 9 2018, 2:50 PM

May 7 2018

cfang accepted D46532: AMDGPU: Stop special casing constant indexes of extract_vector_elt.
May 7 2018, 2:18 PM
cfang updated the diff for D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..

Correct indentation error and typo.

May 7 2018, 9:31 AM
cfang added a comment to D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..
May 7 2018, 9:28 AM

May 4 2018

cfang created D46438: AMDGPU: Use eraseFromParent to delete am instruction when it is no longer needed..
May 4 2018, 9:34 AM

May 3 2018

cfang added a comment to D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..

So this does not cover all possible cases of "irreducible" infinite loops, e.g. consider the case with basic blocks A and B where both end with a conditional branch that can go to either A or B. I.e., there are no unconditional branches in the loop.

A more robust approach would be to also transform conditional branches like this:

br i1 %cc, label %A, label %B

becomes

  br i1 true, label %DummyReturnBB, label %local_dummy
local_dummy:
  br i1 %cc, label %A, label %b
May 3 2018, 4:45 PM
cfang updated the diff for D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..

Handle the case that the infinite loop is controlled by a conditional branch. In such case, the two edges of the branch
are both backedges.

May 3 2018, 4:40 PM

May 2 2018

cfang updated the diff for D45993: AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction..

Update the test:

May 2 2018, 2:40 PM
cfang updated the diff for D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.

Update the test:

  1. Remove the -amdgiz from the triple since it is no longer necessary;
  2. Add "-data-layout=A5" to explicitly specify the data layout for address space 5 for alloca
    • and to remove the "target:" line for the same data layout purpose.
May 2 2018, 2:23 PM

May 1 2018

cfang created D46340: AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops..
May 1 2018, 4:11 PM

Apr 27 2018

cfang accepted D46154: [AMDGPU][Waitcnt] Update a few lit tests to use the default waitcnt pass.

LGTM

Apr 27 2018, 9:51 AM · Restricted Project

Apr 26 2018

cfang added inline comments to D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.
Apr 26 2018, 2:10 PM
cfang added a comment to D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.

What do you think of the test?
What should we do if we don't add the following line?
target datalayout = "A5"

Apr 26 2018, 1:48 PM
cfang added a comment to D45993: AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction..

What do you think of the test? Thanks;

Apr 26 2018, 1:46 PM

Apr 25 2018

cfang accepted D46089: AMDGPU: Extend extract_vector_elt fneg combine to fabs.

LGTM,

Apr 25 2018, 3:49 PM
cfang added inline comments to D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.
Apr 25 2018, 3:48 PM
cfang updated the diff for D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.

Combine !isAtomic and !isVolatile checks as isSimple

Apr 25 2018, 3:43 PM
cfang created D46085: AMDGPU/SI: Don't promote alloca to vector for atomic load/store.
Apr 25 2018, 2:33 PM
cfang updated the diff for D45993: AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction..

Fix addrspacecast in the test since generic address space is 0 (default) now. Thanks, Sam!

Apr 25 2018, 10:08 AM

Apr 24 2018

cfang updated the diff for D45993: AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction..

Add a test.

Apr 24 2018, 4:16 PM

Apr 23 2018

cfang created D45993: AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction..
Apr 23 2018, 3:02 PM
cfang accepted D45973: [AMDGPU][Waitcnt] NFC. Cleanup some code.
Apr 23 2018, 10:53 AM · Restricted Project
cfang added reviewers for D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector: msearles, kzhuravl.

Add reviewers and ping.

Apr 23 2018, 10:50 AM

Apr 12 2018

cfang added a comment to D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.

ping

Apr 12 2018, 4:10 PM

Apr 3 2018

cfang created D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.
Apr 3 2018, 1:50 PM

Feb 16 2018

cfang updated the diff for D33559: AMDGPU/SI: Extend promoting alloca to vector to arrays of up to 16 elements.

Update LIT tests before landing since the patch was developed a long time back.

Feb 16 2018, 11:04 AM

Feb 15 2018

cfang updated the diff for D43297: AMDGPU/SI: Turn off GPR Indexing Mode immediately after the interested instruction..

full context diff

Feb 15 2018, 4:06 PM
cfang added a comment to D43297: AMDGPU/SI: Turn off GPR Indexing Mode immediately after the interested instruction..

Ping

Feb 15 2018, 4:02 PM
cfang added a reviewer for D43297: AMDGPU/SI: Turn off GPR Indexing Mode immediately after the interested instruction.: rampitec.
Feb 15 2018, 2:58 PM

Feb 14 2018

cfang created D43297: AMDGPU/SI: Turn off GPR Indexing Mode immediately after the interested instruction..
Feb 14 2018, 9:39 AM

Feb 8 2018

cfang accepted D42760: AMDGPU: Remove tied operand from si_else.
Feb 8 2018, 4:10 PM

Feb 7 2018

cfang accepted D43049: AMDGPU: Don't crash when trying to fold implicit operands.

LGTM

Feb 7 2018, 4:03 PM

Jan 31 2018

cfang added a comment to D42548: AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature..

Needs test

Jan 31 2018, 3:14 PM
cfang updated the diff for D42548: AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature..

Update tests to expose the LLVM-ERROR withput the patch.

Jan 31 2018, 3:14 PM

Jan 29 2018

cfang added a comment to D42548: AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature..

ping! Thanks.

Jan 29 2018, 1:22 PM
cfang added a comment to D42596: AMDGPU/SI: Add decoding in the GFX80_UNPACKED decoding namespace..
In D42596#990403, @dp wrote:

It would be nice to have a few decoding tests as there is currently zero _d16 coverage for disassembler.

BTW, there is no difference in data size for affected MIMG instructions. Currently data size in dwords = count_population(dmask) + tfe. How does d16 affect this number?

I was unable to find anything helpful in docs and SP3 output is not affected by d16 for gfx9. Are data in VGPRs really packed for MIMG? :-)

For MIMG, to determine the data size for d16, we need to know whether the target has the feature UnpackedD16VMem (gfx8.0),

  1. if that feature is set, the data size is the same as without D16 bit set;
  2. if that feature is not set, then the data size is "half" of the size because we can packed two f16 into one register.
Jan 29 2018, 12:07 PM
cfang updated the diff for D42596: AMDGPU/SI: Add decoding in the GFX80_UNPACKED decoding namespace..

Add Disassembler tests based on Reviewers' suggestion. Thanks.

Jan 29 2018, 11:51 AM

Jan 26 2018

cfang added a comment to D42596: AMDGPU/SI: Add decoding in the GFX80_UNPACKED decoding namespace..

Can this be tested?

Jan 26 2018, 1:46 PM
cfang created D42596: AMDGPU/SI: Add decoding in the GFX80_UNPACKED decoding namespace..
Jan 26 2018, 12:16 PM

Jan 25 2018

cfang created D42548: AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature..
Jan 25 2018, 9:55 AM

Jan 18 2018

cfang closed D39912: AMDGPU/SI: Implement d16 support for image intrinsics.

Patch committed to trunk:
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322903

Jan 18 2018, 3:03 PM
cfang added a comment to D39912: AMDGPU/SI: Implement d16 support for image intrinsics.

Patched updated! Request for reviewer's check. Thanks.
<still I could not receive message of the diff update>

Jan 18 2018, 8:35 AM
cfang updated the diff for D39912: AMDGPU/SI: Implement d16 support for image intrinsics.

Update based on Matt's suggestion: Factor out the common defs.

Jan 18 2018, 8:28 AM

Jan 17 2018

cfang added inline comments to D39912: AMDGPU/SI: Implement d16 support for image intrinsics.
Jan 17 2018, 10:04 AM

Jan 16 2018

cfang added a comment to D39912: AMDGPU/SI: Implement d16 support for image intrinsics.

Don't know why I didn't received a message after I updated the patch. So ping here with the updating message:

Jan 16 2018, 8:17 AM

Jan 15 2018

cfang updated the diff for D39912: AMDGPU/SI: Implement d16 support for image intrinsics.
  1. sync with the buffer d16 support patch to reuse some of the functionalities;
  2. Remove unnecessary functions that print "d16" in the AsmPrinter;
  3. Update based on Matt's other comments.
Jan 15 2018, 3:18 PM

Jan 11 2018

cfang added inline comments to D38906: AMDGPU/SI: Implement d16 support for buffer intrinsics.
Jan 11 2018, 11:25 AM