Page MenuHomePhabricator

cfang (Changpeng Fang)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2015, 11:41 AM (225 w, 5 d)

Recent Activity

Feb 27 2020

cfang committed rG8629cfdd7d5c: AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common (authored by cfang).
AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common
Feb 27 2020, 4:15 AM
cfang committed rGee2e845ba7fd: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 4:14 AM
cfang committed rGe9fea9aba9a8: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 3:52 AM
cfang committed rG98ae090e0088: AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls (authored by cfang).
AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls
Feb 27 2020, 3:41 AM
cfang committed rGbfcd3423e569: AMDGPU/SI: implements image load/store intrinsics (authored by cfang).
AMDGPU/SI: implements image load/store intrinsics
Feb 27 2020, 3:11 AM
cfang committed rG971896d73390: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 1:52 AM

Feb 26 2020

cfang committed rG23a2858a123d: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 26 2020, 11:45 PM

Feb 11 2020

cfang accepted D74408: AMDGPU: Don't create potentially dead rcp declarations.

LGTM

Feb 11 2020, 9:35 AM · Restricted Project

Feb 7 2020

cfang closed D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

commit 2531535984ad989ce88aeee23cb92a827da6686e
Author: Changpeng Fang <changpeng.fang@gmail.com>
Date: Thu Jan 23 16:57:43 2020 -0800

Feb 7 2020, 1:03 PM · Restricted Project
cfang committed rG884acbb9e167: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Feb 7 2020, 11:49 AM
cfang closed D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 7 2020, 11:49 AM · Restricted Project
cfang committed rG6370c7c13e6d: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 7 2020, 11:12 AM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 7 2020, 11:12 AM · Restricted Project
cfang updated the diff for D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Fix the LIT test failures.
(Yes, I messed up with a release build check, Thanks for catching these).

Feb 7 2020, 10:54 AM · Restricted Project

Feb 6 2020

cfang added a comment to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Where are these failing LIT tests located? I did LIT tests before posting for review and also before integrating.
Maybe my check is incomplete. Thanks.

Feb 6 2020, 7:58 PM · Restricted Project
cfang committed rG982780648124: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 6 2020, 4:42 PM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:42 PM · Restricted Project
cfang created D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:23 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Updated based on feedback:

Feb 6 2020, 11:07 AM · Restricted Project

Feb 5 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 3:11 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Rename a few functions and variables:

Feb 5 2020, 2:16 PM · Restricted Project
cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 12:10 PM · Restricted Project

Feb 3 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 3 2020, 10:01 AM · Restricted Project

Jan 31 2020

cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

update based on comment.

Jan 31 2020, 2:25 PM · Restricted Project

Jan 30 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 30 2020, 3:16 PM · Restricted Project

Jan 29 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 29 2020, 2:53 PM · Restricted Project

Jan 28 2020

cfang created D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 28 2020, 3:03 PM · Restricted Project

Jan 23 2020

cfang committed rG2531535984ad: AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare
Jan 23 2020, 5:01 PM
cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on feedback:

Jan 23 2020, 11:56 AM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 23 2020, 11:51 AM · Restricted Project

Jan 22 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

update based on feedback.

Jan 22 2020, 4:00 PM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 22 2020, 2:31 PM · Restricted Project
cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on the comments.

Jan 22 2020, 12:05 PM · Restricted Project

Jan 21 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on feedback from the reviewer.

Jan 21 2020, 5:08 PM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 21 2020, 9:38 AM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 21 2020, 8:36 AM · Restricted Project

Jan 20 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Implement rcp optimization for fdiv in AMGGPUCodegenPrepare to insert amdgcn_rcp intrinsic. For f32 type fdiv,
if fpmath metadata is unavailable, we could not do rcp optimization unless fast unsafe math is specified.

Jan 20 2020, 4:27 PM · Restricted Project

Jan 10 2020

cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 10 2020, 9:53 AM · Restricted Project

Jan 7 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Introduce an intrinsic in AMDGPUCodeGenPrepare to generate correctly rounded fdiv32.

Jan 7 2020, 3:14 PM · Restricted Project

Dec 12 2019

cfang added a comment to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

The attribute should not de directly checked (we probably shouldn’t even be putting it on the function). The proper thing to check is the fpmath metadata on the individual instruction. This isn’t propagated into the DAG, so AMDGPUCodeGenPrepare inserts intrinsic calls which isn’t ideal

:
So what's your suggestion here? The current logic in AMDGPUCodeGenPrepare is to find cases that we can insert the intrinsic to generate "Faster 2.5 ULP division that does not support denormals."
Otherwise SIISelLowering will lower FDIV32 UnsafeMath and Demorm support.

Dec 12 2019, 1:28 PM · Restricted Project

Dec 10 2019

cfang created D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Dec 10 2019, 11:23 AM · Restricted Project

Oct 25 2019

cfang committed rG1ce552f3ef8d: AMDGPU: Fix the broken dominator tree when creating waterfall loop for resource… (authored by cfang).
AMDGPU: Fix the broken dominator tree when creating waterfall loop for resource…
Oct 25 2019, 1:15 PM
cfang closed D69358: AMDGPU: Fix the broken dominator tree.
Oct 25 2019, 1:15 PM · Restricted Project

Oct 24 2019

cfang added inline comments to D69358: AMDGPU: Fix the broken dominator tree.
Oct 24 2019, 4:36 PM · Restricted Project
cfang updated the diff for D69358: AMDGPU: Fix the broken dominator tree.

Update based on the comments

Oct 24 2019, 4:35 PM · Restricted Project
cfang updated the diff for D69358: AMDGPU: Fix the broken dominator tree.

Add test case.

Oct 24 2019, 9:23 AM · Restricted Project

Oct 23 2019

cfang created D69358: AMDGPU: Fix the broken dominator tree.
Oct 23 2019, 2:49 PM · Restricted Project

Oct 1 2019

cfang committed rGe4ee28d14ce6: AMDGPU: Fix an out of date assert in addressing FrameIndex (authored by cfang).
AMDGPU: Fix an out of date assert in addressing FrameIndex
Oct 1 2019, 4:07 PM

Sep 30 2019

cfang updated the diff for D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.

update the test based on the comment:

  1. use a defined comparison
  2. remove "unreachable", and make it branch to exit.
Sep 30 2019, 1:38 PM · Restricted Project

Sep 26 2019

cfang committed rGf5524f04512d: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint (authored by cfang).
Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint
Sep 26 2019, 3:54 PM
cfang updated the diff for D58360: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint.

Rebase and update the diffs.

Sep 26 2019, 3:38 PM · Restricted Project
cfang updated the diff for D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
  1. Remove the use of structs to simplify the test
  2. generate llc checks with utils/update_llc_test_checks.py
Sep 26 2019, 1:13 PM · Restricted Project

Sep 24 2019

cfang added inline comments to D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
Sep 24 2019, 3:42 PM · Restricted Project

Sep 13 2019

cfang created D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
Sep 13 2019, 3:01 PM · Restricted Project

Aug 26 2019

cfang accepted D66772: AMDGPU: Combine directly on mul24 intrinsics.
Aug 26 2019, 4:13 PM

May 8 2019

cfang committed rG73b7272e7a87: AMDGPU: Fix a mis-placed bracket (authored by cfang).
AMDGPU: Fix a mis-placed bracket
May 8 2019, 12:44 PM

Mar 29 2019

cfang added inline comments to D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 29 2019, 3:03 PM

Mar 26 2019

cfang added inline comments to D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 26 2019, 11:45 AM
cfang created D59829: AMDGPU: An extension to promote constant offset to the immediate.
Mar 26 2019, 11:16 AM

Mar 25 2019

cfang added a comment to D51584: [IndVars] Smart hard uses detection.

I am working on a regression caused by this patch. It is a memory access fault actually.
However, if we put back the original consition, my test would pass.

Mar 25 2019, 2:13 PM

Mar 15 2019

cfang committed rG989ec59c9f04: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges. (authored by cfang).
AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges.
Mar 15 2019, 2:03 PM
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

Add comment to the test

Mar 15 2019, 1:28 PM · Restricted Project
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

update the test with update_test_checks.py

Mar 15 2019, 11:13 AM · Restricted Project

Mar 14 2019

cfang added inline comments to D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..
Mar 14 2019, 9:20 PM · Restricted Project
cfang updated the diff for D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

Fix a typo and add a test.

Mar 14 2019, 10:18 AM · Restricted Project

Mar 13 2019

cfang created D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..
Mar 13 2019, 10:50 AM · Restricted Project

Feb 26 2019

cfang abandoned D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Feb 26 2019, 10:39 AM

Feb 20 2019

cfang abandoned D58479: This is just a try.
Feb 20 2019, 2:34 PM
cfang created D58479: This is just a try.
Feb 20 2019, 2:30 PM

Feb 18 2019

cfang created D58360: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint.
Feb 18 2019, 4:04 PM · Restricted Project
cfang committed rG4cabf6d3b52b: AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint… (authored by cfang).
AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint…
Feb 18 2019, 3:02 PM
cfang added inline comments to D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 18 2019, 2:44 PM · Restricted Project
cfang updated the diff for D58295: AMDGPU: Fix memory dependence analysis by considering the offset..

Remove the unused Target Instruction Info argument in a couple functions.

Feb 18 2019, 2:33 PM · Restricted Project
cfang updated the diff for D58295: AMDGPU: Fix memory dependence analysis by considering the offset..

Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass.

Feb 18 2019, 2:09 PM · Restricted Project

Feb 15 2019

cfang added inline comments to D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 15 2019, 1:57 PM · Restricted Project
cfang created D58295: AMDGPU: Fix memory dependence analysis by considering the offset..
Feb 15 2019, 11:10 AM · Restricted Project

Jan 16 2019

cfang updated the diff for D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Update based on reviewer's suggestions:

  1. -check-prefix=GCN
  2. addrspace(0) is not needed.
Jan 16 2019, 1:25 PM
cfang added inline comments to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..
Jan 16 2019, 1:23 PM
cfang added a comment to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

This problem isn't limited to private address space. This should have tests for every address space, and with cases using unrelated bases

Jan 16 2019, 12:17 PM
cfang added a comment to D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Can you add a test where low half does not produce a chain? An arithmetic operation and an undef.

Jan 16 2019, 12:16 PM
cfang updated the diff for D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..

Update based on the comments from the reviewers:

Jan 16 2019, 12:15 PM

Jan 15 2019

cfang updated the diff for D56745: AMDGPU: Raise the priority of MAD24 in instruction selection..

update the test based on the following suggestion:

There should be no unnamed variables left.

Jan 15 2019, 3:05 PM
cfang created D56745: AMDGPU: Raise the priority of MAD24 in instruction selection..
Jan 15 2019, 2:46 PM

Jan 8 2019

cfang created D56454: AMDGPU: Adjust the chain for loads writing to the HI part of a register..
Jan 8 2019, 2:19 PM

Jan 4 2019

cfang added inline comments to D56291: ScheduleDAG: Don't break the dependence in clustering neighboring loads..
Jan 4 2019, 10:52 AM

Jan 3 2019

cfang created D56291: ScheduleDAG: Don't break the dependence in clustering neighboring loads..
Jan 3 2019, 2:16 PM

Dec 17 2018

cfang updated the diff for D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..

Cheap check first.

Dec 17 2018, 10:00 AM

Dec 13 2018

cfang added inline comments to D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..
Dec 13 2018, 8:29 AM

Dec 11 2018

cfang created D55568: AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing..
Dec 11 2018, 12:25 PM

Dec 10 2018

cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

No, it's not committed. Variable + constant is a common case in general.

It would probably be better to do this fold in the DAG for now though

Dec 10 2018, 10:18 AM

Dec 5 2018

cfang updated the diff for D55241: AMDGPU: Should always start from the first register in VGPR indexing..

Fix typos.

Dec 5 2018, 2:36 PM
cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

D30466 is the primitive computeKnownBits

Dec 5 2018, 11:23 AM
cfang added inline comments to D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Dec 5 2018, 10:28 AM
cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

We should try to use some known bits information to keep this. I have a patch to add a machine version, but there might be a better way

Would you please explain how would your knownbit approach resolve the negative index issue while keep the optimization for gfx9+?
Or just post your patch. Thanks.

If you know the base index isn't negative, you don't need to disable this

Dec 5 2018, 10:28 AM

Dec 3 2018

cfang added a comment to D55241: AMDGPU: Should always start from the first register in VGPR indexing..

We should try to use some known bits information to keep this. I have a patch to add a machine version, but there might be a better way

Dec 3 2018, 4:00 PM
cfang created D55241: AMDGPU: Should always start from the first register in VGPR indexing..
Dec 3 2018, 3:47 PM

Oct 15 2018

cfang updated the diff for D52946: AMDGPU: Add support pattern for SUB of one bit.

Add test for non-uniform case.

Oct 15 2018, 3:10 PM

Oct 5 2018

cfang created D52946: AMDGPU: Add support pattern for SUB of one bit.
Oct 5 2018, 1:49 PM

Sep 25 2018

cfang created D52518: AMDGPU: Add Selection patterns to support add of one bit..
Sep 25 2018, 1:26 PM

May 17 2018

cfang added a reviewer for D46912: StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly: dberlin.

Thanks Justin for your comment. I added Daniel as a reviewer.

May 17 2018, 9:25 AM