Page MenuHomePhabricator

cfang (Changpeng Fang)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2015, 11:41 AM (314 w, 2 d)

Recent Activity

Wed, Nov 17

cfang added a comment to D100464: [DSE] Remove stores in the same loop iteration.

Right. If we manually LICM'ed the load/stores in group 1 and group2 (with something like sum_i),
we will see the higher register pressure. Further, if we intentionally insert dead stores to store
the intermediate sum_is, the register pressure can be lowered. Some later passes should be responsible.

Wed, Nov 17, 10:56 AM · Restricted Project

Tue, Nov 16

cfang added a comment to D100464: [DSE] Remove stores in the same loop iteration.

We see a big performance regression caused by this patch due to register pressure increase.
We found that remove stores in the same iteration could increase
the register pressure dramatically. In the following piece of code, group 1 and group 2 stores
are eliminated and the register usage increased from 70 to 171. This is from our
critical application, I am not sure whether this DSE could estimate RP?

Tue, Nov 16, 9:59 PM · Restricted Project

May 12 2021

cfang abandoned D59829: AMDGPU: An extension to promote constant offset to the immediate.
May 12 2021, 10:49 AM

Mar 29 2021

cfang added inline comments to D99352: [AMDGPU] ds_read_*/ds_write_* operations require strict alignment..
Mar 29 2021, 4:04 PM · Restricted Project

Jan 25 2021

cfang committed rG5b648df1a842: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause (authored by cfang).
AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause
Jan 25 2021, 4:09 PM
cfang closed D95273: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause.
Jan 25 2021, 4:09 PM · Restricted Project
cfang updated the diff for D95273: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause.

Fix LIT failures.

Jan 25 2021, 3:28 PM · Restricted Project

Jan 23 2021

cfang updated the diff for D95273: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause.

Make a few changes based on comments, Thanks!

Jan 23 2021, 10:41 PM · Restricted Project

Jan 22 2021

cfang added inline comments to D95273: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause.
Jan 22 2021, 6:17 PM · Restricted Project
cfang requested review of D95273: AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause.
Jan 22 2021, 4:21 PM · Restricted Project

Jan 12 2021

cfang abandoned D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

The issue has been workarounded by https://reviews.llvm.org/D94107
So abandon this one.

Jan 12 2021, 2:44 PM · Restricted Project

Jan 5 2021

cfang committed rGcb5b52a06eeb: AMDGPU: Annotate amdgpu.noclobber for global loads only (authored by cfang).
AMDGPU: Annotate amdgpu.noclobber for global loads only
Jan 5 2021, 2:48 PM
cfang closed D94107: AMDGPU: Annotate amdgpu.noclobber for global loads only.
Jan 5 2021, 2:48 PM · Restricted Project
cfang requested review of D94107: AMDGPU: Annotate amdgpu.noclobber for global loads only.
Jan 5 2021, 11:24 AM · Restricted Project

Dec 14 2020

cfang committed rGce0c0013d8b1: AMDGPU: If a store defines (alias) a load, it clobbers the load. (authored by cfang).
AMDGPU: If a store defines (alias) a load, it clobbers the load.
Dec 14 2020, 4:35 PM
cfang closed D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 14 2020, 4:35 PM · Restricted Project
cfang added inline comments to D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 14 2020, 3:40 PM · Restricted Project
cfang added inline comments to D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 14 2020, 10:57 AM · Restricted Project

Dec 11 2020

cfang updated the diff for D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..

Add a new test case.

Dec 11 2020, 2:27 PM · Restricted Project
cfang added inline comments to D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 11 2020, 12:34 PM · Restricted Project

Dec 9 2020

cfang added inline comments to D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 9 2020, 9:55 PM · Restricted Project
cfang updated the diff for D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..

Change store to any instructions that may write to memory.

Dec 9 2020, 11:51 AM · Restricted Project
cfang requested review of D92951: AMDGPU: If a store defines (alias) a load, it clobbers the load..
Dec 9 2020, 11:00 AM · Restricted Project

Dec 1 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Ping!

Should we commit this patch to fix the compilation time for now? Then we may look at the possibility to replace
MemoryDependenceAnaysis in AnnotateUniform pass?

This doesn't sound like a commitment to me

Dec 1 2020, 4:03 PM · Restricted Project

Oct 27 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Should we commit this patch to fix the compilation time for now? Then we may look at the possibility to replace
MemoryDependenceAnaysis in AnnotateUniform pass?

Oct 27 2020, 2:10 PM · Restricted Project

Oct 12 2020

cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

Why does this require a commit upstream?

Oct 12 2020, 10:53 AM · Restricted Project
cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

I agree with Matt here. You should be able to do experiments locally. Perhaps sinking should be disabled entirely, or perhaps sinking should be improved to take register liveness into account.

Oct 12 2020, 10:40 AM · Restricted Project

Oct 9 2020

cfang committed rGf192a27ed3ba: Sink: Handle instruction sink when a user is dead (authored by cfang).
Sink: Handle instruction sink when a user is dead
Oct 9 2020, 4:21 PM
cfang closed D89166: Sink: Handle instruction sink when a user is dead.
Oct 9 2020, 4:21 PM · Restricted Project
cfang added inline comments to D89166: Sink: Handle instruction sink when a user is dead.
Oct 9 2020, 3:50 PM · Restricted Project
cfang updated the diff for D89166: Sink: Handle instruction sink when a user is dead.

remove checking for findNearestCommonDominator failures.

Oct 9 2020, 3:49 PM · Restricted Project
cfang requested review of D89166: Sink: Handle instruction sink when a user is dead.
Oct 9 2020, 3:38 PM · Restricted Project
cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

Just adding another flag isn't really fixing anything

Oct 9 2020, 12:33 PM · Restricted Project

Oct 8 2020

cfang abandoned D88839: SINK: Sink instructions to the block that the current block immediately dominates. .
Oct 8 2020, 10:37 PM · Restricted Project
cfang added reviewers for D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass: sameerds, msearles.
Oct 8 2020, 10:33 PM · Restricted Project
cfang requested review of D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.
Oct 8 2020, 10:04 PM · Restricted Project

Oct 5 2020

cfang requested review of D88839: SINK: Sink instructions to the block that the current block immediately dominates. .
Oct 5 2020, 10:10 AM · Restricted Project

Aug 18 2020

cfang committed rGe7081d117a72: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc (authored by cfang).
AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc
Aug 18 2020, 4:28 PM
cfang closed D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Aug 18 2020, 4:28 PM · Restricted Project
cfang added inline comments to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Aug 18 2020, 4:03 PM · Restricted Project
cfang added a comment to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Ping. Is there any further suggestions to move this patch ahead? Thanks.

Aug 18 2020, 3:03 PM · Restricted Project

Aug 17 2020

cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

A minor change to use the register directly in stead of getting it from the instruction.

NOTE: this is based on the version we have generalized the resource register size based on the implementation from GISel.
Aug 17 2020, 10:43 AM · Restricted Project

Aug 16 2020

cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Update based on the reviewers' request to generate the waterfall loop for resource registers of
arbitrary sizes.

Aug 16 2020, 12:34 PM · Restricted Project

Aug 4 2020

cfang added inline comments to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Aug 4 2020, 4:02 PM · Restricted Project

Jul 30 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

How much improvement does D84890 give vs. this?

Jul 30 2020, 3:21 PM · Restricted Project
cfang updated the diff for D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Rebase after https://reviews.llvm.org/D84890

Jul 30 2020, 2:51 PM · Restricted Project
cfang committed rG243376cdc7b7: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst (authored by cfang).
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst
Jul 30 2020, 2:38 PM
cfang closed D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
Jul 30 2020, 2:37 PM · Restricted Project
cfang updated the diff for D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
  1. Add a missing space;
  2. Early exit when it is not an EntryFunction.
Jul 30 2020, 2:20 PM · Restricted Project

Jul 29 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Does MemorySSA have the same problem? Could we just switch this to use MemorySSA?

Disclaimer: I know nothing about this pass or the purpose of this patch, just trying to answer this question.
MemorySSA has its own internal threshold limiting the number of memory instructions that are traversed upwards. It does not care at how many blocks those memory instructions are spread over.

Jul 29 2020, 2:49 PM · Restricted Project
cfang added inline comments to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Jul 29 2020, 2:41 PM · Restricted Project
cfang requested review of D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
Jul 29 2020, 2:37 PM · Restricted Project
cfang abandoned D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.

https://reviews.llvm.org/D84873 addresses the same issue. So this is no longer needed.

Jul 29 2020, 10:19 AM · Restricted Project
cfang requested review of D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Jul 29 2020, 10:18 AM · Restricted Project
cfang abandoned D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.

This is no longer needed.

Jul 29 2020, 10:05 AM

Jul 25 2020

cfang committed rG9162b70e5104: DADCombiner: Don't simplify the token factor if the node's number of operands… (authored by cfang).
DADCombiner: Don't simplify the token factor if the node's number of operands…
Jul 25 2020, 9:22 PM
cfang closed D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit.
Jul 25 2020, 9:22 PM · Restricted Project

Jul 24 2020

cfang added a comment to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Could you make this generic over the VGPR register class instead? That code duplication is rather annoying.

Jul 24 2020, 5:13 PM · Restricted Project
cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Add waterloop tests, and update existing tests.

Jul 24 2020, 5:03 PM · Restricted Project

Jul 23 2020

cfang updated the diff for D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit.

Add a test. Thanks!

Jul 23 2020, 3:34 PM · Restricted Project

Jul 20 2020

Herald added a project to D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit: Restricted Project.
Jul 20 2020, 9:16 PM · Restricted Project

Jun 25 2020

cfang created D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Jun 25 2020, 2:44 PM · Restricted Project

Jun 23 2020

cfang added a comment to D72841: Add support for pragma float_control, to control precision and exception behavior at the source level.

-ffast-math flag got lost in the Builder after this change.

Jun 23 2020, 3:38 PM · Restricted Project, Restricted Project

Jun 8 2020

cfang added inline comments to D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 9:00 PM · Restricted Project
cfang added inline comments to D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 5:12 PM · Restricted Project
cfang created D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 2:58 PM · Restricted Project

Jun 4 2020

cfang added inline comments to D81211: [AMDGPU] Enable structurizer workarounds by default.
Jun 4 2020, 8:56 PM · Restricted Project

May 30 2020

cfang committed rG234eba90f4f3: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently (authored by cfang).
AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently
May 30 2020, 9:14 PM
cfang closed D80853: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently.
May 30 2020, 9:14 PM · Restricted Project

May 29 2020

cfang created D80853: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently.
May 29 2020, 4:57 PM · Restricted Project

Feb 27 2020

cfang committed rG8629cfdd7d5c: AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common (authored by cfang).
AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common
Feb 27 2020, 4:15 AM
cfang committed rGee2e845ba7fd: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 4:14 AM
cfang committed rGe9fea9aba9a8: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 3:52 AM
cfang committed rG98ae090e0088: AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls (authored by cfang).
AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls
Feb 27 2020, 3:41 AM
cfang committed rGbfcd3423e569: AMDGPU/SI: implements image load/store intrinsics (authored by cfang).
AMDGPU/SI: implements image load/store intrinsics
Feb 27 2020, 3:11 AM
cfang committed rG971896d73390: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 1:52 AM

Feb 26 2020

cfang committed rG23a2858a123d: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 26 2020, 11:45 PM

Feb 11 2020

cfang accepted D74408: AMDGPU: Don't create potentially dead rcp declarations.

LGTM

Feb 11 2020, 9:35 AM · Restricted Project

Feb 7 2020

cfang closed D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

commit 2531535984ad989ce88aeee23cb92a827da6686e
Author: Changpeng Fang <changpeng.fang@gmail.com>
Date: Thu Jan 23 16:57:43 2020 -0800

Feb 7 2020, 1:03 PM · Restricted Project
cfang committed rG884acbb9e167: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Feb 7 2020, 11:49 AM
cfang closed D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 7 2020, 11:49 AM · Restricted Project
cfang committed rG6370c7c13e6d: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 7 2020, 11:12 AM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 7 2020, 11:12 AM · Restricted Project
cfang updated the diff for D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Fix the LIT test failures.
(Yes, I messed up with a release build check, Thanks for catching these).

Feb 7 2020, 10:54 AM · Restricted Project

Feb 6 2020

cfang added a comment to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Where are these failing LIT tests located? I did LIT tests before posting for review and also before integrating.
Maybe my check is incomplete. Thanks.

Feb 6 2020, 7:58 PM · Restricted Project
cfang committed rG982780648124: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 6 2020, 4:42 PM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:42 PM · Restricted Project
cfang created D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:23 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Updated based on feedback:

Feb 6 2020, 11:07 AM · Restricted Project

Feb 5 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 3:11 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Rename a few functions and variables:

Feb 5 2020, 2:16 PM · Restricted Project
cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 12:10 PM · Restricted Project

Feb 3 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 3 2020, 10:01 AM · Restricted Project

Jan 31 2020

cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

update based on comment.

Jan 31 2020, 2:25 PM · Restricted Project

Jan 30 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 30 2020, 3:16 PM · Restricted Project

Jan 29 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 29 2020, 2:53 PM · Restricted Project

Jan 28 2020

cfang created D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 28 2020, 3:03 PM · Restricted Project

Jan 23 2020

cfang committed rG2531535984ad: AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare
Jan 23 2020, 5:01 PM
cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on feedback:

Jan 23 2020, 11:56 AM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 23 2020, 11:51 AM · Restricted Project