Page MenuHomePhabricator

cfang (Changpeng Fang)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2015, 11:41 AM (255 w, 1 d)

Recent Activity

Mon, Oct 12

cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

Why does this require a commit upstream?

Mon, Oct 12, 10:53 AM · Restricted Project
cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

I agree with Matt here. You should be able to do experiments locally. Perhaps sinking should be disabled entirely, or perhaps sinking should be improved to take register liveness into account.

Mon, Oct 12, 10:40 AM · Restricted Project

Fri, Oct 9

cfang committed rGf192a27ed3ba: Sink: Handle instruction sink when a user is dead (authored by cfang).
Sink: Handle instruction sink when a user is dead
Fri, Oct 9, 4:21 PM
cfang closed D89166: Sink: Handle instruction sink when a user is dead.
Fri, Oct 9, 4:21 PM · Restricted Project
cfang added inline comments to D89166: Sink: Handle instruction sink when a user is dead.
Fri, Oct 9, 3:50 PM · Restricted Project
cfang updated the diff for D89166: Sink: Handle instruction sink when a user is dead.

remove checking for findNearestCommonDominator failures.

Fri, Oct 9, 3:49 PM · Restricted Project
cfang requested review of D89166: Sink: Handle instruction sink when a user is dead.
Fri, Oct 9, 3:38 PM · Restricted Project
cfang added a comment to D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.

Just adding another flag isn't really fixing anything

Fri, Oct 9, 12:33 PM · Restricted Project

Thu, Oct 8

cfang abandoned D88839: SINK: Sink instructions to the block that the current block immediately dominates. .
Thu, Oct 8, 10:37 PM · Restricted Project
cfang added reviewers for D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass: sameerds, msearles.
Thu, Oct 8, 10:33 PM · Restricted Project
cfang requested review of D89095: AMDGPU: Introduce a flag to control enable/disable instruction sink pass.
Thu, Oct 8, 10:04 PM · Restricted Project

Mon, Oct 5

cfang requested review of D88839: SINK: Sink instructions to the block that the current block immediately dominates. .
Mon, Oct 5, 10:10 AM · Restricted Project

Aug 18 2020

cfang committed rGe7081d117a72: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc (authored by cfang).
AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc
Aug 18 2020, 4:28 PM
cfang closed D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Aug 18 2020, 4:28 PM · Restricted Project
cfang added inline comments to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Aug 18 2020, 4:03 PM · Restricted Project
cfang added a comment to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Ping. Is there any further suggestions to move this patch ahead? Thanks.

Aug 18 2020, 3:03 PM · Restricted Project

Aug 17 2020

cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

A minor change to use the register directly in stead of getting it from the instruction.

NOTE: this is based on the version we have generalized the resource register size based on the implementation from GISel.
Aug 17 2020, 10:43 AM · Restricted Project

Aug 16 2020

cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Update based on the reviewers' request to generate the waterfall loop for resource registers of
arbitrary sizes.

Aug 16 2020, 12:34 PM · Restricted Project

Aug 4 2020

cfang added inline comments to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Aug 4 2020, 4:02 PM · Restricted Project

Jul 30 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

How much improvement does D84890 give vs. this?

Jul 30 2020, 3:21 PM · Restricted Project
cfang updated the diff for D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Rebase after https://reviews.llvm.org/D84890

Jul 30 2020, 2:51 PM · Restricted Project
cfang committed rG243376cdc7b7: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst (authored by cfang).
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst
Jul 30 2020, 2:38 PM
cfang closed D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
Jul 30 2020, 2:37 PM · Restricted Project
cfang updated the diff for D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
  1. Add a missing space;
  2. Early exit when it is not an EntryFunction.
Jul 30 2020, 2:20 PM · Restricted Project

Jul 29 2020

cfang added a comment to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..

Does MemorySSA have the same problem? Could we just switch this to use MemorySSA?

Disclaimer: I know nothing about this pass or the purpose of this patch, just trying to answer this question.
MemorySSA has its own internal threshold limiting the number of memory instructions that are traversed upwards. It does not care at how many blocks those memory instructions are spread over.

Jul 29 2020, 2:49 PM · Restricted Project
cfang added inline comments to D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Jul 29 2020, 2:41 PM · Restricted Project
cfang requested review of D84890: AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst.
Jul 29 2020, 2:37 PM · Restricted Project
cfang abandoned D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.

https://reviews.llvm.org/D84873 addresses the same issue. So this is no longer needed.

Jul 29 2020, 10:19 AM · Restricted Project
cfang requested review of D84873: AMDGPU: In determining load clobbering in AnnotateUniform, don't scan if there are too many blocks..
Jul 29 2020, 10:18 AM · Restricted Project
cfang abandoned D45228: AMDGPU/SI: Handle BitCast of GEP in promoting alloca to vector.

This is no longer needed.

Jul 29 2020, 10:05 AM

Jul 25 2020

cfang committed rG9162b70e5104: DADCombiner: Don't simplify the token factor if the node's number of operands… (authored by cfang).
DADCombiner: Don't simplify the token factor if the node's number of operands…
Jul 25 2020, 9:22 PM
cfang closed D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit.
Jul 25 2020, 9:22 PM · Restricted Project

Jul 24 2020

cfang added a comment to D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Could you make this generic over the VGPR register class instead? That code duplication is rather annoying.

Jul 24 2020, 5:13 PM · Restricted Project
cfang updated the diff for D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.

Add waterloop tests, and update existing tests.

Jul 24 2020, 5:03 PM · Restricted Project

Jul 23 2020

cfang updated the diff for D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit.

Add a test. Thanks!

Jul 23 2020, 3:34 PM · Restricted Project

Jul 20 2020

Herald added a project to D84204: DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit: Restricted Project.
Jul 20 2020, 9:16 PM · Restricted Project

Jun 25 2020

cfang created D82603: AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc.
Jun 25 2020, 2:44 PM · Restricted Project

Jun 23 2020

cfang added a comment to D72841: Add support for pragma float_control, to control precision and exception behavior at the source level.

-ffast-math flag got lost in the Builder after this change.

Jun 23 2020, 3:38 PM · Restricted Project, Restricted Project

Jun 8 2020

cfang added inline comments to D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 9:00 PM · Restricted Project
cfang added inline comments to D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 5:12 PM · Restricted Project
cfang created D81433: AMDGPU: Restrict the number of instructions to scan for getPointerDependencyFrom.
Jun 8 2020, 2:58 PM · Restricted Project

Jun 4 2020

cfang added inline comments to D81211: [AMDGPU] Enable structurizer workarounds by default.
Jun 4 2020, 8:56 PM · Restricted Project

May 30 2020

cfang committed rG234eba90f4f3: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently (authored by cfang).
AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently
May 30 2020, 9:14 PM
cfang closed D80853: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently.
May 30 2020, 9:14 PM · Restricted Project

May 29 2020

cfang created D80853: AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently.
May 29 2020, 4:57 PM · Restricted Project

Feb 27 2020

cfang committed rG8629cfdd7d5c: AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common (authored by cfang).
AMDGPU: fix multipe definitions of DEBUG_TYPE in amd-common
Feb 27 2020, 4:15 AM
cfang committed rGee2e845ba7fd: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 4:14 AM
cfang committed rGe9fea9aba9a8: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 3:52 AM
cfang committed rG98ae090e0088: AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls (authored by cfang).
AMDGPU/SI: Fix MIN/UMIN Selection in ConvertAtomicLibCalls
Feb 27 2020, 3:41 AM
cfang committed rGbfcd3423e569: AMDGPU/SI: implements image load/store intrinsics (authored by cfang).
AMDGPU/SI: implements image load/store intrinsics
Feb 27 2020, 3:11 AM
cfang committed rG971896d73390: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 27 2020, 1:52 AM

Feb 26 2020

cfang committed rG23a2858a123d: Merge branch amd-master into amd-common (authored by cfang).
Merge branch amd-master into amd-common
Feb 26 2020, 11:45 PM

Feb 11 2020

cfang accepted D74408: AMDGPU: Don't create potentially dead rcp declarations.

LGTM

Feb 11 2020, 9:35 AM · Restricted Project

Feb 7 2020

cfang closed D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

commit 2531535984ad989ce88aeee23cb92a827da6686e
Author: Changpeng Fang <changpeng.fang@gmail.com>
Date: Thu Jan 23 16:57:43 2020 -0800

Feb 7 2020, 1:03 PM · Restricted Project
cfang committed rG884acbb9e167: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Feb 7 2020, 11:49 AM
cfang closed D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 7 2020, 11:49 AM · Restricted Project
cfang committed rG6370c7c13e6d: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 7 2020, 11:12 AM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 7 2020, 11:12 AM · Restricted Project
cfang updated the diff for D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Fix the LIT test failures.
(Yes, I messed up with a release build check, Thanks for catching these).

Feb 7 2020, 10:54 AM · Restricted Project

Feb 6 2020

cfang added a comment to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Where are these failing LIT tests located? I did LIT tests before posting for review and also before integrating.
Maybe my check is incomplete. Thanks.

Feb 6 2020, 7:58 PM · Restricted Project
cfang committed rG982780648124: AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by cfang).
AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Feb 6 2020, 4:42 PM
cfang closed D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:42 PM · Restricted Project
cfang created D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Feb 6 2020, 4:23 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Updated based on feedback:

Feb 6 2020, 11:07 AM · Restricted Project

Feb 5 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 3:11 PM · Restricted Project
cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

Rename a few functions and variables:

Feb 5 2020, 2:16 PM · Restricted Project
cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 5 2020, 12:10 PM · Restricted Project

Feb 3 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Feb 3 2020, 10:01 AM · Restricted Project

Jan 31 2020

cfang updated the diff for D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .

update based on comment.

Jan 31 2020, 2:25 PM · Restricted Project

Jan 30 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 30 2020, 3:16 PM · Restricted Project

Jan 29 2020

cfang added inline comments to D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 29 2020, 2:53 PM · Restricted Project

Jan 28 2020

cfang created D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare .
Jan 28 2020, 3:03 PM · Restricted Project

Jan 23 2020

cfang committed rG2531535984ad: AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare (authored by cfang).
AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare
Jan 23 2020, 5:01 PM
cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on feedback:

Jan 23 2020, 11:56 AM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 23 2020, 11:51 AM · Restricted Project

Jan 22 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

update based on feedback.

Jan 22 2020, 4:00 PM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 22 2020, 2:31 PM · Restricted Project
cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on the comments.

Jan 22 2020, 12:05 PM · Restricted Project

Jan 21 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Update based on feedback from the reviewer.

Jan 21 2020, 5:08 PM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 21 2020, 9:38 AM · Restricted Project
cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 21 2020, 8:36 AM · Restricted Project

Jan 20 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Implement rcp optimization for fdiv in AMGGPUCodegenPrepare to insert amdgcn_rcp intrinsic. For f32 type fdiv,
if fpmath metadata is unavailable, we could not do rcp optimization unless fast unsafe math is specified.

Jan 20 2020, 4:27 PM · Restricted Project

Jan 10 2020

cfang added inline comments to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Jan 10 2020, 9:53 AM · Restricted Project

Jan 7 2020

cfang updated the diff for D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

Introduce an intrinsic in AMDGPUCodeGenPrepare to generate correctly rounded fdiv32.

Jan 7 2020, 3:14 PM · Restricted Project

Dec 12 2019

cfang added a comment to D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.

The attribute should not de directly checked (we probably shouldn’t even be putting it on the function). The proper thing to check is the fpmath metadata on the individual instruction. This isn’t propagated into the DAG, so AMDGPUCodeGenPrepare inserts intrinsic calls which isn’t ideal

:
So what's your suggestion here? The current logic in AMDGPUCodeGenPrepare is to find cases that we can insert the intrinsic to generate "Faster 2.5 ULP division that does not support denormals."
Otherwise SIISelLowering will lower FDIV32 UnsafeMath and Demorm support.

Dec 12 2019, 1:28 PM · Restricted Project

Dec 10 2019

cfang created D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set.
Dec 10 2019, 11:23 AM · Restricted Project

Oct 25 2019

cfang committed rG1ce552f3ef8d: AMDGPU: Fix the broken dominator tree when creating waterfall loop for resource… (authored by cfang).
AMDGPU: Fix the broken dominator tree when creating waterfall loop for resource…
Oct 25 2019, 1:15 PM
cfang closed D69358: AMDGPU: Fix the broken dominator tree.
Oct 25 2019, 1:15 PM · Restricted Project

Oct 24 2019

cfang added inline comments to D69358: AMDGPU: Fix the broken dominator tree.
Oct 24 2019, 4:36 PM · Restricted Project
cfang updated the diff for D69358: AMDGPU: Fix the broken dominator tree.

Update based on the comments

Oct 24 2019, 4:35 PM · Restricted Project
cfang updated the diff for D69358: AMDGPU: Fix the broken dominator tree.

Add test case.

Oct 24 2019, 9:23 AM · Restricted Project

Oct 23 2019

cfang created D69358: AMDGPU: Fix the broken dominator tree.
Oct 23 2019, 2:49 PM · Restricted Project

Oct 1 2019

cfang committed rGe4ee28d14ce6: AMDGPU: Fix an out of date assert in addressing FrameIndex (authored by cfang).
AMDGPU: Fix an out of date assert in addressing FrameIndex
Oct 1 2019, 4:07 PM

Sep 30 2019

cfang updated the diff for D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.

update the test based on the comment:

  1. use a defined comparison
  2. remove "unreachable", and make it branch to exit.
Sep 30 2019, 1:38 PM · Restricted Project

Sep 26 2019

cfang committed rGf5524f04512d: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint (authored by cfang).
Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint
Sep 26 2019, 3:54 PM
cfang updated the diff for D58360: Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint.

Rebase and update the diffs.

Sep 26 2019, 3:38 PM · Restricted Project
cfang updated the diff for D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
  1. Remove the use of structs to simplify the test
  2. generate llc checks with utils/update_llc_test_checks.py
Sep 26 2019, 1:13 PM · Restricted Project

Sep 24 2019

cfang added inline comments to D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
Sep 24 2019, 3:42 PM · Restricted Project

Sep 13 2019

cfang created D67574: AMDGPU: Fix an out of date assert in addressing FrameIndex.
Sep 13 2019, 3:01 PM · Restricted Project

Aug 26 2019

cfang accepted D66772: AMDGPU: Combine directly on mul24 intrinsics.
Aug 26 2019, 4:13 PM