Page MenuHomePhabricator

cfang (Changpeng Fang)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2015, 11:41 AM (381 w, 4 d)

Recent Activity

Feb 10 2023

cfang committed rG7ca3444fba73: AMDGPU: Use module flag to get code object version at IR level folow-up (authored by cfang).
AMDGPU: Use module flag to get code object version at IR level folow-up
Feb 10 2023, 11:17 AM · Restricted Project, Restricted Project

Feb 9 2023

cfang updated the diff for D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.

define an enum type for code object version number.

Feb 9 2023, 2:56 PM · Restricted Project, Restricted Project

Feb 7 2023

cfang added inline comments to D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.
Feb 7 2023, 1:57 PM · Restricted Project, Restricted Project

Feb 6 2023

cfang updated the diff for D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.

Update based on reviewers' comments. Thanks.

Feb 6 2023, 10:37 AM · Restricted Project, Restricted Project
cfang added a comment to D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.

Do we still need getHsaAbiVersion() and the ELFABIVERSION_AMDGPU_HSA_* constants?

Feb 6 2023, 10:31 AM · Restricted Project, Restricted Project

Feb 3 2023

cfang added inline comments to D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.
Feb 3 2023, 3:25 PM · Restricted Project, Restricted Project
cfang requested review of D143293: AMDGPU: Use module flag to get code object version at IR level folow-up.
Feb 3 2023, 1:54 PM · Restricted Project, Restricted Project

Feb 2 2023

cfang committed rG54cf69c9d54e: AMDGPU: Use module flag to get code object version at IR level (authored by cfang).
AMDGPU: Use module flag to get code object version at IR level
Feb 2 2023, 6:58 PM · Restricted Project, Restricted Project
Herald added projects to D14313: Add a libLTO diagnostic handler that supports lto_get_error_message API: Restricted Project, Restricted Project.
Feb 2 2023, 6:58 PM · Restricted Project, Restricted Project
cfang updated the diff for D143138: AMDGPU: Use module flag to get code object version at IR level.

Update based on Matt's comments.

Feb 2 2023, 6:00 PM · Restricted Project, Restricted Project
cfang updated the diff for D143138: AMDGPU: Use module flag to get code object version at IR level.

Update based on Matt's comments:

  1. Add a test case for out of range code object version;
  2. Move code object version into AMDGPUInformationCache
  3. Pass code object version as an argument in MetadataStreamerMsgPackV3::getHSAKernelProps

4 remove unnecessary parens

Feb 2 2023, 4:01 PM · Restricted Project, Restricted Project
cfang added inline comments to D143138: AMDGPU: Use module flag to get code object version at IR level.
Feb 2 2023, 3:55 PM · Restricted Project, Restricted Project
cfang added inline comments to D143138: AMDGPU: Use module flag to get code object version at IR level.
Feb 2 2023, 11:47 AM · Restricted Project, Restricted Project

Feb 1 2023

cfang requested review of D143138: AMDGPU: Use module flag to get code object version at IR level.
Feb 1 2023, 11:20 PM · Restricted Project, Restricted Project

Jan 20 2023

cfang committed rG3bde23c5e0b2: AMDGPU: Put un-initiaized enumerators together in an enum definition. (authored by cfang).
AMDGPU: Put un-initiaized enumerators together in an enum definition.
Jan 20 2023, 2:40 PM · Restricted Project, Restricted Project
cfang updated the diff for D141643: AMDGPU: Put un-initiaized enumerators together in an enum definition..

update test

Jan 20 2023, 12:49 PM · Restricted Project, Restricted Project

Jan 19 2023

cfang updated the diff for D141643: AMDGPU: Put un-initiaized enumerators together in an enum definition..

Add a test.

Jan 19 2023, 4:24 PM · Restricted Project, Restricted Project

Jan 12 2023

cfang added inline comments to D138661: [AMDGPU][MC] Correct handling of mandatory literals.
Jan 12 2023, 4:30 PM · Restricted Project, Restricted Project
cfang requested review of D141643: AMDGPU: Put un-initiaized enumerators together in an enum definition..
Jan 12 2023, 4:22 PM · Restricted Project, Restricted Project

Sep 26 2022

cfang committed rGdee4bc4a4ecc: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by… (authored by cfang).
AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by…
Sep 26 2022, 9:33 AM · Restricted Project, Restricted Project
cfang closed D134596: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers.
Sep 26 2022, 9:33 AM · Restricted Project, Restricted Project

Sep 25 2022

cfang updated the diff for D134596: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers.

Updated based on arsenm's comments.

Sep 25 2022, 4:13 PM · Restricted Project, Restricted Project
cfang added inline comments to D134596: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers.
Sep 25 2022, 3:56 PM · Restricted Project, Restricted Project

Sep 24 2022

cfang added reviewers for D134596: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers: arsenm, bcahoon, ronl.
Sep 24 2022, 5:56 PM · Restricted Project, Restricted Project
cfang requested review of D134596: AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers.
Sep 24 2022, 5:55 PM · Restricted Project, Restricted Project

Sep 21 2022

cfang accepted D134355: [AMDGPU] Emit module flag for all code object versions.

Should the module flag name be amdgpu_code_object_version or amdhsa_code_object_version?

Sep 21 2022, 5:35 PM · Restricted Project, Restricted Project

Sep 20 2022

cfang committed rG3ae4c3589ec7: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup… (authored by cfang).
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup…
Sep 20 2022, 5:27 PM · Restricted Project, Restricted Project
cfang closed D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.
Sep 20 2022, 5:27 PM · Restricted Project, Restricted Project
cfang updated the diff for D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

Updated based on arsenm's comment to merge two cases.

Sep 20 2022, 5:14 PM · Restricted Project, Restricted Project
cfang added inline comments to D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.
Sep 20 2022, 5:12 PM · Restricted Project, Restricted Project
cfang updated the diff for D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

Update based on arsenm's comments.

Sep 20 2022, 2:32 PM · Restricted Project, Restricted Project
cfang added inline comments to D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.
Sep 20 2022, 2:29 PM · Restricted Project, Restricted Project

Sep 19 2022

cfang updated the diff for D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

Merge two ProcessUse functions, and do selection based on code object version.

Sep 19 2022, 4:56 PM · Restricted Project, Restricted Project

Aug 8 2022

cfang updated the diff for D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

Still use the existing AMDGPU::getAmdhsaCodeObjectVersion() to check code object version.
This is for consistency in the backend. Plan to use module flag in a later patch for all cases.

Aug 8 2022, 11:09 AM · Restricted Project, Restricted Project

Aug 5 2022

cfang added inline comments to D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.
Aug 5 2022, 11:50 PM · Restricted Project, Restricted Project
cfang updated the diff for D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

Do the pattern matching optimizations based on their existence with code object version.
Get code object version from amdhsa-code-object-version module flag. Note that now
I just declare a static function to do as a proof of concept because we don't know what is the default
code object version if the module flag does not exists.
Also, it is beyond this work to get code object version from module flag everywhere in the compiler.

Aug 5 2022, 11:46 PM · Restricted Project, Restricted Project
cfang added a comment to D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

I also think we're still missing a module flag to indicate the code object version

The module flag was implemented quite a while back. See D119026 from February. From a recent compile:

!llvm.module.flags = !{!0, !1, !2}
!0 = !{i32 1, !"amdgpu_code_object_version", i32 500}

We were using --amdhsa-code-object-version=5 to run LIT tests. Do you mean this flag will no longer take effect if we
switch to module flag for code object version?

I think the flag is distinct from the metadata. The module metadata just records which version is being used. You can access it in the code:

auto Ver = mdconst::dyn_extract_or_null<ConstantInt>(M.getModuleFlag("amdgpu_code_object_version"))

and, then if Ver != 5, then there is no need to execute the loop with the calls to processImplicitArgUse().

Aug 5 2022, 10:22 PM · Restricted Project, Restricted Project
cfang added a comment to D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.

I also think we're still missing a module flag to indicate the code object version

The module flag was implemented quite a while back. See D119026 from February. From a recent compile:

!llvm.module.flags = !{!0, !1, !2}
!0 = !{i32 1, !"amdgpu_code_object_version", i32 500}

Aug 5 2022, 2:37 PM · Restricted Project, Restricted Project
cfang requested review of D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true.
Aug 5 2022, 11:28 AM · Restricted Project, Restricted Project

Jul 28 2022

cfang committed rG2b731b30a7e7: AMDGPU: Take care of "tied" operand when removeOperand (authored by cfang).
AMDGPU: Take care of "tied" operand when removeOperand
Jul 28 2022, 5:32 PM · Restricted Project, Restricted Project
cfang closed D130537: AMDGPU: Take care of "tied" operand when removeOperand.
Jul 28 2022, 5:32 PM · Restricted Project, Restricted Project

Jul 27 2022

cfang updated the diff for D130537: AMDGPU: Take care of "tied" operand when removeOperand.

Merge the new test into frame-index-elimination.ll

Jul 27 2022, 5:08 PM · Restricted Project, Restricted Project
cfang added inline comments to D130537: AMDGPU: Take care of "tied" operand when removeOperand.
Jul 27 2022, 2:57 PM · Restricted Project, Restricted Project
cfang updated the diff for D130537: AMDGPU: Take care of "tied" operand when removeOperand.

Add a mir test, and remove the use of undef in the test.

Jul 27 2022, 2:53 PM · Restricted Project, Restricted Project

Jul 26 2022

cfang updated the diff for D130537: AMDGPU: Take care of "tied" operand when removeOperand.

Rename the LIT test to frame-index-elimination-tied-operand.ll as suggested.

Jul 26 2022, 10:57 AM · Restricted Project, Restricted Project
cfang added inline comments to D130537: AMDGPU: Take care of "tied" operand when removeOperand.
Jul 26 2022, 10:55 AM · Restricted Project, Restricted Project

Jul 25 2022

cfang requested review of D130537: AMDGPU: Take care of "tied" operand when removeOperand.
Jul 25 2022, 7:06 PM · Restricted Project, Restricted Project

Jul 15 2022

cfang updated the diff for D129818: AMDGPU: Make default AMDHSA Code Object Version to be 5.

Update https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Driver/Options.td#L3626
as well as the corresponding LIT tests.

Jul 15 2022, 1:10 PM · Restricted Project, Restricted Project

Jul 14 2022

cfang retitled D129818: AMDGPU: Make default AMDHSA Code Object Version to be 5 from AMDGPU: Make default AMDHSA Code Pbject Version to be 5 to AMDGPU: Make default AMDHSA Code Object Version to be 5.
Jul 14 2022, 5:18 PM · Restricted Project, Restricted Project
cfang requested review of D129818: AMDGPU: Make default AMDHSA Code Object Version to be 5.
Jul 14 2022, 4:29 PM · Restricted Project, Restricted Project

Jul 12 2022

cfang updated subscribers of D128796: [SCCP] Simplify CFG in SCCP as well.
Jul 12 2022, 10:09 AM · Restricted Project, Restricted Project
cfang added a comment to D128796: [SCCP] Simplify CFG in SCCP as well.

This patch triggered a correctness issue in running mixbench-ocl-alt.
I am not familiar with the CFG in SCCP pass at all. But the comments
in the code seems suggest we should not change the CFG:

Jul 12 2022, 10:08 AM · Restricted Project, Restricted Project

Apr 13 2022

cfang committed rG8edaf25986a4: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally (authored by cfang).
AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally
Apr 13 2022, 2:32 PM · Restricted Project, Restricted Project
cfang closed D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
Apr 13 2022, 2:31 PM · Restricted Project, Restricted Project

Apr 12 2022

cfang added a comment to D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.

Typos in description: "and also and also", "WE".

Apr 12 2022, 12:38 PM · Restricted Project, Restricted Project
cfang added inline comments to D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
Apr 12 2022, 11:41 AM · Restricted Project, Restricted Project
cfang added inline comments to D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
Apr 12 2022, 11:24 AM · Restricted Project, Restricted Project

Apr 11 2022

cfang updated the diff for D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
  1. Correct the function funcRetrievesMultigridSyncArg;
  2. update after rebase.
Apr 11 2022, 5:53 PM · Restricted Project, Restricted Project
cfang added inline comments to D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
Apr 11 2022, 5:19 PM · Restricted Project, Restricted Project
cfang committed rG7f9868f9b765: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5 (authored by cfang).
AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5
Apr 11 2022, 4:13 PM · Restricted Project, Restricted Project
cfang closed D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.
Apr 11 2022, 4:13 PM · Restricted Project, Restricted Project
cfang added reviewers for D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5: sameerds, bcahoon.
Apr 11 2022, 3:06 PM · Restricted Project, Restricted Project
cfang added a reviewer for D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally: Restricted Project.
Apr 11 2022, 3:05 PM · Restricted Project, Restricted Project
cfang added a reviewer for D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally: bcahoon. cfang removed 1 blocking reviewer(s) for D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally: sameerds.
Apr 11 2022, 3:04 PM · Restricted Project, Restricted Project
cfang requested review of D123548: AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally.
Apr 11 2022, 3:02 PM · Restricted Project, Restricted Project
cfang added a comment to D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.

Ping
Now that we agreed that we have to use alignTo to align the Offset to what the implicitarg_ptr requires
and the LIT tests have been updated to show the alignment related layout. Thanks.

Apr 11 2022, 10:22 AM · Restricted Project, Restricted Project

Apr 8 2022

cfang updated the diff for D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.

Did correct "git diff" to include deleted and added files in the diff

Apr 8 2022, 2:53 PM · Restricted Project, Restricted Project
cfang added a comment to D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.

These test changes don't show a change in behavior

Apr 8 2022, 2:51 PM · Restricted Project, Restricted Project
cfang added inline comments to D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.
Apr 8 2022, 2:05 PM · Restricted Project, Restricted Project
cfang added inline comments to D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.
Apr 8 2022, 9:26 AM · Restricted Project, Restricted Project
cfang updated the diff for D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.

Use alignTo to force the alignment for the implicit kernarg segment:
Offset = alignTo(Offset, ST.getAlignmentForImplicitArgPtr());

Apr 8 2022, 9:24 AM · Restricted Project, Restricted Project
cfang added a comment to D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.

[AMD Official Use Only]

Apr 8 2022, 7:14 AM · Restricted Project, Restricted Project

Apr 7 2022

cfang requested review of D123346: AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5.
Apr 7 2022, 5:04 PM · Restricted Project, Restricted Project
cfang committed rG6733590db284: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5 (authored by cfang).
AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5
Apr 7 2022, 8:36 AM · Restricted Project, Restricted Project
cfang closed D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.
Apr 7 2022, 8:36 AM · Restricted Project, Restricted Project

Apr 6 2022

cfang updated the diff for D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.

For the LIT test, add back the GCN check prefix. For the GCN checks that are common to code object version 2 and MESA but different from code object version 5, we split the checks to HSA (version 2) and MESA.

Apr 6 2022, 9:07 PM · Restricted Project, Restricted Project
cfang added inline comments to D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.
Apr 6 2022, 8:46 PM · Restricted Project, Restricted Project
cfang updated the diff for D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.

update LIT test

Apr 6 2022, 6:26 PM · Restricted Project, Restricted Project
cfang added a comment to D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.

This can be tested in the existing llvm/test/CodeGen/AMDGPU/llvm.amdgcn.implicitarg.ptr.ll

Apr 6 2022, 5:05 PM · Restricted Project, Restricted Project
cfang requested review of D123262: AMDGPU: Set implicit kernarg size to be of 256 bytes for code object version 5.
Apr 6 2022, 3:31 PM · Restricted Project, Restricted Project

Mar 31 2022

cfang committed rG1711020c3769: AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be… (authored by cfang).
AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be…
Mar 31 2022, 8:07 AM · Restricted Project, Restricted Project
cfang closed D122778: AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal.
Mar 31 2022, 8:07 AM · Restricted Project, Restricted Project

Mar 30 2022

cfang added a reviewer for D122778: AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal: Restricted Project.
Mar 30 2022, 7:11 PM · Restricted Project, Restricted Project
cfang added reviewers for D122778: AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal: arsenm, bcahoon.
Mar 30 2022, 7:10 PM · Restricted Project, Restricted Project
cfang requested review of D122778: AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal.
Mar 30 2022, 7:09 PM · Restricted Project, Restricted Project

Mar 28 2022

cfang committed rG8384ced974c6: [AMDGPU][NFC]: Remove unnecessary MFI functions (authored by cfang).
[AMDGPU][NFC]: Remove unnecessary MFI functions
Mar 28 2022, 12:14 PM · Restricted Project, Restricted Project
cfang closed D122600: [AMDGPU][NFC]: Remove unnecessary MFI functions.
Mar 28 2022, 12:14 PM · Restricted Project, Restricted Project
cfang requested review of D122600: [AMDGPU][NFC]: Remove unnecessary MFI functions.
Mar 28 2022, 10:50 AM · Restricted Project, Restricted Project

Mar 17 2022

cfang committed rGdd5895cc3986: AMDGPU: Use the implicit kernargs for code object version 5 (authored by cfang).
AMDGPU: Use the implicit kernargs for code object version 5
Mar 17 2022, 2:13 PM · Restricted Project
cfang closed D120265: AMDGPU: Use the implicit kernargs for code object version 5.
Mar 17 2022, 2:13 PM · Restricted Project, Restricted Project, Restricted Project
cfang added inline comments to D120265: AMDGPU: Use the implicit kernargs for code object version 5.
Mar 17 2022, 1:59 PM · Restricted Project, Restricted Project, Restricted Project
cfang updated the diff for D120265: AMDGPU: Use the implicit kernargs for code object version 5.

A minor change: add suffix to the enum itself instead of the individual field.
Also remove the "Fixes" field in the summary (commit message).

Mar 17 2022, 12:11 AM · Restricted Project, Restricted Project, Restricted Project
cfang added inline comments to D120265: AMDGPU: Use the implicit kernargs for code object version 5.
Mar 17 2022, 12:07 AM · Restricted Project, Restricted Project, Restricted Project

Mar 16 2022

cfang updated the diff for D120265: AMDGPU: Use the implicit kernargs for code object version 5.

Update based on Matt's comments:

  1. Use buildPtrAdd
  2. Remove a space
  3. Add suffix for the enum definition and also wrap with a namespace
  4. Remove the redundant def of ST (SubTarget)
  5. Updated according to clang-format
Mar 16 2022, 4:15 PM · Restricted Project, Restricted Project, Restricted Project
cfang added inline comments to D120265: AMDGPU: Use the implicit kernargs for code object version 5.
Mar 16 2022, 3:06 PM · Restricted Project, Restricted Project, Restricted Project
cfang added a comment to D120265: AMDGPU: Use the implicit kernargs for code object version 5.

Ping!

Mar 16 2022, 1:56 PM · Restricted Project, Restricted Project, Restricted Project

Mar 14 2022

cfang updated the diff for D120265: AMDGPU: Use the implicit kernargs for code object version 5.
  1. Introduce a common function, SITargetLowering::loadImplicitKernelArgument, which is used

in both getSegmentAperture and lowerTrapHsaQueuePtr.

Mar 14 2022, 3:57 PM · Restricted Project, Restricted Project, Restricted Project

Mar 11 2022

cfang updated the diff for D120265: AMDGPU: Use the implicit kernargs for code object version 5.

Rebase and update LIT tests.

Mar 11 2022, 11:24 AM · Restricted Project, Restricted Project, Restricted Project

Mar 9 2022

cfang committed rG0f20a35b9e4b: AMDGPU: Set up User SGPRs for queue_ptr only when necessary (authored by cfang).
AMDGPU: Set up User SGPRs for queue_ptr only when necessary
Mar 9 2022, 10:15 AM · Restricted Project
cfang closed D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary.
Mar 9 2022, 10:15 AM · Restricted Project, Restricted Project