Page MenuHomePhabricator

hsmhsm (Mahesha S)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 14 2020, 12:20 AM (20 w, 2 d)

Recent Activity

Today

hsmhsm updated the diff for D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.

Fixed LLVM lit test regressions.

Thu, Jun 4, 1:03 AM · Restricted Project

Yesterday

hsmhsm updated the summary of D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.
Wed, Jun 3, 4:53 AM · Restricted Project
hsmhsm created D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.
Wed, Jun 3, 4:53 AM · Restricted Project
hsmhsm updated the summary of D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.
Wed, Jun 3, 4:53 AM · Restricted Project
hsmhsm updated the summary of D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.
Wed, Jun 3, 4:53 AM · Restricted Project
hsmhsm updated the summary of D81085: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size.
Wed, Jun 3, 4:53 AM · Restricted Project
hsmhsm committed rG29c17ed96ed5: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width (authored by hsmhsm).
[AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width
Wed, Jun 3, 1:37 AM
hsmhsm closed D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.
Wed, Jun 3, 1:36 AM · Restricted Project

Tue, Jun 2

hsmhsm updated the diff for D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.

Have taken care of review comments by Jay.

Tue, Jun 2, 10:25 AM · Restricted Project

Mon, Jun 1

hsmhsm added inline comments to D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.
Mon, Jun 1, 8:33 PM · Restricted Project
hsmhsm updated the diff for D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.

Have taken care of second review comment by Matt, and first one, I think is not necessary to take care.

Mon, Jun 1, 8:33 PM · Restricted Project
hsmhsm added inline comments to D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.
Mon, Jun 1, 7:30 PM · Restricted Project
hsmhsm created D80946: [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width.
Mon, Jun 1, 12:25 PM · Restricted Project
hsmhsm committed rG0ed2c046362e: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of… (authored by hsmhsm).
[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of…
Mon, Jun 1, 10:47 AM
hsmhsm closed D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Mon, Jun 1, 10:47 AM · Restricted Project
hsmhsm added a comment to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Thanks Jay, I will take care of these minor comments in a next immediate patch.

Mon, Jun 1, 10:45 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Make sure that Width is properly computed within SIInstrInfo::getMemOperandsWithOffsetWidth()

Mon, Jun 1, 8:33 AM · Restricted Project

Sat, May 30

hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Sat, May 30, 6:20 AM · Restricted Project

Fri, May 29

hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Fri, May 29, 7:34 AM · Restricted Project
hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Fri, May 29, 6:29 AM · Restricted Project

Thu, May 28

hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Thu, May 28, 7:34 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

One of the fixes for review comments resulted in one unit test failure, hence reverted back to original state.

Thu, May 28, 7:34 AM · Restricted Project
hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Thu, May 28, 7:02 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Have taken care of review comments made by Jay.

Thu, May 28, 6:30 AM · Restricted Project
hsmhsm added a comment to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

PING (since it is bit of high priority)

Thu, May 28, 3:12 AM · Restricted Project

Wed, May 27

hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Wed, May 27, 9:11 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Compute Width within SIInstrInfo::getMemOperandsWithOffset()

Wed, May 27, 8:05 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

getMemOperandsWithOffset() implemented by other targets needs compute Width too.
If it is not computing it, then, default computation is based on memoperands() api.

Wed, May 27, 6:28 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Modified getMemOperandsWithOffset() to pass Width as an argument. Note that
SIInstrInfo::getMemOperandsWithOffset() needs to be extended to compute and
assign Width. It will be implemented in next patch.

Wed, May 27, 5:22 AM · Restricted Project

Tue, May 26

hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Replace few if stmts with ternary expression

Tue, May 26, 11:22 PM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Have taken care of second set of review comments (by Matt)

Tue, May 26, 10:50 PM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Have taken care of Matt's review comments

Tue, May 26, 10:16 AM · Restricted Project
hsmhsm added a comment to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Missing tests

Tue, May 26, 9:44 AM · Restricted Project
hsmhsm added inline comments to D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Tue, May 26, 9:11 AM · Restricted Project
hsmhsm updated the diff for D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.

Taken care of clang formatting.

Tue, May 26, 4:17 AM · Restricted Project
hsmhsm committed rG09f7dcb64e1b: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic (authored by hsmhsm).
[AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic
Tue, May 26, 3:45 AM
hsmhsm closed D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic.
Tue, May 26, 3:45 AM · Restricted Project
hsmhsm updated the summary of D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Tue, May 26, 3:44 AM · Restricted Project
hsmhsm updated the summary of D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Tue, May 26, 3:44 AM · Restricted Project
hsmhsm created D80545: [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes.
Tue, May 26, 3:44 AM · Restricted Project
hsmhsm updated the summary of D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic.
Tue, May 26, 3:44 AM · Restricted Project

Fri, May 22

hsmhsm added a comment to D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic.

Hi Jay,

Fri, May 22, 12:52 PM · Restricted Project
hsmhsm updated the diff for D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic.

Since it is safe to do clean-up step-by-step, the previous somewhat larger
patch is reverted back, and in this modified patch, initial step is taken
towards the clean-up.

Fri, May 22, 12:52 PM · Restricted Project

Mon, May 18

hsmhsm created D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic.
Mon, May 18, 5:52 AM · Restricted Project

Mar 4 2020

hsmhsm committed rG3fda1fde8f7b: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics (authored by hsmhsm).
AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics
Mar 4 2020, 7:04 PM
hsmhsm closed D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Mar 4 2020, 7:03 PM · Restricted Project

Mar 3 2020

hsmhsm committed rGcac068600e55: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code (authored by hsmhsm).
[HIP] Make sure, unused hip-pinned-shadow global var is kept within device code
Mar 3 2020, 9:48 PM
hsmhsm closed D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.
Mar 3 2020, 9:47 PM · Restricted Project
hsmhsm added a comment to D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.

Take care review comments by hliao.

Mar 3 2020, 10:56 AM · Restricted Project

Mar 2 2020

hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Take care of latest review comments by matt about type setting of virtual register.

Mar 2 2020, 12:55 PM · Restricted Project
hsmhsm added a reviewer for D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code: hliao.
Mar 2 2020, 11:36 AM · Restricted Project
hsmhsm updated the diff for D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.

Take care review comments by hliao.

Mar 2 2020, 11:36 AM · Restricted Project
hsmhsm added a comment to D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.

BTW, why that variable cannot have an initializer? Suppose that initializer is a trivial one, initializing to 0, would that cause any issue in the compilation?

Mar 2 2020, 10:20 AM · Restricted Project
hsmhsm added inline comments to D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.
Mar 2 2020, 9:56 AM · Restricted Project
hsmhsm added inline comments to D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.
Mar 2 2020, 9:55 AM · Restricted Project

Mar 1 2020

hsmhsm added a comment to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

I believe that the debug_trap trap handler support no longer needs the queue_ptr to be passed in as it is internally computing it from the dootbell ID available from a GETREG together with a doorbell to queue mapping maintained by the ROCm Runtim that is accessible through the TMB register.

So should that change be reflect in this to simplify things? This is does change the ABI so needs the HSA ABI number to be incremented in the ELF header and AMDGOUUsage documentation updated. However, from the compilers point of view it is not ABI breaking as old code will still work, as it is generating code to set the queue_ptr that is unnecessary.

Adding @kzhuravl for ELF ABI version help.

Hi Tony,

If you are specifically asking for llvm.debugtrap() intrinsic here, then, we are already taken care of it, and we are not adding queue_ptr in this case, as you can see from the code changes at AMDGPULegalizerInfo.cpp:3602. The queue_ptr related discussions here are only specific to llvm.trap() intrinsic.

Looking at AMDGPULegalizerInfo.cpp:3602 it appears the queue_ptr is still being set up so not sure what you mean that it is already being taken care of.

Mar 1 2020, 8:59 PM · Restricted Project

Feb 29 2020

hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Couple of additional required changes

Feb 29 2020, 8:47 AM · Restricted Project
hsmhsm created D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code.
Feb 29 2020, 12:21 AM · Restricted Project

Feb 24 2020

hsmhsm added a comment to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

I believe that the debug_trap trap handler support no longer needs the queue_ptr to be passed in as it is internally computing it from the dootbell ID available from a GETREG together with a doorbell to queue mapping maintained by the ROCm Runtim that is accessible through the TMB register.

So should that change be reflect in this to simplify things? This is does change the ABI so needs the HSA ABI number to be incremented in the ELF header and AMDGOUUsage documentation updated. However, from the compilers point of view it is not ABI breaking as old code will still work, as it is generating code to set the queue_ptr that is unnecessary.

Adding @kzhuravl for ELF ABI version help.

Feb 24 2020, 2:03 AM · Restricted Project

Feb 22 2020

hsmhsm added inline comments to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Feb 22 2020, 10:34 PM · Restricted Project
hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Take care of further review comments by arsenm.

Feb 22 2020, 10:34 PM · Restricted Project

Feb 19 2020

hsmhsm added inline comments to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Feb 19 2020, 10:09 PM · Restricted Project
hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Changes as per further review comments:

Feb 19 2020, 10:00 PM · Restricted Project
hsmhsm added inline comments to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Feb 19 2020, 1:17 AM · Restricted Project
hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Take care of further review comments by arsenm.

Feb 19 2020, 1:08 AM · Restricted Project

Feb 18 2020

hsmhsm added inline comments to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Feb 18 2020, 12:15 AM · Restricted Project

Feb 17 2020

hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Share the original DAG test for GlobalISel, and do not re-invent new one here.

Feb 17 2020, 11:48 PM · Restricted Project
hsmhsm added a comment to D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Regarding the sharing of original ISelDAG test within GlobalISel, it seems to be not working since there are checks like below in case of ISelDAG path. Hence, let's keep the both the tests separate for now.

Feb 17 2020, 12:02 PM · Restricted Project
hsmhsm updated the diff for D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.

Take care of review comments by arsenm.

  1. Introduce intermediate virtual register copy
  2. Remove DAG tests in the GlobalIsel test directory
  3. Add LLVM IRs that uses the stack
  4. Assert that destination register is virtual within loadInputValue()
Feb 17 2020, 11:53 AM · Restricted Project

Feb 16 2020

hsmhsm added a comment to D74527: AMDGPU/GlobalISel: Support llvm.trap intrinsic.

Based on the review comments by arsenm, I have moved handling of trap and debugtrap intrinsics to Legazizer, and submitted a new Phabricator review request - https://reviews.llvm.org/D74688. Hence closing this review request.

You could also have just updated this one with a new diff, instead of creating a separate review. Splitting the review usually makes it harder to track

Feb 16 2020, 7:56 PM · Restricted Project
hsmhsm abandoned D74527: AMDGPU/GlobalISel: Support llvm.trap intrinsic.

Based on the review comments by arsenm, I have moved handling of trap and debugtrap intrinsics to Legazizer, and submitted a new Phabricator review request - https://reviews.llvm.org/D74688. Hence closing this review request.

Feb 16 2020, 3:35 AM · Restricted Project
hsmhsm created D74688: AMDGPU/GlobalISel: Support llvm.trap and llvm.debugtrap intrinsics.
Feb 16 2020, 3:33 AM · Restricted Project

Feb 13 2020

hsmhsm added inline comments to D74527: AMDGPU/GlobalISel: Support llvm.trap intrinsic.
Feb 13 2020, 9:10 AM · Restricted Project

Feb 12 2020

hsmhsm created D74527: AMDGPU/GlobalISel: Support llvm.trap intrinsic.
Feb 12 2020, 10:53 PM · Restricted Project

Jan 30 2020

hsmhsm committed rG1d9e08ec35a5: [AMDGPU] Add file headers for few files where it is missing. (authored by hsmhsm).
[AMDGPU] Add file headers for few files where it is missing.
Jan 30 2020, 12:44 PM
hsmhsm closed D73417: [AMDGPU] Add file headers for few files where it is missing..
Jan 30 2020, 12:44 PM · Restricted Project

Jan 26 2020

hsmhsm added a reviewer for D73417: [AMDGPU] Add file headers for few files where it is missing.: cdevadas.

Hello CD,

Jan 26 2020, 1:36 AM · Restricted Project

Jan 25 2020

hsmhsm created D73417: [AMDGPU] Add file headers for few files where it is missing..
Jan 25 2020, 7:34 AM · Restricted Project