Page MenuHomePhabricator

pravinjagtap (Pravin Jagtap)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 2 2022, 7:38 PM (73 w, 2 d)

Recent Activity

Today

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Addressed review comments of @ruiling and @arsenm

Wed, May 31, 5:39 AM · Restricted Project, Restricted Project

Wed, May 24

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Ping & Rebase

Wed, May 24, 6:40 AM · Restricted Project, Restricted Project

Thu, May 18

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Rebased. I think, work is in good shape now. Please let me know if there are any other concerns, if not, we can move forward.

Thu, May 18, 6:51 AM · Restricted Project, Restricted Project
pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Thu, May 18, 5:40 AM · Restricted Project, Restricted Project

Thu, May 11

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Rebased.

Thu, May 11, 4:43 AM · Restricted Project, Restricted Project

Apr 29 2023

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Addressed review comments

Apr 29 2023, 2:39 AM · Restricted Project, Restricted Project

Apr 28 2023

pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 28 2023, 9:47 PM · Restricted Project, Restricted Project
pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 28 2023, 9:44 PM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Thank you @foad for comments. Addressed most of them.

Apr 28 2023, 9:43 PM · Restricted Project, Restricted Project

Apr 25 2023

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Rebased & Ping

Apr 25 2023, 3:33 AM · Restricted Project, Restricted Project

Apr 21 2023

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Introduced new command line flag amdgpu-atomic-optimizer-use-dpp which selects the scan implementation(DPP/Iterative).

Apr 21 2023, 12:01 AM · Restricted Project, Restricted Project

Apr 19 2023

pravinjagtap committed rG21a69bdb66e3: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer (authored by pravinjagtap).
[NewPM][AMDGPU] Port amdgpu-atomic-optimizer
Apr 19 2023, 9:32 PM · Restricted Project, Restricted Project
pravinjagtap closed D148628: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer.
Apr 19 2023, 9:31 PM · Restricted Project, Restricted Project
pravinjagtap added inline comments to D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.
Apr 19 2023, 2:10 AM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D148628: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer.

Rebased and addressed comments

Apr 19 2023, 1:56 AM · Restricted Project, Restricted Project

Apr 18 2023

pravinjagtap added a reviewer for D148628: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer: gandhi21299.
Apr 18 2023, 6:09 AM · Restricted Project, Restricted Project
pravinjagtap added inline comments to D148628: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer.
Apr 18 2023, 6:07 AM · Restricted Project, Restricted Project
pravinjagtap requested review of D148628: [NewPM][AMDGPU] Port amdgpu-atomic-optimizer.
Apr 18 2023, 5:08 AM · Restricted Project, Restricted Project

Apr 16 2023

pravinjagtap added inline comments to D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.
Apr 16 2023, 11:54 PM · Restricted Project, Restricted Project

Apr 13 2023

pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 13 2023, 3:58 AM · Restricted Project, Restricted Project
pravinjagtap updated the summary of D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.
Apr 13 2023, 3:48 AM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.

Rebased and addressed review comment about test flag

Apr 13 2023, 3:47 AM · Restricted Project, Restricted Project
pravinjagtap added inline comments to D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.
Apr 13 2023, 2:26 AM · Restricted Project, Restricted Project
pravinjagtap requested review of D148199: [AMDGPU] Add a new command line argument amdgpu-atomic-optimizer-use-dpp.
Apr 13 2023, 12:18 AM · Restricted Project, Restricted Project

Apr 12 2023

pravinjagtap added a comment to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Shouldn't this new lowering get enabled for device functions too?

Hello @cdevadas, The current visitor of AtomicRMWInst considers only AMDGPUAS::GLOBAL_ADDRESS and AMDGPUAS::LOCAL_ADDRESS as potential candidates for atomic optimizations and *NOT* the AMDGPUAS::FLAT_ADDRESS. In cases of device functions, I am observing that input argument (if device function is doing atomic add then we need to pass the address to device function) are addrSpaceCasted to AMDGPUAS::FLAT_ADDRESS in the caller (i.e global function) before passing it to device function. Thats the reason why this lowering is not getting enabled for device functions. Will talk to @b-sumner and @arsenm about handling of this.

Apr 12 2023, 2:53 AM · Restricted Project, Restricted Project

Apr 11 2023

pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 11 2023, 11:18 PM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Implemented @ruiling suggestions. In this approach, we iterate over only active lanes of a wavefront using llvm.cttz to precompute an exclusive scan scan.

Apr 11 2023, 6:09 AM · Restricted Project, Restricted Project

Apr 4 2023

pravinjagtap added inline comments to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 4 2023, 2:32 AM · Restricted Project, Restricted Project

Apr 3 2023

pravinjagtap updated the diff for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

Addressed @cdevadas comment. Used isGraphics to guard graphic shaders DPP implementation against this new iterative approach using readlane and writelane

Apr 3 2023, 9:03 AM · Restricted Project, Restricted Project
pravinjagtap added a comment to D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..

This is the same as D147303?

Apr 3 2023, 1:54 AM · Restricted Project, Restricted Project
pravinjagtap abandoned D147303: [WIP] Alternative implementation for Scan computations.
Apr 3 2023, 1:53 AM · Restricted Project, Restricted Project

Apr 2 2023

pravinjagtap added a reviewer for D147408: [AMDGPU] Iterative scan implementation for atomic optimizer.: ruiling.
Apr 2 2023, 4:15 AM · Restricted Project, Restricted Project
pravinjagtap requested review of D147408: [AMDGPU] Iterative scan implementation for atomic optimizer..
Apr 2 2023, 2:16 AM · Restricted Project, Restricted Project

Apr 1 2023

pravinjagtap updated the diff for D147303: [WIP] Alternative implementation for Scan computations.

Customizing patch testing

Apr 1 2023, 11:11 PM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D147303: [WIP] Alternative implementation for Scan computations.

Updated the lit tests, clang format, one lit test needs to be fixed in llvm/test/CodeGen/AMDGPU/noclobber-barrier.ll

Apr 1 2023, 10:16 PM · Restricted Project, Restricted Project

Mar 31 2023

pravinjagtap requested review of D147303: [WIP] Alternative implementation for Scan computations.
Mar 31 2023, 12:04 AM · Restricted Project, Restricted Project

Mar 22 2023

pravinjagtap added a comment to D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.

I don't think this patch solves any real problem, it just raises a bunch of questions about what you're trying to do.

If you want to read values from inactive lanes of a vgpr robustly then you need something like WWM - but I guess you don't trust the WWM implementation, so you're back to square one.

However... why are you doing readlane inside the part of the code that only has a single active lane? You can write a reduction across all active lanes like this without changing EXEC. (This is the unrolled version but you can put it in a loop if you want; that's irrelevant.)

s_mov s0, 0 ; initialize accumulator
; conditionally add in lane 0
v_readlane s1, v0, 0
s_bitcmp1 exec, 0
s_cselect s1, s1, 0
s_add s0, s0, s1
; conditionally add in lane 1
v_readlane s1, v0, 1
s_bitcmp1 exec, 1
s_cselect s1, s1, 0
s_add s0, s0, s1
...
; conditionally add in lane 31
v_readlane s1, v0, 31
s_bitcmp1 exec, 31
s_cselect s1, s1, 0
s_add s0, s0, s1
; result is in s0

You should be able to generate code like this from regular IR using the readlane instrinsic, which is already marked as Convergent. Once you've done the reduction you can do your atomic operation with only one lane active by generating regular IR like the AtomicOptimizer pass does.

Mar 22 2023, 2:04 AM · Restricted Project, Restricted Project

Mar 21 2023

pravinjagtap added a comment to D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.

FWIW, there is no desire to read from inactive lanes. The loop is supposed to only be reading from, and writing to, lanes that were active before the for loop is executed by a select single lane.

Then I'm back to not understanding what this convergent copy is for. I'd need to see a more complete example.

Mar 21 2023, 10:04 PM · Restricted Project, Restricted Project
pravinjagtap added a comment to D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.

What does this do and what is it for?

This convergent copy intrinsic will acts here as a form of barrier which makes sure that all the active lanes of VGPR (i.e. result of intrinsic) is computed before its use.

Can you give an example?

Mar 21 2023, 9:43 AM · Restricted Project, Restricted Project
pravinjagtap added a comment to D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.

What does this do and what is it for?

Mar 21 2023, 7:53 AM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.

Clang format

Mar 21 2023, 6:36 AM · Restricted Project, Restricted Project
pravinjagtap requested review of D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy.
Mar 21 2023, 6:26 AM · Restricted Project, Restricted Project

Feb 18 2022

pravinjagtap updated the diff for D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

Addressed vpykhtin's commnets

Feb 18 2022, 3:23 AM · Restricted Project, Restricted Project

Feb 9 2022

pravinjagtap added a comment to D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

It looks like you need to update a number of existing tests.

You should also test the case where the GEP inbounds is conditional -- I think it might still be okay based on a dereferenceabiliy argument, but it's not entirely obvious to me.

Thank you for your inputs. I am not sure whether I fully understood what do you mean by GEP inbounds is conditional. I am assuming that test should include: GEP with inbound and without inbound in callee so that other conditional branch will be exercised.

Feb 9 2022, 3:50 AM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

Added test which exercises the conditional GEP inbound

Feb 9 2022, 3:44 AM · Restricted Project, Restricted Project
pravinjagtap updated the diff for D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

Applied git-clang-format on change. Sorry for inconvenience.

Feb 9 2022, 1:54 AM · Restricted Project, Restricted Project

Feb 8 2022

pravinjagtap updated the diff for D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

Addressed the comments by nikic and fixed the unit tests

Feb 8 2022, 5:56 AM · Restricted Project, Restricted Project
pravinjagtap planned changes to D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..
Feb 8 2022, 5:39 AM · Restricted Project, Restricted Project
pravinjagtap added a comment to D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..

It looks like you need to update a number of existing tests.

You should also test the case where the GEP inbounds is conditional -- I think it might still be okay based on a dereferenceabiliy argument, but it's not entirely obvious to me.

Feb 8 2022, 5:16 AM · Restricted Project, Restricted Project

Feb 7 2022

pravinjagtap added a reviewer for D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers.: rampitec.
Feb 7 2022, 4:08 AM · Restricted Project, Restricted Project
pravinjagtap updated the summary of D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..
Feb 7 2022, 3:43 AM · Restricted Project, Restricted Project
pravinjagtap requested review of D119127: Preserve inbounds information of GEP during Argument Promotion Pass across callee and its callers..
Feb 7 2022, 3:39 AM · Restricted Project, Restricted Project

Feb 6 2022

pravinjagtap abandoned D119102: NFC formating.
Feb 6 2022, 8:27 PM · Restricted Project
pravinjagtap requested review of D119102: NFC formating.
Feb 6 2022, 8:26 PM · Restricted Project

Jan 31 2022

pravinjagtap abandoned D116523: Understanding how to create and upload patch .
Jan 31 2022, 9:08 AM
pravinjagtap abandoned D116650: learning Arc tool.
Jan 31 2022, 9:08 AM · Restricted Project

Jan 5 2022

pravinjagtap requested review of D116650: learning Arc tool.
Jan 5 2022, 4:16 AM · Restricted Project

Jan 4 2022

pravinjagtap committed rG2899e8de67aa: [AMDGPU] Test commit. NFC. (authored by pravinjagtap).
[AMDGPU] Test commit. NFC.
Jan 4 2022, 8:22 PM
pravinjagtap closed D116641: [AMDGPU] Test commit. NFC..
Jan 4 2022, 8:22 PM · Restricted Project
pravinjagtap requested review of D116641: [AMDGPU] Test commit. NFC..
Jan 4 2022, 8:12 PM · Restricted Project

Jan 2 2022

pravinjagtap requested review of D116523: Understanding how to create and upload patch .
Jan 2 2022, 11:49 PM