Page MenuHomePhabricator

rampitec (Stanislav Mekhanoshin)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 4 2014, 4:14 AM (354 w, 3 d)

Recent Activity

Fri, Jan 15

rampitec accepted D94823: AMDGPU: Add occupancy to serialized MachineFunctionInfo.
Fri, Jan 15, 1:40 PM · Restricted Project
rampitec accepted D94777: [AMDGPU][MC] Refactored parsing of dpp ctrl.

LGTM

Fri, Jan 15, 11:34 AM · Restricted Project
rampitec accepted D94756: [AMDGPU][MC][GFX10] Improved dpp8 errors handling.

LGTM

Fri, Jan 15, 11:34 AM · Restricted Project
rampitec added inline comments to D94777: [AMDGPU][MC] Refactored parsing of dpp ctrl.
Fri, Jan 15, 10:53 AM · Restricted Project
rampitec added inline comments to D94756: [AMDGPU][MC][GFX10] Improved dpp8 errors handling.
Fri, Jan 15, 10:48 AM · Restricted Project
rampitec added inline comments to D94585: [IndirectFunctions] Skip propagating attributes to address taken functions.
Fri, Jan 15, 10:47 AM · Restricted Project

Wed, Jan 13

rampitec added a comment to D94585: [IndirectFunctions] Skip propagating attributes to address taken functions.

With this patch you would set features on an address-taken function and ignore whole call stack below it. I.e. its own callees will not be processed. I think you need to continue traversal, just skip actual setting of attributes on such a function. Setting these attributes on a functions it may call in turn shall be fine.

Wed, Jan 13, 10:37 AM · Restricted Project

Mon, Jan 11

rampitec accepted D94406: [AMDGPU] Fix failing assert with scratch ST mode.
Mon, Jan 11, 10:30 AM · Restricted Project
rampitec accepted D94358: [NFC][AMDGPU] Clarify memory model support for volatile.
Mon, Jan 11, 10:11 AM · Restricted Project

Fri, Jan 8

rampitec added a reviewer for D94341: [AMDGPU] Add _e64 suffix to VOP3 Insts: arsenm.
Fri, Jan 8, 2:09 PM · Restricted Project
rampitec added a reviewer for D94341: [AMDGPU] Add _e64 suffix to VOP3 Insts: dp.
Fri, Jan 8, 2:09 PM · Restricted Project
rampitec accepted D94214: [AMDGPU] Add volatile support to SIMemoryLegalizer.
Fri, Jan 8, 12:02 PM · Restricted Project

Thu, Jan 7

rampitec added a comment to D94214: [AMDGPU] Add volatile support to SIMemoryLegalizer.

LGTM. Make sure to rerun tests before push as a lot of tests are affected.

Thu, Jan 7, 11:46 AM · Restricted Project

Wed, Jan 6

rampitec added inline comments to D94153: [AMDGPU][Inliner] Remove amdgpu-inline and add new TTI inline hooks.
Wed, Jan 6, 10:33 AM · Restricted Project
rampitec accepted D93813: [NFC][AMDGPU] Reduce include files dependency..

LGTM

Wed, Jan 6, 10:28 AM · Restricted Project
rampitec added inline comments to D94153: [AMDGPU][Inliner] Remove amdgpu-inline and add new TTI inline hooks.
Wed, Jan 6, 10:24 AM · Restricted Project

Mon, Jan 4

rampitec accepted D94020: [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10.

LGTM

Mon, Jan 4, 4:26 PM · Restricted Project
rampitec requested changes to D94020: [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10.
Mon, Jan 4, 4:22 PM · Restricted Project
rampitec added a comment to D94020: [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10.

Should the disassembler still understand it?

Mon, Jan 4, 11:06 AM · Restricted Project
rampitec added a comment to D94020: [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10.

Generally speaking it is deprecated but it is supported. Well, not in gfx1030.

Mon, Jan 4, 11:01 AM · Restricted Project
rampitec accepted D94010: [AMDGPU] Handle v_fmac_legacy_f32 in SIFoldOperands.
Mon, Jan 4, 10:53 AM · Restricted Project
rampitec accepted D94009: [AMDGPU] Split out new helper function macToMad in SIFoldOperands. NFC..
Mon, Jan 4, 10:50 AM · Restricted Project
rampitec added inline comments to D93813: [NFC][AMDGPU] Reduce include files dependency..
Mon, Jan 4, 10:46 AM · Restricted Project

Wed, Dec 23

rampitec committed rG747f67e034a9: [AMDGPU] Fix adjustWritemask subreg handling (authored by rampitec).
[AMDGPU] Fix adjustWritemask subreg handling
Wed, Dec 23, 2:44 PM
rampitec closed D93782: [AMDGPU] Fix adjustWritemask subreg handling.
Wed, Dec 23, 2:43 PM · Restricted Project
rampitec requested review of D93782: [AMDGPU] Fix adjustWritemask subreg handling.
Wed, Dec 23, 2:24 PM · Restricted Project
rampitec added inline comments to D55301: RegAlloc: Allow targets to split register allocation.
Wed, Dec 23, 10:48 AM · Restricted Project
rampitec added inline comments to D55301: RegAlloc: Allow targets to split register allocation.
Wed, Dec 23, 10:29 AM · Restricted Project
rampitec accepted D93757: [AMDGPU][MC] Improved diagnostics messages for v_interp* operands.
Wed, Dec 23, 9:47 AM · Restricted Project
rampitec accepted D93756: [AMDGPU][MC][NFC] Parser refactoring.
Wed, Dec 23, 9:45 AM · Restricted Project

Tue, Dec 22

rampitec committed rGd15119a02d92: [AMDGPU][GlobalISel] GlobalISel for flat scratch (authored by rampitec).
[AMDGPU][GlobalISel] GlobalISel for flat scratch
Tue, Dec 22, 4:45 PM
rampitec closed D93670: [AMDGPU][GlobalISel] GlobalISel for flat scratch.
Tue, Dec 22, 4:45 PM · Restricted Project
rampitec committed rGca4bf58e4ee5: [AMDGPU] Support unaligned flat scratch in TLI (authored by rampitec).
[AMDGPU] Support unaligned flat scratch in TLI
Tue, Dec 22, 4:31 PM
rampitec closed D93669: [AMDGPU] Support unaligned flat scratch in TLI.
Tue, Dec 22, 4:31 PM · Restricted Project
rampitec updated the diff for D93669: [AMDGPU] Support unaligned flat scratch in TLI.

Added run line to the unaligned-load-store.ll.

Tue, Dec 22, 1:54 PM · Restricted Project
rampitec added a comment to D93669: [AMDGPU] Support unaligned flat scratch in TLI.

I think there's some missing tests, these test changes look incidental. Can you add some checks to the existing unaligned load/store base tests

Which tests do you mean? As far as I understand this mostly affects GlobalISel and child patch D93670 contains the actual test, all unaligned cases.

This should cover both. I mean unaligned-load-store.ll (and some others, it's annoying how the test names for these aren't consistent and there is redundancy between files)

It should not change without flat scratch enabled?

Tue, Dec 22, 1:34 PM · Restricted Project
rampitec added a comment to D93669: [AMDGPU] Support unaligned flat scratch in TLI.

I think there's some missing tests, these test changes look incidental. Can you add some checks to the existing unaligned load/store base tests

Which tests do you mean? As far as I understand this mostly affects GlobalISel and child patch D93670 contains the actual test, all unaligned cases.

This should cover both. I mean unaligned-load-store.ll (and some others, it's annoying how the test names for these aren't consistent and there is redundancy between files)

Tue, Dec 22, 1:30 PM · Restricted Project
rampitec added a comment to D93669: [AMDGPU] Support unaligned flat scratch in TLI.

I think there's some missing tests, these test changes look incidental. Can you add some checks to the existing unaligned load/store base tests

Tue, Dec 22, 1:20 PM · Restricted Project
rampitec accepted D92483: AMDGPU - Use MUBUF instructions for global address space access.

LGTM, but please check with Tony for the documentation.

Tue, Dec 22, 12:45 PM · Restricted Project
rampitec added a reviewer for D93669: [AMDGPU] Support unaligned flat scratch in TLI: sebastian-ne.
Tue, Dec 22, 12:35 PM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

I like what Scott describes as it seems intuitively obvious and cleaner. Are there any concerns for making these changes?

Tue, Dec 22, 11:04 AM · Restricted Project
rampitec committed rGae8f4b2178c4: [AMDGPU] Folding of FI operand with flat scratch (authored by rampitec).
[AMDGPU] Folding of FI operand with flat scratch
Tue, Dec 22, 10:48 AM
rampitec closed D93501: [AMDGPU] Folding of FI operand with flat scratch.
Tue, Dec 22, 10:48 AM · Restricted Project
rampitec added inline comments to D93715: AMDGPU: Don't fold AGPR copy pairs that need a temp VGPR.
Tue, Dec 22, 10:46 AM · Restricted Project
rampitec added inline comments to D93670: [AMDGPU][GlobalISel] GlobalISel for flat scratch.
Tue, Dec 22, 10:36 AM · Restricted Project
rampitec added inline comments to D93669: [AMDGPU] Support unaligned flat scratch in TLI.
Tue, Dec 22, 10:34 AM · Restricted Project
rampitec accepted D93692: [AMDGPU][GlobalISel] Fold flat vgpr + constant addresses.

LGTM

Tue, Dec 22, 10:31 AM · Restricted Project
rampitec added inline comments to D93670: [AMDGPU][GlobalISel] GlobalISel for flat scratch.
Tue, Dec 22, 10:29 AM · Restricted Project

Mon, Dec 21

rampitec requested review of D93670: [AMDGPU][GlobalISel] GlobalISel for flat scratch.
Mon, Dec 21, 5:08 PM · Restricted Project
rampitec requested review of D93669: [AMDGPU] Support unaligned flat scratch in TLI.
Mon, Dec 21, 4:01 PM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

AMDGPUUsage does not list amdhsa for all 3 SI targets, yet it is going to be accepted with this patch. If that is so then amdhsa has to be listed in the table for these targets.

The table is indicating what runtimes support which ABIs. Since ROCm does not support gfx6* it would still be correct to not list it in the table. It may be useful to have another column to indicate which ABIs are supported by which processors.

Mon, Dec 21, 3:09 PM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

I updated AMDGPUUsage to reflect the current state which is that gfx6* does not support amdhsa. But if this patch allows gfx6* to be supported without generic addresses then that table should also be updated as part of the patch.

Mon, Dec 21, 2:40 PM · Restricted Project
rampitec accepted D93652: AMDGPU: Fix assert when checking for implicit operand legality.
Mon, Dec 21, 11:56 AM · Restricted Project
rampitec accepted D93652: AMDGPU: Fix assert when checking for implicit operand legality.

LGTM

Mon, Dec 21, 11:12 AM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

Mon, Dec 21, 9:19 AM · Restricted Project

Dec 18 2020

rampitec accepted D93550: [AMDGPU][MC][NFC] Lit tests cleanup.
Dec 18 2020, 12:23 PM · Restricted Project
rampitec accepted D93548: [AMDGPU][MC][NFC] Parser refactoring.
Dec 18 2020, 12:22 PM · Restricted Project
rampitec accepted D93551: AMDGPU: Add spilled CSR SGPRs to entry block live ins.
Dec 18 2020, 12:18 PM · Restricted Project
rampitec requested changes to D92483: AMDGPU - Use MUBUF instructions for global address space access.

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

Dec 18 2020, 12:16 PM · Restricted Project

Dec 17 2020

rampitec requested review of D93501: [AMDGPU] Folding of FI operand with flat scratch.
Dec 17 2020, 4:57 PM · Restricted Project
rampitec accepted D93440: [NFC][AMDGPU] Format change to processr table in AMGPUUsage.rst.
Dec 17 2020, 10:45 AM · Restricted Project

Dec 16 2020

rampitec accepted D93440: [NFC][AMDGPU] Format change to processr table in AMGPUUsage.rst.
Dec 16 2020, 11:39 PM · Restricted Project
rampitec accepted D93302: Disable Jump Threading for the targets with divergent control flow.

LGTM, but please fix typo in the commit message before submit: "unniform".

Dec 16 2020, 2:03 PM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

Dec 16 2020, 1:01 PM · Restricted Project
rampitec added a comment to D92483: AMDGPU - Use MUBUF instructions for global address space access.

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

Dec 16 2020, 12:03 PM · Restricted Project
rampitec accepted D93403: AMDGPU: Remove SGPRSpillVGPRDefinedSet hack.
Dec 16 2020, 11:56 AM · Restricted Project

Dec 15 2020

rampitec committed rGeb66bf0802f9: [AMDGPU] Print SCRATCH_EN field after the kernel (authored by rampitec).
[AMDGPU] Print SCRATCH_EN field after the kernel
Dec 15 2020, 10:45 PM
rampitec closed D93353: [AMDGPU] Print SCRATCH_EN field after the kernel.
Dec 15 2020, 10:44 PM · Restricted Project
rampitec requested review of D93353: [AMDGPU] Print SCRATCH_EN field after the kernel.
Dec 15 2020, 3:33 PM · Restricted Project
rampitec accepted D93288: [AMDGPU] Allow no saddr for global addtid insts.

LGTM

Dec 15 2020, 10:26 AM · Restricted Project
rampitec added a comment to D93302: Disable Jump Threading for the targets with divergent control flow.

Needs test in Transforms/JumpThreading.
Typo in the commit message "unniform".

Dec 15 2020, 10:18 AM · Restricted Project
rampitec accepted D93271: [AMDGPU] Clarify scratch initialization.

LGTM, thanks!

Dec 15 2020, 9:57 AM · Restricted Project

Dec 14 2020

rampitec committed rGcf5845d6c428: [AMDGPU] Use multi-dword flat scratch for spilling (authored by rampitec).
[AMDGPU] Use multi-dword flat scratch for spilling
Dec 14 2020, 2:20 PM
rampitec closed D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 14 2020, 2:19 PM · Restricted Project
rampitec added inline comments to D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 14 2020, 11:01 AM · Restricted Project
rampitec updated the diff for D93067: [AMDGPU] Use multi-dword flat scratch for spilling.

Removed CHAR_WIDTH.

Dec 14 2020, 11:01 AM · Restricted Project
rampitec committed rG87d7757bbe14: [SLP] Control maximum vectorization factor from TTI (authored by rampitec).
[SLP] Control maximum vectorization factor from TTI
Dec 14 2020, 8:50 AM
rampitec closed D92059: [SLP] Control maximum vectorization factor from TTI.
Dec 14 2020, 8:49 AM · Restricted Project
rampitec accepted D93202: [AMDGPU] Make use of HasSMemRealTime predicate. NFC..

LGTM

Dec 14 2020, 8:31 AM · Restricted Project

Dec 11 2020

rampitec updated the diff for D92059: [SLP] Control maximum vectorization factor from TTI.

Added 0=unlimited to the option description.

Dec 11 2020, 4:42 PM · Restricted Project
rampitec added a comment to D92059: [SLP] Control maximum vectorization factor from TTI.

ping.

Dec 11 2020, 12:05 PM · Restricted Project
rampitec added inline comments to D92483: AMDGPU - Use MUBUF instructions for global address space access.
Dec 11 2020, 11:42 AM · Restricted Project
rampitec added inline comments to D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 11 2020, 10:51 AM · Restricted Project

Dec 10 2020

rampitec added inline comments to D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 10 2020, 3:01 PM · Restricted Project
rampitec updated the diff for D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 10 2020, 3:01 PM · Restricted Project
rampitec requested review of D93067: [AMDGPU] Use multi-dword flat scratch for spilling.
Dec 10 2020, 1:32 PM · Restricted Project
rampitec added inline comments to D92483: AMDGPU - Use MUBUF instructions for global address space access.
Dec 10 2020, 11:30 AM · Restricted Project

Dec 9 2020

rampitec committed rG4617cc68f64a: [AMDGPU] Fix expansion of 192 bit spills in PEI (authored by rampitec).
[AMDGPU] Fix expansion of 192 bit spills in PEI
Dec 9 2020, 4:37 PM
rampitec closed D92979: [AMDGPU] Fix expansion of 192 bit spills in PEI.
Dec 9 2020, 4:36 PM · Restricted Project
rampitec requested review of D92979: [AMDGPU] Fix expansion of 192 bit spills in PEI.
Dec 9 2020, 3:59 PM · Restricted Project

Dec 8 2020

rampitec accepted D91048: [AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing.

LGTM. Not sure why MODE was also defined by set_gpr_idx, but that can be corrected later if needed.

Dec 8 2020, 12:17 PM · Restricted Project
rampitec added inline comments to D92483: AMDGPU - Use MUBUF instructions for global address space access.
Dec 8 2020, 12:16 PM · Restricted Project

Dec 7 2020

rampitec added a comment to D91048: [AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing.

JBTW, having a need to def M0 probably defeats the idea of potential rescheduling which justifies the separation of the indirect access methods.

The GPR_IDX variants are where we are speaking of adding the def M0 right? These are not expanded until after the scheduler. The MOVREL version's M0 def is expanded earlier during ISel finalization.

My thinking was that we wanted rescheduling the M0 def with the MOVREL version but not the GPR_IDX version and that combining these methods wouldn't allow that.

Dec 7 2020, 8:41 PM · Restricted Project
rampitec added a comment to D91048: [AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing.

JBTW, having a need to def M0 probably defeats the idea of potential rescheduling which justifies the separation of the indirect access methods.

Dec 7 2020, 8:15 PM · Restricted Project
rampitec added inline comments to D91048: [AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing.
Dec 7 2020, 8:13 PM · Restricted Project
rampitec added inline comments to D91048: [AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing.
Dec 7 2020, 3:48 PM · Restricted Project
rampitec accepted D92665: RegisterCoalescer: Remove phi-only subranges when erasing identity copies.

I think I've seen similar cases before. LGTM, but will be nice if @qcolombet will look at this too.

Dec 7 2020, 3:40 PM · Restricted Project
rampitec added a comment to D91487: [AMDGPU] Don't require swz operand for non-return Atomics..

Needs test.

Hi @rampitec,

I've tried writing a test for this change, but I've been having trouble (I'm new to LLVM so I might be missing the obvious answer). From what I can see, I thought the correct option would be to create a MIR test, using the same instruction that I saw when investigating the original problem. I started by copying and modifying an existing test in fix-sgpr-copies.mir:

Dec 7 2020, 3:31 PM · Restricted Project
rampitec added a comment to D92059: [SLP] Control maximum vectorization factor from TTI.

@rampitec : It would be nice to see this go in soon...

Dec 7 2020, 3:13 PM · Restricted Project
rampitec updated the diff for D92059: [SLP] Control maximum vectorization factor from TTI.

Updated option description.
Updated callback comment to clarify this is used by SLP only now.
Dropped BE test.
Addded a store to the general test to make it useful if anyone needs to run AMDGPU BE on it too.

Dec 7 2020, 3:12 PM · Restricted Project