vpykhtin (Valery Pykhtin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 28 2016, 8:30 AM (89 w, 5 d)

Recent Activity

Tue, Sep 19

vpykhtin accepted D38014: [AMDGPU] Prevent post-RA scheduler from breaking memory clauses.

LGTM.

Tue, Sep 19, 1:33 PM

Sep 12 2017

vpykhtin accepted D37698: Allow target to decide when to cluster loads/stores in misched.

SIInstrInfo::doMemOpsHaveSameBasePtr looks generic enough to be default indeed but should we commit this first and then make doMemOpsHaveSameBasePtr default so we could rollback to this one in case of severe regressions? LGTM by the way.

Sep 12 2017, 9:07 AM

Sep 8 2017

vpykhtin accepted D37594: AMDGPU: Start using !con operator.

Wow! I wish I had this before! LGTM.

Sep 8 2017, 7:36 AM

Sep 6 2017

vpykhtin accepted D37502: [AMDGPU] Fix shouldClusterMemOps to process flat loads.

LGTM. However it looks like we should fix TD to follow single naming convention.

Sep 6 2017, 7:22 AM

Sep 4 2017

vpykhtin added inline comments to D36831: [AMDGPU] Transform __read_pipe_* and __write_pipe_*.
Sep 4 2017, 3:15 AM

Sep 1 2017

vpykhtin accepted D36831: [AMDGPU] Transform __read_pipe_* and __write_pipe_*.

Ok, unmangled part looks different indeed. If the issue with pre-link checking is solved this patch is ok with me.

Sep 1 2017, 9:32 AM
vpykhtin added a comment to D36831: [AMDGPU] Transform __read_pipe_* and __write_pipe_*.

Splitting AMDGPULibFunc in two classes looks a huge overkill. How about modifying AMDGPULibFunc::parse so it could accept unmangled names and just return an enum id for the function (using some fast lookup approach)? Type info for such functions can be left unpopulated and supposed to be handled by the client (as in fold_read_write_pipe).

Sep 1 2017, 6:36 AM

Aug 28 2017

vpykhtin accepted D37223: [AMDGPU] Fix regression in AMDGPULibCalls allowing native for doubles.

LGTM.

Aug 28 2017, 10:56 AM

Aug 11 2017

vpykhtin accepted D36436: [AMDGPU] Ported and adopted AMDLibCalls pass.

LGTM.

Aug 11 2017, 8:35 AM

Jul 21 2017

vpykhtin added a comment to D35435: [AMDGPU] Produce flat|global_dwordx3 instructions.

The implementation of this approach looks good to me. The only question is which way to go to implement v3 vector.

Jul 21 2017, 12:49 PM
vpykhtin accepted D35710: AMDGPU: Add instruction definitions for some scratch_* instructions.

LGTM.

Jul 21 2017, 6:35 AM

Jul 20 2017

vpykhtin accepted D35659: AMDGPU: Add encodings for global atomics.

LGTM.

Jul 20 2017, 7:21 AM
vpykhtin accepted D35658: AMDGPU: Rename _RTN atomic instructions.

LGTM, Thanks!

Jul 20 2017, 6:37 AM

Jun 27 2017

vpykhtin accepted D34626: [AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions.

LGTM.

Jun 27 2017, 7:50 AM
vpykhtin accepted D34655: [AMDGPU] Add 2 new alignbit patterns.

LGTM.

Jun 27 2017, 3:13 AM
vpykhtin accepted D34579: Fold fneg and fabs like multiplications.

LGTM.

Jun 27 2017, 3:12 AM
vpykhtin accepted D34545: [AMDGPU] Simplify setcc (sext from i1 b), -1|0, cc.

LGTM.

Jun 27 2017, 3:05 AM
vpykhtin accepted D34500: [AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0.

LGTM.

Jun 27 2017, 3:03 AM

Jun 20 2017

vpykhtin accepted D34291: [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32.

LGTM.

Jun 20 2017, 7:42 AM
vpykhtin accepted D34300: [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setcc.

LGTM.

Jun 20 2017, 3:31 AM
vpykhtin accepted D34374: [AMDGPU] Combine add and adde, sub and sube.

LGTM.

Jun 20 2017, 3:31 AM
vpykhtin accepted D34241: [AMDGPU] SDWA: add support for GFX9 in peephole pass.

LGTM, with nit.

Jun 20 2017, 3:29 AM

May 26 2017

vpykhtin accepted D33064: AMDGPU: Start adding offset fields to flat instructions.

LGTM.

May 26 2017, 5:35 AM

May 23 2017

vpykhtin accepted D33432: [AMDGPU] Convert shl (add) into add (shl).

LGTM.

May 23 2017, 4:15 AM

May 22 2017

vpykhtin accepted D33367: [AMDGPU] Narrow lshl from 64 to 32 bit if possible.

LGTM.

May 22 2017, 9:34 AM
vpykhtin added inline comments to D33367: [AMDGPU] Narrow lshl from 64 to 32 bit if possible.
May 22 2017, 9:18 AM
vpykhtin added inline comments to D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
May 22 2017, 6:11 AM
vpykhtin committed rL303548: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
[AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker
May 22 2017, 6:09 AM
vpykhtin closed D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker by committing rL303548: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
May 22 2017, 6:09 AM

May 19 2017

vpykhtin updated the diff for D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.

fixed as per comments

May 19 2017, 5:08 AM

May 18 2017

vpykhtin added inline comments to D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
May 18 2017, 10:37 AM
vpykhtin added a comment to D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.

I inserted MachineInstrRegs between reset and recede, all functions are in old places. I should move MachineInstrRegs higher

I see it now, thanks. Can you move it higher please?

May 18 2017, 10:21 AM
vpykhtin accepted D32804: [AMDGPU] SDWA operands should not intersect with potential MIs.

I agree if this is a bugfix.

May 18 2017, 4:40 AM
vpykhtin added a reviewer for D33132: [AMDGPU] SDWA: Add assembler support for GFX9: dp.
May 18 2017, 4:05 AM
vpykhtin added a comment to D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.

I inserted MachineInstrRegs between reset and recede, all functions are in old places. I should move MachineInstrRegs higher

May 18 2017, 3:25 AM
vpykhtin added inline comments to D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
May 18 2017, 3:15 AM

May 17 2017

vpykhtin created D33289: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker.
May 17 2017, 10:25 AM

May 16 2017

vpykhtin accepted D33244: [AMDGPU] Use GCNRPTracker dumper methods in scheduler.

LGTM.

May 16 2017, 9:34 AM
vpykhtin added inline comments to D33117: [AMDGPU] Cache live-ins and register pressure in scheduler.
May 16 2017, 8:53 AM
vpykhtin accepted D33117: [AMDGPU] Cache live-ins and register pressure in scheduler.

LGTM.

May 16 2017, 8:01 AM
vpykhtin accepted D33105: [AMDGPU] Turn register pressure estimation into forward tracker.
May 16 2017, 8:00 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

I'm a bit confused with all of these advance... but lets submit and improve this later.

May 16 2017, 5:54 AM

May 15 2017

vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

You can do the reset with move - this would not require copy.

May 15 2017, 10:29 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

I really don't understand it all. Why not just have reset(MachineInstr *, LiveRegSet ) and do a reset on the next BB with liveset from previous BB? It looks like your code does it. Another question: your reset does skip debugs, but the caller doesn't know about it and can supply unskipped iterator to the next advance.

May 15 2017, 10:19 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

This looks a bit as using carkeys to open a bottle. Should we have single instruction RP diff returning function without changing any state?

We probably should, but it probably does not belong to this change.

May 15 2017, 9:53 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

This looks a bit as using carkeys to open a bottle. Should we have single instruction RP diff returning function without changing any state?

May 15 2017, 9:45 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

If I don't mistake downward tracker cannot work on arbitrary instruction order, so let's change it's interface so that it tracks current instruction and move it, likewise like llvm standard tracker, that is reset(MI), advance()

In general it cannot work on an arbitrary order, but it can be used as a probe to schedule next arbitrary instruction and then reset to the previous state.
Then in D33117 I'm using advanceBefore to cross basic block boundary and I do not really want to slow down advance method by checking for the iterator end condition which is needed relatively seldom.

May 15 2017, 9:27 AM
vpykhtin added a comment to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.

If I don't mistake downward tracker cannot work on arbitrary instruction order, so let's change it's interface so that it tracks current instruction and move it, likewise like llvm standard tracker, that is reset(MI), advance()

May 15 2017, 7:12 AM

May 12 2017

vpykhtin added inline comments to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.
May 12 2017, 10:52 AM
vpykhtin added inline comments to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.
May 12 2017, 10:44 AM
vpykhtin added inline comments to D33105: [AMDGPU] Turn register pressure estimation into forward tracker.
May 12 2017, 7:18 AM

May 11 2017

vpykhtin accepted D33086: [AMDGPU] Fix incorrect register pressure calculation.

LGTM.

May 11 2017, 10:21 AM
vpykhtin added inline comments to D33086: [AMDGPU] Fix incorrect register pressure calculation.
May 11 2017, 8:29 AM

Apr 26 2017

vpykhtin accepted D32546: [AMDGPU][MC] Added arg checks for vmcnt() expcnt() lgkmcnt() helpers.

LGTM, Thanks!

Apr 26 2017, 10:43 AM
vpykhtin accepted D32535: [AMDGPU][MC] Added check for truncation of SOPK imm operand.

LGTM.

Apr 26 2017, 6:57 AM

Apr 25 2017

vpykhtin added a comment to D32101: Skip bitcasts while looking for GEP in LoadStoreVectorizer.

Looks good.

Apr 25 2017, 10:36 AM
vpykhtin accepted D32493: [TableGen] Add EncoderMethod to RegisterOperand.

LGTM.

Apr 25 2017, 9:30 AM
vpykhtin accepted D32101: Skip bitcasts while looking for GEP in LoadStoreVectorizer.
Apr 25 2017, 5:37 AM
vpykhtin added a comment to D32101: Skip bitcasts while looking for GEP in LoadStoreVectorizer.

LGTM.

Apr 25 2017, 5:37 AM

Apr 21 2017

vpykhtin accepted D32279: [AMDGPU] Merge M0 initializations.

Looks very good now! Thanks!

Apr 21 2017, 4:10 AM

Apr 20 2017

vpykhtin added inline comments to D32279: [AMDGPU] Merge M0 initializations.
Apr 20 2017, 11:01 AM
vpykhtin added inline comments to D32279: [AMDGPU] Merge M0 initializations.
Apr 20 2017, 10:38 AM
vpykhtin added a comment to D32279: [AMDGPU] Merge M0 initializations.

Thank you for doing this! I really need it.

Apr 20 2017, 6:57 AM

Apr 13 2017

vpykhtin accepted D31993: [AMDGPU] Combine DS operations with offsets bigger than byte.
Apr 13 2017, 10:48 AM
vpykhtin added inline comments to D31993: [AMDGPU] Combine DS operations with offsets bigger than byte.
Apr 13 2017, 10:44 AM
vpykhtin added a comment to D31993: [AMDGPU] Combine DS operations with offsets bigger than byte.

Looks good.

Apr 13 2017, 10:25 AM

Apr 12 2017

vpykhtin accepted D31587: MachineScheduler/ScheduleDAG: Add support for getSUTopoIndex.

LGTM.

Apr 12 2017, 3:58 AM
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

For the second assert it may be the case for iterator invalidation, though I haven't checked.

Apr 12 2017, 3:55 AM · Restricted Project
vpykhtin accepted D31820: [AMDGPU][MC] Fix for bug 32565 + LIT tests.

LGTM.

Apr 12 2017, 3:49 AM
vpykhtin accepted D31810: [AMDGPU][MC] Fix for bug 32552 + LIT tests.

LGTM.

Apr 12 2017, 3:48 AM
vpykhtin accepted D31809: [AMDGPU][MC] Fix for Bug 32551 + LIT tests.

LGTM.

Apr 12 2017, 3:48 AM
vpykhtin accepted D31808: [AMDGPU][MC] Fix for Bug 28227 + LIT tests.

LGTM.

Apr 12 2017, 3:47 AM
vpykhtin accepted D31595: [AMDGPU][MC] Fix for Bug 28159 + LIT tests.

LGTM.

Apr 12 2017, 3:30 AM

Apr 10 2017

vpykhtin accepted D31854: AMDGPU: Fix crash when disassembling VOP3 mac.

LGTM.

Apr 10 2017, 11:07 AM
vpykhtin added a reviewer for D31854: AMDGPU: Fix crash when disassembling VOP3 mac: dp.
Apr 10 2017, 11:06 AM

Apr 7 2017

vpykhtin accepted D31693: [AMDGPU] Unroll more to eliminate phis and conditions.

LGTM.

Apr 7 2017, 4:16 AM

Apr 6 2017

vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

Thanks Axel!

Apr 6 2017, 2:49 PM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

The following tests assert with this (and predecessors) patches with SISched turned on by default:

Apr 6 2017, 9:41 AM · Restricted Project

Apr 5 2017

vpykhtin accepted D31707: [AMDGPU][MC] Fix for Bug 28211 + LIT tests.

LGTM.

Apr 5 2017, 7:02 AM

Apr 4 2017

vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

First part of comments related only to C++ issues

Apr 4 2017, 8:02 AM · Restricted Project

Mar 31 2017

vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

If SU(i) uses register produced by SI(j):

Mar 31 2017, 10:32 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

I may miss something, but it looks that you can build data edges when building a superdag consisting of blocks. Incoming data edges would be liveins, outcoming - liveouts.

Mar 31 2017, 10:05 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

In general, I think moving instructions just to use standard RP tracker to discover liveins/liveouts isn't a good idea. It isn't only slow but doesn't look reliable too. Why not discover these sets using DAG directly?

Mar 31 2017, 9:48 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

Stack trace:

Mar 31 2017, 9:39 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

I didn't debugged it and I don't know why you decided so.

Mar 31 2017, 9:05 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

I ran lit tests with sished with ShouldTrackLaneMasks=true enabled by default with this patch, the following tests asserted:

Mar 31 2017, 5:54 AM · Restricted Project
vpykhtin added inline comments to D30147: AMDGPU/SI: Add new SISched policy to reduce register usage.
Mar 31 2017, 3:24 AM · Restricted Project
vpykhtin added inline comments to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.
Mar 31 2017, 3:22 AM · Restricted Project

Mar 29 2017

vpykhtin accepted D31469: [AMDGPU][MC] Fix for Bug 28158 + LIT tests.

LGTM.

Mar 29 2017, 9:23 AM
vpykhtin accepted D31463: [AMDGPU][MC] Fix for Bug 28167 + LIT tests.

LGTM.

Mar 29 2017, 9:02 AM
vpykhtin accepted D31455: [AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns.

LGTM.

Mar 29 2017, 5:37 AM
vpykhtin added a comment to D31455: [AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns.

Mostly good.

Mar 29 2017, 4:35 AM

Mar 28 2017

vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

Axel, thanks for the update.

Mar 28 2017, 8:33 PM · Restricted Project
vpykhtin accepted D31434: [AMDGPU] Fix recorded region boundaries in max-occupancy scheduler.

This looks elegant, mine was longer :), thanks!

Mar 28 2017, 12:57 PM
vpykhtin added inline comments to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.
Mar 28 2017, 11:32 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

Ok, I understood SIScheduleBlockCreator::scheduleInsideBlocks() moves instructions to actually get LiveIn and LiveOut set for a block, but this is rather heavy. Have you thought about getting those using DAG directly, not regpressure tracker? By the common sence the dependencies between blocks correspond to that liveness info. There is a problem however: LiveIn and LiveOut dependencies aren't modelled for boundary SUs. I have local patch that build such dependencies - scheduling region LiveIns edges comes from EntrySU, LiveOut - to ExitSU. Another problem - dependency edges doesn't have lanemask, need to think how to deal with this.

Mar 28 2017, 11:01 AM · Restricted Project
vpykhtin added a comment to D31124: AMDGPU/SI: Add lane tracking to SI Scheduler.

Why SIScheduleBlockCreator::scheduleInsideBlocks() actually move instructions? Why it isn't done on the final scheduling?

Mar 28 2017, 6:51 AM · Restricted Project
vpykhtin committed rL298902: [AMDGPU] Update SI scheduler colorHighLatenciesGroups.
[AMDGPU] Update SI scheduler colorHighLatenciesGroups
Mar 28 2017, 12:32 AM
vpykhtin closed D30152: AMDGPU/SI: Update SI scheduler colorHighLatenciesGroups by committing rL298902: [AMDGPU] Update SI scheduler colorHighLatenciesGroups.
Mar 28 2017, 12:32 AM · Restricted Project

Mar 27 2017

vpykhtin committed rL298896: MachineScheduler/ScheduleDAG: Add support for GetSubGraph.
MachineScheduler/ScheduleDAG: Add support for GetSubGraph
Mar 27 2017, 10:25 PM