rampitec (Stanislav Mekhanoshin)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 4 2014, 4:14 AM (219 w, 5 d)

Recent Activity

Yesterday

rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Addressed review comments.

Wed, Jun 20, 5:43 PM
rampitec added inline comments to D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".
Wed, Jun 20, 5:42 PM
rampitec added a comment to D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

All prerequisites are submitted, this change is ready now.

Wed, Jun 20, 4:06 PM
rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Rebased to master.

Wed, Jun 20, 1:33 PM
rampitec committed rL335167: Allow binop C1, (select cc, CF, CT) -> select folding.
Allow binop C1, (select cc, CF, CT) -> select folding
Wed, Jun 20, 1:29 PM
rampitec closed D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Wed, Jun 20, 1:29 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Wed, Jun 20, 11:30 AM

Tue, Jun 19

rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Updated test.

Tue, Jun 19, 11:26 AM
rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Rebased.

Tue, Jun 19, 11:17 AM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Added some i16/f16/vector tests.

Tue, Jun 19, 11:14 AM

Mon, Jun 18

rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Updated x86 test after rebase.

Mon, Jun 18, 5:16 PM
rampitec updated the diff for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".

Rebased.

Mon, Jun 18, 5:06 PM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Added x86 test for shifts with not reversed operands (as before the change).

Mon, Jun 18, 4:53 PM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
  • Added comment about shift VTs
  • Renamed C1 into CBO
Mon, Jun 18, 4:49 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 4:40 PM
rampitec added a dependency for D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1": D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 3:15 PM
rampitec added a dependent revision for D48223: Allow binop C1, (select cc, CF, CT) -> select folding: D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".
Mon, Jun 18, 3:15 PM
rampitec created D48301: DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1".
Mon, Jun 18, 3:15 PM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Only commute part of the change.

Mon, Jun 18, 3:08 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 2:58 PM
rampitec committed rL334987: Tests for dag combine select (binop) -> select. NFC..
Tests for dag combine select (binop) -> select. NFC.
Mon, Jun 18, 2:53 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 2:25 PM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
  • Fixed handling of non-commutative operations if arguments are swapped.
  • Added tests for non-commutative operations with all-const value.
  • Retitled patch accordingly.
Mon, Jun 18, 12:27 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 12:16 PM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Mon, Jun 18, 10:23 AM

Sun, Jun 17

rampitec added inline comments to D48246: [AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc.
Sun, Jun 17, 12:45 AM

Sat, Jun 16

rampitec added a comment to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

What about
(X / Y) != 0 -> X >= Y ?

(X, Y are unsigned)

Sat, Jun 16, 10:17 AM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Updated x86 test with utils/update_llc_test_checks.py

Sat, Jun 16, 2:25 AM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Sat, Jun 16, 2:24 AM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Sat, Jun 16, 1:42 AM
rampitec added inline comments to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Sat, Jun 16, 1:37 AM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Updated x86 asm to contain full ISA.

Sat, Jun 16, 1:31 AM
rampitec added a reviewer for D48223: Allow binop C1, (select cc, CF, CT) -> select folding: lebedev.ri.
Sat, Jun 16, 1:00 AM
rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Added x86 test.
Updated amdgcn test.

Sat, Jun 16, 1:00 AM
rampitec added a comment to D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Would be good to have additional small test for some other target (e.g. x86), too.

Sat, Jun 16, 1:00 AM

Fri, Jun 15

rampitec updated the diff for D48223: Allow binop C1, (select cc, CF, CT) -> select folding.

Rebase.

Fri, Jun 15, 10:19 PM
rampitec committed rL334882: [AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc.
[AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc
Fri, Jun 15, 8:51 PM
rampitec closed D48246: [AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc.
Fri, Jun 15, 8:51 PM
rampitec created D48246: [AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc.
Fri, Jun 15, 3:45 PM
rampitec added a comment to D47918: Utilize new SDNode flag functionality to expand current support for fma.

We do have tests for those in AMDGPU, with -enable-unsafe-fp-math. We don't have systematic tests with the new contract/reassoc bits. At least the Mesa frontend doesn't generate those anyway at the moment.

@tpr, @rampitec maybe you want to add some tests for those for HSA / LLPC? I don't know if you're using the new bits yet.

In any case, the change looks good to me.

We should mostly just convert all the tests using the global flag to using the per-instruction ones.

The HSA case just inherits them from clang if its using them, which I assume it is.

Fri, Jun 15, 9:21 AM
rampitec accepted D47918: Utilize new SDNode flag functionality to expand current support for fma.

LGTM

Fri, Jun 15, 9:21 AM
rampitec created D48223: Allow binop C1, (select cc, CF, CT) -> select folding.
Fri, Jun 15, 9:11 AM

Thu, Jun 14

rampitec accepted D48168: AMDGPU: Remove redundant MIMG instruction variants.

LGTM

Thu, Jun 14, 9:42 AM
rampitec accepted D48167: AMDGPU: Remove old-style image intrinsics.

LGTM

Thu, Jun 14, 9:39 AM
rampitec accepted D48165: InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded.

LGTM

Thu, Jun 14, 9:38 AM

Wed, Jun 13

rampitec committed rL334640: [AMDGPU] Corrected computeKnownBits for V_PERM_B32.
[AMDGPU] Corrected computeKnownBits for V_PERM_B32
Wed, Jun 13, 11:57 AM
rampitec closed D48133: [AMDGPU] Corrected computeKnownBits for V_PERM_B32.
Wed, Jun 13, 11:57 AM
rampitec accepted D48126: AMDGPU: Add combine for short vector extract_vector_elts.

LGTM

Wed, Jun 13, 11:23 AM
rampitec created D48133: [AMDGPU] Corrected computeKnownBits for V_PERM_B32.
Wed, Jun 13, 9:32 AM

Tue, Jun 12

rampitec committed rL334559: [AMDGPU] DAG combine to produce V_PERM_B32.
[AMDGPU] DAG combine to produce V_PERM_B32
Tue, Jun 12, 4:55 PM
rampitec closed D48099: [AMDGPU] DAG combine to produce V_PERM_B32.
Tue, Jun 12, 4:55 PM
rampitec updated the diff for D48099: [AMDGPU] DAG combine to produce V_PERM_B32.

Changed comments as requested.

Tue, Jun 12, 4:23 PM
rampitec created D48099: [AMDGPU] DAG combine to produce V_PERM_B32.
Tue, Jun 12, 2:45 PM

Mon, Jun 11

rampitec accepted D48047: [AMDGPU] findMaskOperands() - prevent hitting Assertion `isReg() && "Wrong MachineOperand accessor"' .

LGTM

Mon, Jun 11, 1:17 PM · Restricted Project
rampitec accepted D48017: AMDGPU: Select MIMG instructions manually in SITargetLowering.

Thank you! I am in favor of this change even despite what we have discussed today. If we eventually want to create target ISD nodes it could be done on top of it and separately, but currently I see no such need.
A separate note: SIISelLowering.cpp has really overgrown, it is already about 8000 lines. Maybe we shall consider splitting it into separate pieces as a separate change. Like this stuff is something like SIImageLowering.cpp.

Mon, Jun 11, 10:34 AM
rampitec accepted D48018: AMDGPU: Convert test cases to the dimension-aware intrinsics.

LGTM

Mon, Jun 11, 10:19 AM
rampitec accepted D48016: AMDGPU: Refactor MIMG instruction TableGen using generic tables.

LGTM

Mon, Jun 11, 10:15 AM
rampitec accepted D48014: AMDGPU: Use generic tables instead of SearchableTable.

LGTM

Mon, Jun 11, 10:09 AM
rampitec accepted D48011: AMDGPU: Pass AMDGPUSampleVariant to MIMG_{Sampler,Gather}(_WQM).

LGTM

Mon, Jun 11, 10:06 AM
rampitec accepted D47980: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y).

Ah! I see the reviews to fold it into a bfe. I have no concern then, LGTM.

Mon, Jun 11, 10:04 AM
rampitec added a comment to D47980: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y).

In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register.

Mon, Jun 11, 10:01 AM
rampitec committed rL334420: [AMDGPU] Do not consider indirect acces through phi for wave limiter.
[AMDGPU] Do not consider indirect acces through phi for wave limiter
Mon, Jun 11, 9:55 AM
rampitec closed D47740: [AMDGPU] Do not consider indirect acces through phi for wave limiter.
Mon, Jun 11, 9:55 AM

Thu, Jun 7

rampitec accepted D47828: AMDGPU: Make v4i16/v4f16 legal.

LGTM.

Thu, Jun 7, 2:21 AM
rampitec accepted D47823: AMDGPU: Try a lot harder to emit scalar loads.

LGTM. Thanks.

Thu, Jun 7, 2:19 AM

Wed, Jun 6

rampitec committed rL334142: [AMDGPU] Improve reciprocal handling.
[AMDGPU] Improve reciprocal handling
Wed, Jun 6, 3:27 PM
rampitec closed D47805: [AMDGPU] Improve reciprocal handling.
Wed, Jun 6, 3:27 PM
rampitec added inline comments to D47823: AMDGPU: Try a lot harder to emit scalar loads.
Wed, Jun 6, 2:58 PM
rampitec added inline comments to D47805: [AMDGPU] Improve reciprocal handling.
Wed, Jun 6, 12:16 PM
rampitec added inline comments to D47823: AMDGPU: Try a lot harder to emit scalar loads.
Wed, Jun 6, 12:05 PM
rampitec added inline comments to D47828: AMDGPU: Make v4i16/v4f16 legal.
Wed, Jun 6, 11:37 AM
rampitec accepted D47827: AMDGPU: Use scalar operations for f16 fabs/fneg patterns.

LGTM

Wed, Jun 6, 11:29 AM
rampitec added inline comments to D47823: AMDGPU: Try a lot harder to emit scalar loads.
Wed, Jun 6, 11:29 AM
rampitec added inline comments to D47805: [AMDGPU] Improve reciprocal handling.
Wed, Jun 6, 10:28 AM
rampitec updated the diff for D47805: [AMDGPU] Improve reciprocal handling.
Wed, Jun 6, 10:28 AM

Tue, Jun 5

rampitec created D47805: [AMDGPU] Improve reciprocal handling.
Tue, Jun 5, 5:25 PM
rampitec accepted D47782: AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16.

LGTM

Tue, Jun 5, 10:34 AM
rampitec added inline comments to D47782: AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16.
Tue, Jun 5, 9:11 AM
rampitec accepted D47761: AMDGPU: Add implicit def of SCC to kill and indirect pseudos.

LGTM

Tue, Jun 5, 9:06 AM

Mon, Jun 4

rampitec created D47740: [AMDGPU] Do not consider indirect acces through phi for wave limiter.
Mon, Jun 4, 12:48 PM
rampitec committed rL333934: [AMDGPU] Small refactoring in the scheduler.
[AMDGPU] Small refactoring in the scheduler
Mon, Jun 4, 11:02 AM
rampitec closed D47661: [AMDGPU] Small refactoring in the scheduler.
Mon, Jun 4, 11:02 AM
rampitec committed rL333931: [AMDGPU] Factored out common part of GCNRPTracker::reset().
[AMDGPU] Factored out common part of GCNRPTracker::reset()
Mon, Jun 4, 10:28 AM
rampitec closed D47664: [AMDGPU] Factored out common part of GCNRPTracker::reset().
Mon, Jun 4, 10:28 AM
rampitec accepted D47710: AMDGPU: Use more custom insert/extract_vector_elt lowering.

LGTM

Mon, Jun 4, 9:56 AM

Fri, Jun 1

rampitec accepted D47504: [AMDGPU] Simplify memory legalizer.

LGTM

Fri, Jun 1, 3:48 PM
rampitec created D47664: [AMDGPU] Factored out common part of GCNRPTracker::reset().
Fri, Jun 1, 3:26 PM
rampitec created D47661: [AMDGPU] Small refactoring in the scheduler.
Fri, Jun 1, 3:15 PM
rampitec accepted D47580: AMDGPU: Preserve metadata when widening loads.

LGTM

Fri, Jun 1, 12:36 PM

Thu, May 31

rampitec committed rL333691: [AMDGPU] Construct memory clauses before RA.
[AMDGPU] Construct memory clauses before RA
Thu, May 31, 1:19 PM
rampitec closed D47511: [AMDGPU] Construct memory clauses before RA.
Thu, May 31, 1:19 PM
rampitec committed rL333687: [AMDGPU] Fixed incorrect -mcpu=gfx800 in xnor.ll test. NFC..
[AMDGPU] Fixed incorrect -mcpu=gfx800 in xnor.ll test. NFC.
Thu, May 31, 12:44 PM
rampitec added inline comments to D47580: AMDGPU: Preserve metadata when widening loads.
Thu, May 31, 10:20 AM

Wed, May 30

rampitec added inline comments to D47511: [AMDGPU] Construct memory clauses before RA.
Wed, May 30, 11:58 PM
rampitec committed rL333629: [AMDGPU] Track occupancy in MFI.
[AMDGPU] Track occupancy in MFI
Wed, May 30, 10:40 PM
rampitec closed D47509: [AMDGPU] Track occupancy in MFI.
Wed, May 30, 10:40 PM
rampitec added inline comments to D47509: [AMDGPU] Track occupancy in MFI.
Wed, May 30, 10:10 PM
rampitec added inline comments to D47511: [AMDGPU] Construct memory clauses before RA.
Wed, May 30, 5:41 PM
rampitec updated the diff for D47511: [AMDGPU] Construct memory clauses before RA.

Replaced custom instruction with TargetOpcode::BUNDLE.

Wed, May 30, 5:41 PM
rampitec added inline comments to D47504: [AMDGPU] Simplify memory legalizer.
Wed, May 30, 4:08 PM