Page MenuHomePhabricator

arsenm (Matt Arsenault)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 5 2012, 4:53 PM (358 w, 6 d)

Recent Activity

Mon, Oct 21

arsenm added inline comments to D69288: [GISel][ArtifactCombiner] Relax the constraint to combine unmerge with concat_vectors.
Mon, Oct 21, 7:27 PM · Restricted Project
arsenm created D69284: AMDGPU: Use default operands for clamp/omod.
Mon, Oct 21, 3:56 PM
arsenm added a comment to D69182: [AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies..

@nhaehnle didn't want to use RPO for some reason last time this fixed a problem here. Which way do you prefer here?

Mon, Oct 21, 3:46 PM · Restricted Project
arsenm committed rGef9a0278f0ac: AMDGPU: Select basic interp directly from intrinsics (authored by arsenm).
AMDGPU: Select basic interp directly from intrinsics
Mon, Oct 21, 2:51 PM
arsenm committed rL375457: AMDGPU: Select basic interp directly from intrinsics.
AMDGPU: Select basic interp directly from intrinsics
Mon, Oct 21, 2:51 PM
arsenm closed D69279: AMDGPU: Select basic interp directly from intrinsics.

r375457

Mon, Oct 21, 2:50 PM
arsenm added inline comments to D69280: [AMDGPU] Allow folding of sgpr to vgpr copy.
Mon, Oct 21, 2:50 PM · Restricted Project
arsenm added inline comments to D69280: [AMDGPU] Allow folding of sgpr to vgpr copy.
Mon, Oct 21, 2:32 PM · Restricted Project
arsenm created D69279: AMDGPU: Select basic interp directly from intrinsics.
Mon, Oct 21, 1:36 PM
arsenm committed rG38038f116f7b: AMDGPU: Use CopyToReg for interp intrinsic lowering (authored by arsenm).
AMDGPU: Use CopyToReg for interp intrinsic lowering
Mon, Oct 21, 12:59 PM
arsenm committed rG8ebbf25cb1e9: AMDGPU: Erase redundant redefs of m0 in SIFoldOperands (authored by arsenm).
AMDGPU: Erase redundant redefs of m0 in SIFoldOperands
Mon, Oct 21, 12:59 PM
arsenm committed rL375450: AMDGPU: Use CopyToReg for interp intrinsic lowering.
AMDGPU: Use CopyToReg for interp intrinsic lowering
Mon, Oct 21, 12:59 PM
arsenm committed rL375449: AMDGPU: Erase redundant redefs of m0 in SIFoldOperands.
AMDGPU: Erase redundant redefs of m0 in SIFoldOperands
Mon, Oct 21, 12:59 PM
arsenm closed D69269: AMDGPU: Use CopyToReg for interp intrinsic lowering.

r375450

Mon, Oct 21, 12:58 PM
arsenm closed D68895: AMDGPU: Erase redundant redefs of m0 in SIFoldOperands.

r375449

Mon, Oct 21, 12:58 PM
arsenm committed rGdd6cf159bab7: AMDGPU: Stop adding m0 implicit def to SGPR spills (authored by arsenm).
AMDGPU: Stop adding m0 implicit def to SGPR spills
Mon, Oct 21, 12:49 PM
arsenm committed rGb5234b64af83: AMDGPU: Slightly restructure m0 init code (authored by arsenm).
AMDGPU: Slightly restructure m0 init code
Mon, Oct 21, 12:49 PM
arsenm closed D69274: AMDGPU: Stop adding m0 implicit def to SGPR spills.

r375448

Mon, Oct 21, 12:49 PM
arsenm closed D69270: AMDGPU: Slightly restructure m0 init code.

r375447

Mon, Oct 21, 12:49 PM
arsenm committed rL375448: AMDGPU: Stop adding m0 implicit def to SGPR spills.
AMDGPU: Stop adding m0 implicit def to SGPR spills
Mon, Oct 21, 12:40 PM
arsenm committed rL375447: AMDGPU: Slightly restructure m0 init code.
AMDGPU: Slightly restructure m0 init code
Mon, Oct 21, 12:40 PM
arsenm added a comment to D68233: [FPEnv] [WIP] Verify strictfp attribute correctness.

Needs tests

Mon, Oct 21, 12:29 PM · Restricted Project
arsenm accepted D69206: [AMDGPU] Select AGPR in PHI operand legalization.

LGTM

Mon, Oct 21, 12:10 PM · Restricted Project
arsenm accepted D69231: AMDGPU/GlobalISel: Legalize fast unsafe FDIV.

LGTM with minor fix

Mon, Oct 21, 12:03 PM · Restricted Project
arsenm created D69274: AMDGPU: Stop adding m0 implicit def to SGPR spills.
Mon, Oct 21, 12:01 PM
arsenm updated the diff for D69274: AMDGPU: Stop adding m0 implicit def to SGPR spills.

Remove unused variable

Mon, Oct 21, 12:01 PM
arsenm added a comment to D68585: AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG.

ping

Mon, Oct 21, 10:46 AM
arsenm accepted D68946: [MIParser] Set RegClassOrRegBank during instruction parsing.

LGTM. I think the "dummy register class" FIXME is a pretty definite no

Mon, Oct 21, 10:46 AM · Restricted Project
arsenm accepted D69149: [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors.

LGTM

Mon, Oct 21, 10:36 AM · Restricted Project
arsenm created D69270: AMDGPU: Slightly restructure m0 init code.
Mon, Oct 21, 10:36 AM
arsenm created D69269: AMDGPU: Use CopyToReg for interp intrinsic lowering.
Mon, Oct 21, 10:36 AM
arsenm added inline comments to D69231: AMDGPU/GlobalISel: Legalize fast unsafe FDIV.
Mon, Oct 21, 10:36 AM · Restricted Project
arsenm added inline comments to D69231: AMDGPU/GlobalISel: Legalize fast unsafe FDIV.
Mon, Oct 21, 8:51 AM · Restricted Project

Sun, Oct 20

arsenm committed rGe5be543a5598: AMDGPU: Increase vcc liveness scan threshold (authored by arsenm).
AMDGPU: Increase vcc liveness scan threshold
Sun, Oct 20, 10:49 AM
arsenm committed rL375367: AMDGPU: Increase vcc liveness scan threshold.
AMDGPU: Increase vcc liveness scan threshold
Sun, Oct 20, 10:42 AM
arsenm closed D68894: AMDGPU: Increase vcc liveness scan threshold.

r375367

Sun, Oct 20, 10:41 AM
arsenm committed rG7cd57dcd5b71: AMDGPU: Split flat offsets that don't fit in DAG (authored by arsenm).
AMDGPU: Split flat offsets that don't fit in DAG
Sun, Oct 20, 10:40 AM
arsenm closed D68893: AMDGPU: Split flat offsets that don't fit in DAG.

r375366

Sun, Oct 20, 10:33 AM
arsenm committed rL375366: AMDGPU: Split flat offsets that don't fit in DAG.
AMDGPU: Split flat offsets that don't fit in DAG
Sun, Oct 20, 10:32 AM
arsenm committed rG1aad3835f869: AMDGPU: Fix missing OPERAND_IMMEDIATE (authored by arsenm).
AMDGPU: Fix missing OPERAND_IMMEDIATE
Sun, Oct 20, 9:55 AM
arsenm committed rL375365: AMDGPU: Fix missing OPERAND_IMMEDIATE.
AMDGPU: Fix missing OPERAND_IMMEDIATE
Sun, Oct 20, 9:54 AM
arsenm committed rGbba8fd713249: AMDGPU: Add baseline tests for flat offset splitting (authored by arsenm).
AMDGPU: Add baseline tests for flat offset splitting
Sun, Oct 20, 9:36 AM
arsenm committed rL375364: AMDGPU: Add baseline tests for flat offset splitting.
AMDGPU: Add baseline tests for flat offset splitting
Sun, Oct 20, 9:36 AM
arsenm committed rGfc205f1d118a: AMDGPU: Don't re-get the subtarget (authored by arsenm).
AMDGPU: Don't re-get the subtarget
Sun, Oct 20, 9:27 AM
arsenm committed rL375363: AMDGPU: Don't re-get the subtarget.
AMDGPU: Don't re-get the subtarget
Sun, Oct 20, 9:27 AM
arsenm committed rG8a8b317460ff: AMDGPU: Don't error on calls to null or undef (authored by arsenm).
AMDGPU: Don't error on calls to null or undef
Sun, Oct 20, 12:49 AM
arsenm committed rL375356: AMDGPU: Don't error on calls to null or undef.
AMDGPU: Don't error on calls to null or undef
Sun, Oct 20, 12:48 AM
arsenm closed D51794: AMDGPU: Don't error on calls to null or undef.

r375356

Sun, Oct 20, 12:48 AM

Fri, Oct 18

arsenm updated the diff for D51794: AMDGPU: Don't error on calls to null or undef.

Rebase

Fri, Oct 18, 9:17 PM
arsenm added a comment to D51794: AMDGPU: Don't error on calls to null or undef.

Considering emitting traps requires fixing traps first, otherwise a program that should work will incorrectly trap

Do you agree that is how we have to handle it in principle? If that requres further fixes I do not object a todo comment here.

Fri, Oct 18, 9:17 PM
arsenm added a comment to D69206: [AMDGPU] Select AGPR in PHI operand legalization.

Testcase?

Probably I can forge a mir which should not have vgpr inputs for agpr result. So far I cannot figure out why a .mir dumped with -stop-before does not run the next pass (si-fix-sgpr-copies).

Fri, Oct 18, 6:42 PM · Restricted Project
arsenm committed rG1aae510893e6: AMDGPU: Remove optnone from a test (authored by arsenm).
AMDGPU: Remove optnone from a test
Fri, Oct 18, 6:34 PM
arsenm committed rL375321: AMDGPU: Remove optnone from a test.
AMDGPU: Remove optnone from a test
Fri, Oct 18, 6:33 PM
arsenm added a comment to D68893: AMDGPU: Split flat offsets that don't fit in DAG.

I.e. ideally we want:

load p
load p+128
load p+256
...
load p+2048-128
p1 = p + 2048
load p1
load p1 + 128
load p1 + 256
...

etc for a 128 byte stride.

Fri, Oct 18, 6:33 PM
arsenm added a comment to D68893: AMDGPU: Split flat offsets that don't fit in DAG.

I.e. ideally we want:

load p
load p+128
load p+256
...
load p+2048-128
p1 = p + 2048
load p1
load p1 + 128
load p1 + 256
...

etc for a 128 byte stride.

This would be better, but picking the base constant to use is more difficult. I think this is a next step beyond this patch. I'm not sure splitting this in the IR will work out, as the DAG will try to fold the adds of constants pack together

Fri, Oct 18, 6:33 PM
arsenm created D69211: AMDGPU: Enable shouldConsiderGEPOffsetSplit.
Fri, Oct 18, 6:24 PM
arsenm added a comment to D69206: [AMDGPU] Select AGPR in PHI operand legalization.

Testcase?

Fri, Oct 18, 6:24 PM · Restricted Project
arsenm abandoned D13527: AMDGPU: Exclude SGPRs except m0 from movrel operands.
Fri, Oct 18, 5:28 PM
arsenm committed rGd4274f0174ff: LiveIntervals: Fix handleMoveUp with subreg def moving across a def (authored by arsenm).
LiveIntervals: Fix handleMoveUp with subreg def moving across a def
Fri, Oct 18, 4:23 PM
arsenm committed rL375300: LiveIntervals: Fix handleMoveUp with subreg def moving across a def.
LiveIntervals: Fix handleMoveUp with subreg def moving across a def
Fri, Oct 18, 4:23 PM
arsenm closed D68149: LiveIntervals: Fix handleMoveUp with subreg def moving across a def.

r375300

Fri, Oct 18, 4:23 PM
arsenm accepted D69200: [AMDGPU] move PHI nodes to AGPR class.

LGTM

Fri, Oct 18, 3:27 PM · Restricted Project
arsenm updated the diff for D68893: AMDGPU: Split flat offsets that don't fit in DAG.

Remove dead code. Add new test for different offsets

Fri, Oct 18, 1:36 PM
arsenm added inline comments to D67602: GlobalISel: Handle llvm.read_register.
Fri, Oct 18, 12:12 PM
arsenm added a comment to D68149: LiveIntervals: Fix handleMoveUp with subreg def moving across a def.

ping

Fri, Oct 18, 11:34 AM
arsenm committed rL375267: AMDGPU: Relax 32-bit SGPR register class.
AMDGPU: Relax 32-bit SGPR register class
Fri, Oct 18, 11:26 AM
arsenm committed rGf9a42ed0a7f6: AMDGPU: Relax 32-bit SGPR register class (authored by arsenm).
AMDGPU: Relax 32-bit SGPR register class
Fri, Oct 18, 11:26 AM
arsenm closed D68821: AMDGPU: Relax 32-bit SGPR register class.

r375267

Fri, Oct 18, 11:25 AM
arsenm added inline comments to D67602: GlobalISel: Handle llvm.read_register.
Fri, Oct 18, 10:58 AM
arsenm added inline comments to D69182: [AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies..
Fri, Oct 18, 10:48 AM · Restricted Project
arsenm added a comment to D69172: AMDGPU: Fix SMEM WAR hazard for gfx10 readlane.

It is OK as a w/a, but real opcode should not appear that early.
@arsenm What is the reason to use MC opcode in the SIFrameLowering::emitEpilogue()?

BuildMI(MBB, MBBI, DL, TII->getMCOpcodeFromPseudo(AMDGPU::V_READLANE_B32),
        FuncInfo->getFrameOffsetReg())

There is a reason for it, but I don't remember what it is. I've tried to fix this multiple times in the past, and then rediscovered why. I think it had something to do with an operand constraint/encoding change in VI

Fri, Oct 18, 10:48 AM · Restricted Project
arsenm added a comment to D69172: AMDGPU: Fix SMEM WAR hazard for gfx10 readlane.

It is OK as a w/a, but real opcode should not appear that early.
@arsenm What is the reason to use MC opcode in the SIFrameLowering::emitEpilogue()?

BuildMI(MBB, MBBI, DL, TII->getMCOpcodeFromPseudo(AMDGPU::V_READLANE_B32),
        FuncInfo->getFrameOffsetReg())
Fri, Oct 18, 10:39 AM · Restricted Project
arsenm accepted D69163: [AMDGPU] Remove -amdgpu-spill-sgpr-to-smem..

LGTM

Fri, Oct 18, 10:39 AM · Restricted Project
arsenm accepted D69086: [IR] Fix mayReadFromMemory() for writeonly calls.

LGTM with minor test change

Fri, Oct 18, 10:39 AM · Restricted Project
arsenm added a comment to D69161: [IR] Allow fast math flags on calls with floating point array type..

Also should have some real IR tests that show this parses and preserves the flags

Fri, Oct 18, 9:52 AM · Restricted Project

Thu, Oct 17

arsenm added inline comments to D69149: [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors.
Thu, Oct 17, 4:30 PM · Restricted Project
arsenm accepted D69138: [AMDGPU] drop getIsFP td helper.
Thu, Oct 17, 2:37 PM · Restricted Project
arsenm added a comment to D68946: [MIParser] Set RegClassOrRegBank during instruction parsing.

I'm still not sure I mechanically follow how this ends up causing a problem? Surely the VRegInfo field was zero initialized already?

Thu, Oct 17, 1:50 PM · Restricted Project
arsenm accepted D68563: [AMDGPU] Disable a test that was relying on misched behavior.

LGTM. Are you going to remove the smem spill path?

Thu, Oct 17, 1:50 PM · Restricted Project

Wed, Oct 16

arsenm added a comment to D69078: Move LiveRangeCalc header to the llvm/include directory to make it publicly available.

I was just about to do the same

Wed, Oct 16, 7:14 PM · Restricted Project
arsenm accepted D69071: [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector.

LGTM

Wed, Oct 16, 5:24 PM · Restricted Project
arsenm accepted D69071: [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector.

LGTM with nit

Wed, Oct 16, 3:45 PM · Restricted Project
arsenm added inline comments to D69071: [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector.
Wed, Oct 16, 3:08 PM · Restricted Project
arsenm added inline comments to D69071: [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector.
Wed, Oct 16, 2:47 PM · Restricted Project
arsenm committed rL375042: GlobalISel: Implement lower for G_SADDO/G_SSUBO.
GlobalISel: Implement lower for G_SADDO/G_SSUBO
Wed, Oct 16, 2:35 PM
arsenm accepted D53768: Add VerboseOutputStream to CompilerInstance.

LGTM

Wed, Oct 16, 1:55 PM · Restricted Project
arsenm closed D68628: GlobalISel: Implement lower for G_SADDO/G_SSUBO.

r375042

Wed, Oct 16, 1:50 PM
arsenm committed rG34ed76e1803c: GlobalISel: Implement lower for G_SADDO/G_SSUBO (authored by arsenm).
GlobalISel: Implement lower for G_SADDO/G_SSUBO
Wed, Oct 16, 1:48 PM
arsenm updated the diff for D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT.

Use G_ANYEXT for vector/value

Wed, Oct 16, 1:18 PM
arsenm added a comment to D68585: AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG.

ping

Wed, Oct 16, 1:18 PM
arsenm accepted D68881: [AMDGPU] Improve code size cost model.

LGTM but I still find TTI's set of cost functions incomprehensible

Wed, Oct 16, 12:59 PM · Restricted Project
arsenm accepted D68666: LiveIntervals: Split live intervals on multiple dead defs.

LGTM

Wed, Oct 16, 12:40 PM · Restricted Project
arsenm added a comment to D68628: GlobalISel: Implement lower for G_SADDO/G_SSUBO.

ping

Wed, Oct 16, 12:40 PM
arsenm accepted D69065: [AMDGPU] Do not combine dpp mov reading physregs.

LGTM

Wed, Oct 16, 12:31 PM · Restricted Project
arsenm accepted D69063: [AMDGPU] Do not combine dpp with physreg def.

LGTM

Wed, Oct 16, 11:42 AM · Restricted Project

Tue, Oct 15

arsenm added a comment to D68893: AMDGPU: Split flat offsets that don't fit in DAG.

Mostly LGTM, but I wonder about the high level intention here. Is this intended to expose new load/store merging opportunities? If so, is there a test for this? Or is there some part of SIFoldOperands that can now be removed?

Tue, Oct 15, 1:19 PM
arsenm accepted D68865: [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsics.

LGTM

Tue, Oct 15, 1:10 PM · Restricted Project
arsenm added inline comments to D68881: [AMDGPU] Improve code size cost model.
Tue, Oct 15, 1:10 PM · Restricted Project
arsenm added inline comments to D68970: AMDGPU: Fix infinite searches in SIFixSGPRCopies.
Tue, Oct 15, 12:23 PM · Restricted Project
arsenm added inline comments to D68946: [MIParser] Set RegClassOrRegBank during instruction parsing.
Tue, Oct 15, 11:07 AM · Restricted Project