Page MenuHomePhabricator

arsenm (Matt Arsenault)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 5 2012, 4:53 PM (468 w, 3 d)

Recent Activity

Thu, Nov 25

arsenm added a comment to D114533: LLVM IR should allow bitcast between address spaces with the same size..

This seems like it should not apply to non-integral address spaces?

No, it shouldn’t care about the individual address spaces. It’s a reinterpret of the bits, the type meanings don’t matter

Thu, Nov 25, 9:04 AM · Restricted Project, Restricted Project

Wed, Nov 24

arsenm added a comment to D114533: LLVM IR should allow bitcast between address spaces with the same size..

Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs

That is a good point, but all the motivational cases seem to be about chaining the address space of the pointee pointer.
Does this come up for changing the address space of the pointer itself? I'm just wondering if this is still relevant with opaque pointers.

Wed, Nov 24, 7:15 AM · Restricted Project, Restricted Project
arsenm added a comment to D114533: LLVM IR should allow bitcast between address spaces with the same size..

Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs

Wed, Nov 24, 6:39 AM · Restricted Project, Restricted Project

Tue, Nov 23

arsenm accepted D113448: [AMDGPU] Check for unneeded shift mask in shift PatFrags..

LGTM. Solving constant regbankselect is not really related and shouldn't hold this up

Tue, Nov 23, 3:12 PM · Restricted Project
arsenm accepted D113986: [AMDGPU] Implement widening multiplies with v_mad_i64_i32/v_mad_u64_u32.
Tue, Nov 23, 3:04 PM · Restricted Project
arsenm accepted D114252: [AMDGPU] Only select VOP3 forms of VOP2 instructions.
Tue, Nov 23, 3:03 PM · Restricted Project
arsenm added inline comments to D114232: [AMDGPU] Fold more inline constant operands by commuting instructions.
Tue, Nov 23, 3:02 PM · Restricted Project
arsenm committed rG273a0c8bc9c7: PrologEpilogInserter: Use explicit control for scavenge slot placement (authored by arsenm).
PrologEpilogInserter: Use explicit control for scavenge slot placement
Tue, Nov 23, 3:01 PM
arsenm closed D113960: PrologEpilogInserter: Use explicit control for scavenge slot placement.

273a0c8bc9c774aa0d5982c23dc3d62b68ef4338

Tue, Nov 23, 3:01 PM · Restricted Project

Mon, Nov 22

arsenm added a comment to D114274: [openmp][amdgpu] Make plugin robust to presence of explicit implicit arguments.

This makes my assert problem go away

Mon, Nov 22, 8:26 AM · Restricted Project

Fri, Nov 19

arsenm added inline comments to D114274: [openmp][amdgpu] Make plugin robust to presence of explicit implicit arguments.
Fri, Nov 19, 12:54 PM · Restricted Project
arsenm added inline comments to D114270: [openmp][amdgpu][nfc] Simplify implicit args handling.
Fri, Nov 19, 11:50 AM · Restricted Project
arsenm added inline comments to D114270: [openmp][amdgpu][nfc] Simplify implicit args handling.
Fri, Nov 19, 11:48 AM · Restricted Project
arsenm requested review of D114247: OpenMP: Correctly query location for amdgpu-arch.
Fri, Nov 19, 7:39 AM
arsenm added inline comments to D114230: [AMDGPU] Use new opcode for indexed vgpr reads.
Fri, Nov 19, 5:06 AM · Restricted Project
arsenm accepted D114230: [AMDGPU] Use new opcode for indexed vgpr reads.

LGTM, I've meant to do this before

Fri, Nov 19, 5:05 AM · Restricted Project
arsenm added a comment to D114232: [AMDGPU] Fold more inline constant operands by commuting instructions.

A dedicated test (maybe MIR) would be nice

Fri, Nov 19, 5:03 AM · Restricted Project

Thu, Nov 18

arsenm accepted D114158: [AMDGPU] Remove redundant optimization in selectG_BUILD_VECTOR_TRUNC.
Thu, Nov 18, 6:23 PM · Restricted Project
arsenm added inline comments to D114202: [AMDGPU] Fix SIPostRABundler crash on null register used by dbg value.
Thu, Nov 18, 4:05 PM · Restricted Project
arsenm added inline comments to D113538: OpenMP: Start calling setTargetAttributes for generated kernels.
Thu, Nov 18, 7:52 AM

Tue, Nov 16

arsenm added a comment to D113538: OpenMP: Start calling setTargetAttributes for generated kernels.

ping

Tue, Nov 16, 6:58 AM
arsenm added a comment to D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas.

So you won't articulate or document the new invariant and you think there's a llvm-dev discussion that says we can't verify the invariant which you won't reference, but means you won't add this to the verifier.

Tue, Nov 16, 6:18 AM · Restricted Project
arsenm added a comment to D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas.

This is not something specific to AMDGPU backend, but AMDGPU backend at present requires this canonical form.

Tue, Nov 16, 6:16 AM · Restricted Project
arsenm accepted D113493: [CodeGen] Update LiveIntervals in TargetInstrInfo::convertToThreeAddress.
Tue, Nov 16, 6:15 AM · Restricted Project
arsenm accepted D113985: [AMDGPU] Generate test checks for mad_64_32.ll.
Tue, Nov 16, 6:09 AM · Restricted Project
arsenm added a comment to D113986: [AMDGPU] Implement widening multiplies with v_mad_i64_i32/v_mad_u64_u32.

GlobalISel version?

Tue, Nov 16, 6:08 AM · Restricted Project

Mon, Nov 15

arsenm accepted D113800: [amdgpu] Don't crash on empty global ctor/dtor.

I'm still not a huge fan of purely negative checks since it's so easy for them to break

Mon, Nov 15, 6:50 PM · Restricted Project
arsenm committed rG659887b40562: AMDGPU: Mark prolog/epilog SCC defs as dead (authored by arsenm).
AMDGPU: Mark prolog/epilog SCC defs as dead
Mon, Nov 15, 6:35 PM
arsenm committed rGe6bfbd7e0dc4: AMDGPU: Regenerate test checks (authored by arsenm).
AMDGPU: Regenerate test checks
Mon, Nov 15, 6:35 PM
arsenm closed D113829: AMDGPU: Mark prolog/epilog SCC defs as dead.

659887b4056236d376c0ac1218ca3f7a0dd75604

Mon, Nov 15, 6:35 PM · Restricted Project
arsenm requested review of D113960: PrologEpilogInserter: Use explicit control for scavenge slot placement.
Mon, Nov 15, 6:15 PM · Restricted Project
arsenm accepted D112827: [AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods.
Mon, Nov 15, 6:08 PM · Restricted Project
arsenm added a comment to D113784: [RFC][AMDGPU][GlobalISel] Fix RegBanks for G_CONSTANT.

Last time I thought about this I thought it would be easier to have the post-regbank combiner handle this. In some situations it makes most sense to completely rematerialize the constant value in each regbank, not just reassign it. If you have an inline constant, it would be better to just emit a new constant for each bank. For multiple uses of literals its trickier since there's a code size or instruction count tradeoff based on the uses

Mon, Nov 15, 6:01 PM · Restricted Project
arsenm added reviewers for D113784: [RFC][AMDGPU][GlobalISel] Fix RegBanks for G_CONSTANT: qcolombet, aemerson, paquette.
Mon, Nov 15, 5:58 PM · Restricted Project

Sat, Nov 13

arsenm requested review of D113829: AMDGPU: Mark prolog/epilog SCC defs as dead.
Sat, Nov 13, 9:55 AM · Restricted Project
arsenm committed rG54172326e095: AMDGPU: Regenerate test checks (authored by arsenm).
AMDGPU: Regenerate test checks
Sat, Nov 13, 8:35 AM

Fri, Nov 12

arsenm added inline comments to D113800: [amdgpu] Don't crash on empty global ctor/dtor.
Fri, Nov 12, 1:19 PM · Restricted Project
arsenm added a comment to D113778: [AMDGPU] Use shift for b64 mov.

64-bit shifts were quarter rate instructions last I checked, so this is slower

The Write64Bit definitions in SISchedule.td suggest they are half rate on most subtargets and full rate on gfx90a.

Fri, Nov 12, 8:59 AM · Restricted Project
arsenm added a comment to D113778: [AMDGPU] Use shift for b64 mov.

64-bit shifts were quarter rate instructions last I checked, so this is slower

Fri, Nov 12, 8:49 AM · Restricted Project

Thu, Nov 11

arsenm accepted D113679: [AMDGPU] Simplify 64-bit division/remainder expansion.
Thu, Nov 11, 8:04 AM · Restricted Project
arsenm accepted D112639: [openmp][amdgpu] Add comment warning that libm may be broken.
Thu, Nov 11, 7:37 AM · Restricted Project
arsenm accepted D113671: [CodeGen] Tweak whitespace in LiveInterval printing.
Thu, Nov 11, 6:57 AM · Restricted Project
arsenm accepted D113671: [CodeGen] Tweak whitespace in LiveInterval printing.

I don't see the difference in the example, but maybe phabricator just ate the whitespace differences

Thu, Nov 11, 6:30 AM · Restricted Project
arsenm added inline comments to D112025: Intrinsic for checking floating point class.
Thu, Nov 11, 6:19 AM · Restricted Project
arsenm added a comment to D112696: CycleInfo: Introduce cycles as a generalization of loops.

Do you have any users of the analysis or any plans? Otherwise, it could be seen as dead code?

Thu, Nov 11, 6:11 AM · Restricted Project

Wed, Nov 10

arsenm accepted D110411: [LiveIntervals] Update subranges in processTiedPairs.
Wed, Nov 10, 5:27 PM · Restricted Project
arsenm added a comment to D112488: AMDGPU: Assume all amdhsa kernarg passed implicit arguments by default.

Can we land this? Seems likely to step on many latent bugs all at once

The only holdup is surviving openmp precheckin, which is mostly complicated due to a reverted commit

Wed, Nov 10, 5:20 PM · Restricted Project
arsenm accepted D113628: [AMDGPU] Fixed stack pointer init with architected flat scratch.
Wed, Nov 10, 5:05 PM · Restricted Project
arsenm closed D113627: AMDGPU: Report large stack usage for recursive calls.

c7a0c2d0f7be2f456bd72b5c3508966d5b10233b

Wed, Nov 10, 5:03 PM · Restricted Project
arsenm committed rGc7a0c2d0f7be: AMDGPU: Report large stack usage for recursive calls (authored by arsenm).
AMDGPU: Report large stack usage for recursive calls
Wed, Nov 10, 5:02 PM
arsenm added inline comments to D113629: Add path to lower addrspacecasts in constant exprs for __ptr32/__ptr64..
Wed, Nov 10, 4:51 PM · Restricted Project
arsenm requested review of D113627: AMDGPU: Report large stack usage for recursive calls.
Wed, Nov 10, 4:31 PM · Restricted Project
arsenm added a comment to D113624: Revert "[amdgpu] Enable selection of `s_cselect_b64`.".

Description missing justification

Wed, Nov 10, 4:23 PM · Restricted Project
arsenm added inline comments to D112827: [AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods.
Wed, Nov 10, 2:40 PM · Restricted Project
arsenm added a comment to D112488: AMDGPU: Assume all amdhsa kernarg passed implicit arguments by default.

Can we land this? Seems likely to step on many latent bugs all at once

Wed, Nov 10, 1:59 PM · Restricted Project
arsenm updated the diff for D113538: OpenMP: Start calling setTargetAttributes for generated kernels.

Also test non-kernel

Wed, Nov 10, 6:17 AM
arsenm accepted D113044: [TwoAddressInstruction] Update LiveIntervals after rewriting INSERT_SUBREG to COPY.
Wed, Nov 10, 6:03 AM · Restricted Project
arsenm added a comment to D113538: OpenMP: Start calling setTargetAttributes for generated kernels.

That seems important. What was the symptom of failing to set these? We may now be redundantly setting some, e.g.
I think convergent is set somewhere else before this patch.

Wed, Nov 10, 5:52 AM

Tue, Nov 9

arsenm requested review of D113538: OpenMP: Start calling setTargetAttributes for generated kernels.
Tue, Nov 9, 7:20 PM
arsenm closed D112716: AMDGPU: Account for implicit argument alignment for kernarg segment.

90ff14871904881fb156a1d4d5fb083ca75998ab

Tue, Nov 9, 2:49 PM · Restricted Project
arsenm committed rG90ff14871904: AMDGPU: Account for implicit argument alignment for kernarg segment (authored by arsenm).
AMDGPU: Account for implicit argument alignment for kernarg segment
Tue, Nov 9, 2:49 PM
arsenm accepted D113437: [GlobalISel] Ensure that translateInvoke adds all successors for inlineasm.
Tue, Nov 9, 12:51 PM · Restricted Project
arsenm committed rG62ffcc5f3793: AMDGPU: Regenerate test checks (authored by arsenm).
AMDGPU: Regenerate test checks
Tue, Nov 9, 12:29 PM
arsenm added a comment to D112716: AMDGPU: Account for implicit argument alignment for kernarg segment.

ping

Tue, Nov 9, 12:27 PM · Restricted Project
arsenm requested changes to D55067: [HIP] Fix offset of kernel argument for AMDGPU target.

Is this still relevant? We want to move towards consistently using byref for kernel arguments anyway

Tue, Nov 9, 9:30 AM
arsenm added a comment to D94264: [GlobalISel] Add MachineInstNumbering to CSEInfo and propagate CSE throughout AArch64 pipeline..

Is this still relevant? I thought this was done already

Tue, Nov 9, 9:29 AM · Restricted Project
arsenm requested changes to D112635: [AMDGPU] Disable d16 loads/stores to high halves on gfx90a.

You're probably experiencing a driver issue

Tue, Nov 9, 9:26 AM · Restricted Project
arsenm added inline comments to D113448: [AMDGPU] Check for unneeded shift mask in shift PatFrags..
Tue, Nov 9, 5:47 AM · Restricted Project

Mon, Nov 8

arsenm accepted D112041: [InferAddressSpaces] Support assumed addrspaces from addrspace predicates..
Mon, Nov 8, 7:51 AM · Restricted Project, Restricted Project
arsenm added inline comments to D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics.
Mon, Nov 8, 5:54 AM · Restricted Project
arsenm accepted D113318: [AMDGPU] Make getInstSizeInBytes more generic.
Mon, Nov 8, 5:43 AM · Restricted Project

Fri, Nov 5

arsenm accepted D113191: [TwoAddressInstructionPass] Update existing physreg live intervals.
Fri, Nov 5, 9:47 AM · Restricted Project

Wed, Nov 3

arsenm added inline comments to D93154: GlobalISel: remove assert that memcpy Src and Dst addrspace must be identical.
Wed, Nov 3, 3:36 PM · Restricted Project
arsenm added inline comments to D112041: [InferAddressSpaces] Support assumed addrspaces from addrspace predicates..
Wed, Nov 3, 8:14 AM · Restricted Project, Restricted Project

Tue, Nov 2

arsenm added inline comments to D113005: [AMDGPU] Fix SGPR checks in S_MOV_B64_IMM_PSEUDO generation..
Tue, Nov 2, 7:07 AM · Restricted Project

Mon, Nov 1

arsenm added a comment to D112716: AMDGPU: Account for implicit argument alignment for kernarg segment.

ping

Mon, Nov 1, 4:05 PM · Restricted Project
arsenm added a comment to D112760: Require 'contract' fast-math flag for FMA generation.

I would rephrase the description as removing the global flag for contraction

This change also removes the behavior of the function attribute "unsafe=fp-math" enabling fp-contraction.

Mon, Nov 1, 3:05 PM · Restricted Project
arsenm accepted D109032: [AMDGPU][NFC] Alter ComplexPattern types to be consistent with their uses.

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

Mon, Nov 1, 2:33 PM · Restricted Project
arsenm added inline comments to D110950: [AMDGPU] Enable divergence-driven BFE selection.
Mon, Nov 1, 1:34 PM · Restricted Project
arsenm accepted D112717: [IR] Replace *all* uses of a constant expression by corresponding instruction.
Mon, Nov 1, 9:48 AM · Restricted Project
arsenm accepted D112917: [AMDGPU] Shrink v_mac_legacy_f32 and v_fmac_legacy_f32.
Mon, Nov 1, 6:53 AM · Restricted Project

Sat, Oct 30

arsenm added inline comments to D112820: Emit hidden hostcall argument for sanitized kernels..
Sat, Oct 30, 7:35 AM · Restricted Project, Restricted Project

Fri, Oct 29

arsenm added a comment to D112760: Require 'contract' fast-math flag for FMA generation.

I would rephrase the description as removing the global flag for contraction

Fri, Oct 29, 2:05 PM · Restricted Project
arsenm accepted D112323: GlobalISel/Utils: Use incoming regbank while constraining the superclasses.
Fri, Oct 29, 2:04 PM · Restricted Project
arsenm accepted D112644: [AMDGPU] Fix global isel for kernels using agprs on gfx90a.
Fri, Oct 29, 2:03 PM · Restricted Project
arsenm added a comment to D112644: [AMDGPU] Fix global isel for kernels using agprs on gfx90a.

How was this breaking?

LLVM ERROR: no registers from class available to allocate

What happens usesAGPRs() does not see any agprs used and then the whole register budget is allocated to vgprs.

Wouldn't it work correctly after later getReservedRegs calls? I don't think we should be calling this before finalizeIsel?

It does work correctly with a later call from freezeReservedRegs(). The first call though comes from MachineVerifier::visitMachineFunctionBefore().

Fri, Oct 29, 1:47 PM · Restricted Project
arsenm added inline comments to D112717: [IR] Replace *all* uses of a constant expression by corresponding instruction.
Fri, Oct 29, 1:42 PM · Restricted Project
arsenm added a comment to D112827: [AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods.

We should also apply these pre-legalization like in the DAG

Fri, Oct 29, 11:22 AM · Restricted Project
arsenm closed D112715: AMDGPU: Check kernarg alignments in test.

52fc2edb5357075c7c746adc274d513f48d412b8 plus rebase

Fri, Oct 29, 9:43 AM · Restricted Project
arsenm committed rG52fc2edb5357: AMDGPU: Check kernarg alignments in test (authored by arsenm).
AMDGPU: Check kernarg alignments in test
Fri, Oct 29, 9:42 AM
arsenm requested changes to D112717: [IR] Replace *all* uses of a constant expression by corresponding instruction.
Fri, Oct 29, 6:17 AM · Restricted Project
arsenm added inline comments to D112717: [IR] Replace *all* uses of a constant expression by corresponding instruction.
Fri, Oct 29, 6:16 AM · Restricted Project
arsenm added inline comments to D112696: CycleInfo: Introduce cycles as a generalization of loops.
Fri, Oct 29, 6:13 AM · Restricted Project
arsenm accepted D112813: [AMDGPU] Change numBitsSigned for simplicity and document it. NFC..
Fri, Oct 29, 6:11 AM · Restricted Project

Oct 28 2021

arsenm accepted D112733: [AMDGPU] Fix cvt_f32_ubyte combine with shl.
Oct 28 2021, 4:40 PM · Restricted Project
arsenm added inline comments to D112733: [AMDGPU] Fix cvt_f32_ubyte combine with shl.
Oct 28 2021, 12:55 PM · Restricted Project
arsenm added inline comments to D112731: [AMDGPU] Really preserve LiveVariables in SILowerControlFlow.
Oct 28 2021, 11:49 AM · Restricted Project
arsenm accepted D112733: [AMDGPU] Fix cvt_f32_ubyte combine with shl.
Oct 28 2021, 11:06 AM · Restricted Project
arsenm accepted D110076: [AMDGPU][GlobalISel] Code quality: Combine V_RSQ.
Oct 28 2021, 7:03 AM · Restricted Project
arsenm requested review of D112716: AMDGPU: Account for implicit argument alignment for kernarg segment.
Oct 28 2021, 6:28 AM · Restricted Project