Page MenuHomePhabricator

arsenm (Matt Arsenault)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 5 2012, 4:53 PM (435 w, 6 d)

Recent Activity

Today

arsenm added a comment to D100481: [AMDGPU] Disable forceful inline of non-kernel functions which use LDS..

Graphics does use region, so we can't break that

Wed, Apr 14, 2:25 PM · Restricted Project
arsenm requested changes to D100430: [AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads.

Needs some pure MIR tests too (also including negative tests)

Wed, Apr 14, 2:23 PM · Restricted Project
arsenm added a comment to D100453: [MIR][NFC] Introduce a new method to check a MachineInstr contains implicit register.

This is also further confused by the two types of implicit operands. There are implicit physical register uses as present in the instruction definition, but also implicit uses that code can arbitrarily append to an instruction

Wed, Apr 14, 5:43 AM · Restricted Project
arsenm requested changes to D100453: [MIR][NFC] Introduce a new method to check a MachineInstr contains implicit register.
Wed, Apr 14, 5:42 AM · Restricted Project
arsenm accepted D100461: [AMDGPU] Mark scavenged SGPR as used.
Wed, Apr 14, 5:40 AM · Restricted Project
arsenm added a comment to D100453: [MIR][NFC] Introduce a new method to check a MachineInstr contains implicit register.

I don't think a helper that's only useful in the middle of a partial instruction modification is useful to generally expose

Wed, Apr 14, 5:40 AM · Restricted Project

Yesterday

arsenm added a comment to D99833: [TableGen] generate `getRegBankFromRegClass`.

When I build llc with this patch, with AMDGPU as a target, I get a lot of warnings.

E.g.

Included from /Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td:1439:
/Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBanks.td:13:1: warning: Register class 'AV_128' is already mapped to register bank 'AGPR'.
def VGPRRegBank : RegisterBank<"VGPR",
^
Included from /Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td:1439:
/Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBanks.td:13:1: warning: Register class 'AV_160' is already mapped to register bank 'AGPR'.
def VGPRRegBank : RegisterBank<"VGPR",
^

Is that expected?

AV_* classes are for operands that can be one of two different regbanks. You can't unambiguously go back from the class to the register bank. At the moment these are unallocatable classes, although this will probably be changing soon. I'm not really sure what this code should do in this case.

Tue, Apr 13, 11:19 AM · Restricted Project
arsenm added a comment to D99833: [TableGen] generate `getRegBankFromRegClass`.

When I build llc with this patch, with AMDGPU as a target, I get a lot of warnings.

E.g.

Included from /Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td:1439:
/Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBanks.td:13:1: warning: Register class 'AV_128' is already mapped to register bank 'AGPR'.
def VGPRRegBank : RegisterBank<"VGPR",
^
Included from /Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td:1439:
/Users/gruyere/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBanks.td:13:1: warning: Register class 'AV_160' is already mapped to register bank 'AGPR'.
def VGPRRegBank : RegisterBank<"VGPR",
^

Is that expected?

Tue, Apr 13, 11:16 AM · Restricted Project

Mon, Apr 12

arsenm requested review of D100358: GlobalISel: Defer register creation in handleAssignments.
Mon, Apr 12, 7:24 PM · Restricted Project
arsenm accepted D99347: [AMDGPU] Set implicit arg attributes for indirect calls.
Mon, Apr 12, 12:10 PM · Restricted Project
arsenm added inline comments to D100300: [AMDGPU] Refactor ds_read/ds_write related select code for better readability..
Mon, Apr 12, 5:16 AM · Restricted Project
arsenm accepted D100295: [AMDGPU] Kill temporary register after restoring.

This is why we should work to migrate to the reverse scavenger which does not depend on kill flags

Mon, Apr 12, 5:14 AM · Restricted Project

Sat, Apr 10

arsenm added a comment to D99469: GlobalISel: Restrict narrow scalar for fptoui/fptosi results.

The code change looks fine to me. The added tests aren't actually testing any behaviour that has changed, are they? Is there any reasonable way to add tests for the unable-to-legalize case?

Sat, Apr 10, 6:58 AM · Restricted Project

Fri, Apr 9

arsenm added a comment to D100236: AMDGPU: Restore atomic fp feature on FP atomic instruction definitions.

We should not produce this for gfx803.

Fri, Apr 9, 3:56 PM · Restricted Project
arsenm requested review of D100236: AMDGPU: Restore atomic fp feature on FP atomic instruction definitions.
Fri, Apr 9, 3:46 PM · Restricted Project
arsenm accepted D96336: [AMDGPU] Save VGPR of whole wave when spilling.

LGTM, although should look into updating the MFI serialization

Fri, Apr 9, 2:08 PM · Restricted Project
arsenm requested changes to D87903: [CSInfo][GlobalISel] CallSiteInfo support when using GlobalISel.
Fri, Apr 9, 1:36 PM · debug-info, Restricted Project
arsenm added inline comments to D99635: [SelectionDAG] Add extra check on asm operand legalization..
Fri, Apr 9, 1:19 PM · Restricted Project
arsenm added a comment to D98884: [IR] Ignore bitcasts of function pointers which are only used as callees in callbase instruction.

This change has wider impact than was expected as it affects several components and backends. Bootstrap builds fail with this change and as the buildbots show, this change generates a buggy Tablegen binary in release mode.

Debugging:

  1. Disabling the newly added code in hasAddressTaken(), we get sane Tablegen and builds pass. To note, debug builds pass with the change. This clearly shows that codegen is functionally different in release and debug mode for Tablegen.
  2. Doing “ninja check-all” on stage1 build does not help although we have ~90000 tests.
  3. Running valgrind also did not reveal any new things.
  4. Doing ASAN build of the compiler led to several errors and drawing any conclusion from it is close to impossible for this patch.

However, if I disable IPSCCP pass on top of this patch then everything turns green. I tried to fix trivial issues in the pass but it seems the pass needs deeper and major fixes to make this patch land.

Till we fix issues in IPSCCP, this patch stands reverted. The whole intention of this patch was to handle a subset of cases for AMDGPU backend, specifically, needed for https://reviews.llvm.org/D99347.

Fri, Apr 9, 1:11 PM · Restricted Project
arsenm accepted D100117: [AMDGPU] SIFoldOperands: eagerly delete dead copies.
Fri, Apr 9, 5:28 AM · Restricted Project

Thu, Apr 8

arsenm added inline comments to D100149: [AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX.
Thu, Apr 8, 4:06 PM · Restricted Project
arsenm accepted D100135: [AMDGPU] Check for all meta instrs in GCNRegBankReassign.
Thu, Apr 8, 1:37 PM · Restricted Project
arsenm added a comment to D100135: [AMDGPU] Check for all meta instrs in GCNRegBankReassign.

If it doesn't make an observable difference, there's no real reason to add the extra check?

Thu, Apr 8, 1:31 PM · Restricted Project
arsenm accepted D100109: [RegisterScavenging] Add asserts for better errors.
Thu, Apr 8, 10:31 AM · Restricted Project
arsenm added a comment to D100117: [AMDGPU] SIFoldOperands: eagerly delete dead copies.

Can you add a test for the DBG_VALUE case?

Thu, Apr 8, 10:30 AM · Restricted Project
arsenm added inline comments to D100109: [RegisterScavenging] Add asserts for better errors.
Thu, Apr 8, 8:47 AM · Restricted Project
arsenm added inline comments to D100117: [AMDGPU] SIFoldOperands: eagerly delete dead copies.
Thu, Apr 8, 8:45 AM · Restricted Project
arsenm accepted D100100: [AMDGPU] SIFoldOperands: try harder to fold cndmask instructions.
Thu, Apr 8, 6:21 AM · Restricted Project
arsenm added a comment to D99743: [AMDGPU] Use enum for flat variants. NFC.

Can we just drop the Is from the bit name?

Thu, Apr 8, 5:54 AM · Restricted Project
arsenm accepted D100098: [AMDGPU] Fix computing live registers in prolog.
Thu, Apr 8, 5:41 AM · Restricted Project

Wed, Apr 7

arsenm added a comment to D100072: [AMDGPU] Allow -amdgpu-unsafe-fp-atomics to ignore denorm mode.

I guess it would be nicer

Wed, Apr 7, 3:38 PM · Restricted Project, Restricted Project
arsenm added a comment to D99865: [AMDGPU, test] Fix use of undef FileCheck var.

It's really hard to write checks for R600, but it's in maintenance mode. The comment doesn't really make sense to me. I doubt this check was ever really correct or stable. You can't really write this test stably and expect specific things in T registers at a specific time. I guess just go with this is it passes?

Unfortunately the test case as updated by this diff FAILs, as can be seen above. Unfortunately the current testcase only succeeds due to a bug in FileCheck which I have a patch for. FileCheck currently count an undefined variable as a match failure which makes a CHECK-NOT with an undefined variable succeed. Does the comment make sense with the old CHECKS (see https://reviews.llvm.org/rG880a80ad07b6ea7ead3a842fc03c74c2247c9486)?

Should we remove those checks altogether?

Wed, Apr 7, 3:02 PM · Restricted Project
arsenm accepted D99865: [AMDGPU, test] Fix use of undef FileCheck var.

It's really hard to write checks for R600, but it's in maintenance mode. The comment doesn't really make sense to me. I doubt this check was ever really correct or stable. You can't really write this test stably and expect specific things in T registers at a specific time. I guess just go with this is it passes?

Wed, Apr 7, 2:30 PM · Restricted Project
arsenm added a comment to D100069: Disable use of SCC bit from asm.

Message / name is somewhat confusing since I assumed this meant the scc register

Wed, Apr 7, 2:10 PM · Restricted Project
arsenm accepted D100063: [AMDGPU] Split GCNRegBankReassign.

If you will run rewriter once no changes are needed at all. Just run the pass right before the rewriter. Although currently you are running it twice. What are your plans about this?

Wed, Apr 7, 1:45 PM · Restricted Project
arsenm added a comment to D100063: [AMDGPU] Split GCNRegBankReassign.

Can't this just skip virtual registers? Does it really need to know which mode its in?

All registers are virtual here. It will skip a register if there is no assignment in VRM (and obviously skip physregs).
This is slower as it is though. I can add the check "Reg.isPhysical() || !VRM->isAssignedReg(Reg)" earlier to mitigate it.
Will that help?

Wed, Apr 7, 1:20 PM · Restricted Project
arsenm added a comment to D100063: [AMDGPU] Split GCNRegBankReassign.

Can't this just skip virtual registers? Does it really need to know which mode its in?

Wed, Apr 7, 1:10 PM · Restricted Project
arsenm accepted D100031: [AMDGPU] SIFoldOperands: clean up tryConstantFoldOp.
Wed, Apr 7, 5:58 AM · Restricted Project

Tue, Apr 6

arsenm added inline comments to D99908: [GlobalISel] Simplify G_ICMP against true/false when boolean contents are 0/1.
Tue, Apr 6, 2:24 PM · Restricted Project
arsenm accepted D99955: AMDGPU: Add isBranch=1 to SOPP branch instructions.
Tue, Apr 6, 7:28 AM · Restricted Project
arsenm accepted D99722: [AMDGPU] Update SGPRSpillVGPRCSR name. NFC.
Tue, Apr 6, 6:01 AM · Restricted Project

Mon, Apr 5

arsenm accepted D99902: Copy syncscope when expanding atomicrmw into cmpxchg loop.
Mon, Apr 5, 5:29 PM · Restricted Project
arsenm added inline comments to D99908: [GlobalISel] Simplify G_ICMP against true/false when boolean contents are 0/1.
Mon, Apr 5, 5:07 PM · Restricted Project
arsenm accepted D99902: Copy syncscope when expanding atomicrmw into cmpxchg loop.
Mon, Apr 5, 4:48 PM · Restricted Project

Thu, Apr 1

arsenm requested changes to D99772: [AMDGPU] Check for NaN when folding output modifiers.

Oh whoops, I forgot the hardware totally ignores the modifiers with IEEE mode on (not that this behavior makes any sense). We can't actually do this

Thu, Apr 1, 3:19 PM · Restricted Project
arsenm accepted D99733: [GlobalISel]: Add a getConstantIntVRegVal utility.

LGTM, although we have a few too many constant lookup functions at this point

Thu, Apr 1, 12:53 PM · Restricted Project
arsenm accepted D99739: [GlobalISel] Allow different tyeps for G_SBFX and G_UBFX operands.

Typo tyeps in commit message

Thu, Apr 1, 12:46 PM · Restricted Project
arsenm accepted D99670: [AMDGPU] Don't rely on SIAddIMGInit for GlobalISel.
Thu, Apr 1, 10:11 AM · Restricted Project
arsenm accepted D99748: [AMDGPU] Remove SIAddIMGInit pass which is now unused.
Thu, Apr 1, 10:11 AM · Restricted Project
arsenm accepted D99747: [AMDGPU][SDag] Add IMG init in AdjustInstrPostInstrSelection.
Thu, Apr 1, 10:11 AM · Restricted Project
arsenm added a comment to D99735: [MIPatternMatch]: Add mi_match for MachineInstr.

Probably should have some unit tests for this

Thu, Apr 1, 9:18 AM · Restricted Project
arsenm added reviewers for D99735: [MIPatternMatch]: Add mi_match for MachineInstr: aemerson, paquette, aditya_nandakumar, bogner, dsanders.
Thu, Apr 1, 9:18 AM · Restricted Project
arsenm added reviewers for D99736: [MIPatternMatch]: Add matchers for binary instructions: aemerson, paquette, aditya_nandakumar, bogner, dsanders.
Thu, Apr 1, 9:12 AM · Restricted Project
arsenm added a comment to D99739: [GlobalISel] Allow different tyeps for G_SBFX and G_UBFX operands.

Description is misleading. This should be allow a different type for the offset/width operands, which do not necessarily need to be constant

Thu, Apr 1, 9:11 AM · Restricted Project

Wed, Mar 31

arsenm requested changes to D92999: [amdgpu] Enhance load widening in the constant address space..
Wed, Mar 31, 4:44 PM · Restricted Project
arsenm added inline comments to D92999: [amdgpu] Enhance load widening in the constant address space..
Wed, Mar 31, 4:44 PM · Restricted Project
arsenm requested changes to D98050: [AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg (fma (fpext x), (fpext y), z)).
Wed, Mar 31, 4:23 PM · Restricted Project
arsenm added inline comments to D99269: [AMDGPU] Unify spill code.
Wed, Mar 31, 4:12 PM · Restricted Project
arsenm requested changes to D99635: [SelectionDAG] Add extra check on asm operand legalization..

Missing test

Wed, Mar 31, 3:57 PM · Restricted Project
arsenm added inline comments to D99284: [RegAllocFast] properly handle STATEPOINT instruction..
Wed, Mar 31, 3:08 PM · Restricted Project
arsenm added inline comments to D90050: AMDGPU/GlobalISel: Add integer med3 combines.
Wed, Mar 31, 10:46 AM · Restricted Project
arsenm requested changes to D99670: [AMDGPU] Don't rely on SIAddIMGInit for GlobalISel.

If the fallback is enabled, this pass will still run. You need to have the pass check if isel failed and skip the function if it suceeded

Wed, Mar 31, 9:55 AM · Restricted Project
arsenm added inline comments to D96605: AMDGPU/GlobalISel: Remove redundant G_FCANONICALIZE.
Wed, Mar 31, 9:19 AM · Restricted Project
arsenm accepted D99505: [AMDGPU] Enable output modifiers for double precision instructions.
Wed, Mar 31, 9:15 AM · Restricted Project
arsenm added a comment to D98515: [AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks.

%1:vgpr(s32) = COPY $sgpr0

I wonder if we should ban this in the verifier. It's not wrong, but it sure feels like bad form to allow cross bank copies involving physical registers

Wed, Mar 31, 6:18 AM · Restricted Project

Tue, Mar 30

arsenm updated the diff for D99469: GlobalISel: Restrict narrow scalar for fptoui/fptosi results.

Style changes

Tue, Mar 30, 7:01 PM · Restricted Project
arsenm added a comment to D81899: [gicombiner] Unify common state for current targets into CommonTargetCombinerHelperState.

reverse ping

Tue, Mar 30, 6:27 PM · Restricted Project
arsenm added inline comments to D84397: [MSP430] Replace known epilogues with branches to __mspabi_func_epilog_N.
Tue, Mar 30, 6:26 PM · Restricted Project
arsenm resigned from D88485: [SDag][AMDGPU] Maintain DAG divergence through instruction selection.

I guess this is technically correct but I wouldn't spend time getting this accurate

Tue, Mar 30, 6:19 PM · Restricted Project
arsenm requested changes to D82713: Improve stack object printing..
Tue, Mar 30, 6:17 PM · Restricted Project
arsenm requested changes to D86269: [RFC][Target] Add a new backend target called CSKY.

Can this be abandoned now?

Tue, Mar 30, 6:12 PM · Restricted Project
arsenm requested changes to D88667: [GlobalISel] Change asserting conditions when initializing call site info.

I think the block numbering is meaningless and shouldn't be any different between selectors

Tue, Mar 30, 6:07 PM · Restricted Project, debug-info
arsenm added a comment to D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space.

Is this still needed?

Tue, Mar 30, 6:03 PM · Restricted Project
arsenm requested changes to D90052: AMDGPU/GlobalISel: Add clamp combine for IEEE=false.
Tue, Mar 30, 6:02 PM · Restricted Project
arsenm requested changes to D93411: [GlobalISel] Transform sext (cmp pred, x, y) -> select (cmp pred, x, y) tval, 0.
Tue, Mar 30, 5:56 PM · Restricted Project
arsenm requested changes to D98491: [AMDGPU] Split GCN subtarget features for unaligned access.
Tue, Mar 30, 5:36 PM · Restricted Project
arsenm added inline comments to D99505: [AMDGPU] Enable output modifiers for double precision instructions.
Tue, Mar 30, 5:19 PM · Restricted Project
arsenm added inline comments to D99505: [AMDGPU] Enable output modifiers for double precision instructions.
Tue, Mar 30, 5:18 PM · Restricted Project
arsenm requested changes to D91753: [GlobalISel] Add an isExtendedTrueVal helper..
Tue, Mar 30, 3:59 PM · Restricted Project
arsenm requested changes to D92394: [amdgpu] Teach one more case for assumed global pointers..
Tue, Mar 30, 3:57 PM · Restricted Project
arsenm requested changes to D95795: [AMDGPU] Add new CostPerUse values for VGPRs.
Tue, Mar 30, 3:55 PM · Restricted Project
arsenm added a comment to D95664: [AVR] Fix the eliminateCallFramePseudos method so that it always expands STDWSPQRr and STDSPQRr.

I just realized I've been reading "frame pointer elimination" as "frame index elimination" but this change still doesn't make sense to me. Why do you want to treat these stores as frame instructions when they don't modify the stack pointer?

Tue, Mar 30, 3:52 PM · Restricted Project
arsenm requested changes to D95664: [AVR] Fix the eliminateCallFramePseudos method so that it always expands STDWSPQRr and STDSPQRr.

This seems like a confusion of what frame instructions means. What do these stores have to do with call frames? The stores should reference frame indexes representing fixed frame objects in the incoming arguments or local objects. The frame adjust instructions are for SP modifications in call sequences where instructions are directly referencing offsets off of the SP, which do not have corresponding frame indexes

Tue, Mar 30, 3:45 PM · Restricted Project
arsenm added a reviewer for D96179: Ignore assume like calls by default in hasAddressTaken(): jdoerfert.
Tue, Mar 30, 3:37 PM · Restricted Project
arsenm accepted D95554: [BitcodeReader] Validate Strtab before accessing..
Tue, Mar 30, 3:37 PM · Restricted Project
arsenm accepted D96869: [AMDGPU] Fix saving fp and bp.

LGTM. I think "lanes do not die in a function" may not be true in the future, but is good enough for now (not sure whether callers or callees should be responsible for wave termination on kill)

Tue, Mar 30, 3:35 PM · Restricted Project
arsenm accepted D97767: [AMDGPU][GlobalISel] Add support for global atomicrmw fadd.

LGTM, though ensuring the right mode in the lowering. We probably won't have MIR atomic expansions anytime soon

Tue, Mar 30, 3:33 PM · Restricted Project
arsenm requested changes to D96336: [AMDGPU] Save VGPR of whole wave when spilling.
Tue, Mar 30, 3:31 PM · Restricted Project
arsenm requested changes to D96605: AMDGPU/GlobalISel: Remove redundant G_FCANONICALIZE.

Should have isCanonicalize utility function

Tue, Mar 30, 3:31 PM · Restricted Project
arsenm requested changes to D99269: [AMDGPU] Unify spill code.
Tue, Mar 30, 3:28 PM · Restricted Project
arsenm requested changes to D97054: [MachineVerifier] Confirm that both ends of a tied def/use are live together.

Implicit operands aren't special. I also think the subreg handling is off

Tue, Mar 30, 3:25 PM · Restricted Project
arsenm requested changes to D98040: [AMDGPU][GlobalISel] Improve constant offset lookup for llvm.amdgcn.s.buffer.
Tue, Mar 30, 3:20 PM · Restricted Project
arsenm requested changes to D99284: [RegAllocFast] properly handle STATEPOINT instruction..
Tue, Mar 30, 3:20 PM · Restricted Project
arsenm added a comment to D99284: [RegAllocFast] properly handle STATEPOINT instruction..

I don't think this should special case statepoints. Any regmask with tied operands should behave the same way

Tue, Mar 30, 3:19 PM · Restricted Project
arsenm added inline comments to D99429: [AMDGPU] Save WWM registers in functions.
Tue, Mar 30, 3:17 PM · Restricted Project
arsenm updated the diff for D99541: GlobalISel: Check for powers of 2 for inverse funnel shift lowering.

Fix being totally wrong

Tue, Mar 30, 3:12 PM · Restricted Project
arsenm added inline comments to D99505: [AMDGPU] Enable output modifiers for double precision instructions.
Tue, Mar 30, 11:24 AM · Restricted Project
arsenm requested changes to D99352: [AMDGPU] ds_read_*/ds_write_* operations require strict alignment..
Tue, Mar 30, 9:38 AM · Restricted Project
arsenm added a comment to D99587: [AMDGPU] Deduce sign from address space.

We already use this connection in SIInstrInfo::isLegalFLATOffset, there is a check for ST.hasFlatSegmentOffsetBug() && AddrSpace == AMDGPUAS::FLAT_ADDRESS. I guess this is already broken then.

Yeah, that's wrong

Tue, Mar 30, 7:28 AM · Restricted Project
arsenm added a comment to D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask..

It seems to me that we may need to revise CFG lowering to avoid updating EXEC directly and later revise it based on whether the restoring mask needs reloading or not. Here's the brief thought in my mind:

  • Instead of lowering CFG early before RA, lower it after RA. As a byproduct, it also remove the need of "terminator" version of exec mask manipulation instructions.
  • When CFG is being lowered, it could update EXEC eagerly if the merge point doesn't need to reload the mask; Otherwise, it just needs to translate as what we currently did.

Any suggestions and comments?

Tue, Mar 30, 6:30 AM · Restricted Project