Page MenuHomePhabricator

foad (Jay Foad)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 29 2014, 9:58 AM (245 w, 6 d)

Recent Activity

Yesterday

foad added inline comments to D64328: [AMDGPU] Optimize atomic max/min.
Tue, Jul 16, 10:53 AM · Restricted Project
foad created D64809: [AMDGPU] Optimize atomic AND/OR/XOR.
Tue, Jul 16, 10:53 AM · Restricted Project
foad committed rG17060f0a54b6: [AMDGPU] Optimize atomic max/min (authored by foad).
[AMDGPU] Optimize atomic max/min
Tue, Jul 16, 10:47 AM
foad committed rL366235: [AMDGPU] Optimize atomic max/min.
[AMDGPU] Optimize atomic max/min
Tue, Jul 16, 10:45 AM
foad closed D64328: [AMDGPU] Optimize atomic max/min.
Tue, Jul 16, 10:44 AM · Restricted Project
foad added a comment to D64328: [AMDGPU] Optimize atomic max/min.

Ping?

Tue, Jul 16, 9:31 AM · Restricted Project
foad added inline comments to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Tue, Jul 16, 1:25 AM · Restricted Project
foad added a comment to rL366121: AMDGPU/GlobalISel: Select G_AND/G_OR/G_XOR.

The new tests are failing in my Release build:

Failing Tests (3):
    LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-and.mir
    LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-or.mir
    LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-xor.mir

The failures all look like:

/home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-and.mir:120:12: error: WAVE64: expected string not found in input
 ; WAVE64: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
           ^
<stdin>:375:2: note: scanning from here
 %0:sreg_32_xm0 = COPY $sgpr0
 ^
Tue, Jul 16, 1:23 AM

Mon, Jul 15

foad added inline comments to rL366048: [Loop Peeling] Enable peeling for loops with multiple exits.
Mon, Jul 15, 5:09 AM
foad updated the diff for D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

Rewrite to remove duplicated code.

Mon, Jul 15, 4:12 AM · Restricted Project
foad added inline comments to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Mon, Jul 15, 3:42 AM · Restricted Project
foad updated the diff for D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

Add a negative test for fmul without fast math flags.

Mon, Jul 15, 3:35 AM · Restricted Project
foad added a comment to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

b) yes, one more test never hurts

Mon, Jul 15, 3:23 AM · Restricted Project
foad added a comment to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

I think we need a few more tests here

a) vector tests
b) 'extra use' tests
c) negative tests

Mon, Jul 15, 3:01 AM · Restricted Project
foad updated the diff for D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

Add a vector test.

Mon, Jul 15, 2:59 AM · Restricted Project
foad added a comment to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.

Not sure how other reviewers, but I consider the use of 'C' as confusing here, since 'C' is usually a constant. I would recommend to avoid 'C', if it is not a constant.

Edit: As I see, original code also used 'C', so feel free to ignore me :)

Mon, Jul 15, 2:54 AM · Restricted Project
foad added inline comments to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Mon, Jul 15, 2:30 AM · Restricted Project
foad created D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Mon, Jul 15, 2:24 AM · Restricted Project

Fri, Jul 12

foad committed rG27ec195f391c: [AMDGPU] Fix DPP combiner check for exec modification (authored by foad).
[AMDGPU] Fix DPP combiner check for exec modification
Fri, Jul 12, 9:00 AM
foad committed rL365910: [AMDGPU] Fix DPP combiner check for exec modification.
[AMDGPU] Fix DPP combiner check for exec modification
Fri, Jul 12, 8:59 AM
foad closed D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Fri, Jul 12, 8:59 AM · Restricted Project
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Count instructions instead of uses.

Fri, Jul 12, 8:51 AM · Restricted Project
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Remove unnecessary check for debug operands.

Fri, Jul 12, 8:35 AM · Restricted Project
foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Fri, Jul 12, 8:32 AM · Restricted Project
foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Fri, Jul 12, 8:13 AM · Restricted Project
foad committed rG7816ad918ff2: [AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32 (authored by foad).
[AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32
Fri, Jul 12, 8:05 AM
foad committed rL365904: [AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32.
[AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32
Fri, Jul 12, 8:02 AM
foad closed D64636: [AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32.
Fri, Jul 12, 8:02 AM · Restricted Project
foad added a comment to D64497: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.

This change introduced CTS regressions with RADV like dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_3.atan2_frag and friends.
Can you investigate ?

Fri, Jul 12, 7:53 AM · Restricted Project
foad created D64636: [AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32.
Fri, Jul 12, 7:53 AM · Restricted Project
foad added a comment to D64497: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.

This change introduced CTS regressions with RADV like dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_3.atan2_frag and friends.
Can you investigate ?

Fri, Jul 12, 5:42 AM · Restricted Project

Thu, Jul 11

foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Thu, Jul 11, 6:01 AM · Restricted Project
foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Thu, Jul 11, 4:59 AM · Restricted Project
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Add a DPP combiner test that would have caught the problem.

Thu, Jul 11, 2:39 AM · Restricted Project
foad committed rGc1b7db9edaaf: Remove some redundant code from r290372 and improve a comment. (authored by foad).
Remove some redundant code from r290372 and improve a comment.
Thu, Jul 11, 1:51 AM
foad committed rL365741: Remove some redundant code from r290372 and improve a comment..
Remove some redundant code from r290372 and improve a comment.
Thu, Jul 11, 1:49 AM
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Count non-debug use operands of non-debug instructions instead of using
a SmallPtrSet.

Thu, Jul 11, 1:41 AM · Restricted Project

Wed, Jul 10

foad added a comment to D64393: [AMDGPU] Fix DPP combiner check for exec modification.

I'll add tests tomorrow.

Wed, Jul 10, 1:56 PM · Restricted Project
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Address some review comments.

Wed, Jul 10, 1:54 PM · Restricted Project
foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Wed, Jul 10, 1:52 PM · Restricted Project
foad added a comment to D64497: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.

We still need more work to handle the integer cases in the complex pattern.

Wed, Jul 10, 8:02 AM · Restricted Project
foad committed rGbba37e89a57a: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32 (authored by foad).
[AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32
Wed, Jul 10, 7:55 AM
foad committed rL365640: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.
[AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32
Wed, Jul 10, 7:54 AM
foad closed D64497: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.
Wed, Jul 10, 7:54 AM · Restricted Project
foad created D64497: [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32.
Wed, Jul 10, 7:41 AM · Restricted Project
foad added inline comments to rL290372: AMDGPU: Invert cmp + select with constant.
Wed, Jul 10, 6:35 AM
foad added a comment to D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Sorry, there is a code for returnining unique instr, so we can use the count:

defusechain_instr_iterator &operator++() {          // Preincrement
      assert(Op && "Cannot increment end iterator!");
      if (ByOperand)
        advance();
      else if (ByInstr) {
        MachineInstr *P = Op->getParent();
        do {
          advance();
        } while (Op && Op->getParent() == P); // <- removes duplicates
Wed, Jul 10, 6:08 AM · Restricted Project
foad added inline comments to rL290372: AMDGPU: Invert cmp + select with constant.
Wed, Jul 10, 6:06 AM
foad added a comment to D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Jay, I'm ok with your patch, just a note that the prt set can be avoided if we count the number of use instructions and use the count to limit the scan.

Wed, Jul 10, 5:38 AM · Restricted Project
foad updated the diff for D64393: [AMDGPU] Fix DPP combiner check for exec modification.

Try a different approach.

Wed, Jul 10, 2:30 AM · Restricted Project
foad added inline comments to D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Wed, Jul 10, 2:30 AM · Restricted Project
foad added inline comments to D64328: [AMDGPU] Optimize atomic max/min.
Wed, Jul 10, 1:24 AM · Restricted Project
foad updated the diff for D64328: [AMDGPU] Optimize atomic max/min.

New helper function getIdentityValueForAtomicOp.

Wed, Jul 10, 1:13 AM · Restricted Project

Tue, Jul 9

foad updated the diff for D64328: [AMDGPU] Optimize atomic max/min.

Fix identity for signed min.

Tue, Jul 9, 6:31 AM · Restricted Project
foad added a comment to D64411: [AMDGPU] Simplify the exclusive scan used for optimized atomics.

See also comments here about whether to include the shift by 3: https://reviews.llvm.org/D57737#1392824

Tue, Jul 9, 6:18 AM · Restricted Project
foad created D64411: [AMDGPU] Simplify the exclusive scan used for optimized atomics.
Tue, Jul 9, 6:13 AM · Restricted Project
foad added a comment to rL363675: AMDGPU: Change API for checking for exec modification.

I rather strongly dislike this "check all uses" API. I would expect this fold to only be done for hasOneNonDBGUse anyway, but at the time didn't want to spend time fixing the DPP pas

Tue, Jul 9, 1:53 AM
foad created D64393: [AMDGPU] Fix DPP combiner check for exec modification.
Tue, Jul 9, 1:50 AM · Restricted Project

Mon, Jul 8

foad added a comment to rL363675: AMDGPU: Change API for checking for exec modification.

I guess the point of "using a set to track seen uses" was for exactly this case: when no use is specified, but all uses are in the same BB, we want to scan as far as the last use, but no further. But perhaps this could be implemented with a count of uses instead of a set.

Mon, Jul 8, 9:11 AM
foad updated the diff for D64328: [AMDGPU] Optimize atomic max/min.

Add i64 tests.

Mon, Jul 8, 8:12 AM · Restricted Project
foad added inline comments to D64328: [AMDGPU] Optimize atomic max/min.
Mon, Jul 8, 7:36 AM · Restricted Project
foad added a comment to rL363675: AMDGPU: Change API for checking for exec modification.

We don't want to rely on kill flags

Mon, Jul 8, 7:30 AM
foad created D64328: [AMDGPU] Optimize atomic max/min.
Mon, Jul 8, 6:00 AM · Restricted Project
foad added a comment to rL363675: AMDGPU: Change API for checking for exec modification.

This commit stopped the DPP combiner from working on code sequences generated by the atomic optimizer. For example:

declare i32 @llvm.amdgcn.workitem.id.x()
@local_var32 = addrspace(3) global i32 undef, align 4
define amdgpu_kernel void @add_i32_varying() {
entry:
  %lane = call i32 @llvm.amdgcn.workitem.id.x()
  %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %lane monotonic
  ret void
}

With llc -print-before-all -debug-only=gcn-dpp-combine -march=amdgcn -mcpu=gfx900 -amdgpu-atomic-optimizations < test.ll I get:

Mon, Jul 8, 4:07 AM
foad committed rG38902350ef4a: [AMDGPU] Use a named predicate instead of a magic number. (authored by foad).
[AMDGPU] Use a named predicate instead of a magic number.
Mon, Jul 8, 12:06 AM
foad committed rL365294: [AMDGPU] Use a named predicate instead of a magic number..
[AMDGPU] Use a named predicate instead of a magic number.
Mon, Jul 8, 12:04 AM
foad closed D64201: [AMDGPU] Use a named predicate instead of a magic number..
Mon, Jul 8, 12:04 AM · Restricted Project

Fri, Jul 5

foad committed rG7e0c10b55ff7: [AMDGPU] DPP combiner: recognize identities for more opcodes (authored by foad).
[AMDGPU] DPP combiner: recognize identities for more opcodes
Fri, Jul 5, 7:55 AM
foad committed rL365211: [AMDGPU] DPP combiner: recognize identities for more opcodes.
[AMDGPU] DPP combiner: recognize identities for more opcodes
Fri, Jul 5, 7:55 AM
foad closed D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.
Fri, Jul 5, 7:55 AM · Restricted Project
foad updated the diff for D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.

Add a test case, and change the tests to run on gfx9 because it needs
add-no-carry instructions.

Fri, Jul 5, 1:37 AM · Restricted Project
foad added a comment to D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.

Right, this is hard to follow even for me :). 3rd operand is src1_modifiers, you can use a junk value for this to check whether the DPP combiner don't crash and don't combine it.

Fri, Jul 5, 1:21 AM · Restricted Project

Thu, Jul 4

foad added a comment to D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.

I think modifiers are checked correctly by the existing code, but can you add a test for e64 encodings into dpp_combine.mir similar to what is under "check for floating point modifiers" comment?

Thu, Jul 4, 8:59 AM · Restricted Project
foad updated the diff for D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.

Typo fixes have been committed separately.

Thu, Jul 4, 8:10 AM · Restricted Project
foad committed rG0cd50b2a95d3: Fix typos in comments and debug output. (authored by foad).
Fix typos in comments and debug output.
Thu, Jul 4, 8:06 AM
foad committed rL365146: Fix typos in comments and debug output..
Fix typos in comments and debug output.
Thu, Jul 4, 8:06 AM
foad created D64207: [AMDGPU] DPP combiner: recognize identities for more opcodes.
Thu, Jul 4, 7:05 AM · Restricted Project
foad created D64201: [AMDGPU] Use a named predicate instead of a magic number..
Thu, Jul 4, 3:59 AM · Restricted Project

Thu, Jun 27

foad committed rG8479240b0a62: [AMDGPU] Fix +DumpCode to print an entry label for the first function (authored by foad).
[AMDGPU] Fix +DumpCode to print an entry label for the first function
Thu, Jun 27, 1:20 AM
foad committed rL364508: [AMDGPU] Fix +DumpCode to print an entry label for the first function.
[AMDGPU] Fix +DumpCode to print an entry label for the first function
Thu, Jun 27, 1:19 AM
foad closed D63712: [AMDGPU] Fix +DumpCode to print an entry label for the first function.
Thu, Jun 27, 1:19 AM · Restricted Project

Mon, Jun 24

foad added reviewers for D63712: [AMDGPU] Fix +DumpCode to print an entry label for the first function: arsenm, tpr, kzhuravl.
Mon, Jun 24, 6:08 AM · Restricted Project
foad created D63712: [AMDGPU] Fix +DumpCode to print an entry label for the first function.
Mon, Jun 24, 6:02 AM · Restricted Project

Fri, Jun 21

foad committed rGd9d3c91b48c6: [Scalarizer] Propagate IR flags (authored by foad).
[Scalarizer] Propagate IR flags
Fri, Jun 21, 7:11 AM
foad committed rL364051: [Scalarizer] Propagate IR flags.
[Scalarizer] Propagate IR flags
Fri, Jun 21, 7:10 AM
foad closed D63593: [Scalarizer] Propagate IR flags.
Fri, Jun 21, 7:10 AM · Restricted Project

Thu, Jun 20

foad added inline comments to D63593: [Scalarizer] Propagate IR flags.
Thu, Jun 20, 1:25 PM · Restricted Project
foad updated the diff for D63593: [Scalarizer] Propagate IR flags.

Add fcmp and intrinsic tests.

Thu, Jun 20, 1:24 PM · Restricted Project
foad added inline comments to D63593: [Scalarizer] Propagate IR flags.
Thu, Jun 20, 8:15 AM · Restricted Project
foad updated the diff for D63593: [Scalarizer] Propagate IR flags.

Add fneg test.

Thu, Jun 20, 8:15 AM · Restricted Project
foad requested review of D63593: [Scalarizer] Propagate IR flags.

Hoping for review from someone outside AMD.

Thu, Jun 20, 5:29 AM · Restricted Project
foad added reviewers for D63593: [Scalarizer] Propagate IR flags: arsenm, rsandifo, dstuttard, tpr, sheredom, patrik.h.hagglund, uabelho.
Thu, Jun 20, 3:37 AM · Restricted Project
foad created D63593: [Scalarizer] Propagate IR flags.
Thu, Jun 20, 3:32 AM · Restricted Project

Wed, Jun 19

foad committed rG45d19fb47061: [ConstantFolding] Fix assertion failure on non-power-of-two vector load. (authored by foad).
[ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Wed, Jun 19, 3:26 AM
foad committed rL363784: [ConstantFolding] Fix assertion failure on non-power-of-two vector load..
[ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Wed, Jun 19, 3:25 AM
foad closed D63375: [ConstantFolding] Fix assertion failure on non-power-of-two vector load..
Wed, Jun 19, 3:25 AM · Restricted Project

Jun 15 2019

foad added a comment to D63375: [ConstantFolding] Fix assertion failure on non-power-of-two vector load..

The test case does an (out of bounds) load from a global constant with type <3 x float>.

Do you need an out-of-bounds access to trigger the problem?

Jun 15 2019, 7:49 AM · Restricted Project
foad created D63375: [ConstantFolding] Fix assertion failure on non-power-of-two vector load..
Jun 15 2019, 6:49 AM · Restricted Project

Dec 14 2015

foad added inline comments to rL255057: [PPC64, TSAN] LLVM basic enablement of thread sanitizer for PPC64 (BE and LE).
Dec 14 2015, 1:25 AM

Dec 2 2015

foad added a comment to D15108: [asan] Fix dynamic allocas unpoisoning on PowerPC{64}..

Jay, could you please check it on your PPC box?

Dec 2 2015, 12:31 PM