Page MenuHomePhabricator

foad (Jay Foad)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 29 2014, 9:58 AM (311 w, 5 d)

Recent Activity

Today

foad committed rG56f6bf1a8d6c: [AMDGPU] Remove MUL_LOHI_U24/MUL_LOHI_I24 (authored by foad).
[AMDGPU] Remove MUL_LOHI_U24/MUL_LOHI_I24
Mon, Oct 19, 11:21 AM
foad closed D89706: [AMDGPU] Remove MUL_LOHI_U24/MUL_LOHI_I24.
Mon, Oct 19, 11:20 AM · Restricted Project
foad requested review of D89706: [AMDGPU] Remove MUL_LOHI_U24/MUL_LOHI_I24.
Mon, Oct 19, 9:00 AM · Restricted Project
foad added a comment to D89501: [AMDGPU] flat scratch ST addressing mode on gfx10.

LGTM.

Mon, Oct 19, 6:15 AM · Restricted Project
foad added a comment to D89619: [AMDGPU][NFC] Tidy SIOptimizeExecMaskingPreRA for extensibility.

Looks fine to me. More ideas inline.

Mon, Oct 19, 4:27 AM · Restricted Project
foad updated the diff for D88955: [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy.

Rebase on D89038.

Mon, Oct 19, 4:15 AM · Restricted Project

Fri, Oct 16

foad committed rG1417abe54c28: [AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic (authored by foad).
[AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic
Fri, Oct 16, 9:16 AM
foad closed D89558: [AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic.
Fri, Oct 16, 9:16 AM · Restricted Project
foad added inline comments to D89487: [AMDGPU] gfx1032 target.
Fri, Oct 16, 9:15 AM · Restricted Project, Restricted Project
foad committed rG0c1381d79567: [llc] Use -filetype=null to disable MIR printing (authored by foad).
[llc] Use -filetype=null to disable MIR printing
Fri, Oct 16, 9:04 AM
foad closed D89476: [llc] Use -filetype=null to disable MIR printing.
Fri, Oct 16, 9:04 AM · Restricted Project
foad requested review of D89558: [AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic.
Fri, Oct 16, 8:51 AM · Restricted Project
foad added a comment to D89525: [amdgpu] Enhance AMDGPU AA..

@yaxunl could you double-check that OpenCL also follows that rule.
@nhaehnle could you check whether that potentially breaks graphics.

Fri, Oct 16, 8:31 AM · Restricted Project
foad added inline comments to D89038: [PatternMatch] Add new FP matchers. NFC..
Fri, Oct 16, 5:59 AM · Restricted Project
foad updated the diff for D89038: [PatternMatch] Add new FP matchers. NFC..

Add a vector test case for matchFastFloatClamp.

Fri, Oct 16, 5:57 AM · Restricted Project
foad added inline comments to D89038: [PatternMatch] Add new FP matchers. NFC..
Fri, Oct 16, 5:05 AM · Restricted Project
foad updated the diff for D89038: [PatternMatch] Add new FP matchers. NFC..

Test simplifySelectWithFCmp on non-uniform vectors.

Fri, Oct 16, 5:02 AM · Restricted Project
foad updated the diff for D89476: [llc] Use -filetype=null to disable MIR printing.

Address feedback.

Fri, Oct 16, 1:34 AM · Restricted Project

Thu, Oct 15

foad added inline comments to D88060: [GISel]: Few InsertVecElt combines.
Thu, Oct 15, 2:08 PM · Restricted Project
foad added inline comments to D89476: [llc] Use -filetype=null to disable MIR printing.
Thu, Oct 15, 9:42 AM · Restricted Project
foad added inline comments to D89480: [GlobalISel][Legalizer] Implement lower action for G_FSHL/G_FSHR.
Thu, Oct 15, 9:22 AM · Restricted Project
foad added reviewers for D89476: [llc] Use -filetype=null to disable MIR printing: MaskRay, efriedma, ychen.
Thu, Oct 15, 9:00 AM · Restricted Project
foad requested review of D89476: [llc] Use -filetype=null to disable MIR printing.
Thu, Oct 15, 8:55 AM · Restricted Project
foad added inline comments to D86878: [AMDGPU] Fix a miscompile with S_ADD/S_SUB.
Thu, Oct 15, 8:14 AM · Restricted Project
foad added inline comments to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.
Thu, Oct 15, 5:11 AM · Restricted Project
foad added a comment to D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.

But it would be nice to somehow get the codegen improvements shown in some of the MIPS and X86 tests.

Thu, Oct 15, 1:53 AM · Restricted Project
foad added inline comments to D88060: [GISel]: Few InsertVecElt combines.
Thu, Oct 15, 1:37 AM · Restricted Project
foad abandoned D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.

Thanks for testing @gargaroff.

Thu, Oct 15, 1:19 AM · Restricted Project
foad accepted D89187: [AMDGPU] Minimize number of s_mov generated by copyPhysReg.

Remove peephole

Thu, Oct 15, 1:17 AM · Restricted Project
foad added a comment to D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.

I attempted this in the past but abandoned it due to infinite loops in legalization like mentioned earlier. Constant folding during legalization seems okay as long as it operates on legal types (ie fold an operation of constants (which are already legal) to the same type.

Thu, Oct 15, 1:07 AM · Restricted Project

Wed, Oct 14

foad added inline comments to D88060: [GISel]: Few InsertVecElt combines.
Wed, Oct 14, 10:49 AM · Restricted Project
foad added a comment to D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.

Is there anything I need to do to use the CSEMIRBuilder?

Wed, Oct 14, 7:51 AM · Restricted Project
foad added a comment to D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.

Automatically folding legalization artifacts makes me a bit nervous. What happens if you want to legalize a constant by widening, and the legalizer wants to insert the trunc from a widened constant?

Wed, Oct 14, 7:41 AM · Restricted Project
foad added inline comments to D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine.
Wed, Oct 14, 7:36 AM · Restricted Project
foad requested review of D89392: [GlobalISel] Fold unary opcodes in CSEMIRBuilder.
Wed, Oct 14, 6:57 AM · Restricted Project
foad added inline comments to D89386: [AMDGPU] Fix access beyond the end of the basic block in execMayBeModifiedBeforeAnyUse..
Wed, Oct 14, 6:00 AM · Restricted Project
foad added a comment to D89388: [AMDGPU] Fix ieee mode default value.

Makes sense to me. As I understand it the biggest practical effect is that the compiler will start making use of output modifiers in compute shaders.

Wed, Oct 14, 5:53 AM · Restricted Project
foad added a comment to D88060: [GISel]: Few InsertVecElt combines.

Reverse ping! If you just don't have time to work on this, I'd be interested in commandeering the "insert_vec_elt(build_vector) -> build_vector" part.

Wed, Oct 14, 3:09 AM · Restricted Project
foad accepted D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.
Wed, Oct 14, 2:03 AM · Restricted Project
foad added inline comments to D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.
Wed, Oct 14, 12:34 AM · Restricted Project
foad accepted D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.

Looks good. Some nits inline.

Wed, Oct 14, 12:14 AM · Restricted Project

Tue, Oct 13

foad committed rGedc37baca6d6: [AMDGPU] Add MC layer support for v_fmac_legacy_f32 (authored by foad).
[AMDGPU] Add MC layer support for v_fmac_legacy_f32
Tue, Oct 13, 2:07 PM
foad closed D89247: [AMDGPU] Add MC layer support for v_fmac_legacy_f32.
Tue, Oct 13, 2:07 PM · Restricted Project
foad added inline comments to D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.
Tue, Oct 13, 9:09 AM · Restricted Project
foad committed rGb59d8d7c7254: [AMDGPU][GlobalISel] Compute known bits for zero-extending loads (authored by foad).
[AMDGPU][GlobalISel] Compute known bits for zero-extending loads
Tue, Oct 13, 8:22 AM
foad closed D89316: [AMDGPU][GlobalISel] Compute known bits for zero-extending loads.
Tue, Oct 13, 8:22 AM · Restricted Project
foad requested review of D89316: [AMDGPU][GlobalISel] Compute known bits for zero-extending loads.
Tue, Oct 13, 6:51 AM · Restricted Project
foad retitled D89247: [AMDGPU] Add MC layer support for v_fmac_legacy_f32 from [AMDGPU] Add v_fmac_legacy_f32 to [AMDGPU] Add MC layer support for v_fmac_legacy_f32.
Tue, Oct 13, 5:41 AM · Restricted Project
foad added inline comments to D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine.
Tue, Oct 13, 5:15 AM · Restricted Project
foad accepted D87140: [GlobalISel] Avoid making G_PTR_ADD with nullptr.

LGTM.

Tue, Oct 13, 2:58 AM · Restricted Project
foad updated the diff for D89247: [AMDGPU] Add MC layer support for v_fmac_legacy_f32.

Lowercase subtarget feature names.

Tue, Oct 13, 2:56 AM · Restricted Project
foad added inline comments to D87140: [GlobalISel] Avoid making G_PTR_ADD with nullptr.
Tue, Oct 13, 2:27 AM · Restricted Project
foad committed rGcdf0214845a1: [AMDGPU] v_mac_legacy_f32 does not support DPP (authored by foad).
[AMDGPU] v_mac_legacy_f32 does not support DPP
Tue, Oct 13, 2:03 AM
foad closed D89245: [AMDGPU] v_mac_legacy_f32 does not support DPP.
Tue, Oct 13, 2:03 AM · Restricted Project
foad added a comment to D89247: [AMDGPU] Add MC layer support for v_fmac_legacy_f32.

It also seems to lack changes in SIFoldOperands, SIInstrInfo::FoldImmediate, SIInstrInfo::convertToThreeAddress, SIInstrInfo::canShrink. I think these can go in a separate change.

Tue, Oct 13, 1:05 AM · Restricted Project
foad committed rGacd0dd3a62d1: [AMDGPU] Use lowercase for subtarget feature names in RUN lines (authored by foad).
[AMDGPU] Use lowercase for subtarget feature names in RUN lines
Tue, Oct 13, 1:02 AM

Mon, Oct 12

foad requested review of D89247: [AMDGPU] Add MC layer support for v_fmac_legacy_f32.
Mon, Oct 12, 8:56 AM · Restricted Project
foad requested review of D89245: [AMDGPU] v_mac_legacy_f32 does not support DPP.
Mon, Oct 12, 8:18 AM · Restricted Project
foad committed rGb8901230c07e: [AMDGPU] Use @LINE for error checking in gfx10 assembler tests (authored by foad).
[AMDGPU] Use @LINE for error checking in gfx10 assembler tests
Mon, Oct 12, 8:12 AM
foad added inline comments to D89187: [AMDGPU] Minimize number of s_mov generated by copyPhysReg.
Mon, Oct 12, 3:36 AM · Restricted Project
foad added inline comments to D89187: [AMDGPU] Minimize number of s_mov generated by copyPhysReg.
Mon, Oct 12, 2:33 AM · Restricted Project
foad added inline comments to D86878: [AMDGPU] Fix a miscompile with S_ADD/S_SUB.
Mon, Oct 12, 2:14 AM · Restricted Project
foad accepted D89139: [DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns.

Looks good to me if the target maintainers are happy.

Mon, Oct 12, 1:56 AM · Restricted Project
foad added a comment to D86203: [GlobalISel][TableGen] Add handling of unannotated dst pattern ops.

Querying the register bank bank for VS_32 doesn't make any sense, and I suspect would assert.

Mon, Oct 12, 1:49 AM · Restricted Project
foad added a comment to D64393: [AMDGPU] Fix DPP combiner check for exec modification.

I've found the case when execMayBeModifiedBeforeAnyUse randomly leads to a coredump, which is hard to debug. Most likely it's because an instruction beyond the end of a basic block is accessed. This means that the first loop calculates some instructions twice and I was wrong assuming use_nodbg_instructions doesn't repeat them. In fact there is no code in MachineRegisterInfo::verifyUseList that ensures that uses belonging to one instruction should be sequent in the use list nor the traces of such ordering can be found in MachineRegisterInfo::addRegOperandToUseList. I'm going to fix this code.

Mon, Oct 12, 1:24 AM · Restricted Project

Fri, Oct 9

foad committed rG1dfbc2ea1441: [AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed (authored by foad).
[AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed
Fri, Oct 9, 9:18 AM
foad closed D89123: [AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed.
Fri, Oct 9, 9:18 AM · Restricted Project
foad requested review of D89123: [AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed.
Fri, Oct 9, 5:38 AM · Restricted Project
foad added inline comments to D89077: [AMDGPU] Run hazard recognizer pass later.
Fri, Oct 9, 3:07 AM · Restricted Project
foad added inline comments to D89077: [AMDGPU] Run hazard recognizer pass later.
Fri, Oct 9, 2:17 AM · Restricted Project
foad added a comment to D89077: [AMDGPU] Run hazard recognizer pass later.

Is this now running after the waitcnt insertion pass? That would avoid the NOPs currently being inserted to split memory clauses that are not necessary as the waitcnt instructions will split the clauses.

We also insert nops in the post-RA scheduler.

In earlier conversation it was suggested that the spurious NOPs were explained as happening because the hazard recognizer inserted them to break memory clauses, and then the waitcnt pass ran. There would be no need to insert the NOPs if the waitcnt instructions were already there. So seems that was not a valid explanation, perhaps the post-RA scheduler is an explanation, but I am unclear why it put those ones in. @rampitec can you help explain?

Post-RA scheduler and hazard recognizer is the same pass if you run post-RA scheduler. If not it is a separate pass.

Fri, Oct 9, 2:01 AM · Restricted Project

Thu, Oct 8

foad committed rG7238faa4ae97: [AMDGPU] Add patterns for mad/mac legacy f32 instructions (authored by foad).
[AMDGPU] Add patterns for mad/mac legacy f32 instructions
Thu, Oct 8, 7:24 AM
foad closed D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.
Thu, Oct 8, 7:24 AM · Restricted Project
foad updated the diff for D89038: [PatternMatch] Add new FP matchers. NFC..

Use matchers to handle fixed width vectors.

Thu, Oct 8, 7:19 AM · Restricted Project
foad added a comment to D88955: [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy.

It might make sense to rebase this on D89038.

Thu, Oct 8, 6:36 AM · Restricted Project
foad requested review of D89038: [PatternMatch] Add new FP matchers. NFC..
Thu, Oct 8, 6:09 AM · Restricted Project
foad added a reviewer for D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions: dp.
Thu, Oct 8, 4:33 AM · Restricted Project
foad updated the diff for D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.

Rebase after D89000.

Thu, Oct 8, 4:32 AM · Restricted Project
foad added inline comments to D87621: [AMDGPU] Add XDL resource to scheduling model.
Thu, Oct 8, 2:24 AM · Restricted Project
foad added a comment to D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.

I'll rebase this after D89000 lands, since it touches some of the same code in VOP2Instructions.td and gfx1030_err.s.

Thu, Oct 8, 2:13 AM · Restricted Project

Wed, Oct 7

foad added inline comments to D88955: [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy.
Wed, Oct 7, 9:46 AM · Restricted Project
foad updated the diff for D88955: [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy.

Add some matchers to PatternMatch and use them.

Wed, Oct 7, 9:38 AM · Restricted Project
foad updated the diff for D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.

Fix a couple more problems shown up by running the MC tests:

  • Update asm/dis special cases for V_MAC_LEGACY_F32 now that it uses the VOP_MAC_F32 profile.
  • Fix use of OtherPredicates in VOP2 pseudos.
Wed, Oct 7, 8:28 AM · Restricted Project
foad committed rGfc819b692561: [AMDGPU] Use @LINE for error checking in gfx10.3 assembler tests (authored by foad).
[AMDGPU] Use @LINE for error checking in gfx10.3 assembler tests
Wed, Oct 7, 7:49 AM
foad requested review of D88955: [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy.
Wed, Oct 7, 4:13 AM · Restricted Project
foad updated the diff for D88569: [DAGCombiner] Call SimplifyDemandedBits to simplify EXTRACT_VECTOR_ELT.

Rebase.

Wed, Oct 7, 2:57 AM · Restricted Project
foad committed rG1aa8e6a51a0e: [SDag] SimplifyDemandedBits: simplify to FP constant if all bits known (authored by foad).
[SDag] SimplifyDemandedBits: simplify to FP constant if all bits known
Wed, Oct 7, 1:31 AM
foad closed D88570: [SDag] SimplifyDemandedBits: simplify to FP constant if all bits known.
Wed, Oct 7, 1:31 AM · Restricted Project

Tue, Oct 6

foad updated the diff for D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.

Update comments for NoMods patterns.

Tue, Oct 6, 10:49 AM · Restricted Project
foad added inline comments to D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.
Tue, Oct 6, 9:12 AM · Restricted Project
foad added inline comments to D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.
Tue, Oct 6, 7:30 AM · Restricted Project
foad added a comment to D88895: [AMDGPU] Add test with redundant copies to temporary stack slot produced by expandUnalignedLoad.

Is "spill" the right name? They come from TargetLowering::expandUnalignedLoad copying the data to a temporary stack slot, not from the register allocator.

Tue, Oct 6, 6:37 AM · Restricted Project
foad requested review of D88890: [AMDGPU] Add patterns for mad/mac legacy f32 instructions.
Tue, Oct 6, 5:36 AM · Restricted Project
foad added a comment to D88882: [AMDGPU] Prefer SplitVectorLoad/Store over expandUnalignedLoad/Store..

Could you pre-commit the test case?

Tue, Oct 6, 3:04 AM · Restricted Project
foad added a reviewer for D88882: [AMDGPU] Prefer SplitVectorLoad/Store over expandUnalignedLoad/Store.: rampitec.
Tue, Oct 6, 3:03 AM · Restricted Project

Mon, Oct 5

foad accepted D88821: Fix reordering of instructions during VirtRegRewriter unbundling.

LGTM, though I can't help feeling the whole function could be simpler.

Mon, Oct 5, 5:34 AM · Restricted Project
foad added inline comments to D88821: Fix reordering of instructions during VirtRegRewriter unbundling.
Mon, Oct 5, 5:24 AM · Restricted Project
foad added a reviewer for D88570: [SDag] SimplifyDemandedBits: simplify to FP constant if all bits known: t.p.northover.

This looks like a clear win to me on all the affected test cases except for one slight regression noted inline.

Mon, Oct 5, 3:07 AM · Restricted Project
foad accepted D88573: [SelectionDAG] Add check for BUILD_VECTOR in isKnownNeverNaN.

LGTM apart from the nitpick about comments.

Mon, Oct 5, 2:25 AM · Restricted Project
foad added a comment to D88573: [SelectionDAG] Add check for BUILD_VECTOR in isKnownNeverNaN.

@ecnelises did you really mean to add yourself as a "blocking reviewer" rather than a normal reviewer?

Mon, Oct 5, 2:14 AM · Restricted Project