Page MenuHomePhabricator

foad (Jay Foad)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 29 2014, 9:58 AM (377 w, 1 d)

Recent Activity

Today

foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

I guess this particular case could be handled by improving the v_perm matching in SITargetLowering::performOrCombine, e.g. add a cases for or (op x, c1), c2 -> perm x, x, permute_mask(c1, c2) and or (perm x, x, c1), (op y, c2) -> perm x, y, permute_mask(c1, c2).

Thu, Jan 20, 9:53 AM · Restricted Project
foad added inline comments to D116270: [AMDGPU] Enable divergence-driven XNOR selection.
Thu, Jan 20, 9:25 AM · Restricted Project
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

Overall this seems reasonable. The only alternative I can think of would need more complicated isel patterns that do the reassociation gated by predicates that check the divergence, something like:

let Predicate = doesNotHaveXNOR in
def : GCNPat<
  (i32 (xor (xor_oneuse i32:$src0, i32:$src1), i32:$src2)),
  (i32 (V_XOR_B32 $src0, (V_XOR_B32 $src1, $src2))),
  [{ return src0->isDivergent() && !src1->isDivergent() && !src2->isDivergent(); }]
>;

... and lots of commuted versions of the same thing, and the same for any other isel pattern that matches something that could be reassoicated. So that doesn't sound very scalable.

Thu, Jan 20, 8:10 AM · Restricted Project
foad added inline comments to D116529: [GlobalISel] Fold or of shifts with constant amount to funnel shift..
Thu, Jan 20, 4:54 AM · Restricted Project
foad committed rG847bb26820b1: [AMDGPU] Regenerate some MIR checks (authored by foad).
[AMDGPU] Regenerate some MIR checks
Thu, Jan 20, 4:45 AM
foad accepted D117710: AMDGPU/GlobalISel: Mostly fix BFI patterns.

LGTM, thanks!

Thu, Jan 20, 2:51 AM · Restricted Project
foad accepted D117762: [AMDGPU] Set MemoryVT for truncstores in tblgen..

LGTM. It would be nice to improve the codegen as noted inline but I'm not sure how to implement that.

Thu, Jan 20, 2:46 AM · Restricted Project
foad accepted D117719: AMDGPU/GlobalISel: Try to use s_and_b64 in ptrmask selection.

LGTM, thanks!

Thu, Jan 20, 2:34 AM · Restricted Project
foad added inline comments to D117720: AMDGPU/GlobalISel: Do not create readfirstlane with non-s32 type.
Thu, Jan 20, 2:09 AM · Restricted Project
foad added a comment to D117026: GlobalISel: Fix CSEMIRBuilder mishandling constant folds of vectors.

or have the utils function return a vector of APInt

Thu, Jan 20, 2:03 AM · Restricted Project
foad requested review of D117758: [GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt.
Thu, Jan 20, 2:02 AM · Restricted Project
foad added a comment to D117690: AMDGPU/GlobalISel: Directly diagnose return value use for FP atomics.

Previously we were falling back to the DAG on selection
failure, where it would emit this error and then fail again.

Thu, Jan 20, 12:42 AM · Restricted Project
foad added a comment to D117721: [AMDGPU] Make v8i16/v8f16 legal.

Make v8i16/v8f16 legal

Thu, Jan 20, 12:34 AM · Restricted Project

Yesterday

foad committed rG63eea41de63a: [AMDGPU] Simplify SILoadStoreOptimizer::getSubRegIdxs. NFC. (authored by foad).
[AMDGPU] Simplify SILoadStoreOptimizer::getSubRegIdxs. NFC.
Wed, Jan 19, 7:20 AM
foad added inline comments to D86578: [TargetLowering] Combine known bits for icmp in SimplifySetCC (PR41182).
Wed, Jan 19, 4:52 AM · Restricted Project
foad committed rG0bc14a0a989f: [AMDGPU] Tweak some compares in wqm.ll test (authored by foad).
[AMDGPU] Tweak some compares in wqm.ll test
Wed, Jan 19, 4:51 AM
foad added inline comments to D86578: [TargetLowering] Combine known bits for icmp in SimplifySetCC (PR41182).
Wed, Jan 19, 12:26 AM · Restricted Project
foad committed rG7af959673e67: [AMDGPU] Tweak some compares in wave32.ll test (authored by foad).
[AMDGPU] Tweak some compares in wave32.ll test
Wed, Jan 19, 12:26 AM
foad added inline comments to D117620: AMDGPU/GlobalISel: Fix assert on invalid cond code for llvm.amdgcn.icmp.
Wed, Jan 19, 12:15 AM · Restricted Project

Tue, Jan 18

foad added inline comments to D117562: [AMDGPU] Sink immediate VGPR defs if high RP.
Tue, Jan 18, 8:22 AM · Restricted Project
foad accepted D117544: [AMDGPU] Fix missing waitcnt issue.
Tue, Jan 18, 5:23 AM · Restricted Project
foad updated subscribers of D86578: [TargetLowering] Combine known bits for icmp in SimplifySetCC (PR41182).
Tue, Jan 18, 2:28 AM · Restricted Project
foad accepted D117544: [AMDGPU] Fix missing waitcnt issue.

In our usual compile-time tests it shows 0.056% degradation on average, worst case 0.9%.

Tue, Jan 18, 1:44 AM · Restricted Project
foad added a comment to D117544: [AMDGPU] Fix missing waitcnt issue.

Did you notice any compile-time degradation from your fix?

Tue, Jan 18, 1:16 AM · Restricted Project
foad added a comment to D117544: [AMDGPU] Fix missing waitcnt issue.

As a bit of background, the OldOutOfOrder test was introduced by @nhaehnle in a large refactoring in D54231. I think it's just a performance optimisation: if the old state had events that completed out of order, then any use of the corresponding registers would have to be preceded by a "waitcnt 0", so there's no need to reprocess that block because the waitcnts are already as strict as they can be. But this goes wrong in the case where the merge introduces a wait on a particular register that had no wait in the old state (so no waitcnt would have been generated for it the last time the block was processed).

Tue, Jan 18, 1:15 AM · Restricted Project
foad added inline comments to D117544: [AMDGPU] Fix missing waitcnt issue.
Tue, Jan 18, 12:29 AM · Restricted Project
foad added inline comments to D117544: [AMDGPU] Fix missing waitcnt issue.
Tue, Jan 18, 12:27 AM · Restricted Project

Mon, Jan 17

foad added inline comments to D117482: AMDGPU: Don't clobber source register for V_SET_INACTIVE_*.
Mon, Jan 17, 7:21 AM · Restricted Project
foad accepted D117487: AMDGPU: Remove llvm.amdgcn.alignbit and handle bitcode upgrade to fshr.

LGTM, thanks!

Mon, Jan 17, 7:06 AM · Restricted Project
foad added a comment to D117484: AMDGPU/GlobalISel: Handle legacy grid ID intrinsics.

I don't understand. Why do we need to handle llvm.r600.* intrinsics on amdgcn subtargets? Why does globalisel need to handle them, when it doesn't support r600?

Mon, Jan 17, 6:51 AM · Restricted Project
foad accepted D117481: AMDGPU/GlobalISel: Fix legalization failure for s65 shifts.

LGTM. I always assumed it was intentional that legalization rules had to be ordered in just the right way.

Mon, Jan 17, 6:47 AM · Restricted Project

Fri, Jan 14

foad added a comment to D116116: [AMDGPU] Remove lz and nomip combine from codegen.

It looks like for some tests you manually optimized the IR, but for other tests you updated the expected ISA. Any reason for the difference?

Fri, Jan 14, 3:29 AM · Restricted Project
foad added inline comments to D115675: AMDGPU: Fix assert on function argument as loop condition.
Fri, Jan 14, 3:12 AM · Restricted Project
foad added a comment to D117298: [CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC..

In case anyone needs to make equivalent changes downstream, this patch was created with: sed -i 's/{{\\\[}}/[/g;s/{{\\\]}}/]/g' $(grep -lr '{{\\[][]}}' test)

Fri, Jan 14, 3:10 AM · Restricted Project
foad added reviewers for D117298: [CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC.: jholewinski, nigelp-xmos, tstellar, t.p.northover, arsenm.
Fri, Jan 14, 3:07 AM · Restricted Project
foad requested review of D117298: [CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC..
Fri, Jan 14, 3:05 AM · Restricted Project
foad committed rG350bc5683da5: [llvm-dwp] Simplify FileCheck patterns. NFC. (authored by foad).
[llvm-dwp] Simplify FileCheck patterns. NFC.
Fri, Jan 14, 2:55 AM
foad committed rG013116cd7077: Use {LITERAL} instead of regex escaping in some lit tests. NFC. (authored by foad).
Use {LITERAL} instead of regex escaping in some lit tests. NFC.
Fri, Jan 14, 2:55 AM
foad added a comment to D110579: [AMDGPU] Add two new intrinsics to control fp_trunc rounding mode.

It seems like people are mostly happy with the design now, so I am being a bit more picky with my review comments!

Fri, Jan 14, 1:51 AM · Restricted Project, Restricted Project
foad added a comment to D116469: [AMDGPU] Correct the known bits calculation for MUL_I24..

LGTM, thanks!

Fri, Jan 14, 1:00 AM · Restricted Project
foad accepted D117280: [AMDGPU] Pre-commit test for D116469. NFC.

Looks obviously fine, thanks!

Fri, Jan 14, 1:00 AM · Restricted Project

Thu, Jan 13

foad committed rG821dd3b0e5b7: [FileCheck] Allow literal '['s before "[[var...]]" (authored by foad).
[FileCheck] Allow literal '['s before "[[var...]]"
Thu, Jan 13, 1:52 AM
foad closed D117117: [FileCheck] Allow literal '['s before "[[var...]]".
Thu, Jan 13, 1:52 AM · Restricted Project

Wed, Jan 12

foad added a reviewer for D117117: [FileCheck] Allow literal '['s before "[[var...]]": eliben.
Wed, Jan 12, 8:00 AM · Restricted Project
foad updated the diff for D117117: [FileCheck] Allow literal '['s before "[[var...]]".

Add a FileCheck test.

Wed, Jan 12, 8:00 AM · Restricted Project
foad added inline comments to D117117: [FileCheck] Allow literal '['s before "[[var...]]".
Wed, Jan 12, 7:55 AM · Restricted Project
foad added reviewers for D117117: [FileCheck] Allow literal '['s before "[[var...]]": fhahn, thopre, jdenny, jpienaar.
Wed, Jan 12, 7:24 AM · Restricted Project
foad requested review of D117117: [FileCheck] Allow literal '['s before "[[var...]]".
Wed, Jan 12, 7:22 AM · Restricted Project
foad added a comment to D116469: [AMDGPU] Correct the known bits calculation for MUL_I24..

@foad were you able to come up with a test for this?

Wed, Jan 12, 3:13 AM · Restricted Project
foad added inline comments to D110579: [AMDGPU] Add two new intrinsics to control fp_trunc rounding mode.
Wed, Jan 12, 1:19 AM · Restricted Project, Restricted Project

Tue, Jan 11

foad added inline comments to D117042: GlobalISel: Fix insert point in localizer.
Tue, Jan 11, 11:24 AM · Restricted Project
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

Removing this guard leads to the infinite transforming the pattern back and forth in SITargetLowering::reassociateScalarOps and DAGCombiner::ReassociateOps().
The former transform (xor (xor uniform, divergent), -1) to (xor (xor uniform, -1), divergent) but the latter one transform it back by applying this rule:

if (N0.hasOneUse()) {
 ** // Reassociate: (op (op x, c1), y) -> (op (op x, y), c1)
  //              iff (op x, c1) has one use**
  if (SDValue OpNode = DAG.getNode(Opc, SDLoc(N0), VT, N00, N1))
    return DAG.getNode(Opc, DL, VT, OpNode, N01);
  return SDValue();
}
Tue, Jan 11, 9:37 AM · Restricted Project
foad added a comment to D117026: GlobalISel: Fix CSEMIRBuilder mishandling constant folds of vectors.

Alternatively we could pass DstOps to the constant fold functions.

Tue, Jan 11, 9:15 AM · Restricted Project
foad added inline comments to D117014: AMDGPU: Use removeAllRegUnitsForPhysReg().
Tue, Jan 11, 5:31 AM · Restricted Project
foad added inline comments to D116870: [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits..
Tue, Jan 11, 12:22 AM · Restricted Project

Mon, Jan 10

foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

Once again, in my case BOTH nodes (not,xor) are divergent!

 %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
DIVERGENT:       %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
DIVERGENT:       %xor = xor i32 %v, %s.load
DIVERGENT:       %d = xor i32 %xor, -1
DIVERGENT:       store i32 %d, i32 addrspace(1)* %out.load, align 4
Mon, Jan 10, 8:04 AM · Restricted Project
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

Now:

We select the divergent NOT to V_NOT_B32_e32 and divergent XOR to V_XOR_B32_e64. The selection is correct but we missed the opportunity to exploit the fact that even divergent NOT may be selected to S_NOT_B32 w/o the correctness lost.
Mon, Jan 10, 7:45 AM · Restricted Project
foad added a reviewer for D116943: AMDGPU/GlobalISel: Explicitly track d16 for image legalization: sebastian-ne.
Mon, Jan 10, 7:28 AM · Restricted Project
foad added a reviewer for D116674: [Docs] Fix IR and TableGen grammar inconsistencies: pcc.

+ @pcc for the partition stuff. But it all LGTM.

Mon, Jan 10, 3:34 AM · Restricted Project
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

This looks like a regression in xnor.ll :

	s_not_b32 s0, s0                        	v_not_b32_e32 v0, v0
	v_xor_b32_e32 v0, s0, v0                        v_xor_b32_e32 v0, s4, v0

but it is not really. All the nodes in the example are divergent and the divergent ( xor, x -1) is selected to V_NOT_B32 as of https://reviews.llvm.org/D115884 has been committed.
S_NOT_B32 appears at the left because of the custom optimization that converts S_XNOR_B32 back to NOT (XOR) for the targets which have no V_XNOR. This optimization relies on the fact that if the NOT operand is SGPR and V_XOR_B32_e32 can accept SGPR as a first source operand.
I am not sure if it is always safe. The VALU instructions execution is controlled by the EXEC mask but SALU is not.

Mon, Jan 10, 2:20 AM · Restricted Project
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

SITargetLowering::reassociateScalarOps exists to fix the instruction selection that is done in a wrong way.

Mon, Jan 10, 1:13 AM · Restricted Project
foad added inline comments to D116870: [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits..
Mon, Jan 10, 1:03 AM · Restricted Project

Sun, Jan 9

foad added inline comments to D116870: [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits..
Sun, Jan 9, 1:03 AM · Restricted Project

Sat, Jan 8

foad committed rG50fb44eebb03: [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine (authored by foad).
[GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine
Sat, Jan 8, 1:33 AM
foad committed rGff971873b3fc: [GlobalISel] Fix legality checks for G_UBFX combines (authored by foad).
[GlobalISel] Fix legality checks for G_UBFX combines
Sat, Jan 8, 1:33 AM
foad closed D116803: [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine.
Sat, Jan 8, 1:32 AM · Restricted Project
foad closed D116802: [GlobalISel] Fix legality checks for G_UBFX combines.
Sat, Jan 8, 1:32 AM · Restricted Project

Fri, Jan 7

foad added inline comments to D116270: [AMDGPU] Enable divergence-driven XNOR selection.
Fri, Jan 7, 8:16 AM · Restricted Project
foad added a comment to D116807: [GlobalISel] Remove TargetLowering::isConstantUnsignedBitfieldExtractLegal.

Legality rules can't be context dependent on the specific operands used, but IIRC here it only wants to use it if the operands are constants

Fri, Jan 7, 8:05 AM · Restricted Project
foad added a comment to D116807: [GlobalISel] Remove TargetLowering::isConstantUnsignedBitfieldExtractLegal.

TBH I don't understand why D99283 introduced this target lowering hook in the first place. Comments on the review say "The annoying part (for AArch64) is that the legality checks don't work with custom legalization". Did isLegalOrCustom not exist back then? Or does it not do the right thing? (I don't see any tests failing with this patch.)

Fri, Jan 7, 6:31 AM · Restricted Project
foad requested review of D116807: [GlobalISel] Remove TargetLowering::isConstantUnsignedBitfieldExtractLegal.
Fri, Jan 7, 6:29 AM · Restricted Project
foad added a reviewer for D116038: [AMDGPU] Fix LOD bias in A16 combine: dstuttard.
Fri, Jan 7, 5:57 AM · Restricted Project
foad requested review of D116803: [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine.
Fri, Jan 7, 5:03 AM · Restricted Project
foad added a comment to D116802: [GlobalISel] Fix legality checks for G_UBFX combines.

Further cleanups are possible:

  1. Change CombinerHelper::matchBitfieldExtractFromShrAnd to use getPreferredShiftAmountTy for the shift-amount-like operands of G_UBFX, like all the other G_*BFX combines do. Change AArch64's G_*BFX legality rules to match.
  2. Remove isConstantUnsignedBitfieldExtractLegal since it doesn't seem to do any more than a standard legality check.
Fri, Jan 7, 4:11 AM · Restricted Project
foad requested review of D116802: [GlobalISel] Fix legality checks for G_UBFX combines.
Fri, Jan 7, 4:07 AM · Restricted Project
foad committed rGbd934dad5280: [AMDGPU] Regenerate MIR checks for G_[SU]BFX (authored by foad).
[AMDGPU] Regenerate MIR checks for G_[SU]BFX
Fri, Jan 7, 4:07 AM
foad committed rG7a66c980f58b: [AMDGPU] Regenerate G_[SU]BFX checks using some common prefixes (authored by foad).
[AMDGPU] Regenerate G_[SU]BFX checks using some common prefixes
Fri, Jan 7, 4:07 AM
foad committed rG3f3fe4a5cfa1: [GlobalISel] Fix typo Extact to Extract in function name. NFC. (authored by foad).
[GlobalISel] Fix typo Extact to Extract in function name. NFC.
Fri, Jan 7, 3:14 AM
foad added a comment to D115269: [SystemZ][z/OS] Add entry point marker to PPA.

The test added in this change fails on build bots that do not build the s390x target. Can you please take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/15573

******************** TEST 'LLVM :: MC/GOFF/ppa1.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/llc -mtriple s390x-ibm-zos < /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/MC/GOFF/ppa1.ll | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/MC/GOFF/ppa1.ll
--
Exit Code: 2
Command Output (stderr):
--
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/llc: error: : error: unable to get target for 's390x-ibm-zos', see --version and --triple.
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/MC/GOFF/ppa1.ll
--
Fri, Jan 7, 1:40 AM · Restricted Project
foad committed rG080f372ad364: [SystemZ][z/OS] Fix test failure when SystemZ target is not built (authored by foad).
[SystemZ][z/OS] Fix test failure when SystemZ target is not built
Fri, Jan 7, 1:39 AM
foad added a comment to D116270: [AMDGPU] Enable divergence-driven XNOR selection.

SIInstrInfo::lowerScalarXnor() is dead after your patch

Fri, Jan 7, 1:14 AM · Restricted Project

Thu, Jan 6

foad accepted D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection.

LGTM.

Thu, Jan 6, 5:49 AM · Restricted Project
foad added a comment to D116231: [InstCombine] (~a & ~b & c) | (~a & ~c & b) --> (b ^ c) & ~a.

I don't really have an opinion on the patch, but I'm curious.

Thu, Jan 6, 1:17 AM · Restricted Project

Wed, Jan 5

foad added a comment to D116616: [InstSimplify] use knownbits to fold more udiv/urem.

It seems like you're using knownbits information to derive range information. It would be good to do this more universally, and in both directions.

Yes - we do have a combination analysis called "computeConstantRangeIncludingKnownBits", so that should be more powerful.
But that is currently a static helper function in ValueTracking, so it would have to be made visible, and it would be good to have more tests to show where the extra logic gives us a better result. Ok to make that a TODO item?

Wed, Jan 5, 6:18 AM · Restricted Project
foad added a comment to D116616: [InstSimplify] use knownbits to fold more udiv/urem.

It seems like you're using knownbits information to derive range information. It would be good to do this more universally, and in both directions.

Wed, Jan 5, 3:12 AM · Restricted Project
foad added a comment to D116469: [AMDGPU] Correct the known bits calculation for MUL_I24..

This patch seems to change codegen in test/CodeGen/AMDGPU/lshl64-to-32.ll but I haven't managed to understand whether the change is good, bad or indifferent.

I'm not seeing any failure for that test in my local runs. The test appears to be generated by update_llc_test_checks.py and rerunning the script produced no changes. Am I missing something?

Wed, Jan 5, 1:54 AM · Restricted Project

Tue, Jan 4

foad added inline comments to D116529: [GlobalISel] Fold or of shifts with constant amount to funnel shift..
Tue, Jan 4, 2:16 AM · Restricted Project
foad added a comment to D116469: [AMDGPU] Correct the known bits calculation for MUL_I24..

This patch seems to change codegen in test/CodeGen/AMDGPU/lshl64-to-32.ll but I haven't managed to understand whether the change is good, bad or indifferent.

Tue, Jan 4, 2:09 AM · Restricted Project
foad added inline comments to D116500: [Support] Add KnownBits::countMaxSignedBits(). Make KnownBits::countMinSignBits() always return at least 1..
Tue, Jan 4, 1:26 AM · Restricted Project
foad added a comment to D116522: [ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC.

Rename APInt::getMinSignedBits->getSignificantBits

Tue, Jan 4, 1:19 AM · Restricted Project

Sun, Jan 2

foad added a comment to D116469: [AMDGPU] Correct the known bits calculation for MUL_I24..

Both changes look good to me but it's not clear how they are related, and it seems odd to include the generic change in a patch which claims to be AMDGPU-specific. Would it makes sense to commit the generic change separately first?

Sun, Jan 2, 1:20 AM · Restricted Project

Fri, Dec 31

foad added a reviewer for D116270: [AMDGPU] Enable divergence-driven XNOR selection: foad.
Fri, Dec 31, 3:57 AM · Restricted Project
foad added a comment to D116273: [AMDGPU] Iterate LoweredEndCf in the reverse order.

What was the effect of inserting multiple branch instructions? Did it fail MIR verification?

Fri, Dec 31, 3:43 AM · Restricted Project
foad added inline comments to D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection.
Fri, Dec 31, 3:40 AM · Restricted Project
foad committed rG866b195cb9d7: [AMDGPU] Regenerate checks for waitcnt-overflow.mir (authored by foad).
[AMDGPU] Regenerate checks for waitcnt-overflow.mir
Fri, Dec 31, 3:32 AM
foad accepted D116423: [SelectionDAG] Use KnownBits::countMinSignBits() to simplify the end of ComputeNumSignBits..
Fri, Dec 31, 2:27 AM · Restricted Project

Thu, Dec 23

foad committed rG74ce7ff5dc5b: [AMDGPU] Remove a TODO that was done by D98081 (authored by foad).
[AMDGPU] Remove a TODO that was done by D98081
Thu, Dec 23, 2:20 AM
foad added inline comments to D116187: [AMDGPU] Select build_vector DAG nodes according to the divergence.
Thu, Dec 23, 2:13 AM · Restricted Project

Dec 21 2021

foad added inline comments to D114198: [GlobalISel] Rework more/fewer elements for vectors.
Dec 21 2021, 6:41 AM · Restricted Project
foad added a comment to D116042: [AMDGPU][InstCombine] Remove zero LOD bias.

I like it.

Dec 21 2021, 6:13 AM · Restricted Project