Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

jrbyrnes (Jeffrey Byrnes)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 27 2022, 6:44 PM (82 w, 5 d)

Recent Activity

Tue, Sep 26

jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Fix signedness handling of any_extend

Tue, Sep 26, 11:37 AM · Restricted Project, Restricted Project

Fri, Sep 22

jrbyrnes added a comment to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.

@jrbyrnes shouldn't we land the AMDGPU changes separately first? Or would it all be dead code without the rest of this patch?

Fri, Sep 22, 5:02 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Fix dereference issue + nits (reorganize logic + comments)

Fri, Sep 22, 2:00 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Extract signedness checking

Fri, Sep 22, 1:12 PM · Restricted Project, Restricted Project

Thu, Sep 21

jrbyrnes added a comment to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.

Thanks -- the AMDGPUISD::PERM side LGTM

Thu, Sep 21, 8:59 AM · Restricted Project, Restricted Project

Wed, Sep 20

jrbyrnes added inline comments to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.
Wed, Sep 20, 5:41 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.

Can you please bring in tests from https://github.com/llvm/llvm-project/pull/66965

Wed, Sep 20, 5:36 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.

Disregard previous version of comment -- I did not see the most recent changes.

Wed, Sep 20, 3:24 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support.

what is the best way to generalize the performOrCombine handling to support ISD::FSHR as well?

Wed, Sep 20, 12:19 PM · Restricted Project, Restricted Project

Tue, Sep 19

jrbyrnes added a reviewer for D155995: [AMDGPU]: Allow combining into v_dot4: foad.
Tue, Sep 19, 4:32 PM · Restricted Project, Restricted Project

Fri, Sep 15

jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

IsSigned tracks whether or not to produce an instruction with signed behavior. In some cases, we are able to determine this based on the semantics of the top-level instruction, however, in other cases, we need more information. For such cases, we must look to the tree itself.

Fri, Sep 15, 1:19 PM · Restricted Project, Restricted Project

Wed, Sep 13

jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Wed, Sep 13, 5:11 PM · Restricted Project, Restricted Project
jrbyrnes committed rG372115fadddc: [AMDGPU] Precommit test for i8 vector CopyToReg handling patch (authored by jrbyrnes).
[AMDGPU] Precommit test for i8 vector CopyToReg handling patch
Wed, Sep 13, 11:28 AM · Restricted Project, Restricted Project
jrbyrnes closed D159303: [AMDGPU] Precommit test for i8 vector CopyToReg handling patch.
Wed, Sep 13, 11:27 AM · Restricted Project, Restricted Project

Tue, Sep 12

jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Bring in https://github.com/llvm/llvm-project/pull/65995

Tue, Sep 12, 5:00 PM · Restricted Project, Restricted Project
jrbyrnes reopened D155995: [AMDGPU]: Allow combining into v_dot4.

Reopen for review as it has been reverted, now includes https://github.com/llvm/llvm-project/pull/65995

Tue, Sep 12, 4:58 PM · Restricted Project, Restricted Project

Thu, Sep 7

jrbyrnes abandoned D155868: [AMDGPU] Add patterns for v_dot*_IU for GFX11.

Abandon in favor of https://reviews.llvm.org/D155995

Thu, Sep 7, 1:07 PM · Restricted Project, Restricted Project
jrbyrnes committed rG7fda1b74be4a: [AMDGPU]: Allow combining into v_dot4 (authored by jrbyrnes).
[AMDGPU]: Allow combining into v_dot4
Thu, Sep 7, 1:06 PM · Restricted Project, Restricted Project
jrbyrnes closed D155995: [AMDGPU]: Allow combining into v_dot4.
Thu, Sep 7, 1:06 PM · Restricted Project, Restricted Project

Wed, Sep 6

jrbyrnes abandoned D133731: [AMDGPU] Add Lower Bound to PipelineSolver.

Abandoning since rules provides framework for closely fitting pipelines to user specifications.

Wed, Sep 6, 3:47 PM · Restricted Project, Restricted Project
jrbyrnes reopened D133731: [AMDGPU] Add Lower Bound to PipelineSolver.

Reopened due to commit reversion

Wed, Sep 6, 3:46 PM · Restricted Project, Restricted Project

Tue, Sep 5

jrbyrnes updated the diff for D159303: [AMDGPU] Precommit test for i8 vector CopyToReg handling patch.

Address comments + Add tests & change test structure of vni8-across-blocks

Tue, Sep 5, 3:03 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D155995: [AMDGPU]: Allow combining into v_dot4.
Tue, Sep 5, 2:34 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Address comments

Tue, Sep 5, 2:34 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D155995: [AMDGPU]: Allow combining into v_dot4.

ping

Tue, Sep 5, 7:22 AM · Restricted Project, Restricted Project

Thu, Aug 31

jrbyrnes requested review of D159303: [AMDGPU] Precommit test for i8 vector CopyToReg handling patch.
Thu, Aug 31, 11:31 AM · Restricted Project, Restricted Project

Aug 30 2023

jrbyrnes updated the diff for D159036: [AMDGPU] Accept arbitrary sized sources in CalculateByteProvider.

Account for max scalar size of i256. Factor out common code.

Aug 30 2023, 12:56 PM · Restricted Project, Restricted Project

Aug 28 2023

jrbyrnes requested review of D159036: [AMDGPU] Accept arbitrary sized sources in CalculateByteProvider.
Aug 28 2023, 4:38 PM · Restricted Project, Restricted Project

Aug 25 2023

jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Aug 25 2023, 12:59 PM · Restricted Project, Restricted Project
jrbyrnes committed rG3ba8dabbf31b: [AMDGPU] Add sdot4 / sdot8 intrinsics for gfx11 (authored by jrbyrnes).
[AMDGPU] Add sdot4 / sdot8 intrinsics for gfx11
Aug 25 2023, 11:47 AM · Restricted Project, Restricted Project
jrbyrnes closed D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 25 2023, 11:46 AM · Restricted Project, Restricted Project
jrbyrnes accepted D158845: [NFC][AMDGPU] assert we've found a value before use.

Thanks, LGTM

Aug 25 2023, 8:01 AM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158845: [NFC][AMDGPU] assert we've found a value before use.
Aug 25 2023, 7:00 AM · Restricted Project, Restricted Project

Aug 24 2023

jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Address Comments

Aug 24 2023, 11:57 AM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.

Include comments about dot*c, remove unintended changes to test

Aug 24 2023, 10:50 AM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.

Add note for intrinsics

Aug 24 2023, 10:35 AM · Restricted Project, Restricted Project
jrbyrnes added a comment to D155868: [AMDGPU] Add patterns for v_dot*_IU for GFX11.

This is better than the combiner - if it doesn't completely blow up compile time. It's probably easier to avoid the compile time problems with the combine though

Aug 24 2023, 9:43 AM · Restricted Project, Restricted Project

Aug 23 2023

jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Aug 23 2023, 6:09 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 23 2023, 9:48 AM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 23 2023, 9:45 AM · Restricted Project, Restricted Project

Aug 22 2023

jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Aug 22 2023, 4:28 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D155995: [AMDGPU]: Allow combining into v_dot4.
Aug 22 2023, 12:45 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Rebase (for https://reviews.llvm.org/D158468) and lower with intrinsics.

Aug 22 2023, 12:45 PM · Restricted Project, Restricted Project
jrbyrnes added a reviewer for D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11: Joe_Nash.
Aug 22 2023, 12:20 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 22 2023, 12:18 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.

Properly handle neg modifier

Aug 22 2023, 12:17 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 22 2023, 11:23 AM · Restricted Project, Restricted Project
jrbyrnes retitled D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11 from [AMDGPU] Add sdot4 / sdot8 intrinsics for gfx11 to [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 22 2023, 11:09 AM · Restricted Project, Restricted Project

Aug 21 2023

jrbyrnes requested review of D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11.
Aug 21 2023, 5:21 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Aug 21 2023, 5:01 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy..
Aug 21 2023, 9:40 AM · Restricted Project, Restricted Project

Aug 17 2023

jrbyrnes abandoned D127994: [AMDGPU] Expose CLI controls for IGroup ordering.

We have since taken the UI for this framework in a different direction. Most of what can be achieved through this UI can be achieved with already existing controls / interface (e.g. SGB). I don't see any need to pursue this further, will reopen if there are opposite opinions.

Aug 17 2023, 4:11 PM · Restricted Project, Restricted Project
jrbyrnes committed rGd26a06728da8: [DAG] NFC: Add getBitcastedExtOrTrunc (authored by jrbyrnes).
[DAG] NFC: Add getBitcastedExtOrTrunc
Aug 17 2023, 2:30 PM · Restricted Project, Restricted Project
jrbyrnes closed D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.
Aug 17 2023, 2:30 PM · Restricted Project, Restricted Project
jrbyrnes retitled D157733: [DAG] NFC: Add getBitcasedExtOrTrunc from [DAG] NFC: Add getScalarizeExtOrTrunc to [DAG] NFC: Add getBitcasedExtOrTrunc.
Aug 17 2023, 2:28 PM · Restricted Project, Restricted Project

Aug 16 2023

jrbyrnes updated the diff for D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.

Clean up getBitcasted* implementations

Aug 16 2023, 3:40 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Fix non-determinism -- iteration order of DenseMap. Use SmallVector instead (worst case lookup is non factor due to size)

Aug 16 2023, 11:07 AM · Restricted Project, Restricted Project

Aug 11 2023

jrbyrnes closed D157564: [MCP] Invalidate copy for super register in copy source.

f76ffc1f406ecdb5a8329e8a10a56f0ce2f6220c

Aug 11 2023, 1:56 PM · Restricted Project, Restricted Project
jrbyrnes updated the summary of D155995: [AMDGPU]: Allow combining into v_dot4.
Aug 11 2023, 1:50 PM · Restricted Project, Restricted Project
jrbyrnes retitled D155995: [AMDGPU]: Allow combining into v_dot4 from [AMDGPU] WIP: Allow matching into v_dot4 to [AMDGPU]: Allow combining into v_dot4.
Aug 11 2023, 1:49 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Rebase + clean up code. Still running tests but no longer a WIP.

Aug 11 2023, 1:49 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.

Naming + function API

Aug 11 2023, 12:54 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.

Add getExtOrTrunc for Any/Z/S.

Aug 11 2023, 12:09 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.
Aug 11 2023, 10:18 AM · Restricted Project, Restricted Project
jrbyrnes requested review of D157733: [DAG] NFC: Add getBitcasedExtOrTrunc.
Aug 11 2023, 9:29 AM · Restricted Project, Restricted Project
jrbyrnes committed rGf76ffc1f406e: [MCP] Invalidate copy for super register in copy source (authored by jrbyrnes).
[MCP] Invalidate copy for super register in copy source
Aug 11 2023, 9:04 AM · Restricted Project, Restricted Project
jrbyrnes committed rGd0e54e377b57: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed (authored by jrbyrnes).
[AMDGPU] Extend CalculateByteProvider to capture vectors and signed
Aug 11 2023, 8:59 AM · Restricted Project, Restricted Project
jrbyrnes closed D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.
Aug 11 2023, 8:59 AM · Restricted Project, Restricted Project

Aug 10 2023

jrbyrnes updated the diff for D157564: [MCP] Invalidate copy for super register in copy source.

RegUnitsToInvalidate

Aug 10 2023, 6:44 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157564: [MCP] Invalidate copy for super register in copy source.

Convert RegsToInvalidate back to MCRegister as well.

Aug 10 2023, 6:35 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157564: [MCP] Invalidate copy for super register in copy source.

Delete partial conversion to regunits

Aug 10 2023, 6:31 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157564: [MCP] Invalidate copy for super register in copy source.

Always try to invalidate both the LastSeenUseInCopy and CopyInfo->MI.

Aug 10 2023, 6:24 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D157564: [MCP] Invalidate copy for super register in copy source.
Aug 10 2023, 1:52 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157564: [MCP] Invalidate copy for super register in copy source.

Use LastSeenUseInCopy + use regunits in more places.

Aug 10 2023, 1:52 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D157564: [MCP] Invalidate copy for super register in copy source.
Aug 10 2023, 10:25 AM · Restricted Project, Restricted Project

Aug 9 2023

jrbyrnes added a comment to D157564: [MCP] Invalidate copy for super register in copy source.
5: USE r6
6: r1:4 = COPY r6:9

Hi @jrbyrnes , do you know why MCP fails to invalidate r7:r9 at label 5? In my view, when we are invalidating r6 at label 5, we should also find the copy involving def or use of r6 which is label 6, and invalidate the regunits of both src operands(r6:r9) and dest operands(r1:r4), which is what invalidateRegister doing.

          RegsToInvalidate.insert(
              CopyOperands->Destination->getReg().asMCReg());
          RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
...
    for (MCRegister InvalidReg : RegsToInvalidate)
      for (MCRegUnit Unit : TRI.regunits(InvalidReg))
        Copies.erase(Unit);
Aug 9 2023, 7:36 PM · Restricted Project, Restricted Project
jrbyrnes updated the summary of D157564: [MCP] Invalidate copy for super register in copy source.
Aug 9 2023, 4:44 PM · Restricted Project, Restricted Project
jrbyrnes requested review of D157564: [MCP] Invalidate copy for super register in copy source.
Aug 9 2023, 4:40 PM · Restricted Project, Restricted Project

Aug 8 2023

jrbyrnes updated the diff for D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.

Fix Extract element id handling

Aug 8 2023, 4:24 PM · Restricted Project, Restricted Project

Aug 4 2023

jrbyrnes updated the diff for D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.

Remove remaining redundant code.

Aug 4 2023, 3:44 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.
Aug 4 2023, 3:39 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.

Address comments + remove redundant ValueSize checks and handling

Aug 4 2023, 3:39 PM · Restricted Project, Restricted Project
jrbyrnes added a reviewer for D155995: [AMDGPU]: Allow combining into v_dot4: gandhi21299.
Aug 4 2023, 1:28 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Rebase + Extended algorithm for more complete coverage of potential trees.

Aug 4 2023, 1:27 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D155995: [AMDGPU]: Allow combining into v_dot4.

SLP vectorization should be tuned but that seems like a separate issue. Trees corresponding to v_dot4 often have s/zext as the final dest is 32 bit, but the arithmetic operations involve 8 bit operands. By introducing s/zext into the tree, we confuse the SLP vectorization cost model as it thinks it is vectorizing 32bit operands. The main issue is that cost model only looks at one node of the vectorizable tree at a time to calculate cost, instead of also considering the sequence as a whole. If we were to vectorize, codegen may be significantly less complex for these.

Aug 4 2023, 1:25 PM · Restricted Project, Restricted Project
jrbyrnes requested review of D157133: [AMDGPU] Extend CalculateByteProvider to capture vectors and signed.
Aug 4 2023, 1:25 PM · Restricted Project, Restricted Project

Jul 28 2023

jrbyrnes committed rG391249d1afe4: [AMDGPU] Allow 8,16 bit sources in calculateSrcByte (authored by jrbyrnes).
[AMDGPU] Allow 8,16 bit sources in calculateSrcByte
Jul 28 2023, 9:51 AM · Restricted Project, Restricted Project
jrbyrnes closed D155864: [AMDGPU] Allow 8,16 bit sources in calculateSrcByte.
Jul 28 2023, 9:51 AM · Restricted Project, Restricted Project

Jul 26 2023

jrbyrnes planned changes to D155995: [AMDGPU]: Allow combining into v_dot4.

Nothing necessarily planned at the moment, just want to block the review for now.

Jul 26 2023, 4:53 PM · Restricted Project, Restricted Project

Jul 24 2023

jrbyrnes added a comment to D155995: [AMDGPU]: Allow combining into v_dot4.

passes psdb

Jul 24 2023, 1:38 PM · Restricted Project, Restricted Project
jrbyrnes updated the summary of D155864: [AMDGPU] Allow 8,16 bit sources in calculateSrcByte.
Jul 24 2023, 1:38 PM · Restricted Project, Restricted Project
jrbyrnes updated the diff for D155864: [AMDGPU] Allow 8,16 bit sources in calculateSrcByte.

Address comments + rework "hasEightBitAccesses".

Jul 24 2023, 1:37 PM · Restricted Project, Restricted Project

Jul 21 2023

jrbyrnes updated the diff for D155995: [AMDGPU]: Allow combining into v_dot4.

Fix some errors.

Jul 21 2023, 5:16 PM · Restricted Project, Restricted Project
jrbyrnes added a comment to D155868: [AMDGPU] Add patterns for v_dot*_IU for GFX11.

Will abandon if https://reviews.llvm.org/D155995 supersedes selection of these instructions.

Jul 21 2023, 2:23 PM · Restricted Project, Restricted Project
jrbyrnes added inline comments to D155995: [AMDGPU]: Allow combining into v_dot4.
Jul 21 2023, 2:22 PM · Restricted Project, Restricted Project
jrbyrnes requested review of D155995: [AMDGPU]: Allow combining into v_dot4.
Jul 21 2023, 2:21 PM · Restricted Project, Restricted Project

Jul 20 2023

jrbyrnes added a comment to D155868: [AMDGPU] Add patterns for v_dot*_IU for GFX11.

We used to pattern match all the dot operations, but stopped because of a ridiculous blow up in compile time. Have you tried measuring that?

Also look at the generated selection tables. This shouldn't be one of the first patterns tried

Jul 20 2023, 11:13 AM · Restricted Project, Restricted Project
jrbyrnes requested review of D155868: [AMDGPU] Add patterns for v_dot*_IU for GFX11.
Jul 20 2023, 10:42 AM · Restricted Project, Restricted Project
jrbyrnes requested review of D155864: [AMDGPU] Allow 8,16 bit sources in calculateSrcByte.
Jul 20 2023, 10:20 AM · Restricted Project, Restricted Project

Jul 13 2023

jrbyrnes committed rG6b7805fcb182: [AMDGPU][IGLP] Add iglp_opt(1) strategy for single wave gemms (authored by jrbyrnes).
[AMDGPU][IGLP] Add iglp_opt(1) strategy for single wave gemms
Jul 13 2023, 12:04 PM · Restricted Project, Restricted Project