Page MenuHomePhabricator

foad (Jay Foad)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 29 2014, 9:58 AM (273 w, 6 d)

Recent Activity

Today

foad added a comment to D73509: [MachineScheduler] relax successor chain on clustering.

Typo in summary: successfor -> successor ?

Tue, Jan 28, 3:37 AM · Restricted Project
foad added a comment to D73509: [MachineScheduler] relax successor chain on clustering.

This doesn't fix the problem that inspired D71717. Consider the first test case in memory_clause.ll. With baseline llvm I get:

$ bin/llc -march=amdgcn -mcpu=gfx902 -verify-machineinstrs -amdgpu-enable-global-sgpr-addr -o /dev/null ~/git/llvm-project/llvm/test/CodeGen/AMDGPU/memory_clause.ll -debug-only=machine-scheduler |& egrep "^Cluster|Machine code for function"
# Machine code for function vector_clause: NoPHIs, TracksLiveness
Cluster ld/st SU(2) - SU(3)
Cluster ld/st SU(6) - SU(7)
Cluster ld/st SU(8) - SU(9)
Cluster ld/st SU(11) - SU(13)
# Machine code for function vector_clause: NoPHIs, TracksLiveness
[...]

Your patch doesn't change this, but with D71717 I get:

$ bin/llc -march=amdgcn -mcpu=gfx902 -verify-machineinstrs -amdgpu-enable-global-sgpr-addr -o /dev/null ~/git/llvm-project/llvm/test/CodeGen/AMDGPU/memory_clause.ll -debug-only=machine-scheduler |& egrep "^Cluster|Machine code for function"
# Machine code for function vector_clause: NoPHIs, TracksLiveness
Cluster ld/st SU(2) - SU(3)
Cluster ld/st SU(6) - SU(7)
Cluster ld/st SU(8) - SU(9)
Cluster ld/st SU(10) - SU(11)
Cluster ld/st SU(12) - SU(13)
# Machine code for function vector_clause: NoPHIs, TracksLiveness
[...]

So now it is considering all of SU(10) .. SU(13) for clustering, instead of just SU(11) and SU(13). The relevant SUs are:

SU(6):   %12:vreg_128 = GLOBAL_LOAD_DWORDX4_SADDR %26:vreg_64, %4:sreg_64_xexec, 0, 0, 0, 0, implicit $exec, implicit $exec :: (load 16 from %ir.tmp3, addrspace 1)
SU(7):   %15:vreg_128 = GLOBAL_LOAD_DWORDX4_SADDR %26:vreg_64, %4:sreg_64_xexec, 16, 0, 0, 0, implicit $exec, implicit $exec :: (load 16 from %ir.tmp72, addrspace 1)
SU(8):   %17:vreg_128 = GLOBAL_LOAD_DWORDX4_SADDR %26:vreg_64, %4:sreg_64_xexec, 32, 0, 0, 0, implicit $exec, implicit $exec :: (load 16 from %ir.tmp116, addrspace 1)
SU(9):   %19:vreg_128 = GLOBAL_LOAD_DWORDX4_SADDR %26:vreg_64, %4:sreg_64_xexec, 48, 0, 0, 0, implicit $exec, implicit $exec :: (load 16 from %ir.tmp1510, addrspace 1)
SU(10):   GLOBAL_STORE_DWORDX4_SADDR %26:vreg_64, %12:vreg_128, %5:sreg_64_xexec, 0, 0, 0, 0, implicit $exec, implicit $exec :: (store 16 into %ir.tmp5, addrspace 1)
SU(11):   GLOBAL_STORE_DWORDX4_SADDR %26:vreg_64, %15:vreg_128, %5:sreg_64_xexec, 16, 0, 0, 0, implicit $exec, implicit $exec :: (store 16 into %ir.tmp94, addrspace 1)
SU(12):   GLOBAL_STORE_DWORDX4_SADDR %26:vreg_64, %17:vreg_128, %5:sreg_64_xexec, 32, 0, 0, 0, implicit $exec, implicit $exec :: (store 16 into %ir.tmp138, addrspace 1)
SU(13):   GLOBAL_STORE_DWORDX4_SADDR %26:vreg_64, %19:vreg_128, %5:sreg_64_xexec, 48, 0, 0, 0, implicit $exec, implicit $exec :: (store 16 into %ir.tmp1712, addrspace 1)
Tue, Jan 28, 3:27 AM · Restricted Project
foad committed rG4a331beadc3a: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi (authored by foad).
[AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi
Tue, Jan 28, 3:00 AM
foad closed D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi.
Tue, Jan 28, 3:00 AM · Restricted Project
foad added inline comments to D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi.
Tue, Jan 28, 3:00 AM · Restricted Project
foad added inline comments to D73485: [AMDGPU] Simplify DS and SM cases in getMemOperandsWithOffset.
Tue, Jan 28, 2:05 AM · Restricted Project
foad updated the diff for D73485: [AMDGPU] Simplify DS and SM cases in getMemOperandsWithOffset.

Add a TODO comment about M0.

Tue, Jan 28, 1:56 AM · Restricted Project

Yesterday

foad committed rGcbbbd5b5f617: [GlobalISel] Make use of KnownBits::computeForAddSub (authored by foad).
[GlobalISel] Make use of KnownBits::computeForAddSub
Mon, Jan 27, 2:27 PM
foad closed D73431: [GlobalISel] Make use of KnownBits::computeForAddSub.
Mon, Jan 27, 2:27 PM · Restricted Project
foad updated the diff for D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi.

Use CHECK-NEXT instead of CHECK-NOT.

Mon, Jan 27, 2:25 PM · Restricted Project
foad committed rGe37997cc0de0: [AMDGPU] Simplify test and extend to gfx9 and gfx10 (authored by foad).
[AMDGPU] Simplify test and extend to gfx9 and gfx10
Mon, Jan 27, 9:04 AM
foad closed D70708: [AMDGPU] Simplify test and extend to gfx9 and gfx10.
Mon, Jan 27, 9:04 AM · Restricted Project
foad added a comment to D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi.

Ping!

Mon, Jan 27, 9:03 AM · Restricted Project
foad added a comment to rG1f950ced5046: GlobalISel: Define G_READCYCLECOUNTER.

Can you add G_READCYCLECOUNTER to docs/GlobalISel/GenericOpcode.rst?

Mon, Jan 27, 8:45 AM
foad added a comment to D72002: GlobalISel: Handle llvm.read_register.

Can you add G_READ_REGISTER and G_WRITE_REGISTER to docs/GlobalISel/GenericOpcode.rst?

Mon, Jan 27, 8:45 AM · Restricted Project
foad added a comment to rG0ea3c7291fb8: GlobalISel: Handle llvm.read_register.

Can you add G_READ_REGISTER and G_WRITE_REGISTER to docs/GlobalISel/GenericOpcode.rst?

Mon, Jan 27, 8:44 AM
foad updated the diff for D70708: [AMDGPU] Simplify test and extend to gfx9 and gfx10.

Track liveness.

Mon, Jan 27, 8:21 AM · Restricted Project
foad created D73485: [AMDGPU] Simplify DS and SM cases in getMemOperandsWithOffset.
Mon, Jan 27, 8:05 AM · Restricted Project
foad committed rG6461eadf8fff: [AMDGPU] Handle multiple base operands in shouldClusterMemOps (authored by foad).
[AMDGPU] Handle multiple base operands in shouldClusterMemOps
Mon, Jan 27, 6:49 AM
foad committed rG1bf00219fc80: [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint (authored by foad).
[AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint
Mon, Jan 27, 6:49 AM
foad committed rGfcf5254fa792: [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr (authored by foad).
[AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr
Mon, Jan 27, 6:49 AM
foad closed D73455: [AMDGPU] Handle multiple base operands in shouldClusterMemOps.
Mon, Jan 27, 6:49 AM · Restricted Project
foad closed D73456: [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint.
Mon, Jan 27, 6:49 AM · Restricted Project
foad closed D73454: [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr.
Mon, Jan 27, 6:49 AM · Restricted Project
foad created D73456: [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint.
Mon, Jan 27, 3:02 AM · Restricted Project
foad created D73455: [AMDGPU] Handle multiple base operands in shouldClusterMemOps.
Mon, Jan 27, 3:02 AM · Restricted Project
foad added a child revision for D73455: [AMDGPU] Handle multiple base operands in shouldClusterMemOps: D73456: [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint.
Mon, Jan 27, 3:02 AM · Restricted Project
foad created D73454: [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr.
Mon, Jan 27, 3:02 AM · Restricted Project
foad added a child revision for D73454: [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr: D73455: [AMDGPU] Handle multiple base operands in shouldClusterMemOps.
Mon, Jan 27, 3:02 AM · Restricted Project

Sun, Jan 26

foad created D73431: [GlobalISel] Make use of KnownBits::computeForAddSub.
Sun, Jan 26, 6:43 AM · Restricted Project

Fri, Jan 24

foad accepted D73292: [AMDGPU] Correct NumLoads in clustering.

LGTM! Perhaps add a comment in TargetInstrInfo.h documenting the argument? E.g. "the number of loads that will be in the cluster if this hook returns true"...?

Fri, Jan 24, 12:27 PM · Restricted Project
foad added inline comments to D72737: [AMDGPU] Bundle loads before post-RA scheduler.
Fri, Jan 24, 3:58 AM · Restricted Project
foad added a comment to D73292: [AMDGPU] Correct NumLoads in clustering.

I just cannot deal with any single performance regression in all of the targets, including out of tree targets.

Fri, Jan 24, 3:49 AM · Restricted Project
foad added a comment to D73292: [AMDGPU] Correct NumLoads in clustering.

I tried something similar in D72325.

Comments there argue about how much should we cluster, but regardless I do not think we should use a wrong data. If we want more clustering we need to increase thresholds, but still rely on a correct input.

Fri, Jan 24, 1:15 AM · Restricted Project

Thu, Jan 23

foad added a comment to D73292: [AMDGPU] Correct NumLoads in clustering.

I tried something similar in D72325.

Thu, Jan 23, 2:20 PM · Restricted Project
foad committed rGb482e1bfe29d: [CodeGen] Make use of MachineInstrBuilder::getReg (authored by foad).
[CodeGen] Make use of MachineInstrBuilder::getReg
Thu, Jan 23, 5:45 AM
foad closed D73262: [CodeGen] Make use of MachineInstrBuilder::getReg.
Thu, Jan 23, 5:44 AM · Restricted Project
foad added inline comments to D73051: [GlobalISel][AMDGPU] Saturating add/subtract.
Thu, Jan 23, 4:41 AM · Restricted Project
foad added reviewers for D73051: [GlobalISel][AMDGPU] Saturating add/subtract: arsenm, Petar.Avramovic, aemerson, aditya_nandakumar, dsanders, volkan, bogner, rovka, paquette.
Thu, Jan 23, 4:23 AM · Restricted Project
foad updated the diff for D73051: [GlobalISel][AMDGPU] Saturating add/subtract.

Rebase and address some review comments.

Thu, Jan 23, 4:23 AM · Restricted Project
foad created D73262: [CodeGen] Make use of MachineInstrBuilder::getReg.
Thu, Jan 23, 3:55 AM · Restricted Project

Wed, Jan 22

foad committed rGe0f0d0e55cc7: [MachineScheduler] Allow clustering mem ops with complex addresses (authored by foad).
[MachineScheduler] Allow clustering mem ops with complex addresses
Wed, Jan 22, 6:34 AM
foad closed D71655: [MachineScheduler] Allow clustering mem ops with complex addresses.
Wed, Jan 22, 6:34 AM · Restricted Project

Mon, Jan 20

foad updated the diff for D73051: [GlobalISel][AMDGPU] Saturating add/subtract.

Add legalize tests for signed 16-bit operations.

Mon, Jan 20, 9:02 AM · Restricted Project
foad created D73051: [GlobalISel][AMDGPU] Saturating add/subtract.
Mon, Jan 20, 8:42 AM · Restricted Project

Fri, Jan 17

foad added a comment to D71655: [MachineScheduler] Allow clustering mem ops with complex addresses.

Ping!

Fri, Jan 17, 1:47 AM · Restricted Project

Thu, Jan 16

foad committed rG63f73545dd89: [GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods (authored by foad).
[GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods
Thu, Jan 16, 8:19 AM
foad committed rG885260d5d805: [GlobalISel] Don't arbitrarily limit a mask to 64 bits (authored by foad).
[GlobalISel] Don't arbitrarily limit a mask to 64 bits
Thu, Jan 16, 8:19 AM
foad closed D72853: [GlobalISel] Don't arbitrarily limit a mask to 64 bits.
Thu, Jan 16, 8:19 AM · Restricted Project
foad closed D72849: [GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods.
Thu, Jan 16, 8:19 AM · Restricted Project
foad created D72853: [GlobalISel] Don't arbitrarily limit a mask to 64 bits.
Thu, Jan 16, 7:59 AM · Restricted Project
foad added inline comments to D72833: [GlobalISel] Use more MachineIRBuilder helper methods.
Thu, Jan 16, 7:59 AM · Restricted Project
foad created D72849: [GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods.
Thu, Jan 16, 7:50 AM · Restricted Project
foad committed rG28bb43bdf808: [GlobalISel] Use more MachineIRBuilder helper methods (authored by foad).
[GlobalISel] Use more MachineIRBuilder helper methods
Thu, Jan 16, 7:40 AM
foad closed D72833: [GlobalISel] Use more MachineIRBuilder helper methods.
Thu, Jan 16, 7:40 AM · Restricted Project
foad updated the diff for D72842: [GlobalISel] Tweak lowering of G_SMULO/G_UMULO.

Use MIRBuilder.getTII().

Thu, Jan 16, 7:11 AM · Restricted Project
foad updated the diff for D72833: [GlobalISel] Use more MachineIRBuilder helper methods.

Remove G_SMULO/G_UMULO changes which have been split out into D72842.

Thu, Jan 16, 6:53 AM · Restricted Project
foad created D72842: [GlobalISel] Tweak lowering of G_SMULO/G_UMULO.
Thu, Jan 16, 6:51 AM · Restricted Project
foad added a comment to D72833: [GlobalISel] Use more MachineIRBuilder helper methods.

Sorry @arsenm, I updated the patch just before I saw your approval.

Thu, Jan 16, 5:45 AM · Restricted Project
foad updated the diff for D72833: [GlobalISel] Use more MachineIRBuilder helper methods.

Added similar fixes in IRTranslator.cpp.

Thu, Jan 16, 5:42 AM · Restricted Project
foad created D72833: [GlobalISel] Use more MachineIRBuilder helper methods.
Thu, Jan 16, 4:18 AM · Restricted Project
foad added a comment to D72800: [MachineScheduler] Don't swap when we can't cluster.

This patch would only change behaviour if the target's shouldClusterMemOps(SUa, SUb) might return a different answer from shouldClusterMemOps(SUb, SUa).

Thu, Jan 16, 1:22 AM · Restricted Project

Wed, Jan 15

foad added a comment to D72737: [AMDGPU] Bundle loads before post-RA scheduler.

We have moved uses of loaded values further from their loads, which is good. As far as I understand these changes are inducted by the removal of artificial edges which were created by MemOpClusterMutation. These edges were linking successors of any load to all the nodes in a cluster and restricted the scheduling.
In sign_extend.ll that is because of the store clustering, we have moved v_ashrrev_i32_e32 producing v2 past v_ashrrev_i32_e32 producing v3 because store cluster uses them in this order. Before it was harder to do because of the artificial edges linking all predecessors to all stores.

Wed, Jan 15, 9:17 AM · Restricted Project
foad updated the diff for D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.

Rebase.

Wed, Jan 15, 4:01 AM · Restricted Project
foad added a reviewer for D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics: fhahn.
Wed, Jan 15, 4:01 AM · Restricted Project
foad added a comment to D72737: [AMDGPU] Bundle loads before post-RA scheduler.

There are nice changes in a bunch of tests, where we're preserving clusters instead of breaking them apart.

Wed, Jan 15, 1:42 AM · Restricted Project

Tue, Jan 14

foad committed rGb777e551f044: [MachineScheduler] Reduce reordering due to mem op clustering (authored by foad).
[MachineScheduler] Reduce reordering due to mem op clustering
Tue, Jan 14, 11:23 AM
foad closed D72706: [MachineScheduler] Reduce reordering due to mem op clustering.
Tue, Jan 14, 11:23 AM · Restricted Project
foad created D72706: [MachineScheduler] Reduce reordering due to mem op clustering.
Tue, Jan 14, 7:52 AM · Restricted Project
foad accepted D72669: [AMDGPU] Model distance to instruction in bundle.

LGTM

Tue, Jan 14, 1:13 AM · Restricted Project
foad committed rG440ce5164f52: [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests (authored by foad).
[AMDGPU] Remove duplicate gfx10 assembler and disassembler tests
Tue, Jan 14, 12:27 AM
foad closed D72616: [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests.
Tue, Jan 14, 12:27 AM · Restricted Project
foad committed rG0950de264e37: [AMDGPU] Improve error checking in gfx10 assembler tests (authored by foad).
[AMDGPU] Improve error checking in gfx10 assembler tests
Tue, Jan 14, 12:27 AM
foad committed rG63c3691f7917: [AMDGPU] Add gfx9 assembler and disassembler test cases (authored by foad).
[AMDGPU] Add gfx9 assembler and disassembler test cases
Tue, Jan 14, 12:27 AM
foad closed D72592: [AMDGPU] Add gfx9 assembler and disassembler test cases.
Tue, Jan 14, 12:27 AM · Restricted Project
foad closed D72611: [AMDGPU] Improve error checking in gfx10 assembler tests.
Tue, Jan 14, 12:27 AM · Restricted Project

Mon, Jan 13

foad added a comment to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Any update on this?

Mon, Jan 13, 11:50 AM · Restricted Project
foad added a comment to D72616: [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests.

Here's the disassembler diff with a bit more context:

diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt
index c1ec51ee7ad..3e040460988 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt
@@ -5773,17 +5773,14 @@
Mon, Jan 13, 11:41 AM · Restricted Project
foad created D72616: [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests.
Mon, Jan 13, 6:26 AM · Restricted Project
foad added a child revision for D72611: [AMDGPU] Improve error checking in gfx10 assembler tests: D72616: [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests.
Mon, Jan 13, 6:26 AM · Restricted Project
foad created D72611: [AMDGPU] Improve error checking in gfx10 assembler tests.
Mon, Jan 13, 5:48 AM · Restricted Project
foad created D72592: [AMDGPU] Add gfx9 assembler and disassembler test cases.
Mon, Jan 13, 2:21 AM · Restricted Project

Sun, Jan 12

foad committed rG241f330d6bab: [AMDGPU] Add gfx8 assembler and disassembler test cases (authored by foad).
[AMDGPU] Add gfx8 assembler and disassembler test cases
Sun, Jan 12, 1:20 PM
foad closed D72561: [AMDGPU] Add gfx8 assembler and disassembler test cases.
Sun, Jan 12, 1:20 PM · Restricted Project

Sat, Jan 11

foad created D72561: [AMDGPU] Add gfx8 assembler and disassembler test cases.
Sat, Jan 11, 1:26 PM · Restricted Project

Fri, Jan 10

foad accepted D72535: Let targets to adjust operand latency of bundles.

I'd prefer to call adjustSchedDependency unconditionally, but we can do that later.

Fri, Jan 10, 2:06 PM · Restricted Project
foad updated the diff for D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.

New tests for this specific fix:
test/CodeGen/PowerPC/botheightreduce.mir
test/CodeGen/PowerPC/topdepthreduce-postra.mir
test/CodeGen/X86/topdepthreduce-postra.mir

Fri, Jan 10, 8:02 AM · Restricted Project
foad added a comment to D72487: [AMDGPU] Fix bundle scheduling.

This is only relevant for post-ra scheduling because we don't have any bundles when we do pre-ra scheduling, right?

Fri, Jan 10, 3:34 AM · Restricted Project

Thu, Jan 9

foad added a comment to D72325: [AMDGPU] Fix cluster size threshold calculation.

Don't we *want* clusters that large, and even larger?

Thu, Jan 9, 7:05 AM · Restricted Project
foad added a comment to D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.

Also, do you have any performance/codesize numbers for impacted targets?

Thu, Jan 9, 4:14 AM · Restricted Project
foad added a comment to D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.

The diff in llvm/test/CodeGen/X86/testb-je-fusion.ll is:

        movl    %edi, %eax
-       addl    $-512, %eax             # imm = 0xFE00
        movb    $1, (%rsi)
+       addl    $-512, %eax             # imm = 0xFE00
        je      .LBB2_2
...
        movl    %edi, %eax
-       decl    %eax
        movb    $1, (%rsi)
+       decl    %eax
        je      .LBB3_2

The scheduler prefers not to put the addl/decl immediately after the first movl because of the register dependency on eax with latency 1.

Thu, Jan 9, 4:04 AM · Restricted Project
foad added a comment to D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.

The diff in llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll is:

        adrp    x8, .LCPI0_0
-       ldr     h0, [x8, :lo12:.LCPI0_0]
        movi    v1.2d, #0000000000000000
        movi    v2.2d, #0000000000000000
+       ldr     h0, [x8, :lo12:.LCPI0_0]
        movi    v3.2d, #0000000000000000

The scheduler prefers not to put the ldr immediately after the adrp because of the register dependency on x8 with latency 1.

Thu, Jan 9, 3:59 AM · Restricted Project

Wed, Jan 8

foad added a reviewer for D71655: [MachineScheduler] Allow clustering mem ops with complex addresses: MatzeB.
Wed, Jan 8, 9:05 AM · Restricted Project
foad added a comment to D71655: [MachineScheduler] Allow clustering mem ops with complex addresses.

Mostly LGTM, except this diff seems to have absorbed a number of recent changes into the diff

Wed, Jan 8, 9:05 AM · Restricted Project
foad updated the summary of D71655: [MachineScheduler] Allow clustering mem ops with complex addresses.
Wed, Jan 8, 9:05 AM · Restricted Project
foad added reviewers for D71655: [MachineScheduler] Allow clustering mem ops with complex addresses: t.p.northover, kparzysz, jpienaar, tstellar, craig.topper.

Adding target owners to review the (mostly mechanical) changes to affected targets.

Wed, Jan 8, 7:59 AM · Restricted Project
foad updated the diff for D71655: [MachineScheduler] Allow clustering mem ops with complex addresses.

New simpler approach. This version only introduces one new target hook,
getMemOperandsWithOffset, and leaves the old getMemOperandWithOffset as
a convenience function. It also leaves the immediate Offset as an
int64_t value in bytes, instead of trying to represent it as one or more
MachineOperands.

Wed, Jan 8, 7:59 AM · Restricted Project
foad created D72392: [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics.
Wed, Jan 8, 3:50 AM · Restricted Project
foad created D72389: [lit] Run lit -s -vv by default instead of lit -sv.
Wed, Jan 8, 2:02 AM · Restricted Project

Tue, Jan 7

foad created D72325: [AMDGPU] Fix cluster size threshold calculation.
Tue, Jan 7, 5:22 AM · Restricted Project