Page MenuHomePhabricator
Feed Advanced Search

Yesterday

rampitec added inline comments to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.
Thu, Feb 20, 5:56 PM · Restricted Project
rampitec added inline comments to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.
Thu, Feb 20, 5:49 PM · Restricted Project
rampitec added a comment to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.

I don't think copies of these should ever be produced (at leasts for the high half) since the high half is not really addressable, and only appears that way to some instructions. Where are copies coming from?

First, hi16 registers are used by load_hi instructions, that is their destination. And then RA can happily copy anything to anything. For sanity we need to know how to copy any register.

The high result isn't what's encoded though, so they really are writing the 32-bit register. They only read the low 16-bits. I think the correct way to model this is a 32-bit write but only a 16-bit read

Low16 are preserved and if we say we write 32 bit then we cannot model it.

I think declaring the high 16 is the output register is still wrong and not how it's encoded. Having only the 16-bit read is still an improvement

Thu, Feb 20, 5:47 PM · Restricted Project
rampitec added a comment to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.

I don't think copies of these should ever be produced (at leasts for the high half) since the high half is not really addressable, and only appears that way to some instructions. Where are copies coming from?

First, hi16 registers are used by load_hi instructions, that is their destination. And then RA can happily copy anything to anything. For sanity we need to know how to copy any register.

The high result isn't what's encoded though, so they really are writing the 32-bit register. They only read the low 16-bits. I think the correct way to model this is a 32-bit write but only a 16-bit read

Thu, Feb 20, 5:29 PM · Restricted Project
rampitec added inline comments to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.
Thu, Feb 20, 5:23 PM · Restricted Project
rampitec added a comment to D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.

I don't think copies of these should ever be produced (at leasts for the high half) since the high half is not really addressable, and only appears that way to some instructions. Where are copies coming from?

Thu, Feb 20, 5:11 PM · Restricted Project
rampitec added a parent revision for D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs: D74873: [AMDGPU] Define 16 bit VGPR subregs.
Thu, Feb 20, 4:51 PM · Restricted Project
rampitec added a child revision for D74873: [AMDGPU] Define 16 bit VGPR subregs: D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.
Thu, Feb 20, 4:51 PM · Restricted Project
rampitec created D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs.
Thu, Feb 20, 4:51 PM · Restricted Project

Wed, Feb 19

rampitec created D74873: [AMDGPU] Define 16 bit VGPR subregs.
Wed, Feb 19, 2:55 PM · Restricted Project
rampitec committed rG03954a12aecb: [AMDGPU] Fix DS_WRITE_B32 patterns (authored by rampitec).
[AMDGPU] Fix DS_WRITE_B32 patterns
Wed, Feb 19, 1:50 PM
rampitec closed D74868: [AMDGPU] Fix DS_WRITE_B32 patterns.
Wed, Feb 19, 1:49 PM · Restricted Project
rampitec created D74868: [AMDGPU] Fix DS_WRITE_B32 patterns.
Wed, Feb 19, 1:29 PM · Restricted Project
rampitec accepted D74848: AMDGPU: Move dot intrinsic patterns to instruction def.
Wed, Feb 19, 10:28 AM · Restricted Project
rampitec accepted D74832: AMDGPU: Use default operand for VOP3P clamp.
Wed, Feb 19, 9:31 AM · Restricted Project
rampitec added a comment to D74805: [AMDGPU] Fix assumption about LaneBitmask content.

Test?

Wed, Feb 19, 9:21 AM · Restricted Project
rampitec committed rGada205e91eb0: [AMDGPU] Fix assumption about LaneBitmask content (authored by rampitec).
[AMDGPU] Fix assumption about LaneBitmask content
Wed, Feb 19, 9:13 AM
rampitec closed D74805: [AMDGPU] Fix assumption about LaneBitmask content.
Wed, Feb 19, 9:13 AM · Restricted Project

Tue, Feb 18

rampitec created D74805: [AMDGPU] Fix assumption about LaneBitmask content.
Tue, Feb 18, 4:22 PM · Restricted Project
rampitec accepted D74804: AMDGPU: Use undef_tied_input on VOP3P instructions.
Tue, Feb 18, 4:12 PM · Restricted Project
rampitec added a comment to D74744: [TBLGEN] Inhibit generation of unneeded psets.

@rampitec Since you've been looking at register pressure sets. Do you have any suggestions on how to prevent VK16 and VK16WM from being merged on X86? Most our instructions that use these classes use the VK16WM class which contains one less register than VK16. All of the registers in both classes are allocatable. So tracking VK16WM pressure correctly is important to make sure we don't over count by 1 in situations where everything is constrained to VK16WM.

Tue, Feb 18, 12:44 PM · Restricted Project
rampitec committed rGdd4766451eca: [AMDGPU] Use generated RegisterPressureSets enum (authored by rampitec).
[AMDGPU] Use generated RegisterPressureSets enum
Tue, Feb 18, 10:35 AM
rampitec closed D74671: [AMDGPU] Use generated RegisterPressureSets enum.
Tue, Feb 18, 10:35 AM · Restricted Project
rampitec committed rGb2a958a01385: [TBLGEN] Emit register pressure set enum (authored by rampitec).
[TBLGEN] Emit register pressure set enum
Tue, Feb 18, 10:26 AM
rampitec closed D74649: [TBLGEN] Emit register pressure set enum.
Tue, Feb 18, 10:25 AM · Restricted Project
rampitec added a comment to D72031: [Scheduling] Create the missing dependency edges for store cluster.

LGTM, but please wait for other responses.

Tue, Feb 18, 12:24 AM · Restricted Project

Mon, Feb 17

rampitec added a comment to D72031: [Scheduling] Create the missing dependency edges for store cluster.

Tests change looks neutral to me now, and the logic seems plausible. How comes that only AMDGPU tests affected?

Mon, Feb 17, 11:38 PM · Restricted Project
rampitec added reviewers for D74649: [TBLGEN] Emit register pressure set enum: kerbowa, vpykhtin.
Mon, Feb 17, 5:27 PM · Restricted Project
rampitec committed rG8e760e1018d1: [TBLGEN] Inhibit generation of unneeded psets (authored by rampitec).
[TBLGEN] Inhibit generation of unneeded psets
Mon, Feb 17, 3:50 PM
rampitec closed D74744: [TBLGEN] Inhibit generation of unneeded psets.
Mon, Feb 17, 3:49 PM · Restricted Project
rampitec added a comment to D74649: [TBLGEN] Emit register pressure set enum.

This is still needed even after D74744. See D74671 why.

Mon, Feb 17, 3:49 PM · Restricted Project
rampitec accepted D74745: AMDGPU: Enable integer division bypass.

LGTM

Mon, Feb 17, 3:49 PM · Restricted Project
rampitec added a comment to D74744: [TBLGEN] Inhibit generation of unneeded psets.

How much does this help compile time?

Mon, Feb 17, 3:40 PM · Restricted Project
rampitec added a comment to D74744: [TBLGEN] Inhibit generation of unneeded psets.

FYI, these all PSets generated for AMDGPU with this patch instead of 255 of them:

Mon, Feb 17, 3:40 PM · Restricted Project
rampitec added a reviewer for D74649: [TBLGEN] Emit register pressure set enum: nhaehnle.
Mon, Feb 17, 3:30 PM · Restricted Project
rampitec created D74744: [TBLGEN] Inhibit generation of unneeded psets.
Mon, Feb 17, 3:30 PM · Restricted Project
rampitec accepted D74711: AMDGPU/GlobalISel: Allow arbitrary global values.

LGTM

Mon, Feb 17, 11:26 AM · Restricted Project
rampitec accepted D74459: AMDGPU/GlobalISel: Custom lower 32-bit G_SDIV/G_SREM.
Mon, Feb 17, 11:26 AM · Restricted Project
rampitec accepted D74446: AMDGPU/GlobalISel: Custom lower 32-bit G_UDIV/G_UREM.

LGTM

Mon, Feb 17, 10:31 AM · Restricted Project
rampitec added inline comments to D74446: AMDGPU/GlobalISel: Custom lower 32-bit G_UDIV/G_UREM.
Mon, Feb 17, 9:54 AM · Restricted Project
rampitec added inline comments to D74711: AMDGPU/GlobalISel: Allow arbitrary global values.
Mon, Feb 17, 9:54 AM · Restricted Project

Sun, Feb 16

rampitec updated the diff for D74649: [TBLGEN] Emit register pressure set enum.

Produce legal pset names from start.

Sun, Feb 16, 10:18 AM · Restricted Project
rampitec added inline comments to D74649: [TBLGEN] Emit register pressure set enum.
Sun, Feb 16, 9:24 AM · Restricted Project
rampitec added a comment to D74671: [AMDGPU] Use generated RegisterPressureSets enum.

LGTM. I agree with Matt that it would be great not to create those other pressure sets in the first place, possibly by adding some .td file construct to explicitly list the ones that are required. That said, this change is clearly a step in the right direction.

Sun, Feb 16, 9:15 AM · Restricted Project

Sat, Feb 15

rampitec added a parent revision for D74671: [AMDGPU] Use generated RegisterPressureSets enum: D74649: [TBLGEN] Emit register pressure set enum.
Sat, Feb 15, 10:40 AM · Restricted Project
rampitec added a child revision for D74649: [TBLGEN] Emit register pressure set enum: D74671: [AMDGPU] Use generated RegisterPressureSets enum.
Sat, Feb 15, 10:40 AM · Restricted Project
rampitec created D74671: [AMDGPU] Use generated RegisterPressureSets enum.
Sat, Feb 15, 10:40 AM · Restricted Project
rampitec updated the diff for D74649: [TBLGEN] Emit register pressure set enum.

Moved to the enum section.

Sat, Feb 15, 10:27 AM · Restricted Project

Fri, Feb 14

rampitec added a comment to D74649: [TBLGEN] Emit register pressure set enum.

Can we just stop emitting all of the pressure sets we don't use instead?

Fri, Feb 14, 4:07 PM · Restricted Project
rampitec abandoned D73509: [MachineScheduler] relax successor chain on clustering.
Fri, Feb 14, 4:03 PM · Restricted Project
rampitec committed rG922197d664d3: [TBLGEN] Allow to override RC weight (authored by rampitec).
[TBLGEN] Allow to override RC weight
Fri, Feb 14, 3:53 PM
rampitec closed D74509: [TBLGEN] Allow to override RC weight.
Fri, Feb 14, 3:52 PM · Restricted Project
rampitec updated the diff for D74509: [TBLGEN] Allow to override RC weight.

Replaced double with singe quotes in char output.

Fri, Feb 14, 3:43 PM · Restricted Project
rampitec created D74649: [TBLGEN] Emit register pressure set enum.
Fri, Feb 14, 2:02 PM · Restricted Project
rampitec accepted D74594: [AMDGPU] Fix some tests that did not specify -mcpu.

LGTM

Fri, Feb 14, 12:03 PM · Restricted Project
rampitec accepted D74629: AMDGPU/GlobalISel: Improve 16-bit bswap.

LGTM

Fri, Feb 14, 11:53 AM · Restricted Project
rampitec accepted D74630: [AMDGPU] Always enable XNACK feature when support is explicitly requested.
Fri, Feb 14, 11:44 AM · Restricted Project

Thu, Feb 13

rampitec updated the diff for D74509: [TBLGEN] Allow to override RC weight.

Updated comment.

Thu, Feb 13, 5:07 PM · Restricted Project
rampitec added inline comments to D74509: [TBLGEN] Allow to override RC weight.
Thu, Feb 13, 4:57 PM · Restricted Project
rampitec accepted D74563: AMDGPU: Use v_perm_b32 to implement bswap.

LGTM

Thu, Feb 13, 8:51 AM · Restricted Project

Wed, Feb 12

rampitec added inline comments to D74524: [Scheduling] Improve memory ops cluster preparation.
Wed, Feb 12, 11:11 PM · Restricted Project
rampitec added a reviewer for D74524: [Scheduling] Improve memory ops cluster preparation: foad.
Wed, Feb 12, 10:53 PM · Restricted Project
rampitec created D74509: [TBLGEN] Allow to override RC weight.
Wed, Feb 12, 2:59 PM · Restricted Project
rampitec committed rGf8d044bbcfdc: [TBLGEN] Fix subreg value overflow in DAGISelMatcher (authored by rampitec).
[TBLGEN] Fix subreg value overflow in DAGISelMatcher
Wed, Feb 12, 1:38 PM
rampitec closed D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.
Wed, Feb 12, 1:38 PM · Restricted Project
rampitec added inline comments to D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.
Wed, Feb 12, 11:01 AM · Restricted Project
rampitec updated the diff for D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.

Addressed review comments.

Wed, Feb 12, 9:57 AM · Restricted Project

Tue, Feb 11

rampitec accepted D74442: AMDGPU: Add option to disable CGP division expansion.
Tue, Feb 11, 2:59 PM · Restricted Project
rampitec committed rGd538dc05f3b5: [AMDGPU] Fixed subreg use in sdwa-scalar-ops.mir. NFC (authored by rampitec).
[AMDGPU] Fixed subreg use in sdwa-scalar-ops.mir. NFC
Tue, Feb 11, 2:31 PM
rampitec accepted D73073: AMDGPU: Add option to expand 64-bit integer division in IR.
Tue, Feb 11, 2:13 PM · Restricted Project
rampitec accepted D74437: AMDGPU: Use conditions directly in division expansion.
Tue, Feb 11, 1:10 PM · Restricted Project
rampitec accepted D74434: AMDGPU: Don't expand more special div cases in IR.

LGTM, but do not you want to handle it in the IR right away instead of TODO?

Tue, Feb 11, 1:00 PM · Restricted Project
rampitec accepted D74435: AMDGPU: Add baseline tests for CGP div expansion.
Tue, Feb 11, 12:51 PM · Restricted Project
rampitec accepted D74432: AMDGPU: Fix crash on v3i15 kernel arguments.

LGTM

Tue, Feb 11, 12:15 PM · Restricted Project
rampitec committed rG453a8f3af781: [AMDGPU] Remove AMDGPURegisterInfo (authored by rampitec).
[AMDGPU] Remove AMDGPURegisterInfo
Tue, Feb 11, 11:19 AM
rampitec closed D74426: [AMDGPU] Remove AMDGPURegisterInfo.
Tue, Feb 11, 11:19 AM · Restricted Project
rampitec created D74426: [AMDGPU] Remove AMDGPURegisterInfo.
Tue, Feb 11, 11:01 AM · Restricted Project
rampitec accepted D74408: AMDGPU: Don't create potentially dead rcp declarations.
Tue, Feb 11, 11:01 AM · Restricted Project
rampitec accepted D74410: AMDGPU: Directly use rcp intrinsic in idiv expansions.

LGTM

Tue, Feb 11, 11:01 AM · Restricted Project

Mon, Feb 10

rampitec updated the diff for D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.

Addressed review comments.

Mon, Feb 10, 11:24 PM · Restricted Project
rampitec updated the diff for D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.

Moved check inside test .

Mon, Feb 10, 5:06 PM · Restricted Project
rampitec created D74368: [TBLGEN] Fix subreg value overflow in DAGISelMatcher.
Mon, Feb 10, 4:22 PM · Restricted Project
rampitec accepted D74345: AMDGPU: Don't report 2-byte alignment as fast.

LGTM after some HW clarification.

Mon, Feb 10, 12:27 PM · Restricted Project
rampitec added inline comments to D74345: AMDGPU: Don't report 2-byte alignment as fast.
Mon, Feb 10, 12:00 PM · Restricted Project
rampitec accepted D74323: AMDGPU: Move R600 test compatability hack.

LGTM

Mon, Feb 10, 9:01 AM · Restricted Project
rampitec accepted D74317: [AMDGPU] Fix non-deterministic iteration order.

LGTM

Mon, Feb 10, 9:01 AM · Restricted Project
rampitec committed rGed3527c64896: [AMDGPU] Split R600 and GCN subregs (authored by rampitec).
[AMDGPU] Split R600 and GCN subregs
Mon, Feb 10, 8:34 AM
rampitec closed D74248: [AMDGPU] Split R600 and GCN subregs.
Mon, Feb 10, 8:34 AM · Restricted Project

Fri, Feb 7

rampitec created D74248: [AMDGPU] Split R600 and GCN subregs.
Fri, Feb 7, 12:17 PM · Restricted Project
rampitec accepted D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Fri, Feb 7, 11:03 AM · Restricted Project
rampitec accepted D74227: [AMDGPU] Use @LINE for error checking in gfx10 assembler tests.

LGTM

Fri, Feb 7, 10:17 AM · Restricted Project

Thu, Feb 6

rampitec added a comment to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Where are these failing LIT tests located? I did LIT tests before posting for review and also before integrating.
Maybe my check is incomplete. Thanks.

Thu, Feb 6, 9:26 PM · Restricted Project
rampitec requested changes to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

Please make sure it passes llvm check tests.

Thu, Feb 6, 6:03 PM · Restricted Project
rampitec reopened D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .
Thu, Feb 6, 6:03 PM · Restricted Project
rampitec committed rGcacc3b7a557a: [AMDGPU] Cleanup assumptions about generated subregs (authored by rampitec).
[AMDGPU] Cleanup assumptions about generated subregs
Thu, Feb 6, 5:45 PM
rampitec committed rG2863c269683b: Revert "AMDGPU: Limit the search in finding the instruction pattern for v_swap… (authored by rampitec).
Revert "AMDGPU: Limit the search in finding the instruction pattern for v_swap…
Thu, Feb 6, 5:45 PM
rampitec closed D74177: [AMDGPU] Cleanup assumptions about generated subregs.
Thu, Feb 6, 5:45 PM · Restricted Project
rampitec added a reverting change for rG982780648124: AMDGPU: Limit the search in finding the instruction pattern for v_swap…: rG2863c269683b: Revert "AMDGPU: Limit the search in finding the instruction pattern for v_swap….
Thu, Feb 6, 5:45 PM
rampitec added a comment to D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

I have reverted it. It has resulted in 96 lit test failures. Please run make check before posting review.

Thu, Feb 6, 5:44 PM · Restricted Project
rampitec accepted D74180: AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. .

LGTM

Thu, Feb 6, 4:32 PM · Restricted Project