This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Tests for future commit in DAGCombiner
ClosedPublic

Authored by kmitropoulou on Jun 21 2023, 3:55 PM.

Diff Detail

Event Timeline

kmitropoulou created this revision.Jun 21 2023, 3:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2023, 3:55 PM
kmitropoulou requested review of this revision.Jun 21 2023, 3:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2023, 3:55 PM
arsenm added a subscriber: arsenm.Jun 21 2023, 4:07 PM
arsenm added inline comments.
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

Why use mir for this?

24

Should also do some SGPR tests

Updating D153479: [NFC] Add tests for future commit in DAGCombiner

Updating D153479: [NFC] Add tests for future commit in DAGCombiner

kmitropoulou marked 2 inline comments as done.Jun 21 2023, 6:18 PM
kmitropoulou added inline comments.
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

CSE changes my optimization. Therefore, I need to do the checking earlier.

For example, the following test:

define i1 @test1(i32 %arg1, i32 %arg2) #0 {

%cmp1 = icmp slt i32 %arg1, 1000
%cmp2 = icmp slt i32 %arg2, 1000
%or  = or i1 %cmp1, %cmp2
ret i1 %or

}

will be optimized as follows with my optimization:

bb.0 (%ir-block.0):
  liveins: $vgpr0, $vgpr1
  %1:vgpr_32 = COPY $vgpr1
  %0:vgpr_32 = COPY $vgpr0
  %2:vgpr_32 = V_MIN_I32_e64 %0, %1, implicit $exec
  %3:sreg_32 = S_MOV_B32 1000
  %4:sreg_32_xm0_xexec = V_CMP_LT_I32_e64 killed %2, killed %3, implicit $exec
  %5:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, killed %4, implicit $exec
  $vgpr0 = COPY %5
  SI_RETURN implicit $vgpr0

This is the output after the instruction selection. After CSE, the predicate of the compare instruction changes:

; %bb.0:

s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
s_waitcnt_vscnt null, 0x0
v_min_i32_e32 v0, v0, v1
s_delay_alu instid0(VALU_DEP_1)
v_cmp_gt_i32_e32 vcc_lo, 0x3e8, v0
v_cndmask_b32_e64 v0, 0, 1, vcc_lo
s_setpc_b64 s[30:31]
kmitropoulou retitled this revision from [NFC] Add tests for future commit in DAGCombiner to [NFC] Tests for future commit in DAGCombiner.Jun 21 2023, 6:21 PM
kmitropoulou marked an inline comment as done.
kmitropoulou added a subscriber: Restricted Project.
arsenm added inline comments.Jun 21 2023, 6:45 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

I don't understand. I assume you mean MachineCSE? Is your patch not actually a DAG combine as the description states?

Can you stop somewhere after SIFixSGPRCopies instead?

Updating D153479: [NFC] Tests for future commit in DAGCombiner

kmitropoulou added inline comments.Jun 21 2023, 7:48 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

I am sorry I meant MachineCSE. The patch will upload implements is in DAGCombiner.
The new checks are generated after amdgpu-isel .

kmitropoulou added inline comments.Jun 21 2023, 7:50 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

*The patch that I will upload implements the optimization in DAGCombiner.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

kmitropoulou added inline comments.Jun 21 2023, 9:43 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

I am sorry I did not understand your comment earlier :) I update the test.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

arsenm added inline comments.Jun 22 2023, 6:41 AM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

My original point still stands. Why can't you test the end ISA? In general optimization patches are better of testing end to end unless you specifically need to check some intermediate state

Updating D153479: [NFC] Tests for future commit in DAGCombiner

kmitropoulou added inline comments.Jun 22 2023, 10:01 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

Lets' say we have the following test:

define i1 @test1(i32 %arg1, i32 %arg2) #0 {

%cmp1 = icmp slt i32 %arg1, 1000
%cmp2 = icmp slt i32 %arg2, 1000
%or  = or i1 %cmp1, %cmp2
ret i1 %or

}

The dump after SI Fix SGPR copies is:

bb.0 (%ir-block.0):
  liveins: $vgpr0, $vgpr1
  %1:vgpr_32 = COPY $vgpr1
  %0:vgpr_32 = COPY $vgpr0
  %2:vgpr_32 = V_MIN_I32_e64 %0, %1, implicit $exec
  %3:sreg_32 = S_MOV_B32 1000
  %4:sreg_32_xm0_xexec = V_CMP_LT_I32_e64 killed %2, killed %3, implicit $exec
  %5:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, killed %4, implicit $exec
  $vgpr0 = COPY %5
  SI_RETURN implicit $vgpr0

The final output is:

%bb.0:

s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
s_waitcnt_vscnt null, 0x0
v_min_i32_e32 v0, v0, v1
s_delay_alu instid0(VALU_DEP_1)
v_cmp_gt_i32_e32 vcc_lo, 0x3e8, v0
v_cndmask_b32_e64 v0, 0, 1, vcc_lo
s_setpc_b64 s[30:31]

So, it is easier to check the correctness of the optimization after SI Fix SGPR copies.

If you do not like it, then I can change it.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

Updating D153479: [NFC] Tests for future commit in DAGCombiner

arsenm added inline comments.Jun 26 2023, 9:47 AM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

You don't actually care about the MIR here, so just go to ISA. You can disable the s_delay_salu insertion for gfx11 with the flag

950

amdgpu_gfx with inreg is less noisy

kmitropoulou added inline comments.Jun 26 2023, 10:28 AM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

Which flag?

foad added inline comments.Jun 26 2023, 10:47 AM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
3

-amdgpu-enable-delay-alu=0

Updating D153479: [NFC] Tests for future commit in DAGCombiner

kmitropoulou marked 3 inline comments as done.Jun 26 2023, 1:47 PM

Thank you all :)

Updating D153479: [NFC] Tests for future commit in DAGCombiner

arsenm accepted this revision.Jun 27 2023, 11:27 AM
This revision is now accepted and ready to land.Jun 27 2023, 11:27 AM

Updating D153479: [NFC] Tests for future commit in DAGCombiner

arsenm added inline comments.Jul 6 2023, 2:57 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
19–20

Do you need tests where these use different constants?

275–276

Some tests where the compare types don't match?

1101

Don't need these attributes

kmitropoulou marked 3 inline comments as done.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
275–276

In test62, the predicates are different and in test63, the compare types are different.

kmitropoulou added inline comments.Jul 6 2023, 3:32 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
19–20

I added test65.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

Updating D153479: [NFC] Tests for future commit in DAGCombiner

arsenm added inline comments.Jul 14 2023, 5:15 AM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
1875

Also test f16 and v2f16

Updating D153479: [NFC] Tests for future commit in DAGCombiner

kmitropoulou marked an inline comment as done.Jul 14 2023, 4:08 PM
kmitropoulou added inline comments.Jul 14 2023, 4:11 PM
llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
1875

I added tests: test84, test85, test86, test87, test88, test89, tes90 and test91.

Updating D153479: [NFC] Tests for future commit in DAGCombiner

This revision was landed with ongoing or failed builds.Jul 17 2023, 5:09 PM
This revision was automatically updated to reflect the committed changes.