This is an archive of the discontinued LLVM Phabricator instance.

[MachineCopyPropagation] BackwardPropagatableCopy: add check for hasOverlappingMultipleDef
ClosedPublic

Authored by simonwallis2 on Jun 26 2020, 3:31 AM.

Details

Summary

In MachineCopyPropagation::BackwardPropagatableCopy(),
a check is added for multiple destination registers.

The copy propagation is avoided if the copied destination register
is the same register as another destination on the same instruction.

A new test is added. This used to fail on ARM like this:
error: unpredictable instruction, RdHi and RdLo must be different

umull   r9, r9, lr, r0

Diff Detail

Event Timeline

simonwallis2 created this revision.Jun 26 2020, 3:31 AM

Satisfy clang-format. NFC.

If the definition is earlyclobber, do we also need to check for overlapping uses?

llvm/lib/CodeGen/MachineCopyPropagation.cpp
468

Is the getNumDefs() check necessary?

470

is the "isDef()" check necessary?

llvm/test/CodeGen/ARM/mcp-dest-regs-no-dup.ll
2

Please make an MIR testcase (http://llvm.org/docs/MIRLangRef.html); this is overly fragile.

simonwallis2 marked 2 inline comments as done.

If the definition is earlyclobber, will it ever reach hasOverlappingMultipleDef?

The getNumDefs() check is not necessary.
It has been removed.

The isDef() check is not necessary.
It has been removed.

I removed the fragile .ll testcase.
I have added an MIR testcase.

simonwallis2 marked an inline comment as done.Jun 30 2020, 3:56 AM

Please review the recent changes made since the last review comments.

lkail added a subscriber: lkail.Jul 8 2020, 7:30 AM
lkail added a comment.Jul 8 2020, 7:42 AM

Hi @simonwallis2 , you can create an MIR test by

llc -simplify-mir -verify-machineinstrs -mtriple=arm-eabi -stop-before=machine-cp foo.ll -o foo.mir

and then add RUN line for it

RUN: llc -simplify-mir -verify-machineinstrs -mtriple=arm-eabi -run-pass=machine-cp %s -o - | FileCheck %s

Hi @lkail

Yes, I used llc -stop-before=machine-cp mcp-dest-regs-no-dup.ll
to create the MIR test mcp-dest-regs-no-dup.mir

I note that the RUN line you suggested produces this error:
llc: warning: run-pass is for .mir file only.

lkail added a comment.EditedJul 9 2020, 4:06 AM

I used your previous mcp-dest-regs-no-dup.ll and run

llc -simplify-mir -verify-machineinstrs -mtriple=arm-eabi -stop-before=machine-cp mcp-dest-regs-no-dup.ll -o mcp-dest-regs-no-dup.mir

It produces

--- |
  ; ModuleID = 'mcp-dest-regs-no-dup.ll'
  source_filename = "mcp-dest-regs-no-dup.ll"
  target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
  target triple = "arm-unknown-unknown-eabi"
  
  @a = hidden local_unnamed_addr global i32 0, align 4
  @b = hidden local_unnamed_addr global i32 0, align 4
  @f = hidden local_unnamed_addr global i32 0, align 4
  @c = hidden local_unnamed_addr global i32 0, align 4
  @d = hidden local_unnamed_addr global i32 0, align 4
  @e = hidden local_unnamed_addr global i32 0, align 4
  @m = hidden local_unnamed_addr global i32 0, align 4
  @g = hidden local_unnamed_addr global i32* null, align 4
  
  define hidden void @h() local_unnamed_addr #0 {
    call void asm sideeffect "", ""() #1
    br label %11
  
  1:                                                ; preds = %13
    call void asm sideeffect "", ""() #1
    %2 = load i32, i32* @a, align 4
    %3 = udiv i32 %2, 45
    %4 = load i32, i32* @b, align 4
    %5 = load i32, i32* @f, align 4
    %6 = icmp ult i32 %4, %5
    %7 = zext i1 %6 to i32
    %8 = icmp ule i32 %3, %7
    %9 = zext i1 %8 to i32
    %10 = icmp ult i32 %14, %9
    br i1 %10, label %18, label %11
  
  11:                                               ; preds = %1, %0
    store i32 2, i32* @f, align 4
    %12 = load i32*, i32** @g, align 4
    br label %13
  
  13:                                               ; preds = %13, %11
    %14 = load i32, i32* @c, align 4
    store i32 11, i32* @d, align 4
    store i32 0, i32* @e, align 4
    store i32 1, i32* @b, align 4
    %15 = load i32, i32* @m, align 4
    store i32 %15, i32* %12, align 4
    %16 = load i32, i32* @f, align 4
    %17 = icmp eq i32 %16, 0
    br i1 %17, label %1, label %13
  
  18:                                               ; preds = %1
    ret void
  }
  
  ; Function Attrs: nounwind
  declare void @llvm.stackprotector(i8*, i8**) #1
  
  attributes #0 = { "target-features"="+armv8-a,-fpregs" }
  attributes #1 = { nounwind }

...
---
name:            h
alignment:       4
tracksRegLiveness: true
frameInfo:
  maxAlignment:    1
  maxCallFrameSize: 0
machineFunctionInfo: {}
body:             |
  bb.0 (%ir-block.0):
    INLINEASM &"", 1 /* sideeffect attdialect */
    renamable $r10 = MOVi32imm @f
    renamable $r8 = MOVi32imm @d
    renamable $r11 = MOVi 11, 14 /* CC::al */, $noreg, $noreg
    renamable $r2 = MOVi32imm @e
    renamable $r4 = MOVi 0, 14 /* CC::al */, $noreg, $noreg
    renamable $r5 = MOVi32imm @b
    renamable $r6 = MOVi 1, 14 /* CC::al */, $noreg, $noreg
    renamable $r7 = MOVi32imm @c
    renamable $r3 = MOVi32imm @m
    B %bb.2
  
  bb.1 (%ir-block.1):
    successors: %bb.4(0x04000000), %bb.2(0x7c000000)
    liveins: $r1, $r2, $r3, $r4, $r5, $r6, $r7, $r8, $r10, $r11
  
    INLINEASM &"", 1 /* sideeffect attdialect */
    renamable $r12 = LDRi12 renamable $r10, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @f)
    renamable $lr = LDRi12 renamable $r5, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @b)
    CMPrr killed renamable $lr, killed renamable $r12, 14 /* CC::al */, $noreg, implicit-def $cpsr
    renamable $r12 = MOVi 0, 14 /* CC::al */, $noreg, $noreg
    renamable $r12 = MOVCCi16 killed renamable $r12, 1, 3 /* CC::lo */, killed $cpsr
    renamable $r0 = MOVi32imm @a
    renamable $lr = LDRi12 killed renamable $r0, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @a)
    renamable $r0 = MOVi32imm 1813430637
    dead renamable $r9, renamable $r0 = UMULL renamable $lr, killed renamable $r0, 14 /* CC::al */, $noreg, $noreg
    renamable $r9 = COPY killed renamable $r0
    renamable $r0 = SUBrr killed renamable $lr, renamable $r9, 14 /* CC::al */, $noreg, $noreg
    renamable $r0 = ADDrsi killed renamable $r9, killed renamable $r0, 11, 14 /* CC::al */, $noreg, $noreg
    CMPrsi killed renamable $r12, killed renamable $r0, 43, 14 /* CC::al */, $noreg, implicit-def $cpsr
    renamable $r0 = MOVi 0, 14 /* CC::al */, $noreg, $noreg
    renamable $r0 = MOVCCi16 killed renamable $r0, 1, 2 /* CC::hs */, killed $cpsr
    CMPrr killed renamable $r1, killed renamable $r0, 14 /* CC::al */, $noreg, implicit-def $cpsr
    Bcc %bb.4, 3 /* CC::lo */, killed $cpsr
    B %bb.2
  
  bb.2 (%ir-block.11):
    liveins: $r2, $r3, $r4, $r5, $r6, $r7, $r8, $r10, $r11
  
    renamable $r0 = MOVi 2, 14 /* CC::al */, $noreg, $noreg
    STRi12 killed renamable $r0, renamable $r10, 0, 14 /* CC::al */, $noreg :: (store 4 into @f)
    renamable $r0 = MOVi32imm @g
    renamable $r12 = LDRi12 killed renamable $r0, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @g)
  
  bb.3 (%ir-block.13):
    successors: %bb.1(0x04000000), %bb.3(0x7c000000)
    liveins: $r2, $r3, $r4, $r5, $r6, $r7, $r8, $r10, $r11, $r12
  
    STRi12 renamable $r11, renamable $r8, 0, 14 /* CC::al */, $noreg :: (store 4 into @d)
    STRi12 renamable $r4, renamable $r2, 0, 14 /* CC::al */, $noreg :: (store 4 into @e)
    STRi12 renamable $r6, renamable $r5, 0, 14 /* CC::al */, $noreg :: (store 4 into @b)
    renamable $r1 = LDRi12 renamable $r7, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @c)
    renamable $r0 = LDRi12 renamable $r3, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @m)
    STRi12 killed renamable $r0, renamable $r12, 0, 14 /* CC::al */, $noreg :: (store 4 into %ir.12)
    renamable $r0 = LDRi12 renamable $r10, 0, 14 /* CC::al */, $noreg :: (dereferenceable load 4 from @f)
    CMPri killed renamable $r0, 0, 14 /* CC::al */, $noreg, implicit-def $cpsr
    Bcc %bb.1, 0 /* CC::eq */, killed $cpsr
    B %bb.3
  
  bb.4 (%ir-block.18):
    BX_RET 14 /* CC::al */, $noreg
...

Can you double confirm it?

MIR test added, after useful feedback.

Please review the recent changes made since the last review comments.

lkail added inline comments.Jul 28 2020, 8:13 PM
llvm/test/CodeGen/ARM/mcp-dest-regs-no-dup.mir
91 ↗(On Diff #279546)

Could you make the MIR test smaller to include minimal instructions trigger the error, so that reviewers can review it easier?

lkail added inline comments.Jul 28 2020, 8:19 PM
llvm/lib/CodeGen/MachineCopyPropagation.cpp
466

Could you add comments to describe this method?

lkail added a reviewer: lkail.Jul 28 2020, 8:41 PM
lkail removed a subscriber: lkail.
simonwallis2 marked 2 inline comments as done.

• Could you make the MIR test smaller to include minimal instructions trigger the error, so that reviewers can review it easier?
Test was 130 lines. Now 25 lines.

• Could you add comments to describe this method?
Comments added.

lkail added inline comments.Jul 29 2020, 7:22 AM
llvm/test/CodeGen/ARM/mcp-dest-regs-no-dup.mir
1 ↗(On Diff #281554)

Prefer adding -verify-machineinstrs, -simplify-mir and keep the line under 80 columns.

1 ↗(On Diff #281554)

Any reason to redirect stderr?

13 ↗(On Diff #281554)

LLVM IR from L2-L12 can be removed.

16 ↗(On Diff #281554)

(%ir-block.0) can also be removed.

20 ↗(On Diff #281554)

These CHECK directives are confusing to me. IIUC, dead renamable $r9, renamable $r0 = UMULL renamable $lr, killed renamable $r0, 14 /* CC::al */, $noreg, $noreg should keep what it is.

simonwallis2 marked 4 inline comments as done.

MIR test further reduced and cleaned up after latest feedback.

llvm/test/CodeGen/ARM/mcp-dest-regs-no-dup.mir
1 ↗(On Diff #281554)

Reason is that I saw it in the test I based this on.
I have removed the redirection.

lkail accepted this revision.Jul 29 2020, 8:10 AM

LGTM. Thanks for fixing it.

This revision is now accepted and ready to land.Jul 29 2020, 8:10 AM
This revision was landed with ongoing or failed builds.Jul 29 2020, 8:21 AM
This revision was automatically updated to reflect the committed changes.
lkail added a comment.Jul 29 2020, 8:27 AM

If the definition is earlyclobber, do we also need to check for overlapping uses?

Prior propagateDefs, there is an invalidation of earlyclobber registers, so in propagateDefs, if MODef is earlyclobber, there should not be a corresponding COPY to propagate from.