This is an archive of the discontinued LLVM Phabricator instance.

[Codegen] skip debug instr to avoid code change
ClosedPublic

Authored by yechunliang on Aug 20 2019, 3:54 AM.

Details

Summary

Skip debugging instruction and CFI_INSTRUCTION to avoid different codegen with/without -g by branch-folder pass. Most of ComputeCommonTailLength uses the "countsAsInstruction" helper to skip over debug instructions, but the last two loops don't.
Add branch-folder-with-debug.mir test

the bug is reported in
https://bugs.llvm.org/show_bug.cgi?id=42138

Diff Detail

Event Timeline

yechunliang created this revision.Aug 20 2019, 3:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2019, 3:54 AM

Indeed, this'll need a MIR test for the BranchFolding pass to check for future regressions, and to demonstrate that this patch fixes the bug report. There are some MIR test examples in the llvm/test/CodeGen/MIR/X86 directory.

You should also take a quick look at the history of the file to try and find other reviewers. While Greg and I know some debug things, ideally someone who knows more about the branch folding pass should look at this too, in case there's a more general problem we're not aware of. People who've recently made significant changes to the file are good candidates (or they might know who else should review this).

yechunliang added a comment.EditedAug 21 2019, 2:09 AM

thanks for review and comments. While adding test code, found another error about "unnamed alloca" before branch-folder by llc, I'll investigate the issue before submit the test code.

Please allow me to add Hsiangkai and Petar as reviewer, as you had contributed related patches about debug and CFI instruction skipping. Thanks.

add branch-folder-with-debug.mir test

@yechunliang Thanks for the test case. Have you tried to simplify it?

llvm/test/CodeGen/MIR/X86/branch-folder-with-debug.mir
157

I think we can delete the attributes.

238

We can attach more instructions to the same !DILocation and by doing that we are simplifying the test case.

275

I think we don't need most of the MIR Function attributes, so we can delete some of them.

dstenb added a subscriber: dstenb.Aug 26 2019, 4:58 AM

It might be worthwhile to try to reduce the reproducer with Bugpoint.

Perhaps there exists some neat wrapper/template for bugpointing these types of bugs, but I personally just pass a custom script to Bugpoint of the following form:

opt $1 -S | llc [...] -o with_debug.o
opt $1 -S -strip-debug | llc [...] -o without_debug.o
[exit 1 if text sections differ]
exit 0

If you have not used Bugpoint with custom scripts before, then I think the following blog post can be helpful: http://blog.llvm.org/2015/11/reduce-your-testcases-with-bugpoint-and.html

Simplify MIR test case

@djtodoro thanks for review, followed your comments and guide from https://llvm.org/docs/MIRLangRef.html#id14, I have simplified mir test code by removing all unnecessary code for FileCheck.
"llc -o - test.mir -mtriple=x86_64-- -run-pass=branch-folder | FileCheck test.mir"

@dstenb thanks for guidelines, I know bugpoint but have not used yet, this should be a useful tool and I'll try it.

jmorse added a comment.Sep 2 2019, 8:56 AM

Looking good; a couple of extra nits, then it's ready to go.

I think the CHECK lines need strengthening a little, by changing for example:

; CHECK: bb.5:
; CHECK: successors: %bb.6(0x80000000)

to

; CHECK: bb.5:
; CHECK-NEXT: successors: %bb.6(0x80000000)

Otherwise, FileCheck could in theory accept any number of lines between "bb.5:" and the successor list, including other blocks. Due to the way that the test is written that probably wouldn't be a problem right now, but tests get updated all the time, and explicitly connecting pairs of CHECK lines with -NEXT will make it harder for a future developer to make a mistake when editing this test.

IMO it'd be best to replace the "0x80000000" ... etc constants in the successors list with a regex (i.e. "{{0x[0-9]+}}") to avoid testing too much. These are branch frequency weights that aren't important for this test.

Ideally there would also be a comment in the test file explaining what should happen -- i.e. "Branch folder should ignore the DBG_VALUE between [...] and merge these blocks", or whatever it's supposed to be doing.

llvm/test/CodeGen/MIR/X86/branch-folder-with-debug.mir
45

NB, I believe the convention is to put the RUN line at the very top of the file -- I don't think it's rule, but it's best to avoid surprise.

57

If this successor list is going to be tested for, the block it's in should be tested too. Also on line 64.

yechunliang edited the summary of this revision. (Show Details)

update branch-folder-with-debug.mir follow the review comments

  • update CHECK block lines before successors check
  • Add "CHECK-NEXT" for successors after block
  • Simplify address after successors block
  • move RUN line to top and add more descriptions.
jmorse accepted this revision.Sep 3 2019, 2:06 AM

LGTM, thanks for digging into this!

This revision is now accepted and ready to land.Sep 3 2019, 2:06 AM
bjope added a subscriber: bjope.Oct 28 2019, 5:54 AM

Looks like this one was accepted a long time ago. But I can't see that it has been committed yet. So what is the status?

Current status is "accepted and ready to land" for a long time.
I'm new on LLVM, Is there anything I need to do to make this PR be merged?

Current status is "accepted and ready to land" for a long time.
I'm new on LLVM, Is there anything I need to do to make this PR be merged?

If you have commit access, this is the signal for you to commit the patch to LLVM. If you don't, it's best to ask one of the reviewers to commit the patch for you. LLVM doesn't have pull requests in the GitHub sense.

yechunliang marked 5 inline comments as done.Oct 28 2019, 6:14 PM

@aprantl @jmorse @bjope thanks for review, this patch is ready to land, but I have no submit permission, would you please help to submit it, many thanks.

Sure, I'll drop this in. Thanks for the patch!

This revision was automatically updated to reflect the committed changes.
bjope added a comment.Nov 7 2019, 7:19 AM

Unfortunately I see a lot more test cases failing due to codegen not being debug invariant after having merged this patch, compared to the past.

Main reason is that the loops after SkipTopCFIAndReturn: now undo the work done just before the SkipTopCFIAndReturn label. In the past we have moved the iterators backwards, past any DBG_VALUE instructions. Now we do that, but directly afterward we move the iterators forward again, past both DBG_VALUE and CFI directives. Notice that main purpose with ComputeCommonTailLength is to count number of instructions that are common in the tails, and that number is used by ProfitableToMerge to answer the question if we should attempt to do a tail merge or not.
The answer given by ProfitableToMerge depends on the result of comparing the iterators with MBB->begin(). To make those checks debug invariant we need to move the iterators backward past any debug instructions! (or we would need to modify the checks in some way)

While investigating this, and when using some more or less hand-crafted MIR-tests, it turns out that this is a rather complicated problem. And afaict BranchFolding isn't that good at being debug invariant. And when it is, it isn't that smart when it comes to dealing with any debug instructions or CFI directives that isn't really part of the tail.

One thing that I'm not 100% sure about is if CFI-directives should be allowed to impact codegen or not. It seems like CFI directives, at least sometimes, are generated even without -g. If CFI directives shouldn't impact codegen, then what should the rules be for passes like BranchFolding when merging tails with different CFI directives in the involved basic blocks. CFI directive could both follow an earlier instruction, or it can be put in the beginning of a BB.

I'm afraid that if we simply move the iterators backward until just before a "real" instruction that isn't common (or until we reach beginning of BB). Then at least the transform done by BranchFolding is likely to be debug invariant. But the cost might be that we currently drop lots of DBG/CFI instructions, which at least when it comes to CFI instructions could be bad.

Maybe BranchFolder should execute before PrologEpilogInserter (assuming CFI directives are first introduced by PEI?). Or is BranchFolder typicallly used to merge epilogues, so that is why it is executed after PEI.

bjope added a comment.Nov 7 2019, 7:27 AM

Here is an test case that currently fail. The test1, test2 and test3 functions only differ by DBG_VALUE and CFI_INSTRUCTION instructions. So I guess we want to see the same real instructions being generated for all three functions.

# RUN: llc -mtriple=x86_64-- -run-pass branch-folder -O3 -o - %s | FileCheck %s

---
name:            test1
body:             |
  bb.0:
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit killed $eflags

  bb.1:
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    RET 0

  bb.2:
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    RET 0
...

# CHECK-LABEL: name:            test1
# CHECK: bb.0:
# CHECK:   TEST8rr
# CHECK:   JCC_1
# CHECK: bb.1:
# CHECK:   successors: %bb.2
# CHECK:   MOV8mi $r12
# CHECK: bb.2:
# CHECK:   MOV8mi $r13
# CHECK:   RET 0

---
name:            test2
body:             |
  bb.0:
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit killed $eflags

  bb.1:
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    CFI_INSTRUCTION def_cfa_offset 8
    RET 0

  bb.2:
    DBG_VALUE
    DBG_VALUE
    CFI_INSTRUCTION def_cfa_offset 8
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    RET 0
...

# CHECK-LABEL: name:            test2
# CHECK: bb.0:
# CHECK:   TEST8rr
# CHECK:   JCC_1
# CHECK: bb.1:
# CHECK:   successors: %bb.2
# CHECK:   MOV8mi $r12
# CHECK: bb.2:
# CHECK:   MOV8mi $r13
# CHECK:   RET 0

---
name:            test3
body:             |
  bb.0:
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    JCC_1 %bb.1, 5, implicit killed $eflags

  bb.1:
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
    DBG_VALUE
    CFI_INSTRUCTION def_cfa_offset 8
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    CFI_INSTRUCTION offset $r13, -123
    RET 0

  bb.2:
    DBG_VALUE
    CFI_INSTRUCTION def_cfa_offset 8
    DBG_VALUE
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    DBG_VALUE
    CFI_INSTRUCTION offset $r13, -123
    DBG_VALUE
    RET 0
...

# CHECK-LABEL: name:            test3
# CHECK: bb.0:
# CHECK:   TEST8rr
# CHECK:   JCC_1
# CHECK: bb.1:
# CHECK:   successors: %bb.2
# CHECK:   MOV8mi $r12
# CHECK: bb.2:
# CHECK:   MOV8mi $r13
# CHECK:   RET 0

But we now get these three different functions:

name:            test1
  bb.0:
    successors: %bb.2(0x40000000), %bb.1(0x40000000)
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit $eflags
  bb.1:
    successors: %bb.2(0x80000000)
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
  bb.2:
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    RET 0


name:            test2
  bb.0:
    successors: %bb.2(0x40000000), %bb.1(0x40000000)
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit killed $eflags
  bb.1:
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    CFI_INSTRUCTION def_cfa_offset 8
    RET 0
  bb.2:
    DBG_VALUE
    DBG_VALUE
    CFI_INSTRUCTION def_cfa_offset 8
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    RET 0


name:            test3
  bb.0:
    TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
    MOV8mi $r12, 1, $noreg, 0, $noreg, 0
    DBG_VALUE
    CFI_INSTRUCTION def_cfa_offset 8
    MOV8mi $r13, 1, $noreg, 0, $noreg, 0
    CFI_INSTRUCTION offset $r13, -123
    RET 0

@bjope thanks for the information. Without this patch, I have seem some check issues already there, I suggest to fix the issue first before apply this patch, than verify this patch. I don't know whether the CHECK in test case is correct or not, would you please help to provide test case without debug, help me find the normal behavior that could pass llc check without error, so that we could compare the result and diagnose why debug or cfi instructions impact the branch-folder pass. As this PR already merged, It's better for us to file a new bug and fix with another commit.

name: test3
body: |
bb.0:

TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.1, 5, implicit killed $eflags

from the test case, the JCC should be bb.2, as below, right?

JCC_1 %bb.2, 5, implicit killed $eflags

bjope added a comment.Nov 11 2019, 9:13 AM

name: test3
body: |
bb.0:

TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.1, 5, implicit killed $eflags

from the test case, the JCC should be bb.2, as below, right?

JCC_1 %bb.2, 5, implicit killed $eflags

Ah yes, I made a mistake in test3.

Anyway, I wrote a new PR here for my problem (simply skipped to include test3). It can be found here: https://bugs.llvm.org/show_bug.cgi?id=43964
(I guess any future discussions about this should take place in that PR)