Page MenuHomePhabricator

[Scalarizer] Do not insert instructions between PHI nodes and debug intrinsics.
ClosedPublic

Authored by vettoreldaniele on Oct 25 2021, 10:53 AM.

Details

Summary

The scalarizer pass seems to be inserting instructions in-between PHI nodes or debug intrinsics that end up staying at the end of the pass, resulting in malformed IR and violating assumptions.

This patch adds a check to make sure the extractelement instructions that it adds are correctly placed after all PHI nodes and debug intrinsics.

Diff Detail

Unit TestsFailed

TimeTest
1,720 msx64 debian > AddressSanitizer-x86_64-linux-dynamic.TestCases::exitcode.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -shared-libasan -g -Wno-deprecated-declarations /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/exitcode.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/exitcode.cpp.tmp
2,280 msx64 debian > AddressSanitizer-x86_64-linux.TestCases::exitcode.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -g -Wno-deprecated-declarations /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/exitcode.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/exitcode.cpp.tmp

Event Timeline

vettoreldaniele requested review of this revision.Oct 25 2021, 10:53 AM

I think it would be more foolproof to do that in the constructor of the class.

I think it would be more foolproof to do that in the constructor of the class.

I feel like it would be confusing to have some logic around the insertion point in the constructor when the majority of the logic around that is in scatter(). I'd prefer to have the constructor require the insertion point to be valid rather than fixing it if passed an invalid one.

kuhar added a subscriber: kuhar.Oct 26 2021, 11:48 AM
bjope added a comment.Nov 1 2021, 10:01 AM

Hi @vettoreldaniele

I noticed that we have two downstream fixes (that for some reason haven't been upstreamed yet, I figure we did not have any reproducers identifying this as an upstream problem) that is a bit related to your patch.
Our patch looks like this:

--- a/llvm/lib/Transforms/Scalar/Scalarizer.cpp
+++ b/llvm/lib/Transforms/Scalar/Scalarizer.cpp
@@ -66,6 +66,15 @@ static cl::opt<bool>
 
 namespace {
 
+static BasicBlock::iterator skipPastPhiNodesAndDbg(BasicBlock::iterator Itr) {
+  BasicBlock *BB = Itr->getParent();
+  if (isa<PHINode>(Itr))
+    Itr = BB->getFirstInsertionPt();
+  if (Itr != BB->end())
+    Itr = skipDebugIntrinsics(Itr);
+  return Itr;
+}
+
+}

@@ -373,7 +407,7 @@ Scatterer ScalarizerVisitor::scatter(Instruction *Point, Value *V) {
     // Put the scattered form of an instruction directly after the
     // instruction.
     BasicBlock *BB = VOp->getParent();
-    return Scatterer(BB, std::next(BasicBlock::iterator(VOp)),
+    return Scatterer(BB, skipPastPhiNodesAndDbg(std::next(VOp->getIterator())),
                      V, &Scattered[V]);
   }
   // In the fallback case, just put the scattered before Point and

So there are actually two fixes involved. One is to skip past PHI nodes (same as your problem), and the other is to solve a debug-invariance problem by also skipping past dbg instrinsics.

Our test case for the latter ( test/Transforms/Scalarizer/dbg-invariant.ll ) looks like this:

; RUN: opt -strip-debug -passes=scalarizer -S < %s | FileCheck %s
; RUN: opt -passes=scalarizer -S < %s | FileCheck %s

; CHECK: %0 = load <8 x i16>

; CHECK: %.i0 = extractelement <8 x i16> %0, i32 0
; CHECK-NEXT: %.i01 = add i16 %.i0, 28690
; CHECK: %.i1 = extractelement <8 x i16> %0, i32 1
; CHECK-NEXT: %.i12 = add i16 %.i1, 28690
; CHECK: %.i2 = extractelement <8 x i16> %0, i32 2
; CHECK-NEXT: %.i23 = add i16 %.i2, 28690
; CHECK: %.i3 = extractelement <8 x i16> %0, i32 3
; CHECK-NEXT: %.i34 = add i16 %.i3, 28690
; CHECK: %.i4 = extractelement <8 x i16> %0, i32 4
; CHECK-NEXT: %.i45 = add i16 %.i4, 28690
; CHECK: %.i5 = extractelement <8 x i16> %0, i32 5
; CHECK-NEXT: %.i56 = add i16 %.i5, 28690
; CHECK: %.i6 = extractelement <8 x i16> %0, i32 6
; CHECK-NEXT: %.i67 = add i16 %.i6, 28690
; CHECK: %.i7 = extractelement <8 x i16> %0, i32 7
; CHECK-NEXT: = add i16 %.i7, 28690

@d = external global [8 x i16], align 1
@e = external global i16, align 1

; Function Attrs: nofree norecurse nounwind
define dso_local void @foo() local_unnamed_addr #0 !dbg !7 {
entry:
  %0 = load <8 x i16>, <8 x i16>* bitcast ([8 x i16]* @d to <8 x i16>*), align 1
  call void @llvm.dbg.value(metadata i16 0, metadata !11, metadata !DIExpression()), !dbg !13
  %1 = add <8 x i16> %0, <i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690>, !dbg !13
  store <8 x i16> %1, <8 x i16>* bitcast ([8 x i16]* @d to <8 x i16>*), align 1, !dbg !13
  %2 = extractelement <8 x i16> %1, i32 7, !dbg !13
  store i16 %2, i16* @e, align 1, !dbg !13
  ret void
}

; Function Attrs: nounwind readnone speculatable willreturn
declare void @llvm.dbg.value(metadata, metadata, metadata) #1

attributes #0 = { nofree norecurse nounwind }
attributes #1 = { nounwind readnone speculatable willreturn }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5}
!llvm.ident = !{!6}

!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "foo.c", directory: "/")
!2 = !{}
!3 = !{i32 7, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 1}
!6 = !{!"clang version 11.0.0 "}
!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 4, type: !8, scopeLine: 4, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !10)
!8 = !DISubroutineType(types: !9)
!9 = !{null}
!10 = !{!11}
!11 = !DILocalVariable(name: "i", scope: !7, file: !1, line: 5, type: !12)
!12 = !DIBasicType(name: "int", size: 16, encoding: DW_ATE_signed)
!13 = !DILocation(line: 0, scope: !7)

And the motivation in the commit message was:

[Scalarizer] Fix a debug invariance problem

This fixes a debug info invariance problem for IR of the following form:

  %0 = load <8 x i16>, [...]
  <potentially llvm.dbg.value intrinsics here>
  %1 = add <8 x i16> %0, <[...]>

When scalarizing that, the extractvalue instructions were inserted
before the instruction succeeding the load instruction, and the
scalarized add instructions were inserted before the vectorized add
instruction. Without debug intrinsics, those two insertion points were
the same, resulting in IR of the following form:

  %0 = load <8 x i16>, [...]
  %.i0 = extractelement <8 x i16> %0, i32 0
  %.i01 = add i16 %.i0, [...]
  %.i1 = extractelement <8 x i16> %0, i32 1
  %.i12 = add i16 %.i1, [...]
  [...]

but of the following form with debug intrinsics present:

  %0 = load <8 x i16>, [...]
  %.i0 = extractelement <8 x i16> %0, i32 0
  %.i1 = extractelement <8 x i16> %0, i32 1
  [...]
  <llvm.dbg.value intrinsics>
  %.i01 = add i16 %.i0, [...]
  %.i12 = add i16 %.i1, [...]
  [...]

This patch fixes this debug info variance by skipping past any debug
intrinsics when determining the insertion point for the extractvalue
instructions.

So I wouldn't mind if we try to solve both problems here (by using some kind of helper such as skipPastPhiNodesAndDbg from the downstream patch I refer to above).

Let me know if you for example want me to pre-commit the test case for the debug-invariance problem.
Or if you want to focus on the phi node problem at first, then I can follow up with a patch that also skips past the dbg intrinsics.

So I wouldn't mind if we try to solve both problems here (by using some kind of helper such as skipPastPhiNodesAndDbg from the downstream patch I refer to above).

Let me know if you for example want me to pre-commit the test case for the debug-invariance problem.
Or if you want to focus on the phi node problem at first, then I can follow up with a patch that also skips past the dbg intrinsics.

Hi Bjorn,

Thanks for letting me know about this. I'm happy to fix both problems with this patch and add the test case you provided for debug instructions as well.

I'll update this patch soon.

vettoreldaniele retitled this revision from [Scalarizer] Do not insert instructions between PHI nodes. to [Scalarizer] Do not insert instructions between PHI nodes and debug intrinsics..
vettoreldaniele edited the summary of this revision. (Show Details)

Added handling of debug intrinsics as well.

kuhar added inline comments.Nov 1 2021, 11:21 AM
llvm/lib/Transforms/Scalar/Scalarizer.cpp
69

nit: static is redundant in anonymous namespaces

Removed redundant static.

vettoreldaniele marked an inline comment as done.Nov 1 2021, 11:25 AM
bjope accepted this revision.Nov 2 2021, 1:51 AM

LGTM!

This revision is now accepted and ready to land.Nov 2 2021, 1:51 AM

Thanks! I don't have commit access, can somebody commit this?

This revision was landed with ongoing or failed builds.Nov 2 2021, 6:55 AM
This revision was automatically updated to reflect the committed changes.