This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
GVN.cpp
-
test/Transforms/GVN/PRE/
-
Transforms/
-
GVN/
-
PRE/
1/1
pre-load-dbg.ll

Differential D142787

[GVN] Don't count debug instructions when limit the number of checked instructions
ClosedPublic

Authored by Carrot on Jan 27 2023, 3:29 PM.

Download Raw Diff

Details

Reviewers

mkazantsev
nikic
uabelho

Commits

rGf494b366ff8a: [GVN] Don't count debug instructions when limit the number of checked…

Summary

Don't count debug instructions when limit the number of checked instructions. Otherwise the debug information may impact optimization like the test case shows.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Carrot created this revision.Jan 27 2023, 3:29 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 27 2023, 3:29 PM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

Carrot requested review of this revision.Jan 27 2023, 3:29 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 27 2023, 3:29 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

dblaikie added a subscriber: dblaikie.Jan 27 2023, 3:41 PM

Herald added a subscriber: ormris. · View Herald TranscriptJan 27 2023, 3:41 PM

dblaikie added a project: debug-info.Jan 27 2023, 3:42 PM

Harbormaster completed remote builds in B210488: Diff 492920.Jan 27 2023, 4:53 PM

nikic added inline comments.Jan 28 2023, 1:14 AM

llvm/test/Transforms/GVN/PRE/pre-load-dbg.ll
2	Pass `-gvn-max-num-insns` with a low value and reduce test accordingly. No need to test the actual 100 instruction limit.

If I understand this limitation correctly, its only goal is to save compile time. What are you trying to achieve? Good performance in debug mode or?..

There are other limits like that where we dont' care about it:

// Find non-clobbered value for Loc memory location in extended basic block
// (chain of basic blocks with single predecessors) starting From instruction.
static Value *findDominatingValue(const MemoryLocation &Loc, Type *LoadTy,
                                  Instruction *From, AAResults *AA) {
  uint32_t NumVisitedInsts = 0;
  BasicBlock *FromBB = From->getParent();
  BatchAAResults BatchAA(*AA);
  for (BasicBlock *BB = FromBB; BB; BB = BB->getSinglePredecessor())
    for (auto I = BB == FromBB ? From->getReverseIterator() : BB->rbegin(),
              E = BB->rend();
         I != E; ++I) {
      // Stop the search if limit is reached.
      if (++NumVisitedInsts > MaxNumVisitedInsts)
        return nullptr;
      Instruction *Inst = &*I;
      if (isModSet(BatchAA.getModRefInfo(Inst, Loc)))
        return nullptr;
      if (auto *LI = dyn_cast<LoadInst>(Inst))
        if (LI->getPointerOperand() == Loc.Ptr && LI->getType() == LoadTy)
          return LI;
    }
  return nullptr;
}

Why getting GVN to work here despite giant CT cost is important?

From a debug-info perspective this LGTM, with the test improvement to reduce the size -- I'm not familiar with all of GVN though. The objective of ensuring that debug-info can't affect the code generated is an important one.

You might consider using the BasicBlock::instructionsWithoutDebug range to iterate over instructions without any explicit logic to deal with debug-info and pseudo instructions.

@mkazantsev I believe it's just the general policy that debuginfo should not affect optimization outcome. We do generally exclude debug intrinsics for these kinds of instruction counting cutoffs for that reason.

Well, if so, then LG with Nikita's comment regarding the test. It needs to be reasonably small for a change like this.

Can we follow-up updating the place I've pointed out above?

In D142787#4090389, @mkazantsev wrote:

Can we follow-up updating the place I've pointed out above?

I think it should also be fixed with a test case.

Harbormaster completed remote builds in B210871: Diff 493434.Jan 30 2023, 4:19 PM

mkazantsev accepted this revision.Jan 30 2023, 8:49 PM

This revision is now accepted and ready to land.Jan 30 2023, 8:49 PM

Closed by commit rGf494b366ff8a: [GVN] Don't count debug instructions when limit the number of checked… (authored by Carrot). · Explain WhyJan 31 2023, 1:04 PM

This revision was automatically updated to reflect the committed changes.

Carrot added a commit: rGf494b366ff8a: [GVN] Don't count debug instructions when limit the number of checked….

Carrot added a reverting change: rG3a5777f6308b: Revert "[GVN] Don't count debug instructions when limit the number of checked….Feb 1 2023, 2:49 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

GVN.cpp

2 lines

test/

Transforms/

GVN/

PRE/

pre-load-dbg.ll

126 lines

Diff 493714

llvm/lib/Transforms/Scalar/GVN.cpp

Show First 20 Lines • Show All 1,363 Lines • ▼ Show 20 Lines	LoadInst GVNPass::findLoadToHoistIntoPred(BasicBlock Pred, BasicBlock *LoadBB,
auto *SuccBB = Term->getSuccessor(0);		auto *SuccBB = Term->getSuccessor(0);
if (SuccBB == LoadBB)		if (SuccBB == LoadBB)
SuccBB = Term->getSuccessor(1);		SuccBB = Term->getSuccessor(1);
if (!SuccBB->getSinglePredecessor())		if (!SuccBB->getSinglePredecessor())
return nullptr;		return nullptr;

unsigned int NumInsts = MaxNumInsnsPerBlock;		unsigned int NumInsts = MaxNumInsnsPerBlock;
for (Instruction &Inst : *SuccBB) {		for (Instruction &Inst : *SuccBB) {
		if (Inst.isDebugOrPseudoInst())
		continue;
if (--NumInsts == 0)		if (--NumInsts == 0)
return nullptr;		return nullptr;

if (!Inst.isIdenticalTo(Load))		if (!Inst.isIdenticalTo(Load))
continue;		continue;

MemDepResult Dep = MD->getDependency(&Inst);		MemDepResult Dep = MD->getDependency(&Inst);
// If an identical load doesn't depends on any local instructions, it can		// If an identical load doesn't depends on any local instructions, it can
▲ Show 20 Lines • Show All 1,947 Lines • Show Last 20 Lines

llvm/test/Transforms/GVN/PRE/pre-load-dbg.ll

This file was added.

				; RUN: opt < %s -passes=gvn -gvn-max-num-insns=22 -S \| FileCheck %s

				nikicUnsubmitted Done Reply Inline Actions Pass `-gvn-max-num-insns` with a low value and reduce test accordingly. No need to test the actual 100 instruction limit. nikic: Pass `-gvn-max-num-insns` with a low value and reduce test accordingly. No need to test the…
				; Debug information should not impact gvn. The following two functions have same
				; code except debug information. They should generate same optimized
				; instructions.

				%struct.a = type { i16 }

				@f = local_unnamed_addr global i16 0, align 1
				@m = local_unnamed_addr global ptr null, align 1
				@h = global %struct.a zeroinitializer, align 1

				define void @withdbg() {
				; CHECK-LABEL: @withdbg
				; CHECK: [[PRE_PRE1:%.*]] = load i16, ptr @f, align 1
				; CHECK-NEXT: [[PRE_PRE2:%.*]] = load ptr, ptr @m, align 1
				; CHECK-NEXT: br i1 true, label %[[BLOCK1:.]], label %[[BLOCK2:.]]
				; CHECK: [[BLOCK1]]:
				; CHECK-NEXT: [[CONV:%.*]] = sext i16 [[PRE_PRE1]] to i32
				; CHECK-NEXT: store i32 [[CONV]], ptr [[PRE_PRE2]], align 1

				entry:
				%agg.tmp.ensured.sroa.0.i = alloca i16, align 1
				br i1 icmp ne (ptr @withdbg, ptr null), label %lor.end, label %lor.rhs

				lor.rhs: ; preds = %entry
				call void @llvm.dbg.declare(metadata ptr undef, metadata !46, metadata !DIExpression()), !dbg !40
				call void @llvm.dbg.declare(metadata ptr undef, metadata !47, metadata !DIExpression()), !dbg !40
				call void @llvm.dbg.declare(metadata ptr undef, metadata !48, metadata !DIExpression()), !dbg !40
				call void @llvm.dbg.declare(metadata ptr undef, metadata !49, metadata !DIExpression()), !dbg !40
				call void @llvm.dbg.declare(metadata ptr undef, metadata !50, metadata !DIExpression()), !dbg !40
				%agg.tmp.ensured.sroa.0.0.copyload.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.1.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.1.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.2.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.2.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.3.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.3.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.4.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.4.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.5.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.5.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.6.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.6.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.7.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.7.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.8.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.8.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%fvalue = load i16, ptr @f, align 1
				%mvalue = load ptr, ptr @m, align 1
				br label %lor.end

				lor.end: ; preds = %lor.rhs, %entry
				%tmp11 = load i16, ptr @f, align 1
				%conv.i.i6 = sext i16 %tmp11 to i32
				%tmp12 = load ptr, ptr @m, align 1
				store i32 %conv.i.i6, ptr %tmp12, align 1
				ret void
				}

				define void @lessdbg() {
				; CHECK-LABEL: @lessdbg
				; CHECK: [[PRE_PRE1:%.*]] = load i16, ptr @f, align 1
				; CHECK-NEXT: [[PRE_PRE2:%.*]] = load ptr, ptr @m, align 1
				; CHECK-NEXT: br i1 true, label %[[BLOCK1:.]], label %[[BLOCK2:.]]
				; CHECK: [[BLOCK1]]:
				; CHECK-NEXT: [[CONV:%.*]] = sext i16 [[PRE_PRE1]] to i32
				; CHECK-NEXT: store i32 [[CONV]], ptr [[PRE_PRE2]], align 1

				entry:
				%agg.tmp.ensured.sroa.0.i = alloca i16, align 1
				br i1 icmp ne (ptr @lessdbg, ptr null), label %lor.end, label %lor.rhs

				lor.rhs: ; preds = %entry
				%agg.tmp.ensured.sroa.0.0.copyload.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.1.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.1.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.2.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.2.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.3.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.3.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.4.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.4.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.5.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.5.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.6.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.6.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.7.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.7.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%agg.tmp.ensured.sroa.0.0.copyload.8.i = load volatile i16, ptr @h, align 1
				store volatile i16 %agg.tmp.ensured.sroa.0.0.copyload.8.i, ptr %agg.tmp.ensured.sroa.0.i, align 1
				%fvalue = load i16, ptr @f, align 1
				%mvalue = load ptr, ptr @m, align 1
				br label %lor.end

				lor.end: ; preds = %lor.rhs, %entry
				%tmp11 = load i16, ptr @f, align 1
				%conv.i.i6 = sext i16 %tmp11 to i32
				%tmp12 = load ptr, ptr @m, align 1
				store i32 %conv.i.i6, ptr %tmp12, align 1
				ret void
				}

				declare void @llvm.dbg.declare(metadata, metadata, metadata)

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!35, !36}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 17.0.0.prerel", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "bbi-78272.c", directory: "/tmp")
				!5 = !DIBasicType(name: "int", size: 16, encoding: DW_ATE_signed)

				!35 = !{i32 7, !"Dwarf Version", i32 4}
				!36 = !{i32 2, !"Debug Info Version", i32 3}
				!40 = !DILocation(line: 15, column: 7, scope: !41)
				!41 = distinct !DISubprogram(name: "x", scope: !1, file: !1, line: 14, type: !42, scopeLine: 14, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !45)
				!42 = !DISubroutineType(types: !43)
				!43 = !{!5}
				!45 = !{!46, !47, !48, !49, !50}
				!46 = !DILocalVariable(name: "t", scope: !41, file: !1, line: 15, type: !5)
				!47 = !DILocalVariable(name: "c", scope: !41, file: !1, line: 15, type: !5)
				!48 = !DILocalVariable(name: "v", scope: !41, file: !1, line: 15, type: !5)
				!49 = !DILocalVariable(name: "d", scope: !41, file: !1, line: 15, type: !5)
				!50 = !DILocalVariable(name: "u", scope: !41, file: !1, line: 16, type: !5)