This is an archive of the discontinued LLVM Phabricator instance.

Fix undefined behavior in Dwarf.
ClosedPublic

Authored by linzj on Jun 10 2020, 11:24 PM.

Details

Summary

Caused by field PrevCU is not initialized.
llvm::DwarfCompileUnit::addRange () at ../lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:276
llvm::DwarfDebug::endFunctionImpl () at ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:1586
llvm::DebugHandlerBase::endFunction () at ../lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp:319
llvm::AsmPrinter::EmitFunctionBody () at ../lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1230
llvm::ARMAsmPrinter::runOnMachineFunction () at ../lib/Target/ARM/ARMAsmPrinter.cpp:161

Diff Detail

Event Timeline

linzj created this revision.Jun 10 2020, 11:24 PM
aprantl accepted this revision.Jun 12 2020, 10:42 AM
This revision is now accepted and ready to land.Jun 12 2020, 10:42 AM
linzj added a comment.Jun 15 2020, 2:18 AM

I am not a commiter, please commit this patch for me.

Is this UB codepath reached by existing test cases running in check-llvm? Or was it found on some other testing?

linzj added a comment.Jun 15 2020, 6:28 PM

Is this UB codepath reached by existing test cases running in check-llvm? Or was it found on some other testing?

No. I ran my compiler which is based on LLVM with valgrind. Then this warning was popping everywhere.

Is this UB codepath reached by existing test cases running in check-llvm? Or was it found on some other testing?

No. I ran my compiler which is based on LLVM with valgrind. Then this warning was popping everywhere.

Please ensure/add a test case to the LLVM regression tests that demonstrates the codepath to ensure it has some coverage/regression testing. (if this doesn't reproduce with Clang's IR/LLVM's regression tests, it's possible your compiler is producing some novel IR that we're not testing/haven't thought about)

linzj updated this revision to Diff 271593.Jun 18 2020, 12:02 AM

Add testcase.

Could you include a comment in the test case explaining how this IR differs from all the IR clang usually generates? (in general what's interesting about this IR)

Also - remove the "valgrind" usage from that llc invocation (test machines might not have valgrind, etc - does UBSan/ASan/MSan catch this? There are buildbots that'll run the tests with sanitizers like those enabled & it'd be good if the test could catch a regression at least on those buildbots) - and include some testing of the behavior of llc (ie: what behavior do we get instead of UB? Presumably there's some specific behavior we're expecting beyond "anything other than crashing")

linzj added a comment.Jun 18 2020, 5:15 PM

Could you include a comment in the test case explaining how this IR differs from all the IR clang usually generates? (in general what's interesting about this IR)

Also - remove the "valgrind" usage from that llc invocation (test machines might not have valgrind, etc - does UBSan/ASan/MSan catch this? There are buildbots that'll run the tests with sanitizers like those enabled & it'd be good if the test could catch a regression at least on those buildbots) - and include some testing of the behavior of llc (ie: what behavior do we get instead of UB? Presumably there's some specific behavior we're expecting beyond "anything other than crashing")

It just a simple function complied by clang with -g. No modification.

I don't known how to write a test case that tell the try not to build the LLC with ubsan, could you give me an example?

linzj added a comment.Jun 18 2020, 5:18 PM

Could you include a comment in the test case explaining how this IR differs from all the IR clang usually generates? (in general what's interesting about this IR)

Also - remove the "valgrind" usage from that llc invocation (test machines might not have valgrind, etc - does UBSan/ASan/MSan catch this? There are buildbots that'll run the tests with sanitizers like those enabled & it'd be good if the test could catch a regression at least on those buildbots) - and include some testing of the behavior of llc (ie: what behavior do we get instead of UB? Presumably there's some specific behavior we're expecting beyond "anything other than crashing")

This simple function is int add(int a, int b) { return a + b ;}

How can I build a LLC with ubsan? I want to test in my local machine.

Could you include a comment in the test case explaining how this IR differs from all the IR clang usually generates? (in general what's interesting about this IR)

Also - remove the "valgrind" usage from that llc invocation (test machines might not have valgrind, etc - does UBSan/ASan/MSan catch this? There are buildbots that'll run the tests with sanitizers like those enabled & it'd be good if the test could catch a regression at least on those buildbots) - and include some testing of the behavior of llc (ie: what behavior do we get instead of UB? Presumably there's some specific behavior we're expecting beyond "anything other than crashing")

This simple function is int add(int a, int b) { return a + b ;}

Oh - hmm. I don't seem to reproduce this myself with "clang++ ^.cc -g -c -O3" running under valgrind. Is there something else I should be doing to reproduce/validate this?

How can I build a LLC with ubsan? I want to test in my local machine.

I believe you set the CMake variable: LLVM_USE_SANITIZER=Undefined

linzj added a comment.Jun 18 2020, 6:07 PM

Oh - hmm. I don't seem to reproduce this myself with "clang++ ^.cc -g -c -O3" running under valgrind. Is there something else I should be doing to reproduce/validate this?

It should add -target armv7-linux-android. Or you should simply run llc with my test case.

linzj added a comment.Jun 18 2020, 9:15 PM

I can't reproduce the bug with ubsan! Even in the latest version.
But valgrind is able to reproduce it.

I can't reproduce the bug with ubsan! Even in the latest version.

Memory sanitizer might be the better tool for this particular issue - if it's a read of an uninitialized value.

But valgrind is able to reproduce it.

Hmm - I'm not having much luck with that myself:

$ cat test.ll
target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-linux-android"

; Function Attrs: norecurse nounwind readnone
define hidden i32 @add(i32 %0, i32 %1) local_unnamed_addr #0 !dbg !9 {
  call void @llvm.dbg.value(metadata i32 %0, metadata !14, metadata !DIExpression()), !dbg !16
  call void @llvm.dbg.value(metadata i32 %1, metadata !15, metadata !DIExpression()), !dbg !16
  %3 = add nsw i32 %1, %0, !dbg !17
  ret i32 %3, !dbg !18
}

; Function Attrs: nounwind readnone speculatable willreturn
declare void @llvm.dbg.value(metadata, metadata, metadata) #1

attributes #0 = { norecurse nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+armv7-a,+d32,+dsp,+fp64,+neon,+vfp2,+vfp2sp,+vfp3,+vfp3d16,+vfp3d16sp,+vfp3sp,-crypto,-fp-armv8,-fp-armv8d16,-fp-armv8d16sp,-fp-armv8sp,-fp16,-fp16fml,-fullfp16,-thumb-mode,-vfp4,-vfp4d16,-vfp4d16sp,-vfp4sp" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5, !6, !7}
!llvm.ident = !{!8}

!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 11.0.0 (https://github.com/llvm/llvm-project.git 6dd738e2f0609f7d3313b574a1d471263d2d3ba1)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "1.c", directory: "/tmp")
!2 = !{}
!3 = !{i32 7, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
!6 = !{i32 1, !"min_enum_size", i32 4}
!7 = !{i32 7, !"PIC Level", i32 2}
!8 = !{!"clang version 11.0.0 (https://github.com/llvm/llvm-project.git 6dd738e2f0609f7d3313b574a1d471263d2d3ba1)"}
!9 = distinct !DISubprogram(name: "add", scope: !1, file: !1, line: 1, type: !10, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !13)
!10 = !DISubroutineType(types: !11)
!11 = !{!12, !12, !12}
!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!13 = !{!14, !15}
!14 = !DILocalVariable(name: "a", arg: 1, scope: !9, file: !1, line: 1, type: !12)
!15 = !DILocalVariable(name: "b", arg: 2, scope: !9, file: !1, line: 1, type: !12)
!16 = !DILocation(line: 0, scope: !9)
!17 = !DILocation(line: 2, column: 12, scope: !9)
!18 = !DILocation(line: 2, column: 3, scope: !9)$ cat test.ll
target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-linux-android"

; Function Attrs: norecurse nounwind readnone
define hidden i32 @add(i32 %0, i32 %1) local_unnamed_addr #0 !dbg !9 {
  call void @llvm.dbg.value(metadata i32 %0, metadata !14, metadata !DIExpression()), !dbg !16
  call void @llvm.dbg.value(metadata i32 %1, metadata !15, metadata !DIExpression()), !dbg !16
  %3 = add nsw i32 %1, %0, !dbg !17
  ret i32 %3, !dbg !18
}

; Function Attrs: nounwind readnone speculatable willreturn
declare void @llvm.dbg.value(metadata, metadata, metadata) #1

attributes #0 = { norecurse nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+armv7-a,+d32,+dsp,+fp64,+neon,+vfp2,+vfp2sp,+vfp3,+vfp3d16,+vfp3d16sp,+vfp3sp,-crypto,-fp-armv8,-fp-armv8d16,-fp-armv8d16sp,-fp-armv8sp,-fp16,-fp16fml,-fullfp16,-thumb-mode,-vfp4,-vfp4d16,-vfp4d16sp,-vfp4sp" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5, !6, !7}
!llvm.ident = !{!8}

!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 11.0.0 (https://github.com/llvm/llvm-project.git 6dd738e2f0609f7d3313b574a1d471263d2d3ba1)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "1.c", directory: "/tmp")
!2 = !{}
!3 = !{i32 7, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
!6 = !{i32 1, !"min_enum_size", i32 4}
!7 = !{i32 7, !"PIC Level", i32 2}
!8 = !{!"clang version 11.0.0 (https://github.com/llvm/llvm-project.git 6dd738e2f0609f7d3313b574a1d471263d2d3ba1)"}
!9 = distinct !DISubprogram(name: "add", scope: !1, file: !1, line: 1, type: !10, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !13)
!10 = !DISubroutineType(types: !11)
!11 = !{!12, !12, !12}
!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!13 = !{!14, !15}
!14 = !DILocalVariable(name: "a", arg: 1, scope: !9, file: !1, line: 1, type: !12)
!15 = !DILocalVariable(name: "b", arg: 2, scope: !9, file: !1, line: 1, type: !12)
!16 = !DILocation(line: 0, scope: !9)
!17 = !DILocation(line: 2, column: 12, scope: !9)
!18 = !DILocation(line: 2, column: 3, scope: !9)

Any ideas?

I think I have used to wrong sanitizer. Should be "Memory", not "Undefined".

Then the memory sanitizer finds another bug when compiling:

2397==WARNING: MemorySanitizer: use-of-uninitialized-value

#0 0xb210e7 in llvm::StringRef::find(char, unsigned long) const /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../include/llvm/ADT/StringRef.h:322:29
#1 0xb210e7 in llvm::StringRef::split(llvm::SmallVectorImpl<llvm::StringRef>&, char, int, bool) const /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../lib/Support/StringRef.cpp:344:20
#2 0xb2b6d4 in llvm::Triple::Triple(llvm::Twine const&) /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../lib/Support/Triple.cpp:739:19
#3 0xb63b3c in llvm::sys::getProcessTriple[abi:cxx11]() /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../lib/Support/Host.cpp:1615:10
#4 0xadecab in (anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, bool) /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../lib/Support/CommandLine.cpp:1347:17
#5 0xadecab in llvm::cl::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, char const*, bool) /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../lib/Support/CommandLine.cpp:1322:24
#6 0xa90283 in main /home/linzj/ssdsrc/llvm-project2/llvm/building-release/../utils/TableGen/TableGen.cpp:265:3
#7 0x7fc23a9d4b96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
#8 0x426c69 in _start (/media/linzj/ff6fd1aa-94f2-4f61-9aba-67c6c99a0a59/home/linzj/src/llvm-project2/llvm/building-release/bin/llvm-tblgen+0x426c69)

The bug from libstdc++, switching to libc++.

use libc++ but the same error emitted!

linzj added a comment.Jun 19 2020, 1:46 AM

I think memory sanitizer is buggy...

I think memory sanitizer is buggy...

Oh, sorry - I forgot to mention (& Had forgotten it entirely) that Memory Sanitizer requires a sanitized standard library - so it's a bit more involved/harder to get it working :/ I don't know all the details of how to get that setup and working.

Any ideas why my attempt to reproduce the issue with valgrind might not have worked as it has for you?

linzj added a comment.Jun 20 2020, 7:32 PM

Any ideas why my attempt to reproduce the issue with valgrind might not have worked as it has for you?

The simplest way to reproduce is my test case. Use the same command in the first line.

Ah, I see - only reproduced in an optimized build. Sorry for the delays/noise.

I'd perhaps feel /marginally/ better about fixing this issue this way:

diff --git llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 296c380ae55..8be0c19bbc4 100644
--- llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -333,13 +333,13 @@ DIE *DwarfCompileUnit::getOrCreateCommonBlock(
 void DwarfCompileUnit::addRange(RangeSpan Range) {
   DD->insertSectionLabel(Range.Begin);
 
-  bool SameAsPrevCU = this == DD->getPrevCU();
+  const DwarfCompileUnit *PrevCU = DD->getPrevCU();
   DD->setPrevCU(this);
   // If we have no current ranges just add the range and return, otherwise,
   // check the current section and CU against the previous section and CU we
   // emitted into and the subprogram was contained within. If these are the
   // same then extend our current range, otherwise add this as a new range.
-  if (CURanges.empty() || !SameAsPrevCU ||
+  if (CURanges.empty() || this != PrevCU ||
       (&CURanges.back().End->getSection() !=
        &Range.End->getSection())) {
     CURanges.push_back(Range);

At least I /think/ that should work & probably better reflect when this value should be used.

linzj added a comment.Jul 9 2020, 4:23 AM
  • bool SameAsPrevCU = this == DD->getPrevCU();

+ const DwarfCompileUnit *PrevCU = DD->getPrevCU();

Err, once you invoke getPrevCU, then a load-to-undefined instruction will be emitted. So valgrind will still report this problem. valgrind doesn't like undefined value's load.

dblaikie accepted this revision.Jul 11 2020, 11:53 AM
  • bool SameAsPrevCU = this == DD->getPrevCU();

+ const DwarfCompileUnit *PrevCU = DD->getPrevCU();

Err, once you invoke getPrevCU, then a load-to-undefined instruction will be emitted. So valgrind will still report this problem. valgrind doesn't like undefined value's load.

Right you are! Don't mind me then :) thanks for walking me through it all!

linzj added a comment.Jul 13 2020, 1:57 AM

I think I should not attach this test case then.
You should not accept this patch. Instead you should accept the previous one.
Valgrind won't work in the Windows environment.

I think I should not attach this test case then.
You should not accept this patch. Instead you should accept the previous one.
Valgrind won't work in the Windows environment.

Yep - sure, the test case should not be included. (I don't think I can actually approve a specific prior revision of the patch in Phab) - are you able to commit the initialization or would you like me to?

linzj added a comment.Jul 13 2020, 6:27 PM

Yep - sure, the test case should not be included. (I don't think I can actually approve a specific prior revision of the patch in Phab) - are you able to commit the initialization or would you like me to?

Yes. Thanks. I am not a commiter.

This revision was automatically updated to reflect the committed changes.