This is an archive of the discontinued LLVM Phabricator instance.

[SafeStack] Insert the deref after the offset
ClosedPublic

Authored by phosek on Jul 18 2019, 11:17 PM.

Details

Summary

While debugging code that uses SafeStack, we've noticed that LLVM
produces an invalid DWARF. Concretely, in the following example:

int main(int argc, char* argv[]) {
  std::string value = "";
  printf("%s\n", value.c_str());
  return 0;
}

DWARF would describe the value variable as being located at:

DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus

The assembly to get this variable is:

leaq    -32(%r14), %rbx

The order of operations in the DWARF symbols is incorrect in this case.
Specifically, the deref is incorrect; this appears to be incorrectly
re-inserted in repalceOneDbgValueForAlloca.

With this change which inserts the deref after the offset instead of
before it, LLVM produces correct DWARF:

DW_OP_breg14 R14-32

Diff Detail

Event Timeline

phosek created this revision.Jul 18 2019, 11:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2019, 11:17 PM
vsk added a subscriber: vsk.

Thanks for the patch.

I’m not sure why you don’t still need the deref after applying the correct offset. IIUC the post-patch DIExpression used to describe “value” actually describes its address. Is that intentional / were you able to check that a debugger prints the right thing post-patch?

jmorse added a subscriber: jmorse.Jul 19 2019, 9:49 AM

This sounds similar to PR41675 -- sometimes the DWARF expression builder interprets a DW_OP_deref as an extra dereference, other times it interprets it as confirmation that the expression being built is a memory location.

I think this fix might not work with expressions that already has a DW_OP_stack_value in them. The change relies on the expression eventually being a DWARF memory location (and thus always involving a dereference); but if the rest of the original DIExpr has a stack_value in it, the expression will be an implicit location, and the desired dereference won't happen.

Another fix might be to keep the dereference, but always try to add a stack_value to the expression, to make it explicitly implicit? [Ooof].

vsk added a comment.Jul 19 2019, 10:08 AM

This sounds similar to PR41675 -- sometimes the DWARF expression builder interprets a DW_OP_deref as an extra dereference, other times it interprets it as confirmation that the expression being built is a memory location.

I think this fix might not work with expressions that already has a DW_OP_stack_value in them. The change relies on the expression eventually being a DWARF memory location (and thus always involving a dereference); but if the rest of the original DIExpr has a stack_value in it, the expression will be an implicit location, and the desired dereference won't happen.

Another fix might be to keep the dereference, but always try to add a stack_value to the expression, to make it explicitly implicit? [Ooof].

Adding a stack_value doesn't seem necessary in the motivating example, though? And it would unnecessarily disallow modifying the string value in a debug session. AFAICT the deref is always needed here: my understanding is that dbg.value has to describe the actual value of a variable, not its address (https://www.llvm.org/docs/LangRef.html#diexpression). I admit that keeping the deref may break the dwarf builder, but istm that indicates a bug in the dwarf builder.

In D64971#1593796, @vsk wrote:

AFAICT the deref is always needed here: my understanding is that dbg.value has to describe the actual value of a variable, not its address (https://www.llvm.org/docs/LangRef.html#diexpression). I admit that keeping the deref may break the dwarf builder, but istm that indicates a bug in the dwarf builder.

So the deref ought to go after the arithmetic, and not before. Then someplace that knows the difference between values and locations can remove the deref so the resulting expression is ultimately correct in the DWARF?

(I though Reid was going to invent dbg.addr and solve all this nonsense for us...)

vsk added a comment.Jul 19 2019, 10:31 AM
In D64971#1593796, @vsk wrote:

AFAICT the deref is always needed here: my understanding is that dbg.value has to describe the actual value of a variable, not its address (https://www.llvm.org/docs/LangRef.html#diexpression). I admit that keeping the deref may break the dwarf builder, but istm that indicates a bug in the dwarf builder.

So the deref ought to go after the arithmetic, and not before. Then someplace that knows the difference between values and locations can remove the deref so the resulting expression is ultimately correct in the DWARF?

Thanks, that's exactly what I mean.

rnk added a subscriber: rnk.Jul 19 2019, 11:23 AM

(I though Reid was going to invent dbg.addr and solve all this nonsense for us...)

Yeah, that was the plan, but I don't see myself finding time to work on it. :(

I think at this point Jeremy is more familiar with the current state of things. From the way I understand things, we need a deref in the IR DIExpression somewhere to indicate that the variable value is not the pointer itself, but is the value in memory addressed by the pointer. Looking at things now, I would expect the deref to go *after* the pointer arithmetic, but I'm not confident if that will flow through the rest of LLVM correctly.

In any case, I don't think this patch as written is the right fix.

phosek updated this revision to Diff 210877.Jul 19 2019, 12:09 PM
phosek retitled this revision from [SafeStack] Don't re-insert derefs for allocas in debug info to [SafeStack] Insert the deref after the offset.
phosek edited the summary of this revision. (Show Details)

I've moved the deref after the offset calculation, tested this against our build and it's producing the expected DWARF output.

vsk accepted this revision as: vsk.Jul 19 2019, 1:05 PM

I've moved the deref after the offset calculation, tested this against our build and it's producing the expected DWARF output.

Thanks, LGTM. @jmorse any outstanding concerns?

This revision is now accepted and ready to land.Jul 19 2019, 1:05 PM
jmorse accepted this revision.Jul 22 2019, 8:11 AM

So the deref ought to go after the arithmetic, and not before. Then someplace that knows the difference between values and locations can remove the deref so the resulting expression is ultimately correct in the DWARF?

Ahhhhhh,

Vedant wrote:

Thanks, LGTM. @jmorse any outstanding concerns?

I think this'll work fine with stack_value too, LGTM, thanks for the patch @phosek .

(On the broader issue, I think the DWARF expression code does some guessing about what it's dealing with because it's not especially well defined what context a DIExpression works in, and it can change. Definitely something for further discussion somewhere else).

This revision was automatically updated to reflect the committed changes.

It seems like this change has introduced a crash in our build. I obtained the following stack trace:

invalid expression
!DIExpression(16, 537, 28, 4096, 456, 8, 6)
clang: /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/MachineInstr.cpp:2034: llvm::MachineInstrBuilder llvm::BuildMI(llvm::MachineFunction &, const llvm::DebugLoc &, const llvm::MCInstrDesc &, bool, llvm::MachineOperand &, const llvm::MDNode *, const llvm::MDNode *): Assertion `cast<DIExpression>(Expr)->isValid() && "not an expression"' failed.
Stack dump:
0.      Program arguments: ./bin/clang -cc1 -triple x86_64-fuchsia -emit-obj --mrelax-relocations -disable-free -main-file-name nir_opt_if.c -mrelocation-model pic -pic-level 2 -pic-is-pie -mthread-model posix -mframe-pointer=none -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -target-feature +cx16 -dwarf-column-info -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -ffunction-sections -fdata-sections -coverage-notes-file /b/s/w/ir/k/out/default/obj/third_party/mesa/src/compiler/nir/nir.nir_opt_if.gcno -sys-header-deps -D _LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D NDEBUG=1 -D _LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -D STDC_HEADERS=1 -D HAVE_SYS_TYPES_H=1 -D HAVE_SYS_STAT_H=1 -D HAVE_STDLIB_H=1 -D HAVE_STRING_H=1 -D HAVE_MEMORY_H=1 -D HAVE_STRINGS_H=1 -D HAVE_INTTYPES_H=1 -D HAVE_STDINT_H=1 -D HAVE_TIMESPEC_GET=1 -D HAVE_UNISTD_H=1 -D HAVE_DLFCN_H=1 -D YYTEXT_POINTER=1 -D HAVE___BUILTIN_BSWAP32=1 -D HAVE___BUILTIN_BSWAP64=1 -D HAVE___BUILTIN_CLZ=1 -D HAVE___BUILTIN_CLZLL=1 -D HAVE___BUILTIN_CTZ=1 -D HAVE___BUILTIN_EXPECT=1 -D HAVE___BUILTIN_FFS=1 -D HAVE___BUILTIN_FFSLL=1 -D HAVE___BUILTIN_POPCOUNT=1 -D HAVE___BUILTIN_POPCOUNTLL=1 -D HAVE___BUILTIN_UNREACHABLE=1 -D HAVE_FUNC_ATTRIBUTE_CONST=1 -D HAVE_FUNC_ATTRIBUTE_FLATTEN=1 -D HAVE_FUNC_ATTRIBUTE_FORMAT=1 -D HAVE_FUNC_ATTRIBUTE_MALLOC=1 -D HAVE_FUNC_ATTRIBUTE_PACKED=1 -D HAVE_FUNC_ATTRIBUTE_PURE=1 -D HAVE_FUNC_ATTRIBUTE_UNUSED=1 -D HAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 -D HAVE_FUNC_ATTRIBUTE_NORETURN=1 -D HAVE_DLADDR=1 -D HAVE_DL_ITERATE_PHDR=1 -D HAVE_CLOCK_GETTIME=1 -D PACKAGE_NAME="Mesa"  -D PACKAGE_TARNAME="mesa"  -D PACKAGE_VERSION="19.1.0"  -D PACKAGE_STRING="Mesa 19.1.0"  -D PACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi\?product=Mesa"  -D PACKAGE_URL=""  -D PACKAGE="mesa"  -D VERSION="19.1.0" -D _DEFAULT_SOURCE -D _GNU_SOURCE=1 -D _DEFAULT_SOURCE -D HAVE_PTHREAD=1 -O3 -Wall -Wextra -Wnewline-eof -Wno-unused-parameter -Werror -Wno-error=deprecated-declarations -Wall -Wno-missing-field-initializers -Wno-initializer-overrides -Wno-sign-compare -Wno-overloaded-virtual -Wno-absolute-value -Wno-missing-braces -Wno-unused-function -Wno-mismatched-tags -Wno-enum-conversion -Wno-unused-variable -Wno-unused-private-field -Wno-sometimes-uninitialized -Wno-incompatible-pointer-types-discards-qualifiers -Wno-unused-label -Wno-switch -Wno-gnu-variable-sized-type-not-at-end -Wno-extra-semi -Wno-newline-eof -std=c11 -fdebug-prefix-map=/b/s/w/ir/k/out/default=. -fdebug-prefix-map=/b/s/w/ir/k/out=.. -fdebug-prefix-map=/b/s/w/ir/k=../.. -ferror-limit 19 -fmessage-length 0 -fvisibility hidden -fsanitize=safe-stack -stack-protector 2 -ftrivial-auto-var-init=pattern -fobjc-runtime=gcc -fno-common -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -x c nir_opt_if-63c469.c
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'nir_opt_if-63c469.c'.
4.      Running pass 'Virtual Register Rewriter' on function '@opt_if_cf_list'
 #0 0x00000000073a0627 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:533:11
 #1 0x00000000073a0719 PrintStackTraceSignalHandler(void*) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:594:1
 #2 0x000000000739f15b llvm::sys::RunSignalHandlers() /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/Support/Signals.cpp:67:5
 #3 0x00000000073a0d88 SignalHandler(int) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:385:1
 #4 0x00007f38a40fd3a0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x123a0)
 #5 0x00007f38a318bcfb raise (/lib/x86_64-linux-gnu/libc.so.6+0x36cfb)
 #6 0x00007f38a31768ad abort (/lib/x86_64-linux-gnu/libc.so.6+0x218ad)
 #7 0x00007f38a317677f (/lib/x86_64-linux-gnu/libc.so.6+0x2177f)
 #8 0x00007f38a3184542 (/lib/x86_64-linux-gnu/libc.so.6+0x2f542)
 #9 0x000000000650cee9 llvm::BuildMI(llvm::MachineFunction&, llvm::DebugLoc const&, llvm::MCInstrDesc const&, bool, llvm::MachineOperand&, llvm::MDNode const*, llvm::MDNode const*) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/MachineInstr.cpp:2035:3
#10 0x000000000650d1f6 llvm::BuildMI(llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::DebugLoc const&, llvm::MCInstrDesc const&, bool, llvm::MachineOperand&, llvm::MDNode const*, llvm::MDNode const*) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/MachineInstr.cpp:2065:22
#11 0x00000000067910bd (anonymous namespace)::UserValue::insertDebugValue(llvm::MachineBasicBlock*, llvm::SlotIndex, llvm::SlotIndex, DbgValueLocation, bool, unsigned int, llvm::LiveIntervals&, llvm::TargetInstrInfo const&, llvm::TargetRegisterInfo const&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/LiveDebugVariables.cpp:1318:5
#12 0x00000000067907b7 (anonymous namespace)::UserValue::emitDebugValues(llvm::VirtRegMap*, llvm::LiveIntervals&, llvm::TargetInstrInfo const&, llvm::TargetRegisterInfo const&, llvm::DenseMap<unsigned int, unsigned int, llvm::DenseMapInfo<unsigned int>, llvm::detail::DenseMapPair<unsigned int, unsigned int> > const&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/LiveDebugVariables.cpp:1360:19
#13 0x00000000067884bf (anonymous namespace)::LDVImpl::emitDebugValues(llvm::VirtRegMap*) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/LiveDebugVariables.cpp:1394:24
#14 0x0000000006788317 llvm::LiveDebugVariables::emitDebugValues(llvm::VirtRegMap*) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/LiveDebugVariables.cpp:1410:1
#15 0x0000000006812b44 (anonymous namespace)::VirtRegRewriter::runOnMachineFunction(llvm::MachineFunction&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/VirtRegMap.cpp:258:37
#16 0x00000000064fe61f llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:8
#17 0x00000000069e173c llvm::FPPassManager::runOnFunction(llvm::Function&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:23
#18 0x00000000069e1b9f llvm::FPPassManager::runOnModule(llvm::Module&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16
#19 0x00000000069e2334 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:23
#20 0x00000000069e1e58 llvm::legacy::PassManagerImpl::run(llvm::Module&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863:16
#21 0x00000000069e28d1 llvm::legacy::PassManager::run(llvm::Module&) /usr/local/google/home/phosek/clang-llvm/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1894:3
#22 0x00000000076e3c0c (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::__2::unique_ptr<llvm::raw_pwrite_stream, std::__2::default_delete<llvm::raw_pwrite_stream> >) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:903:3
#23 0x00000000076e031c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__2::unique_ptr<llvm::raw_pwrite_stream, std::__2::default_delete<llvm::raw_pwrite_stream> >) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1502:5
#24 0x0000000008198a02 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:303:7
#25 0x000000000a0deb9e clang::ParseAST(clang::Sema&, bool, bool) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/Parse/ParseAST.cpp:178:12
#26 0x0000000008004652 clang::ASTFrontendAction::ExecuteAction() /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1043:1
#27 0x0000000008195fec clang::CodeGenAction::ExecuteAction() /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:1060:1
#28 0x0000000008004028 clang::FrontendAction::Execute() /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:938:7
#29 0x0000000007f3936f clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:944:23
#30 0x0000000008185b36 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:291:8
#31 0x000000000471e69f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/tools/driver/cc1_main.cpp:249:13
#32 0x000000000471146f ExecuteCC1Tool(llvm::ArrayRef<char const*>, llvm::StringRef) /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/tools/driver/driver.cpp:309:5
#33 0x0000000004710814 main /usr/local/google/home/phosek/clang-llvm/llvm-project/clang/tools/driver/driver.cpp:381:5
#34 0x00007f38a317852b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2352b)
#35 0x000000000471002a _start (./bin/clang+0x471002a)

I'm still debugging this, but if you have any ideas what's the problem, I'd appreciate any pointers.

vsk added a comment.Jul 23 2019, 3:27 PM

Hm. If you step through DIExpression::isValid and look up opcodes in include/llvm/BinaryFormat/Dwarf.def, the issue should become clear.

So far I've put together: !DIExpression(16, 537, 28, 4096, 456, 8, 6) -> DW_OP_constu, 537, DW_OP_minus, 4096, 456, DW_OP_const1u, DW_OP_deref

I'm not sure what the "4096, 456" part is.

vsk added a comment.Jul 23 2019, 3:33 PM

It might help to 'llvm-extract' the problematic function, then use 'llc' and 'print-after-all' to find the responsible pass.

vsk added a comment.Jul 23 2019, 3:42 PM
include/llvm/BinaryFormat/Dwarf.h:  DW_OP_LLVM_fragment = 0x1000,   ///< Only used in LLVM metadata.

 856     case dwarf::DW_OP_LLVM_fragment:                                                                                                                                         
 857       // A fragment operator must appear at the end.                                                                                                                         
 858       return I->get() + I->getSize() == E->get();

So the deref needs to happen before the fragment. Something like: DIExpression::prepend(Expr, DerefAfter, Offset)?