Page MenuHomePhabricator

Add the ability to extract the unwind rows from DWARF Call Frame Information.
AcceptedPublic

Authored by clayborg on Oct 20 2020, 8:17 PM.

Details

Summary

This patch adds the ability to evaluate the state machine for CIE and FDE unwind objects and produce a UnwindTable with all UnwindRow objects needed to unwind registers. It will also dump the UnwindTable for each CIE and FDE when dumping DWARF .debug_frame or .eh_frame sections in llvm-dwarfdump or llvm-objdump. This allows users to see what the unwind rows actually look like for a given CIE or FDE instead of just seeing a list of opcodes.

This patch adds new classes: UnwindLocation, RegisterLocations, UnwindRow, and UnwindTable.

UnwindLocation is a class that describes how to unwind a register or Call Frame Address (CFA).

RegisterLocations is a class that tracks registers and their UnwindLocations. It gets populated when parsing the DWARF call frame instruction opcodes for a unwind row. The registers are mapped from their register numbers to the UnwindLocation in a map.

UnwindRow contains the result of evaluating a row of DWARF call frame instructions for the CIE, or a row from a FDE. The CIE can produce a set of initial instructions that each FDE that points to that CIE will use as the seed for the state machine when parsing FDE opcodes. A UnwindRow for a CIE will not have a valid address, whille a UnwindRow for a FDE will have a valid address.

The UnwindTable is a class that contains a sorted (by address) vector of UnwindRow objects and is the result of parsing all opcodes in a CIE, or FDE. Parsing a CIE should produce a UnwindTable with a single row. Parsing a FDE will produce a UnwindTable with one or more UnwindRow objects where all UnwindRow objects have valid addresses. The rows in the UnwindTable will be sorted from lowest Address to highest after parsing the state machine, or an error will be returned if the table isn't sorted. To parse a UnwindTable clients can use the following methods:

static Expected<UnwindTable> UnwindTable::create(const CIE *Cie);
static Expected<UnwindTable> UnwindTable::create(const FDE *Fde);

A valid table will be returned if the DWARF call frame instruction opcodes have no encoding errors. There are a few things that can go wrong during the evaluation of the state machine and these create functions will catch and return them.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

One week ping.

clayborg updated this revision to Diff 310045.Dec 7 2020, 4:04 PM

Update to be based off of the "main" branch.

wallace accepted this revision.Dec 8 2020, 3:32 PM

Looks good to me. Let's see if anyone has any objections before landing

This revision is now accepted and ready to land.Dec 8 2020, 3:32 PM

There's at least one pre-merge bot failure that looks like its failing due to something in this change.

I've not reviewed the testing yet (but am happy to do so if I get a chance in the coming days), but I've done a mostly cosmetic review of the rest. I don't have enough experience in this area to review the detail though.

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
41

This comment is a little confusing to me. The register is in itself? Should it be "Value is unchanged..."? Note: I don't know much about this stuff, so accept that might not make sense.

88

Should this be register rather than registers?

97

I think you spill onto the stack, not into it, right?

163

(or "maps are")

195
200
228
273

Unnecessary assert?

305

Maybe. I say "an eff-dee-ee" in my head for this.

343
353
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
207

@MaskRay, I thought that too, but I've been unable to find where the description comes from. Could you point me at it, please?

367

There's inconsistency here and above. Earlier you use errc::invalid_argument without the std. We should be consistent!

368

These messages should all use PRIu32 to go with the uint32_t type.

Also, the style for error messages according to the style guide nowadays is no capitalization on first letter (for regular words at least) and no trailing full stop.

385

This and the next case are almost identical, except for a single string. I wonder if we could factor them out int a helper function or similar?

409

These next two are identical again. Fold them together?

428

Same comment as above re. error message style.

432

Some of these cases are identical. Can we combine them? Same goes above in getOperandAsUnsigned.

498

Maybe not use auto here (what is the type of Inst?)

508

This probably shouldn't be auto. What is the size of the return value? Same goes below for Offset, RegNum etc.

612

Not sure whether this is a clang-tidy issue or something not matching coding standards here. If the former, perhaps worth reporting it? If the latter, please fix!

643

Any particular reason to do the str() here rather than just doing .str().c_str() below?

687

This seems to be identical to the below case? Could they be folded together.

919–920

This sounds like it should be reported through an error handler, rather than printed into the main stream?

948

Same comment as above.

clayborg updated this revision to Diff 310731.Dec 9 2020, 5:35 PM

Fixed issue found by jhenderson.

jhenderson added inline comments.Dec 10 2020, 12:54 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
443
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
368

Also, the style for error messages according to the style guide nowadays is no capitalization on first letter (for regular words at least) and no trailing full stop.

Looks like you missed the second half of my previous comment?

432

This comment hasn't been addressed? If you add a function that maps an operand type to a string (or use an existing one if there is one), you could fold several of these cases together by using that function.

518

I might be wrong, but I think LLVM_FALLTHROUGH wasn't needed if there were multiple labels with identical behaviour. I think it is for the case where you have:

case 1:
  something();
  // don't break
  LLVM_FALLTHROUGH;
case 2:
  somethingThatCase1AlsoNeedsToDo();
  break;
560
570
clayborg updated this revision to Diff 311089.Dec 10 2020, 7:03 PM
clayborg marked 2 inline comments as done.

Update to fix more of jhenderson's issues.

MaskRay added inline comments.Dec 10 2020, 9:39 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
207

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html EILSEQ "Illegal byte sequence.". This is brief but in other pages in POSIX you can find descriptions like "The data obtained from the input stream does not form a valid character." "The wide-character code wc does not correspond to a valid character."

jhenderson added inline comments.Dec 11 2020, 6:11 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
487
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
207

Thanks.

448
455

Looks like this could be merged with OT_Address?

478

This and the above case look like they have the operand type in the message. Aside from that, I think they're identical, so can they be merged?

clayborg updated this revision to Diff 311317.Dec 11 2020, 2:13 PM

Fixed more jhenderson comments. Combined many common cases that were possible after making OperandType to string function.

jhenderson added inline comments.Dec 14 2020, 12:41 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
448

Looks like you missed this comment?

919–920

Ping this?

948

Ping this?

clayborg updated this revision to Diff 311803.Dec 14 2020, 11:34 PM
clayborg marked 21 inline comments as done.

Fix one last missed comment fix.

Looks fine aside from the two outstanding comments re. error messages.

clayborg added inline comments.Dec 15 2020, 11:12 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
368

Yes I did miss the error message string part. I will fix!

432

They have different error messages as they mention the "Type" by the enumeration name.

432

Will do

920

This function dumps the CFIProgram contents to the stream that is supplied. I don't believe that returning an error from a dumping function is a good idea as I like to see all errors in a file, not just the first error and have the tool stop dumping.

948

This function dumps the FDE to the stream that is supplied. Same argument as before. If this was a parsing function, then yes, an error should be returned.

Looks fine aside from the two outstanding comments re. error messages.

I commented that these are in dump functions that were already created before this patch. They are in the process of dumping the contents to the stream, so dumping the error messages to the stream seems most important. We don't want to stop dumping the information that people asked for due to running into a single error in a file as people will probably want see all of the errors in a file.

Looks fine aside from the two outstanding comments re. error messages.

I commented that these are in dump functions that were already created before this patch. They are in the process of dumping the contents to the stream, so dumping the error messages to the stream seems most important. We don't want to stop dumping the information that people asked for due to running into a single error in a file as people will probably want see all of the errors in a file.

Continuing rather than stopping makes sense to me for dumpers. This is the approach we already follow in llvm-readelf and in some other parts of llvm-dwarfdump. In the debug line dumper (which happens to effectively run in parallel to the parser), we report format issues via a callback, which by default prints a warning to stderr, but still continue parsing when we can (see the recoverable error handler). This allows library consumers to handle the problem differently, e.g. if they have warnings as errors enabled. If I follow it correctly, your code here prints it into stdout, without textual colouring, and with the "error:" prefix. This would mean that someone redirecting stderr to a file will not see the errors in that file. I don't think that's what a user would expect.

Essentially, I'm suggesting adding an additional parameter to the dump functions e.g. function_ref<void(Error)> ErrorCallback which the caller can pass. Somewhere up the stack, this would for llvm-dwarfdump at least be defaultWarningHandler. This would allow other clients to do something different, should they choose to. See my LLVMDev 2018 lightning talk on "Error Handling in Libraries" for an example of what I mean.

llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
433

Looks like this message hasn't been reformatted with lower-case first letter and no full stop.

clayborg updated this revision to Diff 312313.Dec 16 2020, 3:29 PM

Fixed remaining error message and used error handling build into the dump options.

clayborg marked 3 inline comments as done.Dec 16 2020, 3:30 PM
clayborg added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp
433

You are correct. I will fix.

920

I found that DumpOpts had error handlers built in, so I used them. They are hooked up and caused a few tests that needed to be updated, so the error handling worked.

948

I found that DumpOpts had error handlers built in, so I used them. They are hooked up and caused a few tests that needed to be updated, so the error handling worked.

Mostly looks fine, although there's a unit test failure, so not formally LGTMing. As I think I mentioned earlier, I'm not really familiar with the functional aspects of this change, so don't rely on me alone anyway. Tomorrow will be my last day in the office for over two weeks, so feel free to land if you think others have reviewed sufficiently after the remaining test fix.

clayborg updated this revision to Diff 314715.Tue, Jan 5, 1:53 PM

Update to main branch from today.

clayborg updated this revision to Diff 314796.Tue, Jan 5, 9:29 PM

Found the test suite issue on

https://reviews.llvm.org/harbormaster/unit/view/247190/

DWARFExpression wasn't initializing the "bool Error;" instance variable.

Phabricator is still saying there are unit tests failing. Are these up-to-date reports, or is Phabricator stale?

Phabricator is still saying there are unit tests failing. Are these up-to-date reports, or is Phabricator stale?

They are up to date. I am trying to figure out what is going on without much luck. I thought I had found the issue. Really hard when only one buildbot fails and you have no access to figure out what is going wrong.

The really confusing thing is the error message for x64 debian > LLVM-Unit.DebugInfo/DWARF/_/DebugInfoDWARFTests::DWARFDebugFrame.UnwindTable_DW_CFA_expression:

/mnt/disks/ssd0/agent/llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugFrameTest.cpp:1318: Failure
      Expected: Rows[0].getRegisterLocations()
      Which is: reg13=[<decoding error> 00]
To be equal to: VerifyLocs
      Which is: reg13=[DW_OP_reg10]

The FDE tries to parse:

parseCFI(TestFDE, {dwarf::DW_CFA_expression, Reg, 1, dwarf::DW_OP_reg12}),

Where this clearly encodes DW_CFA_expression (first byte) with a register value or "Reg" which is 13 (second byte) and the a DW_FORM_block where the third byte 1 is the length of the expression for reg13 and the only byte in the expression is DW_OP_reg12 (4th byte).

This somehow fails to decode only on debian??

VerifyLocs is created using:

dwarf::RegisterLocations VerifyLocs;
DataExtractor ExprData({dwarf::DW_OP_reg12}, true, AddrSize);
DWARFExpression Expr(ExprData, AddrSize);
VerifyLocs.setRegisterLocation(
    Reg, dwarf::UnwindLocation::createAtDWARFExpression(Expr));

We can easily see the expression data contains _only_ DW_OP_reg12 but in the printout above it says:

To be equal to: VerifyLocs
      Which is: reg13=[DW_OP_reg10]

No idea how this is getting modified???

So I setup a CentOS 8 linux server today and ran the failing tests and they succeed. I am out of options here. I would like to submit this as I need it to get other work submitted. Is there any way to see how the debian x86 harbormaster build is setup? Maybe it is enabling some extra things?

MaskRay added inline comments.Mon, Jan 11, 8:28 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
85

The two lines can be dropped.

171

This can be dropped.

250

Delete Address(None)

Running the unit test with ASAN gives me the following. The stack trace isn't the most helpful, unfortunately, but it should hopefully be something to go on.

==16512==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fff6bf46fc3 at pc 0x0000034713cf bp 0x7fff6bf46900 sp 0x7fff6bf460b0
READ of size 1 at 0x7fff6bf46fc3 thread T0
    #0 0x34713ce in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) (/data/users/smeenai/llvm-project/build/ASanUBSan/unittests/DebugInfo/DWARF/DebugInfoDWARFTests+0x34713ce)
    #1 0x3471a08 in bcmp (/data/users/smeenai/llvm-project/build/ASanUBSan/unittests/DebugInfo/DWARF/DebugInfoDWARFTests+0x3471a08)
    #2 0x35c6165 in compareMemory /home/smeenai/llvm-project/llvm/include/llvm/ADT/StringRef.h:76:14
    #3 0x35c6165 in llvm::StringRef::equals(llvm::StringRef) const /home/smeenai/llvm-project/llvm/include/llvm/ADT/StringRef.h:192:15
    #4 0x4eeb444 in operator== /home/smeenai/llvm-project/llvm/include/llvm/ADT/StringRef.h:901:16
    #5 0x4eeb444 in llvm::DWARFExpression::operator==(llvm::DWARFExpression const&) const /home/smeenai/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp:507:25
    #6 0x4e7b95e in llvm::dwarf::UnwindLocation::operator==(llvm::dwarf::UnwindLocation const&) const /home/smeenai/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp:142:18
    #7 0x35e5d08 in operator==<const unsigned int, llvm::dwarf::UnwindLocation> /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/stl_pair.h:444:51
    #8 0x35e5d08 in bool std::__equal<false>::equal<std::_Rb_tree_const_iterator<std::pair<unsigned int const, llvm::dwarf::UnwindLocation> >, std::_Rb_tree_const_iterator<std::pair<unsigned int const, llvm::dwarf::UnwindLocation> > >(std::_Rb_tree_const_iterator<std::pair<unsigned int const, llvm::dwarf::UnwindLocation> >, std::_Rb_tree_const_iterator<std::pair<unsigned int const, llvm::dwarf::UnwindLocation> >, std::_Rb_tree_const_iterator<std::pair<unsigned int const, llvm::dwarf::UnwindLocation> >) /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/stl_algobase.h:800:22
    #9 0x35fac2b in operator==<unsigned int, llvm::dwarf::UnwindLocation, std::less<unsigned int>, std::allocator<std::pair<const unsigned int, llvm::dwarf::UnwindLocation> > > /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/stl_map.h:1437:23
    #10 0x35fac2b in operator== /home/smeenai/llvm-project/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h:220:22
    #11 0x35fac2b in CmpHelperEQ<llvm::dwarf::RegisterLocations, llvm::dwarf::RegisterLocations> /home/smeenai/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1392:11
    #12 0x35fac2b in Compare<llvm::dwarf::RegisterLocations, llvm::dwarf::RegisterLocations> /home/smeenai/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1421:12
    #13 0x35fac2b in (anonymous namespace)::DWARFDebugFrame_UnwindTable_DW_CFA_expression_Test::TestBody() /home/smeenai/llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugFrameTest.cpp:1318:3
    #14 0x5797598 in testing::Test::Run() /home/smeenai/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2474:5
    #15 0x579a084 in testing::TestInfo::Run() /home/smeenai/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
    #16 0x579bce3 in testing::TestCase::Run() /home/smeenai/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
    #17 0x57ace73 in testing::internal::UnitTestImpl::RunAllTests() /home/smeenai/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
    #18 0x57ac32e in testing::UnitTest::Run() /home/smeenai/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4257:10
    #19 0x5786874 in RUN_ALL_TESTS /home/smeenai/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
    #20 0x5786874 in main /home/smeenai/llvm-project/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50:10
    #21 0x7f30c6215554 in __libc_start_main (/lib64/libc.so.6+0x22554)
    #22 0x33f5028 in _start (/data/users/smeenai/llvm-project/build/ASanUBSan/unittests/DebugInfo/DWARF/DebugInfoDWARFTests+0x33f5028)

Address 0x7fff6bf46fc3 is located in stack of thread T0 at offset 1155 in frame
    #0 0x35f983f in (anonymous namespace)::DWARFDebugFrame_UnwindTable_DW_CFA_expression_Test::TestBody() /home/smeenai/llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugFrameTest.cpp:1271

  This frame has 47 object(s):
    [32, 36) 'RegNum.addr.i'
    [48, 120) 'ref.tmp.i'
    [160, 200) 'Data.i267' (line 134)
    [240, 248) 'Offset.i' (line 136)
    [272, 488) 'TestCIE' (line 1274)
    [560, 688) 'TestFDE' (line 1278)
    [720, 736) 'agg.tmp'
    [752, 768) 'gtest_ar' (line 1290)
    [784, 785) 'ref.tmp' (line 1290)
    [800, 824) 'ref.tmp10' (line 1290)
    [864, 872) 'agg.tmp11'
    [896, 899) 'ref.tmp16' (line 1290)
    [912, 928) 'agg.tmp22'
    [944, 952) 'ref.tmp29' (line 1290)
    [976, 984) 'ref.tmp34' (line 1290)
    [1008, 1024) 'gtest_ar42' (line 1299)
    [1040, 1041) 'ref.tmp43' (line 1299)
    [1056, 1080) 'ref.tmp49' (line 1299)
    [1120, 1128) 'agg.tmp50'
    [1152, 1156) 'ref.tmp55' (line 1299) <== Memory access at offset 1155 is inside this variable
    [1168, 1176) 'ref.tmp74' (line 1299)
    [1200, 1208) 'ref.tmp79' (line 1299)
    [1232, 1280) 'VerifyLocs' (line 1305)
    [1312, 1336) 'ExprData' (line 1307)
    [1376, 1377) 'ref.tmp92' (line 1307)
    [1392, 1424) 'Expr' (line 1308)
    [1456, 1520) 'ref.tmp111' (line 1309)
    [1552, 1600) 'RowsOrErr' (line 1313)
    [1632, 1648) 'gtest_ar115' (line 1314)
    [1664, 1665) 'ref.tmp116' (line 1314)
    [1680, 1704) 'ref.tmp122' (line 1314)
    [1744, 1752) 'agg.tmp123'
    [1776, 1784) 'ref.tmp133' (line 1314)
    [1808, 1816) 'ref.tmp138' (line 1314)
    [1840, 1856) 'gtest_ar152' (line 1316)
    [1872, 1880) 'ref.tmp153' (line 1316)
    [1904, 1908) 'ref.tmp159' (line 1316)
    [1920, 1928) 'ref.tmp167' (line 1316)
    [1952, 1960) 'ref.tmp172' (line 1316)
    [1984, 2000) 'gtest_ar181' (line 1317)
    [2016, 2024) 'ref.tmp182' (line 1317)
    [2048, 2052) 'ref.tmp191' (line 1317)
    [2064, 2072) 'ref.tmp199' (line 1317)
    [2096, 2104) 'ref.tmp204' (line 1317)
    [2128, 2144) 'gtest_ar213' (line 1318)
    [2160, 2168) 'ref.tmp229' (line 1318)
    [2192, 2200) 'ref.tmp234' (line 1318)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope (/data/users/smeenai/llvm-project/build/ASanUBSan/unittests/DebugInfo/DWARF/DebugInfoDWARFTests+0x34713ce) in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
Shadow bytes around the buggy address:
  0x10006d7e0da0: 00 00 00 00 00 f2 f2 f2 f2 f2 f2 f2 f2 f2 00 00
  0x10006d7e0db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f2 f2
  0x10006d7e0dc0: f2 f2 00 00 f2 f2 f8 f8 f2 f2 f8 f2 f8 f8 f8 f2
  0x10006d7e0dd0: f2 f2 f2 f2 00 f2 f2 f2 f8 f2 00 00 f2 f2 f8 f2
  0x10006d7e0de0: f2 f2 f8 f2 f2 f2 f8 f8 f2 f2 f8 f2 f8 f8 f8 f2
=>0x10006d7e0df0: f2 f2 f2 f2 00 f2 f2 f2[f8]f2 f8 f2 f2 f2 f8 f2
  0x10006d7e0e00: f2 f2 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 f2
  0x10006d7e0e10: f2 f2 f2 f2 f8 f2 00 00 00 00 f2 f2 f2 f2 f8 f8
  0x10006d7e0e20: f8 f8 f8 f8 f8 f8 f2 f2 f2 f2 00 00 00 00 00 00
  0x10006d7e0e30: f2 f2 f2 f2 f8 f8 f2 f2 f8 f2 f8 f8 f8 f2 f2 f2
  0x10006d7e0e40: f2 f2 00 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2 f8 f8
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
clayborg updated this revision to Diff 316574.Wed, Jan 13, 11:29 PM

Fixed inline issues found by MaskRay and fixed the issue found by ASAN that was making the debian buildbot fail. Thanks Shoaib for the ASAN suggestion.

Looks like harbormaster is finally clean. Any more issues or is this good to go?