Page MenuHomePhabricator

[DebugInfo] add SectionedAddress to DebugInfo interfaces.
ClosedPublic

Authored by avl on Feb 13 2019, 10:14 AM.

Details

Summary

That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703 "wrong line number info for obj file compiled with -ffunction-sections" bug. The problem happened with only .o files. If object file contains several .text sections then line number information showed incorrectly. The reason for this is that DwarfLineTable could not detect section which corresponds to specified address(because address is the local to the section). And as the result it could not select proper sequence in the line table. The fix is to pass SectionIndex with the address. So that it would be possible to differentiate addresses from various sections. With this fix llvm-objdump shows correct line numbers for disassembled code.

Diff Detail

Event Timeline

avl created this revision.Feb 13 2019, 10:14 AM
Herald added a project: Restricted Project. · View Herald Transcript

This has some relatively extensive changes to some of the llvm tools (like sancov, for instance) that aren't tested - perhaps it'd be better to leave those tools usage unmodified (adding the undef section value as a fix to the API change, but nothing more) with fixits and/or followup patches that implement the improved functionality possible with the new API, along with tests?

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h
142–144

Might be simpler to implement as:

return std::tie(LHS.SectionIndex, LHS.Address) < std::tie(RHS.SectionIndex, RHS.Address);
148

Maybe "presented" could be replaced with "is present" here and below?
(and "index of /the/ section")

llvm/include/llvm/Object/ObjectFile.h
139

Maybe use non-static data member initializers rather than constructors?
(& then drop the other two ctors too - let this be a plain struct)

145–149

If this can be changed to a plain struct, might be nice to move this operator out of the class definition (& generally/at least it should be a friend rather than a member, so that LHS and RHS get equal treatment for conversions, etc)

& implement this with std::tie similar to the previous suggestion, maybe?

329

Cleanups like this would ideally go in separate NFC (No Functional Change) commits - shouldn't require any review, feel free to just commit them if you have commit rights.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
273

Is this llvm:: qualifier to disambiguate different things called SectionedAddress? Perhaps we could avoid having different things with that name and use only one of them - maybe move the one that's already in libDebugInfo into libObject? (perhaps as an incremental patch before the rest of this change)

975

Presumably this is going to have ambiguity problems if it doesn't have the section too? Which CU does it return if there are two CUs both with address zero in them (because they each describe functions in different sections), for instance?

979–980

Similarly here

1056–1059

Yeah, seems like it'd be best to have one SectionedAddress type rather than mapping between these equivalent but different types.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
867–868

Perhaps Row should contain a SectionedAddress rather than a separate Address and SectionIndex field (again, could be a preliminary or after-this-patch refactoring to clean it up if it'd otherwise introduce a lot of churn to this already fairly large patch)

889–892

Under what conditions is this code needed? (it looks like this would only fail if the section index was incorrect? Which makes me think/feel like this is maybe working around a bug of some kind?)

905

Maybe this should be "lookupAddressImpl" if it's the underlying implementation that is used by the more general lookupAddress entry point?

982–984

Maybe this should be {Address.SectionIndex, EndAddr - 1} (though admittedly it's a bit subtle having to know if section index or address comes first) & passed in directly to findRowInSeq - no need to split out into a named variable, etc.

1064–1066

Looks like unrelated refactoring? (better in a separate commit)

llvm/test/tools/llvm-objdump/Inputs/function-sections-line-numbers.cpp
1–25 ↗(On Diff #186686)

Seems like a rather long/complicated test case. Is it sufficient to compile this with -ffunction-sections:

void f1() { }
void f2() { }

and then symbolize address 0?

llvm/tools/llvm-cfi-verify/lib/FileAnalysis.cpp
529–530

Could use the ctor:

object::SectionedAddress Address(SectionIndex, VMAddress);

or braced init:

IndirectInstructions.insert({SectionIndex, VMAddress});

(I think the latter's probably fine & worth doing across this patch - but I do appreciate the concern that without names one could easily mix up the section/address order - so perhaps you/other reviewers feel differentlny to me)

llvm/tools/llvm-cfi-verify/lib/GraphBuilder.h
105

Might as well pass a SectionedAddress here?

llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
404–405

Perhaps this should be changed to produce more than one result? Maybe just leaving a FIXME (given that it's no worse with this change than it was before)

llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp
392–393

You can roll the variable into the condition here:

if (auto SecOrErr = Sym.getSection())

& LLVM style tends not to put {} on single-line statements like this

avl added a comment.Feb 13 2019, 1:18 PM

David, thank you for the review. I would address all points. two major points ;

  1. If it is OK to left tools with UNDEF value and TODO comment - then I am also OK with this and would remove changes for sancov.
  2. As to having two SectionedAddress structures. I was thinking that existing SectionedAddress structure is local for Dwarf sources. So that Dwarf sources could be independend from another sources. But if it is not the case - then It would be better to leave only one SectionedAddress structure defined at Object/ObjectFile.h So if it is OK I will just remove SectionedAddress from DWARFSection.h

My comments about LLD side are below.

Question is: can LLD part have a test?
I also wonder could this patch just pass object::SectionedAddress::UndefSection,
would behavior be the same? If so, then probably would be better to split LLD change
to a different patch with the test case.

lld/ELF/InputFiles.cpp
208

detect -> Detect

(comments should start from uppercase)

210

I would like to hear what Rui thinks about this,
but my opinion this TODO should be removed because I do not believe we will
add a section index to InputSectionBase. This method is used for error reporting,
so it is OK for it to be reasonably slow. But adding bits to InputSectionBase might increase
memory consumption.

And FTR I do not know a better way to find the section index here, seems
using the loop is fine.

213

We usually use prefix increments I think:

++CurIndex
avl added a comment.Feb 14 2019, 2:55 AM

Hi George, Thank you for comments. I will address them. According to the questions ;

  1. yes, I would create a test for LLD.
  2. Using object::SectionedAddress::UndefSection as section index makes following tests fail :

    lld :: ELF/conflict.s lld :: ELF/debug-line-str.s lld :: ELF/multiple-cu.s lld :: ELF/undef.s lld :: ELF/x86-64-reloc-range-debug-loc.s
avl marked 23 inline comments as done.Feb 14 2019, 7:03 AM
avl added inline comments.
llvm/include/llvm/Object/ObjectFile.h
139

I changed it to non-static data member initializers. But it looks like I could not drop following constructors. Since if I add non-static data member initializers then it stopped to compile inplace initializers = {Addr, Index}. To compile inplace initializer - it require constructor SectionedAddress (Addr, Index). It also require default constructor for default instantiation.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
273

moved SectionedAddress from libDebugInfo into linObject.

975

Agreed. I left it as is to reduce size of the fix. Since It is not required for 40703 bug. I would put TODO comment here.

979–980

would put TODO comment here.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
889–892

That is how DebugLineInfo organized. If it has absolute addresses - it does not care which section they relate. So if caller ask for information for an AbsoluteAddress from section X then there would not be possible to find such entry. All absolute addresses put in Undef section(in DebugLineInfo table). But if Line Info table has Relocation information then it would create entry for specific SectionIndex. Earlier there was not this differrentiation because addresses were not marked with SectionIndex. Now addresses which have relocations marked with proper section index and absolute addresses marked with undex section index. Thus the algorithm is following :

find address in relocations for specified section. then if not found check it for Undef section which contains info about all absolute addresses.

another alternative would be to set proper sectionIndex for absolute addresses while DebugLineInfo table create it`s map. Then there would not be neccessary to search twice, but there would be neccessary to create more complex map. I think this is just another way to implement the same.

llvm/test/tools/llvm-objdump/Inputs/function-sections-line-numbers.cpp
1–25 ↗(On Diff #186686)

I took above short test case and check how llvm-objdump shows line numbers. for llvm-symbolizer it would probably be necessary to extend command line to tell to which section specified address related.

llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
404–405

added TODO comment for this.

avl updated this revision to Diff 186845.Feb 14 2019, 7:51 AM
avl marked an inline comment as done.

Addressed most of the comments. test for LLD is in progress yet.

avl updated this revision to Diff 187074.Feb 15 2019, 1:41 PM

added test for LLD. did all what was requested in comments.

avl added a comment.Feb 15 2019, 1:48 PM

George, I added test for LLD. But.... It already pass for LLD without this fix. The reason for that is following quickfix :

lld/ELF/DWARF.cpp:90

// FIXME: We should be consistent about always adding the file
// offset or not.
if (DR->Section->Flags & ELF::SHF_ALLOC)
  Val += cast<InputSection>(DR->Section)->getOffsetInFile();

lld/ELF/InputFiles.cpp:208

// Use fake address calcuated by adding section file offset and offset in
// section. See comments for ObjectInfo class.
DILineInfo Info;
for (const llvm::DWARFDebugLine::LineTable *LT : LineTables)
  if (LT->getFileLineInfoForAddress(
          S->getOffsetInFile() + Offset, nullptr,
          DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath, Info))
    return Info;

FileOffset added instead of section offset(because section offset could be 0). I think that quickfix should not prevent making interfaces full. i.e. I still think that adding SectionIndex for getFileLineInfoForAddress is a right thing. Having this already done - the fix with getOffsetInFile could probably be refactored out.

Ok. I debugged this a little and think that:

  1. LLD code change LGTM.
  2. I find the LLD test case potentially useful because I did not find any test checking exactly this situation. Given that LLD has mentioned "TODOs" in the code that we might want to remove, it would be good to have such a test for the future possible changes probably.
  3. It feels that the test can be reduced and might need some cleanup (I left a few inline comments).
  4. Since without assigning the section index, we would have few existent LLD tests failures (and also since this test pass without any code changes atm), it should be fine to go without any additional test for LLD in this patch.

To summarize: I suggest to proceed with this patch without the LLD test and post it on a different review as an independent improvement.

lld/test/ELF/debug-line-obj.s
5 ↗(On Diff #187074)

I do not think you need to mention "crash" here.
Seems you used debug-line-str.s test case as a base and the comment is coming from there.
That test was committed with the fix that actually fixed a crash and that is why it is was mentioned.

I would just say here that this test checks that we report locations correctly when we have the line number
information in .debug_line that describes lines from different sections created because of using of -ffunction-sections flag.
Or something like that :)

238 ↗(On Diff #187074)

You should not need .debug_macinfo section.

251 ↗(On Diff #187074)

I think you can remove .ident and the .note.GNU-stack section below.

llvm/test/tools/llvm-objdump/function-sections-line-numbers.test
1 ↗(On Diff #186686)

I did not find any other llvm-objdump test that would use clang.
You should probably use llvm-mc I think.

avl updated this revision to Diff 187235.Feb 18 2019, 6:16 AM

deleted LLD test as it was requested. Modified llvm-objdump test to not to use clang.

dblaikie added inline comments.Feb 25 2019, 2:45 PM
llvm/include/llvm/Object/ObjectFile.h
139

ah, seems that rule doesn't exist in C++14 anymore - so maybe a FIXME to remove the ctors once we've adopted C++14?

149

These shouldn't be static (means they'd have internal linkage, and they'd be duplicated (& not deduplicated by the linker) and lead to ODR violations (any non-static inline function that calls these and is used in multiple translation units would be in violation of the ODR because that inline function would differ (it'd be calling different functions) between fileS))

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
935–946

Probably flip this around to reduce indentation:

if (lookupAddressRangeImpl(...))
  return true;

if (Address.SectionIndex == Undef)
  return false;

Address.SectionIndex = Undef;
return lookupAddressRangeImpl(...);
982–984

In some places you've rolled the SectionedAddress creation into the call site, and in others you have a named variable like this. I think it probably makes sense to not name these variables in general (unless they're needed more than once)? But not a big deal.

1064–1066

Still looks like some unrelated refactoring here?

llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp
391–392

Could you use "else if" here?

llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
206

can't say I've seen this written this way - I think it's probably more common to write it as "if (!BinaryOrErr)"?

209

Rather than x.get().y you could write this as "x->y"?

avl marked 8 inline comments as done.Feb 26 2019, 5:18 AM
avl added inline comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
206

There are more usages of error() in this file. so I did it exactly like in other places. Though if you think it would be better to change it into "if (!BinaryOrErr)" then I will change it.

avl updated this revision to Diff 188386.Feb 26 2019, 8:44 AM

addressed latest comments.

dblaikie accepted this revision.Feb 26 2019, 4:20 PM

Looks good - thanks!

This revision is now accepted and ready to land.Feb 26 2019, 4:20 PM
avl added a comment.Feb 27 2019, 6:00 AM

Thank you.

delcypher added inline comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

@avl This patch has broken sanitizer tests on Darwin (rdar://problem/48474543). In particular this now fails.

FILECHECK_DUMP_INPUT_ON_FAILURE=1 bin/llvm-lit -v projects/compiler-rt/test/asan/X86_64DarwinConfig/TestCases/Darwin/haswell-symbolication.cc

I've tracked down the issue to this commit and this function (getModuleSectionIndexForAddress()). The problem is that this code assumes the module name is the same as the path to the binary. This doesn't seem to be true on Darwin because while running the sanitizer test llvm-symbolizer tries to open...

/path/to/llvm_build/projects/compiler-rt/test/asan/X86_64DarwinConfig/TestCases/Darwin/Output/haswell-symbolication.cc.tmp.thin.x86_64:x86_64

Note the :x86_64 at the end of the module name. That's the slice name inside the file and is not part of the file name itself. We need to land a fix for this or we'll need to revert this patch.

I suspect this issue must be handled elsewhere in the codebase because Darwin specific logic probably doesn't belong here.

@JDevlieghere Any ideas if there's an API that can be used here to get the binary file path from the module name?

avl marked an inline comment as done.Feb 28 2019, 9:07 AM
avl added inline comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

I will research how to fix it. Make me known if there is proper solution for this, please.

avl marked an inline comment as done.Feb 28 2019, 2:24 PM
avl added inline comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

I see that situation already handled in LLVMSymbolizer::getOrCreateModuleInfo(). I am going to handle this in similar manner :

size_t ColonPos = ModuleName.find_last_of(':');

// Verify that substring after colon form a valid arch name.
delcypher added inline comments.Feb 28 2019, 2:46 PM
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

@avl Thanks for looking into this. That sounds sensible. It might nice to factor out that little bit of code too to avoid duplication.

avl marked 2 inline comments as done.Mar 1 2019, 2:32 AM
avl added inline comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

@delcypher I commit a fix for it. But now i think it would not work. I will prepare review for better version asap.

delcypher added inline comments.Mar 1 2019, 5:28 AM
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
204

@avl Thanks. r355192 seems to have fixed the ASan Darwin/haswell-symbolication.cc test.

avl closed this revision.Sat, Mar 23, 3:02 AM
ormris removed a subscriber: ormris.Mon, Mar 25, 9:02 AM