This is an archive of the discontinued LLVM Phabricator instance.

[Windows] Use information from the PE32 exceptions directory to construct unwind plans
ClosedPublic

Authored by aleksandr.urakov on Sep 9 2019, 4:11 AM.

Details

Summary

This patch adds an implementation of unwinding using PE EH info. It allows to get almost ideal call stacks on 64-bit Windows systems (except some epilogue cases, but I believe that they can be fixed with unwind plan disassembly augmentation in the future).

To achieve the goal the CallFrameInfo abstraction was made. It is based on the DWARFCallFrameInfo class interface with a few changes to make it less DWARF-specific.

To implement the new interface for PECOFF object files the class PECallFrameInfo was written. It uses the next helper classes:

  • UnwindCodesIterator helps to iterate through UnwindCode structures (and processes chained infos transparently);
  • EHProgramBuilder with the use of UnwindCodesIterator constructs EHProgram;
  • EHProgram is, by fact, a vector of EHInstructions. It creates an abstraction over the low-level unwind codes and simplifies work with them. It contains only the information that is relevant to unwinding in the unified form. Also the required unwind codes are read from the object file only once with it;
  • EHProgramRange allows to take a range of EHProgram and to build an unwind row for it.

So, PECallFrameInfo builds the EHProgram with EHProgramBuilder, takes the ranges corresponding to every offset in prologue and builds the rows of the resulted unwind plan. The resulted plan covers the whole range of the function except the epilogue.

Diff Detail

Event Timeline

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
aleksandr.urakov retitled this revision from [Windows] Use EH info from the PE32 exceptions directory to construct unwind plans to [Windows] Use information from the PE32 exceptions directory to construct unwind plans.Sep 9 2019, 4:45 AM
aleksandr.urakov edited the summary of this revision. (Show Details)
aleksandr.urakov removed a project: Restricted Project.
aleksandr.urakov removed a subscriber: JDevlieghere.
Herald added a project: Restricted Project. · View Herald Transcript

At the object file level I would love to see much less of this specific unwind info making it into the ObjectFile interface. Why can't we just ask the ObjectFile to provide an UnwindPlan for a given address. There is so much complexity within the UnwindPlan where it is managing all these specific unwind plans. IMHO we should just ask the ObjectFile for an UnwindPlan and let the ObjectFile decide which one is best. Right now UnwindPlan manages a ton of different kinds of unwind alternatives and UnwindPlan has all sorts of knowledge about each specific unwind plan type (ARM compact unwind, Apple compact unwind, EH frame, .debug_frame, arch default, and more). We seem to be adding to this complexity here by making all ObjectFile instances having to know how to create each different type when UnwindPlan already has the all the abstraction we need in UnwindPlan::Row.

It's a good point, thank you! I had the same thoughts when I done it, but I'm not completely sure. The problem is that an object file can't be completely responsible for choosing an unwind plan, because some plans are produced by symbol files (and an object file knows nothing about that, if I understand right). So we can move only a part of the plan-choosing-functionality from FuncUnwinders to ObjectFiles, but will it be better? In that case both FuncUnwinders and ObjectFiles will be responsible for managing plans, won't it add an additional complexity to the solution?

It's a good point, thank you! I had the same thoughts when I done it, but I'm not completely sure. The problem is that an object file can't be completely responsible for choosing an unwind plan, because some plans are produced by symbol files (and an object file knows nothing about that, if I understand right). So we can move only a part of the plan-choosing-functionality from FuncUnwinders to ObjectFiles, but will it be better? In that case both FuncUnwinders and ObjectFiles will be responsible for managing plans, won't it add an additional complexity to the solution?

I spent a lot of time thinking about this when introducing the "breakpad" unwind plans. My conclusion was too, that making ObjectFile solely responsible for this is not a good idea, for the reasons that you already mention (and others), but I haven't really found a solution that would be fully satisfactory. Having the plans be managed by the symbol file is not that great either, because a lot of unwind plans (instruction emulation) really don't have anything to do with symbol files.

The solution that sounded most promising to me back then was making the unwind plan providers pluggable. For instance, the Module (or probably the UnwindTable contained within it) would contain a list of callbacks, which it would invoke when it wanted to create an unwind plan for a given function. Then, anybody could register itself as an unwind info provider. This format seems to be naturally tied to the COFF object files, so the related code could live inside ObjectFilePECOFF. The breakpad unwind info could be provided by SymbolFileBreakpad. Things that are truly independent of any object or symbol file format (like instruction emulation again) could be handled within the UnwindTable by just pre-populating the list of callbacks.

This sounds pretty nice in theory, but what makes it harder in practice is that there isn't just a single ultimate unwind plan for one function. There's the synchronous/asynchronous distinction for one. And then, there are various places throughout the code which reach for a specific unwind method to do some special thing. Doing this is hard if the unwind method is abstracted behind an abstract callback, and it's the reason I gave up on this idea, originally

However, maybe it would be good to resurrect it. The case for a callback solution is stronger if we have two methods that would make use of it (this stuff, and the breakpad format) than it was when we had just one. So, we could port these two methods to the callback mechanism, and others could follow at their own pace. I'd be happy to help with the breakpad side of things if needed.

All that said, I do think there is some degree of elegance in how you've done things in this patch (though I am not a fan of the windows ISomething convention). I've been thinking a lot about what will we do once we get to implementing win64 unwinding, and I haven't though of this. Though, like I already said, none of these solutions seems truly *right*, so I'd like to hear what others think about this too.

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp
16

It doesn't look like this will do the right thing in if host endianness differs from the target. You should use GetMaxU64 instead. (or given that PE should be always little-endian (right?), you could also ditch the DataExtractor and read stuff by casting to llvm::support::ulittleXX_t. I believe that's how most of PECOFF parsing in llvm works.)

179

llvm::array_lengthof

lldb/source/Symbol/DWARFCallFrameInfo.cpp
454 ↗(On Diff #219314)

Too much clang format. Please make sure you only format the lines you modify.

lldb/unittests/Symbol/CMakeLists.txt
17 ↗(On Diff #219314)

As the code you're testing lives in ObjectFilePECOFF, it would probably be better to move this stuff to unittests/ObjectFile/PECOFF (new folder).

I've been trying to figure out how to implement the same functionality you've got here, so I'm very interested in helping you land this in some form.

I think Clayborg and Pavel makes some good points about where the right layer of abstraction is for the unwind plans. I've copied your patch locally and will try out some different ideas and report back.

This patch is pretty large. It might be easier to break it up into a series of small steps to get there. The bullet points in the CL description might be a good roadmap for such a break-up. But let's first figure out the right abstraction points.

Update diff due to Pavel's requests.

aleksandr.urakov marked 4 inline comments as done.Sep 12 2019, 1:10 AM

Thanks all for taking a look!

This patch is pretty large. It might be easier to break it up into a series of small steps to get there. The bullet points in the CL description might be a good roadmap for such a break-up. But let's first figure out the right abstraction points.

I'm open to this, just let me know.

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp
16

I believe that this code is ok: it actually doesn't read a pointer to data, it gets the data read from the file and interprets it like a structure of type T (GetData returns just a pointer to the data buffer). I think it's safe because T can be RuntimeFunction, UnwindInfo or UnwindCode, but all these types are already using ulittleXX_t in their definitions.

lldb/unittests/Symbol/CMakeLists.txt
17 ↗(On Diff #219314)

My bad, fixed, thank you!

My opinion is we need to be able to ask both the SymbolFile and ObjectFile for unwind plans in the API, but we can always just ask the SymbolFile for the unwind plan and the default SymbolFile virtual function can just defer to the ObjectFile. We also need to say "I need an async unwind plan" or "I need an sync unwind plan". Since we always have a symbol file, we fall back to SymbolFileSymtab if we don't have DWARF or PDB.

If we look at FuncUnwinders we see just how many different ways we currently have:

class FuncUnwinders {
  ...
  lldb::UnwindPlanSP m_unwind_plan_assembly_sp;
  lldb::UnwindPlanSP m_unwind_plan_eh_frame_sp;
  lldb::UnwindPlanSP m_unwind_plan_debug_frame_sp;
  // augmented by assembly inspection so it's valid everywhere
  lldb::UnwindPlanSP m_unwind_plan_eh_frame_augmented_sp;
  lldb::UnwindPlanSP m_unwind_plan_debug_frame_augmented_sp;
  std::vector<lldb::UnwindPlanSP> m_unwind_plan_compact_unwind;
  lldb::UnwindPlanSP m_unwind_plan_arm_unwind_sp;
  lldb::UnwindPlanSP m_unwind_plan_symbol_file_sp;
  lldb::UnwindPlanSP m_unwind_plan_fast_sp;
  lldb::UnwindPlanSP m_unwind_plan_arch_default_sp;
  lldb::UnwindPlanSP m_unwind_plan_arch_default_at_func_entry_sp;
};

So it seems we need to be able to ask:

  • SymbolFile for unwind plan which would be able to get us:
    • m_unwind_plan_debug_frame_sp
    • m_unwind_plan_debug_frame_augmented_sp
    • m_unwind_plan_symbol_file_sp
  • ObjectFile for unwind plan which would get us:
    • m_unwind_plan_eh_frame_sp
    • m_unwind_plan_eh_frame_augmented_sp
    • m_unwind_plan_arm_unwind_sp
    • would assembly unwind go here?
  • Achitecture plug-ins for unwind plan
    • m_unwind_plan_arch_default_sp
    • m_unwind_plan_arch_default_at_func_entry_sp
    • would assembly unwind go here?

It would be great to clean this up with a function that SymbolFile, ObjectFile and Architecture implement like:

virtual UnwindPlanSP GetUnwindPlan(bool async);

The architecture plug-ins could drop the "bool async" part. But in general this would really clean up the code in FuncUnwinders.

Added Jason Molenda who owns the unwind stuff so he might be able to comment and help us with this patch.

Hi Aleksandr, nice job getting this working. I read over the patch and had a couple of initial questions.

I don't understand the creation of the ICallFrameInfo base class. All of the unwind info providers (things which parse their own format and output an UnwindPlan) are independent. We could create a base class for these to give them an interface they must implement if that's helpful to something. I don't understand how DWARFCallFrameInfo and the PE EH format are related; why is it closer than the CompactUnwindInfo or ArmUnwindInfo. I don't understand what the "I" at the beginning of the base class name is indicating, but that's a minor point.

Ideally all details of the input sources are hidden behind the UnwindTable and FuncUnwinders objects in the Module. We obviously have places in generic code that look for a specific unwind source for something unusual. One spot is in a section your patch modifies, in ObjectFileMachO when we don't have an authoritative source of information for function start addresses, we see if we have eh_frame information which gives us the start addresses for every function with eh information.

Most of the unwind formats are not wedded to a specific ObjectFile or SymbolFile format, so having them be separate from the OF/SF class is a good separation IMO. For instance, CompactUnwindInfo is only used, today, on Darwin systems. But it's a pretty clever format and I could see it being adopted on an ELF system. Putting compact unwind parsed in ObjectFileMachO would prevent that kind of flexibility.

I agree with the general point that the FuncUnwinders class is messy, but that mess is mostly contained behind a simple API. I'm not opposed to rethinking this -- plugins is an interesting idea -- but it doesn't bother me a lot given how little this code is typically worked on.

I probably read through the patch too quickly to understand it, but why is this different from other unwind info providers? From the UnwindTable we have access to the ObjectFile and SymbolFile directly - I don't understand the motivation of that part of the change. I don't understand the move of the methods into the ObjectFile class - I don't object to it, but I'm missing the motivation for moving things there from the UnwindTable.

I guess it's best to start out with, the most basic: why isn't this a sources/Symbol/PEUnwindInfo.cpp standalone that UnwindTable instantiates and FuncUnwinders asks for UnwindPlans from.

I have a couple more questions and some renaming requests.

lldb/include/lldb/Symbol/DWARFCallFrameInfo.h
74 ↗(On Diff #219853)

I find the name ForEachEntries confusing. I know this is a leftover from the original code that you're modifying, but I wonder if it can get a better name. In particular, I don't know why it's plural, so I'd expect ForEachEntry, but even that is pretty vague. I realize FDE is DWARF-specific, but at least it gives a clue as to what type of entries are involved.

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp
542

Isn't it possible for more than one RuntimeFunction to intersect with an address range? In normal use, it might not happen because it's being called with constrained ranges, so maybe it's nothing to worry about. I suppose if the range were too large and it encompassed several functions, returning any one of them is acceptable.

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.h
47

nit: missing n: FindRuntimeFunctionIntersectsWithRange

51

m_exception_data is vague. In the constructor, it's referred to as the exception directory, so perhaps m_exception_dir would be a bit more descriptive.

lldb/source/Symbol/ObjectFile.cpp
684

It seems a bit weird for DWARF-specific code to be here, when there are ObjectFile plugins for PECOFF, ELF, and Mach-O. Obviously the PECOFFCallFrameInfo is instantiated in the PECOFF plugin. The ELF and Mach-O versions instantiate DWARFCallFrameInfos. Does the generic ObjectFile need to do the same?

aleksandr.urakov edited the summary of this revision. (Show Details)

Update due to the requests.

aleksandr.urakov marked 4 inline comments as done.Sep 13 2019, 12:16 PM

Hi Jason, thanks for the review!

Initially, the reason of these changes was to reuse the code that works with eh_frame, because on Windows we have a very similar compiler generated info designed to help with unwinding stacks during exception handling. That's why I have chosen DWARFCallFrameInfo and have extracted its interface (I in ICallFrameInfo stands for Interface, but I've renamed it to CallFrameInfo in the last patch, because this notation doesn't look common in LLVM).

But there is one more reason that makes it very difficult to just create PECallFrameInfo in UnwindTable like we do with DWARFCallFrameInfo: PECallFrameInfo is coupled very tight with ObjectFilePECOFF and can't work over plain ObjectFile. First of all, it uses several functions for work with relative virtual addresses (RVA), and they are implemented with the use of ObjectFilePECOFF's private variable m_image_base (however, may be it is possible to implement these methods over the ObjectFile interface, I'm not sure about that). The second, it requires information about exception directory location, but it is an inner mechanics of ObjectFilePECOFF. So its relations with ObjectFilePECOFF are not the same as between DWARFCallFrameInfo and ELF or Mach-O.

To resolve the situation we can:

  • use something like a dynamic_cast to ObjectFilePECOFF in UnwindTable and create PECallFrameInfo with it. But in this case lldbSymbol becomes dependent on lldbPluginObjectFilePECOFF (and we get a circular dependency);
  • extend ObjectFile's interface to make it possible for PECallFrameInfo to work with required PE abstractions. In this case we can just create PECallFrameInfo in UnwindTable like we do with DWARFCallFrameInfo, but it adds weird operations to ObjectFile's interface, which are not related to other object file formats;
  • allow ObjectFile to create its own unwind infos by himself.

I've found the last idea good, and decided to redo things in this way (because DWARFCallFrameInfo etc. are working over object files too). But you are right, it also has a negative impact on ObjectFile: it becomes knowing of the things it shouldn't. So in the last patch I have left only a most abstract method, which only allows to get some unwind info from an object file, and introduced corresponding entities in UnwindTable and FuncUnwinders. I think it can be a good start for moving in the direction that Greg suggests.

What do you think of this?

lldb/include/lldb/Symbol/DWARFCallFrameInfo.h
74 ↗(On Diff #219853)

I just have removed it from the new version.

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp
542

Yes, it is possible. The problem here is that FuncUnwinders requests the plan with an AddressRange, not with an Address, and the information about the original requested address is lost to that moment. The required AddressRange is calculated in UnwindTable::GetAddressRange, that's why the order of querying unwind infos is important in that function. I think that this logic requires some rework, but it seems that it is not a good idea to do it in this patch.

lldb/source/Symbol/ObjectFile.cpp
684

I just have made a default implementation to avoid duplication of the code in ELF and Mach-O plugins, but I agree, it looks weird here. It was removed in the new implementation.

Sorry for the delay, I was OOO. I haven't yet read through all of the discussion, but I wanted to plug into this part.

But there is one more reason that makes it very difficult to just create PECallFrameInfo in UnwindTable like we do with DWARFCallFrameInfo: PECallFrameInfo is coupled very tight with ObjectFilePECOFF and can't work over plain ObjectFile. First of all, it uses several functions for work with relative virtual addresses (RVA), and they are implemented with the use of ObjectFilePECOFF's private variable m_image_base (however, may be it is possible to implement these methods over the ObjectFile interface, I'm not sure about that). The second, it requires information about exception directory location, but it is an inner mechanics of ObjectFilePECOFF. So its relations with ObjectFilePECOFF are not the same as between DWARFCallFrameInfo and ELF or Mach-O.

PE m_image_base is already available through generic object file interfaces. You can access it as obj_file->GetBaseAddress().GetFileAddress(). This is the same way as the Breakpad symbol file plugin does it. As for the location of the exception directory, it sounds like it would not be too horrendous to have the PE plugin parse that and expose this bit as a special section (with it's own special section type etc.). Then, we could (hopefully) write the PE unwind parser in a way which does not depend on object file internals, similar to how the eh_frame parser does it.

That said, there are two reasons I am not 100% in favour of that idea:

  • the current location of the unwind parsers (source/Symbol) seems pretty weird to me. The code in this folder already does a lot of stuff, and most of it has nothing to do with unwinding. So, shoving more unwinding code there seems suboptimal. If we go down that path, I would propose we create a new folder (source/Unwind ?) and put the unwind parsers there.
  • even with a separate "unwind" folder, it seems slightly weird to have formats which are clearly specific to one {symbol|object}file be in a generic place. I am here mostly thinking of more "exotic" formats like breakpad. While it would be possible (and I am happy to do that) to refactor the breakpad parser to work in this way, in does not seem completely optimal to me. That's why I would still be in favour of some "plugin" mechanism, where unwind formats which are definitely tied to some kind of a file format be implemented there. Then the more generic formats could live in the generic place, and things like breakpad (and possibly PECOFF) could live inside their plugins.

(Overall, don't take my comments as a hard objection to anything. It's just an opinion that I am happy to be overridden on.)

lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp
16

Ah sorry. I missed that part. This seems fine then.

Gentle ping! What do you think of this?

I'm afraid I don't have much to add beyond what I've already said. I think @jasonmolenda should make the call as to how to move forward here. If the chosen direction requires any changes in how the breakpad unwind info handled, I'm happy to give you a hand there.

I'm just back from vacation. I agree with Pavel that we need to hear more from Jason at this point. I'm still very interested in helping this land in some form.

jasonmolenda accepted this revision.Oct 10 2019, 10:16 PM

Thanks for all of the work on this patch Aleksandr, I really appreciate the work and this will be a nice addition to lldb. Apologies again for not looking over the revised patch earlier -- this looks really good to me, please do commit it when you have a chance.

This revision is now accepted and ready to land.Oct 10 2019, 10:16 PM

Thanks a lot for the review!

This revision was automatically updated to reflect the committed changes.

Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding?

Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding?

Hi Martin!

Do I understand the question correctly: you mean x64 Windows binaries with an additional DWARF unwind info inside, right? In that case the info from PE32+ directory should be used, it has a higher priority than the debug info. I think it should have a higher priority because it is used by the system during an unwind and is always presented in x64 binaries, so we have stronger guarantees with it.

In all other cases, when the info in PE32+ directory is not presented, all things should work as usual (then, the DWARF info will be used).

Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding?

Do you mean debug_frame or eh_frame? debug_frame should be completely unaffected by this. the interaction between eh_frame and SEH is more tricky, but I don't know if that's ever used/emitted on windows (since presumably the system libraries don't know how to read it)...

Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding?

Do you mean debug_frame or eh_frame? debug_frame should be completely unaffected by this. the interaction between eh_frame and SEH is more tricky, but I don't know if that's ever used/emitted on windows (since presumably the system libraries don't know how to read it)...

I meant debug_frame. Ok, good if that's unaffected.

In MinGW setups, you can have a number of different combinations of both unwind and debug info. The debug info normally is DWARF (i.e. debug_frame), but it can also (in pure clang/lld based environments, not with GCC/binutils) optionally use codeview/PDB.

For unwind info, on x64, SEH is normally used, but it can also optionally use SjLj or DWARF (i.e. eh_frame). For i686 (and armv7), DWARF is the default.

And for the cases where it does use SEH for C++ exception unwinding, it doesn't do it exactly like MSVC does, but it uses a special gcc personality function which unwinds one step at a time with RtlVirtualUnwind. So it still does use the system unwinder facility, but slightly differently.

Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding?

Do I understand the question correctly: you mean x64 Windows binaries with an additional DWARF unwind info inside, right? In that case the info from PE32+ directory should be used, it has a higher priority than the debug info. I think it should have a higher priority because it is used by the system during an unwind and is always presented in x64 binaries, so we have stronger guarantees with it.

No, I meant DWARF debug info.

In all other cases, when the info in PE32+ directory is not presented, all things should work as usual (then, the DWARF info will be used).

Ok, that's good.

If I build an environment for x86_64 that uses DWARF for exception handling, ExceptionTableRVA/ExceptionTableSize are zero in the data directory, so I would presume this would be skipped then, and use the DWARF info instead.

It looks like this changed fixed at least one of the XFAILed tests on Windows:

http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/9751

So now the test results would be red because of the unexpectedly passing test (if there wasn't another failure). Could you have a look at whether any of the other tests that were XFAILed for the same bug are also now passing and remove the expected failure tags as appropriate?

It looks like this changed fixed at least one of the XFAILed tests on Windows:

http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/9751

So now the test results would be red because of the unexpectedly passing test (if there wasn't another failure). Could you have a look at whether any of the other tests that were XFAILed for the same bug are also now passing and remove the expected failure tags as appropriate?

This is still "failing" on the windows bot. Please fix the issue or revert the change.

@stella.stamenova I've fixed it with r374888, thanks again for pointing to that. The fix of unwind can fix some stepping issues, because stepping logic uses call stack information actively.