This is an archive of the discontinued LLVM Phabricator instance.

[lldb][COFF] Match symbols from COFF symbol table to export symbols
ClosedPublic

Authored by alvinhochun on Sep 22 2022, 4:26 AM.

Details

Summary

If a symbol is the same as an export symbol, mark it as 'Additional' to
prevent the duplicated symbol from being repeated in some commands (e.g.
disas -n func). If the RVA is the same but exported with a different
name, only synchronize the symbol types.

Depends on D134265

Event Timeline

alvinhochun created this revision.Sep 22 2022, 4:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 4:26 AM
Herald added a subscriber: mgrang. · View Herald Transcript
alvinhochun published this revision for review.Sep 22 2022, 5:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 5:12 AM

This looks mostly reasonable to me, a couple discussion points only.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
817

As a curious question - since D134265 we get better quality for the symbol types set from the get go - what are the other cases you foresee where this will make a difference? I don't mind if there isn't any difference though, syncing the types from the symbol table which is more expressible probably is good anyway. Just wondering about the actual utility of it.

828

This condition had me puzzled for a moment, until I realized that you're synchronizing symbol type in the other direction here. Is it worth clarifying this in the comment?

(Also, I presume the test case doesn't really trigger this case?)

labath added inline comments.Sep 23 2022, 4:26 AM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
892

Can you have multiple symbols pointing to the same address? Make this should use stable_sort instead?

alvinhochun added inline comments.Sep 23 2022, 6:49 AM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
817

Given what limited info the existing implementations (LLD and GNU ld) write to the symbol table, we probably won't get any better type info from just the symbol table alone. Though I was thinking if we can use a bit of additional heuristics we can assign more specific symbol types.

For example, perhaps if we can guess that a particular code symbol may be an import thunk from the jump instruction it contains, then we can set the symbol as 'eSymbolTypeTrampoline'. It seems the breakpoint command will skip trampoline symbols, so if you, say, set a breakpoint on memcpy it will only break on the real function but not the import thunk. Not sure how feasible it is to implement though.

828

Hmm, I can try to improve the comment.

The aliasInt symbol should trigger this case. Perhaps it will be clearer if I update the previous patch to include these symbols, so the difference in output can be seen in this patch.

892

Yes, it can happen. The ordering shouldn't affect what AppendFromCOFFSymbolTable does but I suppose stable_sort does make it more deterministic to avoid future issues down the line.

Updated test based on parent and addressed review comments.

No further objections/comments from me on this!

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
828

Oh, ok, I had missed that MapSymbolType returns code or invalid - I had expected it to return code or data. I guess this is fine then. (Changing that function, or changing the logic here to look at section types is out of scope here anyway, and I don’t know if the difference between data and invalid matters?)

alvinhochun added inline comments.Sep 23 2022, 10:00 AM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
828

Data vs Invalid does seem to have an effect on whether a symbol can be printed as data or have its address taken. I can't print or dump any static/globals when the type is Invalid. I think the DWARF debug info should have at least some information about these symbols, but I have no idea how that works.

I wonder if I should change MapSymbolType to return Data for anything that is not code?

mstorsjo added inline comments.Sep 23 2022, 12:13 PM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
828

I think that might be a good idea - but that's clearly a separate change from this one too. Thanks!

labath accepted this revision.Sep 27 2022, 8:51 AM

Seems fine to me from a general perspective. I think others have already checked the windows parts.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
892

Yeah, it's better to avoid having this kind of nondeterministic behavior that can differ from run to run. LLDB is not so string about it as clang/llvm, but it's still a good idea.

This revision is now accepted and ready to land.Sep 27 2022, 8:51 AM