This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
59	I think this sometimes returns true incorrectly, seems like it happens when there are multiple ChangeCodeLength annotations. Maybe I'm interpreting those incorrectly?

Harbormaster completed remote builds in B74315: Diff 296739.Oct 7 2020, 11:24 AM

This looks pretty good.

Are the sources for the binary test inputs around somewhere so they could be recreated if necessary someday?
The test turns a small stack into file+line numbers. Is that sufficient or should there also be one to get function name+offsets?
Most of the formatting nits listed by lint are legit and should be fixed before committing.

I'll be curious to hear what you figure out about the ASAN tests.

[It's so weird how methods likes getSymbolById return std::unique_ptrs. That's not a bug in this patch, just an unintuitive interface design that we probably inherited from the original DIA wrappers. Every time I see it, though, I end up chasing down calls to make sure it's correct.]

llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h
134	I don't see anything explicit in LLVM's style guidelines, but I think it's common to segregate the data members and methods whenever possible. So perhaps the `parseSectionContribes()` should be moved up.
llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
59	I don't know much about these annotations. Can these be nested? For example if `foo` runs from f0 to f1 and the annotations also describe another subrange inside of f0..f1, then I would think `Found` would have to be a counter rather than a bool. If it's 0, you're not inside any range. If it's 2, you're inside two nested ranges. If they don't nest, then I don't see a problem.

-Add cpp source for the test binaries
-Fixed some of the lint errors (mostly adding consts), but some of the formatting ones conflict with git clang-format.

In D88988#2320393, @amccarth wrote:

The test turns a small stack into file+line numbers. Is that sufficient or should there also be one to get function name+offsets?

Doesn't it also have function names? The asan tests will also be providing some test coverage for this.

In D88988#2320393, @amccarth wrote:

I'll be curious to hear what you figure out about the ASAN tests.

I think most of them are probably windows tests that don't expect inline stack frames, but I haven't actually fixed any of them yet.

There are also some non windows-specific tests that are failing because the windows inline stack frames are apparently different from the linux ones (e.g. compiler-rt/test/asan/TestCases/calloc-overflow.cpp). Seems like the first line on the linux stack trace is a macro, and in the windows stack trace the first line is a function that was in the macro.

Harbormaster completed remote builds in B74613: Diff 297276.Oct 9 2020, 11:21 AM

akhuang added a comment.Oct 9 2020, 11:41 AM

This comment was removed by akhuang.

Ok, as far as I can tell, all of the asan tests are failing for the same reason-- the symbolizer now outputs an extra line for __sanitizer::BufferedStackTrace::Unwind.

#0 0x7ff6f9fa7e64 in __sanitizer::BufferedStackTrace::Unwind C:\src\llvm-project\compiler-rt\lib\sanitizer_common\sanitizer_stacktrace.h:124
#1 0x7ff6f9fa7e64 in malloc C:\src\llvm-project\compiler-rt\lib\asan\asan_malloc_win.cpp:98

on the failing asan tests: I assume the symbolizer is doing the correct thing (since the native and DIA implementations do the same thing), so I think I'll just update the tests to accept the extra line. I was wondering why we don't get this line on Linux, though, so I looked at where the address is in the calloc function in assembly, and it seems to be in a different part of the function on Windows and Linux.

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
59	I figured out why I was getting different results with DIA, and it turned out to be unrelated to the inline code. (I wasn't specificying PDB_SymType::Function when searching for the parent function, and I guess my implementation for findSymbolByAddress is different).

Change findSymbolByAddress call

Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2020, 3:40 PM

Herald added a subscriber: Restricted Project. · View Herald Transcript

Harbormaster completed remote builds in B75616: Diff 299197.Oct 19 2020, 4:42 PM

In D88988#2322424, @akhuang wrote:
Ok, as far as I can tell, all of the asan tests are failing for the same reason-- the symbolizer now outputs an extra line for __sanitizer::BufferedStackTrace::Unwind.
#0 0x7ff6f9fa7e64 in __sanitizer::BufferedStackTrace::Unwind C:\src\llvm-project\compiler-rt\lib\sanitizer_common\sanitizer_stacktrace.h:124
#1 0x7ff6f9fa7e64 in malloc C:\src\llvm-project\compiler-rt\lib\asan\asan_malloc_win.cpp:98

I guess I misspoke when we chatted, I think we need to avoid these extra frames. Here's the line of code that I think is executing:
https://github.com/llvm/llvm-project/blob/d784f7406911c4fb6bc559320f7f9ff134be7ff5/compiler-rt/lib/asan/asan_stack.h#L45

stack.Unwind(StackTrace::GetCurrentPc(),                     \
             GET_CURRENT_FRAME(), nullptr, fast, max_size);  \

Here's what I think might be happening: in the MS C++ ABI, argument evaluation has to be right-to-left. On Linux, it is left to right. So, on Linux, we get assembly that looks like this:

callq GetCurrentPC
movq $0, %rdi # set up other args
...
# inlined callsite for Unwind

But on Windows, that's rearranged like so:

movq $0, %rcx # set up other args
callq GetCurrentPc # set up rightmost arg last
# inlined call site for Unwind

I guess to sort it all out, the thing to do is to get the annotated assembly produced by clang-cl, and look at the assembly stream. It will have the human-readable .cv_loc directives to help us work out where to apply the fix. We could, for example, subtract one from the result of GetCurrentPc().

Hm, ok. Here's the assembly around GetCurrentPc. I haven't looked too far into it, but I guess I'll put it here for now.

On Linux:

.LBB4_9:
    callq   _ZN11__sanitizer10StackTrace12GetCurrentPcEv@PLT
.Ltmp72:
    movq    %rax, %r12
    movq    _ZN11__sanitizer21common_flags_dont_useE@GOTPCREL(%rip), %rax
    movb    34(%rax), %bl 
    callq   _ZN6__asan20GetMallocContextSizeEv@PLT
.Ltmp73:
    .loc    2 0 3 is_stmt 0                 # llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:0:3
    xorl    %ecx, %ecx
.Ltmp74:
    .loc    3 116 31 is_stmt 1              # llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_stacktrace.h:116:31

On Windows:

.LBB9_4:                                # %if.else
  #DEBUG_VALUE: calloc:size <- $r15
  #DEBUG_VALUE: calloc:nmemb <- $rdi
  .cv_loc 16 2 114 0                      # C:/src/llvm-project/compiler-rt/lib/asan/asan_malloc_win.cpp:114:0
  mov sil, byte ptr [rip + "?common_flags_dont_use@__sanitizer@@3UCommonFlags@1@A"+34]
  lea r14, [rbp - 128]
  call  "?GetCurrentPc@StackTrace@__sanitizer@@SA_KXZ"
  xor ecx, ecx
.Ltmp70:
  #DEBUG_VALUE: Unwind:this <- [DW_OP_constu 80, DW_OP_minus, DW_OP_stack_value] $rbp
  #DEBUG_VALUE: Unwind:context <- 0
  #DEBUG_VALUE: Unwind:request_fast <- [DW_OP_LLVM_convert 1 7, DW_OP_LLVM_convert 8 7, DW_OP_stack_value] undef
  #DEBUG_VALUE: Unwind:max_depth <- $ebx
  #DEBUG_VALUE: Unwind:pc <- $rax
  #DEBUG_VALUE: Unwind:bp <- $r14
  .cv_inline_site_id 19 within 16 inlined_at 2 114 0
  .cv_loc 19 3 116 0                      # C:/src/llvm-project/compiler-rt/lib\sanitizer_common/sanitizer_stacktrace.h:116:0

It looks like the first frame on Windows doesn't even point to GetCurrentPc; it points to the call to UnwindImpl.

In D88988#2343350, @akhuang wrote:

It looks like the first frame on Windows doesn't even point to GetCurrentPc; it points to the call to UnwindImpl.

Well, from reading the code, I would expect PC to be the return address of GetCurrentPc, which points to the xor %ecx, %ecx instruction, which is not in the inline call site. The .cv_inline_site_id directive happens at the next instruction. Maybe there is a bug in the way this code compares against inline site begin/end markers.

It's also possible that LLVM's own inline line annotations aren't correct. One thing you could try to debug this is to load up windbg, run an asan test, set a breakpoint on GetCurrentPc, take a stack trace from inside it, and see what windbg says. If it lists the UnwindImpl frame, then it must be a bug in LLVM's inline line table annotations, not your code.

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
70	I wonder if this should be `>` instead of `>=`. Consider the case of: callq somewhere .cv_loc ... inline location nop # anything Otherwise, this seems like it should work to me.

The result of GetCurrentPc does point to the GetCurrentPc return address. If I give that address to llvm-symbolizer it correctly gets calloc as the function. (And the call stack from the VS debugger looks correct as well). So as far as I can tell, LLVM's inline line annotations are correct and the symbolizer is doing the right thing.

However the first address in the stack trace is not that address.

I looked a bit more in the debugger yesterday asan's stack trace code, and stack traces are being created here. I guess it searches for the result of GetCurrentPc in the stack trace, but for some reason it's not there and the closest thing is the UnwindImpl return address?

Wow, that is some interesting code. :) That explains what is going on: we get the current PC, then we do a stack unwind later (inside the inline call frame), then we take the PC, and truncate the stack trace with it. The return address in the stack trace actually *is* inside the inlined call site, so this code never would've worked with inline line info.

I think this can be fixed by modifying the stack trace in place, after popping off the extra stack frames. After this line:
https://github.com/llvm/llvm-project/blob/1882568fcb08ed8af689f13826cc7e84c3c84e33/compiler-rt/lib/sanitizer_common/sanitizer_unwind_win.cpp#L39
... overwrite the first PC in the stack trace with pc. It could work. :)

In D88988#2348560, @rnk wrote:

Wow, that is some interesting code. :) That explains what is going on: we get the current PC, then we do a stack unwind later (inside the inline call frame), then we take the PC, and truncate the stack trace with it. The return address in the stack trace actually *is* inside the inlined call site, so this code never would've worked with inline line info.

I think this can be fixed by modifying the stack trace in place, after popping off the extra stack frames. After this line:
https://github.com/llvm/llvm-project/blob/1882568fcb08ed8af689f13826cc7e84c3c84e33/compiler-rt/lib/sanitizer_common/sanitizer_unwind_win.cpp#L39
... overwrite the first PC in the stack trace with pc. It could work. :)

Oh yeah, I guess that would work. Anyway, seems like all the tests still pass.

akhuang mentioned this in D89996: [Asan][Windows] Fix asan stack traces on Windows..Oct 22 2020, 5:24 PM

I'd like to be able to test this code more thoroughly. Take a look at llvm/test/DebugInfo/symbolize-paths.s. Can we construct a similar test with an assembly file? What would it take to get that set up? That would allow us to write tests with really fine grained control over the .cv_loc positioning, so we could test all the edge cases for what happens when you symbolize a PC that falls in a gap in an inline call site.

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
70	Any thoughts on this?

akhuang added inline comments.Nov 4 2020, 2:26 PM

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
70	Hm, the things I've seen are something like func: .cv_inline_site_id 2 ... .cv_loc 2 ... movl ... # inlined code addl ... # inlined code .cv_loc 1 nop # some other instructions and the address of the `movl` would be the starting offset of the S_INLINESITE. So the `<` would be non-inclusive but not the `>=`.

rnk added inline comments.Nov 5 2020, 9:33 AM

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
70	Got it, makes sense. I think I was thinking about return addresses, where the debugger is stopped after the executing instruction, and it wants to report the line info for the previous instruction.

Add an assembly test for symbolizer

I added an assembly test and removed the other test in llvm/test/tools, since it pretty much does the same thing.

I'm not sure how much there really is to test in terms of edge cases; I just put in one nested inline site and some line and file changes.

akhuang added inline comments.Nov 5 2020, 4:03 PM

llvm/lib/DebugInfo/PDB/Native/SymbolCache.cpp
566	fyi, I removed this bit of code since it was buggy. It basically searches for line numbers in the next compile unit if we haven't exceeded the length. I was just trying to match DIA's behavior here, but I don't think we ever use this

Harbormaster completed remote builds in B77792: Diff 303286.Nov 5 2020, 5:01 PM

lgtm, but please address the commented out code before landing.

Sorry for the delay, I wrote all these comments, and then didn't hit send.

llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h
133	You know, we shouldn't actually need to create this intermediate data structure. The section contributions in the PDB are sorted by ascending RVA. It should be possible to binary search them directly with `std::lower_bound` without building a new interval map data structure. But, I see you are just moving this code around, so this can be a follow-up optimization.
llvm/include/llvm/DebugInfo/PDB/Native/SymbolCache.h
40	All the const/mutable changes make sense, the cache is "notionally" const. Looking things up fills the cache, but doesn't affect results.
llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp
113–114	Getting here requires doing `O(log(#inputsections))` work, which is pretty quick. This loop here is `O(#symbols in function)`. Maybe one day the lookup will need to be faster or be cached, but I don't see any obvious way to do that today.
llvm/lib/DebugInfo/PDB/Native/SymbolCache.cpp
74	Please remove the commented out part.
566	Makes sense.

This revision is now accepted and ready to land.Nov 16 2020, 2:37 PM

Remove comment and print statement

akhuang added inline comments.Nov 16 2020, 6:51 PM

llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h
133	Hm, yeah, I'll look at it later.

Harbormaster completed remote builds in B79037: Diff 305637.Nov 16 2020, 7:18 PM

more cleanup

This revision was landed with ongoing or failed builds.Nov 17 2020, 1:19 PM

Closed by commit rGbc9803404042: [llvm-symbolizer] Add inline stack traces for Windows. (authored by akhuang). · Explain Why

This revision was automatically updated to reflect the committed changes.

akhuang added a commit: rGbc9803404042: [llvm-symbolizer] Add inline stack traces for Windows..

Harbormaster completed remote builds in B79176: Diff 305886.Nov 17 2020, 1:32 PM

dblaikie added a subscriber: dblaikie.Nov 23 2020, 8:52 PM

dblaikie added inline comments.

lld/test/COFF/symbolizer-inline.s
3–5	After updated the dependencies (adding a dependency on llvm-symbolizer from lld's tests so this test could run 27e73816d6f9a7e627db73c445c4329db2ecfeaf ) that got me looking at this/thinking: If this is the first use of llvm-symbolizer in lld, maybe it's out of place here? Indeed this patch made no changes to lld, so it seems unsuitable that tests be added to lld - changes to llvm should be tested from within llvm. For ELF, at least, we're leaning towards writing hand-crafted assembly and the assembling that (with llvm-mc) and running llvm-symbolizer on the assembled file. If that model would work for COFF that'd be great - but otherwise it is acceptable to include source and repro steps in a file, and checkin a binary file for running llvm-symbolizer over.

rnk added inline comments.Nov 24 2020, 11:43 AM

lld/test/COFF/symbolizer-inline.s
3–5	IMO it is important for test readability that we start with assembly, not a checked in binary file. It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary. There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file, but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD. While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me. Debug info is historically undertested or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing.

dblaikie added inline comments.Nov 24 2020, 12:01 PM

lld/test/COFF/symbolizer-inline.s
3–5	IMO it is important for test readability that we start with assembly, not a checked in binary file. I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary. Not sure I follow here - a checked in binary, built from source including creative .cv_loc transitions, etc, could have the same test coverage, right? There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file, Yeah, I still have sort of mixed feelings about that - but generally accepted the tradeoff (of increasing the size of the code underneath the test - including all of llvm-mc, rather than only the dumping/symbolizing code itself) in test maintainability is worthwhile, given the fairly limited complexity of functionality in llvm-mc that's required for most of these tests. (mostly don't care about eh_frame, so the line table generation is about the worst part of that - the rest of the DWARF is generated using basic assembly directives so the probability of bugs in llvm-mc messing with the tests is relatively low) but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD. Yeah, I thought that might be the case hence the caveat that maybe a checked in binary would be needed. While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me. I'm not sure it's any less testable, though - generating and checking in a binary doesn't seem like a huge hurdle to writing tests here. It was done in the past & while moving to assembly based tests has some value add, I don't think the checked in binaries testing libDebugInfo have been a major burden on LLVM development/maintenance. Debug info is historically undertested I don't think this test is adding significantly more/different testing than I think would be reasonably requested otherwise. An LLVM test with a checked in binary and an lld test using assembly and a debug info dump of some kind (if that isn't already covered - I'd have assumed it was already covered when the relevant functionality was added to lld in other/previous patches?). or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing. I'm not sure I see this particular test providing a lot more value over testing the components separately. And even then, I think it's important to have the isolated testing in addition to any end-to-end testing. There's the debuginfo-tests directory for more end-to-end testing, if that's desired (not sure if/how well that fits/is usable for COFF/PDB testing, though - could be worth expanding to support if it's not already testable there). Could have a full end-to-end test there, source code down to PDBs and symbolized. (might even be able to be portable between DWARF and PDB?)

akhuang added inline comments.Nov 25 2020, 11:41 AM

lld/test/COFF/symbolizer-inline.s
3–5	I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM. I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed.

dblaikie added inline comments.Nov 25 2020, 4:18 PM

lld/test/COFF/symbolizer-inline.s
3–5	I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM. I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed. Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand?

rnk added inline comments.Nov 25 2020, 5:15 PM

lld/test/COFF/symbolizer-inline.s
3–5	Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) I think debuginfo-tests aren't really the right place for this. They mostly focus on testing integration with interactive debuggers. To my knowledge, we aren't running debuginfo-tests continuously on Windows yet. The setup and configuration was sufficiently difficult that it fell off the top of the stack. Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand? Nope, they are not COFF files. PDBs are a kind of "multistream file" (MSF). We do have the means to make PDB files from yaml, though, so we could test things that way. However, the inline line tables, the thing under test in this case, are not represented symbolically, I believe they are just strings of hex bytes. We could make more testing tools to make PDBs, but at a certain point you're going to apply COFF relocations, and now you have yourself a second linker. I think if we want to make llvm-symbolizer more testable from LLVM, the way to go is to: Generalize the debug info parsing over COFF files as well as PDBs (the contents are the same, just unrelocated). This is similar to how llvm-symbolizer works on relocatable ELF object files. Keep coverage of the PDB codepaths with yaml-ified PDB files. They are not good for fine grained inline line table testing, but they'd give us coverage.

jhenderson added a subscriber: Higuoxing.Nov 26 2020, 12:37 AM

jhenderson added inline comments.

lld/test/COFF/symbolizer-inline.s
3–5	Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. Just chiming in that @Higuoxing spent this year's GSOC period significantly improving DWARF support in yaml2obj and to a lesser extent obj2yaml. I don't think it's covered every case yet, but it's certainly possible to write many more tests using YAML than it was before (see llvm\test\tools\yaml2obj\ELF\DWARF for a number of tests that demonstrate the behaviour that has been added), and with greater ease, as some of the information at least is now automatically derived by yaml2obj without needing to be explicitly specified in the YAML itself. Unfortunately, that doesn't help with COFF/PDB of course.

dblaikie added inline comments.Nov 30 2020, 3:09 PM

lld/test/COFF/symbolizer-inline.s
3–5	Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. I think we can implement assembly extensions without necessarily mandating that we only accept ones that gas accepts - especially if they're more for our own test usage (though I think yaml2obj is somewhat nicer/keeps these internal test utilities separate - since we'd have no real intent of using these assembly directives in the wild). (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) I think debuginfo-tests aren't really the right place for this. They mostly focus on testing integration with interactive debuggers. To my knowledge, we aren't running debuginfo-tests continuously on Windows yet. The setup and configuration was sufficiently difficult that it fell off the top of the stack. Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance. Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand? Nope, they are not COFF files. PDBs are a kind of "multistream file" (MSF). We do have the means to make PDB files from yaml, though, so we could test things that way. However, the inline line tables, the thing under test in this case, are not represented symbolically, I believe they are just strings of hex bytes. Any chance of improving the yaml support so those can be written symbolically? (not sure where that sits compared to other things being suggested/considered) We could make more testing tools to make PDBs, but at a certain point you're going to apply COFF relocations, and now you have yourself a second linker. Yeah, not sure that'd be the best bang-for-buck. I think if we want to make llvm-symbolizer more testable from LLVM, the way to go is to: Generalize the debug info parsing over COFF files as well as PDBs (the contents are the same, just unrelocated). This is similar to how llvm-symbolizer works on relocatable ELF object files. Yeah, +1 for that direction from me, at least. Keep coverage of the PDB codepaths with yaml-ified PDB files. They are not good for fine grained inline line table testing, but they'd give us coverage.

(Taking out-of-line since the comment thread has rather diverged from the original thing)

Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance.

We've internally been considering a need for an "integration" lit-based testsuite that is able to test the interactions in tools that would otherwise not really fit in one or the other location. We have our own downstream non-lit based test repository, but for some tests this is too heavyweight, and it would be nice to test things more locally. Examples include showing, for example, that tools can consume debug data produced by clang (note that because defaults might change as to what version of DWARF is emitted, an llvm-mc assembly or YAML based test here wouldn't be appropriate), or similarly for relocations being handled by the downstream tools. The intent here is not to replace targeted lit tests, but rather to show that end-to-end behaviour is still sensible.

Maybe this idea could be useful upstream? It would need the ability to create a lit-based testsuite, probably as a separate top-level directory in the llvm-project tree, which can use components from all projects. We'd want REQUIRES support that could be used to say REQUIRES: clang or similar so that users could still run the subset of tests that work with the projects they build.

In D88988#2424937, @jhenderson wrote:

(Taking out-of-line since the comment thread has rather diverged from the original thing)

Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance.

We've internally been considering a need for an "integration" lit-based testsuite that is able to test the interactions in tools that would otherwise not really fit in one or the other location. We have our own downstream non-lit based test repository, but for some tests this is too heavyweight, and it would be nice to test things more locally. Examples include showing, for example, that tools can consume debug data produced by clang (note that because defaults might change as to what version of DWARF is emitted, an llvm-mc assembly or YAML based test here wouldn't be appropriate), or similarly for relocations being handled by the downstream tools. The intent here is not to replace targeted lit tests, but rather to show that end-to-end behaviour is still sensible.

Maybe this idea could be useful upstream? It would need the ability to create a lit-based testsuite, probably as a separate top-level directory in the llvm-project tree, which can use components from all projects. We'd want REQUIRES support that could be used to say REQUIRES: clang or similar so that users could still run the subset of tests that work with the projects they build.

Yeah - I'd probably first suggest that such tests could go into debuginfo-tests, though if it's meant to extend to things other than debug info - yeah, another top level repo might be suitable. Maybe even the test-suite itself, though that's a bit of a dark place and maybe a distinct place with narrower, pure lit based testing would be beneficial.

In D88988#2426055, @dblaikie wrote:

In D88988#2424937, @jhenderson wrote:

(Taking out-of-line since the comment thread has rather diverged from the original thing)

Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance.

We've internally been considering a need for an "integration" lit-based testsuite that is able to test the interactions in tools that would otherwise not really fit in one or the other location. We have our own downstream non-lit based test repository, but for some tests this is too heavyweight, and it would be nice to test things more locally. Examples include showing, for example, that tools can consume debug data produced by clang (note that because defaults might change as to what version of DWARF is emitted, an llvm-mc assembly or YAML based test here wouldn't be appropriate), or similarly for relocations being handled by the downstream tools. The intent here is not to replace targeted lit tests, but rather to show that end-to-end behaviour is still sensible.

Maybe this idea could be useful upstream? It would need the ability to create a lit-based testsuite, probably as a separate top-level directory in the llvm-project tree, which can use components from all projects. We'd want REQUIRES support that could be used to say REQUIRES: clang or similar so that users could still run the subset of tests that work with the projects they build.

Yeah - I'd probably first suggest that such tests could go into debuginfo-tests, though if it's meant to extend to things other than debug info - yeah, another top level repo might be suitable. Maybe even the test-suite itself, though that's a bit of a dark place and maybe a distinct place with narrower, pure lit based testing would be beneficial.

An integration style testing for llvm-symbolizer (both clang and lld are allowed) sounds good to me. The current llvm-symbolizer tests for ELF mostly focus on relocatable objects. There are very few shared object/executable targeted tests.

In D88988#2426218, @MaskRay wrote:

In D88988#2426055, @dblaikie wrote:

In D88988#2424937, @jhenderson wrote:

(Taking out-of-line since the comment thread has rather diverged from the original thing)

Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance.

We've internally been considering a need for an "integration" lit-based testsuite that is able to test the interactions in tools that would otherwise not really fit in one or the other location. We have our own downstream non-lit based test repository, but for some tests this is too heavyweight, and it would be nice to test things more locally. Examples include showing, for example, that tools can consume debug data produced by clang (note that because defaults might change as to what version of DWARF is emitted, an llvm-mc assembly or YAML based test here wouldn't be appropriate), or similarly for relocations being handled by the downstream tools. The intent here is not to replace targeted lit tests, but rather to show that end-to-end behaviour is still sensible.

Maybe this idea could be useful upstream? It would need the ability to create a lit-based testsuite, probably as a separate top-level directory in the llvm-project tree, which can use components from all projects. We'd want REQUIRES support that could be used to say REQUIRES: clang or similar so that users could still run the subset of tests that work with the projects they build.

Yeah - I'd probably first suggest that such tests could go into debuginfo-tests, though if it's meant to extend to things other than debug info - yeah, another top level repo might be suitable. Maybe even the test-suite itself, though that's a bit of a dark place and maybe a distinct place with narrower, pure lit based testing would be beneficial.

An integration style testing for llvm-symbolizer (both clang and lld are allowed) sounds good to me. The current llvm-symbolizer tests for ELF mostly focus on relocatable objects. There are very few shared object/executable targeted tests.

I've made a note on our internal issue tracker item for our integration suite idea, to keep an eye on this. I don't if or when we'll get around to it, so if somebody else wants to get the ball rolling, just let me know what people have done and I can update our tracker accordingly, to make sure we don't duplicate effort. I guess the first stage would be to propose and RFC.

A couple of notes in case anybody picks this up: 1) @probinson mentioned on our internal issue tracker that there was an end-to-end round table discussion that touched on this topic at the 2019 developers meeting, so there's probably some wider interest outside those involved in this discussion. 2) I think it needs to be integrated with the check-all target, so that people and bots will run it properly.

Revision Contents

Path

Size

compiler-rt/

test/

asan/

TestCases/

suppressions-function.cpp

3 lines

lld/

test/

COFF/

symbolizer-inline.s

302 lines

llvm/

include/

llvm/

DebugInfo/

PDB/

Native/

NativeEnumSymbols.h

41 lines

NativeFunctionSymbol.h

5 lines

NativeInlineSiteSymbol.h

46 lines

NativeSession.h

11 lines

SymbolCache.h

48 lines

PDBSymbol.h

7 lines

lib/

DebugInfo/

PDB/

CMakeLists.txt

2 lines

Native/

NativeEnumSymbols.cpp

41 lines

NativeFunctionSymbol.cpp

98 lines

NativeInlineSiteSymbol.cpp

177 lines

77 lines

131 lines

39 lines

17 lines

utils/

gn/

secondary/

llvm/

lib/

DebugInfo/

PDB/

BUILD.gn

2 lines

Diff 305892

compiler-rt/test/asan/TestCases/suppressions-function.cpp

	// Check that without suppressions, we catch the issue.			// Check that without suppressions, we catch the issue.
	// RUN: %clangxx_asan -O0 %s -o %t			// RUN: %clangxx_asan -O0 %s -o %t
	// RUN: not %run %t 2>&1 \| FileCheck --check-prefix=CHECK-CRASH %s			// RUN: not %run %t 2>&1 \| FileCheck --check-prefix=CHECK-CRASH %s

	// RUN: echo "interceptor_via_fun:crash_function" > %t.supp			// RUN: echo "interceptor_via_fun:crash_function" > %t.supp
	// RUN: %clangxx_asan -O0 %s -o %t && %env_asan_opts=suppressions='"%t.supp"' %run %t 2>&1 \| FileCheck --check-prefix=CHECK-IGNORE %s			// RUN: %clangxx_asan -O0 %s -o %t && %env_asan_opts=suppressions='"%t.supp"' %run %t 2>&1 \| FileCheck --check-prefix=CHECK-IGNORE %s
	// RUN: %clangxx_asan -O3 %s -o %t && %env_asan_opts=suppressions='"%t.supp"' %run %t 2>&1 \| FileCheck --check-prefix=CHECK-IGNORE %s			// RUN: %clangxx_asan -O3 %s -o %t && %env_asan_opts=suppressions='"%t.supp"' %run %t 2>&1 \| FileCheck --check-prefix=CHECK-IGNORE %s

	// FIXME: Windows symbolizer needs work to make this pass.			// XFAIL: android
	// XFAIL: android,windows-msvc
	// UNSUPPORTED: ios			// UNSUPPORTED: ios

	// FIXME: atos does not work for inlined functions, yet llvm-symbolizer			// FIXME: atos does not work for inlined functions, yet llvm-symbolizer
	// does not always work with debug info on Darwin.			// does not always work with debug info on Darwin.
	// UNSUPPORTED: darwin			// UNSUPPORTED: darwin

	#include <stdio.h>			#include <stdio.h>
	#include <stdlib.h>			#include <stdlib.h>
	Show All 17 Lines

lld/test/COFF/symbolizer-inline.s

This file was added.

				# REQUIRES: x86
				# RUN: llvm-mc -filetype=obj %s -o %t.obj -triple x86_64-windows-msvc
				# RUN: lld-link -entry:main -nodefaultlib %t.obj -out:%t.exe -pdb:%t.pdb -debug
				# RUN: llvm-symbolizer --obj=%t.exe --use-native-pdb-reader --relative-address \
				# RUN: 0x1014 0x1018 0x101c 0x1023 0x1024 \| FileCheck %s
				dblaikieUnsubmitted Not Done Reply Inline Actions After updated the dependencies (adding a dependency on llvm-symbolizer from lld's tests so this test could run 27e73816d6f9a7e627db73c445c4329db2ecfeaf ) that got me looking at this/thinking: If this is the first use of llvm-symbolizer in lld, maybe it's out of place here? Indeed this patch made no changes to lld, so it seems unsuitable that tests be added to lld - changes to llvm should be tested from within llvm. For ELF, at least, we're leaning towards writing hand-crafted assembly and the assembling that (with llvm-mc) and running llvm-symbolizer on the assembled file. If that model would work for COFF that'd be great - but otherwise it is acceptable to include source and repro steps in a file, and checkin a binary file for running llvm-symbolizer over. dblaikie: After updated the dependencies (adding a dependency on llvm-symbolizer from lld's tests so this…
				rnkUnsubmitted Not Done Reply Inline Actions IMO it is important for test readability that we start with assembly, not a checked in binary file. It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary. There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file, but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD. While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me. Debug info is historically undertested or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing. rnk: IMO it is important for test readability that we start with assembly, not a checked in binary…
				dblaikieUnsubmitted Not Done Reply Inline Actions IMO it is important for test readability that we start with assembly, not a checked in binary file. I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary. Not sure I follow here - a checked in binary, built from source including creative .cv_loc transitions, etc, could have the same test coverage, right? There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file, Yeah, I still have sort of mixed feelings about that - but generally accepted the tradeoff (of increasing the size of the code underneath the test - including all of llvm-mc, rather than only the dumping/symbolizing code itself) in test maintainability is worthwhile, given the fairly limited complexity of functionality in llvm-mc that's required for most of these tests. (mostly don't care about eh_frame, so the line table generation is about the worst part of that - the rest of the DWARF is generated using basic assembly directives so the probability of bugs in llvm-mc messing with the tests is relatively low) but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD. Yeah, I thought that might be the case hence the caveat that maybe a checked in binary would be needed. While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me. I'm not sure it's any less testable, though - generating and checking in a binary doesn't seem like a huge hurdle to writing tests here. It was done in the past & while moving to assembly based tests has some value add, I don't think the checked in binaries testing libDebugInfo have been a major burden on LLVM development/maintenance. Debug info is historically undertested I don't think this test is adding significantly more/different testing than I think would be reasonably requested otherwise. An LLVM test with a checked in binary and an lld test using assembly and a debug info dump of some kind (if that isn't already covered - I'd have assumed it was already covered when the relevant functionality was added to lld in other/previous patches?). or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing. I'm not sure I see this particular test providing a lot more value over testing the components separately. And even then, I think it's important to have the isolated testing in addition to any end-to-end testing. There's the debuginfo-tests directory for more end-to-end testing, if that's desired (not sure if/how well that fits/is usable for COFF/PDB testing, though - could be worth expanding to support if it's not already testable there). Could have a full end-to-end test there, source code down to PDBs and symbolized. (might even be able to be portable between DWARF and PDB?) dblaikie: > IMO it is important for test readability that we start with assembly, not a checked in binary…
				akhuangAuthorUnsubmitted Done Reply Inline Actions I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM. I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed. akhuang: > I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM.
				dblaikieUnsubmitted Not Done Reply Inline Actions I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience. We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM. I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed. Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand? dblaikie: > > I agree that it's important - but to me, not at the cost of leaving the code untested in…
				rnkUnsubmitted Not Done Reply Inline Actions Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) I think debuginfo-tests aren't really the right place for this. They mostly focus on testing integration with interactive debuggers. To my knowledge, we aren't running debuginfo-tests continuously on Windows yet. The setup and configuration was sufficiently difficult that it fell off the top of the stack. Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand? Nope, they are not COFF files. PDBs are a kind of "multistream file" (MSF). We do have the means to make PDB files from yaml, though, so we could test things that way. However, the inline line tables, the thing under test in this case, are not represented symbolically, I believe they are just strings of hex bytes. We could make more testing tools to make PDBs, but at a certain point you're going to apply COFF relocations, and now you have yourself a second linker. I think if we want to make llvm-symbolizer more testable from LLVM, the way to go is to: Generalize the debug info parsing over COFF files as well as PDBs (the contents are the same, just unrelocated). This is similar to how llvm-symbolizer works on relocatable ELF object files. Keep coverage of the PDB codepaths with yaml-ified PDB files. They are not good for fine grained inline line table testing, but they'd give us coverage. rnk: > Agreed that being able to write the assembly is beneficial (though even the DWARF tests from…
				jhendersonUnsubmitted Not Done Reply Inline Actions Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. Just chiming in that @Higuoxing spent this year's GSOC period significantly improving DWARF support in yaml2obj and to a lesser extent obj2yaml. I don't think it's covered every case yet, but it's certainly possible to write many more tests using YAML than it was before (see llvm\test\tools\yaml2obj\ELF\DWARF for a number of tests that demonstrate the behaviour that has been added), and with greater ease, as some of the information at least is now automatically derived by yaml2obj without needing to be explicitly specified in the YAML itself. Unfortunately, that doesn't help with COFF/PDB of course. jhenderson: >> Agreed that being able to write the assembly is beneficial (though even the DWARF tests from…
				dblaikieUnsubmitted Not Done Reply Inline Actions Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries. We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do. I think we can implement assembly extensions without necessarily mandating that we only accept ones that gas accepts - especially if they're more for our own test usage (though I think yaml2obj is somewhat nicer/keeps these internal test utilities separate - since we'd have no real intent of using these assembly directives in the wild). (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable) I think debuginfo-tests aren't really the right place for this. They mostly focus on testing integration with interactive debuggers. To my knowledge, we aren't running debuginfo-tests continuously on Windows yet. The setup and configuration was sufficiently difficult that it fell off the top of the stack. Perhaps - I'd be happy to widen the use case of debuginfo-tests for other things like symbolizing. And if the existing interactive debugger parts weren't Windows suitable/of interest, I'd be OK with it being possible to run those symbolizer bits without the interactive debugger bits. But would provide a space for a full clang+lld+llvm-symbolizer test cases, which currently can't be done in either the clang or lld repository, for instance. Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand? Nope, they are not COFF files. PDBs are a kind of "multistream file" (MSF). We do have the means to make PDB files from yaml, though, so we could test things that way. However, the inline line tables, the thing under test in this case, are not represented symbolically, I believe they are just strings of hex bytes. Any chance of improving the yaml support so those can be written symbolically? (not sure where that sits compared to other things being suggested/considered) We could make more testing tools to make PDBs, but at a certain point you're going to apply COFF relocations, and now you have yourself a second linker. Yeah, not sure that'd be the best bang-for-buck. I think if we want to make llvm-symbolizer more testable from LLVM, the way to go is to: Generalize the debug info parsing over COFF files as well as PDBs (the contents are the same, just unrelocated). This is similar to how llvm-symbolizer works on relocatable ELF object files. Yeah, +1 for that direction from me, at least. Keep coverage of the PDB codepaths with yaml-ified PDB files. They are not good for fine grained inline line table testing, but they'd give us coverage. dblaikie: > > Agreed that being able to write the assembly is beneficial (though even the DWARF tests…

				# Compiled from this cpp code, with modifications to add extra inline line and
				# file changes:
				# clang -cc1 -triple x86_64-windows-msvc -gcodeview -S test.cpp
				#
				# __attribute__((always_inline)) int inlinee_2(int x) {
				# return x + 1;
				# }
				# __attribute__((always_inline)) int inlinee_1(int x) {
				# return inlinee_2(x) + 1;
				# }
				# int main() {
				# return inlinee_1(33);
				# }


				# CHECK: inlinee_1
				# CHECK-NEXT: C:\src\test.cpp:9:0
				# CHECK-NEXT: main
				# CHECK-NEXT: C:\src\test.cpp:13:10

				# CHECK: inlinee_1
				# CHECK-NEXT: C:\src\test.cpp:10:0
				# CHECK-NEXT: main
				# CHECK-NEXT: C:\src\test.cpp:13:10

				# CHECK: inlinee_2
				# CHECK-NEXT: C:\src\test.cpp:5:0
				# CHECK-NEXT: inlinee_1
				# CHECK-NEXT: C:\src\test.cpp:9:0
				# CHECK-NEXT: main
				# CHECK-NEXT: C:\src\test.cpp:13:10

				# CHECK: inlinee_2
				# CHECK-NEXT: C:\src\file.cpp:5:0
				# CHECK-NEXT: inlinee_1
				# CHECK-NEXT: C:\src\test.cpp:9:0
				# CHECK-NEXT: main
				# CHECK-NEXT: C:\src\test.cpp:13:10

				# CHECK: inlinee_1
				# CHECK-NEXT: C:\src\test.cpp:9:0
				# CHECK-NEXT: main
				# CHECK-NEXT: C:\src\test.cpp:13:10

				.text
				.def @feat.00;
				.scl 3;
				.type 0;
				.endef
				.globl @feat.00
				.set @feat.00, 0
				.file "test.cpp"
				.def main;
				.scl 2;
				.type 32;
				.endef
				.globl main # -- Begin function main
				.p2align 4, 0x90
				main: # @main
				.Lfunc_begin0:
				.cv_func_id 0
				.cv_file 1 "C:\\src\\test.cpp" "4BECA437CFE062C7D0B74B1851B65988" 1
				.cv_file 2 "C:\\src\\file.cpp" "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF" 1
				.cv_loc 0 1 12 0 # test.cpp:12:0
				# %bb.0: # %entry
				subq $16, %rsp
				movl $0, 4(%rsp)
				movl $33, 8(%rsp)
				.Ltmp0:
				.cv_inline_site_id 1 within 0 inlined_at 1 13 10
				.cv_loc 1 1 9 20 # test.cpp:9:20
				movl 8(%rsp), %eax
				.cv_loc 1 1 10 0 # test.cpp:10:0
				movl %eax, 12(%rsp)
				.Ltmp1:
				.cv_inline_site_id 2 within 1 inlined_at 1 9 10
				.cv_loc 2 1 5 10 # test.cpp:5:10
				movl 12(%rsp), %eax
				.cv_loc 2 1 5 12 # test.cpp:5:12
				addl $1, %eax
				.cv_loc 2 2 5 13 # file.cpp:5:13
				nop
				.Ltmp2:
				.cv_loc 1 1 9 23 # test.cpp:9:23
				addl $1, %eax
				.Ltmp3:
				.cv_loc 0 1 13 3 # test.cpp:13:3
				addq $16, %rsp
				retq
				.Ltmp4:
				.Lfunc_end0:
				# -- End function
				.section .debug$S,"dr"
				.p2align 2
				.long 4 # Debug section magic
				.long 241
				.long .Ltmp6-.Ltmp5 # Subsection size
				.Ltmp5:
				.short .Ltmp8-.Ltmp7 # Record length
				.Ltmp7:
				.short 4412 # Record kind: S_COMPILE3
				.long 1 # Flags and language
				.short 208 # CPUType
				.short 12 # Frontend version
				.short 0
				.short 0
				.short 0
				.short 12000 # Backend version
				.short 0
				.short 0
				.short 0
				.asciz "clang version 12.0.0 (https://github.com/llvm/llvm-project.git 6a4850e9c1cc74cc67f99f1f81a8fe060a7088d2)" # Null-terminated compiler version string
				.p2align 2
				.Ltmp8:
				.Ltmp6:
				.p2align 2
				.long 246 # Inlinee lines subsection
				.long .Ltmp10-.Ltmp9 # Subsection size
				.Ltmp9:
				.long 0 # Inlinee lines signature

				# Inlined function inlinee_1 starts at test.cpp:8
				.long 4098 # Type index of inlined function
				.cv_filechecksumoffset 1 # Offset into filechecksum table
				.long 8 # Starting line number

				# Inlined function inlinee_2 starts at test.cpp:4
				.long 4099 # Type index of inlined function
				.cv_filechecksumoffset 1 # Offset into filechecksum table
				.long 4 # Starting line number
				.Ltmp10:
				.p2align 2
				.long 241 # Symbol subsection for main
				.long .Ltmp12-.Ltmp11 # Subsection size
				.Ltmp11:
				.short .Ltmp14-.Ltmp13 # Record length
				.Ltmp13:
				.short 4423 # Record kind: S_GPROC32_ID
				.long 0 # PtrParent
				.long 0 # PtrEnd
				.long 0 # PtrNext
				.long .Lfunc_end0-main # Code size
				.long 0 # Offset after prologue
				.long 0 # Offset before epilogue
				.long 4102 # Function type index
				.secrel32 main # Function section relative address
				.secidx main # Function section index
				.byte 0 # Flags
				.asciz "main" # Function name
				.p2align 2
				.Ltmp14:
				.short .Ltmp16-.Ltmp15 # Record length
				.Ltmp15:
				.short 4114 # Record kind: S_FRAMEPROC
				.long 16 # FrameSize
				.long 0 # Padding
				.long 0 # Offset of padding
				.long 0 # Bytes of callee saved registers
				.long 0 # Exception handler offset
				.short 0 # Exception handler section
				.long 81920 # Flags (defines frame register)
				.p2align 2
				.Ltmp16:
				.short .Ltmp18-.Ltmp17 # Record length
				.Ltmp17:
				.short 4429 # Record kind: S_INLINESITE
				.long 0 # PtrParent
				.long 0 # PtrEnd
				.long 4098 # Inlinee type index
				.cv_inline_linetable 1 1 8 .Lfunc_begin0 .Lfunc_end0
				.p2align 2
				.Ltmp18:
				.short .Ltmp20-.Ltmp19 # Record length
				.Ltmp19:
				.short 4414 # Record kind: S_LOCAL
				.long 116 # TypeIndex
				.short 1 # Flags
				.asciz "x"
				.p2align 2
				.Ltmp20:
				.cv_def_range .Ltmp0 .Ltmp3, frame_ptr_rel, 8
				.short .Ltmp22-.Ltmp21 # Record length
				.Ltmp21:
				.short 4429 # Record kind: S_INLINESITE
				.long 0 # PtrParent
				.long 0 # PtrEnd
				.long 4099 # Inlinee type index
				.cv_inline_linetable 2 1 4 .Lfunc_begin0 .Lfunc_end0
				.p2align 2
				.Ltmp22:
				.short .Ltmp24-.Ltmp23 # Record length
				.Ltmp23:
				.short 4414 # Record kind: S_LOCAL
				.long 116 # TypeIndex
				.short 1 # Flags
				.asciz "x"
				.p2align 2
				.Ltmp24:
				.cv_def_range .Ltmp1 .Ltmp2, frame_ptr_rel, 12
				.short 2 # Record length
				.short 4430 # Record kind: S_INLINESITE_END
				.short 2 # Record length
				.short 4430 # Record kind: S_INLINESITE_END
				.short 2 # Record length
				.short 4431 # Record kind: S_PROC_ID_END
				.Ltmp12:
				.p2align 2
				.cv_linetable 0, main, .Lfunc_end0
				.cv_filechecksums # File index to string table offset subsection
				.cv_stringtable # String table
				.long 241
				.long .Ltmp26-.Ltmp25 # Subsection size
				.Ltmp25:
				.short .Ltmp28-.Ltmp27 # Record length
				.Ltmp27:
				.short 4428 # Record kind: S_BUILDINFO
				.long 4105 # LF_BUILDINFO index
				.p2align 2
				.Ltmp28:
				.Ltmp26:
				.p2align 2
				.section .debug$T,"dr"
				.p2align 2
				.long 4 # Debug section magic
				# ArgList (0x1000)
				.short 0xa # Record length
				.short 0x1201 # Record kind: LF_ARGLIST
				.long 0x1 # NumArgs
				.long 0x74 # Argument: int
				# Procedure (0x1001)
				.short 0xe # Record length
				.short 0x1008 # Record kind: LF_PROCEDURE
				.long 0x74 # ReturnType: int
				.byte 0x0 # CallingConvention: NearC
				.byte 0x0 # FunctionOptions
				.short 0x1 # NumParameters
				.long 0x1000 # ArgListType: (int)
				# FuncId (0x1002)
				.short 0x16 # Record length
				.short 0x1601 # Record kind: LF_FUNC_ID
				.long 0x0 # ParentScope
				.long 0x1001 # FunctionType: int (int)
				.asciz "inlinee_1" # Name
				.byte 242
				.byte 241
				# FuncId (0x1003)
				.short 0x16 # Record length
				.short 0x1601 # Record kind: LF_FUNC_ID
				.long 0x0 # ParentScope
				.long 0x1001 # FunctionType: int (int)
				.asciz "inlinee_2" # Name
				.byte 242
				.byte 241
				# ArgList (0x1004)
				.short 0x6 # Record length
				.short 0x1201 # Record kind: LF_ARGLIST
				.long 0x0 # NumArgs
				# Procedure (0x1005)
				.short 0xe # Record length
				.short 0x1008 # Record kind: LF_PROCEDURE
				.long 0x74 # ReturnType: int
				.byte 0x0 # CallingConvention: NearC
				.byte 0x0 # FunctionOptions
				.short 0x0 # NumParameters
				.long 0x1004 # ArgListType: ()
				# FuncId (0x1006)
				.short 0x12 # Record length
				.short 0x1601 # Record kind: LF_FUNC_ID
				.long 0x0 # ParentScope
				.long 0x1005 # FunctionType: int ()
				.asciz "main" # Name
				.byte 243
				.byte 242
				.byte 241
				# StringId (0x1007)
				.short 0xe # Record length
				.short 0x1605 # Record kind: LF_STRING_ID
				.long 0x0 # Id
				.asciz "C:\\src" # StringData
				.byte 241
				# StringId (0x1008)
				.short 0xe # Record length
				.short 0x1605 # Record kind: LF_STRING_ID
				.long 0x0 # Id
				.asciz "<stdin>" # StringData
				# BuildInfo (0x1009)
				.short 0x1a # Record length
				.short 0x1603 # Record kind: LF_BUILDINFO
				.short 0x5 # NumArgs
				.long 0x1007 # Argument: C:\src
				.long 0x0 # Argument
				.long 0x1008 # Argument: <stdin>
				.long 0x0 # Argument
				.long 0x0 # Argument
				.byte 242
				.byte 241

llvm/include/llvm/DebugInfo/PDB/Native/NativeEnumSymbols.h

This file was added.

				//==- NativeEnumSymbols.h - Native Symbols Enumerator impl -------- C++ --==//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_DEBUGINFO_PDB_NATIVE_NATIVEENUMSYMBOLS_H
				#define LLVM_DEBUGINFO_PDB_NATIVE_NATIVEENUMSYMBOLS_H

				#include "llvm/DebugInfo/CodeView/TypeRecord.h"
				#include "llvm/DebugInfo/PDB/IPDBEnumChildren.h"
				#include "llvm/DebugInfo/PDB/PDBSymbol.h"

				#include <vector>

				namespace llvm {
				namespace pdb {

				class NativeSession;

				class NativeEnumSymbols : public IPDBEnumChildren<PDBSymbol> {
				public:
				NativeEnumSymbols(NativeSession &Session, std::vector<SymIndexId> Symbols);

				uint32_t getChildCount() const override;
				std::unique_ptr<PDBSymbol> getChildAtIndex(uint32_t Index) const override;
				std::unique_ptr<PDBSymbol> getNext() override;
				void reset() override;

				private:
				std::vector<SymIndexId> Symbols;
				uint32_t Index;
				NativeSession &Session;
				};

				} // namespace pdb
				} // namespace llvm

				#endif

llvm/include/llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h

	Show All 14 Lines
	#include "llvm/DebugInfo/PDB/Native/NativeSession.h"			#include "llvm/DebugInfo/PDB/Native/NativeSession.h"

	namespace llvm {			namespace llvm {
	namespace pdb {			namespace pdb {

	class NativeFunctionSymbol : public NativeRawSymbol {			class NativeFunctionSymbol : public NativeRawSymbol {
	public:			public:
	NativeFunctionSymbol(NativeSession &Session, SymIndexId Id,			NativeFunctionSymbol(NativeSession &Session, SymIndexId Id,
	const codeview::ProcSym &Sym);			const codeview::ProcSym &Sym, uint32_t RecordOffset);

	~NativeFunctionSymbol() override;			~NativeFunctionSymbol() override;

	void dump(raw_ostream &OS, int Indent, PdbSymbolIdField ShowIdFields,			void dump(raw_ostream &OS, int Indent, PdbSymbolIdField ShowIdFields,
	PdbSymbolIdField RecurseIdFields) const override;			PdbSymbolIdField RecurseIdFields) const override;

	uint32_t getAddressOffset() const override;			uint32_t getAddressOffset() const override;
	uint32_t getAddressSection() const override;			uint32_t getAddressSection() const override;
	std::string getName() const override;			std::string getName() const override;
	uint64_t getLength() const override;			uint64_t getLength() const override;
	uint32_t getRelativeVirtualAddress() const override;			uint32_t getRelativeVirtualAddress() const override;
	uint64_t getVirtualAddress() const override;			uint64_t getVirtualAddress() const override;
				std::unique_ptr<IPDBEnumSymbols>
				findInlineFramesByVA(uint64_t VA) const override;

	protected:			protected:
	const codeview::ProcSym Sym;			const codeview::ProcSym Sym;
				uint32_t RecordOffset = 0;
	};			};

	} // namespace pdb			} // namespace pdb
	} // namespace llvm			} // namespace llvm

	#endif // LLVM_DEBUGINFO_PDB_NATIVE_NATIVEFUNCTIONSYMBOL_H			#endif // LLVM_DEBUGINFO_PDB_NATIVE_NATIVEFUNCTIONSYMBOL_H

llvm/include/llvm/DebugInfo/PDB/Native/NativeInlineSiteSymbol.h

This file was added.

				//===- NativeInlineSiteSymbol.h - info about inline sites -------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_DEBUGINFO_PDB_NATIVE_NATIVEINLINESITESYMBOL_H
				#define LLVM_DEBUGINFO_PDB_NATIVE_NATIVEINLINESITESYMBOL_H

				#include "llvm/DebugInfo/CodeView/CodeView.h"
				#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
				#include "llvm/DebugInfo/PDB/Native/NativeRawSymbol.h"
				#include "llvm/DebugInfo/PDB/Native/NativeSession.h"

				namespace llvm {
				namespace pdb {

				class NativeInlineSiteSymbol : public NativeRawSymbol {
				public:
				NativeInlineSiteSymbol(NativeSession &Session, SymIndexId Id,
				const codeview::InlineSiteSym &Sym,
				uint64_t ParentAddr);

				~NativeInlineSiteSymbol() override;

				void dump(raw_ostream &OS, int Indent, PdbSymbolIdField ShowIdFields,
				PdbSymbolIdField RecurseIdFields) const override;

				std::string getName() const override;
				std::unique_ptr<IPDBEnumLineNumbers>
				findInlineeLinesByVA(uint64_t VA, uint32_t Length) const override;

				private:
				const codeview::InlineSiteSym Sym;
				uint64_t ParentAddr;

				void getLineOffset(uint32_t OffsetInFunc, uint32_t &LineOffset,
				uint32_t &FileOffset) const;
				};

				} // namespace pdb
				} // namespace llvm

				#endif // LLVM_DEBUGINFO_PDB_NATIVE_NATIVEINLINESITESYMBOL_H

llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h

Show First 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	public:
PDBFile &getPDBFile() { return *Pdb; }		PDBFile &getPDBFile() { return *Pdb; }
const PDBFile &getPDBFile() const { return *Pdb; }		const PDBFile &getPDBFile() const { return *Pdb; }

NativeExeSymbol &getNativeGlobalScope() const;		NativeExeSymbol &getNativeGlobalScope() const;
SymbolCache &getSymbolCache() { return Cache; }		SymbolCache &getSymbolCache() { return Cache; }
const SymbolCache &getSymbolCache() const { return Cache; }		const SymbolCache &getSymbolCache() const { return Cache; }
uint32_t getRVAFromSectOffset(uint32_t Section, uint32_t Offset) const;		uint32_t getRVAFromSectOffset(uint32_t Section, uint32_t Offset) const;
uint64_t getVAFromSectOffset(uint32_t Section, uint32_t Offset) const;		uint64_t getVAFromSectOffset(uint32_t Section, uint32_t Offset) const;
		bool moduleIndexForVA(uint64_t VA, uint16_t &ModuleIndex) const;
		bool moduleIndexForSectOffset(uint32_t Sect, uint32_t Offset,
		uint16_t &ModuleIndex) const;
		Expected<ModuleDebugStreamRef> getModuleDebugStream(uint32_t Index) const;

private:		private:
void initializeExeSymbol();		void initializeExeSymbol();
		void parseSectionContribs();

std::unique_ptr<PDBFile> Pdb;		std::unique_ptr<PDBFile> Pdb;
std::unique_ptr<BumpPtrAllocator> Allocator;		std::unique_ptr<BumpPtrAllocator> Allocator;

SymbolCache Cache;		SymbolCache Cache;
SymIndexId ExeSymbol = 0;		SymIndexId ExeSymbol = 0;
uint64_t LoadAddress = 0;		uint64_t LoadAddress = 0;

		/// Map from virtual address to module index.
		using IMap =
		IntervalMap<uint64_t, uint16_t, 8, IntervalMapHalfOpenInfo<uint64_t>>;
		IMap::Allocator IMapAllocator;
		IMap AddrToModuleIndex;
		rnkUnsubmitted Not Done Reply Inline Actions You know, we shouldn't actually need to create this intermediate data structure. The section contributions in the PDB are sorted by ascending RVA. It should be possible to binary search them directly with `std::lower_bound` without building a new interval map data structure. But, I see you are just moving this code around, so this can be a follow-up optimization. rnk: You know, we shouldn't actually need to create this intermediate data structure. The section…
		akhuangAuthorUnsubmitted Done Reply Inline Actions Hm, yeah, I'll look at it later. akhuang: Hm, yeah, I'll look at it later.
};		};
		amccarthUnsubmitted Done Reply Inline Actions I don't see anything explicit in LLVM's style guidelines, but I think it's common to segregate the data members and methods whenever possible. So perhaps the `parseSectionContribes()` should be moved up. amccarth: I don't see anything explicit in LLVM's style guidelines, but I think it's common to segregate…
} // namespace pdb		} // namespace pdb
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/include/llvm/DebugInfo/PDB/Native/SymbolCache.h

Show All 31 Lines	class SymbolCache {
NativeSession &Session;		NativeSession &Session;
DbiStream *Dbi = nullptr;		DbiStream *Dbi = nullptr;

/// Cache of all stable symbols, indexed by SymIndexId. Just because a		/// Cache of all stable symbols, indexed by SymIndexId. Just because a
/// symbol has been parsed does not imply that it will be stable and have		/// symbol has been parsed does not imply that it will be stable and have
/// an Id. Id allocation is an implementation, with the only guarantee		/// an Id. Id allocation is an implementation, with the only guarantee
/// being that once an Id is allocated, the symbol can be assumed to be		/// being that once an Id is allocated, the symbol can be assumed to be
/// cached.		/// cached.
std::vector<std::unique_ptr<NativeRawSymbol>> Cache;		mutable std::vector<std::unique_ptr<NativeRawSymbol>> Cache;
		rnkUnsubmitted Not Done Reply Inline Actions All the const/mutable changes make sense, the cache is "notionally" const. Looking things up fills the cache, but doesn't affect results. rnk: All the const/mutable changes make sense, the cache is "notionally" const. Looking things up…

/// For type records from the TPI stream which have been paresd and cached,		/// For type records from the TPI stream which have been paresd and cached,
/// stores a mapping to SymIndexId of the cached symbol.		/// stores a mapping to SymIndexId of the cached symbol.
DenseMap<codeview::TypeIndex, SymIndexId> TypeIndexToSymbolId;		mutable DenseMap<codeview::TypeIndex, SymIndexId> TypeIndexToSymbolId;

/// For field list members which have been parsed and cached, stores a mapping		/// For field list members which have been parsed and cached, stores a mapping
/// from (IndexOfClass, MemberIndex) to the corresponding SymIndexId of the		/// from (IndexOfClass, MemberIndex) to the corresponding SymIndexId of the
/// cached symbol.		/// cached symbol.
DenseMap<std::pair<codeview::TypeIndex, uint32_t>, SymIndexId>		mutable DenseMap<std::pair<codeview::TypeIndex, uint32_t>, SymIndexId>
FieldListMembersToSymbolId;		FieldListMembersToSymbolId;

/// List of SymIndexIds for each compiland, indexed by compiland index as they		/// List of SymIndexIds for each compiland, indexed by compiland index as they
/// appear in the PDB file.		/// appear in the PDB file.
std::vector<SymIndexId> Compilands;		mutable std::vector<SymIndexId> Compilands;

/// List of source files, indexed by unique source file index.		/// List of source files, indexed by unique source file index.
mutable std::vector<std::unique_ptr<NativeSourceFile>> SourceFiles;		mutable std::vector<std::unique_ptr<NativeSourceFile>> SourceFiles;

		/// Map from string table offset to source file Id.
mutable DenseMap<uint32_t, SymIndexId> FileNameOffsetToId;		mutable DenseMap<uint32_t, SymIndexId> FileNameOffsetToId;

/// Map from global symbol offset to SymIndexId.		/// Map from global symbol offset to SymIndexId.
DenseMap<uint32_t, SymIndexId> GlobalOffsetToSymbolId;		mutable DenseMap<uint32_t, SymIndexId> GlobalOffsetToSymbolId;

/// Map from segment and code offset to SymIndexId.
DenseMap<std::pair<uint32_t, uint32_t>, SymIndexId> AddressToSymbolId;
DenseMap<std::pair<uint32_t, uint32_t>, SymIndexId> AddressToPublicSymId;

/// Map from virtual address to module index.
using IMap =
IntervalMap<uint64_t, uint16_t, 8, IntervalMapHalfOpenInfo<uint64_t>>;
IMap::Allocator IMapAllocator;
IMap AddrToModuleIndex;

Expected<ModuleDebugStreamRef> getModuleDebugStream(uint32_t Index) const;		/// Map from segment and code offset to function symbols.
		mutable DenseMap<std::pair<uint32_t, uint32_t>, SymIndexId> AddressToSymbolId;
		/// Map from segment and code offset to public symbols.
		mutable DenseMap<std::pair<uint32_t, uint32_t>, SymIndexId>
		AddressToPublicSymId;

		/// Map from module index and symbol table offset to SymIndexId.
		mutable DenseMap<std::pair<uint16_t, uint32_t>, SymIndexId>
		SymTabOffsetToSymbolId;

struct LineTableEntry {		struct LineTableEntry {
uint64_t Addr;		uint64_t Addr;
codeview::LineInfo Line;		codeview::LineInfo Line;
uint32_t ColumnNumber;		uint32_t ColumnNumber;
uint32_t FileNameIndex;		uint32_t FileNameIndex;
bool IsTerminalEntry;		bool IsTerminalEntry;
};		};

std::vector<LineTableEntry> findLineTable(uint16_t Modi) const;		std::vector<LineTableEntry> findLineTable(uint16_t Modi) const;
mutable DenseMap<uint16_t, std::vector<LineTableEntry>> LineTable;		mutable DenseMap<uint16_t, std::vector<LineTableEntry>> LineTable;

SymIndexId createSymbolPlaceholder() {		SymIndexId createSymbolPlaceholder() const {
SymIndexId Id = Cache.size();		SymIndexId Id = Cache.size();
Cache.push_back(nullptr);		Cache.push_back(nullptr);
return Id;		return Id;
}		}

template <typename ConcreteSymbolT, typename CVRecordT, typename... Args>		template <typename ConcreteSymbolT, typename CVRecordT, typename... Args>
SymIndexId createSymbolForType(codeview::TypeIndex TI, codeview::CVType CVT,		SymIndexId createSymbolForType(codeview::TypeIndex TI, codeview::CVType CVT,
Args &&... ConstructorArgs) {		Args &&...ConstructorArgs) const {
CVRecordT Record;		CVRecordT Record;
if (auto EC =		if (auto EC =
codeview::TypeDeserializer::deserializeAs<CVRecordT>(CVT, Record)) {		codeview::TypeDeserializer::deserializeAs<CVRecordT>(CVT, Record)) {
consumeError(std::move(EC));		consumeError(std::move(EC));
return 0;		return 0;
}		}

return createSymbol<ConcreteSymbolT>(		return createSymbol<ConcreteSymbolT>(
TI, std::move(Record), std::forward<Args>(ConstructorArgs)...);		TI, std::move(Record), std::forward<Args>(ConstructorArgs)...);
}		}

SymIndexId createSymbolForModifiedType(codeview::TypeIndex ModifierTI,		SymIndexId createSymbolForModifiedType(codeview::TypeIndex ModifierTI,
codeview::CVType CVT);		codeview::CVType CVT) const;

SymIndexId createSimpleType(codeview::TypeIndex TI,		SymIndexId createSimpleType(codeview::TypeIndex TI,
codeview::ModifierOptions Mods);		codeview::ModifierOptions Mods) const;

std::unique_ptr<PDBSymbol> findFunctionSymbolBySectOffset(uint32_t Sect,		std::unique_ptr<PDBSymbol> findFunctionSymbolBySectOffset(uint32_t Sect,
uint32_t Offset);		uint32_t Offset);
std::unique_ptr<PDBSymbol> findPublicSymbolBySectOffset(uint32_t Sect,		std::unique_ptr<PDBSymbol> findPublicSymbolBySectOffset(uint32_t Sect,
uint32_t Offset);		uint32_t Offset);

public:		public:
SymbolCache(NativeSession &Session, DbiStream *Dbi);		SymbolCache(NativeSession &Session, DbiStream *Dbi);

template <typename ConcreteSymbolT, typename... Args>		template <typename ConcreteSymbolT, typename... Args>
SymIndexId createSymbol(Args &&... ConstructorArgs) {		SymIndexId createSymbol(Args &&...ConstructorArgs) const {
SymIndexId Id = Cache.size();		SymIndexId Id = Cache.size();

// Initial construction must not access the cache, since it must be done		// Initial construction must not access the cache, since it must be done
// atomically.		// atomically.
auto Result = std::make_unique<ConcreteSymbolT>(		auto Result = std::make_unique<ConcreteSymbolT>(
Session, Id, std::forward<Args>(ConstructorArgs)...);		Session, Id, std::forward<Args>(ConstructorArgs)...);
Result->SymbolId = Id;		Result->SymbolId = Id;

Show All 10 Lines	public:
createTypeEnumerator(codeview::TypeLeafKind Kind);		createTypeEnumerator(codeview::TypeLeafKind Kind);

std::unique_ptr<IPDBEnumSymbols>		std::unique_ptr<IPDBEnumSymbols>
createTypeEnumerator(std::vector<codeview::TypeLeafKind> Kinds);		createTypeEnumerator(std::vector<codeview::TypeLeafKind> Kinds);

std::unique_ptr<IPDBEnumSymbols>		std::unique_ptr<IPDBEnumSymbols>
createGlobalsEnumerator(codeview::SymbolKind Kind);		createGlobalsEnumerator(codeview::SymbolKind Kind);

SymIndexId findSymbolByTypeIndex(codeview::TypeIndex TI);		SymIndexId findSymbolByTypeIndex(codeview::TypeIndex TI) const;

template <typename ConcreteSymbolT, typename... Args>		template <typename ConcreteSymbolT, typename... Args>
SymIndexId getOrCreateFieldListMember(codeview::TypeIndex FieldListTI,		SymIndexId getOrCreateFieldListMember(codeview::TypeIndex FieldListTI,
uint32_t Index,		uint32_t Index,
Args &&... ConstructorArgs) {		Args &&... ConstructorArgs) {
SymIndexId SymId = Cache.size();		SymIndexId SymId = Cache.size();
std::pair<codeview::TypeIndex, uint32_t> Key{FieldListTI, Index};		std::pair<codeview::TypeIndex, uint32_t> Key{FieldListTI, Index};
auto Result = FieldListMembersToSymbolId.try_emplace(Key, SymId);		auto Result = FieldListMembersToSymbolId.try_emplace(Key, SymId);
if (Result.second)		if (Result.second)
SymId =		SymId =
createSymbol<ConcreteSymbolT>(std::forward<Args>(ConstructorArgs)...);		createSymbol<ConcreteSymbolT>(std::forward<Args>(ConstructorArgs)...);
else		else
SymId = Result.first->second;		SymId = Result.first->second;
return SymId;		return SymId;
}		}

SymIndexId getOrCreateGlobalSymbolByOffset(uint32_t Offset);		SymIndexId getOrCreateGlobalSymbolByOffset(uint32_t Offset);
		SymIndexId getOrCreateInlineSymbol(codeview::InlineSiteSym Sym,
		uint64_t ParentAddr, uint16_t Modi,
		uint32_t RecordOffset) const;

std::unique_ptr<PDBSymbol>		std::unique_ptr<PDBSymbol>
findSymbolBySectOffset(uint32_t Sect, uint32_t Offset, PDB_SymType Type);		findSymbolBySectOffset(uint32_t Sect, uint32_t Offset, PDB_SymType Type);

std::unique_ptr<IPDBEnumLineNumbers>		std::unique_ptr<IPDBEnumLineNumbers>
findLineNumbersByVA(uint64_t VA, uint32_t Length) const;		findLineNumbersByVA(uint64_t VA, uint32_t Length) const;

std::unique_ptr<PDBSymbolCompiland> getOrCreateCompiland(uint32_t Index);		std::unique_ptr<PDBSymbolCompiland> getOrCreateCompiland(uint32_t Index);
uint32_t getNumCompilands() const;		uint32_t getNumCompilands() const;

std::unique_ptr<PDBSymbol> getSymbolById(SymIndexId SymbolId) const;		std::unique_ptr<PDBSymbol> getSymbolById(SymIndexId SymbolId) const;

NativeRawSymbol &getNativeSymbolById(SymIndexId SymbolId) const;		NativeRawSymbol &getNativeSymbolById(SymIndexId SymbolId) const;

template <typename ConcreteT>		template <typename ConcreteT>
ConcreteT &getNativeSymbolById(SymIndexId SymbolId) const {		ConcreteT &getNativeSymbolById(SymIndexId SymbolId) const {
return static_cast<ConcreteT &>(getNativeSymbolById(SymbolId));		return static_cast<ConcreteT &>(getNativeSymbolById(SymbolId));
}		}

std::unique_ptr<IPDBSourceFile> getSourceFileById(SymIndexId FileId) const;		std::unique_ptr<IPDBSourceFile> getSourceFileById(SymIndexId FileId) const;
SymIndexId		SymIndexId
getOrCreateSourceFile(const codeview::FileChecksumEntry &Checksum) const;		getOrCreateSourceFile(const codeview::FileChecksumEntry &Checksum) const;

void parseSectionContribs();
Optional<uint16_t> getModuleIndexForAddr(uint64_t Addr) const;
};		};

} // namespace pdb		} // namespace pdb
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/include/llvm/DebugInfo/PDB/PDBSymbol.h

Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	public:

std::unique_ptr<IPDBEnumSymbols>		std::unique_ptr<IPDBEnumSymbols>
findChildren(PDB_SymType Type, StringRef Name,		findChildren(PDB_SymType Type, StringRef Name,
PDB_NameSearchFlags Flags) const;		PDB_NameSearchFlags Flags) const;
std::unique_ptr<IPDBEnumSymbols> findChildrenByRVA(PDB_SymType Type,		std::unique_ptr<IPDBEnumSymbols> findChildrenByRVA(PDB_SymType Type,
StringRef Name,		StringRef Name,
PDB_NameSearchFlags Flags,		PDB_NameSearchFlags Flags,
uint32_t RVA) const;		uint32_t RVA) const;
		std::unique_ptr<IPDBEnumSymbols> findInlineFramesByVA(uint64_t VA) const;
std::unique_ptr<IPDBEnumSymbols> findInlineFramesByRVA(uint32_t RVA) const;		std::unique_ptr<IPDBEnumSymbols> findInlineFramesByRVA(uint32_t RVA) const;
		std::unique_ptr<IPDBEnumLineNumbers>
		findInlineeLinesByVA(uint64_t VA, uint32_t Length) const;
		std::unique_ptr<IPDBEnumLineNumbers>
		findInlineeLinesByRVA(uint32_t RVA, uint32_t Length) const;

		std::string getName() const;

const IPDBRawSymbol &getRawSymbol() const { return *RawSymbol; }		const IPDBRawSymbol &getRawSymbol() const { return *RawSymbol; }
IPDBRawSymbol &getRawSymbol() { return *RawSymbol; }		IPDBRawSymbol &getRawSymbol() { return *RawSymbol; }

const IPDBSession &getSession() const { return Session; }		const IPDBSession &getSession() const { return Session; }

std::unique_ptr<IPDBEnumSymbols> getChildStats(TagStats &Stats) const;		std::unique_ptr<IPDBEnumSymbols> getChildStats(TagStats &Stats) const;

Show All 17 Lines

llvm/lib/DebugInfo/PDB/CMakeLists.txt

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	add_pdb_impl_folder(Native
Native/InjectedSourceStream.cpp		Native/InjectedSourceStream.cpp
Native/ModuleDebugStream.cpp		Native/ModuleDebugStream.cpp
Native/NativeCompilandSymbol.cpp		Native/NativeCompilandSymbol.cpp
Native/NativeEnumGlobals.cpp		Native/NativeEnumGlobals.cpp
Native/NativeEnumInjectedSources.cpp		Native/NativeEnumInjectedSources.cpp
Native/NativeEnumLineNumbers.cpp		Native/NativeEnumLineNumbers.cpp
Native/NativeEnumModules.cpp		Native/NativeEnumModules.cpp
Native/NativeEnumTypes.cpp		Native/NativeEnumTypes.cpp
		Native/NativeEnumSymbols.cpp
Native/NativeExeSymbol.cpp		Native/NativeExeSymbol.cpp
Native/NativeFunctionSymbol.cpp		Native/NativeFunctionSymbol.cpp
		Native/NativeInlineSiteSymbol.cpp
Native/NativeLineNumber.cpp		Native/NativeLineNumber.cpp
Native/NativePublicSymbol.cpp		Native/NativePublicSymbol.cpp
Native/NativeRawSymbol.cpp		Native/NativeRawSymbol.cpp
Native/NativeSourceFile.cpp		Native/NativeSourceFile.cpp
Native/NativeSymbolEnumerator.cpp		Native/NativeSymbolEnumerator.cpp
Native/NativeTypeArray.cpp		Native/NativeTypeArray.cpp
Native/NativeTypeBuiltin.cpp		Native/NativeTypeBuiltin.cpp
Native/NativeTypeEnum.cpp		Native/NativeTypeEnum.cpp
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/lib/DebugInfo/PDB/Native/NativeEnumSymbols.cpp

This file was added.

				//==- NativeEnumSymbols.cpp - Native Symbol Enumerator impl ------- C++ --==//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/DebugInfo/PDB/Native/NativeEnumSymbols.h"

				#include "llvm/DebugInfo/PDB/IPDBEnumChildren.h"
				#include "llvm/DebugInfo/PDB/Native/NativeSession.h"
				#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"
				#include "llvm/DebugInfo/PDB/PDBSymbol.h"
				#include "llvm/DebugInfo/PDB/PDBSymbolTypeEnum.h"

				using namespace llvm;
				using namespace llvm::codeview;
				using namespace llvm::pdb;

				NativeEnumSymbols::NativeEnumSymbols(NativeSession &PDBSession,
				std::vector<SymIndexId> Symbols)
				: Symbols(std::move(Symbols)), Index(0), Session(PDBSession) {}

				uint32_t NativeEnumSymbols::getChildCount() const {
				return static_cast<uint32_t>(Symbols.size());
				}

				std::unique_ptr<PDBSymbol>
				NativeEnumSymbols::getChildAtIndex(uint32_t N) const {
				if (N < Symbols.size()) {
				return Session.getSymbolCache().getSymbolById(Symbols[N]);
				}
				return nullptr;
				}

				std::unique_ptr<PDBSymbol> NativeEnumSymbols::getNext() {
				return getChildAtIndex(Index++);
				}

				void NativeEnumSymbols::reset() { Index = 0; }

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp

	//===- NativeFunctionSymbol.cpp - info about function symbols----- C++ --===//			//===- NativeFunctionSymbol.cpp - info about function symbols----- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h"			#include "llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h"

				#include "llvm/DebugInfo/CodeView/SymbolDeserializer.h"
	#include "llvm/DebugInfo/CodeView/SymbolRecord.h"			#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
				#include "llvm/DebugInfo/PDB/Native/NativeEnumSymbols.h"
	#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"			#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"
	#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"			#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"

	using namespace llvm;			using namespace llvm;
	using namespace llvm::codeview;			using namespace llvm::codeview;
	using namespace llvm::pdb;			using namespace llvm::pdb;

	NativeFunctionSymbol::NativeFunctionSymbol(NativeSession &Session,			NativeFunctionSymbol::NativeFunctionSymbol(NativeSession &Session,
	SymIndexId Id,			SymIndexId Id,
	const codeview::ProcSym &Sym)			const codeview::ProcSym &Sym,
	: NativeRawSymbol(Session, PDB_SymType::Function, Id), Sym(Sym) {}			uint32_t Offset)
				: NativeRawSymbol(Session, PDB_SymType::Function, Id), Sym(Sym),
				RecordOffset(Offset) {}

	NativeFunctionSymbol::~NativeFunctionSymbol() {}			NativeFunctionSymbol::~NativeFunctionSymbol() {}

	void NativeFunctionSymbol::dump(raw_ostream &OS, int Indent,			void NativeFunctionSymbol::dump(raw_ostream &OS, int Indent,
	PdbSymbolIdField ShowIdFields,			PdbSymbolIdField ShowIdFields,
	PdbSymbolIdField RecurseIdFields) const {			PdbSymbolIdField RecurseIdFields) const {
	NativeRawSymbol::dump(OS, Indent, ShowIdFields, RecurseIdFields);			NativeRawSymbol::dump(OS, Indent, ShowIdFields, RecurseIdFields);
	dumpSymbolField(OS, "name", getName(), Indent);			dumpSymbolField(OS, "name", getName(), Indent);
	Show All 15 Lines

	uint32_t NativeFunctionSymbol::getRelativeVirtualAddress() const {			uint32_t NativeFunctionSymbol::getRelativeVirtualAddress() const {
	return Session.getRVAFromSectOffset(Sym.Segment, Sym.CodeOffset);			return Session.getRVAFromSectOffset(Sym.Segment, Sym.CodeOffset);
	}			}

	uint64_t NativeFunctionSymbol::getVirtualAddress() const {			uint64_t NativeFunctionSymbol::getVirtualAddress() const {
	return Session.getVAFromSectOffset(Sym.Segment, Sym.CodeOffset);			return Session.getVAFromSectOffset(Sym.Segment, Sym.CodeOffset);
	}			}

				static bool inlineSiteContainsAddress(InlineSiteSym &IS,
				akhuangAuthorUnsubmitted Done Reply Inline Actions I think this sometimes returns true incorrectly, seems like it happens when there are multiple ChangeCodeLength annotations. Maybe I'm interpreting those incorrectly? akhuang: I think this sometimes returns true incorrectly, seems like it happens when there are multiple…
				amccarthUnsubmitted Not Done Reply Inline Actions I don't know much about these annotations. Can these be nested? For example if `foo` runs from f0 to f1 and the annotations also describe another subrange inside of f0..f1, then I would think `Found` would have to be a counter rather than a bool. If it's 0, you're not inside any range. If it's 2, you're inside two nested ranges. If they don't nest, then I don't see a problem. amccarth: I don't know much about these annotations. Can these be nested? For example if `foo` runs…
				akhuangAuthorUnsubmitted Done Reply Inline Actions I figured out why I was getting different results with DIA, and it turned out to be unrelated to the inline code. (I wasn't specificying PDB_SymType::Function when searching for the parent function, and I guess my implementation for findSymbolByAddress is different). akhuang: I figured out why I was getting different results with DIA, and it turned out to be unrelated…
				uint32_t OffsetInFunc) {
				// Returns true if inline site contains the offset.
				bool Found = false;
				uint32_t CodeOffset = 0;
				for (auto &Annot : IS.annotations()) {
				switch (Annot.OpCode) {
				case BinaryAnnotationsOpCode::CodeOffset:
				case BinaryAnnotationsOpCode::ChangeCodeOffset:
				case BinaryAnnotationsOpCode::ChangeCodeOffsetAndLineOffset:
				CodeOffset += Annot.U1;
				if (OffsetInFunc >= CodeOffset)
				rnkUnsubmitted Not Done Reply Inline Actions I wonder if this should be `>` instead of `>=`. Consider the case of: callq somewhere .cv_loc ... inline location nop # anything Otherwise, this seems like it should work to me. rnk: I wonder if this should be `>` instead of `>=`. Consider the case of: callq somewhere .
				rnkUnsubmitted Not Done Reply Inline Actions Any thoughts on this? rnk: Any thoughts on this?
				akhuangAuthorUnsubmitted Done Reply Inline Actions Hm, the things I've seen are something like func: .cv_inline_site_id 2 ... .cv_loc 2 ... movl ... # inlined code addl ... # inlined code .cv_loc 1 nop # some other instructions and the address of the `movl` would be the starting offset of the S_INLINESITE. So the `<` would be non-inclusive but not the `>=`. akhuang: Hm, the things I've seen are something like ``` func: .cv_inline_site_id 2 ... .cv_loc 2 ..
				rnkUnsubmitted Not Done Reply Inline Actions Got it, makes sense. I think I was thinking about return addresses, where the debugger is stopped after the executing instruction, and it wants to report the line info for the previous instruction. rnk: Got it, makes sense. I think I was thinking about return addresses, where the debugger is…
				Found = true;
				break;
				case BinaryAnnotationsOpCode::ChangeCodeLength:
				CodeOffset += Annot.U1;
				if (Found && OffsetInFunc < CodeOffset)
				return true;
				Found = false;
				break;
				case BinaryAnnotationsOpCode::ChangeCodeLengthAndCodeOffset:
				CodeOffset += Annot.U2;
				if (OffsetInFunc >= CodeOffset)
				Found = true;
				CodeOffset += Annot.U1;
				if (Found && OffsetInFunc < CodeOffset)
				return true;
				Found = false;
				break;
				default:
				break;
				}
				}
				return false;
				}

				std::unique_ptr<IPDBEnumSymbols>
				NativeFunctionSymbol::findInlineFramesByVA(uint64_t VA) const {
				uint16_t Modi;
				if (!Session.moduleIndexForVA(VA, Modi))
				return nullptr;

				Expected<ModuleDebugStreamRef> ModS = Session.getModuleDebugStream(Modi);
				if (!ModS) {
				consumeError(ModS.takeError());
				return nullptr;
				}
				CVSymbolArray Syms = ModS->getSymbolArray();

				// Search for inline sites. There should be one matching top level inline
				// site. Then search in its nested inline sites.
				std::vector<SymIndexId> Frames;
				uint32_t CodeOffset = VA - getVirtualAddress();
				auto Start = Syms.at(RecordOffset);
				auto End = Syms.at(Sym.End);
				while (Start != End) {
				rnkUnsubmitted Not Done Reply Inline Actions Getting here requires doing `O(log(#inputsections))` work, which is pretty quick. This loop here is `O(#symbols in function)`. Maybe one day the lookup will need to be faster or be cached, but I don't see any obvious way to do that today. rnk: Getting here requires doing `O(log(#inputsections))` work, which is pretty quick. This loop…
				bool Found = false;
				// Find matching inline site within Start and End.
				for (; Start != End; ++Start) {
				if (Start->kind() != S_INLINESITE)
				continue;

				InlineSiteSym IS =
				cantFail(SymbolDeserializer::deserializeAs<InlineSiteSym>(*Start));
				if (inlineSiteContainsAddress(IS, CodeOffset)) {
				fprintf(stderr, "inline: %d\n", Start.offset());
				// Insert frames in reverse order.
				SymIndexId Id = Session.getSymbolCache().getOrCreateInlineSymbol(
				IS, getVirtualAddress(), Modi, Start.offset());
				Frames.insert(Frames.begin(), Id);

				// Update offsets to search within this inline site.
				++Start;
				End = Syms.at(IS.End);
				Found = true;
				break;
				}

				Start = Syms.at(IS.End);
				if (Start == End)
				break;
				}

				if (!Found)
				break;
				}

				return std::make_unique<NativeEnumSymbols>(Session, std::move(Frames));
				}

llvm/lib/DebugInfo/PDB/Native/NativeInlineSiteSymbol.cpp

This file was added.

				//===- NativeInlineSiteSymbol.cpp - info about inline sites ------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/DebugInfo/PDB/Native/NativeInlineSiteSymbol.h"

				#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
				#include "llvm/DebugInfo/CodeView/LazyRandomTypeCollection.h"
				#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
				#include "llvm/DebugInfo/CodeView/TypeDeserializer.h"
				#include "llvm/DebugInfo/PDB/Native/NativeEnumLineNumbers.h"
				#include "llvm/DebugInfo/PDB/Native/TpiStream.h"

				using namespace llvm;
				using namespace llvm::codeview;
				using namespace llvm::pdb;

				NativeInlineSiteSymbol::NativeInlineSiteSymbol(
				NativeSession &Session, SymIndexId Id, const codeview::InlineSiteSym &Sym,
				uint64_t ParentAddr)
				: NativeRawSymbol(Session, PDB_SymType::InlineSite, Id), Sym(Sym),
				ParentAddr(ParentAddr) {}

				NativeInlineSiteSymbol::~NativeInlineSiteSymbol() {}

				void NativeInlineSiteSymbol::dump(raw_ostream &OS, int Indent,
				PdbSymbolIdField ShowIdFields,
				PdbSymbolIdField RecurseIdFields) const {
				NativeRawSymbol::dump(OS, Indent, ShowIdFields, RecurseIdFields);
				dumpSymbolField(OS, "name", getName(), Indent);
				}

				static Optional<InlineeSourceLine>
				findInlineeByTypeIndex(TypeIndex Id, ModuleDebugStreamRef &ModS) {
				for (const auto &SS : ModS.getSubsectionsArray()) {
				if (SS.kind() != DebugSubsectionKind::InlineeLines)
				continue;

				DebugInlineeLinesSubsectionRef InlineeLines;
				BinaryStreamReader Reader(SS.getRecordData());
				if (auto EC = InlineeLines.initialize(Reader)) {
				consumeError(std::move(EC));
				continue;
				}

				for (const InlineeSourceLine &Line : InlineeLines)
				if (Line.Header->Inlinee == Id)
				return Line;
				}
				return None;
				}

				std::string NativeInlineSiteSymbol::getName() const {
				auto Tpi = Session.getPDBFile().getPDBTpiStream();
				if (!Tpi) {
				consumeError(Tpi.takeError());
				return "";
				}
				auto Ipi = Session.getPDBFile().getPDBIpiStream();
				if (!Ipi) {
				consumeError(Ipi.takeError());
				return "";
				}

				LazyRandomTypeCollection &Types = Tpi->typeCollection();
				LazyRandomTypeCollection &Ids = Ipi->typeCollection();
				CVType InlineeType = Ids.getType(Sym.Inlinee);
				std::string QualifiedName;
				if (InlineeType.kind() == LF_MFUNC_ID) {
				MemberFuncIdRecord MFRecord;
				cantFail(TypeDeserializer::deserializeAs<MemberFuncIdRecord>(InlineeType,
				MFRecord));
				TypeIndex ClassTy = MFRecord.getClassType();
				QualifiedName.append(std::string(Types.getTypeName(ClassTy)));
				QualifiedName.append("::");
				} else if (InlineeType.kind() == LF_FUNC_ID) {
				FuncIdRecord FRecord;
				cantFail(
				TypeDeserializer::deserializeAs<FuncIdRecord>(InlineeType, FRecord));
				TypeIndex ParentScope = FRecord.getParentScope();
				if (!ParentScope.isNoneType()) {
				QualifiedName.append(std::string(Ids.getTypeName(ParentScope)));
				QualifiedName.append("::");
				}
				}

				QualifiedName.append(std::string(Ids.getTypeName(Sym.Inlinee)));
				return QualifiedName;
				}

				void NativeInlineSiteSymbol::getLineOffset(uint32_t OffsetInFunc,
				uint32_t &LineOffset,
				uint32_t &FileOffset) const {
				LineOffset = 0;
				FileOffset = 0;
				uint32_t CodeOffset = 0;
				for (const auto &Annot : Sym.annotations()) {
				switch (Annot.OpCode) {
				case BinaryAnnotationsOpCode::CodeOffset:
				case BinaryAnnotationsOpCode::ChangeCodeOffset:
				case BinaryAnnotationsOpCode::ChangeCodeLength:
				CodeOffset += Annot.U1;
				break;
				case BinaryAnnotationsOpCode::ChangeCodeLengthAndCodeOffset:
				CodeOffset += Annot.U2;
				break;
				case BinaryAnnotationsOpCode::ChangeLineOffset:
				case BinaryAnnotationsOpCode::ChangeCodeOffsetAndLineOffset:
				CodeOffset += Annot.U1;
				LineOffset += Annot.S1;
				break;
				case BinaryAnnotationsOpCode::ChangeFile:
				FileOffset = Annot.U1;
				break;
				default:
				break;
				}

				if (CodeOffset >= OffsetInFunc)
				return;
				}
				}

				std::unique_ptr<IPDBEnumLineNumbers>
				NativeInlineSiteSymbol::findInlineeLinesByVA(uint64_t VA,
				uint32_t Length) const {
				uint16_t Modi;
				if (!Session.moduleIndexForVA(VA, Modi))
				return nullptr;

				Expected<ModuleDebugStreamRef> ModS = Session.getModuleDebugStream(Modi);
				if (!ModS) {
				consumeError(ModS.takeError());
				return nullptr;
				}

				Expected<DebugChecksumsSubsectionRef> Checksums =
				ModS->findChecksumsSubsection();
				if (!Checksums) {
				consumeError(Checksums.takeError());
				return nullptr;
				}

				// Get the line number offset and source file offset.
				uint32_t SrcLineOffset;
				uint32_t SrcFileOffset;
				getLineOffset(VA - ParentAddr, SrcLineOffset, SrcFileOffset);

				// Get line info from inlinee line table.
				Optional<InlineeSourceLine> Inlinee =
				findInlineeByTypeIndex(Sym.Inlinee, ModS.get());

				if (!Inlinee)
				return nullptr;

				uint32_t SrcLine = Inlinee->Header->SourceLineNum + SrcLineOffset;
				uint32_t SrcCol = 0; // Inline sites don't seem to have column info.
				uint32_t FileChecksumOffset =
				(SrcFileOffset == 0) ? Inlinee->Header->FileID : SrcFileOffset;

				auto ChecksumIter = Checksums->getArray().at(FileChecksumOffset);
				uint32_t SrcFileId =
				Session.getSymbolCache().getOrCreateSourceFile(*ChecksumIter);

				uint32_t LineSect, LineOff;
				Session.addressForVA(VA, LineSect, LineOff);
				NativeLineNumber LineNum(Session, SrcLine, SrcCol, LineSect, LineOff, Length,
				SrcFileId, Modi);
				auto SrcFile = Session.getSymbolCache().getSourceFileById(SrcFileId);
				std::vector<NativeLineNumber> Lines{LineNum};

				return std::make_unique<NativeEnumLineNumbers>(std::move(Lines));
				}

llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp

//===- NativeSession.cpp - Native implementation of IPDBSession -- C++ --===//		//===- NativeSession.cpp - Native implementation of IPDBSession -- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/DebugInfo/PDB/Native/NativeSession.h"		#include "llvm/DebugInfo/PDB/Native/NativeSession.h"

#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/DebugInfo/CodeView/TypeIndex.h"		#include "llvm/DebugInfo/CodeView/TypeIndex.h"
#include "llvm/DebugInfo/PDB/IPDBEnumChildren.h"		#include "llvm/DebugInfo/PDB/IPDBEnumChildren.h"
#include "llvm/DebugInfo/PDB/IPDBSourceFile.h"		#include "llvm/DebugInfo/PDB/IPDBSourceFile.h"
#include "llvm/DebugInfo/PDB/Native/DbiStream.h"		#include "llvm/DebugInfo/PDB/Native/DbiStream.h"
		#include "llvm/DebugInfo/PDB/Native/ISectionContribVisitor.h"
#include "llvm/DebugInfo/PDB/Native/NativeCompilandSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativeCompilandSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativeEnumInjectedSources.h"		#include "llvm/DebugInfo/PDB/Native/NativeEnumInjectedSources.h"
#include "llvm/DebugInfo/PDB/Native/NativeEnumTypes.h"		#include "llvm/DebugInfo/PDB/Native/NativeEnumTypes.h"
#include "llvm/DebugInfo/PDB/Native/NativeExeSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativeExeSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"
#include "llvm/DebugInfo/PDB/Native/PDBFile.h"		#include "llvm/DebugInfo/PDB/Native/PDBFile.h"
#include "llvm/DebugInfo/PDB/Native/RawError.h"		#include "llvm/DebugInfo/PDB/Native/RawError.h"
Show All 27 Lines	static DbiStream *getDbiStreamPtr(PDBFile &File) {

consumeError(DbiS.takeError());		consumeError(DbiS.takeError());
return nullptr;		return nullptr;
}		}

NativeSession::NativeSession(std::unique_ptr<PDBFile> PdbFile,		NativeSession::NativeSession(std::unique_ptr<PDBFile> PdbFile,
std::unique_ptr<BumpPtrAllocator> Allocator)		std::unique_ptr<BumpPtrAllocator> Allocator)
: Pdb(std::move(PdbFile)), Allocator(std::move(Allocator)),		: Pdb(std::move(PdbFile)), Allocator(std::move(Allocator)),
Cache(this, getDbiStreamPtr(Pdb)) {}		Cache(this, getDbiStreamPtr(Pdb)), AddrToModuleIndex(IMapAllocator) {}

NativeSession::~NativeSession() = default;		NativeSession::~NativeSession() = default;

Error NativeSession::createFromPdb(std::unique_ptr<MemoryBuffer> Buffer,		Error NativeSession::createFromPdb(std::unique_ptr<MemoryBuffer> Buffer,
std::unique_ptr<IPDBSession> &Session) {		std::unique_ptr<IPDBSession> &Session) {
StringRef Path = Buffer->getBufferIdentifier();		StringRef Path = Buffer->getBufferIdentifier();
auto Stream = std::make_unique<MemoryBufferByteStream>(		auto Stream = std::make_unique<MemoryBufferByteStream>(
std::move(Buffer), llvm::support::little);		std::move(Buffer), llvm::support::little);
▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines	std::unique_ptr<PDBSymbol> NativeSession::findSymbolByRVA(uint32_t RVA,
uint32_t Offset;		uint32_t Offset;
addressForRVA(RVA, Section, Offset);		addressForRVA(RVA, Section, Offset);
return findSymbolBySectOffset(Section, Offset, Type);		return findSymbolBySectOffset(Section, Offset, Type);
}		}

std::unique_ptr<PDBSymbol>		std::unique_ptr<PDBSymbol>
NativeSession::findSymbolBySectOffset(uint32_t Sect, uint32_t Offset,		NativeSession::findSymbolBySectOffset(uint32_t Sect, uint32_t Offset,
PDB_SymType Type) {		PDB_SymType Type) {
		if (AddrToModuleIndex.empty())
		parseSectionContribs();

return Cache.findSymbolBySectOffset(Sect, Offset, Type);		return Cache.findSymbolBySectOffset(Sect, Offset, Type);
}		}

std::unique_ptr<IPDBEnumLineNumbers>		std::unique_ptr<IPDBEnumLineNumbers>
NativeSession::findLineNumbers(const PDBSymbolCompiland &Compiland,		NativeSession::findLineNumbers(const PDBSymbolCompiland &Compiland,
const IPDBSourceFile &File) const {		const IPDBSourceFile &File) const {
return nullptr;		return nullptr;
}		}
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	uint32_t NativeSession::getRVAFromSectOffset(uint32_t Section,
auto &Sec = Dbi->getSectionHeaders()[Section - 1];		auto &Sec = Dbi->getSectionHeaders()[Section - 1];
return Sec.VirtualAddress + Offset;		return Sec.VirtualAddress + Offset;
}		}

uint64_t NativeSession::getVAFromSectOffset(uint32_t Section,		uint64_t NativeSession::getVAFromSectOffset(uint32_t Section,
uint32_t Offset) const {		uint32_t Offset) const {
return LoadAddress + getRVAFromSectOffset(Section, Offset);		return LoadAddress + getRVAFromSectOffset(Section, Offset);
}		}

		bool NativeSession::moduleIndexForVA(uint64_t VA, uint16_t &ModuleIndex) const {
		ModuleIndex = 0;
		auto Iter = AddrToModuleIndex.find(VA);
		if (Iter == AddrToModuleIndex.end())
		return false;
		ModuleIndex = Iter.value();
		return true;
		}

		bool NativeSession::moduleIndexForSectOffset(uint32_t Sect, uint32_t Offset,
		uint16_t &ModuleIndex) const {
		ModuleIndex = 0;
		auto Iter = AddrToModuleIndex.find(getVAFromSectOffset(Sect, Offset));
		if (Iter == AddrToModuleIndex.end())
		return false;
		ModuleIndex = Iter.value();
		return true;
		}

		void NativeSession::parseSectionContribs() {
		auto Dbi = Pdb->getPDBDbiStream();
		if (!Dbi)
		return;

		class Visitor : public ISectionContribVisitor {
		NativeSession &Session;
		IMap &AddrMap;

		public:
		Visitor(NativeSession &Session, IMap &AddrMap)
		: Session(Session), AddrMap(AddrMap) {}
		void visit(const SectionContrib &C) override {
		if (C.Size == 0)
		return;

		uint64_t VA = Session.getVAFromSectOffset(C.ISect, C.Off);
		uint64_t End = VA + C.Size;

		// Ignore overlapping sections based on the assumption that a valid
		// PDB file should not have overlaps.
		if (!AddrMap.overlaps(VA, End))
		AddrMap.insert(VA, End, C.Imod);
		}
		void visit(const SectionContrib2 &C) override { visit(C.Base); }
		};

		Visitor V(*this, AddrToModuleIndex);
		Dbi->visitSectionContributions(V);
		}

		Expected<ModuleDebugStreamRef>
		NativeSession::getModuleDebugStream(uint32_t Index) const {
		auto Dbi = getDbiStreamPtr(Pdb);
		assert(Dbi && "Dbi stream not present");

		DbiModuleDescriptor Modi = Dbi->modules().getModuleDescriptor(Index);

		uint16_t ModiStream = Modi.getModuleStreamIndex();
		if (ModiStream == kInvalidStreamIndex)
		return make_error<RawError>("Module stream not present");

		std::unique_ptr<msf::MappedBlockStream> ModStreamData =
		Pdb->createIndexedStream(ModiStream);

		ModuleDebugStreamRef ModS(Modi, std::move(ModStreamData));
		if (auto EC = ModS.reload())
		return std::move(EC);

		return std::move(ModS);
		}

llvm/lib/DebugInfo/PDB/Native/SymbolCache.cpp

#include "llvm/DebugInfo/PDB/Native/SymbolCache.h"		#include "llvm/DebugInfo/PDB/Native/SymbolCache.h"

		#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugLinesSubsection.h"		#include "llvm/DebugInfo/CodeView/DebugLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/SymbolDeserializer.h"		#include "llvm/DebugInfo/CodeView/SymbolDeserializer.h"
#include "llvm/DebugInfo/CodeView/TypeDeserializer.h"		#include "llvm/DebugInfo/CodeView/TypeDeserializer.h"
#include "llvm/DebugInfo/CodeView/TypeRecordHelpers.h"		#include "llvm/DebugInfo/CodeView/TypeRecordHelpers.h"
#include "llvm/DebugInfo/PDB/Native/DbiStream.h"		#include "llvm/DebugInfo/PDB/Native/DbiStream.h"
#include "llvm/DebugInfo/PDB/Native/GlobalsStream.h"		#include "llvm/DebugInfo/PDB/Native/GlobalsStream.h"
#include "llvm/DebugInfo/PDB/Native/ISectionContribVisitor.h"		#include "llvm/DebugInfo/PDB/Native/ISectionContribVisitor.h"
#include "llvm/DebugInfo/PDB/Native/NativeCompilandSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativeCompilandSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativeEnumGlobals.h"		#include "llvm/DebugInfo/PDB/Native/NativeEnumGlobals.h"
#include "llvm/DebugInfo/PDB/Native/NativeEnumLineNumbers.h"		#include "llvm/DebugInfo/PDB/Native/NativeEnumLineNumbers.h"
		#include "llvm/DebugInfo/PDB/Native/NativeEnumSymbols.h"
#include "llvm/DebugInfo/PDB/Native/NativeEnumTypes.h"		#include "llvm/DebugInfo/PDB/Native/NativeEnumTypes.h"
#include "llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h"
		#include "llvm/DebugInfo/PDB/Native/NativeInlineSiteSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativePublicSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativePublicSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativeRawSymbol.h"		#include "llvm/DebugInfo/PDB/Native/NativeRawSymbol.h"
#include "llvm/DebugInfo/PDB/Native/NativeSession.h"		#include "llvm/DebugInfo/PDB/Native/NativeSession.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeArray.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeArray.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeBuiltin.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeEnum.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypeFunctionSig.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypeFunctionSig.h"
#include "llvm/DebugInfo/PDB/Native/NativeTypePointer.h"		#include "llvm/DebugInfo/PDB/Native/NativeTypePointer.h"
Show All 40 Lines	uint32_t Size;
{codeview::SimpleTypeKind::Float64, PDB_BuiltinType::Float, 8},		{codeview::SimpleTypeKind::Float64, PDB_BuiltinType::Float, 8},
{codeview::SimpleTypeKind::Float80, PDB_BuiltinType::Float, 10},		{codeview::SimpleTypeKind::Float80, PDB_BuiltinType::Float, 10},
{codeview::SimpleTypeKind::Boolean8, PDB_BuiltinType::Bool, 1},		{codeview::SimpleTypeKind::Boolean8, PDB_BuiltinType::Bool, 1},
// This table can be grown as necessary, but these are the only types we've		// This table can be grown as necessary, but these are the only types we've
// needed so far.		// needed so far.
};		};

SymbolCache::SymbolCache(NativeSession &Session, DbiStream *Dbi)		SymbolCache::SymbolCache(NativeSession &Session, DbiStream *Dbi)
: Session(Session), Dbi(Dbi), AddrToModuleIndex(IMapAllocator) {		: Session(Session), Dbi(Dbi) {
		rnkUnsubmitted Done Reply Inline Actions Please remove the commented out part. rnk: Please remove the commented out part.
// Id 0 is reserved for the invalid symbol.		// Id 0 is reserved for the invalid symbol.
Cache.push_back(nullptr);		Cache.push_back(nullptr);
SourceFiles.push_back(nullptr);		SourceFiles.push_back(nullptr);

if (Dbi)		if (Dbi)
Compilands.resize(Dbi->modules().getModuleCount());		Compilands.resize(Dbi->modules().getModuleCount());
}		}

Show All 16 Lines

std::unique_ptr<IPDBEnumSymbols>		std::unique_ptr<IPDBEnumSymbols>
SymbolCache::createGlobalsEnumerator(codeview::SymbolKind Kind) {		SymbolCache::createGlobalsEnumerator(codeview::SymbolKind Kind) {
return std::unique_ptr<IPDBEnumSymbols>(		return std::unique_ptr<IPDBEnumSymbols>(
new NativeEnumGlobals(Session, {Kind}));		new NativeEnumGlobals(Session, {Kind}));
}		}

SymIndexId SymbolCache::createSimpleType(TypeIndex Index,		SymIndexId SymbolCache::createSimpleType(TypeIndex Index,
ModifierOptions Mods) {		ModifierOptions Mods) const {
if (Index.getSimpleMode() != codeview::SimpleTypeMode::Direct)		if (Index.getSimpleMode() != codeview::SimpleTypeMode::Direct)
return createSymbol<NativeTypePointer>(Index);		return createSymbol<NativeTypePointer>(Index);

const auto Kind = Index.getSimpleKind();		const auto Kind = Index.getSimpleKind();
const auto It = std::find_if(		const auto It = std::find_if(
std::begin(BuiltinTypes), std::end(BuiltinTypes),		std::begin(BuiltinTypes), std::end(BuiltinTypes),
[Kind](const BuiltinTypeEntry &Builtin) { return Builtin.Kind == Kind; });		[Kind](const BuiltinTypeEntry &Builtin) { return Builtin.Kind == Kind; });
if (It == std::end(BuiltinTypes))		if (It == std::end(BuiltinTypes))
return 0;		return 0;
return createSymbol<NativeTypeBuiltin>(Mods, It->Type, It->Size);		return createSymbol<NativeTypeBuiltin>(Mods, It->Type, It->Size);
}		}

SymIndexId		SymIndexId
SymbolCache::createSymbolForModifiedType(codeview::TypeIndex ModifierTI,		SymbolCache::createSymbolForModifiedType(codeview::TypeIndex ModifierTI,
codeview::CVType CVT) {		codeview::CVType CVT) const {
ModifierRecord Record;		ModifierRecord Record;
if (auto EC = TypeDeserializer::deserializeAs<ModifierRecord>(CVT, Record)) {		if (auto EC = TypeDeserializer::deserializeAs<ModifierRecord>(CVT, Record)) {
consumeError(std::move(EC));		consumeError(std::move(EC));
return 0;		return 0;
}		}

if (Record.ModifiedType.isSimple())		if (Record.ModifiedType.isSimple())
return createSimpleType(Record.ModifiedType, Record.Modifiers);		return createSimpleType(Record.ModifiedType, Record.Modifiers);
Show All 13 Lines	default:
// No other types can be modified. (LF_POINTER, for example, records		// No other types can be modified. (LF_POINTER, for example, records
// its modifiers a different way.		// its modifiers a different way.
assert(false && "Invalid LF_MODIFIER record");		assert(false && "Invalid LF_MODIFIER record");
break;		break;
}		}
return 0;		return 0;
}		}

SymIndexId SymbolCache::findSymbolByTypeIndex(codeview::TypeIndex Index) {		SymIndexId SymbolCache::findSymbolByTypeIndex(codeview::TypeIndex Index) const {
// First see if it's already in our cache.		// First see if it's already in our cache.
const auto Entry = TypeIndexToSymbolId.find(Index);		const auto Entry = TypeIndexToSymbolId.find(Index);
if (Entry != TypeIndexToSymbolId.end())		if (Entry != TypeIndexToSymbolId.end())
return Entry->second;		return Entry->second;

// Symbols for built-in types are created on the fly.		// Symbols for built-in types are created on the fly.
if (Index.isSimple()) {		if (Index.isSimple()) {
SymIndexId Result = createSimpleType(Index, ModifierOptions::None);		SymIndexId Result = createSimpleType(Index, ModifierOptions::None);
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	SymIndexId SymbolCache::getOrCreateGlobalSymbolByOffset(uint32_t Offset) {
if (Id != 0) {		if (Id != 0) {
assert(GlobalOffsetToSymbolId.count(Offset) == 0);		assert(GlobalOffsetToSymbolId.count(Offset) == 0);
GlobalOffsetToSymbolId[Offset] = Id;		GlobalOffsetToSymbolId[Offset] = Id;
}		}

return Id;		return Id;
}		}

Expected<ModuleDebugStreamRef>		SymIndexId SymbolCache::getOrCreateInlineSymbol(InlineSiteSym Sym,
SymbolCache::getModuleDebugStream(uint32_t Index) const {		uint64_t ParentAddr,
assert(Dbi && "Dbi stream not present");		uint16_t Modi,
		uint32_t RecordOffset) const {
DbiModuleDescriptor Modi = Dbi->modules().getModuleDescriptor(Index);		auto Iter = SymTabOffsetToSymbolId.find({Modi, RecordOffset});
		if (Iter != SymTabOffsetToSymbolId.end())
uint16_t ModiStream = Modi.getModuleStreamIndex();		return Iter->second;
if (ModiStream == kInvalidStreamIndex)
return make_error<RawError>("Module stream not present");

std::unique_ptr<msf::MappedBlockStream> ModStreamData =
Session.getPDBFile().createIndexedStream(ModiStream);

ModuleDebugStreamRef ModS(Modi, std::move(ModStreamData));
if (auto EC = ModS.reload())
return std::move(EC);

return std::move(ModS);		SymIndexId Id = createSymbol<NativeInlineSiteSymbol>(Sym, ParentAddr);
		SymTabOffsetToSymbolId.insert({{Modi, RecordOffset}, Id});
		return Id;
}		}

std::unique_ptr<PDBSymbol>		std::unique_ptr<PDBSymbol>
SymbolCache::findSymbolBySectOffset(uint32_t Sect, uint32_t Offset,		SymbolCache::findSymbolBySectOffset(uint32_t Sect, uint32_t Offset,
PDB_SymType Type) {		PDB_SymType Type) {
if (AddrToModuleIndex.empty())
parseSectionContribs();

switch (Type) {		switch (Type) {
case PDB_SymType::Function:		case PDB_SymType::Function:
return findFunctionSymbolBySectOffset(Sect, Offset);		return findFunctionSymbolBySectOffset(Sect, Offset);
case PDB_SymType::PublicSymbol:		case PDB_SymType::PublicSymbol:
return findPublicSymbolBySectOffset(Sect, Offset);		return findPublicSymbolBySectOffset(Sect, Offset);
case PDB_SymType::Compiland: {		case PDB_SymType::Compiland: {
Optional<uint16_t> Modi =		uint16_t Modi;
getModuleIndexForAddr(Session.getVAFromSectOffset(Sect, Offset));		if (!Session.moduleIndexForSectOffset(Sect, Offset, Modi))
if (!Modi)
return nullptr;		return nullptr;
return getOrCreateCompiland(*Modi);		return getOrCreateCompiland(Modi);
}		}
case PDB_SymType::None: {		case PDB_SymType::None: {
// FIXME: Implement for PDB_SymType::Data. The symbolizer calls this but		// FIXME: Implement for PDB_SymType::Data. The symbolizer calls this but
// only uses it to find the symbol length.		// only uses it to find the symbol length.
if (auto Sym = findFunctionSymbolBySectOffset(Sect, Offset))		if (auto Sym = findFunctionSymbolBySectOffset(Sect, Offset))
return Sym;		return Sym;
return nullptr;		return nullptr;
}		}
default:		default:
return nullptr;		return nullptr;
}		}
}		}

std::unique_ptr<PDBSymbol>		std::unique_ptr<PDBSymbol>
SymbolCache::findFunctionSymbolBySectOffset(uint32_t Sect, uint32_t Offset) {		SymbolCache::findFunctionSymbolBySectOffset(uint32_t Sect, uint32_t Offset) {
auto Iter = AddressToSymbolId.find({Sect, Offset});		auto Iter = AddressToSymbolId.find({Sect, Offset});
if (Iter != AddressToSymbolId.end())		if (Iter != AddressToSymbolId.end())
return getSymbolById(Iter->second);		return getSymbolById(Iter->second);

if (!Dbi)		if (!Dbi)
return nullptr;		return nullptr;

auto Modi = getModuleIndexForAddr(Session.getVAFromSectOffset(Sect, Offset));		uint16_t Modi;
if (!Modi)		if (!Session.moduleIndexForSectOffset(Sect, Offset, Modi))
return nullptr;		return nullptr;

auto ExpectedModS = getModuleDebugStream(*Modi);		Expected<ModuleDebugStreamRef> ExpectedModS =
		Session.getModuleDebugStream(Modi);
if (!ExpectedModS) {		if (!ExpectedModS) {
consumeError(ExpectedModS.takeError());		consumeError(ExpectedModS.takeError());
return nullptr;		return nullptr;
}		}
CVSymbolArray Syms = ExpectedModS->getSymbolArray();		CVSymbolArray Syms = ExpectedModS->getSymbolArray();

// Search for the symbol in this module.		// Search for the symbol in this module.
for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {		for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {
if (I->kind() != S_LPROC32 && I->kind() != S_GPROC32)		if (I->kind() != S_LPROC32 && I->kind() != S_GPROC32)
continue;		continue;
auto PS = cantFail(SymbolDeserializer::deserializeAs<ProcSym>(*I));		auto PS = cantFail(SymbolDeserializer::deserializeAs<ProcSym>(*I));
if (Sect == PS.Segment && Offset >= PS.CodeOffset &&		if (Sect == PS.Segment && Offset >= PS.CodeOffset &&
Offset < PS.CodeOffset + PS.CodeSize) {		Offset < PS.CodeOffset + PS.CodeSize) {
// Check if the symbol is already cached.		// Check if the symbol is already cached.
auto Found = AddressToSymbolId.find({PS.Segment, PS.CodeOffset});		auto Found = AddressToSymbolId.find({PS.Segment, PS.CodeOffset});
if (Found != AddressToSymbolId.end())		if (Found != AddressToSymbolId.end())
return getSymbolById(Found->second);		return getSymbolById(Found->second);

// Otherwise, create a new symbol.		// Otherwise, create a new symbol.
SymIndexId Id = createSymbol<NativeFunctionSymbol>(PS);		SymIndexId Id = createSymbol<NativeFunctionSymbol>(PS, I.offset());
AddressToSymbolId.insert({{PS.Segment, PS.CodeOffset}, Id});		AddressToSymbolId.insert({{PS.Segment, PS.CodeOffset}, Id});
return getSymbolById(Id);		return getSymbolById(Id);
}		}

// Jump to the end of this ProcSym.		// Jump to the end of this ProcSym.
I = Syms.at(PS.End);		I = Syms.at(PS.End);
}		}
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	SymbolCache::findLineTable(uint16_t Modi) const {
auto LineTableIter = LineTable.find(Modi);		auto LineTableIter = LineTable.find(Modi);
if (LineTableIter != LineTable.end())		if (LineTableIter != LineTable.end())
return LineTableIter->second;		return LineTableIter->second;

std::vector<LineTableEntry> &ModuleLineTable = LineTable[Modi];		std::vector<LineTableEntry> &ModuleLineTable = LineTable[Modi];

// If there is an error or there are no lines, just return the		// If there is an error or there are no lines, just return the
// empty vector.		// empty vector.
Expected<ModuleDebugStreamRef> ExpectedModS = getModuleDebugStream(Modi);		Expected<ModuleDebugStreamRef> ExpectedModS =
		Session.getModuleDebugStream(Modi);
if (!ExpectedModS) {		if (!ExpectedModS) {
consumeError(ExpectedModS.takeError());		consumeError(ExpectedModS.takeError());
return ModuleLineTable;		return ModuleLineTable;
}		}

std::vector<std::vector<LineTableEntry>> EntryList;		std::vector<std::vector<LineTableEntry>> EntryList;
for (const auto &SS : ExpectedModS->getSubsectionsArray()) {		for (const auto &SS : ExpectedModS->getSubsectionsArray()) {
if (SS.kind() != DebugSubsectionKind::Lines)		if (SS.kind() != DebugSubsectionKind::Lines)
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	for (size_t I = 0; I < EntryList.size(); ++I)
ModuleLineTable.insert(ModuleLineTable.end(), EntryList[I].begin(),		ModuleLineTable.insert(ModuleLineTable.end(), EntryList[I].begin(),
EntryList[I].end());		EntryList[I].end());

return ModuleLineTable;		return ModuleLineTable;
}		}

std::unique_ptr<IPDBEnumLineNumbers>		std::unique_ptr<IPDBEnumLineNumbers>
SymbolCache::findLineNumbersByVA(uint64_t VA, uint32_t Length) const {		SymbolCache::findLineNumbersByVA(uint64_t VA, uint32_t Length) const {
Optional<uint16_t> MaybeModi = getModuleIndexForAddr(VA);		uint16_t Modi;
if (!MaybeModi)		if (!Session.moduleIndexForVA(VA, Modi))
return nullptr;		return nullptr;
uint16_t Modi = *MaybeModi;

std::vector<LineTableEntry> Lines = findLineTable(Modi);		std::vector<LineTableEntry> Lines = findLineTable(Modi);
if (Lines.empty())		if (Lines.empty())
return nullptr;		return nullptr;

// Find the first line in the line table whose address is not greater than		// Find the first line in the line table whose address is not greater than
// the one we are searching for.		// the one we are searching for.
auto LineIter = llvm::partition_point(Lines, [&](const LineTableEntry &E) {		auto LineIter = llvm::partition_point(Lines, [&](const LineTableEntry &E) {
return (E.Addr < VA \|\| (E.Addr == VA && E.IsTerminalEntry));		return (E.Addr < VA \|\| (E.Addr == VA && E.IsTerminalEntry));
});		});

// Try to back up if we've gone too far.		// Try to back up if we've gone too far.
if (LineIter == Lines.end() \|\| LineIter->Addr > VA) {		if (LineIter == Lines.end() \|\| LineIter->Addr > VA) {
if (LineIter == Lines.begin() \|\| std::prev(LineIter)->IsTerminalEntry)		if (LineIter == Lines.begin() \|\| std::prev(LineIter)->IsTerminalEntry)
return nullptr;		return nullptr;
--LineIter;		--LineIter;
}		}

Expected<ModuleDebugStreamRef> ExpectedModS = getModuleDebugStream(Modi);		Expected<ModuleDebugStreamRef> ExpectedModS =
		Session.getModuleDebugStream(Modi);
if (!ExpectedModS) {		if (!ExpectedModS) {
consumeError(ExpectedModS.takeError());		consumeError(ExpectedModS.takeError());
return nullptr;		return nullptr;
}		}
Expected<DebugChecksumsSubsectionRef> ExpectedChecksums =		Expected<DebugChecksumsSubsectionRef> ExpectedChecksums =
ExpectedModS->findChecksumsSubsection();		ExpectedModS->findChecksumsSubsection();
if (!ExpectedChecksums) {		if (!ExpectedChecksums) {
consumeError(ExpectedChecksums.takeError());		consumeError(ExpectedChecksums.takeError());
return nullptr;		return nullptr;
}		}

// Populate a vector of NativeLineNumbers that have addresses in the given		// Populate a vector of NativeLineNumbers that have addresses in the given
// address range.		// address range.
Optional<uint16_t> EndModi = getModuleIndexForAddr(VA + Length);
akhuangAuthorUnsubmitted Done Reply Inline Actions fyi, I removed this bit of code since it was buggy. It basically searches for line numbers in the next compile unit if we haven't exceeded the length. I was just trying to match DIA's behavior here, but I don't think we ever use this akhuang: fyi, I removed this bit of code since it was buggy. It basically searches for line numbers in…
rnkUnsubmitted Not Done Reply Inline Actions Makes sense. rnk: Makes sense.
if (!EndModi)
return nullptr;
std::vector<NativeLineNumber> LineNumbers;		std::vector<NativeLineNumber> LineNumbers;
while (Modi <= *EndModi) {		while (LineIter != Lines.end()) {
// If we reached the end of the current module, increment Modi and get the
// new line table and checksums array.
if (LineIter == Lines.end()) {
++Modi;

ExpectedModS = getModuleDebugStream(Modi);
if (!ExpectedModS) {
consumeError(ExpectedModS.takeError());
break;
}
ExpectedChecksums = ExpectedModS->findChecksumsSubsection();
if (!ExpectedChecksums) {
consumeError(ExpectedChecksums.takeError());
break;
}

Lines = findLineTable(Modi);
LineIter = Lines.begin();

if (Lines.empty())
continue;
}

if (LineIter->IsTerminalEntry) {		if (LineIter->IsTerminalEntry) {
++LineIter;		++LineIter;
continue;		continue;
}		}

// If the line is still within the address range, create a NativeLineNumber		// If the line is still within the address range, create a NativeLineNumber
// and add to the list.		// and add to the list.
if (LineIter->Addr > VA + Length)		if (LineIter->Addr > VA + Length)
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	SymbolCache::getOrCreateSourceFile(const FileChecksumEntry &Checksums) const {

SymIndexId Id = SourceFiles.size();		SymIndexId Id = SourceFiles.size();
auto SrcFile = std::make_unique<NativeSourceFile>(Session, Id, Checksums);		auto SrcFile = std::make_unique<NativeSourceFile>(Session, Id, Checksums);
SourceFiles.push_back(std::move(SrcFile));		SourceFiles.push_back(std::move(SrcFile));
FileNameOffsetToId[Checksums.FileNameOffset] = Id;		FileNameOffsetToId[Checksums.FileNameOffset] = Id;
return Id;		return Id;
}		}

void SymbolCache::parseSectionContribs() {
if (!Dbi)
return;

class Visitor : public ISectionContribVisitor {
NativeSession &Session;
IMap &AddrMap;

public:
Visitor(NativeSession &Session, IMap &AddrMap)
: Session(Session), AddrMap(AddrMap) {}
void visit(const SectionContrib &C) override {
if (C.Size == 0)
return;

uint64_t VA = Session.getVAFromSectOffset(C.ISect, C.Off);
uint64_t End = VA + C.Size;

// Ignore overlapping sections based on the assumption that a valid
// PDB file should not have overlaps.
if (!AddrMap.overlaps(VA, End))
AddrMap.insert(VA, End, C.Imod);
}
void visit(const SectionContrib2 &C) override { visit(C.Base); }
};

Visitor V(Session, AddrToModuleIndex);
Dbi->visitSectionContributions(V);
}

Optional<uint16_t> SymbolCache::getModuleIndexForAddr(uint64_t Addr) const {
auto Iter = AddrToModuleIndex.find(Addr);
if (Iter == AddrToModuleIndex.end())
return None;
return Iter.value();
}

llvm/lib/DebugInfo/PDB/PDBContext.cpp

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	PDBContext::getLineInfoForAddressRange(object::SectionedAddress Address,
}		}
return Table;		return Table;
}		}

DIInliningInfo		DIInliningInfo
PDBContext::getInliningInfoForAddress(object::SectionedAddress Address,		PDBContext::getInliningInfoForAddress(object::SectionedAddress Address,
DILineInfoSpecifier Specifier) {		DILineInfoSpecifier Specifier) {
DIInliningInfo InlineInfo;		DIInliningInfo InlineInfo;
DILineInfo Frame = getLineInfoForAddress(Address, Specifier);		DILineInfo CurrentLine = getLineInfoForAddress(Address, Specifier);
InlineInfo.addFrame(Frame);
		// Find the function at this address.
		std::unique_ptr<PDBSymbol> ParentFunc =
		Session->findSymbolByAddress(Address.Address, PDB_SymType::Function);
		if (!ParentFunc) {
		InlineInfo.addFrame(CurrentLine);
		return InlineInfo;
		}

		auto Frames = ParentFunc->findInlineFramesByVA(Address.Address);
		if (!Frames \|\| Frames->getChildCount() == 0) {
		InlineInfo.addFrame(CurrentLine);
		return InlineInfo;
		}

		while (auto Frame = Frames->getNext()) {
		uint32_t Length = 1;
		auto LineNumbers = Frame->findInlineeLinesByVA(Address.Address, Length);
		if (!LineNumbers \|\| LineNumbers->getChildCount() == 0)
		break;

		std::unique_ptr<IPDBLineNumber> Line = LineNumbers->getNext();
		assert(Line);

		DILineInfo LineInfo;
		LineInfo.FunctionName = Frame->getName();
		auto SourceFile = Session->getSourceFileById(Line->getSourceFileId());
		if (SourceFile &&
		Specifier.FLIKind != DILineInfoSpecifier::FileLineInfoKind::None)
		LineInfo.FileName = SourceFile->getFileName();
		LineInfo.Line = Line->getLineNumber();
		LineInfo.Column = Line->getColumnNumber();
		InlineInfo.addFrame(LineInfo);
		}

		InlineInfo.addFrame(CurrentLine);
return InlineInfo;		return InlineInfo;
}		}

std::vector<DILocal>		std::vector<DILocal>
PDBContext::getLocalsForAddress(object::SectionedAddress Address) {		PDBContext::getLocalsForAddress(object::SectionedAddress Address) {
return std::vector<DILocal>();		return std::vector<DILocal>();
}		}

Show All 26 Lines

llvm/lib/DebugInfo/PDB/PDBSymbol.cpp

	Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines

	std::unique_ptr<IPDBEnumSymbols>			std::unique_ptr<IPDBEnumSymbols>
	PDBSymbol::findChildrenByRVA(PDB_SymType Type, StringRef Name,			PDBSymbol::findChildrenByRVA(PDB_SymType Type, StringRef Name,
	PDB_NameSearchFlags Flags, uint32_t RVA) const {			PDB_NameSearchFlags Flags, uint32_t RVA) const {
	return RawSymbol->findChildrenByRVA(Type, Name, Flags, RVA);			return RawSymbol->findChildrenByRVA(Type, Name, Flags, RVA);
	}			}

	std::unique_ptr<IPDBEnumSymbols>			std::unique_ptr<IPDBEnumSymbols>
				PDBSymbol::findInlineFramesByVA(uint64_t VA) const {
				return RawSymbol->findInlineFramesByVA(VA);
				}

				std::unique_ptr<IPDBEnumSymbols>
	PDBSymbol::findInlineFramesByRVA(uint32_t RVA) const {			PDBSymbol::findInlineFramesByRVA(uint32_t RVA) const {
	return RawSymbol->findInlineFramesByRVA(RVA);			return RawSymbol->findInlineFramesByRVA(RVA);
	}			}

				std::unique_ptr<IPDBEnumLineNumbers>
				PDBSymbol::findInlineeLinesByVA(uint64_t VA, uint32_t Length) const {
				return RawSymbol->findInlineeLinesByVA(VA, Length);
				}

				std::unique_ptr<IPDBEnumLineNumbers>
				PDBSymbol::findInlineeLinesByRVA(uint32_t RVA, uint32_t Length) const {
				return RawSymbol->findInlineeLinesByRVA(RVA, Length);
				}

				std::string PDBSymbol::getName() const { return RawSymbol->getName(); }

	std::unique_ptr<IPDBEnumSymbols>			std::unique_ptr<IPDBEnumSymbols>
	PDBSymbol::getChildStats(TagStats &Stats) const {			PDBSymbol::getChildStats(TagStats &Stats) const {
	std::unique_ptr<IPDBEnumSymbols> Result(findAllChildren());			std::unique_ptr<IPDBEnumSymbols> Result(findAllChildren());
	if (!Result)			if (!Result)
	return nullptr;			return nullptr;
	Stats.clear();			Stats.clear();
	while (auto Child = Result->getNext()) {			while (auto Child = Result->getNext()) {
	++Stats[Child->getSymTag()];			++Stats[Child->getSymTag()];
	Show All 39 Lines

llvm/utils/gn/secondary/llvm/lib/DebugInfo/PDB/BUILD.gn

Show All 27 Lines	sources = [
"Native/ModuleDebugStream.cpp",		"Native/ModuleDebugStream.cpp",
"Native/NamedStreamMap.cpp",		"Native/NamedStreamMap.cpp",
"Native/NativeCompilandSymbol.cpp",		"Native/NativeCompilandSymbol.cpp",
"Native/NativeEnumGlobals.cpp",		"Native/NativeEnumGlobals.cpp",
"Native/NativeEnumInjectedSources.cpp",		"Native/NativeEnumInjectedSources.cpp",
"Native/NativeEnumLineNumbers.cpp",		"Native/NativeEnumLineNumbers.cpp",
"Native/NativeEnumModules.cpp",		"Native/NativeEnumModules.cpp",
"Native/NativeEnumTypes.cpp",		"Native/NativeEnumTypes.cpp",
		"Native/NativeEnumSymbols.cpp",
"Native/NativeExeSymbol.cpp",		"Native/NativeExeSymbol.cpp",
		"Native/NativeInlineSiteSymbol.cpp",
"Native/NativeFunctionSymbol.cpp",		"Native/NativeFunctionSymbol.cpp",
"Native/NativeLineNumber.cpp",		"Native/NativeLineNumber.cpp",
"Native/NativePublicSymbol.cpp",		"Native/NativePublicSymbol.cpp",
"Native/NativeRawSymbol.cpp",		"Native/NativeRawSymbol.cpp",
"Native/NativeSession.cpp",		"Native/NativeSession.cpp",
"Native/NativeSourceFile.cpp",		"Native/NativeSourceFile.cpp",
"Native/NativeSymbolEnumerator.cpp",		"Native/NativeSymbolEnumerator.cpp",
"Native/NativeTypeArray.cpp",		"Native/NativeTypeArray.cpp",
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-symbolizer] Add inline stack traces for Windows.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 305892

compiler-rt/test/asan/TestCases/suppressions-function.cpp

lld/test/COFF/symbolizer-inline.s

llvm/include/llvm/DebugInfo/PDB/Native/NativeEnumSymbols.h

llvm/include/llvm/DebugInfo/PDB/Native/NativeFunctionSymbol.h

llvm/include/llvm/DebugInfo/PDB/Native/NativeInlineSiteSymbol.h

llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h

llvm/include/llvm/DebugInfo/PDB/Native/SymbolCache.h

llvm/include/llvm/DebugInfo/PDB/PDBSymbol.h

llvm/lib/DebugInfo/PDB/CMakeLists.txt

llvm/lib/DebugInfo/PDB/Native/NativeEnumSymbols.cpp

llvm/lib/DebugInfo/PDB/Native/NativeFunctionSymbol.cpp

llvm/lib/DebugInfo/PDB/Native/NativeInlineSiteSymbol.cpp

llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp

llvm/lib/DebugInfo/PDB/Native/SymbolCache.cpp

llvm/lib/DebugInfo/PDB/PDBContext.cpp

llvm/lib/DebugInfo/PDB/PDBSymbol.cpp

llvm/utils/gn/secondary/llvm/lib/DebugInfo/PDB/BUILD.gn

[llvm-symbolizer] Add inline stack traces for Windows.
ClosedPublic