This patch adds support for parsing STACK CFI records from breakpad
files. For the expressions specifying the values of registers, only a
basic parse is performed -- the record is split into a bunch of key-value
pairs, where both keys and values are strings. The idea is that these
will be handed off to the postfix expression -> dwarf compiler, once
it is extracted from the internals of the NativePDB plugin
Details
Diff Detail
- Build Status
Buildable 30094 Build 30093: arc lint + arc unit
Event Timeline
This looks good. I have a few questions inline, but none of those are major concerns.
source/Plugins/ObjectFile/Breakpad/BreakpadRecords.h | ||
---|---|---|
173 | I'm not a fan of deep class hierarchies, but given that StackCFIRecord is a subset of StackCFIInitRecord and given that the naming suggests StackCFIInitRecord is a concrete type of StackCFIRecord, should StackCFIInitRecord derive from StackCFIRecord? (Or perhaps contain one?) If not, perhaps a comment to explain why not. | |
unittests/ObjectFile/Breakpad/BreakpadRecordsTest.cpp | ||
122 | All of the negative parsing tests seem to be incomplete (e.g., truncated or a missing INIT). Should there be others, like having the wrong token type? Or an invalid token? Or one that has extra tokens at the end of an otherwise valid record? I see you're following the pattern for the other types here, but is that enough? | |
131 | This test relies on the parsing test for the StackCFIInitRecord having covered most of the cases. That works because the parsing implementation is shared, so I'm OK with it. Will others be concerned that this test isn't as independent as it could be? |
source/Plugins/ObjectFile/Breakpad/BreakpadRecords.h | ||
---|---|---|
173 | Good question. I was doing the same thing as I did with FUNC and PUBLIC records, which are also very similar, but I wouln't be comfortable saying one is a special case of the other. However, the case here is not as clear, as one could consider the "init" record to be a special kind of a "stack cfi" record. OTOH, I also don't like deep hierarchies and particularly having a concrete class (init record) inherit from another concrete class (stack cfi record). Since this code is likely to have only a single consumer (the thing which converts these into UnwindPlans), adding an AbstractCFIRecord class would seem like an overkill. Instead, how about we just collapse things even more and use just a single class to represent both records (with an Optional<addr_t> for the size)? I don't think this should make it the job of the consumer much harder, but it will allow us to:
| |
unittests/ObjectFile/Breakpad/BreakpadRecordsTest.cpp | ||
122 | Following the pattern isn't much of an excuse for me, as I was the one who established that pattern. :) The reason the pattern is like it is was that I was trying to cover all of the paths through the parsing code. I believe I have succeeded at that, but I haven't used a profiler or anything to verify that. Knowing the implementation, it's obvious that a test with an invalid record type is not going to be interesting, as we will bail out straight after the first token. However, I agree that from a black-box testing perspective, these are interesting cases to check, so I've added a couple more that I could think of. This is going to be the most extensively tested piece of code in lldb :P. | |
131 | Another advantage of merging the two record types is that it makes this comment void. :) |
source/Plugins/ObjectFile/Breakpad/BreakpadRecords.cpp | ||
---|---|---|
393 | Do we really need to pull the content apart into separate strings for each register? Seems like a lot of work and 99% of these we will never accessed. Maybe just store the entire string for all registers and be done? | |
394 | Does the format specify no space between the register name and the colon? |
source/Plugins/ObjectFile/Breakpad/BreakpadRecords.cpp | ||
---|---|---|
393 | You can add an iterator method to the StackCFIRecord record maybe for when you do want to parse each register? |
source/Plugins/ObjectFile/Breakpad/BreakpadRecords.cpp | ||
---|---|---|
393 | That's a good point. I'll remove the register parsing code from this patch and just store the entire expression as one string(ref). I'll see what's the best api for splitting these up when I get to connecting this with the postfix expression parser. | |
394 | That is how I interpret the format "spec" https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/symbol_files.md Each registeri is the name of a register or pseudoregister. Each expression is a Breakpad postfix expression, which may contain spaces, but never ends with a colon. That's not a very good definition because there are many ways one can split up these strings such they "don't end with a colon", but I chose this interpretation, as that's what made most sense to me (and it works for the example files I have lying around). We can always change that if we find a producer doing things slightly differently. |
I'm not a fan of deep class hierarchies, but given that StackCFIRecord is a subset of StackCFIInitRecord and given that the naming suggests StackCFIInitRecord is a concrete type of StackCFIRecord, should StackCFIInitRecord derive from StackCFIRecord? (Or perhaps contain one?)
If not, perhaps a comment to explain why not.