This connects the Symbolizer to the markup filter and enables the first
working end-to-end flow using the filter.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp | ||
---|---|---|
470–471 | This should be moved as well so we can use debuginfod with the symbolizer filter. | |
471–477 | This is not something that needs to be addressed in this change, but I think it would be valuable to support different output styles with the symbolizer filter as well, ideally by extending and using the existing printers. The JSON output in particular would be useful for consumption from other tools and scripts. |
Address review comments.
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp | ||
---|---|---|
470–471 | Ah, good catch. Done. | |
471–477 | The use case makes sense, but it would be a somewhat odd way of interpreting the symbolizer markup. For the output to make sense as (e.g.) JSON, you'd need to have the filter ignore any text and SGR nodes present in the log, or you'd need to expect that the input is only markup and whitespace. Maybe we could support an auxiliary output for this. When the log symbolizer encounters a presentation element, it could also tee the results to the auxiliary in one of these existing formats. This would keep the semantics of the main log symbolization pipeline simple and consistent, while still allowing machine processing of the referenced symbols, perhaps in concert with the filtered markup output. |
One small comment on the test, I'll be out on vacation next week so may be slow to respond. I've not got any objections so please don't wait for me if others are happy to approve.
llvm/test/DebugInfo/symbolize-filter-markup-data.test | ||
---|---|---|
18 | Possibly worth a test with too many fields? | |
33 | I can't see a label l in the test. Was the intention to set the size of long to 4? If so I think you'll need .size long, 4. The same applies to .size b, 1 below. Checking the object file the sizes of long and byte are 0. |
Correct size directive.
llvm/test/DebugInfo/symbolize-filter-markup-data.test | ||
---|---|---|
18 | Eh, for that test to fail and this one to pass, there would have to be a discontinuity in how the implementation handles inputs with too many fields vs too few fields. The implementation currently uses "==", and that's rather unlikely to accidentally change. A possible class of errors would be to swap checkNumFields with checkNumFieldsAtLeast or similar, but these each come with different messages. | |
33 | Yep, these didn't get updated when I renamed the variables. Done. |
Apologies for the delay in responding, have been on vacation. I've got one suggestion on the output of the module line having looked at the test. Otherwise looks good to me.
llvm/test/DebugInfo/symbolize-filter-markup-data.test | ||
---|---|---|
13 | I don't know how much scope there is to change the output format of the mmap information in the module line given that this is based on a pre-existing markup. When I saw 0x0(r) - 0x10(r) I wasn't sure whether this was one contiguous range [0x0, 0x10) or two separate ranges, not necessarily contiguous with one starting at 0x0 and one at 0x10. I guessed at the latter given the {{{mmap}}} elements but it would have been harder to understand if I'd just seen the output and not the input. Perhaps use a comma instead of a dash to separate the mmaps? As an alternative I guess the size could be incorporated in some way. No worries if this isn't possible. |
Address review comments.
llvm/test/DebugInfo/symbolize-filter-markup-data.test | ||
---|---|---|
13 | Generally agree; there shouldn't be anything that hard-depends on the format, as this is intended to be human readable, not machine readable. I think we can mostly escape Hyrum's law too, since usage of this markup format is pretty limited at present. Spent a bit of time thinking about this, and ended up with [0x0-0x9](r),[0x10-0x11](r). Using brackets agrees with the mathematical notation for inclusive ranges. Using inclusive ranges and dashes inside makes this also readable to those unfamiliar with that notation. |
I don't know how much scope there is to change the output format of the mmap information in the module line given that this is based on a pre-existing markup. When I saw 0x0(r) - 0x10(r) I wasn't sure whether this was one contiguous range [0x0, 0x10) or two separate ranges, not necessarily contiguous with one starting at 0x0 and one at 0x10. I guessed at the latter given the {{{mmap}}} elements but it would have been harder to understand if I'd just seen the output and not the input.
Perhaps use a comma instead of a dash to separate the mmaps? As an alternative I guess the size could be incorporated in some way.
No worries if this isn't possible.