Page MenuHomePhabricator

Add support for JSON output style to llvm-symbolizer

Authored by aorlov on Feb 17 2021, 10:58 AM.



This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
aorlov added inline comments.Apr 22 2021, 3:13 PM

This behavior is a part of llvm-symbolizer and does not depend on the output styles and does not belong to the printer (look at decideHowToPrintFunctions()).
Not sure if I understand what you are saying. Changing the behavior is out of the scope of this patch.
I agree that it does not make much sense in testing that, but one of the reviewers requested these tests.

41 ↗(On Diff #330008)

Intention was to break a dependency on DWARF generation in clang. But I have changed it build from the C source to keep it simple for reading.

aorlov updated this revision to Diff 339949.Apr 23 2021, 2:01 AM

Hi @aorlov,

The community norm is for a week between updates and a ping/between consecutive pings. I haven't had a chance to come back to this just yet due to my workload. I hope to get to it in the next day or two.

Out of time for today. Will come back to this another time.


Make this all one paragraph, and reflow to 80 character limit. I'd actually rephrase it slightly too:

"If addresses are supplied via stdin, the output JSON will be a series of individual objects. Otherwise, all results will be contained in a single array."


Sorry, use -p, not --pretty-print (I just realised that's what's used in the previous example).


I'd rephrase as in the inline comment (please make sure to reflow if necessary).


Please reflow this comment to 80-character limit.


As before, do we need both --no-inlines and --inlines cases?


It seems currently we have no binaries in llvm/test/tools/llvm-symbolizer/Inputs with a valid Source info.

Consider generating one at test time using assembly or yaml2obj.

You can simplify

No, it does not work because JSON has own rules for escaping special symbols in paths.

Okay - I missed the double backslash.


Ah, I think we've hit on a fundamental question regarding JSON output - what should the options that change what information is printed do to JSON output? I'm thinking here specifically both --functions and --addresses, but it may apply to others too.

aorlov updated this revision to Diff 340882.Apr 27 2021, 9:40 AM
aorlov marked 7 inline comments as done.Apr 27 2021, 9:49 AM
aorlov added inline comments.

But note the library name is symbolize, not symbolizer.


I have added output-style-json-code-source.c.

aorlov updated this revision to Diff 340908.Apr 27 2021, 10:49 AM
aorlov marked an inline comment as done.

Used --print-source-context-lines to control Source printout in JSON.

aorlov updated this revision to Diff 341070.Apr 27 2021, 9:50 PM
aorlov updated this revision to Diff 342279.May 2 2021, 2:34 PM
jhenderson added inline comments.May 5 2021, 2:04 AM

Is this clang-formatted properly? Slightly surprised there's no space after override.


ErrorCode is unused now. Also, no need for the const on ErrorMsg.


Usually we use StringRef.str() to convert to a std::string. Same applies below in the error message line.


Don't abbeviate these names unnecessarily like this. Array and FrameCount would both be acceptable, for example. In general, avoid single letter variable names except for loop counters. Same goes throughout this patch. See for details:

We cannot stress enough how important it is to use descriptive names. Pick names that match the semantics and role of the underlying entities, within reason. Avoid abbreviations unless they are well known.

I think you can inline N into the initialising part of the for loop? I don't see it being used outside the loop.


Object maybe?


I think adding source context can be a later patch. Let's try to avoid adding more than we have to in this one patch.

3 ↗(On Diff #342279)

This won't work. clang isn't available in llvm-symbolizer tests, as the latter are part of the LLVM layer. Clang depends on the LLVM layer, but not vice-versa. You might be able to use llvm-mc -g to compile some assembly to be able to test this though.


Fair point. How about simply ## Show how library errors are reported in the output.


Same comment as the code test.


Same comment as above.

24 ↗(On Diff #342279)

As above. This won't work. You should use llvm-mc to build from assembly, as in other test cases.

aorlov updated this revision to Diff 343189.May 5 2021, 2:15 PM
aorlov marked 7 inline comments as done.May 5 2021, 2:31 PM

I think adding source context can be a later patch. Let's try to avoid adding more than we have to in this one patch.

Note you asked for a new test for Source a week ago.
Ok, I have removed Source from the JSON output and removed the new test for Source.

I have restored output-style-json-frame.test instead of output-style-json-frame.c
Note output-style-json-frame.test is huge because it is machine generated and you asked for a non zero and non empty TagOffset.
I don't like an idea to clean up or optimize it anyway.


It is generated by clang-format and validated by clang-tidy.


It is not necessary anymore because Source has been removed.

dblaikie added inline comments.May 5 2021, 2:33 PM

The extraneous trailing semicolon is probably confusing clang-format, so maybe remove those and see how it gets formatted?

aorlov updated this revision to Diff 343427.May 6 2021, 8:56 AM
aorlov marked an inline comment as done.
jhenderson added inline comments.May 7 2021, 12:38 AM

Ah, good spot!


There's still an unfortunate single-letter variable name here. I'd suggest Json or Object for the variable name.


Same comment as above. How about InliningInfo?


LI is a similar abbreviation which doesn't clearly indicate its name. What's wrong with LineInfo?


J -> Json or Object


Object or Json.




Maybe use FrameObject here to disambiguate from the toJson retiurn below.


Json or RequestJson or similar.


Json or Object


If I follow this correctly, this test case is showing that llvm-addr2line with -f results in the function name being printed? Assuming that's correct, I think you need to show that llvm-addr2line without -f results in the function name not being included in the JSON output too.

38 ↗(On Diff #343427)

Please include the version of clang explicitly here, not just in the ident string below. The reason is that someone might come along and trim off the superfluous parts of the below assembly to minimise the test case, but it would still be helpful to know how to generate the unmodified part. I would also use a final release version of clang, so that a future user can easily get the exact version of clang well into the future.

aorlov updated this revision to Diff 343649.May 7 2021, 5:14 AM
aorlov marked 11 inline comments as done.May 7 2021, 5:18 AM
dblaikie added inline comments.May 7 2021, 12:55 PM
38 ↗(On Diff #343649)

This seems like a fairly non-trivial build - might be worth an explanation about what's interesting about this case that's not exercised by simpler cases (such as ones without sanitizers, maybe also x86 plain (not a necessity, but curious if it's something ARM specific here), etc)

aorlov updated this revision to Diff 343923.May 9 2021, 11:42 AM

I have replaced output-style-json-frame.test with output-style-json-frame.ll, which is much friendlier but still contains non-zero TagOffset.

jhenderson added inline comments.May 10 2021, 12:25 AM

Does this need a REQUIRES: aarch64-registered-target or equivalent?


Perhaps highlight that you are testing both 0 and non-zero frame offsets with a comment.


This code could probably do with some comments highlighting what the key elements are, so that future changes don't lose them.

aorlov updated this revision to Diff 344128.May 10 2021, 11:33 AM
aorlov added inline comments.May 10 2021, 11:35 AM

It does not require any ARM target.
Note this test is passed on x64 Windows and x64 Debian.


I have updated output-style-json-frame.ll.
Now it contains 0, non-zero and empty (missing) TagOffset to cover all possible cases.
I added a comment too.


There are no any key elements. I just declared 3 variables with different type, size and all possible TagOffset per your requests.
I personally think it is overkill, and does not really belong to the JSON patch, as there is nothing JSON specific there.
It seems a test for the symbolize library and the symbolizer itself.

aorlov updated this revision to Diff 344183.May 10 2021, 1:29 PM

I have added the aarch64 requirements, and a few comments in that test for target offsets. Hope this should address it.

jhenderson accepted this revision.May 11 2021, 12:48 AM

LGTM, with one nit.


Nit: similar to the ## in the other tests, use ;; for comments in this test to help the comments stand out from lit and FileCheck directives.

This revision is now accepted and ready to land.May 11 2021, 12:48 AM
This revision was landed with ongoing or failed builds.May 11 2021, 2:11 AM
This revision was automatically updated to reflect the committed changes.

Please make another commit to fix the unnecessary introduction of semi-colons in the places I've highlighted.


Why have these superfluous semi-colons reappeared?



MaskRay added inline comments.May 11 2021, 12:36 PM

virtual ~DIPrinter(){}; => virtual ~DIPrinter() {}

Does clang-format complain on virtual ~DIPrinter(){}; ?

dblaikie added inline comments.May 11 2021, 1:04 PM

clang-format doesn't have warnings/errors really - it does it's best to guess at what's going on and format it.

The extra ';' confuse clang-format and so it formats poorly.

@jhenderson also asked @aorlov to remove the unnecessary semicolons earlier:

Please make another commit to fix the unnecessary introduction of semi-colons in the places I've highlighted.

@aorlov - could you take a look at this & ensure the semicolons have been removed?

Please make another commit to fix the unnecessary introduction of semi-colons in the places I've highlighted.

Sorry for that. Something went wrong when switching branches.