This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] SARIF: Add EOF newline; replace diff_sarif
ClosedPublic

Authored by hubert.reinterpretcast on Jun 6 2019, 7:19 AM.

Details

Summary

This patch applies a change similar to rC363069, but for SARIF files.

The %diff_sarif lit substitution invokes diff with a non-portable -I option. The intended effect can be achieved by normalizing the inputs to diff beforehand. Such normalization can be done with grep -Ev, which is also used by other tests.

Additionally, this patch updates the SARIF output to have a newline at the end of the file. This makes it so that the SARIF file qualifies as a POSIX text file, which increases the consumability of the generated file in relation to various tools.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2019, 7:19 AM

Added Aaron as he wrote the this diagnostic output type :)

@aaron.ballman, for similar cases in the plist output, it has been proposed

  • that the reference expected file be committed into the tree pre-normalized, and
  • that tool be modified such that the output file has a newline at the end of the file.

Does that sound good for this format?

@aaron.ballman, for similar cases in the plist output, it has been proposed

  • that the reference expected file be committed into the tree pre-normalized, and
  • that tool be modified such that the output file has a newline at the end of the file.

Does that sound good for this format?

In general, that seems reasonable, but I would prefer to take care of more of the work in lit.local.cfg than have to deal with that atrocious RUN line in every test case. Is there a way to retain a similarly succinct solution as diff_sarif?

In general, that seems reasonable, but I would prefer to take care of more of the work in lit.local.cfg than have to deal with that atrocious RUN line in every test case. Is there a way to retain a similarly succinct solution as diff_sarif?

There'd be no atrocious RUN line if we went with modifying the expected files beforehand and having the tool output a newline.

The unchanged:

// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.security.taint,debug.TaintTest %s -verify -analyzer-output=sarif -o - | %diff_sarif %S/Inputs/expected-sarif/sarif-diagnostics-taint-test.c.sarif -

becomes:

// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.security.taint,debug.TaintTest %s -verify -analyzer-output=sarif -o - | %normalize_sarif | diff -U1 -b %S/Inputs/expected-sarif/sarif-diagnostics-taint-test.c.sarif -

As in, %diff_sarif gets replaced with %normalize_sarif | diff -U1 -b and that's it.

In general, that seems reasonable, but I would prefer to take care of more of the work in lit.local.cfg than have to deal with that atrocious RUN line in every test case. Is there a way to retain a similarly succinct solution as diff_sarif?

There'd be no atrocious RUN line if we went with modifying the expected files beforehand and having the tool output a newline.

The unchanged:

// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.security.taint,debug.TaintTest %s -verify -analyzer-output=sarif -o - | %diff_sarif %S/Inputs/expected-sarif/sarif-diagnostics-taint-test.c.sarif -

becomes:

// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.security.taint,debug.TaintTest %s -verify -analyzer-output=sarif -o - | %normalize_sarif | diff -U1 -b %S/Inputs/expected-sarif/sarif-diagnostics-taint-test.c.sarif -

As in, %diff_sarif gets replaced with %normalize_sarif | diff -U1 -b and that's it.

But is there a reason to not keep %diff_sarif and define it in terms of %normalize_sarif | diff -U1 -b within lit.local.cfg? I guess I don't see the benefit to exposing the call to diff (I don't anticipate anyone needing to change the options passed to diff).

But is there a reason to not keep %diff_sarif and define it in terms of %normalize_sarif | diff -U1 -b within lit.local.cfg? I guess I don't see the benefit to exposing the call to diff (I don't anticipate anyone needing to change the options passed to diff).

Yes: The normalization runs on stdin. %diff_sarif would then only make sense if stdin is one of the inputs, and I think the only input we can hardcode to be stdin is the one conventionally considered to be the reference input (which is not what we want).

But is there a reason to not keep %diff_sarif and define it in terms of %normalize_sarif | diff -U1 -b within lit.local.cfg? I guess I don't see the benefit to exposing the call to diff (I don't anticipate anyone needing to change the options passed to diff).

Yes: The normalization runs on stdin. %diff_sarif would then only make sense if stdin is one of the inputs, and I think the only input we can hardcode to be stdin is the one conventionally considered to be the reference input (which is not what we want).

Ah drat, I think you're right. Oh well! I think your approach makes sense to me then. Thank you for working on this!

Update based on review comments from D62949 as confirmed in D62952

Normalized versions of the reference expected SARIF output files were
checked in under r363788. The patch has been updated with that revision
as the base. Made the SARIF output generation produce a newline at the
end of the file, and modified the RUN lines in the manner discussed.

hubert.reinterpretcast retitled this revision from [analyzer][tests] Use normalize_sarif in place of diff_sarif to [analyzer] SARIF: Add EOF newline; replace diff_sarif.Jun 19 2019, 7:45 AM
hubert.reinterpretcast edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jun 19 2019, 8:07 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2019, 8:24 AM