Page MenuHomePhabricator

[analyzer] A Python script to prettify the ExplodedGraph dumps.
ClosedPublic

Authored by NoQ on May 29 2019, 6:21 PM.

Details

Summary

That's my attempt on D62553 that i (just) promised.

Here's roughly how it looks:

Full dump of some random test file:

The rough idea behind this script is that it deserializes JSON dumps into familiar python objects. Serializing them back into a pretty graph dump is only one of the possible use-cases; we can manipulate the deserialized graph by, say, hiding nodes that we aren't interested in.

This initial commit only supports Environment and Store and silently drops unsupported information; i'll hopefully follow up with more this week, most importantly ranges. As of today it's only a dot-to-dot converter, so @Charusso's SVG cleanup is not supported yet. Diffs are not supported yet; tomorrow i'll think a bit more how exactly do i want to implement them.

I wrote some tests but i'm not really sure they're worth it. I guess some corner-cases would be nice to document this way.

@Charusso, could you see if you can:

  • Bring back stable IDs for nodes into JSON (not only pointers). They're very useful for debugging.
  • See if your trick for reducing SVG size with text joining is even applicable to the dumps that i produce.
  • Understand my code here :)

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ created this revision.May 29 2019, 6:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2019, 6:21 PM
NoQ updated this revision to Diff 202091.May 29 2019, 6:43 PM

Abstract away raw prints.

NoQ updated this revision to Diff 202092.May 29 2019, 6:44 PM

Remove outdated comment.

NoQ retitled this revision from [analyzer][WIP] A Python script to prettify the ExplodedGraph dumps. to [analyzer] A Python script to prettify the ExplodedGraph dumps..May 29 2019, 6:45 PM
NoQ updated this revision to Diff 202094.May 29 2019, 6:48 PM

Use os.path.join for discovering the utility in tests.

NoQ added a comment.May 29 2019, 6:54 PM

I wrote some tests but i'm not really sure they're worth it.

Mmm, on second thought, they probably won't work out of the box, because they might require installing python modules in order to work. I'm actually not sure if all machines have python3. I'll try but it'll most likely fail. I guess i could try excluding them from make check-all.

Sorry for that much rewrite request but in a script language you have to name your stuff properly or you will be completely lost. I have not written a single Python class yet, but trust me.

I really like that DotDumpVisitor semantic, but it would be a lot better at SVG level. At the moment whatever style/table-data you insert it adds an extra text tag per styling which is quite inefficient.

We could use "foreignObject" (somehow) and apply pure HTML tables or I could use my <tspan> idea instead (https://developer.mozilla.org/en-US/docs/Web/SVG/Element/tspan). I recommend the latter as it is better than a fixed table.

With the current representation it would be difficult to apply every styling to tspans. I think it also would be difficult to use your Explorer idea as we are talking about modificating the SVG dynamically.

In D62638#1522357, @NoQ wrote:

I wrote some tests but i'm not really sure they're worth it.

Mmm, on second thought, they probably won't work out of the box, because they might require installing python modules in order to work. I'm actually not sure if all machines have python3. I'll try but it'll most likely fail. I guess i could try excluding them from make check-all.

It would be cool to introduce something like check-scripts.

clang/utils/analyzer/exploded-graph-rewriter.py
5 ↗(On Diff #202094)

I think it is not that cool variable name. I would use the appropiate name of the object instead of the generic json name.

12 ↗(On Diff #202094)

json -> loc

22 ↗(On Diff #202094)

json -> program_point and so on...

33 ↗(On Diff #202094)

What about just loc?

124 ↗(On Diff #202094)

What about predecessors and successors? It is the proper name in the Static Analyzer world.

126 ↗(On Diff #202094)
  • set_json() -> set() the given node
  • json -> node
151 ↗(On Diff #202094)

nodes -> graph

184 ↗(On Diff #202094)

raw_json -> node_string which will be a raw JSON after loads().

191 ↗(On Diff #202094)

I have removed the trailing , from the end yesterday so rstrip(',') is not needed.

It would be cool to name the result[], like: pred_id = result[1] and succ_id = result[2] or current/previous ID, or something like that.

I also would put the regexes into a function and you could write:
pred_id, succ_id = parse_edge() or something more simpler.

What is result[0] btw? That confused me a little-bit.

233 ↗(On Diff #202094)
237 ↗(On Diff #202094)

I think tables are left aligned by default, so you could remove your left alignments.

326 ↗(On Diff #202094)

I would create a table-builder class to reduce the repetition and for better readability what is going on. Also you could avoid balign typo.

Then may each class could have its own pretty-print method like LLVM does.

NoQ marked an inline comment as done.May 30 2019, 1:58 PM
NoQ added inline comments.
clang/utils/analyzer/exploded-graph-rewriter.py
326 ↗(On Diff #202094)

balign is a thing in the graphviz dialect of html (https://www.graphviz.org/doc/info/shapes.html)

(not sure i still need this in the current incarnation of the script, given that we've removed those whole-decl pretty-prints in D62495)

NoQ updated this revision to Diff 202502.May 31 2019, 3:59 PM
NoQ marked 12 inline comments as done.

Fxd!

clang/utils/analyzer/exploded-graph-rewriter.py
151 ↗(On Diff #202094)

Mm, why would i have a field called "graph" in an object called "graph".

191 ↗(On Diff #202094)

pred_id, succ_id = parse_edge()

Not really, i'd also have to return whether the match was successful, which kinda makes it as bad as before.

What is result[0] btw? That confused me a little-bit.

That's the entire match. https://docs.python.org/3/library/re.html#match-objects

233 ↗(On Diff #202094)

I don't want to list the same color many times for each key that maps into it.

237 ↗(On Diff #202094)

Unfortunately in GraphViz they're centered by default.

NoQ added a comment.May 31 2019, 3:59 PM

Also switched to python2 because it seems to be the default. The differences are minimal.

Charusso added inline comments.May 31 2019, 4:15 PM
clang/utils/analyzer/exploded-graph-rewriter.py
151 ↗(On Diff #202094)

Because we are constructing the graph object and it would be still much better than LLVM's fancy Graph TheGraph; logic. I am okay with a bunch of nodes, just those are the graph itself.

191 ↗(On Diff #202094)

I see, just I am that newbie. Thanks!

237 ↗(On Diff #202094)

Well, that is unfortunate.

Charusso added a comment.EditedMay 31 2019, 4:15 PM
In D62638#1525823, @NoQ wrote:

Also switched to python2 because it seems to be the default. The differences are minimal.

What about the state of the patch itself? Planned stuff WIP, could I take it to SVGify, or?

NoQ added a comment.May 31 2019, 4:39 PM
In D62638#1525823, @NoQ wrote:

Also switched to python2 because it seems to be the default. The differences are minimal.

What about the state of the patch itself? Planned stuff WIP, could I take it to SVGify, or?

The format would look roughly the same from now on, but i'll add more prints for other state traits and may slightly rearrange fields around, also diffs which would be simply extra +/- fields in the tables that otherwise look the same. I guess you should wait until i commit this revision and we at least figure out what's the situation with buildbots and tests.

NoQ marked an inline comment as done.May 31 2019, 4:44 PM
NoQ added inline comments.
clang/utils/analyzer/exploded-graph-rewriter.py
191 ↗(On Diff #202094)

I have removed the trailing , from the end yesterday so rstrip(',') is not needed.

It still seems to be there for all nodes except the last node.

Charusso added inline comments.May 31 2019, 4:46 PM
clang/utils/analyzer/exploded-graph-rewriter.py
191 ↗(On Diff #202094)

Hm, that is why your testing idea is so cool, I am nearly sure I have removed it.

NoQ marked an inline comment as done.May 31 2019, 4:48 PM
NoQ added inline comments.
clang/utils/analyzer/exploded-graph-rewriter.py
191 ↗(On Diff #202094)

*politely refuses to take credit for inventing test-driven development*

NoQ updated this revision to Diff 202511.May 31 2019, 4:50 PM

Rebase.

Charusso accepted this revision.May 31 2019, 5:20 PM

Let us see those tests!

This revision is now accepted and ready to land.May 31 2019, 5:20 PM
Closed by commit rL362340: [analyzer] exploded-graph-rewriter: Initial commit. (authored by dergachev, committed by ). · Explain WhyJun 2 2019, 2:38 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2019, 2:38 PM
NoQ added a comment.Jun 3 2019, 1:45 PM

I disabled tests on Windows for now (rC362343 -> rC362347), but other than that the tests are working well, there was nothing to worry about.

This commit breaks the NetBSD buildbot node.

http://lab.llvm.org:8011/builders/netbsd-amd64/builds/20359

The problem is that this patch hardcodes python specific binary name, while it has to be detected dynamically or ideally passed from CMake. Afair lit has a knowledge of too.

​#!/usr/bin/env python # <- unportable form

On NetBSD with pkgsrc this can be python2.7, but the name is not fixed.

Plase correct this.

Charusso added inline comments.Jun 4 2019, 10:29 AM
cfe/trunk/utils/analyzer/exploded-graph-rewriter.py
1
# something else then
#
#==- exploded-graph-rewriter.py - Simplifies the ExplodedGraph -*- python -*-==#
#
# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
#===------------------------------------------------------------------------===#

Whoops!

NoQ added a comment.Jun 4 2019, 11:02 AM

Thanks!! Will fix as soon as i get to my desktop.

NoQ marked an inline comment as done.Jun 4 2019, 7:08 PM

This commit breaks the NetBSD buildbot node.

rC362574.

cfe/trunk/utils/analyzer/exploded-graph-rewriter.py
1