This is an archive of the discontinued LLVM Phabricator instance.

[IR] Add SturcturalHash printer pass
ClosedPublic

Authored by aidengrossman on Aug 18 2023, 2:52 PM.

Details

Summary

This patch adds in a StructuralHash printer pass that prints out the
hexadeicmal representation of the hash of a module and all of the
functions within it.

Diff Detail

Event Timeline

aidengrossman created this revision.Aug 18 2023, 2:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2023, 2:52 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
aidengrossman requested review of this revision.Aug 18 2023, 2:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2023, 2:52 PM
aeubanks added inline comments.Aug 21 2023, 9:52 AM
llvm/include/llvm/Analysis/StructuralHash.h
1
9

this comment doesn't seem very useful, I'd remove it

llvm/lib/Analysis/StructuralHash.cpp
20

this should be a textual pass parameter, see FUNCTION_PASS_WITH_PARAMS. typically parseSinglePassOption expects foo<bar>, and this pass name is print<foo<bar>>, so you'll need to strip off the print<> before passing the name to parseSinglePassOption

llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
14

this and DETAILED-HASH are the same CHECK lines. did you mean to check that DETAILED-HASH gives two different hashes?

aidengrossman marked 3 inline comments as done.

Address reviewer feedback

  • Fix top comment typo
  • Remove mostly useless header comment
  • Switch to using pass options fron the cl::opt in printer pass
  • Update test
aidengrossman marked an inline comment as done.Aug 21 2023, 7:36 PM
aidengrossman added inline comments.
llvm/lib/Analysis/StructuralHash.cpp
20

Updated. I ended up doing it similar to how it's done for print<memoryssa> where the syntax is print<pass-name><option> which seems to already have support within parseSinglePassOptions and is at least somewhat canonical.

llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
14

The original intention here was just to make sure that they were printing anything at all rather than be flaky in response to a specific hash, but that's a good point. I've adjusted it to specific hashes as that should be rather consistent (other than in response to changes in StructuralHash).

nikic added inline comments.Aug 23 2023, 5:29 AM
llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
21

Isn't this going to fail on 32-bit, at least when combined with your other patch?

aidengrossman marked an inline comment as done.

Address reviewer feedback

  • Update test hashes based on most recent revision of earlier patch in stack
  • Make test require 64 bit architecture to prevent failures on 32 bit arches.
aidengrossman marked an inline comment as done.

Split lit test

aidengrossman added inline comments.Aug 23 2023, 11:31 AM
llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
21

Yes. I meant to fix this but hadn't gotten around to it yet. I've split the test into two files, one marked as REQUIRE; llvm-64-bits. I check both hashes on both platforms (but regex for the detailed hash on the non-64-bit-only test) and check for an exact hash on the 64 bit platform. I think that should provide a decent amount of test coverage.

Update test that I forgot to save before committing.

nikic added inline comments.Aug 23 2023, 11:44 AM
llvm/lib/Passes/PassBuilder.cpp
1098

For pass options, just detailed would be a more typical name.

It would probably make sense to move your existing unit test coverage over to the printer pass.

It would probably make sense to move your existing unit test coverage over to the printer pass.

I think it would be difficult to express a lot of the StructuralHash test cases as lit style tests. A lot of the functionality that is tested is whether or not hashes are different for two slightly different functions rather than their exact values which is very feasible to do in a unit test and possible to do in a lit test but a little bit less elegant (more test churn if the StructuralHash implementation changes, dealing with 32/64-bit differences) I would think. Maybe I'm missing something though?

It would probably make sense to move your existing unit test coverage over to the printer pass.

I think it would be difficult to express a lot of the StructuralHash test cases as lit style tests. A lot of the functionality that is tested is whether or not hashes are different for two slightly different functions rather than their exact values which is very feasible to do in a unit test and possible to do in a lit test but a little bit less elegant (more test churn if the StructuralHash implementation changes, dealing with 32/64-bit differences) I would think. Maybe I'm missing something though?

That's a good point, but ... what's the point of this pass if we can't use it for testing?

Address reviewer feedback

  • Switch pass option name to detailed rather than the much more verbose enable-detailed-structural-hash

Address reviewer feedback

  • Switch pass option name to detailed rather than the much more verbose enable-detailed-structural-hash

I'm interested in having a consistent textual interface to expose IR hashes for some downstream analysis and figured upstreaming it would add in a little bit of test coverage (although not be the bulk of it) and might be useful if someone else wants similar data.

Bump on this patch when reviewers have a chance. Thanks!

Skip printing hashes of function declarations.

aeubanks added inline comments.Aug 28 2023, 11:03 AM
llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
18

seems unfortunate to hardcode specific hashes, can we instead do something like

CHECK-NEXT: Function f1 Hash: [[H:([a-z0-9]{16})]]
CHECK-NEXT: Function f2 Hash: [[H:([a-z0-9]{16})]]

DETAILED-NEXT: Function f1 Hash: [[H:([a-z0-9]{16})]]
DETAILED-NOT: Function f2 Hash: [[H]]
DETAILED-NEXT: Function f2 Hash: {{([a-z0-9]{16})}}
aidengrossman marked an inline comment as done.

Address reviewer feedback

  • Switch main test to use regular expressions and lit variables for

verification that hashes are correct.

llvm/test/Analysis/StructuralHash/structural-hash-printer.ll
18

That is definitely an elegant way of doing that. I hadn't seen substitution blocks before. Thanks for pointing them out! I've left the exact detailed hash values in the other test file as I still think it's important to test that the values are stable across runs/machines matching the criteria to avoid accidentally hashing a pointer. If a different from of testing for that is desired, I can change it up though.

aeubanks accepted this revision.Aug 29 2023, 3:04 PM

lg with one comment

if the test does turn out to be annoying to constantly update we can revisit

llvm/test/Analysis/StructuralHash/structural-hash-detailed.ll
11 ↗(On Diff #554039)

add a comment about expecting the number to be stable across runs

This revision is now accepted and ready to land.Aug 29 2023, 3:04 PM
This revision was landed with ongoing or failed builds.Aug 29 2023, 7:01 PM
This revision was automatically updated to reflect the committed changes.
aeubanks added inline comments.Aug 29 2023, 7:03 PM
llvm/test/Analysis/StructuralHash/structural-hash-detailed.ll
11 ↗(On Diff #554039)

actually I just had the idea that you can run it twice and diff the outputs to make sure they're the same, that's way more resistant to random changes

aidengrossman marked an inline comment as done.Aug 29 2023, 7:06 PM
aidengrossman added inline comments.
llvm/test/Analysis/StructuralHash/structural-hash-detailed.ll
11 ↗(On Diff #554039)

That's definitely a better way of doing it. Thanks for the suggestion! I'll push a commit up with that change. It won't show up locally on some of the machines I work on because they have ASLR disabled, but all (or at least most) of the buildbots should have that enabled which is what matters more for coverage.