Page MenuHomePhabricator

Passes for Logging Stackdepth Info
Needs ReviewPublic

Authored by anniecherk on Jul 20 2018, 4:06 PM.

Details

Summary

These are two backend passes which log information that can be used to compute stack depth.

Diff Detail

Event Timeline

anniecherk created this revision.Jul 20 2018, 4:06 PM

These are currently a work-in-progress; not ready for review yet.

anniecherk updated this revision to Diff 158659.Aug 1 2018, 4:30 PM
anniecherk edited the summary of this revision. (Show Details)

This version includes two passes which run at the very back of the backend, just before codegen:

  1. X86LogStackInfo: For every function, this pass logs the function name, stack size, function signature, and the direct & indirect calls made from this function.
  2. X86LogStackInfoClasses: This module pass logs any identified struct types in the IR; these include class inheritance data, which can be used to track indirect function calls by function signature.

This version is implemented in the X86 backend. We plan to also support the AArch64 backend. The code in the two passes should run on other backends without modification, except for scheduling the passes to run in other backends. I'm not sure what the most correct way to implement this is, with the least amount of code duplication-- for instance, where in the source tree is the best place for the LogStackInfo passes? I'd appreciate hearing any opinions on this.

I thought the plan was to produce JSON, not YAML.

llvm/lib/Target/X86/X86LogStackInfo.cpp
60 ↗(On Diff #158659)

This looks like a really wrong way to generate the file name. String manipulation here is a big red flag.
Why is "." magical??
The file name you need to generate here is something based on the -o argument.

63 ↗(On Diff #158659)

Opening the file afresh for each function seems really wrong. The file should be opened exactly one and the stream shared across all runs of the passes in a single translation unit. Besides the cleanliness and performance problems with reopening the file, this approach looks like it will always append to an existing file from a previous compilation, which is completely wrong.

llvm/lib/Target/X86/X86LogStackInfoClasses.cpp
38 ↗(On Diff #158659)

The file-opening logic being repeated in two places is another big red flag.

phosek added a comment.Aug 1 2018, 6:31 PM

Two high level comments:

  1. All new files should have LLVM copyright header.
  2. All files should be formatted according to the LLVM style guide (easiest way to achieve that is to run them through clang-format).
  • Moved from Target/X86 to Codegen. These passes will now run on all backends.
  • Fixed a logical bug to check whether a structType has a name before calling getStructName().
  • Ran clang-format, clang-check & clang-tidy
    • Question: clang-format reformatted many lines of code that I did not write in a file that I modified. Should I revert these changes?
  • Added LLVM copyright headers to new files
  • Switched from manual string mangling to llvm's file extension updating function

Updated logging approach to log directly to the DiagnosticsOutputFile contained in the LLVMContext.

The main advantage of this approach is the creation, opening, and closing of the output file is already taken care of.
The disadvantage of this approach is other passes do not interact with the DiagnosticOutputFile directly, but instead write to to it using the DiagnosticInfo framework. The DiagnosticInfo framework currently does not support logging any data other than a single integer from Module passes, and does not support logging any data other than strings from Function passes. As an alternative implementation, I could create new DiagnosticInfo subclasses, but it's not clear whether that's the correct approach, and if it is, the details of the best way to do this.

If anyone has opinions on the correct way to do this, please voice them.

Minor update: the function pass now also logs whether that function has both default visibility and is externally visible. A function with those properties can be a root of the call graph.

Updated logic to not log callsites which are llvm intrinsics.