This patch extends the InlineFeatureAnalysis pass adds a printer pass to it. Planning to add the ability extract more code features including #loops, loop depth, types of instructions, etc.
Details
Diff Detail
Event Timeline
@mtrofin We want to "rename" InlineFeatureAnalysis to a more generic name and extend it. This patch does the former, basically, and adds a printer pass. Extensions will follow soon. Is that generally OK with you?
@tarinduj I left some comments. We also need to replace the old InlineFeaturesAnalsysis with the CodeFeature one everywhere, assuming @mtrofin doesn't have any concerns. Generally we minimize duplication ;)
llvm/include/llvm/Analysis/ML/CodeFeaturesAnalysis.h | ||
---|---|---|
1 | Please use the same file comment style we have elsewhere. | |
llvm/lib/Analysis/ML/CodeFeaturesAnalysis.cpp | ||
1 | File comment missing too. | |
49 | We need a lit test for the printer. |
Some feature calculations are computationally intensive. At least for inlining, we plan in a next step to look at regions of the call graph around a call site. I wouldn't want to burden other consumers with this extra cost.
As a first step, if the planned new features would add space or computational overhead, could the work requiring the extra features have its own analysis and consume both analysis results? As we have examples of more such analyses and understand the usecases and performance tradeoffs, we can very easily refactor / rename.
If the new features are simple and small, SGTM - the only issue I have is the name, it's too generic - I worry it invites grabbag dumping of features. Can we find something more specific? What would the short/medium term consumers be (that could help with a name)
That is fair, see below.
As a first step, if the planned new features would add space or computational overhead, could the work requiring the extra features have its own analysis and consume both analysis results? As we have examples of more such analyses and understand the usecases and performance tradeoffs, we can very easily refactor / rename.
While we can have more analysis if we want, I somehow doubt that will make it better. I was envisioning a lazy approach in which the analysis will not do anything until instructed to. Users can "ask" for what they want while reuse is still possible and code is kept at a single place. As an example, if users ask to findCFGstructureFeatures they will not get any instruction level results. Querying those could assert w/o extra cost. I don't see why this would cost us extra and I would prefer it very much over a multitude of passes. WDYT?
If the new features are simple and small, SGTM - the only issue I have is the name, it's too generic - I worry it invites grabbag dumping of features. Can we find something more specific? What would the short/medium term consumers be (that could help with a name)
Short term consumers on our site are experiments. We hope to eventually set up heuristics or models that use more features.
IIUC, that means the one analysis memoizes results. I'm speculating invalidation would be total - i.e. even if only some results were needed (and, thus, computed), invalidation clears everything. Getting results would be semantically equivalent to having many analyses. Invalidating would be semantically equivalent to going piecemeal through the analyses that should be cleared and doing that. I'm not sure if that'd be always desirable, I suppose we'll learn when we hit a specific problem.
How about this:
- we rename InlineFeaturesAnalysis to FunctionFeaturesAnalysis (just because "Code" is too general - do we use that term elsewhere in LLVM?)
- as long as the features are trivially computable , let's just eagerly compute them
- if this causes perf/space problems, or when we hit more complicated features, we figure out at that point what the best course of action may be.
wdyt?
If the new features are simple and small, SGTM - the only issue I have is the name, it's too generic - I worry it invites grabbag dumping of features. Can we find something more specific? What would the short/medium term consumers be (that could help with a name)
Short term consumers on our site are experiments. We hope to eventually set up heuristics or models that use more features.
Is there a better name than features? I initially read this as having to do with subtarget features.
I think that is fine. @tarinduj, WDYT?
- as long as the features are trivially computable , let's just eagerly compute them
Sure.
- if this causes perf/space problems, or when we hit more complicated features, we figure out at that point what the best course of action may be.
Sounds good to me.
Hm, properties?
Yeah, looks good.
- as long as the features are trivially computable , let's just eagerly compute them
Sure.
- if this causes perf/space problems, or when we hit more complicated features, we figure out at that point what the best course of action may be.
Sounds good to me.
Hm, properties?
So FunctionPropertiesAnalysis?
My comment was formatted badly. The test is needed for sure, wanted to know if you are fine with the rename ;)
Ah, yes - lgtm
One overall nit: could we split this into the stand-alone rename, followed by the addition of the new code? Thanks!
Please use the same file comment style we have elsewhere.