This is an archive of the discontinued LLVM Phabricator instance.

Clang: Add a new flag Wmisnoinline for printing hot noinline functions
Needs ReviewPublic

Authored by iamarchit123 on Aug 18 2022, 4:02 PM.

Details

Summary

In lieu of the change D115907 which introduces a new warning Wmisnoexpect, I introduce a new warning, Wmisnoinline to print potential wrong usage of noinline attribute with functions. NoInline functions should ideally not be hot as per profile summary. If they are hot, we could print a warning to inform user to think again about marking of their annotation as it may happen that maybe, overtime functions which a user might think is cold/slow may turn hot and compiler may benefit from inlining them.

The WMisNoInline only works if profile data is attached or else throws a warning and has no effect. It emits a warning if the following two conditions are met:

  1. If a noinline function is hotter than fdiagnostics-misnoinline-percentile-threshold (by default set to 99%) as per PSI analysis.
  2. There was a possibility of Inlining that function in CGSCC Inliner pass(based on cost-threshold calculation).

The change was tested on HHVM and we found out 14 extremely hot noinline functions which were within 50 percentile hotness and ~110 mildly hot noinline functions within 99 percentile hotness based on PSI analysis.

Test Plan: ninja check all
llvm-lit --show-all MisNoInline.cpp
llvm-lit --show-all MisNoInline_LowThreshold.cpp
llvm-lit --show-all MisNoInline.ll

Diff Detail

Event Timeline

iamarchit123 created this revision.Aug 18 2022, 4:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 4:02 PM
iamarchit123 requested review of this revision.Aug 18 2022, 4:02 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 18 2022, 4:02 PM

update commit

iamarchit123 retitled this revision from Clang add front flags for Wmisnoinline to Clang add a new front flags for Wmisnoinline.Aug 18 2022, 5:31 PM
iamarchit123 edited the summary of this revision. (Show Details)
iamarchit123 retitled this revision from Clang add a new front flags for Wmisnoinline to Clang: Add a new flag Wmisnoinline for printing hot noinline functions.Aug 18 2022, 5:35 PM
iamarchit123 edited the summary of this revision. (Show Details)

Hi, thanks for taking a look at this. Before we start an in-depth review, can you describe the deficiencies w/ the existing diagnostics, and why they don't meet your needs?

Primarily, I'm a little skeptical if taking the same approach as MisExpect is the correct approach.

  1. Unlike llvm.expect, the noinline attribute is often used for correctness. I'm not sure it makes sense to warn about it in the same way as a performance optimization. My experience may differ from the code bases you work in, but I cannot recall seeing a function annotated noinline for due to any kind of performance reason. The one exception I can think of being code marked as cold for outlining purposes, but those are usually inferred from profiles or are added due to other annotations. Do people do this for better i-cache performance or something?
  2. MisExpect diagnostics need to run a specific points during compilation to check the weights added by the llvm.expect intrinsic against the profile, so it can't be a separate pass, since e.g., LowerExpectIntrinsicPass and the PGO passes for instrumentation/sampling replace/remove that information. From what I can see this could be its own analysis pass, since you at most need to consult the function entry count.
  3. Optimization remarks already exist for missed inlining opportunities. I'm unsure of the value in using a warning over the existing diagnostic in this case. In the case of llvm.expect intrinsics, it may be the result of an incorrect annotation, or a mis-annotated branch (i.e., marking a branch w/ LIKELY instead of UNLIKELY). In these cases, we'd like to signal to users a problem w/ the source code/annotation. I'm not sure that the same is true for noinline attributes. Is this something you want to use to fail builds? That was something we wanted to achieve for Fuchsia's CI, which is why -Wmisexpect exists as more than an optimization remark.

Hi, thanks for taking a look at this. Before we start an in-depth review, can you describe the deficiencies w/ the existing diagnostics, and why they don't meet your needs?

Primarily, I'm a little skeptical if taking the same approach as MisExpect is the correct approach.

Unlike llvm.expect, the noinline attribute is often used for correctness. I'm not sure it makes sense to warn about it in the same way as performance optimization. My experience may differ from the code bases you work in, but I cannot recall seeing a function annotated noinline for due to any kind of performance reason. The one exception I can think of being code marked as cold for outlining purposes, but those are usually inferred from profiles or are added due to other annotations. Do people do this for better i-cache performance or something?

I was under the impression that it was used for performance reasons and hot functions should not ideally be marked noinline. @modimo could you also pitch in on this? If not function entry count what other things could possibly indicate a false usage of noinline(if not the hotness)

MisExpect diagnostics need to run a specific points during compilation to check the weights added by the llvm.expect intrinsic against the profile, so it can't be a separate pass, since e.g., LowerExpectIntrinsicPass and the PGO passes for instrumentation/sampling replace/remove that information. From what I can see this could be its own analysis pass, since you at most need to consult the function entry count.

This can be a separate pass but even I am using Cost Analysis to check noinline functions cost apart from function entry count, so it felt natural to integrate the noinline warning into inline cost analysis.

Optimization remarks already exist for missed inlining opportunities. I'm unsure of the value in using a warning over the existing diagnostic in this case. In the case of llvm.expect intrinsics, it may be the result of an incorrect annotation, or a mis-annotated branch (i.e., marking a branch w/ LIKELY instead of UNLIKELY). In these cases, we'd like to signal to users a problem w/ the source code/annotation. I'm not sure that the same is true for noinline attributes. Is this something you want to use to fail builds? That was something we wanted to achieve for Fuchsia's CI, which is why -Wmisexpect exists as more than an optimization remark.

I think there is no Optimization remark emission for missed inlining opportunities for noinline functions as cost analysis is skipped for functions marked with noinline attribute. No we don't want any builds to fail with this. Our only aim to make this change is to emit a warning for functions which a user may have accidentally marked noinline but removing noinline attribute may give some performance benefits.

@modimo feel free to pitch in as well on this about concerns raised by @paulkirth and any changes I could make on this based on the comments.

To be clear, I'm not morally opposed to your patch, I just wanted to understand the context more completely and why this is the best approach. And like I said, I can't recall encountering a place where noinline was done for performance reasons. Code size and correctness are the two reasons I've seen commonly though.

I was under the impression that it was used for performance reasons and hot functions should not ideally be marked noinline. @modimo could you also pitch in on this? If not function entry count what other things could possibly indicate a false usage of noinline(if not the hotness)

Using the entry count isn't the issue, its that people don't normally mark their hot code noinline, certainly not w/o a reason. I'm open to the idea that there might be a performance reason to do so, but I'm aware of other reasons that are more common in my experience. While you've gotten good use from the diagnostic, I'm unsure how well that generalizes.

This can be a separate pass but even I am using Cost Analysis to check noinline functions cost apart from function entry count, so it felt natural to integrate the noinline warning into inline cost analysis.

If the goal is to disclose that the annotation may be wrong, then shouldn't you want to report that regardless of the cost analysis? The inliner's decision is orthogonal to whether the attribute is beneficial or potentially incorrect, right?

I think there is no Optimization remark emission for missed inlining opportunities for noinline functions as cost analysis is skipped for functions marked with noinline attribute.

Right, but your new diagnostic seems to be exactly like those remarks, which is why I brought them up. You are issuing a diagnostic regarding a missed inlining opportunity afterall, so I don't think its strange to suggest that you should consider reporting these through the same mechanism. If there's a good reason not to, that's fine, but then I would expect there to be some rationale.

No we don't want any builds to fail with this. Our only aim to make this change is to emit a warning for functions which a user may have accidentally marked noinline but removing noinline attribute may give some performance benefits.

Then using a remark seems to be the better choice. There are lot's of places that compile w/ -Werror, and making this a warning ensures builds fail. That was a goal for Fuchsia w/ MisExpect, so if it's not in this case, you may want to consider only using remarks as an alternative, since they are always only informational.

Like I said before, I'm not opposed to this, but I'd like to understand why the current infrastructure and diagnostics are insufficient or shouldn't just be updated to also report this case. My other concern is that there are significantly more cases where noinline is used for correctness than there are for performance reasons, which may dilute the usefulness of the diagnostic.

@paulkirth this change was done under the intuition that marking hot functions noinline may hurt performance. This change till now hasn't been tested for performance improvements and so the points that you raise that functions marked noinline are marked not for hotness/performance but rather correctness is something that I was unaware of. Since I don't have access to big services currently and can't test this change, so I may be unable to defend this change. @modimo feel free if you think folks can come back and test this change for effectiveness or completely redo this if the above change is still insufficient.

I agree until it's shown/proven that there is a serious performance win this change may not be useful. Thanks for the review.

@paulkirth this change was done under the intuition that marking hot functions noinline may hurt performance. This change till now hasn't been tested for performance improvements and so the points that you raise that functions marked noinline are marked not for hotness/performance but rather correctness is something that I was unaware of. Since I don't have access to big services currently and can't test this change, so I may be unable to defend this change. @modimo feel free if you think folks can come back and test this change for effectiveness or completely redo this if the above change is still insufficient.

I agree until it's shown/proven that there is a serious performance win this change may not be useful. Thanks for the review.

I have seen a few cases where noinline was used for performance, in addition to other cases like avoiding too much stack growth. I've also seen it used without any comment whatsoever. So I think it would be good to make it easier to identify cases where we are blocked from inlining at hot callsites because of the attribute.

It is a little different than misexpect though in that the expect hints are pretty much only for performance, so it is more useful to be able to issue a strong warning that can be turned into an error if they are wrong. And also there was no way to report the misuse of expects earlier, unlike inlining where we already had the remarks plumbing.

I haven't looked through the patch in detail, but is it possible to use your changes to emit a better missed opt remark from the inliner for these cases (I assume we will already emit a -Rpass-missed=inline for the noinline attribute case, just not highlighting that it is hot and would have been inlined for performance reasons otherwise)? I suppose one main reason for adding a warning is that the missed inline remarks can be really noisy and not really useful to the user vs a compiler optimization engineer doing inliner/compiler tuning, and therefore a warning would make it easier to turn on more widely as user feedback that can/should be addressed in user code.

I have seen a few cases where noinline was used for performance, in addition to other cases like avoiding too much stack growth.

Well, I stand corrected. I'm curious about what these cases are, but in any case if there are cases where its done, then I agree that a diagnostic would be helpful.

I've also seen it used without any comment whatsoever. So I think it would be good to make it easier to identify cases where we are blocked from inlining at hot callsites because of the attribute.

I wonder if there is some analysis or heuristic we could use to distinguish those cases? Nothing really comes to mind, but it would be nice if we had one.

It is a little different than misexpect though in that the expect hints are pretty much only for performance, so it is more useful to be able to issue a strong warning that can be turned into an error if they are wrong. And also there was no way to report the misuse of expects earlier, unlike inlining where we already had the remarks plumbing.

I haven't looked through the patch in detail, but is it possible to use your changes to emit a better missed opt remark from the inliner for these cases (I assume we will already emit a -Rpass-missed=inline for the noinline attribute case, just not highlighting that it is hot and would have been inlined for performance reasons otherwise)? I suppose one main reason for adding a warning is that the missed inline remarks can be really noisy and not really useful to the user vs a compiler optimization engineer doing inliner/compiler tuning, and therefore a warning would make it easier to turn on more widely as user feedback that can/should be addressed in user code.

Yeah, I was thinking we could emit a new remark type for this to differentiate, but it seems simpler more user friendly to emit some clar diagnostic directly.

I think we’re starting to accumulate a few of these diagnostics now that are trying to diagnose potential performance deficiencies based on profiling information. Originally we had prototyped a tool for misexpect based on libtooling that ran over the build based on the compile commands DB and reported everything it found. I wonder if reviving that would be useful in these cases when you want to look for performance issues like this, misexpect, and other cases? Making ORE diagnostic output queryable through a tool may also be a good option, but I'm not too familiar with what already exists in that area.

Thanks for taking a look!

I have seen a few cases where noinline was used for performance, in addition to other cases like avoiding too much stack growth.

Well, I stand corrected. I'm curious about what these cases are, but in any case if there are cases where its done, then I agree that a diagnostic would be helpful.

Same. The instances I've seen is an older codebase where compiler optimizations were not as powerful and/or purposefully written by engineers that didn't trust the compiler to do the right thing.

It is a little different than misexpect though in that the expect hints are pretty much only for performance, so it is more useful to be able to issue a strong warning that can be turned into an error if they are wrong. And also there was no way to report the misuse of expects earlier, unlike inlining where we already had the remarks plumbing.

I haven't looked through the patch in detail, but is it possible to use your changes to emit a better missed opt remark from the inliner for these cases (I assume we will already emit a -Rpass-missed=inline for the noinline attribute case, just not highlighting that it is hot and would have been inlined for performance reasons otherwise)? I suppose one main reason for adding a warning is that the missed inline remarks can be really noisy and not really useful to the user vs a compiler optimization engineer doing inliner/compiler tuning, and therefore a warning would make it easier to turn on more widely as user feedback that can/should be addressed in user code.

Yeah, I was thinking we could emit a new remark type for this to differentiate, but it seems simpler more user friendly to emit some clar diagnostic directly.

I think we’re starting to accumulate a few of these diagnostics now that are trying to diagnose potential performance deficiencies based on profiling information. Originally we had prototyped a tool for misexpect based on libtooling that ran over the build based on the compile commands DB and reported everything it found. I wonder if reviving that would be useful in these cases when you want to look for performance issues like this, misexpect, and other cases? Making ORE diagnostic output queryable through a tool may also be a good option, but I'm not too familiar with what already exists in that area.

Currently a new ORE (-pass-remarks=misnoinline) is getting generated, which misnoexcept also does. Agreed a warning is more familiar and friendlier for users so I lean towards that approach. For additional tooling, I think the first step will be to trial this on more real programs to see what cases are interesting. @iamarchit123 just finished his internship with us so I'll be evaluating these changes on HHVM to see if they can swing the performance needle.

Same. The instances I've seen is an older codebase where compiler optimizations were not as powerful and/or purposefully written by engineers that didn't trust the compiler to do the right thing.

Thanks for pointing that out. I had failed to consider those scenarios. I do recall having discussions w/ hardware/embedded engineers a long time ago regarding their mistrust of the compiler, so I should have thought about these types of situations.

Currently a new ORE (-pass-remarks=misnoinline) is getting generated, which misnoexcept also does. Agreed a warning is more familiar and friendlier for users so I lean towards that approach. For additional tooling, I think the first step will be to trial this on more real programs to see what cases are interesting. @iamarchit123 just finished his internship with us so I'll be evaluating these changes on HHVM to see if they can swing the performance needle.

@iamarchit123, thanks for your contribution, and I hope your summer went well! You may want to see if you can present your work at the dev meeting in November. Even a lighting talk is a great item for a resume. There's even a student travel grant to help out w/ costs. https://discourse.llvm.org/t/llvm-foundation-student-travel-grants-available-sept-8-deadline/64794. Also, universities often have their own travel grants for students presenting at a conference, so I would encourage you to see if you're eligible for one of those from your own university too.

@modimo Keep us posted w/ your findings from HHVM, it will be interesting to see what kind of improvements can be gained.

llvm/docs/MisNoInline.rst
2
9–39

This is almost a verbatim copy of the MisExpect.rst. If these diagnostics are so similar, and use the exact same methodology, then maybe we should unify them as something like Profile-based Mis-annotation Analysis? We can describe the basic approach and then in subsections describe how individual diagnostics work/differ. What do you think?

llvm/include/llvm/Analysis/InlineCost.h
174

Do you need this variable? you store it here in the InlineCost, but the only place its used is in canEmitNoInlineWarning. Seems easier to just check for Attribute::NoInline on the Callee variable directly in canEmitNoInlineWarning, and avoid plumbing all this data around.

llvm/lib/Analysis/InlineAdvisor.cpp
69–71

the verbage here makes noinline sound like its always bad/wrong. Maybe "... noinlne function attribute on hot functions, which may degrade performance"?

148–150

nit: Maybe sink these until they're needed?

163

why not check for NoInline on either CB or Callee directly?

166–169

Seems like a helper function would make this a little cleaner/ergonomic.

233–242

Can we separate this out into its own function? or maybe it makes more sense to be part of getInlineCost? If it were there, then you may avoid invoking isInlineViableFromCostAnalysis more than necessary, right?

also, neither Callee nor CalleeTTI have changed from their def above, so do we need to shadow them here?

wenlei added a subscriber: wenlei.Aug 29 2022, 2:50 PM
iamarchit123 added a comment.EditedSep 9 2022, 2:46 PM

Hi @modimo @paulkirth any good open source benchmarks where I also can test this? The problem is finding some standard benchmarks which are profiled as well as have good performance measuring metric to measure improvement.

@iamarchit123 I think the standard advice is to start w/ the llvm-test-suite and then explore other benchmarks as needed. Also, Clang itself is often a very good starting point.

As for profiles, it probably won't be representative, but you could collect the profile using your benchmark and then assess how often the mismatch w/ inlining happens. if you want to do it w/ Clang itself, then a common approach I've heard is to record have Clang build your project and then use ninja trace or equivalent to find the 5-10 TUs w/ the longest compile time. Then stick them in the https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training directory, which will use them for PGO automatically. If you go that route, you may need to preprocess the source files.

@iamarchit123 I think the standard advice is to start w/ the llvm-test-suite and then explore other benchmarks as needed. Also, Clang itself is often a very good starting point.

As for profiles, it probably won't be representative, but you could collect the profile using your benchmark and then assess how often the mismatch w/ inlining happens. if you want to do it w/ Clang itself, then a common approach I've heard is to record have Clang build your project and then use ninja trace or equivalent to find the 5-10 TUs w/ the longest compile time. Then stick them in the https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training directory, which will use them for PGO automatically. If you go that route, you may need to preprocess the source files.

+1 Clang is the best starting point. I've been busy recently so haven't had a chance to run the HHVM experiments, starting a run today. Paul left some good review comments that you can address without requiring performance runs--I would recommend getting the patch updated so when the results come back everything will be ready to commit.