This is an archive of the discontinued LLVM Phabricator instance.

Conditionally eliminate library calls where the result value is not used
ClosedPublic

Authored by xur on Sep 9 2016, 11:37 AM.

Details

Summary

This pass shrink-wraps a condition to some library calls where the call
result is not used. For example:

sqrt(val);

is transformed to

if (val < 0)
  sqrt(val);

Even if the result of library call is not being used, the compiler cannot
safely delete the call because the function can set errno on error
conditions.
Note in many functions, the error condition solely depends on the incoming
parameter. In this optimization, we can generate the condition can lead to
the errno to shrink-wrap the call. Since the chances of hitting the error
condition is low, the runtime call is effectively eliminated.

These partially dead calls are usually results of C++ abstraction penalty
exposed by inlining. This optimization hits 108 times in 19 C/C++ programs
in SPEC2006.

Diff Detail

Repository
rL LLVM

Event Timeline

xur updated this revision to Diff 70874.Sep 9 2016, 11:37 AM
xur retitled this revision from to Conditionally eliminate library calls where the result value is not used.
xur updated this object.
xur added reviewers: davidxl, hfinkel, mehdi_amini.
xur added subscribers: llvm-commits, xur.
davidxl edited edge metadata.Sep 9 2016, 11:45 AM

Have not looked in details -- but just a quick note: you should annotate the created branches with MD_prof meta data to indicate it is not likely to be taken

you should probably use 1/2000 which is the same value used in builtin_expect lowering. We can refactor the option for it later.

davidxl added inline comments.Sep 16 2016, 2:20 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
10 ↗(On Diff #70874)

shrink wraps a call to function .. if the result is not used. The call can set errno but is otherwise side effect free.

54 ↗(On Diff #70874)

The name of this option is quite confusing. Perhaps DoDeleteCallNoErrNo?

98 ↗(On Diff #70874)

CICandidate or CICand?

104 ↗(On Diff #70874)

-> checkCandidate

110 ↗(On Diff #70874)

Missing documentation for the method. Also name it "CreateOrCond".

I also think it is better to pass the Arg directly instead of the CallInst&

118 ↗(On Diff #70874)

Missing documentation; Pass Arg0 directly

127 ↗(On Diff #70874)

This method can refactored to share code with CreateCond methods that takes two compares.

143 ↗(On Diff #70874)

Fix documentation. This function does not 'find all ...'. It checks if CI is a candidate for shrinkwrapping and put it into work list if it is.

202 ↗(On Diff #70874)

Perhaps break the giant switch into smaller helper functions. For instance define a helper function called:

bool doesFuncSetErrNo(...) { ..

}
211 ↗(On Diff #70874)

DomeinError -> DomainError

552 ↗(On Diff #70874)

When OptimizeForSize is on, this pass should not be turned on.

xur marked 8 inline comments as done.Sep 21 2016, 2:30 PM

Will branch-weight to 1/2000.

lib/Transforms/Utils/CondDeadCallElimination.cpp
110 ↗(On Diff #70874)

I will change the name.
I'm not sure if it's better to pass Arg directly. I also need the instruction for the IRBuilder (either IRBuilder or CallInst).

202 ↗(On Diff #70874)

Will do.

xur updated this revision to Diff 72139.Sep 21 2016, 11:02 PM
xur edited edge metadata.

Addressed David's review comments. Also changed the coding structure so that
the transformaton in one place.

davidxl added inline comments.Sep 22 2016, 4:45 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
131 ↗(On Diff #72139)

Make this one also takes IRBuilder<>& as parameter. Introduce another overload of CreateCond that does not have IRBuilder<>&

149 ↗(On Diff #72139)

why not having a separate patch to improve FE?

161 ↗(On Diff #72139)

Why not using attribute marked by FE?

369 ↗(On Diff #72139)

If the runtime exposes API to invoke fast path no-errorno implementation, this pass can also be enhanced to handle calls that have uses. Perhaps add a comment here for future enhancement

471 ↗(On Diff #72139)

Add a comment that the condition can be more conservative than it actually is.

485 ↗(On Diff #72139)

// FIXME

595 ↗(On Diff #72139)

Should OptimizeForSize be passed in via pass builder (see LoopUnswitch) as well?

xur marked 4 inline comments as done.Sep 23 2016, 11:51 AM
xur added inline comments.
lib/Transforms/Utils/CondDeadCallElimination.cpp
149 ↗(On Diff #72139)

we could. I only tested c/c++ FE. I think the good thing doing here is this applies to all FE.

I'll leave it here for now. Will do post commit fix if needed.

161 ↗(On Diff #72139)

I haven't tested that. But why bother? we can safely delete them here.

595 ↗(On Diff #72139)

the parameter OptForSize is passed from pass builder.
Here we also check the function attribute that can overwrite the command line options.

xur updated this revision to Diff 72316.Sep 23 2016, 11:52 AM

Addressed David's review comments.

davidxl added inline comments.Sep 23 2016, 12:01 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
162 ↗(On Diff #72316)

Actually I think it is the wrong place to handle these no-errno functions here -- they should be handled by regular DCE pass.

In fact functions like ceil etc are already handled by DCE.

So my suggestion is to completely get rid of these no-side effect functions from this patch. Have a separate patch to fix CFE for the missing cases like asinh.

xur updated this revision to Diff 72335.Sep 23 2016, 1:10 PM

Removed the deleting for no-errno functionality suggested by David.
Will submit a separated patch in DCE.

davidxl added inline comments.Sep 26 2016, 5:27 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
107 ↗(On Diff #72335)

Remove this line.

304 ↗(On Diff #72335)

Move this check into ShrinkWrapCI

test/Transforms/Util/cond-call-dce-double.ll
24 ↗(On Diff #72335)

no need to redefine BRANCH_WEIGHT

34 ↗(On Diff #72335)

same here and other places.

xur updated this revision to Diff 72724.Sep 27 2016, 3:39 PM

Integrated David's review comments.

The plan is to have this pass provide a more general framework to do math function partial inlining as well (as followup). Once that is done, the existing SQRT partial inliine pass can be subsumed.

mehdi_amini edited edge metadata.Oct 7 2016, 10:52 AM

That's a smart transformation!
The patch looks good overall, I only have minor comments inline.

include/llvm/Transforms/Scalar.h
525 ↗(On Diff #72724)

I'm not super convinced by the "DCE" in the name, there does not seem to be "dead code" here, and no "code elimination" either.

Also you need to document the parameter OptimizeForSize, which I'm not convinced should be there at all. If I read correctly the code it totally disable this pass. So adding a pass to the pipeline while disabling it at the same time does not make sense to me.
Also, it handled the function attribute, isn't it enough?

lib/Transforms/Utils/CondDeadCallElimination.cpp
15 ↗(On Diff #72724)

Can you turn this into "if (val < 0) { errno = ... }" and eliminate the call?
Maybe we don't have guarantee that errno is available?

52 ↗(On Diff #72724)

I don't believe it is common to have such flags in pass implementations?

129 ↗(On Diff #72724)

What about FP16 here?

314 ↗(On Diff #72724)

That's a neat idea!

325 ↗(On Diff #72724)

Any example in mind?

Also there is no test for any of the bailout here. Can you add some?

518 ↗(On Diff #72724)

Usually we try to add a message in assertion, something like "checkCandidate shouldn't queue non-TLI func"

davidxl added inline comments.Oct 7 2016, 11:05 AM
lib/Transforms/Utils/CondDeadCallElimination.cpp
312 ↗(On Diff #72724)

used, if there is API .. ---> used. If there is an API ..

313 ↗(On Diff #72724)

errno). We ... --> errno, we ...

Also, the description should be :

.. can use the same framework to direct/wrap the call to the fast API in the error free path, and leave the original call in the slow path.

xur marked 3 inline comments as done.Oct 7 2016, 11:39 AM

Thanks for reviewing this patch. I'll update with your comments and upload it soon.

include/llvm/Transforms/Scalar.h
525 ↗(On Diff #72724)

The intention is to eliminate the call in the runtime. But I agree that there should be a better name for this. Let me think about this.

Thanks for finding OptimizeForSize issue. In the earlier implementation, there is a part of pass still doing some work with OptimizeForSize==true . Now that part is removed from the pass which makes the OptimizeForSize useless. I'll remove this part.

OptimizeForSize is passed from pass manager and it also reads the function attribute. I'm not sure I understand your last question. Can you explain more?

lib/Transforms/Utils/CondDeadCallElimination.cpp
15 ↗(On Diff #72724)

I can do this for some calls -- if I can use the exact condition for the call. For others, like pow(), the conditions we used here are the superset of the true conditions of generating errno . We can not set the errno directly.

Also the assumption is that the chance of getting to the err code path is low. We probably don't concern the performance of this cold path. I would prefer to keep as it is.

52 ↗(On Diff #72724)

You mean in PassManagerBuilder?

129 ↗(On Diff #72724)

Are there math lib calls using FP16 type?
I only know the following types for the math lib calls: double, float and long double.

325 ↗(On Diff #72724)

We probably do not need this check.
Let me try to remove this.

xur updated this revision to Diff 73992.Oct 7 2016, 3:09 PM
xur edited edge metadata.

Addressed Mehdi and David review comments. The bigger ones are:
(1) removed the use of the OptimizedForSize passed from passmanagerbuild.
(2) renamed the pass to LibCallsShrinkWrap.

mehdi_amini added inline comments.Oct 7 2016, 3:31 PM
include/llvm/Transforms/Scalar.h
525 ↗(On Diff #72724)

That's fine, removing OptimizeForSize addresses it.

lib/Transforms/Utils/CondDeadCallElimination.cpp
15 ↗(On Diff #72724)

OK!

52 ↗(On Diff #72724)

We have such flags in the pass manager builder, yes. But not here in within the pass itself I think.

129 ↗(On Diff #72724)

I don't know about what can we stuck into TLI? What would the Cuda target do for instance?

xur added inline comments.Oct 7 2016, 4:07 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
52 ↗(On Diff #72724)

OK. I'll move this "disable" option to pass manager build. I'll rename it to "disable-libcalls-shrinkwrap"

129 ↗(On Diff #72724)

the library calls I handled in this pass are white-listed. They should take either "float", "double", or "long double" as the argument. I don't think we will ever see a FP16 type here.

mehdi_amini added inline comments.Oct 7 2016, 5:13 PM
lib/Transforms/Utils/CondDeadCallElimination.cpp
129 ↗(On Diff #72724)

You're right!

xur updated this revision to Diff 74299.Oct 11 2016, 2:39 PM

Move internal option -disable-libcalls-shrinkwrap to PassBuilderManager per
Mehdi's suggestion. Also check the FP format. This is primarily for long double
type where the powerpc has narrower range than in x86. We only handle
x87DoubleExtended format for long double for now.

davidxl accepted this revision.Oct 17 2016, 9:55 AM
davidxl edited edge metadata.

lgtm

(the original patch also includes handling of math libs which are not properly annotated by Clang FE. Is there a patch submitted for that fix?)

lib/Transforms/Utils/LibCallsShrinkWrap.cpp
320 ↗(On Diff #74299)

Add a comment (todo) for handle long double in other formats (either with conservative range test or precise handling).

This revision is now accepted and ready to land.Oct 17 2016, 9:55 AM
xur updated this revision to Diff 74876.Oct 17 2016, 11:34 AM
xur edited edge metadata.

Add a TODO comment for long double suggested by David.

To david: I'll send another patch for mathlib functions that not properly annotated.

LGTM as well, thanks!

xur closed this revision.Oct 18 2016, 2:45 PM
This revision was automatically updated to reflect the committed changes.