This is an archive of the discontinued LLVM Phabricator instance.

[SDAG] avoid libcalls to fmin/fmax for soft-float targets
ClosedPublic

Authored by spatel on Mar 28 2022, 12:37 PM.

Details

Summary

This is an extension of D70965 to avoid creating a mathlib call where it did not exist in the original source. Also see D70852 for discussion about an alternative proposal that was abandoned.

In the motivating bug report:
https://github.com/llvm/llvm-project/issues/54554
...we also have a more general issue about handling "no-builtin" options. I'm not sure how that works yet.

Diff Detail

Event Timeline

spatel created this revision.Mar 28 2022, 12:37 PM
spatel requested review of this revision.Mar 28 2022, 12:37 PM
efriedma accepted this revision.Mar 28 2022, 12:59 PM

LGTM. (As a followup, we could consider querying TargetLibraryInfo to see if fmax is available, instead of assuming it isn't.)


The general question could probably use some thought... currently, the compiler is very constrained in terms of helper routines. We're basically constrained to calling memcpy/memset/memmove, plus whatever routines are exposed by libgcc, plus whatever TargetLibraryInfo tells us is available. This is pretty limiting: we're currently forced to inline routines we don't really want to inline due to codesize, like multiply-with-overflow, or minnum. Or we have to add weird checks along the lines of D70852. (This is made worse by the fact that SelectionDAG lowering can't really generate control flow.) We basically can't add new routines to compiler-rt builtins because a large fraction of users end up linking with libgcc.

It might be worth considering if we could come up with a mechanism to embed the routines we need into each object file; sort of like compiler-rt builtins, but you wouldn't have to actually link against compiler-rt. So each object file that needs a helper contains a hidden weak symbol, e.g. __llvm_mul_with_overflow_32_v1, and the linker picks one. Sort of wasteful in terms of disk space in object files and static libraries, but it would allow us to add helpers while staying compatible with build systems we don't control.

Just food for thought; I don't have a complete proposal at this point.

This revision is now accepted and ready to land.Mar 28 2022, 12:59 PM

The tests look good; I still need to see how it handles the usecase that originally prompted my bug report.

(As a followup, we could consider querying TargetLibraryInfo to see if fmax is available, instead of assuming it isn't.)

It might be nice to have a switch, possibly controlled by one of the -Os, that would prefer the fmax call.

It might be worth considering if we could come up with a mechanism to embed the routines we need into each object file; sort of like compiler-rt builtins, but you wouldn't have to actually link against compiler-rt. So each object file that needs a helper contains a hidden weak symbol, e.g. __llvm_mul_with_overflow_32_v1, and the linker picks one. Sort of wasteful in terms of disk space in object files and static libraries, but it would allow us to add helpers while staying compatible with build systems we don't control.

Regarding disk space, while it's worse than using a compiler-rt builtin, it's not generally worse than just embedding the code at every point that needs it, since it only appears once per file. Moreover, if this is done by adding intrinsics, the additional disk space would only be realized after machine code generation, and would not be an issue at all with LTO.

Just food for thought; I don't have a complete proposal at this point.

I like this idea, and not just for this particular issue, but more generally for things that are tightly integrated with the compiler, but require runtime support (e.g.: sanitizers).

This revision was landed with ongoing or failed builds.Mar 30 2022, 8:22 AM
This revision was automatically updated to reflect the committed changes.