Many codegen pass require this pass with useful triple info. Legacy pass manager need to
add a TargetLibraryInfo with the module info before run passes. Or the TargetLibraryInfo
will be initialized too conservative.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
CodeGenPasses is a legacy pass manager here. For now. llc.cpp has not switched to the new pass manager. llc.cpp does add this pass. And many passes in codegen need a right TargetLibraryInfo. If not adding it here with the module's triple info. TargetLibraryInfo will be initialized with a null string. That results in setting many library calls unavailable, as conservative as possible. I think it's not reasonable. Welcome discuss! Welcome adding more reviewers.
the summary needs more description, why is this change needed? is it fixing a bug? a test case would be nice
For example this patch can fix such performance drop on Linux:
$ cat foo.h double foo(double a); $ cat foo.c #include <math.h> #include "foo.h" double foo(double a){ return sqrt(a); } $ cat main.c #include <stdio.h> #include "foo.h" int main() { printf("%lf\n", foo(200));} $ cat foo.sh clang -c -o foo.o -Ofast -flto foo.c clang -c -o main.o -Ofast -flto main.c clang -Ofast -flto foo.o main.o -lm objdump -d a.out | less $ sh foo.sh
Without this patch, when enable -flto, it will generate call __sqrt_finite, when disable -flto , it will generate sqrtsd %xmm0,%xmm0. The later version is faster. I don't know how to add lto dependent lit test. Hi @aeubanks, do you know how?
Why would TLI be involved in that test. Isn't the sqrt call converted to llvm.sqrt by the frontend?
Why would TLI be involved in that test. Isn't the sqrt call converted to llvm.sqrt by the frontend?
Sorry for the late response. BTW, thanks for review! This maybe not a good example. I realized it depends on certain version of math.h. Here's the source code pre-processed on my system which can generate call __sqrt_finite: https://gcc.godbolt.org/z/947nhffv1
I'd expect most/all the TLI optimizations to have happened before codegen, although some IR passes can be added to the codegen pipeline, and having the right TLI for those does make sense.
it'd be nice to have an actual test for this in tree, but that might be hard to come up with. can you use --print-after-all to see which pass is doing something different when this patch is applied, just to convince me that this actually works?
Speculating a little. For the sqrt_finite example given previously, it's probably SelectionDAGBuilder that is doing something different. There is code in there to convert LibFunc_sqrt_finite to ISD::FSQRT. I guess we don't turn finite calls into intrinsics anywhere before that. TargetLibraryInfo needs the Triple in order to know if sqrt_finite is supported. It's marked unavailable in TargetLibraryInfo.cpp's initialize function unless the Triple has a Linux OS and a GNU environment.
in that case we should be able to put some IR with some target triple through the LTO pipeline and see that we call the proper function as a test
Thank you both! Your speculating is right! That's the original motivation for this patch.