This is an archive of the discontinued LLVM Phabricator instance.

[Flang] Change fir.divc to perform library call rather than generate inline operations.
ClosedPublic

Authored by SBallantyne on Mar 10 2023, 9:18 AM.

Details

Summary

Currently fir.divc is always lowered to a sequence of llvm operations to perform complex division, however this causes issues for extreme values when the calculations overflow. While this behaviour would be fine at -Ofast, this is currently the default at all levels.

This patch changes fir.divc to lower to a library call instead, except for when KIND=3 as there is no appropriate library call for this case.

Diff Detail

Event Timeline

SBallantyne created this revision.Mar 10 2023, 9:18 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 10 2023, 9:18 AM
SBallantyne requested review of this revision.Mar 10 2023, 9:18 AM

Add test to check for generating library call or generating inline code

vzakhari added inline comments.
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

I guess these names might not be available on all targets. I think we'd better call Fortran runtime library implementation here.

SBallantyne added inline comments.Mar 13 2023, 6:29 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

These functions are provided by compiler-rt, which as i understand is already a requirement for flang. I think it would be better to just reuse the implementation from there, rather than add it again in the fortran runtime library.

clementval added inline comments.
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

Where did you see it is a requirement?

kiranchandramohan added inline comments.
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

It is listed in the Readme (https://github.com/llvm/llvm-project/tree/main/flang#building-flang-in-tree). I believe @PeteSteinfeld discussed this before and added it.

vzakhari added inline comments.Mar 13 2023, 9:12 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

I think flang driver does not link clang_rt.builtins (under -flang-experimental-exec), and currently the dependencies on these functions are satisfied by libgcc on Linux. It seems the driver needs to be changed, if we want to rely on compiler-rt implementation. Did I miss something?

flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

That is a good point. We can discuss this in the Wednesday Call.

The build instructions were updated in https://reviews.llvm.org/D116566 by @PeteSteinfeld. I am assuming at that time Pete was using a custom driver. And we probably missed this while adding the flang-experimental-exec flag. Also, not clear whether libgcc covers this on Linux. Either way, it looks like we have to agree and/or update the driver code or the Readme.

vzakhari added inline comments.Mar 13 2023, 10:02 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

I think Pete's comment about the dependency on compiler-rt might be related to the compiler itself, e.g. you may see that -lrt in build/tools/flang/tools/flang-driver/CMakeFiles/flang-new.dir/link.txt

I wonder whether intermediate computations generated by ComplexToStandard conversion complex::DivOp honor the range of the element's data type. Maybe we can think of replacing the FIR operation with the one from the Complex dialect?

I wonder whether intermediate computations generated by ComplexToStandard conversion complex::DivOp honor the range of the element's data type. Maybe we can think of replacing the FIR operation with the one from the Complex dialect?

I think @SBallantyne tried using the mlir::Complex:DivOp first and it did not work.

flang/lib/Optimizer/CodeGen/CodeGen.cpp
3571

Possibly.

@jdoerfert Would using functions from compiler-rt cause issues with target offloading?
Context: We are thinking of using functions from compiler-rt for implementing the complex division operation. The existing implementations in Flang (an inline one) and the one on MLIR both have precision issues whereas the compiler-rt ones work fine.

@jdoerfert Would using functions from compiler-rt cause issues with target offloading?
Context: We are thinking of using functions from compiler-rt for implementing the complex division operation. The existing implementations in Flang (an inline one) and the one on MLIR both have precision issues whereas the compiler-rt ones work fine.

I think these functions exist for target offloading too, I've used complex numbers in omp regions in C/C++ before and it has worked, and afaict clang just calls these functions. At least for Nvidia there's clang/lib/Headers/__clang_cuda_complex_builtins.h which contains them.

@jdoerfert Would using functions from compiler-rt cause issues with target offloading?
Context: We are thinking of using functions from compiler-rt for implementing the complex division operation. The existing implementations in Flang (an inline one) and the one on MLIR both have precision issues whereas the compiler-rt ones work fine.

I think these functions exist for target offloading too, I've used complex numbers in omp regions in C/C++ before and it has worked, and afaict clang just calls these functions. At least for Nvidia there's clang/lib/Headers/__clang_cuda_complex_builtins.h which contains them.

Thank you, David. Since it is already the case with clang, I suppose, it should be okay for Flang too.

vzakhari accepted this revision.Mar 29 2023, 9:03 AM
This revision is now accepted and ready to land.Mar 29 2023, 9:03 AM

Update to move half float to inline instead of library call as compiler-rt doesn't currently support divhc3. This may be added at some point as this currently causes errors for clang when libgcc is not available, see this issue: https://github.com/llvm/llvm-project/issues/61914

@SBallantyne, this patch causes problems with COMPLEX division. See the comments I added to the commit -- https://reviews.llvm.org/rGa7bb8e273f433cceeb547e87d04114178573496a