This is an archive of the discontinued LLVM Phabricator instance.

Lower F08 bit-population count intrinsics
ClosedPublic

Authored by tarunprabhu on Jul 12 2022, 12:29 PM.

Details

Summary

This patch implements lowering for the Fortran 2008 bit-population count intrinsics POPCNT, POPPAR, LEADZ and TRAILZ.

POPCNT, LEADZ and TRAILZ are implemented using calls to the corresponding LLVM intrinsic.

POPVAR is implemented in terms of POPCNT.

Diff Detail

Event Timeline

tarunprabhu created this revision.Jul 12 2022, 12:29 PM
tarunprabhu requested review of this revision.Jul 12 2022, 12:29 PM
klausler added inline comments.Jul 12 2022, 12:37 PM
flang/lib/Lower/IntrinsicCall.cpp
3543

Most modern architectures support bit population count instructions; it would be better to use them where available.

Thanks. Hadn't realized that.

I will reimplement using LLVM intrinsics.

tarunprabhu edited the summary of this revision. (Show Details)

Use LLVM intrinsics instead of calculating POPCNT by hand.

Add implementation and tests for LEADZ and TRAILZ in addition to POPCNT and POPPAR since they are also related.

The math dialect has operations that model leadz, trailz, pop etc. There exists a conversion from math dialect to llvm (https://github.com/llvm/llvm-project/blob/4ae254e48850033f4e441a28fb28ae27d63ea458/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp#L87). Using these ops might be useful since the might have some transformations/folding, the existence of vector versions might help vectorisation in the future.
https://mlir.llvm.org/docs/Dialects/MathOps/#mathctpop-mlirmathctpopop
https://mlir.llvm.org/docs/Dialects/MathOps/#mathctlz-mlirmathcountleadingzerosop
https://mlir.llvm.org/docs/Dialects/MathOps/#mathcttz-mlirmathcounttrailingzerosop

flang/test/Evaluate/fold-popcnt.f90
1 ↗(On Diff #444080)

I think this can go in as a separate NFC patch since the patch is testing the implementation of existing code.

Updated patch to use mlir population operators for leadz, popcnt and trailz instead of calling the LLVM intrinsics.

Removed the folding tests from this patch.

tarunprabhu marked 2 inline comments as done.Jul 12 2022, 5:48 PM
kiranchandramohan added a subscriber: vzakhari.
kiranchandramohan added inline comments.
flang/lib/Lower/IntrinsicCall.cpp
3220

The genMathOp function seems to switch between the libm version and the Math dialect Op version based on whether the setting of mathRuntimeVersion is precise or not. I believe the mathRuntimeVersion is for floating point intrinsics and probably not applicable for integer intrinsics. Adding @vzakhari for an opinion on this.

flang/test/Lower/Intrinsics/leadz.f90
2

Nit: Could you add the flang driver also for tests.

vzakhari added inline comments.Jul 13 2022, 3:11 PM
flang/lib/Lower/IntrinsicCall.cpp
3220

Thanks for adding me @kiranchandramohan!

Yes, genMathOp cannot distinguish between FP and non-FP operations currntly, so it will treat the operation as FP and generate a call to ctlz under mathRuntimeVersion == precise. We'd better just create the math operation directly here. Moreover, I do not see which library provides ctlz API...

Create the call to the MLIR ctpop, leadz and trailz instructions directly instead of using genMathOp.

Use the flang driver in the tests.

tarunprabhu marked 3 inline comments as done.Jul 13 2022, 3:31 PM

LGTM. Thanks for all the changes. Please wait a day before submitting in case there are further comments from others.

flang/test/Lower/Intrinsics/leadz.f90
2

Apologies, I was requesting flang driver tests in addition. You may add bbc back when you submit.

This revision is now accepted and ready to land.Jul 13 2022, 3:41 PM
tarunprabhu added inline comments.Jul 13 2022, 3:44 PM
flang/test/Lower/Intrinsics/leadz.f90
2

Could you clarify what you mean by this?

flang/test/Lower/Intrinsics/leadz.f90
2

I meant having tests with the driver as well as bbc like the test https://github.com/llvm/llvm-project/blob/main/flang/test/Lower/Intrinsics/exp.f90.

! RUN: bbc -emit-fir %s -o - | FileCheck %s
! RUN: %flang_fc1 -emit-fir %s -o - | FileCheck %s

The second part is applicable if you finally use git push to submit. Then you do not need to create another revision in phabricator for review. You can fetch this patch. Add the RUN like for bbc, rebase and then submit using git push

Added flang driver in addition to BBC in the tests.

tarunprabhu marked an inline comment as done.Jul 13 2022, 4:52 PM
This revision was automatically updated to reflect the committed changes.

@tarunprabhu , this implementation is failing NAG tests. Do you have access to them?

@tarunprabhu , this implementation is failing NAG tests. Do you have access to them?

@PeteSteinfeld, no I do not. If you can file a bug report, I can investigate the failure.

@tarunprabhu , this implementation is failing NAG tests. Do you have access to them?

@PeteSteinfeld, no I do not. If you can file a bug report, I can investigate the failure.

Thanks, @tarunprabhu . I'm all set for now.