This is an archive of the discontinued LLVM Phabricator instance.

Add support for math.ctlz in convert-math-to-funcs

Authored by j2kun on Mar 16 2023, 2:32 PM.



This change adds a software implementation of the math.ctlz operation
and includes it in --convert-math-to-funcs.

This is my first change to MLIR, so please bear with me as I'm still learning
the idioms of the codebase.

The context for this change is that I have some larger scale project in which
I'd like to lower from a mix of MLIR dialects to CIRCT, but many of the CIRCT
passes don't support the math dialect.

I noticed the content of convert-math-to-funcs was limited entirely to
the pow functions, but otherwise provided the needed structure to implement
this feature with minimal changes.

Highlight of the changes:

  • Add a dependence on the SCF dialect for this lower. I could have lowered directly to cf, following the pow lowerings in the same pass, but I felt it was not necessary given the existing support for lowering scf to cf.
  • Generalize the DenseMap storing op implementations: modify the callback function hashmap to be keyed by both OperationType (for me this effectively means the name of the op being implemented in software) and the type signature of the resulting function.
  • Implement the ctlz function as a loop. I had researched a variety of implementations that claimed to be more efficient (such as those based on a de Bruijn sequence), but it seems to me that the simplest approach would make it easier for later compiler optimizations to do a better (platform-aware) job optimizing this than I could do by hand.

Questions I had for the reviewer:

  • [edit: found mlir-cpu-runner and added two tests] What would I add to the filecheck invocation to actually run the resulting MLIR on a value and assert the output is correct? I have done this manually with the C implementation but I'm not confident my port to MLIR is correct.
  • Should I add a test for a vectorized version of this lowering? I followed suit with the VecOpToScalarOp but I admit I don't fully understand what it's doing.

Diff Detail

Event Timeline

j2kun created this revision.Mar 16 2023, 2:32 PM
j2kun edited the summary of this revision. (Show Details)Mar 16 2023, 2:46 PM
j2kun published this revision for review.Mar 17 2023, 9:47 AM
j2kun updated this revision to Diff 506128.Mar 17 2023, 10:20 AM

Added mlir-cpu-runner tests for ctlz

j2kun edited the summary of this revision. (Show Details)Mar 17 2023, 10:20 AM

Hi @j2kun, welcome to MLIR code base :)

The story behind only the power operations handled in this converter is that we would prefer to convert them to MLIR LLVM dialect (and then into LLVM IR), but there is no support for ipowi and not all flavors of fpowi are supported by LLVM IR. So I introduced this converter to transform these math operations into inline implementations.

I am not sure that extending it for math::CountLeadingZerosOp is the right direction. For example, in Flang pipeline this operation is converted into LLVM::CountLeadingZerosOp by MathToLLVM converter, and I think we want to keep it this way so that the LLVM backend handles it in a target-dependent fashion. I would prefer to keep Flang flow untouched, so at the very least the ctlz conversion has to be optional.

I am not aware about CIRCT specifics. Are there other options to resolve the problem, e.g. convert it to LLVM::CountLeadingZerosOp?

The backend we're ultimately planning to support does not have a ctlz builtin, so we have no choice but to emulate it in software. This is why lowering to llvm's builtin doesn't help us (and IIUC CIRCT also doesn't support llvm's ctlz)

Since this pass is aimed at software implementations of math functions in general, I would expect the fact that it is used in flang shouldn't have a blocking influence on supporting new ops (the pass is, after all, in MLIR and not just the Flang subproject). Perhaps a good middle ground would be to add a configuration option that turns off the lowering of specific ops, with all conversions enabled by default? I could have ctlz turned off by default in this diff and, in a later change, flang could turn it off explicitly and then the default could be changed to enable it. Or if everyone agrees that ctlz is an unusual case due to most backends having it built in, it could stay off by default forever.

Does that plan sound agreeable?

Yes, I think the option to control ctlz conversion will do. The default is up to you.

j2kun updated this revision to Diff 506702.Mar 20 2023, 1:43 PM

Added default-disabled convert-ctlz option

Thank you for the update!


What would be the result for i1 0 input?


I believe the way you encoded it in MLIR matches continue rather than break: if (x < 0) continue - is this intentional?


Please guard with LLVM_DEBUG and use llvm::dbgs() stream.


I would rather use llvm_unreachable.


nit: no need for braces.


nit: the braces are not needed.

j2kun updated this revision to Diff 509749.Mar 30 2023, 11:32 AM
j2kun marked 5 inline comments as done.

Address reviewer comments

  • for input zero, output integer bit width
  • update/augment tests for above
  • use dbgs/unreachable

Sorry for the delay, I was in Korea at a conference.


It is zero in this case, but I'm reading now that most architectures define this to be the bitwidth of the underlying type, so updated it to reflect that.

Added an extra if branch for the check, and an extra mlir-cpu-runner test to ensure this case is evaluated properly.


I think they are semantically the same, but I did this because I was unable to find the equivalent of an scf.break operation, just scf.yield.

Would you recommend I use instead here? I wasn't sure if it made more sense to keep the output entirely scf, do a mix of scf and cf, and/or let later optimization passes recognize this branch is effectively a break.

j2kun updated this revision to Diff 509755.Mar 30 2023, 11:38 AM

thenBuilder -> elseBuilder

Fixed a typo/bad variable name choice from the last update

Thank you for the update. Looks good to me in general.


Thank you.


I think just changing break to continue in the header comment is enough.


Please change break to continue here as well.

j2kun updated this revision to Diff 510917.Apr 4 2023, 1:46 PM
j2kun marked an inline comment as done.

break -> continue, plus move mlir-cpu-runner

vzakhari accepted this revision.Apr 4 2023, 5:38 PM

LGTM, but the windows pre-merge check failure looks suspicious.

This revision is now accepted and ready to land.Apr 4 2023, 5:38 PM
j2kun updated this revision to Diff 511611.Apr 6 2023, 10:14 PM

Try syncing with head

j2kun added a comment.Apr 6 2023, 11:07 PM

This time only Bazel failed, so ok to merge :)

j2kun added a comment.Apr 6 2023, 11:16 PM

(btw, I don't have commit access, so someone else will need to commit it for me)

(btw, I don't have commit access, so someone else will need to commit it for me)

I can merge it on Monday. Please ping me then.

j2kun added a comment.Apr 10 2023, 8:51 AM

@vzakhari ping :) thanks for your review and advice!

@vzakhari ping :) thanks for your review and advice!

Thanks for the ping! I will merge it today.

This revision was landed with ongoing or failed builds.Apr 10 2023, 10:02 AM
This revision was automatically updated to reflect the committed changes.

@j2kun, this update is causing problems. Here's a short program:

  integer :: i1, i2
  i2 = leadz(i1)

When I compile this program, I get the following output from the compiler:

error: loc("/local/home/psteinfeld/up/install/x.f90":2:3): failed to legalize operation 'math.ctlz' that was explicitly marked illegal
mlir did not succeed

I'm not able to get to my computer til Monday, but perhaps your issue is
that this feature is disabled by default (see comments above), and you need
to explicitly enable it. The test files in this patch show the invocation.

mehdi_amini added inline comments.Apr 15 2023, 12:12 PM

The issue is actually here. This line should be guarded by the option.

PeteSteinfeld added inline comments.Apr 16 2023, 10:40 AM

Thanks, @mehdi_amini. I tried making this change, and I did get different results. But I still get errors. Here's the output from invoking the compiler on the test program I previously provided:

error: loc("/local/home/psteinfeld/up/install/x.f90":0:0): failed to legalize operation 'scf.if'
error: Lowering to LLVM IR failed
error: loc("/local/home/psteinfeld/up/install/x.f90":1:3): cannot be converted to LLVM IR: missing `LLVMTranslationDialectInterface` registration for dialect for op: func.func
error: failed to create the LLVM module
PLEASE submit a bug report to and include the crash backtrace.
Stack dump:
0.      Program arguments: /local/home/psteinfeld/up/install/bin/flang-new -fc1 -triple x86_64-unknown-linux-gnu -emit-obj -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu x86-64 -o /tmp/x-f64cb2.o -x f95-cpp-input x.f90
 #0 0x00005565430284eb llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/local/home/psteinfeld/up/install/bin/flang-new+0x1a0d4eb)
 #1 0x0000556543025c44 SignalHandler(int) Signals.cpp:0:0
 #2 0x000014ece2f5bc20 __restore_rt sigaction.c:0:0
 #3 0x000055654377d6ed Fortran::frontend::CodeGenAction::executeAction() (/local/home/psteinfeld/up/install/bin/flang-new+0x21626ed)
 #4 0x000055654305ff8d Fortran::frontend::FrontendAction::execute() (/local/home/psteinfeld/up/install/bin/flang-new+0x1a44f8d)
 #5 0x000055654304f1bb Fortran::frontend::CompilerInstance::executeAction(Fortran::frontend::FrontendAction&) (/local/home/psteinfeld/up/install/bin/flang-new+0x1a341bb)
 #6 0x00005565430653de Fortran::frontend::executeCompilerInvocation(Fortran::frontend::CompilerInstance*) (/local/home/psteinfeld/up/install/bin/flang-new+0x1a4a3de)
 #7 0x0000556542b90058 fc1_main(llvm::ArrayRef<char const*>, char const*) (/local/home/psteinfeld/up/install/bin/flang-new+0x1575058)
 #8 0x0000556542ae05bc main (/local/home/psteinfeld/up/install/bin/flang-new+0x14c55bc)
 #9 0x000014ece1b7a493 __libc_start_main (/lib64/
#10 0x0000556542b8d70e _start (/local/home/psteinfeld/up/install/bin/flang-new+0x157270e)
flang-new: error: unable to execute command: Segmentation fault (core dumped)
flang-new: error: flang frontend command failed due to signal (use -v to see invocation)
flang-new version 17.0.0 ( 282d114c21bfe5292db659579a944403bec345a2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
failed to legalize operation 'scf.if'
error: Lowering to LLVM IR failed

I don’t think this is because of this patch, likely another problem. But the change I mentioned seems clearly necessary: @j2kun are you on it?

failed to legalize operation 'scf.if'
error: Lowering to LLVM IR failed

I don’t think this is because of this patch, likely another problem. But the change I mentioned seems clearly necessary: @j2kun are you on it?

@vzakhari : may want to revert in the meantime.

I will try to fix it or revert soon.

I can fix it Monday morning. A rollback is fine if it's urgent.

Monday's soon enough for me!