Page MenuHomePhabricator

[SimplifyLibCalls] Constant folding for fls, flsl, flsll
ClosedPublic

Authored by davide on Nov 11 2015, 2:39 PM.

Details

Summary

This enables constant folding for fls() [FreeBSD specific]. This should be quite straighforward (unless I got the math wrong, which I don't exclude).

Thanks!

Diff Detail

Repository
rL LLVM

Event Timeline

davide updated this revision to Diff 39977.Nov 11 2015, 2:39 PM
davide retitled this revision from to [SimplifyLibCalls] Constant folding for fls, flsl, flsll.
davide updated this object.
davide added reviewers: majnemer, echristo.
davide added a subscriber: llvm-commits.
davide abandoned this revision.Dec 14 2016, 6:27 PM

SimplifyLibCalls needs to know how to translate the pair (DataLayout, C99 type) -> LLVM IR Type in order to make this transformation. Until then, there's no reliable way to know how big a long is and that could cause a wrong constant folding of fls(). I haven't audited all the transformations here, but I wouldn't be surprised if other functions are incorrectly transformed assuming sizes of primitive types.

SimplifyLibCalls needs to know how to translate the pair (DataLayout, C99 type) -> LLVM IR Type in order to make this transformation. Until then, there's no reliable way to know how big a long is and that could cause a wrong constant folding of fls(). I haven't audited all the transformations here, but I wouldn't be surprised if other functions are incorrectly transformed assuming sizes of primitive types.

https://llvm.org/bugs/show_bug.cgi?id=31385

I'm not sure I follow the issue with long. If we see call i32 @flsl(i32 42) in the IR, "long" must be 32 bits; if we see call i32 @flsl(i64 42) in the IR, "long" must be 64 bits.

SimplifyLibCalls needs to know how to translate the pair (DataLayout, C99 type) -> LLVM IR Type in order to make this transformation. Until then, there's no reliable way to know how big a long is and that could cause a wrong constant folding of fls(). I haven't audited all the transformations here, but I wouldn't be surprised if other functions are incorrectly transformed assuming sizes of primitive types.

https://llvm.org/bugs/show_bug.cgi?id=31385

I'm not sure I follow the issue with long. If we see call i32 @flsl(i32 42) in the IR, "long" must be 32 bits; if we see call i32 @flsl(i64 42) in the IR, "long" must be 64 bits.

You're absolutely right, there's no such problem as we actually have the type. I was under the impression we needed the correspondence with the C type but I was clearly wrong. Well, thanks for catching.
I updated the patch, which also revealed a bug in TargetLibraryInfo while checking the function prototype.
Can you please take a look?

davide updated this revision to Diff 81523.Dec 14 2016, 7:50 PM
  1. Fix TargetLibraryInfo prototype checking
  2. make the transform correct with flsl() and flsll()
  3. Remove now unneeded call to checkIntUnaryReturnAndParam()
davide added a reviewer: efriedma.
efriedma added inline comments.Dec 14 2016, 9:40 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1558 ↗(On Diff #81523)

Maybe we should just unconditionally lower fls() to llvm.ctlz()?

davide added inline comments.Dec 15 2016, 12:44 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1558 ↗(On Diff #81523)

Hmm, yeah. I decided to do what ffs already does.
Try to constant fold, and if the argument is not a constant, lower to
(x != 0) ? (i32)(sizeInBits(x) - llvm.ctlz()) : 0
Is this what you had in mind, or you wanted to remove the constant folding part completely?

davide updated this revision to Diff 81637.Dec 15 2016, 12:44 PM
efriedma added inline comments.Dec 15 2016, 1:06 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1551 ↗(On Diff #81637)

There isn't much point to keeping this around; the general-case code will constant-fold anyway.

1560 ↗(On Diff #81637)

"sizeInBits(x) - llvm.ctlz(x, false)" is equivalent, I think?

With llvm.ctlz(x, true) we end up producing slightly smaller code (at least on x86_64 FreeBSD)

i.e.

x != 0 ? (sizeInBits(x) - llvm.ctlz(x, true)) : 0

bsrq    %rdi, %rcx
xorl    $-64, %ecx
addl    $65, %ecx
xorl    %eax, %eax
testq   %rdi, %rdi
cmovnel %ecx, %eax
retq

x != 0 ? (sizeInBits(x) - llvm.ctlz(x, false)) : 0

# BB#0:
        testq   %rdi, %rdi
        je      .LBB0_1
# BB#2:                                 # %cond.false
        bsrq    %rdi, %rax
        xorq    $63, %rax
        jmp     .LBB0_3
.LBB0_1:
        movl    $64, %eax
.LBB0_3:                                # %cond.end
        movl    $64, %ecx
        subl    %eax, %ecx
        xorl    %eax, %eax
        testq   %rdi, %rdi
        cmovnel %ecx, %eax
        retq

sizeInBits(x) - llvm.ctlz(x, false)

# BB#0:
        testq   %rdi, %rdi
        je      .LBB0_1
# BB#2:                                 # %cond.false
        bsrq    %rdi, %rcx
        xorq    $63, %rcx
        jmp     .LBB0_3
.LBB0_1:
        movl    $64, %ecx
.LBB0_3:                                # %cond.end
        movl    $64, %eax
        subl    %ecx, %eax
        retq

Which one do you prefer, 1) or 3) ?

lib/Transforms/Utils/SimplifyLibCalls.cpp
1551 ↗(On Diff #81637)

Done.

efriedma edited edge metadata.Dec 15 2016, 2:04 PM

That's weird... I would have guessed instcombine would kick in an canonicalize one way or the other. Anyway, the difference between 1 and 3 doesn't really matter.

That's weird... I would have guessed instcombine would kick in an canonicalize one way or the other. Anyway, the difference between 1 and 3 doesn't really matter.

I'll go for 3, then.
Slightly related, we seem to produce slightly worse code than gcc-4.2 for ffs(x), see https://godbolt.org/g/CruJtS
I don't think it matters much, but still worth mentioning.

davide updated this revision to Diff 81662.Dec 15 2016, 2:19 PM
davide edited edge metadata.

Simplified after discussion with Eli

efriedma accepted this revision.Dec 15 2016, 3:27 PM
efriedma edited edge metadata.

LGTM with typo fixed.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1550 ↗(On Diff #81662)

Typo: fls, not ffs.

This revision is now accepted and ready to land.Dec 15 2016, 3:27 PM
majnemer added inline comments.Dec 15 2016, 3:31 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1558 ↗(On Diff #81662)

Can't we replace B.getInt32Ty with CI->getType()?

This revision was automatically updated to reflect the committed changes.
davide added inline comments.Dec 15 2016, 3:59 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1558 ↗(On Diff #81662)

Done before committing. Thanks for your comment!