This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
11/12
InstCombineSelect.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
5/5
bit_ceil.ll

Differential D145299

[InstCombine] Generate better code for std::bit_ceil
ClosedPublic

Authored by kazu on Mar 4 2023, 12:25 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
goldstein.w.n
nikic
pengfei

Commits

rG231fa2743510: [InstCombine] Generate better code for std::bit_ceil

Summary

Without this patch, std::bit_ceil<uint32_t> is compiled as:

%dec = add i32 %x, -1
%lz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
%sub = sub i32 32, %lz
%res = shl i32 1, %sub
%ugt = icmp ugt i32 %x, 1
%sel = select i1 %ugt, i32 %res, i32 1

With this patch, we generate:

%dec = add i32 %x, -1
%ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
%sub = sub nsw i32 0, %ctlz
%and = and i32 %1, 31
%sel = shl nuw i32 1, %and
ret i32 %sel

https://alive2.llvm.org/ce/z/pwezvF

This patch recognizes the specific pattern and drops the conditional
move. Specifically, it recognizes patterns from std::bit_ceil in
libc++ and libstdc++. In addition to the LLVM IR generated for
std::bit_ceil(X), this patch recognizes variants like:

std::bit_ceil(X - 1)
std::bit_ceil(X + 1)
std::bit_ceil(X + 2)
std::bit_ceil(-X)
std::bit_ceil(~X)

This patch fixes:

https://github.com/llvm/llvm-project/issues/60802

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kazu created this revision.Mar 4 2023, 12:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2023, 12:25 AM

Herald added subscribers: pengfei, hiraditya. · View Herald Transcript

kazu requested review of this revision.Mar 4 2023, 12:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2023, 12:25 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

kazu edited the summary of this revision. (Show Details)Mar 4 2023, 12:28 AM

kazu added reviewers: RKSimon, spatel, goldstein.w.n.

Herald added a subscriber: StephenFan. · View Herald TranscriptMar 4 2023, 12:28 AM

Harbormaster completed remote builds in B217355: Diff 502358.Mar 4 2023, 1:52 AM

goldstein.w.n added inline comments.Mar 4 2023, 10:23 AM

llvm/test/CodeGen/X86/bit_ceil.ll
2 ↗	(On Diff #502358)	Can you add tests w.o `bmi` and w.o `lzcnt` enabled?
36 ↗	(On Diff #502358)	Is the select at the end just unneeded? If so this would probably be better addressed in the IR or `DAGCombiner` as its a pretty generic. Maybe `DAGCombiner::VisitSETCC`?

goldstein.w.n added inline comments.Mar 4 2023, 10:23 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
47340 ↗	(On Diff #502358)	This all seems a bit a bit brittle, for example if someone does `std::bit_ceil(x + 1)` we won't have the `add` on `ctlz`: %2 = tail call i32 @llvm.ctlz.i32(i32 %0, i1 false), !range !5 %3 = sub nuw nsw i32 32, %2 %4 = shl nuw i32 1, %3 %5 = add i32 %0, -1 %6 = icmp ult i32 %5, -2 %7 = select i1 %6, i32 %4, i32 1 ret i32 %7 is there a way we could more generically detect that the `select` is unneeded?

kazu marked an inline comment as done.Mar 4 2023, 12:37 PM

kazu added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
47340 ↗	(On Diff #502358)	Yes, brittleness is a real concern. I could calculate the difference between the `ctlz` argument and `icmp` argument.
llvm/test/CodeGen/X86/bit_ceil.ll
2 ↗	(On Diff #502358)	Sure. I just added: https://github.com/llvm/llvm-project/commit/ccc849e0b1a4acb5fb16b0d0ddb63744fb0faec9
36 ↗	(On Diff #502358)	Correct, we do not need the `select` at the end. We do rely on a subtle property that `shlx` masks off the shift count with 31 or 63, depending on the operand size. I generate `ISD::AND` with 31 (or 63), expecting that the hardware shift instruction absorbs it. At the LLVM IR level, the result of the shift instruction is undefined because the final `select` picks 1 for inputs < 2. Yes, we could do this in the IR or DAG combiner somewhere. Let me look into it.

goldstein.w.n added inline comments.Mar 4 2023, 1:48 PM

llvm/test/CodeGen/X86/bit_ceil.ll
36 ↗	(On Diff #502358)	Correct, we do not need the `select` at the end. We do rely on a subtle property that `shlx` masks off the shift count with 31 or 63, depending on the operand size. I generate `ISD::AND` with 31 (or 63), expecting that the hardware shift instruction absorbs it. You can still generate the ISD::AND in DAGCombiner, it will be removed in later lowering.

As noted in previous comments, it's hard to reliably pattern-match a sequence that is this long. We are missing several potential canonicalizations in IR, and I see at least one possible variant that is shorter in IR:
https://alive2.llvm.org/ce/z/CkQ433

We probably want to form the umax variant in IR based on it being one less instruction. If a target has a umax instruction (and ctlz), then that's probably going to be the best in codegen.

So this patch should wait until the IR questions are resolved, but at first glance, it seems like we will need D144451, more codegen conversions, and several IR patches.

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

In D145299#4169706, @spatel wrote:

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

The IR is:

%2 = add i32 %0, -1
%3 = tail call i32 @llvm.ctlz.i32(i32 %2, i1 false), !range !5
%4 = sub nuw nsw i32 32, %3
%5 = shl nuw i32 1, %4
%6 = icmp ugt i32 %0, 1
%7 = select i1 %6, i32 %5, i32 1
ret i32 %7

The poison shift is if %3 is zero?

In D145299#4169799, @goldstein.w.n wrote:
In D145299#4169706, @spatel wrote:

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

The IR is:
%2 = add i32 %0, -1
%3 = tail call i32 @llvm.ctlz.i32(i32 %2, i1 false), !range !5
%4 = sub nuw nsw i32 32, %3
%5 = shl nuw i32 1, %4
%6 = icmp ugt i32 %0, 1
%7 = select i1 %6, i32 %5, i32 1
ret i32 %7
The poison shift is if %3 is zero?

Yes - if we shift by the bitwidth, that's poison in IR.

IIUC in this example, we don't have to care about any input "ugt 0x8000_0000" ( https://en.cppreference.com/w/cpp/numeric/bit_ceil ), so we'd need the front-end to provide that info somehow. But 0x8000_0000 is still a valid input, so does that knowledge actually help?

In D145299#4169852, @spatel wrote:
In D145299#4169799, @goldstein.w.n wrote:
In D145299#4169706, @spatel wrote:

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

The IR is:
%2 = add i32 %0, -1
%3 = tail call i32 @llvm.ctlz.i32(i32 %2, i1 false), !range !5
%4 = sub nuw nsw i32 32, %3
%5 = shl nuw i32 1, %4
%6 = icmp ugt i32 %0, 1
%7 = select i1 %6, i32 %5, i32 1
ret i32 %7
The poison shift is if %3 is zero?
Yes - if we shift by the bitwidth, that's poison in IR.

IIUC in this example, we don't have to care about any input "ugt 0x8000_0000" ( https://en.cppreference.com/w/cpp/numeric/bit_ceil ), so we'd need the front-end to provide that info somehow. But 0x8000_0000 is still a valid input, so does that knowledge actually help?

yeah, the impl would either need to change width - clz(X) -> -clz(X) % width or add some assume. Guess it makes sense to keep this in X86ISel but should be sooner in the pipeline, rather than a very brittle match on cmov, might be able to find an easier pattern in CombineSELECT although have doubts about whether it would stand up in real codes.

In D145299#4169706, @spatel wrote:

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

Well, we could also do what is being proposed here and mask the shift amount: https://alive2.llvm.org/ce/z/FD_9Sh On x86, that happens to be free. I'm not sure this makes sense as an IR canonicalization though.

In D145299#4169855, @nikic wrote:

In D145299#4169706, @spatel wrote:

In D145299#4169702, @RKSimon wrote:

What is preventing is from performing this in InstCombine? I don't think this pattern will emerge in SelectionDAG

I haven't found a way to avoid a poison shift in IR without doing a cmp+select or umax yet. I think we're relying on the x86-specific behavior of masking the shift amount to make that part of the logic disappear in this patch.

Well, we could also do what is being proposed here and mask the shift amount: https://alive2.llvm.org/ce/z/FD_9Sh On x86, that happens to be free. I'm not sure this makes sense as an IR canonicalization though.

Oops - yes, I was overlooking that direct translation. It's shorter IR with the mask, and the backend should already account for the mask if it can be removed, so yes, that seems like a good improvement.

I think we still have several potential canonicalizations to deal with if we want a robust solution (see my earlier Alive2 link with variations with umax and shift-right).

Alive2: https://alive2.llvm.org/ce/z/pwezvF

This should be handled in InstCombine

This revision now requires changes to proceed.Mar 8 2023, 3:44 AM

Ported to InstCombine.

kazu retitled this revision from [X86] Generate better code for std::bit_ceil to [InstCombine] Generate better code for std::bit_ceil.Mar 12 2023, 1:04 PM

kazu edited the summary of this revision. (Show Details)

In D145299#4177666, @RKSimon wrote:

This should be handled in InstCombine

I just ported my patch to InstCombine. Please take a look. Thanks!

Harbormaster completed remote builds in B218900: Diff 504466.Mar 12 2023, 1:54 PM

RKSimon added a reviewer: nikic.Mar 13 2023, 2:36 AM

goldstein.w.n added inline comments.Mar 16 2023, 11:15 AM

llvm/test/Transforms/InstCombine/bit_ceil.ll
15	Can you add some tests where there are prior transformations on `%x` (some binops like add/sub/shift/and/xor/or) to test that this impl is is reasonable robust in finding the pattern.

kazu marked an inline comment as done.Mar 19 2023, 1:47 PM

kazu added inline comments.

llvm/test/Transforms/InstCombine/bit_ceil.ll
15	I added a few more tests: https://github.com/llvm/llvm-project/commit/ef860cf150f0d7e0f635026c55182bc1f65ac8f9 I'll update this patch shortly

Handle more cases.

Please take a look. Thanks!

kazu added a reviewer: pengfei.Mar 19 2023, 2:11 PM

Harbormaster completed remote builds in B220317: Diff 506421.Mar 19 2023, 2:46 PM

I haven't looked in detail what you're doing here, but you're clearly looking for the ConstantRange class. makeExactICmpRegion(), add(), sub(), binaryNot(), etc.

This revision now requires changes to proceed.Mar 19 2023, 2:53 PM

Started using ConstantRange.

In D145299#4205118, @nikic wrote:

I haven't looked in detail what you're doing here, but you're clearly looking for the ConstantRange class. makeExactICmpRegion(), add(), sub(), binaryNot(), etc.

Thank you for the pointer. I started using ConstantRange in this patch.

Harbormaster completed remote builds in B220341: Diff 506451.Mar 19 2023, 6:11 PM

RKSimon added inline comments.Mar 20 2023, 6:28 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3193	empty clause?

kazu added inline comments.Mar 20 2023, 8:58 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3193	I would like to find `X` here and leave it `nullptr` if no match occurs. I don't have any actions when the `if` condition is true. Should I say something like this? (void) match(CtlzOp, m_Add(m_Value(X), m_ConstantInt())) \|\| match(CtlzOp, m_Sub(m_ConstantInt(), m_Value(X))) \|\| match(CtlzOp, m_Not(m_Value(X)));

nikic added inline comments.Mar 20 2023, 12:40 PM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3276	All these m_ConstantInt should be m_APInt.

Use m_APInt instead of m_ConstantInt.

kazu marked an inline comment as done.Mar 20 2023, 1:00 PM

Harbormaster completed remote builds in B220520: Diff 506687.Mar 20 2023, 3:03 PM

please can you add vector test coverage?

nikic added inline comments.Mar 21 2023, 7:53 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3193	Something that I don't really get is why we have to do this separately. Wouldn't something like this be sufficient? Value RangeOp = Cond0; ConstantRange CR = ConstantRange::makeExactICmpRegion( CmpInst::getInversePredicate(Pred), Cond1); if (CtlzOp != RangeOp) { if (match(CtlzOp, m_Add(m_Specific(RangeOp), m_APInt(C)))) { CR = CR.add(*C); } // ... }
3225	These RangeOp assignments are all dead?
3263	`IRBuilderBase &` is preferred.
3270	Drop nullptr init.
3292	`CreateNeg`

Removed the empty clause.
Removed RangeOp.
Switched to CreateNeg.
Made the matching a little more straightforward.

Please take a look. Thanks!

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3193	I've revamped the matching logic. No more `RangeOp`.

Harbormaster completed remote builds in B220831: Diff 507117.Mar 21 2023, 3:33 PM

Implementation looks fine, but this needs more test coverage.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3197	Check `CommonAncestor == CtlzOp` here and then drop the first `if` below? So it's one MatchForward plus one MatchForward after peeking through add?
llvm/test/Transforms/InstCombine/bit_ceil.ll
153	Some missing tests: Commuted select operands. Select constant not 1. Select condition does not imply the needed range. Multi-use test. Vector test. Wrong constant in sub.

This revision now requires changes to proceed.Mar 22 2023, 2:32 AM

Updated tests, including a vector one.

I've updated the patch. Please take a look. Thanks!

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3197	I am not sure if I understand your suggestion here. Could you elaborate? We have three possibilities: Peek through an `add` `MatchForward` Peek through an `add` and `MatchForward` I cannot just drop the first `if` below. I need to check for a specific operand `m_Add(m_Specific(CtlzOp), ...)`.

Harbormaster completed remote builds in B221228: Diff 507617.Mar 22 2023, 11:45 PM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3193–3227	To clarify, this is what I meant.
llvm/test/Transforms/InstCombine/bit_ceil.ll
154–155	The order gets canonicalized here. You can use something like `icmp slt %x, 0` to avoid.

unblocking this - thanks @kazu

This revision is now accepted and ready to land.Mar 23 2023, 5:22 AM

Adjusted MatchForward.
Adjusted the condition for select.

kazu marked 2 inline comments as done.Mar 23 2023, 7:08 PM

kazu added inline comments.

llvm/test/Transforms/InstCombine/bit_ceil.ll
154–155	I settled on `icmp eq %dec, 0`. The order of the select operands seems to stick that way.

This revision was landed with ongoing or failed builds.Mar 23 2023, 7:27 PM

Closed by commit rG231fa2743510: [InstCombine] Generate better code for std::bit_ceil (authored by kazu). · Explain Why

This revision was automatically updated to reflect the committed changes.

kazu marked an inline comment as done.

kazu added a commit: rG231fa2743510: [InstCombine] Generate better code for std::bit_ceil.

Harbormaster completed remote builds in B221467: Diff 507939.Mar 23 2023, 8:22 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

131 lines

test/

Transforms/

InstCombine/

bit_ceil.ll

52 lines

Diff 507117

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 3,157 Lines • ▼ Show 20 Lines if (match(CondVal, m_c_LogicalAnd(m_Not(m_Value(C)), m_Value(A))) &&

C = Builder.CreateFreeze(C); C = Builder.CreateFreeze(C);

return SelectInst::Create(C, B, A); return SelectInst::Create(C, B, A);

} }

return nullptr; return nullptr;

} }

// Return true if we can safely remove the select instruction for std::bit_ceil

// pattern.

static bool isSafeToRemoveBitCeilSelect(ICmpInst::Predicate Pred, Value *Cond0,

const APInt *Cond1, Value *CtlzOp,

unsigned BitWidth) {

// The challenge in recognizing std::bit_ceil(X) is that the operand is used

// for the CTLZ proper and select condition, each possibly with some

// operation like add and sub.

// Our aim is to make sure that -ctlz & (BitWidth - 1) == 0 even when the

// select instruction would select 1, which allows us to get rid of the select

// instruction.

// To see if we can do so, we do some symbolic execution with ConstantRange.

// Specifically, we compute the range of values that Cond0 could take when

// Cond == false. Then we successively transform the range until we obtain

// the range of values that CtlzOp could take.

// Conceptually, we follow the def-use chain backward from Cond0 while

// transforming the range for Cond0 until we meet the common ancestor of Cond0

// and CtlzOp. Then we follow the def-use chain forward until we obtain the

// range for CtlzOp. That said, we only follow at most one ancestor from

// Cond0. Likewise, we only follow at most one ancestor from CtrlOp.

ConstantRange CR = ConstantRange::makeExactICmpRegion(

CmpInst::getInversePredicate(Pred), *Cond1);

// Match the operation that's used to compute CtlzOp from CommonAncestor. If

RKSimonUnsubmitted

Done

empty clause?

RKSimon: empty clause?

kazuAuthorUnsubmitted

Done

I would like to find X here and leave it nullptr if no match occurs. I don't have any actions when the if condition is true. Should I say something like this?

(void) match(CtlzOp, m_Add(m_Value(X), m_ConstantInt())) ||
    match(CtlzOp, m_Sub(m_ConstantInt(), m_Value(X))) ||
    match(CtlzOp, m_Not(m_Value(X)));

kazu: I would like to find `X` here and leave it `nullptr` if no match occurs. I don't have any…

nikicUnsubmitted

Done

Something that I don't really get is why we have to do this separately. Wouldn't something like this be sufficient?

Value *RangeOp = Cond0;
ConstantRange CR = ConstantRange::makeExactICmpRegion(
    CmpInst::getInversePredicate(Pred), *Cond1);

if (CtlzOp != RangeOp) {
  if (match(CtlzOp, m_Add(m_Specific(RangeOp), m_APInt(C)))) {
    CR = CR.add(*C);
  }
  // ...
}

nikic: Something that I don't really get is why we have to do this separately. Wouldn't something like…

kazuAuthorUnsubmitted

Done

I've revamped the matching logic. No more RangeOp.

kazu: I've revamped the matching logic. No more `RangeOp`.

// a match is found, execute the operation on CR, update CR, and return true.

// Otherwise, return false.

auto MatchForward = [&](Value *CommonAncestor) {

const APInt *C = nullptr;

nikicUnsubmitted

Not Done

Check CommonAncestor == CtlzOp here and then drop the first if below? So it's one MatchForward plus one MatchForward after peeking through add?

nikic: Check `CommonAncestor == CtlzOp` here and then drop the first `if` below? So it's one…

kazuAuthorUnsubmitted

Done

I am not sure if I understand your suggestion here. Could you elaborate? We have three possibilities:

Peek through an add
MatchForward
Peek through an add and MatchForward

I cannot just drop the first if below. I need to check for a specific operand m_Add(m_Specific(CtlzOp), ...).

kazu: I am not sure if I understand your suggestion here. Could you elaborate? We have three…

if (match(CtlzOp, m_Add(m_Specific(CommonAncestor), m_APInt(C)))) {

CR = CR.add(*C);

return true;

}

if (match(CtlzOp, m_Sub(m_APInt(C), m_Specific(CommonAncestor)))) {

CR = ConstantRange(*C).sub(CR);

return true;

}

if (match(CtlzOp, m_Not(m_Specific(CommonAncestor)))) {

CR = CR.binaryNot();

return true;

}

return false;

};

const APInt *C = nullptr;

Value *CommonAncestor;

if (match(Cond0, m_Add(m_Specific(CtlzOp), m_APInt(C)))) {

// We have Cond0's parent == CtlzOp.

CR = CR.sub(*C);

} else if (MatchForward(Cond0)) {

// We have Cond0 == CtlzOp's parent. CR has been updated.

} else if (match(Cond0, m_Add(m_Value(CommonAncestor), m_APInt(C)))) {

CR = CR.sub(*C);

if (!MatchForward(CommonAncestor))

return false;

// We have Cond0's parent == CtlzOp's parent. CR has been updated.

} else {

nikicUnsubmitted

Done

These RangeOp assignments are all dead?

nikic: These RangeOp assignments are all dead?

return false;

}

nikicUnsubmitted

Done

CmpInst::getInversePredicate(Pred), *Cond1);

// Match the operation that's used to compute CtlzOp from CommonAncestor. If

// a match is found, execute the operation on CR, update CR, and return true.

// Otherwise, return false.

auto MatchForward = [&](Value *CommonAncestor) {

+ if (CtlzOp == CommonAncestor)

+ return true;

const APInt *C = nullptr;

if (match(CtlzOp, m_Add(m_Specific(CommonAncestor), m_APInt(C)))) {

CR = CR.add(*C);

return true;

}

if (match(CtlzOp, m_Sub(m_APInt(C), m_Specific(CommonAncestor)))) {

CR = ConstantRange(*C).sub(CR);

return true;

}

if (match(CtlzOp, m_Not(m_Specific(CommonAncestor)))) {

CR = CR.binaryNot();

return true;

}

return false;

};

const APInt *C = nullptr;

Value *CommonAncestor;

- if (match(Cond0, m_Add(m_Specific(CtlzOp), m_APInt(C)))) {

- // We have Cond0's parent == CtlzOp.

- CR = CR.sub(*C);

- } else if (MatchForward(Cond0)) {

+ if (MatchForward(Cond0)) {

// We have Cond0 == CtlzOp's parent. CR has been updated.

} else if (match(Cond0, m_Add(m_Value(CommonAncestor), m_APInt(C)))) {

CR = CR.sub(*C);

if (!MatchForward(CommonAncestor))

return false;

// We have Cond0's parent == CtlzOp's parent. CR has been updated.

} else {

return false;

}

// Return true if all the values in the range are either 0 or negative (if

To clarify, this is what I meant.

nikic: To clarify, this is what I meant.

// Return true if all the values in the range are either 0 or negative (if

// treated as signed). We do so by evaluating:

// CR - 1 u>= (1 << BitWidth) - 1.

APInt IntMax = APInt::getSignMask(BitWidth) - 1;

CR = CR.sub(APInt(BitWidth, 1));

return CR.icmp(ICmpInst::ICMP_UGE, IntMax);

}

// Transform the std::bit_ceil(X) pattern like:

// %dec = add i32 %x, -1

// %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)

// %sub = sub i32 32, %ctlz

// %shl = shl i32 1, %sub

// %ugt = icmp ugt i32 %x, 1

// %sel = select i1 %ugt, i32 %shl, i32 1

// into:

// %dec = add i32 %x, -1

// %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)

// %neg = sub i32 0, %ctlz

// %masked = and i32 %ctlz, 31

// %shl = shl i32 1, %sub

// Note that the select is optimized away while the shift count is masked with

// 31. We handle some variations of the input operand like std::bit_ceil(X +

// 1).

static Instruction *foldBitCeil(SelectInst &SI, IRBuilderBase &Builder) {

Type *SelType = SI.getType();

unsigned BitWidth = SelType->getScalarSizeInBits();

Value *FalseVal = SI.getFalseValue();

Value *TrueVal = SI.getTrueValue();

nikicUnsubmitted

Done

IRBuilderBase & is preferred.

nikic: `IRBuilderBase &` is preferred.

ICmpInst::Predicate Pred;

const APInt *Cond1;

Value *Cond0, *Ctlz, *CtlzOp;

if (!match(SI.getCondition(), m_ICmp(Pred, m_Value(Cond0), m_APInt(Cond1))))

return nullptr;

if (match(TrueVal, m_One())) {

nikicUnsubmitted

Done

Drop nullptr init.

nikic: Drop nullptr init.

std::swap(FalseVal, TrueVal);

Pred = CmpInst::getInversePredicate(Pred);

}

if (!match(FalseVal, m_One()) ||

!match(TrueVal,

nikicUnsubmitted

Done

All these m_ConstantInt should be m_APInt.

nikic: All these m_ConstantInt should be m_APInt.

m_OneUse(m_Shl(m_One(), m_OneUse(m_Sub(m_SpecificInt(BitWidth),

m_Value(Ctlz)))))) ||

!match(Ctlz, m_Intrinsic<Intrinsic::ctlz>(m_Value(CtlzOp), m_Zero())) ||

!isSafeToRemoveBitCeilSelect(Pred, Cond0, Cond1, CtlzOp, BitWidth))

return nullptr;

// Build 1 << (-CTLZ & (BitWidth-1)). The negation likely corresponds to a

// single hardware instruction as opposed to BitWidth - CTLZ, where BitWidth

// is an integer constant. Masking with BitWidth-1 comes free on some

// hardware as part of the shift instruction.

Value *Neg = Builder.CreateNeg(Ctlz);

Value *Masked =

Builder.CreateAnd(Neg, ConstantInt::get(SelType, BitWidth - 1));

return BinaryOperator::Create(Instruction::Shl, ConstantInt::get(SelType, 1),

Masked);

}

nikicUnsubmitted

Done

CreateNeg

nikic: `CreateNeg`

Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) { Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {

Value *CondVal = SI.getCondition(); Value *CondVal = SI.getCondition();

Value *TrueVal = SI.getTrueValue(); Value *TrueVal = SI.getTrueValue();

Value *FalseVal = SI.getFalseValue(); Value *FalseVal = SI.getFalseValue();

Type *SelType = SI.getType(); Type *SelType = SI.getType();

if (Value *V = simplifySelectInst(CondVal, TrueVal, FalseVal, if (Value *V = simplifySelectInst(CondVal, TrueVal, FalseVal,

SQ.getWithInstruction(&SI))) SQ.getWithInstruction(&SI)))

▲ Show 20 Lines • Show All 411 Lines • ▼ Show 20 Lines Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {

// Match logical variants of the pattern, // Match logical variants of the pattern,

// and transform them iff that gets rid of inversions. // and transform them iff that gets rid of inversions.

// (~x) | y --> ~(x & (~y)) // (~x) | y --> ~(x & (~y))

// (~x) & y --> ~(x | (~y)) // (~x) & y --> ~(x | (~y))

if (sinkNotIntoOtherHandOfLogicalOp(SI)) if (sinkNotIntoOtherHandOfLogicalOp(SI))

return &SI; return &SI;

if (Instruction *I = foldBitCeil(SI, Builder))

return I;

return nullptr; return nullptr;

} }

llvm/test/Transforms/InstCombine/bit_ceil.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -passes=instcombine -S \| FileCheck %s			; RUN: opt < %s -passes=instcombine -S \| FileCheck %s

	; std::bit_ceil<uint32_t>(x)			; std::bit_ceil<uint32_t>(x)
	define i32 @bit_ceil_32(i32 %x) {			define i32 @bit_ceil_32(i32 %x) {
	; CHECK-LABEL: @bit_ceil_32(			; CHECK-LABEL: @bit_ceil_32(
	; CHECK-NEXT: [[DEC:%.]] = add i32 [[X:%.]], -1			; CHECK-NEXT: [[DEC:%.]] = add i32 [[X:%.]], -1
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[DEC]], i1 false), !range [[RNG0:![0-9]+]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[DEC]], i1 false), !range [[RNG0:![0-9]+]]
	; CHECK-NEXT: [[SUB:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP1:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB]]			; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 31
	; CHECK-NEXT: [[UGT:%.*]] = icmp ugt i32 [[X]], 1			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP2]]
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[UGT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	%dec = add i32 %x, -1			%dec = add i32 %x, -1
	goldstein.w.nUnsubmitted Done Reply Inline Actions Can you add some tests where there are prior transformations on `%x` (some binops like add/sub/shift/and/xor/or) to test that this impl is is reasonable robust in finding the pattern. goldstein.w.n: Can you add some tests where there are prior transformations on `%x` (some binops like…
	kazuAuthorUnsubmitted Done Reply Inline Actions I added a few more tests: https://github.com/llvm/llvm-project/commit/ef860cf150f0d7e0f635026c55182bc1f65ac8f9 I'll update this patch shortly kazu: I added a few more tests: https://github.com/llvm/llvm…
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
	%sub = sub i32 32, %ctlz			%sub = sub i32 32, %ctlz
	%shl = shl i32 1, %sub			%shl = shl i32 1, %sub
	%ugt = icmp ugt i32 %x, 1			%ugt = icmp ugt i32 %x, 1
	%sel = select i1 %ugt, i32 %shl, i32 1			%sel = select i1 %ugt, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	; std::bit_ceil<uint64_t>(x)			; std::bit_ceil<uint64_t>(x)
	define i64 @bit_ceil_64(i64 %x) {			define i64 @bit_ceil_64(i64 %x) {
	; CHECK-LABEL: @bit_ceil_64(			; CHECK-LABEL: @bit_ceil_64(
	; CHECK-NEXT: [[DEC:%.]] = add i64 [[X:%.]], -1			; CHECK-NEXT: [[DEC:%.]] = add i64 [[X:%.]], -1
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i64 @llvm.ctlz.i64(i64 [[DEC]], i1 false), !range [[RNG1:![0-9]+]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i64 @llvm.ctlz.i64(i64 [[DEC]], i1 false), !range [[RNG1:![0-9]+]]
	; CHECK-NEXT: [[SUB:%.*]] = sub nuw nsw i64 64, [[CTLZ]]			; CHECK-NEXT: [[TMP1:%.*]] = sub nsw i64 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i64 1, [[SUB]]			; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], 63
	; CHECK-NEXT: [[UGT:%.*]] = icmp ugt i64 [[X]], 1			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i64 1, [[TMP2]]
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[UGT]], i64 [[SHL]], i64 1
	; CHECK-NEXT: ret i64 [[SEL]]			; CHECK-NEXT: ret i64 [[SEL]]
	;			;
	%dec = add i64 %x, -1			%dec = add i64 %x, -1
	%ctlz = tail call i64 @llvm.ctlz.i64(i64 %dec, i1 false)			%ctlz = tail call i64 @llvm.ctlz.i64(i64 %dec, i1 false)
	%sub = sub i64 64, %ctlz			%sub = sub i64 64, %ctlz
	%shl = shl i64 1, %sub			%shl = shl i64 1, %sub
	%ugt = icmp ugt i64 %x, 1			%ugt = icmp ugt i64 %x, 1
	%sel = select i1 %ugt, i64 %shl, i64 1			%sel = select i1 %ugt, i64 %shl, i64 1
	ret i64 %sel			ret i64 %sel
	}			}

	; std::bit_ceil<uint32_t>(x - 1)			; std::bit_ceil<uint32_t>(x - 1)
	define i32 @bit_ceil_32_minus_1(i32 %x) {			define i32 @bit_ceil_32_minus_1(i32 %x) {
	; CHECK-LABEL: @bit_ceil_32_minus_1(			; CHECK-LABEL: @bit_ceil_32_minus_1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SUB:%.]] = add i32 [[X:%.]], -2			; CHECK-NEXT: [[SUB:%.]] = add i32 [[X:%.]], -2
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]
	; CHECK-NEXT: [[SUB2:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP0:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB2]]			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 31
	; CHECK-NEXT: [[ADD:%.*]] = add i32 [[X]], -3			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP1]]
	; CHECK-NEXT: [[ULT:%.*]] = icmp ult i32 [[ADD]], -2
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[ULT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	entry:			entry:
	%sub = add i32 %x, -2			%sub = add i32 %x, -2
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)
	%sub2 = sub nuw nsw i32 32, %ctlz			%sub2 = sub nuw nsw i32 32, %ctlz
	%shl = shl nuw i32 1, %sub2			%shl = shl nuw i32 1, %sub2
	%add = add i32 %x, -3			%add = add i32 %x, -3
	%ult = icmp ult i32 %add, -2			%ult = icmp ult i32 %add, -2
	%sel = select i1 %ult, i32 %shl, i32 1			%sel = select i1 %ult, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	; std::bit_ceil<uint32_t>(x + 1)			; std::bit_ceil<uint32_t>(x + 1)
	define i32 @bit_ceil_32_plus_1(i32 %x) {			define i32 @bit_ceil_32_plus_1(i32 %x) {
	; CHECK-LABEL: @bit_ceil_32_plus_1(			; CHECK-LABEL: @bit_ceil_32_plus_1(
	; CHECK-NEXT: [[CTLZ:%.]] = tail call i32 @llvm.ctlz.i32(i32 [[X:%.]], i1 false), !range [[RNG0]]			; CHECK-NEXT: [[CTLZ:%.]] = tail call i32 @llvm.ctlz.i32(i32 [[X:%.]], i1 false), !range [[RNG0]]
	; CHECK-NEXT: [[SUB:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP1:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB]]			; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 31
	; CHECK-NEXT: [[DEC:%.*]] = add i32 [[X]], -1			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP2]]
	; CHECK-NEXT: [[ULT:%.*]] = icmp ult i32 [[DEC]], -2
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[ULT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %x, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %x, i1 false)
	%sub = sub i32 32, %ctlz			%sub = sub i32 32, %ctlz
	%shl = shl i32 1, %sub			%shl = shl i32 1, %sub
	%dec = add i32 %x, -1			%dec = add i32 %x, -1
	%ult = icmp ult i32 %dec, -2			%ult = icmp ult i32 %dec, -2
	%sel = select i1 %ult, i32 %shl, i32 1			%sel = select i1 %ult, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @bit_ceil_plus_2(i32 %x) {			define i32 @bit_ceil_plus_2(i32 %x) {
	; CHECK-LABEL: @bit_ceil_plus_2(			; CHECK-LABEL: @bit_ceil_plus_2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SUB:%.]] = add i32 [[X:%.]], 1			; CHECK-NEXT: [[SUB:%.]] = add i32 [[X:%.]], 1
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]
	; CHECK-NEXT: [[SUB2:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP0:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB2]]			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 31
	; CHECK-NEXT: [[ULT:%.*]] = icmp ult i32 [[X]], -2			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP1]]
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[ULT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	entry:			entry:
	%sub = add i32 %x, 1			%sub = add i32 %x, 1
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)
	%sub2 = sub nuw nsw i32 32, %ctlz			%sub2 = sub nuw nsw i32 32, %ctlz
	%shl = shl nuw i32 1, %sub2			%shl = shl nuw i32 1, %sub2
	%ult = icmp ult i32 %x, -2			%ult = icmp ult i32 %x, -2
	%sel = select i1 %ult, i32 %shl, i32 1			%sel = select i1 %ult, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	; std::bit_ceil<uint32_t>(-x)			; std::bit_ceil<uint32_t>(-x)
	define i32 @bit_ceil_32_neg(i32 %x) {			define i32 @bit_ceil_32_neg(i32 %x) {
	; CHECK-LABEL: @bit_ceil_32_neg(			; CHECK-LABEL: @bit_ceil_32_neg(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SUB:%.]] = xor i32 [[X:%.]], -1			; CHECK-NEXT: [[SUB:%.]] = xor i32 [[X:%.]], -1
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]
	; CHECK-NEXT: [[SUB2:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP0:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB2]]			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 31
	; CHECK-NEXT: [[NOTSUB:%.*]] = add i32 [[X]], -1			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP1]]
	; CHECK-NEXT: [[ULT:%.*]] = icmp ult i32 [[NOTSUB]], -2
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[ULT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	entry:			entry:
	%sub = xor i32 %x, -1			%sub = xor i32 %x, -1
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)
	%sub2 = sub nuw nsw i32 32, %ctlz			%sub2 = sub nuw nsw i32 32, %ctlz
	%shl = shl nuw i32 1, %sub2			%shl = shl nuw i32 1, %sub2
	%notsub = add i32 %x, -1			%notsub = add i32 %x, -1
	%ult = icmp ult i32 %notsub, -2			%ult = icmp ult i32 %notsub, -2
	%sel = select i1 %ult, i32 %shl, i32 1			%sel = select i1 %ult, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	; std::bit_ceil<uint32_t>(~x)			; std::bit_ceil<uint32_t>(~x)
	define i32 @bit_ceil_not(i32 %x) {			define i32 @bit_ceil_not(i32 %x) {
	; CHECK-LABEL: @bit_ceil_not(			; CHECK-LABEL: @bit_ceil_not(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SUB:%.]] = sub i32 -2, [[X:%.]]			; CHECK-NEXT: [[SUB:%.]] = sub i32 -2, [[X:%.]]
	; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]			; CHECK-NEXT: [[CTLZ:%.*]] = tail call i32 @llvm.ctlz.i32(i32 [[SUB]], i1 false), !range [[RNG0]]
	; CHECK-NEXT: [[SUB2:%.*]] = sub nuw nsw i32 32, [[CTLZ]]			; CHECK-NEXT: [[TMP0:%.*]] = sub nsw i32 0, [[CTLZ]]
	; CHECK-NEXT: [[SHL:%.*]] = shl nuw i32 1, [[SUB2]]			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 31
	; CHECK-NEXT: [[ULT:%.*]] = icmp ult i32 [[X]], -2			; CHECK-NEXT: [[SEL:%.*]] = shl nuw i32 1, [[TMP1]]
	; CHECK-NEXT: [[SEL:%.*]] = select i1 [[ULT]], i32 [[SHL]], i32 1
	; CHECK-NEXT: ret i32 [[SEL]]			; CHECK-NEXT: ret i32 [[SEL]]
	;			;
	entry:			entry:
	%sub = sub i32 -2, %x			%sub = sub i32 -2, %x
	%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)			%ctlz = tail call i32 @llvm.ctlz.i32(i32 %sub, i1 false)
	%sub2 = sub nuw nsw i32 32, %ctlz			%sub2 = sub nuw nsw i32 32, %ctlz
	%shl = shl nuw i32 1, %sub2			%shl = shl nuw i32 1, %sub2
	%ult = icmp ult i32 %x, -2			%ult = icmp ult i32 %x, -2
	%sel = select i1 %ult, i32 %shl, i32 1			%sel = select i1 %ult, i32 %shl, i32 1
	ret i32 %sel			ret i32 %sel
	}			}
	nikicUnsubmitted Done Reply Inline Actions Some missing tests: Commuted select operands. Select constant not 1. Select condition does not imply the needed range. Multi-use test. Vector test. Wrong constant in sub. nikic: Some missing tests: * Commuted select operands. * Select constant not 1. * Select condition…

	declare i32 @llvm.ctlz.i32(i32, i1 immarg)			declare i32 @llvm.ctlz.i32(i32, i1 immarg)
	nikicUnsubmitted Done Reply Inline Actions The order gets canonicalized here. You can use something like `icmp slt %x, 0` to avoid. nikic: The order gets canonicalized here. You can use something like `icmp slt %x, 0` to avoid.
	kazuAuthorUnsubmitted Done Reply Inline Actions I settled on `icmp eq %dec, 0`. The order of the select operands seems to stick that way. kazu: I settled on `icmp eq %dec, 0`. The order of the select operands seems to stick that way.
	declare i64 @llvm.ctlz.i64(i64, i1 immarg)			declare i64 @llvm.ctlz.i64(i64, i1 immarg)

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Generate better code for std::bit_ceilClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 507117

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

llvm/test/Transforms/InstCombine/bit_ceil.ll

[InstCombine] Generate better code for std::bit_ceil
ClosedPublic