This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
IntrinsicsX86.td
-
lib/Target/X86/
-
Target/
-
X86/
-
X86ISelLowering.h
11/14
X86ISelLowering.cpp
-
X86InstrCompiler.td
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
2/4
atomic-rm-bit-test.ll

Differential D140939

[X86] Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.
ClosedPublic

Authored by goldstein.w.n on Jan 3 2023, 6:41 PM.

Download Raw Diff

Details

Reviewers

pengfei
RKSimon

Commits

rG0b74e34938ba: Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a…

Summary

This is essentially expanding on the optimizations added on: D120199
but applies the optimization to cases where the bit being changed /
tested is not am IMM but is a provable power of 2.

The only case currently added for cases like:
__atomic_fetch_xor(p, 1 << c, __ATOMIC_RELAXED) & (1 << c)

Which instead of using a cmpxchg loop can be done with btcl; setcc; shl.

There are still a variety of missed cases that could/should be
addressed in the future. This commit documents many of those
cases with Todos.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

goldstein.w.n created this revision.Jan 3 2023, 6:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 6:41 PM

Herald added subscribers: pengfei, hiraditya. · View Herald Transcript

goldstein.w.n requested review of this revision.Jan 3 2023, 6:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 6:41 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

goldstein.w.n added a parent revision: D140938: [X86] Add tests for atomic bittest with register/memory operands.Jan 3 2023, 6:41 PM

Harbormaster completed remote builds in B205578: Diff 486139.Jan 3 2023, 6:42 PM

This is patch [2/2].

Note: This is a follow up to https://reviews.llvm.org/D140645 and is splitting up the reviews for the tests and code.

goldstein.w.n added reviewers: pengfei, RKSimon.Jan 3 2023, 6:44 PM

goldstein.w.n mentioned this in D140645: Add tests for atomic bittest with register/memory operands.

Propegate BMI test removal

craig.topper retitled this revision from Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit. to [X86] Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit..Jan 4 2023, 4:49 PM

Harbormaster completed remote builds in B205804: Diff 486432.Jan 4 2023, 6:13 PM

pengfei added inline comments.Jan 5 2023, 5:58 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
5653	Should the `Size` be same with operand 1 rather than result?
31434	or
31435	No parentheses needed for single line scope.
31453	Should it be a canonical form that we can assume it is always `X ^ -1`?
31454	We can use `match(....)` for better readability?
31527	Is enough to only check `BitChange.second == UndefBit`?
31559	Missing the last period `.`
31564	ditto.
llvm/test/CodeGen/X86/atomic-rm-bit-test.ll
4622–4628	The branch code doesn't look necessary. Can we necessary it?

Use match instead of hand written logic.
Fix some typos.
Fix some code style nits.

llvm/lib/Target/X86/X86ISelLowering.cpp
5653	Should the `Size` be same with operand 1 rather than result? If I change it I run into assertion errors. `I.getArgOperand(0)->getType()->getScalarSizeInBits()` the `Size` can be zero (bitwidth too small) or `I.getArgOperand(1)->getType()->getScalarSizeInBits()` I run into: llc: /home/noah/programs/opensource/llvm-dev/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:11108: llvm::MemSDNode::MemSDNode(unsigned int, unsigned int, const llvm::DebugLoc&, llvm::SDVTList, llvm::EVT, llvm::MachineMemOperand*): Assertion `memvt.getStoreSize().getKnownMinSize() <= MMO->getSize() && "Size mismatch!"' failed. Maybe that means there is a bug in the patch elsewhere? Running the test here are the sizes I see: Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 64 / 0 / 8 Sizes: 64 / 0 / 8 Sizes: 64 / 0 / 8
31453	Should it be a canonical form that we can assume it is always `X ^ -1`? Probably defacto fixed by using `match` (didn't know `match` existed when I wrote this patch :/ alot to learn).
llvm/test/CodeGen/X86/atomic-rm-bit-test.ll
4622–4628	The branch code doesn't look necessary. Can we necessary it? I think it is b.c we don't `cmovcc` loads. For if.then: ; preds = %entry %idxprom = zext i32 %c to i64 %arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom %1 = load i32, ptr %arrayidx, align 4 br label %return And return: ; preds = %entry, %if.then %retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ] ret i32 %retval.0 a branch seems correct.

lebedev.ri edited the summary of this revision. (Show Details)Jan 5 2023, 9:38 AM

Harbormaster completed remote builds in B205937: Diff 486607.Jan 5 2023, 11:59 AM

pengfei added inline comments.Jan 5 2023, 6:36 PM

llvm/test/CodeGen/X86/atomic-rm-bit-test.ll
4622–4628	Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g., entry: %shl = shl nuw i32 1, %c %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4 %and = and i32 %0, %shl %tobool.not = icmp eq i32 %and, 0 %ret = zext i1 %tobool.not to i32 ret i32 %ret This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected.

pengfei added inline comments.Jan 5 2023, 6:39 PM

llvm/lib/Target/X86/X86ISelLowering.cpp
5653	I mean split `x86_atomic_bt*_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()` Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`.

goldstein.w.n added inline comments.Jan 5 2023, 10:12 PM

llvm/lib/Target/X86/X86ISelLowering.cpp
5653	I mean split `x86_atomic_bt_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()` Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`. What does `"split` x86_atomic_bt_rm` from it" mean?
llvm/test/CodeGen/X86/atomic-rm-bit-test.ll
4622–4628	Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g., entry: %shl = shl nuw i32 1, %c %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4 %and = and i32 %0, %shl %tobool.not = icmp eq i32 %and, 0 %ret = zext i1 %tobool.not to i32 ret i32 %ret This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected. I see, so there are 6 difference test types: `_br` -> branch on value `_brz` -> branch on !value `_brnz` -> branch on !!value `_val` -> return value `_valz` -> return !value `_valnz` -> return !!value Imo they are all worth testing. For example we have logic that searches the uses to see if its only for a truth value to see if we can optimize out the shift. IIRC when writing the code, there where some edge cases where `br` behaved differently than `setc` so think its worth keeping.

pengfei added inline comments.Jan 5 2023, 10:15 PM

llvm/lib/Target/X86/X86ISelLowering.cpp

5653

case Intrinsic::x86_cmpccxadd32:
case Intrinsic::x86_cmpccxadd64:
case Intrinsic::x86_atomic_bts:
case Intrinsic::x86_atomic_btc:
case Intrinsic::x86_atomic_btr: {
  ... ...
}
case Intrinsic::x86_atomic_bts_rm:
case Intrinsic::x86_atomic_btc_rm:
case Intrinsic::x86_atomic_btr_rm: {
  ... ...
}

Propegate test changes.
Use proper size

goldstein.w.n marked an inline comment as done.Jan 5 2023, 11:18 PM

goldstein.w.n added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp

5653

case Intrinsic::x86_cmpccxadd32:
case Intrinsic::x86_cmpccxadd64:
case Intrinsic::x86_atomic_bts:
case Intrinsic::x86_atomic_btc:
case Intrinsic::x86_atomic_btr: {
  ... ...
}
case Intrinsic::x86_atomic_bts_rm:
case Intrinsic::x86_atomic_btc_rm:
case Intrinsic::x86_atomic_btr_rm: {
  ... ...
}

Done, for some reason had thought they where in different cases.

Harbormaster completed remote builds in B206039: Diff 486755.Jan 6 2023, 12:28 AM

ping.

LGTM.

This revision is now accepted and ready to land.Jan 13 2023, 4:57 AM

This revision was landed with ongoing or failed builds.Jan 16 2023, 10:09 PM

Closed by commit rG0b74e34938ba: Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a… (authored by goldstein.w.n). · Explain Why

This revision was automatically updated to reflect the committed changes.

goldstein.w.n added a commit: rG0b74e34938ba: Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a….

Hey, it looks like this change triggered an assertion failure in the Fuchsia continous build: https://luci-milo.appspot.com/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.x64-release/b8791734713064712625/overview

clang: llvm/lib/Target/X86/X86ISelLowering.cpp:31525: TargetLowering::AtomicExpansionKind llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *) const: Assertion `I->getOperand(0)...
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ../../../recipe_cleanup/clangs6i4kjx7/bin/clang -MD -MF obj/src/connectivity/wlan/drivers/third_party/intel/iwlwifi/mvm/mvm.sta.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS ...
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '../../src/connectivity/wlan/drivers/third_party/intel/iw...

In D140939#4067362, @mysterymath wrote:

clang: llvm/lib/Target/X86/X86ISelLowering.cpp:31525: TargetLowering::AtomicExpansionKind llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *) const: Assertion `I->getOperand(0)...
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ../../../recipe_cleanup/clangs6i4kjx7/bin/clang -MD -MF obj/src/connectivity/wlan/drivers/third_party/intel/iwlwifi/mvm/mvm.sta.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS ...
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '../../src/connectivity/wlan/drivers/third_party/intel/iw...

The easiest thing to do now would be:

-  assert(I->getOperand(0) == AI);
+  if (I->getOperand(0) != AI)
+    return AtomicExpansionKind::CmpXChg;

or revert.

Although I think if that assertion is being hit the old code was buggy:

// The following instruction must be a AND single bit.
auto *C2 = dyn_cast<ConstantInt>(I->getOperand(1));
unsigned Bits = AI->getType()->getPrimitiveSizeInBits();
if (!C2 || Bits == 8 || !isPowerOf2_64(C2->getZExtValue()))
  return AtomicExpansionKind::CmpXChg;

if (AI->getOperation() == AtomicRMWInst::And)
  return ~C1->getValue() == C2->getValue()
             ? AtomicExpansionKind::BitTestIntrinsic
             : AtomicExpansionKind::CmpXChg;

return C1 == C2 ? AtomicExpansionKind::BitTestIntrinsic
                : AtomicExpansionKind::CmpXChg;

Was assuming the above assert AFAICT.

In D140939#4067362, @mysterymath wrote:

clang: llvm/lib/Target/X86/X86ISelLowering.cpp:31525: TargetLowering::AtomicExpansionKind llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *) const: Assertion `I->getOperand(0)...
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ../../../recipe_cleanup/clangs6i4kjx7/bin/clang -MD -MF obj/src/connectivity/wlan/drivers/third_party/intel/iwlwifi/mvm/mvm.sta.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS ...
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '../../src/connectivity/wlan/drivers/third_party/intel/iw...

Created: https://reviews.llvm.org/D142166 Its a proper fix and tested. Otherwise revert is probably best for now

In D140939#4067387, @goldstein.w.n wrote:
In D140939#4067362, @mysterymath wrote:
Hey, it looks like this change triggered an assertion failure in the Fuchsia continous build: https://luci-milo.appspot.com/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.x64-release/b8791734713064712625/overview
clang: llvm/lib/Target/X86/X86ISelLowering.cpp:31525: TargetLowering::AtomicExpansionKind llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *) const: Assertion `I->getOperand(0)...
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ../../../recipe_cleanup/clangs6i4kjx7/bin/clang -MD -MF obj/src/connectivity/wlan/drivers/third_party/intel/iwlwifi/mvm/mvm.sta.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS ...
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '../../src/connectivity/wlan/drivers/third_party/intel/iw...
Created: https://reviews.llvm.org/D142166 Its a proper fix and tested. Otherwise revert is probably best for now

D142166 has been accepted with some nits. Going to wait ~24hours to any remaining comments, then push.

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

In D140939#4072474, @alexfh wrote:

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define weak_odr void @f() {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 0, -1
  %0 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %0, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}
$ ./good-clang -c reduced.ll -o good.o
$ ./bad-clang -c reduced.ll -o bad.o
assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x559d015c9524  __assert_fail
    @     0x559cff67fd1c  FindSingleBitChange()
    @     0x559cff67f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x559cff68098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x559cffaa6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x559cffaa59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x559d0126e87d  llvm::FPPassManager::runOnFunction()
    @     0x559d01276104  llvm::FPPassManager::runOnModule()
    @     0x559d0126ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x559cfc5ad581  clang::EmitBackendOutput()
    @     0x559cfc5a94f8  clang::CodeGenAction::ExecuteAction()
    @     0x559cfd1cbf23  clang::FrontendAction::Execute()
    @     0x559cfd141cad  clang::CompilerInstance::ExecuteAction()
    @     0x559cfc1871e8  clang::ExecuteCompilerInvocation()
    @     0x559cfc17ae41  cc1_main()

In D140939#4072488, @alexfh wrote:

In D140939#4072474, @alexfh wrote:

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define weak_odr void @f() {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 0, -1
  %0 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %0, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}
$ ./good-clang -c reduced.ll -o good.o
$ ./bad-clang -c reduced.ll -o bad.o
assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x559d015c9524  __assert_fail
    @     0x559cff67fd1c  FindSingleBitChange()
    @     0x559cff67f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x559cff68098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x559cffaa6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x559cffaa59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x559d0126e87d  llvm::FPPassManager::runOnFunction()
    @     0x559d01276104  llvm::FPPassManager::runOnModule()
    @     0x559d0126ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x559cfc5ad581  clang::EmitBackendOutput()
    @     0x559cfc5a94f8  clang::CodeGenAction::ExecuteAction()
    @     0x559cfd1cbf23  clang::FrontendAction::Execute()
    @     0x559cfd141cad  clang::CompilerInstance::ExecuteAction()
    @     0x559cfc1871e8  clang::ExecuteCompilerInvocation()
    @     0x559cfc17ae41  cc1_main()

Thank you for finding the case. A another one (and probably more realistic one is):

define weak_odr void @f(i32 %0) {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 %0, -1
  %1 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %1, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}

So the fix is:

-      assert(I != nullptr);
+
+      // If constant it will fold and we can evaluate later. If its an argument
+      // or something of that nature, we can't analyze.
+      if (I == nullptr)
+        return {nullptr, UndefBit};

I'm happy to post that as a patch.

If you still want to revert, however, happy to do so as well and then repost the full version.

Whichever you think is more prudent.

In D140939#4072672, @goldstein.w.n wrote:

In D140939#4072488, @alexfh wrote:

In D140939#4072474, @alexfh wrote:

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define weak_odr void @f() {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 0, -1
  %0 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %0, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}
$ ./good-clang -c reduced.ll -o good.o
$ ./bad-clang -c reduced.ll -o bad.o
assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x559d015c9524  __assert_fail
    @     0x559cff67fd1c  FindSingleBitChange()
    @     0x559cff67f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x559cff68098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x559cffaa6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x559cffaa59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x559d0126e87d  llvm::FPPassManager::runOnFunction()
    @     0x559d01276104  llvm::FPPassManager::runOnModule()
    @     0x559d0126ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x559cfc5ad581  clang::EmitBackendOutput()
    @     0x559cfc5a94f8  clang::CodeGenAction::ExecuteAction()
    @     0x559cfd1cbf23  clang::FrontendAction::Execute()
    @     0x559cfd141cad  clang::CompilerInstance::ExecuteAction()
    @     0x559cfc1871e8  clang::ExecuteCompilerInvocation()
    @     0x559cfc17ae41  cc1_main()

Thank you for finding the case. A another one (and probably more realistic one is):

define weak_odr void @f(i32 %0) {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 %0, -1
  %1 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %1, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}

So the fix is:

-      assert(I != nullptr);
+
+      // If constant it will fold and we can evaluate later. If its an argument
+      // or something of that nature, we can't analyze.
+      if (I == nullptr)
+        return {nullptr, UndefBit};

I'm happy to post that as a patch.

Created: https://reviews.llvm.org/D142339

But let me know if revert is still preferable (I say as much there).

If you still want to revert, however, happy to do so as well and then repost the full version.

Whichever you think is more prudent.

In D140939#4072755, @goldstein.w.n wrote:

In D140939#4072672, @goldstein.w.n wrote:

In D140939#4072488, @alexfh wrote:

In D140939#4072474, @alexfh wrote:

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define weak_odr void @f() {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 0, -1
  %0 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %0, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}
$ ./good-clang -c reduced.ll -o good.o
$ ./bad-clang -c reduced.ll -o bad.o
assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x559d015c9524  __assert_fail
    @     0x559cff67fd1c  FindSingleBitChange()
    @     0x559cff67f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x559cff68098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x559cffaa6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x559cffaa59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x559d0126e87d  llvm::FPPassManager::runOnFunction()
    @     0x559d01276104  llvm::FPPassManager::runOnModule()
    @     0x559d0126ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x559cfc5ad581  clang::EmitBackendOutput()
    @     0x559cfc5a94f8  clang::CodeGenAction::ExecuteAction()
    @     0x559cfd1cbf23  clang::FrontendAction::Execute()
    @     0x559cfd141cad  clang::CompilerInstance::ExecuteAction()
    @     0x559cfc1871e8  clang::ExecuteCompilerInvocation()
    @     0x559cfc17ae41  cc1_main()

Thank you for finding the case. A another one (and probably more realistic one is):

define weak_odr void @f(i32 %0) {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 %0, -1
  %1 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %1, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}

So the fix is:

-      assert(I != nullptr);
+
+      // If constant it will fold and we can evaluate later. If its an argument
+      // or something of that nature, we can't analyze.
+      if (I == nullptr)
+        return {nullptr, UndefBit};

I'm happy to post that as a patch.

Created: https://reviews.llvm.org/D142339

But let me know if revert is still preferable (I say as much there).

If you still want to revert, however, happy to do so as well and then repost the full version.

Whichever you think is more prudent.

We're seeing a ton of fallout due to this, and given that there are already two different patterns that cause crashes, I'd like to have a known good version in the tree before we play whack-a-mole ;). Thus, I'd prefer that you reverted this and then recommitted with your proposed fix.

In D140939#4072961, @alexfh wrote:

In D140939#4072755, @goldstein.w.n wrote:

In D140939#4072672, @goldstein.w.n wrote:

In D140939#4072488, @alexfh wrote:

In D140939#4072474, @alexfh wrote:

In D140939#4072473, @alexfh wrote:

Unfortunately, there's another problem, which doesn't get fixed by https://reviews.llvm.org/D142166 / 2e25204779e5b972d668bf66a0014c1325813b35:

assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x56257abc9524  __assert_fail
    @     0x562578c7fd1c  FindSingleBitChange()
    @     0x562578c7f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x562578c8098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x5625790a6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x5625790a59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x56257a86e87d  llvm::FPPassManager::runOnFunction()
    @     0x56257a876104  llvm::FPPassManager::runOnModule()
    @     0x56257a86ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x562575bad581  clang::EmitBackendOutput()
    @     0x562575baabe9  clang::BackendConsumer::HandleTranslationUnit()
    @     0x562576a85e5c  clang::ParseAST()
    @     0x5625767cbf23  clang::FrontendAction::Execute()
    @     0x562576741cad  clang::CompilerInstance::ExecuteAction()
    @     0x5625757871e8  clang::ExecuteCompilerInvocation()
    @     0x56257577ae41  cc1_main()
    @     0x562575776ec8  ExecuteCC1Tool()
    @     0x5625768ed317  llvm::function_ref<>::callback_fn<>()
    @     0x56257aa23d20  llvm::CrashRecoveryContext::RunSafely()
    @     0x5625768ecb63  clang::driver::CC1Command::Execute()
    @     0x5625768ab2e6  clang::driver::Compilation::ExecuteCommand()
    @     0x5625768ab60f  clang::driver::Compilation::ExecuteJobs()
    @     0x5625768caf70  clang::driver::Driver::ExecuteCompilation()
    @     0x562575776027  clang_main()
    @     0x7f8f04be3633  __libc_start_main
    @     0x562575772bea  _start

I'd suggest to revert while investigating. I'm working on an isolated test case.

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define weak_odr void @f() {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 0, -1
  %0 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %0, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}
$ ./good-clang -c reduced.ll -o good.o
$ ./bad-clang -c reduced.ll -o bad.o
assert.h assertion failed at llvm/lib/Target/X86/X86ISelLowering.cpp:31466 in std::pair<Value *, BitTestKind> FindSingleBitChange(Value *): I != nullptr
    @     0x559d015c9524  __assert_fail
    @     0x559cff67fd1c  FindSingleBitChange()
    @     0x559cff67f6dd  llvm::X86TargetLowering::shouldExpandLogicAtomicRMWInIR()
    @     0x559cff68098b  llvm::X86TargetLowering::shouldExpandAtomicRMWInIR()
    @     0x559cffaa6099  (anonymous namespace)::AtomicExpand::tryExpandAtomicRMW()
    @     0x559cffaa59dd  (anonymous namespace)::AtomicExpand::runOnFunction()
    @     0x559d0126e87d  llvm::FPPassManager::runOnFunction()
    @     0x559d01276104  llvm::FPPassManager::runOnModule()
    @     0x559d0126ef7c  llvm::legacy::PassManagerImpl::run()
    @     0x559cfc5ad581  clang::EmitBackendOutput()
    @     0x559cfc5a94f8  clang::CodeGenAction::ExecuteAction()
    @     0x559cfd1cbf23  clang::FrontendAction::Execute()
    @     0x559cfd141cad  clang::CompilerInstance::ExecuteAction()
    @     0x559cfc1871e8  clang::ExecuteCompilerInvocation()
    @     0x559cfc17ae41  cc1_main()

Thank you for finding the case. A another one (and probably more realistic one is):

define weak_odr void @f(i32 %0) {
entry:
  br label %if.end19

cont11:                                           ; No predecessors!
  %not = xor i32 %0, -1
  %1 = atomicrmw and ptr null, i32 %not monotonic, align 4
  %and13 = and i32 %1, 0
  br label %if.end19

if.end19:                                         ; preds = %cont11, %entry
  ret void
}

So the fix is:

-      assert(I != nullptr);
+
+      // If constant it will fold and we can evaluate later. If its an argument
+      // or something of that nature, we can't analyze.
+      if (I == nullptr)
+        return {nullptr, UndefBit};

I'm happy to post that as a patch.

Created: https://reviews.llvm.org/D142339

But let me know if revert is still preferable (I say as much there).

If you still want to revert, however, happy to do so as well and then repost the full version.

Whichever you think is more prudent.

Hi, So as you probably saw, moments before this the change was pushed. The opinion in the other PR seemed to be its best to push the fix.
I don't see anything in the build logs this morning that looks related to the AtomicRMW code (seem to be blocked by something else though),
so my feeling is if things are fine with the 3rd fix, leave up, if we see ANY issue that may be related to the code, immediately reverting all
three is the way to go. Ping back here if you see any issue and I'll revert or you can just do so.

Sorry for the hassle this has caused :(

n-omer added a subscriber: n-omer.Mar 12 2023, 3:41 AM

n-omer mentioned this in D145930: [X86] Fix encoding for ATOMIC_LOGIC_OP.Mar 13 2023, 6:05 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

IntrinsicsX86.td

8 lines

lib/

Target/

X86/

X86ISelLowering.h

3 lines

X86ISelLowering.cpp

238 lines

X86InstrCompiler.td

34 lines

test/

CodeGen/

X86/

atomic-rm-bit-test.ll

1606 lines

Diff 486432

llvm/include/llvm/IR/IntrinsicsX86.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	// Lock bit test.			// Lock bit test.
	let TargetPrefix = "x86" in {			let TargetPrefix = "x86" in {
	def int_x86_atomic_bts : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],			def int_x86_atomic_bts : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],
	[ImmArg<ArgIndex<1>>]>;			[ImmArg<ArgIndex<1>>]>;
	def int_x86_atomic_btc : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],			def int_x86_atomic_btc : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],
	[ImmArg<ArgIndex<1>>]>;			[ImmArg<ArgIndex<1>>]>;
	def int_x86_atomic_btr : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],			def int_x86_atomic_btr : Intrinsic<[llvm_anyint_ty], [llvm_ptr_ty, llvm_i8_ty],
	[ImmArg<ArgIndex<1>>]>;			[ImmArg<ArgIndex<1>>]>;
				def int_x86_atomic_bts_rm : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty],
				[]>;
				def int_x86_atomic_btc_rm : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty],
				[]>;
				def int_x86_atomic_btr_rm : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty],
				[]>;


	}			}

	// Lock binary arith with CC.			// Lock binary arith with CC.
	let TargetPrefix = "x86" in {			let TargetPrefix = "x86" in {
	def int_x86_atomic_add_cc : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty, llvm_i32_ty],			def int_x86_atomic_add_cc : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty, llvm_i32_ty],
	[ImmArg<ArgIndex<2>>]>;			[ImmArg<ArgIndex<2>>]>;
	def int_x86_atomic_sub_cc : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty, llvm_i32_ty],			def int_x86_atomic_sub_cc : Intrinsic<[llvm_i8_ty], [llvm_ptr_ty, llvm_anyint_ty, llvm_i32_ty],
	[ImmArg<ArgIndex<2>>]>;			[ImmArg<ArgIndex<2>>]>;
	▲ Show 20 Lines • Show All 6,294 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 793 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
LADD,		LADD,
LSUB,		LSUB,
LOR,		LOR,
LXOR,		LXOR,
LAND,		LAND,
LBTS,		LBTS,
LBTC,		LBTC,
LBTR,		LBTR,
		LBTS_RM,
		LBTC_RM,
		LBTR_RM,

/// RAO arithmetic instructions.		/// RAO arithmetic instructions.
/// OUTCHAIN = AADD(INCHAIN, PTR, RHS)		/// OUTCHAIN = AADD(INCHAIN, PTR, RHS)
AADD,		AADD,
AOR,		AOR,
AXOR,		AXOR,
AAND,		AAND,

▲ Show 20 Lines • Show All 1,021 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,636 Lines • ▼ Show 20 Lines	case Intrinsic::x86_aesdecwide256kl:
Info.opc = ISD::INTRINSIC_W_CHAIN;		Info.opc = ISD::INTRINSIC_W_CHAIN;
Info.ptrVal = I.getArgOperand(0);		Info.ptrVal = I.getArgOperand(0);
Info.memVT = EVT::getIntegerVT(I.getType()->getContext(), 64);		Info.memVT = EVT::getIntegerVT(I.getType()->getContext(), 64);
Info.align = Align(1);		Info.align = Align(1);
Info.flags \|= MachineMemOperand::MOLoad;		Info.flags \|= MachineMemOperand::MOLoad;
return true;		return true;
case Intrinsic::x86_cmpccxadd32:		case Intrinsic::x86_cmpccxadd32:
case Intrinsic::x86_cmpccxadd64:		case Intrinsic::x86_cmpccxadd64:
		case Intrinsic::x86_atomic_bts_rm:
		case Intrinsic::x86_atomic_btc_rm:
		case Intrinsic::x86_atomic_btr_rm:
case Intrinsic::x86_atomic_bts:		case Intrinsic::x86_atomic_bts:
case Intrinsic::x86_atomic_btc:		case Intrinsic::x86_atomic_btc:
case Intrinsic::x86_atomic_btr: {		case Intrinsic::x86_atomic_btr: {
Info.opc = ISD::INTRINSIC_W_CHAIN;		Info.opc = ISD::INTRINSIC_W_CHAIN;
Info.ptrVal = I.getArgOperand(0);		Info.ptrVal = I.getArgOperand(0);
unsigned Size = I.getType()->getScalarSizeInBits();		unsigned Size = I.getType()->getScalarSizeInBits();
		pengfeiUnsubmitted Not Done Reply Inline Actions Should the `Size` be same with operand 1 rather than result? pengfei: Should the `Size` be same with operand 1 rather than result?
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Should the `Size` be same with operand 1 rather than result? If I change it I run into assertion errors. `I.getArgOperand(0)->getType()->getScalarSizeInBits()` the `Size` can be zero (bitwidth too small) or `I.getArgOperand(1)->getType()->getScalarSizeInBits()` I run into: llc: /home/noah/programs/opensource/llvm-dev/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:11108: llvm::MemSDNode::MemSDNode(unsigned int, unsigned int, const llvm::DebugLoc&, llvm::SDVTList, llvm::EVT, llvm::MachineMemOperand): Assertion `memvt.getStoreSize().getKnownMinSize() <= MMO->getSize() && "Size mismatch!"' failed. Maybe that means there is a bug in the patch elsewhere? Running the test here are the sizes I see: Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 8 / 0 / 16 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 16 / 0 / 8 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 8 / 0 / 32 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 32 / 0 / 8 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 8 / 0 / 64 Sizes: 64 / 0 / 8 Sizes: 64 / 0 / 8 Sizes: 64 / 0 / 8 goldstein.w.n:* > Should the `Size` be same with operand 1 rather than result? If I change it I run into…
		pengfeiUnsubmitted Not Done Reply Inline Actions I mean split `x86_atomic_bt_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()` Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`. pengfei:* I mean split `x86_atomic_bt*_rm` from it and then change it to `I.getArgOperand(1)->getType()…
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I mean split `x86_atomic_bt_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()` Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`. What does `"split` x86_atomic_bt_rm` from it" mean? goldstein.w.n: > I mean split `x86_atomic_bt*_rm` from it and then change it to `I.getArgOperand(1)->getType()…
		pengfeiUnsubmitted Done Reply Inline Actions case Intrinsic::x86_cmpccxadd32: case Intrinsic::x86_cmpccxadd64: case Intrinsic::x86_atomic_bts: case Intrinsic::x86_atomic_btc: case Intrinsic::x86_atomic_btr: { ... ... } case Intrinsic::x86_atomic_bts_rm: case Intrinsic::x86_atomic_btc_rm: case Intrinsic::x86_atomic_btr_rm: { ... ... } pengfei: ``` case Intrinsic::x86_cmpccxadd32: case Intrinsic::x86_cmpccxadd64: case…
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions case Intrinsic::x86_cmpccxadd32: case Intrinsic::x86_cmpccxadd64: case Intrinsic::x86_atomic_bts: case Intrinsic::x86_atomic_btc: case Intrinsic::x86_atomic_btr: { ... ... } case Intrinsic::x86_atomic_bts_rm: case Intrinsic::x86_atomic_btc_rm: case Intrinsic::x86_atomic_btr_rm: { ... ... } Done, for some reason had thought they where in different cases. goldstein.w.n: > ``` > case Intrinsic::x86_cmpccxadd32: > case Intrinsic::x86_cmpccxadd64: > case…
Info.memVT = EVT::getIntegerVT(I.getType()->getContext(), Size);		Info.memVT = EVT::getIntegerVT(I.getType()->getContext(), Size);
Info.align = Align(Size);		Info.align = Align(Size);
Info.flags \|= MachineMemOperand::MOLoad \| MachineMemOperand::MOStore \|		Info.flags \|= MachineMemOperand::MOLoad \| MachineMemOperand::MOStore \|
MachineMemOperand::MOVolatile;		MachineMemOperand::MOVolatile;
return true;		return true;
}		}

case Intrinsic::x86_aadd32:		case Intrinsic::x86_aadd32:
case Intrinsic::x86_aadd64:		case Intrinsic::x86_aadd64:
case Intrinsic::x86_aand32:		case Intrinsic::x86_aand32:
case Intrinsic::x86_aand64:		case Intrinsic::x86_aand64:
case Intrinsic::x86_aor32:		case Intrinsic::x86_aor32:
case Intrinsic::x86_aor64:		case Intrinsic::x86_aor64:
case Intrinsic::x86_axor32:		case Intrinsic::x86_axor32:
case Intrinsic::x86_axor64:		case Intrinsic::x86_axor64:
▲ Show 20 Lines • Show All 22,688 Lines • ▼ Show 20 Lines	case Intrinsic::x86_testui: {
SDLoc dl(Op);		SDLoc dl(Op);
SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDVTList VTs = DAG.getVTList(MVT::i32, MVT::Other);		SDVTList VTs = DAG.getVTList(MVT::i32, MVT::Other);
SDValue Operation = DAG.getNode(X86ISD::TESTUI, dl, VTs, Chain);		SDValue Operation = DAG.getNode(X86ISD::TESTUI, dl, VTs, Chain);
SDValue SetCC = getSETCC(X86::COND_B, Operation.getValue(0), dl, DAG);		SDValue SetCC = getSETCC(X86::COND_B, Operation.getValue(0), dl, DAG);
return DAG.getNode(ISD::MERGE_VALUES, dl, Op->getVTList(), SetCC,		return DAG.getNode(ISD::MERGE_VALUES, dl, Op->getVTList(), SetCC,
Operation.getValue(1));		Operation.getValue(1));
}		}
		case Intrinsic::x86_atomic_bts_rm:
		case Intrinsic::x86_atomic_btc_rm:
		case Intrinsic::x86_atomic_btr_rm: {
		SDLoc DL(Op);
		MVT VT = Op.getSimpleValueType();
		SDValue Chain = Op.getOperand(0);
		SDValue Op1 = Op.getOperand(2);
		SDValue Op2 = Op.getOperand(3);
		unsigned Opc = IntNo == Intrinsic::x86_atomic_bts_rm ? X86ISD::LBTS_RM
		: IntNo == Intrinsic::x86_atomic_btc_rm ? X86ISD::LBTC_RM
		: X86ISD::LBTR_RM;
		MachineMemOperand *MMO = cast<MemIntrinsicSDNode>(Op)->getMemOperand();
		SDValue Res =
		DAG.getMemIntrinsicNode(Opc, DL, DAG.getVTList(MVT::i32, MVT::Other),
		{Chain, Op1, Op2}, VT, MMO);
		Chain = Res.getValue(1);
		Res = DAG.getZExtOrTrunc(getSETCC(X86::COND_B, Res, DL, DAG), DL, VT);
		return DAG.getNode(ISD::MERGE_VALUES, DL, Op->getVTList(), Res, Chain);
		}
case Intrinsic::x86_atomic_bts:		case Intrinsic::x86_atomic_bts:
case Intrinsic::x86_atomic_btc:		case Intrinsic::x86_atomic_btc:
case Intrinsic::x86_atomic_btr: {		case Intrinsic::x86_atomic_btr: {
SDLoc DL(Op);		SDLoc DL(Op);
MVT VT = Op.getSimpleValueType();		MVT VT = Op.getSimpleValueType();
SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDValue Op1 = Op.getOperand(2);		SDValue Op1 = Op.getOperand(2);
SDValue Op2 = Op.getOperand(3);		SDValue Op2 = Op.getOperand(3);
unsigned Opc = IntNo == Intrinsic::x86_atomic_bts ? X86ISD::LBTS		unsigned Opc = IntNo == Intrinsic::x86_atomic_bts ? X86ISD::LBTS
: IntNo == Intrinsic::x86_atomic_btc ? X86ISD::LBTC		: IntNo == Intrinsic::x86_atomic_btc ? X86ISD::LBTC
: X86ISD::LBTR;		: X86ISD::LBTR;

SDValue Size = DAG.getConstant(VT.getScalarSizeInBits(), DL, MVT::i32);		SDValue Size = DAG.getConstant(VT.getScalarSizeInBits(), DL, MVT::i32);
MachineMemOperand *MMO = cast<MemIntrinsicSDNode>(Op)->getMemOperand();		MachineMemOperand *MMO = cast<MemIntrinsicSDNode>(Op)->getMemOperand();
SDValue Res =		SDValue Res =
DAG.getMemIntrinsicNode(Opc, DL, DAG.getVTList(MVT::i32, MVT::Other),		DAG.getMemIntrinsicNode(Opc, DL, DAG.getVTList(MVT::i32, MVT::Other),
{Chain, Op1, Op2, Size}, VT, MMO);		{Chain, Op1, Op2, Size}, VT, MMO);
Chain = Res.getValue(1);		Chain = Res.getValue(1);
Res = DAG.getZExtOrTrunc(getSETCC(X86::COND_B, Res, DL, DAG), DL, VT);		Res = DAG.getZExtOrTrunc(getSETCC(X86::COND_B, Res, DL, DAG), DL, VT);
unsigned Imm = cast<ConstantSDNode>(Op2)->getZExtValue();		unsigned Imm = cast<ConstantSDNode>(Op2)->getZExtValue();
▲ Show 20 Lines • Show All 3,010 Lines • ▼ Show 20 Lines	if (MemType->getPrimitiveSizeInBits() == 64 && !Subtarget.is64Bit() &&
!Subtarget.useSoftFloat() && !NoImplicitFloatOps &&		!Subtarget.useSoftFloat() && !NoImplicitFloatOps &&
(Subtarget.hasSSE1() \|\| Subtarget.hasX87()))		(Subtarget.hasSSE1() \|\| Subtarget.hasX87()))
return AtomicExpansionKind::None;		return AtomicExpansionKind::None;

return needsCmpXchgNb(MemType) ? AtomicExpansionKind::CmpXChg		return needsCmpXchgNb(MemType) ? AtomicExpansionKind::CmpXChg
: AtomicExpansionKind::None;		: AtomicExpansionKind::None;
}		}

		enum BitTestKind : unsigned {
		UndefBit,
		ConstantBit,
		NotConstantBit,
		ShiftBit,
		NotShiftBit
		};

		static std::pair<Value , BitTestKind> FindSingleBitChange(Value V) {
		BitTestKind BTK = UndefBit;
		auto *C = dyn_cast<ConstantInt>(V);
		if (C) {
		// Check if V is a power of 2 or or NOT power of 2.
		pengfeiUnsubmitted Done Reply Inline Actions or pengfei: or
		if (isPowerOf2_64(C->getZExtValue())) {
		pengfeiUnsubmitted Done Reply Inline Actions No parentheses needed for single line scope. pengfei: No parentheses needed for single line scope.
		BTK = ConstantBit;
		} else if (isPowerOf2_64((~C->getValue()).getZExtValue())) {
		BTK = NotConstantBit;
		}
		return {V, BTK};
		}

		// Check if V is some power of 2 pattern known to be non-zero
		auto *I = dyn_cast<Instruction>(V);
		if (I) {
		bool Not = false;
		// Check if we have a NOT
		if (I->getOpcode() == Instruction::Sub \|\|
		I->getOpcode() == Instruction::Xor) {

		auto *OpC0 = dyn_cast<ConstantInt>(I->getOperand(0));
		auto *OpC1 = dyn_cast<ConstantInt>(I->getOperand(1));
		// Check if this is a NOT instruction: -1 - X or X/-1 ^ -1/X
		pengfeiUnsubmitted Not Done Reply Inline Actions Should it be a canonical form that we can assume it is always `X ^ -1`? pengfei: Should it be a canonical form that we can assume it is always `X ^ -1`?
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Should it be a canonical form that we can assume it is always `X ^ -1`? Probably defacto fixed by using `match` (didn't know `match` existed when I wrote this patch :/ alot to learn). goldstein.w.n: > Should it be a canonical form that we can assume it is always `X ^ -1`? Probably defacto…
		if (!OpC0 && (!OpC1 \|\| I->getOpcode() == Instruction::Sub))
		pengfeiUnsubmitted Done Reply Inline Actions We can use `match(....)` for better readability? pengfei: We can use `match(....)` for better readability?
		return {nullptr, UndefBit};

		auto *MaybeNeg1 = OpC0 ? OpC0 : OpC1;
		if (!MaybeNeg1->isMinusOne())
		return {nullptr, UndefBit};

		auto *OpI0 = dyn_cast<Instruction>(I->getOperand(0));
		auto *OpI1 = dyn_cast<Instruction>(I->getOperand(1));

		assert(OpI0 != nullptr \|\| OpI1 != nullptr);
		assert(OpI0 == nullptr \|\| OpI1 == nullptr);

		I = OpI0 ? OpI0 : OpI1;
		Not = true;
		}
		// We can only use 1 << X without more sophisticated analysis. C << X where
		// C is a power of 2 but not 1 can result in zero which cannot be translated
		// to bittest. Likewise any C >> X (either arith or logical) can be zero.
		if (I->getOpcode() == Instruction::Shl) {
		// Todo(1): The cmpxchg case is pretty costly so matching `BLSI(X)`, `X &
		// -X` and some other provable power of 2 patterns that we can use CTZ on
		// may be profitable.
		// Todo(2): It may be possible in some cases to prove that Shl(C, X) is
		// non-zero even where C != 1. Likewise LShr(C, X) and AShr(C, X) may also
		// be provably a non-zero power of 2.
		// Todo(3): ROTL and ROTR patterns on a power of 2 C should also be
		// transformable to bittest.
		auto *ShiftVal = dyn_cast<ConstantInt>(I->getOperand(0));
		if (!ShiftVal)
		return {nullptr, UndefBit};
		if (ShiftVal->equalsInt(1))
		BTK = Not ? NotShiftBit : ShiftBit;

		if (BTK == UndefBit)
		return {nullptr, UndefBit};

		Value *BitV = I->getOperand(1);
		if (auto *I1 = dyn_cast<Instruction>(BitV)) {
		// Read past a shiftmask instruction to find count
		if (I1->getOpcode() == Instruction::And) {
		auto *OpC0 = dyn_cast<ConstantInt>(I1->getOperand(0));
		auto *OpC1 = dyn_cast<ConstantInt>(I1->getOperand(1));
		if (OpC0 \|\| OpC1) {
		assert(OpC0 == nullptr \|\| OpC1 == nullptr);
		auto *C1 = OpC0 ? OpC0 : OpC1;
		if (C1->equalsInt(I->getType()->getPrimitiveSizeInBits() - 1))
		BitV = OpC0 ? I1->getOperand(1) : I1->getOperand(0);
		}
		}
		}

		return {BitV, BTK};
		}
		}
		return {nullptr, UndefBit};
		}

TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *AI) const {		X86TargetLowering::shouldExpandLogicAtomicRMWInIR(AtomicRMWInst *AI) const {
// If the atomicrmw's result isn't actually used, we can just add a "lock"		// If the atomicrmw's result isn't actually used, we can just add a "lock"
// prefix to a normal instruction for these operations.		// prefix to a normal instruction for these operations.
if (AI->use_empty())		if (AI->use_empty())
return AtomicExpansionKind::None;		return AtomicExpansionKind::None;

// If the atomicrmw's result is used by a single bit AND, we may use		// If the atomicrmw's result is used by a single bit AND, we may use
// bts/btr/btc instruction for these operations.		// bts/btr/btc instruction for these operations.
auto *C1 = dyn_cast<ConstantInt>(AI->getValOperand());		// Note: InstCombinePass can cause a de-optimization here. It replaces the
		// SETCC(And(AtomicRMW(P, power_of_2), power_of_2)) with LShr and Xor
		// (depending on CC). This pattern can only use bts/btr/btc but we don't
		// detect it.
Instruction *I = AI->user_back();		Instruction *I = AI->user_back();
if (!C1 \|\| !AI->hasOneUse() \|\| I->getOpcode() != Instruction::And \|\|		auto BitChange = FindSingleBitChange(AI->getValOperand());
		if (!BitChange.first \|\| BitChange.second == UndefBit \|\| !AI->hasOneUse() \|\|
		pengfeiUnsubmitted Done Reply Inline Actions Is enough to only check `BitChange.second == UndefBit`? pengfei: Is enough to only check `BitChange.second == UndefBit`?
		I->getOpcode() != Instruction::And \|\|
		AI->getType()->getPrimitiveSizeInBits() == 8 \|\|
AI->getParent() != I->getParent())		AI->getParent() != I->getParent())
return AtomicExpansionKind::CmpXChg;		return AtomicExpansionKind::CmpXChg;

		assert(I->getOperand(0) == AI);
// The following instruction must be a AND single bit.		// The following instruction must be a AND single bit.
		if (BitChange.second == ConstantBit \|\| BitChange.second == NotConstantBit) {
		auto *C1 = dyn_cast<ConstantInt>(AI->getValOperand());
		assert(C1 != nullptr);
auto *C2 = dyn_cast<ConstantInt>(I->getOperand(1));		auto *C2 = dyn_cast<ConstantInt>(I->getOperand(1));
unsigned Bits = AI->getType()->getPrimitiveSizeInBits();		if (!C2 \|\| !isPowerOf2_64(C2->getZExtValue())) {
if (!C2 \|\| Bits == 8 \|\| !isPowerOf2_64(C2->getZExtValue()))		return AtomicExpansionKind::CmpXChg;
		}
		if (AI->getOperation() == AtomicRMWInst::And) {
		return ~C1->getValue() == C2->getValue()
		? AtomicExpansionKind::BitTestIntrinsic
		: AtomicExpansionKind::CmpXChg;
		}
		return C1 == C2 ? AtomicExpansionKind::BitTestIntrinsic
		: AtomicExpansionKind::CmpXChg;
		}

		assert(BitChange.second == ShiftBit \|\| BitChange.second == NotShiftBit);

		auto BitTested = FindSingleBitChange(I->getOperand(1));
		if (BitTested.second != ShiftBit && BitTested.second != NotShiftBit)
		return AtomicExpansionKind::CmpXChg;

		assert(BitTested.first != nullptr);

		// If shift amounts are not the same we can't use BitTestIntrinsic
		pengfeiUnsubmitted Done Reply Inline Actions Missing the last period `.` pengfei: Missing the last period `.`
		if (BitChange.first != BitTested.first)
return AtomicExpansionKind::CmpXChg;		return AtomicExpansionKind::CmpXChg;

		// If atomic AND need to be masking all be one bit and testing the one bit
		// unset in the mask
		pengfeiUnsubmitted Done Reply Inline Actions ditto. pengfei: ditto.
if (AI->getOperation() == AtomicRMWInst::And)		if (AI->getOperation() == AtomicRMWInst::And)
return ~C1->getValue() == C2->getValue()		return (BitChange.second == NotShiftBit && BitTested.second == ShiftBit)
? AtomicExpansionKind::BitTestIntrinsic		? AtomicExpansionKind::BitTestIntrinsic
: AtomicExpansionKind::CmpXChg;		: AtomicExpansionKind::CmpXChg;

return C1 == C2 ? AtomicExpansionKind::BitTestIntrinsic		// If atomic XOR/OR need to be setting and testing the same bit.
		return (BitChange.second == ShiftBit && BitTested.second == ShiftBit)
		? AtomicExpansionKind::BitTestIntrinsic
: AtomicExpansionKind::CmpXChg;		: AtomicExpansionKind::CmpXChg;
}		}

void X86TargetLowering::emitBitTestAtomicRMWIntrinsic(AtomicRMWInst *AI) const {		void X86TargetLowering::emitBitTestAtomicRMWIntrinsic(AtomicRMWInst *AI) const {
IRBuilder<> Builder(AI);		IRBuilder<> Builder(AI);
Intrinsic::ID IID = Intrinsic::not_intrinsic;		Intrinsic::ID IID_C = Intrinsic::not_intrinsic;
		Intrinsic::ID IID_I = Intrinsic::not_intrinsic;
switch (AI->getOperation()) {		switch (AI->getOperation()) {
default:		default:
llvm_unreachable("Unknown atomic operation");		llvm_unreachable("Unknown atomic operation");
case AtomicRMWInst::Or:		case AtomicRMWInst::Or:
IID = Intrinsic::x86_atomic_bts;		IID_C = Intrinsic::x86_atomic_bts;
		IID_I = Intrinsic::x86_atomic_bts_rm;
break;		break;
case AtomicRMWInst::Xor:		case AtomicRMWInst::Xor:
IID = Intrinsic::x86_atomic_btc;		IID_C = Intrinsic::x86_atomic_btc;
		IID_I = Intrinsic::x86_atomic_btc_rm;
break;		break;
case AtomicRMWInst::And:		case AtomicRMWInst::And:
IID = Intrinsic::x86_atomic_btr;		IID_C = Intrinsic::x86_atomic_btr;
		IID_I = Intrinsic::x86_atomic_btr_rm;
break;		break;
}		}
Instruction *I = AI->user_back();		Instruction *I = AI->user_back();
LLVMContext &Ctx = AI->getContext();		LLVMContext &Ctx = AI->getContext();
unsigned Imm =
countTrailingZeros(cast<ConstantInt>(I->getOperand(1))->getZExtValue());
Function *BitTest =
Intrinsic::getDeclaration(AI->getModule(), IID, AI->getType());
Value *Addr = Builder.CreatePointerCast(AI->getPointerOperand(),		Value *Addr = Builder.CreatePointerCast(AI->getPointerOperand(),
Type::getInt8PtrTy(Ctx));		Type::getInt8PtrTy(Ctx));
Value *Result = Builder.CreateCall(BitTest, {Addr, Builder.getInt8(Imm)});		Function *BitTest = nullptr;
		Value *Result = nullptr;
		auto BitTested = FindSingleBitChange(AI->getValOperand());
		assert(BitTested.first != nullptr);
		if (BitTested.second == ConstantBit \|\| BitTested.second == NotConstantBit) {
		auto *C = dyn_cast<ConstantInt>(I->getOperand(1));
		assert(C != nullptr);

		BitTest = Intrinsic::getDeclaration(AI->getModule(), IID_C, AI->getType());

		unsigned Imm = countTrailingZeros(C->getZExtValue());
		Result = Builder.CreateCall(BitTest, {Addr, Builder.getInt8(Imm)});
		} else {
		BitTest = Intrinsic::getDeclaration(AI->getModule(), IID_I, AI->getType());

		assert(BitTested.second == ShiftBit \|\| BitTested.second == NotShiftBit);

		Value *SI = BitTested.first;
		assert(SI != nullptr);

		// BT{S\|R\|C} on memory operand don't modulo bit position so we need to
		// mask it.
		unsigned ShiftBits = SI->getType()->getPrimitiveSizeInBits();
		Value *BitPos =
		Builder.CreateAnd(SI, Builder.getIntN(ShiftBits, ShiftBits - 1));
		// Todo(1): In many cases it may be provable that SI is less than
		// ShiftBits in which case this mask is unnecessary
		// Todo(2): In the fairly idiomatic case of P[X / sizeof_bits(X)] OP 1
		// << (X % sizeof_bits(X)) we can drop the shift mask and AGEN in
		// favor of just a raw BT{S\|R\|C}.

		Result = Builder.CreateCall(BitTest, {Addr, BitPos});
		Result = Builder.CreateZExtOrTrunc(Result, AI->getType());

		// If the result is only used for zero/non-zero status then we don't need to
		// shift value back. Otherwise do so.
		for (auto It = I->user_begin(); It != I->user_end(); ++It) {
		if (auto ICmp = dyn_cast<ICmpInst>(It)) {
		if (ICmp->isEquality()) {
		auto *C0 = dyn_cast<ConstantInt>(ICmp->getOperand(0));
		auto *C1 = dyn_cast<ConstantInt>(ICmp->getOperand(1));
		if (C0 \|\| C1) {
		assert(C0 == nullptr \|\| C1 == nullptr);
		if ((C0 ? C0 : C1)->isZero())
		continue;
		}
		}
		}
		Result = Builder.CreateShl(Result, BitPos);
		break;
		}
		}

I->replaceAllUsesWith(Result);		I->replaceAllUsesWith(Result);
I->eraseFromParent();		I->eraseFromParent();
AI->eraseFromParent();		AI->eraseFromParent();
}		}

static bool shouldExpandCmpArithRMWInIR(AtomicRMWInst *AI) {		static bool shouldExpandCmpArithRMWInIR(AtomicRMWInst *AI) {
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;
if (!AI->hasOneUse())		if (!AI->hasOneUse())
▲ Show 20 Lines • Show All 2,756 Lines • ▼ Show 20 Lines	#define NODE_NAME_CASE(NODE) case X86ISD::NODE: return "X86ISD::" #NODE;
NODE_NAME_CASE(LADD)		NODE_NAME_CASE(LADD)
NODE_NAME_CASE(LSUB)		NODE_NAME_CASE(LSUB)
NODE_NAME_CASE(LOR)		NODE_NAME_CASE(LOR)
NODE_NAME_CASE(LXOR)		NODE_NAME_CASE(LXOR)
NODE_NAME_CASE(LAND)		NODE_NAME_CASE(LAND)
NODE_NAME_CASE(LBTS)		NODE_NAME_CASE(LBTS)
NODE_NAME_CASE(LBTC)		NODE_NAME_CASE(LBTC)
NODE_NAME_CASE(LBTR)		NODE_NAME_CASE(LBTR)
		NODE_NAME_CASE(LBTS_RM)
		NODE_NAME_CASE(LBTC_RM)
		NODE_NAME_CASE(LBTR_RM)
NODE_NAME_CASE(AADD)		NODE_NAME_CASE(AADD)
NODE_NAME_CASE(AOR)		NODE_NAME_CASE(AOR)
NODE_NAME_CASE(AXOR)		NODE_NAME_CASE(AXOR)
NODE_NAME_CASE(AAND)		NODE_NAME_CASE(AAND)
NODE_NAME_CASE(VZEXT_MOVL)		NODE_NAME_CASE(VZEXT_MOVL)
NODE_NAME_CASE(VZEXT_LOAD)		NODE_NAME_CASE(VZEXT_LOAD)
NODE_NAME_CASE(VEXTRACT_STORE)		NODE_NAME_CASE(VEXTRACT_STORE)
NODE_NAME_CASE(VTRUNC)		NODE_NAME_CASE(VTRUNC)
▲ Show 20 Lines • Show All 23,269 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrCompiler.td

Show First 20 Lines • Show All 863 Lines • ▼ Show 20 Lines	def X86LBTest : SDTypeProfile<1, 3, [SDTCisVT<0, i32>, SDTCisPtrTy<1>,
SDTCisVT<2, i8>, SDTCisVT<3, i32>]>;		SDTCisVT<2, i8>, SDTCisVT<3, i32>]>;
def x86bts : SDNode<"X86ISD::LBTS", X86LBTest,		def x86bts : SDNode<"X86ISD::LBTS", X86LBTest,
[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;
def x86btc : SDNode<"X86ISD::LBTC", X86LBTest,		def x86btc : SDNode<"X86ISD::LBTC", X86LBTest,
[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;
def x86btr : SDNode<"X86ISD::LBTR", X86LBTest,		def x86btr : SDNode<"X86ISD::LBTR", X86LBTest,
[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;

		def X86LBTestRM : SDTypeProfile<1, 2, [SDTCisVT<0, i32>, SDTCisPtrTy<1>,
		SDTCisInt<2>]>;

		def x86_rm_bts : SDNode<"X86ISD::LBTS_RM", X86LBTestRM,
		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;
		def x86_rm_btc : SDNode<"X86ISD::LBTC_RM", X86LBTestRM,
		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;
		def x86_rm_btr : SDNode<"X86ISD::LBTR_RM", X86LBTestRM,
		[SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>;


multiclass ATOMIC_LOGIC_OP<Format Form, string s> {		multiclass ATOMIC_LOGIC_OP<Format Form, string s> {
let Defs = [EFLAGS], mayLoad = 1, mayStore = 1, isCodeGenOnly = 1,		let Defs = [EFLAGS], mayLoad = 1, mayStore = 1, isCodeGenOnly = 1,
SchedRW = [WriteBitTestSetRegRMW] in {		SchedRW = [WriteBitTestSetRegRMW] in {
def 16m : Ii8<0xBA, Form, (outs), (ins i16mem:$src1, i8imm:$src2),		def 16m : Ii8<0xBA, Form, (outs), (ins i16mem:$src1, i8imm:$src2),
!strconcat(s, "{w}\t{$src2, $src1\|$src1, $src2}"),		!strconcat(s, "{w}\t{$src2, $src1\|$src1, $src2}"),
[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 16)))]>,		[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 16)))]>,
OpSize16, TB, LOCK;		OpSize16, TB, LOCK;
def 32m : Ii8<0xBA, Form, (outs), (ins i32mem:$src1, i8imm:$src2),		def 32m : Ii8<0xBA, Form, (outs), (ins i32mem:$src1, i8imm:$src2),
!strconcat(s, "{l}\t{$src2, $src1\|$src1, $src2}"),		!strconcat(s, "{l}\t{$src2, $src1\|$src1, $src2}"),
[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 32)))]>,		[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 32)))]>,
OpSize32, TB, LOCK;		OpSize32, TB, LOCK;
def 64m : RIi8<0xBA, Form, (outs), (ins i64mem:$src1, i8imm:$src2),		def 64m : RIi8<0xBA, Form, (outs), (ins i64mem:$src1, i8imm:$src2),
!strconcat(s, "{q}\t{$src2, $src1\|$src1, $src2}"),		!strconcat(s, "{q}\t{$src2, $src1\|$src1, $src2}"),
[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 64)))]>,		[(set EFLAGS, (!cast<SDNode>("x86" # s) addr:$src1, timm:$src2, (i32 64)))]>,
TB, LOCK;		TB, LOCK;
}		}
}		}

		multiclass ATOMIC_LOGIC_OP_RM<bits<8> Opc8, string s> {
		let Defs = [EFLAGS], mayLoad = 1, mayStore = 1, isCodeGenOnly = 1,
		SchedRW = [WriteBitTestSetRegRMW] in {
		def 16rm : Ii8<Opc8, MRMDestMem, (outs), (ins i16mem:$src1, GR16:$src2),
		!strconcat(s, "{w}\t{$src2, $src1\|$src1, $src2}"),
		[(set EFLAGS, (!cast<SDNode>("x86_rm_" # s) addr:$src1, GR16:$src2))]>,
		OpSize16, TB, LOCK;
		def 32rm : Ii8<Opc8, MRMDestMem, (outs), (ins i32mem:$src1, GR32:$src2),
		!strconcat(s, "{l}\t{$src2, $src1\|$src1, $src2}"),
		[(set EFLAGS, (!cast<SDNode>("x86_rm_" # s) addr:$src1, GR32:$src2))]>,
		OpSize32, TB, LOCK;
		def 64rm : RIi8<Opc8, MRMDestMem, (outs), (ins i64mem:$src1, GR64:$src2),
		!strconcat(s, "{q}\t{$src2, $src1\|$src1, $src2}"),
		[(set EFLAGS, (!cast<SDNode>("x86_rm_" # s) addr:$src1, GR64:$src2))]>,
		TB, LOCK;
		}
		}


defm LOCK_BTS : ATOMIC_LOGIC_OP<MRM5m, "bts">;		defm LOCK_BTS : ATOMIC_LOGIC_OP<MRM5m, "bts">;
defm LOCK_BTC : ATOMIC_LOGIC_OP<MRM7m, "btc">;		defm LOCK_BTC : ATOMIC_LOGIC_OP<MRM7m, "btc">;
defm LOCK_BTR : ATOMIC_LOGIC_OP<MRM6m, "btr">;		defm LOCK_BTR : ATOMIC_LOGIC_OP<MRM6m, "btr">;

		defm LOCK_BTS_RM : ATOMIC_LOGIC_OP_RM<0xAB, "bts">;
		defm LOCK_BTC_RM : ATOMIC_LOGIC_OP_RM<0xBB, "btc">;
		defm LOCK_BTR_RM : ATOMIC_LOGIC_OP_RM<0xB3, "btr">;

// Atomic compare and swap.		// Atomic compare and swap.
multiclass LCMPXCHG_BinOp<bits<8> Opc8, bits<8> Opc, Format Form,		multiclass LCMPXCHG_BinOp<bits<8> Opc8, bits<8> Opc, Format Form,
string mnemonic, SDPatternOperator frag> {		string mnemonic, SDPatternOperator frag> {
let isCodeGenOnly = 1, SchedRW = [WriteCMPXCHGRMW] in {		let isCodeGenOnly = 1, SchedRW = [WriteCMPXCHGRMW] in {
let Defs = [AL, EFLAGS], Uses = [AL] in		let Defs = [AL, EFLAGS], Uses = [AL] in
def NAME#8 : I<Opc8, Form, (outs), (ins i8mem:$ptr, GR8:$swap),		def NAME#8 : I<Opc8, Form, (outs), (ins i8mem:$ptr, GR8:$swap),
!strconcat(mnemonic, "{b}\t{$swap, $ptr\|$ptr, $swap}"),		!strconcat(mnemonic, "{b}\t{$swap, $ptr\|$ptr, $swap}"),
[(frag addr:$ptr, GR8:$swap, 1)]>, TB, LOCK;		[(frag addr:$ptr, GR8:$swap, 1)]>, TB, LOCK;
▲ Show 20 Lines • Show All 1,348 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/atomic-rm-bit-test.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 745 Lines • ▼ Show 20 Lines	entry:
%0 = atomicrmw xor ptr %v, i16 %conv1 monotonic, align 2		%0 = atomicrmw xor ptr %v, i16 %conv1 monotonic, align 2
%conv5 = and i16 %0, %conv1		%conv5 = and i16 %0, %conv1
ret i16 %conv5		ret i16 %conv5
}		}

define zeroext i16 @atomic_shl1_small_mask_xor_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {		define zeroext i16 @atomic_shl1_small_mask_xor_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_xor_16_gpr_val:		; X86-LABEL: atomic_shl1_small_mask_xor_16_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andb $7, %cl		; X86-NEXT: andl $7, %ecx
; X86-NEXT: movl $1, %esi		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btcw %cx, (%edx)
; X86-NEXT: movzwl (%edx), %eax		; X86-NEXT: setb %al
; X86-NEXT: movzwl %si, %ecx		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: shll %cl, %eax
; X86-NEXT: .LBB13_1: # %atomicrmw.start
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %esi
; X86-NEXT: xorl %ecx, %esi
; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: lock cmpxchgw %si, (%edx)
; X86-NEXT: # kill: def $ax killed $ax def $eax
; X86-NEXT: jne .LBB13_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %ecx, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_xor_16_gpr_val:		; X64-LABEL: atomic_shl1_small_mask_xor_16_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: andb $7, %cl		; X64-NEXT: andl $7, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btcw %cx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movzwl (%rdi), %eax
; X64-NEXT: movzwl %dx, %ecx
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB13_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %edx
; X64-NEXT: xorl %ecx, %edx
; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: lock cmpxchgw %dx, (%rdi)
; X64-NEXT: # kill: def $ax killed $ax def $eax
; X64-NEXT: jne .LBB13_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %ecx, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i16 %c, 7		%0 = and i16 %c, 7
%shl = shl nuw nsw i16 1, %0		%shl = shl nuw nsw i16 1, %0
%1 = atomicrmw xor ptr %v, i16 %shl monotonic, align 2		%1 = atomicrmw xor ptr %v, i16 %shl monotonic, align 2
%and = and i16 %1, %shl		%and = and i16 %1, %shl
ret i16 %and		ret i16 %and
▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	entry:
%shl4 = shl nuw i16 1, %1		%shl4 = shl nuw i16 1, %1
%and = and i16 %0, %shl4		%and = and i16 %0, %shl4
ret i16 %and		ret i16 %and
}		}

define zeroext i16 @atomic_shl1_mask01_xor_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {		define zeroext i16 @atomic_shl1_mask01_xor_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_xor_16_gpr_val:		; X86-LABEL: atomic_shl1_mask01_xor_16_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: andl $15, %ecx
; X86-NEXT: andb $15, %cl		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl $1, %edx		; X86-NEXT: lock btcw %cx, (%edx)
; X86-NEXT: shll %cl, %edx		; X86-NEXT: setb %al
; X86-NEXT: movzwl (%esi), %eax		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: shll %cl, %eax
; X86-NEXT: .LBB16_1: # %atomicrmw.start
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: xorl %edx, %ecx
; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: lock cmpxchgw %cx, (%esi)
; X86-NEXT: # kill: def $ax killed $ax def $eax
; X86-NEXT: jne .LBB16_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %edx, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_xor_16_gpr_val:		; X64-LABEL: atomic_shl1_mask01_xor_16_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: andb $15, %cl		; X64-NEXT: andl $15, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btcw %cx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movzwl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB16_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: xorl %edx, %ecx
; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: lock cmpxchgw %cx, (%rdi)
; X64-NEXT: # kill: def $ax killed $ax def $eax
; X64-NEXT: jne .LBB16_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i16 %c, 15		%0 = and i16 %c, 15
%shl = shl nuw i16 1, %0		%shl = shl nuw i16 1, %0
%1 = atomicrmw xor ptr %v, i16 %shl monotonic, align 2		%1 = atomicrmw xor ptr %v, i16 %shl monotonic, align 2
%conv7 = and i16 %1, %shl		%conv7 = and i16 %1, %shl
ret i16 %conv7		ret i16 %conv7
▲ Show 20 Lines • Show All 1,391 Lines • ▼ Show 20 Lines	entry:
%1 = atomicrmw and ptr %v, i16 %conv1 monotonic, align 2		%1 = atomicrmw and ptr %v, i16 %conv1 monotonic, align 2
%conv5 = and i16 %1, %0		%conv5 = and i16 %1, %0
ret i16 %conv5		ret i16 %conv5
}		}

define zeroext i16 @atomic_shl1_small_mask_and_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {		define zeroext i16 @atomic_shl1_small_mask_and_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_and_16_gpr_val:		; X86-LABEL: atomic_shl1_small_mask_and_16_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andb $7, %cl		; X86-NEXT: andl $7, %ecx
; X86-NEXT: movl $1, %esi		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btrw %cx, (%edx)
; X86-NEXT: movw $-2, %di		; X86-NEXT: setb %al
; X86-NEXT: rolw %cl, %di		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: movzwl (%edx), %eax		; X86-NEXT: shll %cl, %eax
; X86-NEXT: .p2align 4, 0x90
; X86-NEXT: .LBB37_1: # %atomicrmw.start
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: andl %edi, %ecx
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: lock cmpxchgw %cx, (%edx)
; X86-NEXT: # kill: def $ax killed $ax def $eax
; X86-NEXT: jne .LBB37_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movzwl %si, %ecx
; X86-NEXT: andl %eax, %ecx
; X86-NEXT: movl %ecx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_and_16_gpr_val:		; X64-LABEL: atomic_shl1_small_mask_and_16_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: andb $7, %cl		; X64-NEXT: andl $7, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btrw %cx, (%rdi)
; X64-NEXT: movw $-2, %si		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: rolw %cl, %si		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movzwl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB37_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: andl %esi, %ecx
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: lock cmpxchgw %cx, (%rdi)
; X64-NEXT: # kill: def $ax killed $ax def $eax
; X64-NEXT: jne .LBB37_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movzwl %dx, %ecx
; X64-NEXT: andl %eax, %ecx
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i16 %c, 7		%0 = and i16 %c, 7
%shl = shl nuw nsw i16 1, %0		%shl = shl nuw nsw i16 1, %0
%not = xor i16 %shl, -1		%not = xor i16 %shl, -1
%1 = atomicrmw and ptr %v, i16 %not monotonic, align 2		%1 = atomicrmw and ptr %v, i16 %not monotonic, align 2
%and = and i16 %1, %shl		%and = and i16 %1, %shl
ret i16 %and		ret i16 %and
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	entry:
%shl4 = shl nuw i16 1, %2		%shl4 = shl nuw i16 1, %2
%and = and i16 %1, %shl4		%and = and i16 %1, %shl4
ret i16 %and		ret i16 %and
}		}

define zeroext i16 @atomic_shl1_mask01_and_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {		define zeroext i16 @atomic_shl1_mask01_and_16_gpr_val(ptr %v, i16 zeroext %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_and_16_gpr_val:		; X86-LABEL: atomic_shl1_mask01_and_16_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: pushl %esi		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: andl $15, %ecx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %eax, %ecx		; X86-NEXT: lock btrw %cx, (%edx)
; X86-NEXT: andb $15, %cl		; X86-NEXT: setb %al
; X86-NEXT: movl $1, %edx		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: shll %cl, %edx		; X86-NEXT: shll %cl, %eax
; X86-NEXT: movw $-2, %di
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: rolw %cl, %di
; X86-NEXT: movzwl (%esi), %eax
; X86-NEXT: .p2align 4, 0x90
; X86-NEXT: .LBB40_1: # %atomicrmw.start
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: andl %edi, %ecx
; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: lock cmpxchgw %cx, (%esi)
; X86-NEXT: # kill: def $ax killed $ax def $eax
; X86-NEXT: jne .LBB40_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %edx, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_and_16_gpr_val:		; X64-LABEL: atomic_shl1_mask01_and_16_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: andb $15, %cl		; X64-NEXT: andl $15, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btrw %cx, (%rdi)
; X64-NEXT: movw $-2, %r8w		; X64-NEXT: setb %al
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: rolw %cl, %r8w		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movzwl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB40_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: andl %r8d, %ecx
; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: lock cmpxchgw %cx, (%rdi)
; X64-NEXT: # kill: def $ax killed $ax def $eax
; X64-NEXT: jne .LBB40_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i16 %c, 15		%0 = and i16 %c, 15
%shl = shl nuw i16 1, %0		%shl = shl nuw i16 1, %0
%conv1 = xor i16 %shl, -1		%conv1 = xor i16 %shl, -1
%1 = atomicrmw and ptr %v, i16 %conv1 monotonic, align 2		%1 = atomicrmw and ptr %v, i16 %conv1 monotonic, align 2
%conv7 = and i16 %1, %shl		%conv7 = and i16 %1, %shl
▲ Show 20 Lines • Show All 1,215 Lines • ▼ Show 20 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i16 [ %2, %if.then ], [ 123, %entry ]		%retval.0 = phi i16 [ %2, %if.then ], [ 123, %entry ]
ret i16 %retval.0		ret i16 %retval.0
}		}

define i32 @atomic_shl1_or_32_gpr_val(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_or_32_gpr_val(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_or_32_gpr_val:		; X86-LABEL: atomic_shl1_or_32_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: andl $31, %ecx
; X86-NEXT: movl $1, %edx		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: shll %cl, %edx		; X86-NEXT: lock btsl %ecx, (%edx)
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: setb %al
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: .LBB60_1: # %atomicrmw.start		; X86-NEXT: shll %cl, %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %edx, %ecx
; X86-NEXT: lock cmpxchgl %ecx, (%esi)
; X86-NEXT: jne .LBB60_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_or_32_gpr_val:		; X64-LABEL: atomic_shl1_or_32_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %ecx
		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btsl %ecx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB60_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: orl %edx, %ecx
; X64-NEXT: lock cmpxchgl %ecx, (%rdi)
; X64-NEXT: jne .LBB60_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
ret i32 %and		ret i32 %and
}		}

define i32 @atomic_shl1_small_mask_or_32_gpr_val(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_small_mask_or_32_gpr_val(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_val:		; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andb $15, %cl		; X86-NEXT: andl $15, %ecx
; X86-NEXT: movl $1, %esi		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %ecx, (%edx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: setb %al
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: .LBB61_1: # %atomicrmw.start		; X86-NEXT: shll %cl, %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %esi, %ecx
; X86-NEXT: lock cmpxchgl %ecx, (%edx)
; X86-NEXT: jne .LBB61_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %esi, %eax
; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_val:		; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: andb $15, %cl		; X64-NEXT: andl $15, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btsl %ecx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB61_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: orl %edx, %ecx
; X64-NEXT: lock cmpxchgl %ecx, (%rdi)
; X64-NEXT: jne .LBB61_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 15		%0 = and i32 %c, 15
%shl = shl nuw nsw i32 1, %0		%shl = shl nuw nsw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %1, %shl		%and = and i32 %1, %shl
ret i32 %and		ret i32 %and
}		}

define i32 @atomic_shl1_mask0_or_32_gpr_val(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask0_or_32_gpr_val(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask0_or_32_gpr_val:		; X86-LABEL: atomic_shl1_mask0_or_32_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: andl $31, %ecx
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: lock btsl %ecx, (%edx)
; X86-NEXT: .LBB62_1: # %atomicrmw.start		; X86-NEXT: setb %al
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %edi
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB62_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $1, %edx
; X86-NEXT: # kill: def $cl killed $cl killed $ecx		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: shll %cl, %edx		; X86-NEXT: shll %cl, %eax
; X86-NEXT: andl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_or_32_gpr_val:		; X64-LABEL: atomic_shl1_mask0_or_32_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: lock btsl %ecx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB62_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB62_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 31		%0 = and i32 %c, 31
%shl = shl nuw i32 1, %0		%shl = shl nuw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%shl1 = shl nuw i32 1, %c		%shl1 = shl nuw i32 1, %c
%and = and i32 %1, %shl1		%and = and i32 %1, %shl1
ret i32 %and		ret i32 %and
}		}

define i32 @atomic_shl1_mask1_or_32_gpr_val(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask1_or_32_gpr_val(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask1_or_32_gpr_val:		; X86-LABEL: atomic_shl1_mask1_or_32_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: andl $31, %ecx
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: lock btsl %ecx, (%edx)
; X86-NEXT: .LBB63_1: # %atomicrmw.start		; X86-NEXT: setb %al
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %edi
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB63_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $1, %edx
; X86-NEXT: # kill: def $cl killed $cl killed $ecx		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: shll %cl, %edx		; X86-NEXT: shll %cl, %eax
; X86-NEXT: andl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_or_32_gpr_val:		; X64-LABEL: atomic_shl1_mask1_or_32_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: lock btsl %ecx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB63_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB63_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%1 = and i32 %c, 31		%1 = and i32 %c, 31
%shl1 = shl nuw i32 1, %1		%shl1 = shl nuw i32 1, %1
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
ret i32 %and		ret i32 %and
}		}

define i32 @atomic_shl1_mask01_or_32_gpr_val(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask01_or_32_gpr_val(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_or_32_gpr_val:		; X86-LABEL: atomic_shl1_mask01_or_32_gpr_val:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: andl $31, %ecx
; X86-NEXT: movl $1, %edx		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: shll %cl, %edx		; X86-NEXT: lock btsl %ecx, (%edx)
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: setb %al
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # kill: def $cl killed $cl killed $ecx
; X86-NEXT: .LBB64_1: # %atomicrmw.start		; X86-NEXT: shll %cl, %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %edx, %ecx
; X86-NEXT: lock cmpxchgl %ecx, (%esi)
; X86-NEXT: jne .LBB64_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: andl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_or_32_gpr_val:		; X64-LABEL: atomic_shl1_mask01_or_32_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %ecx
		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btsl %ecx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NEXT: shll %cl, %edx		; X64-NEXT: shll %cl, %eax
; X64-NEXT: movl (%rdi), %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB64_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %ecx
; X64-NEXT: orl %edx, %ecx
; X64-NEXT: lock cmpxchgl %ecx, (%rdi)
; X64-NEXT: jne .LBB64_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 31		%0 = and i32 %c, 31
%shl = shl nuw i32 1, %0		%shl = shl nuw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %1, %shl		%and = and i32 %1, %shl
ret i32 %and		ret i32 %and
}		}
▲ Show 20 Lines • Show All 701 Lines • ▼ Show 20 Lines	entry:
%tobool = icmp ne i32 %and3, 0		%tobool = icmp ne i32 %and3, 0
%lnot.ext = zext i1 %tobool to i32		%lnot.ext = zext i1 %tobool to i32
ret i32 %lnot.ext		ret i32 %lnot.ext
}		}

define i32 @atomic_shl1_or_32_gpr_br(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_or_32_gpr_br(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_or_32_gpr_br:		; X86-LABEL: atomic_shl1_or_32_gpr_br:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB78_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB78_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB78_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB78_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB78_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB78_5
; X86-NEXT: .LBB78_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB78_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
		pengfeiUnsubmitted Not Done Reply Inline Actions The branch code doesn't look necessary. Can we necessary it? pengfei: The branch code doesn't look necessary. Can we necessary it?
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions The branch code doesn't look necessary. Can we necessary it? I think it is b.c we don't `cmovcc` loads. For if.then: ; preds = %entry %idxprom = zext i32 %c to i64 %arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom %1 = load i32, ptr %arrayidx, align 4 br label %return And return: ; preds = %entry, %if.then %retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ] ret i32 %retval.0 a branch seems correct. goldstein.w.n: > The branch code doesn't look necessary. Can we necessary it? I think it is b.c we don't…
		pengfeiUnsubmitted Not Done Reply Inline Actions Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g., entry: %shl = shl nuw i32 1, %c %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4 %and = and i32 %0, %shl %tobool.not = icmp eq i32 %and, 0 %ret = zext i1 %tobool.not to i32 ret i32 %ret This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected. pengfei: Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g.
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g., entry: %shl = shl nuw i32 1, %c %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4 %and = and i32 %0, %shl %tobool.not = icmp eq i32 %and, 0 %ret = zext i1 %tobool.not to i32 ret i32 %ret This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected. I see, so there are 6 difference test types: `_br` -> branch on value `_brz` -> branch on !value `_brnz` -> branch on !!value `_val` -> return value `_valz` -> return !value `_valnz` -> return !!value Imo they are all worth testing. For example we have logic that searches the uses to see if its only for a truth value to see if we can optimize out the shift. IIRC when writing the code, there where some edge cases where `br` behaved differently than `setc` so think its worth keeping. goldstein.w.n: > Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g.
;		;
; X64-LABEL: atomic_shl1_or_32_gpr_br:		; X64-LABEL: atomic_shl1_or_32_gpr_br:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB78_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB78_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB78_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB78_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB78_3:		; X64-NEXT: .LBB78_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %return, label %if.then		br i1 %tobool.not, label %return, label %if.then

if.then: ; preds = %entry		if.then: ; preds = %entry
%idxprom = zext i32 %c to i64		%idxprom = zext i32 %c to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom
%1 = load i32, ptr %arrayidx, align 4		%1 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_small_mask_or_32_gpr_br(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_small_mask_or_32_gpr_br(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_br:		; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_br:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andl $15, %ecx		; X86-NEXT: andl $15, %ecx
; X86-NEXT: movl $1, %esi		; X86-NEXT: lock btsl %ecx, (%eax)
; X86-NEXT: shll %cl, %esi		; X86-NEXT: jae .LBB79_1
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl (%eax,%ecx,4), %eax
; X86-NEXT: .LBB79_1: # %atomicrmw.start		; X86-NEXT: retl
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: .LBB79_1:
; X86-NEXT: movl %eax, %edi
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB79_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB79_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB79_5
; X86-NEXT: .LBB79_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB79_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_br:		; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_br:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: andl $15, %esi
; X64-NEXT: andl $15, %ecx		; X64-NEXT: lock btsl %esi, (%rdi)
; X64-NEXT: movl $1, %edx		; X64-NEXT: jae .LBB79_1
; X64-NEXT: shll %cl, %edx		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB79_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB79_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB79_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB79_3:		; X64-NEXT: .LBB79_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 15		%0 = and i32 %c, 15
%shl = shl nuw nsw i32 1, %0		%shl = shl nuw nsw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %1, %shl		%and = and i32 %1, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %return, label %if.then		br i1 %tobool.not, label %return, label %if.then

if.then: ; preds = %entry		if.then: ; preds = %entry
%conv2 = zext i32 %0 to i64		%conv2 = zext i32 %0 to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2
%2 = load i32, ptr %arrayidx, align 4		%2 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask0_or_32_gpr_br(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask0_or_32_gpr_br(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask0_or_32_gpr_br:		; X86-LABEL: atomic_shl1_mask0_or_32_gpr_br:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB80_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB80_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB80_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB80_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jae .LBB80_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB80_5
; X86-NEXT: .LBB80_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB80_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_or_32_gpr_br:		; X64-LABEL: atomic_shl1_mask0_or_32_gpr_br:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB80_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB80_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB80_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB80_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB80_3:		; X64-NEXT: .LBB80_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%shl1 = shl nuw i32 1, %c		%shl1 = shl nuw i32 1, %c
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
Show All 9 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask1_or_32_gpr_br(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask1_or_32_gpr_br(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask1_or_32_gpr_br:		; X86-LABEL: atomic_shl1_mask1_or_32_gpr_br:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB81_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB81_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB81_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB81_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jae .LBB81_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB81_5
; X86-NEXT: .LBB81_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB81_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_or_32_gpr_br:		; X64-LABEL: atomic_shl1_mask1_or_32_gpr_br:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB81_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB81_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB81_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB81_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB81_3:		; X64-NEXT: .LBB81_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl1 = shl nuw i32 1, %rem		%shl1 = shl nuw i32 1, %rem
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
Show All 9 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask01_or_32_gpr_br(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask01_or_32_gpr_br(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_or_32_gpr_br:		; X86-LABEL: atomic_shl1_mask01_or_32_gpr_br:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB82_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB82_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB82_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB82_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB82_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB82_5
; X86-NEXT: .LBB82_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB82_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_or_32_gpr_br:		; X64-LABEL: atomic_shl1_mask01_or_32_gpr_br:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB82_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB82_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB82_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB82_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB82_3:		; X64-NEXT: .LBB82_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_or_32_gpr_brz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_or_32_gpr_brz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_or_32_gpr_brz:		; X86-LABEL: atomic_shl1_or_32_gpr_brz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %edi		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: shll %cl, %edi		; X86-NEXT: andl $31, %eax
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: lock btsl %eax, (%edx)
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB84_1: # %atomicrmw.start		; X86-NEXT: jae .LBB84_1
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: # %bb.2: # %return
; X86-NEXT: movl %eax, %edx		; X86-NEXT: retl
; X86-NEXT: orl %edi, %edx		; X86-NEXT: .LBB84_1: # %if.then
; X86-NEXT: lock cmpxchgl %edx, (%esi)		; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jne .LBB84_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $123, %edx
; X86-NEXT: testl %edi, %eax
; X86-NEXT: jne .LBB84_4
; X86-NEXT: # %bb.3: # %if.then
; X86-NEXT: movl (%esi,%ecx,4), %edx
; X86-NEXT: .LBB84_4: # %return
; X86-NEXT: movl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_or_32_gpr_brz:		; X64-LABEL: atomic_shl1_or_32_gpr_brz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %esi		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %esi		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: jae .LBB84_1
; X64-NEXT: .LBB84_1: # %atomicrmw.start		; X64-NEXT: # %bb.2: # %return
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %edx
; X64-NEXT: orl %esi, %edx
; X64-NEXT: lock cmpxchgl %edx, (%rdi)
; X64-NEXT: jne .LBB84_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $123, %edx
; X64-NEXT: testl %esi, %eax
; X64-NEXT: je .LBB84_3
; X64-NEXT: # %bb.4: # %return
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB84_3: # %if.then		; X64-NEXT: .LBB84_1: # %if.then
; X64-NEXT: movl %ecx, %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl (%rdi,%rax,4), %edx		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %if.then, label %return		br i1 %tobool.not, label %if.then, label %return

if.then: ; preds = %entry		if.then: ; preds = %entry
%idxprom = zext i32 %c to i64		%idxprom = zext i32 %c to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom
%1 = load i32, ptr %arrayidx, align 4		%1 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_small_mask_or_32_gpr_brz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_small_mask_or_32_gpr_brz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_brz:		; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_brz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andl $15, %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %edi		; X86-NEXT: andl $15, %edx
; X86-NEXT: shll %cl, %edi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: jae .LBB85_1
; X86-NEXT: .LBB85_1: # %atomicrmw.start		; X86-NEXT: # %bb.2: # %return
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edx		; X86-NEXT: .LBB85_1: # %if.then
; X86-NEXT: orl %edi, %edx		; X86-NEXT: movl (%ecx,%edx,4), %eax
; X86-NEXT: lock cmpxchgl %edx, (%esi)
; X86-NEXT: jne .LBB85_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $123, %edx
; X86-NEXT: testl %edi, %eax
; X86-NEXT: jne .LBB85_4
; X86-NEXT: # %bb.3: # %if.then
; X86-NEXT: movl (%esi,%ecx,4), %edx
; X86-NEXT: .LBB85_4: # %return
; X86-NEXT: movl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_brz:		; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_brz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: andl $15, %esi
; X64-NEXT: andl $15, %ecx		; X64-NEXT: lock btsl %esi, (%rdi)
; X64-NEXT: movl $1, %esi		; X64-NEXT: movl $123, %eax
; X64-NEXT: shll %cl, %esi		; X64-NEXT: jae .LBB85_1
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: # %bb.2: # %return
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB85_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %edx
; X64-NEXT: orl %esi, %edx
; X64-NEXT: lock cmpxchgl %edx, (%rdi)
; X64-NEXT: jne .LBB85_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $123, %edx
; X64-NEXT: testl %esi, %eax
; X64-NEXT: je .LBB85_3
; X64-NEXT: # %bb.4: # %return
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB85_3: # %if.then		; X64-NEXT: .LBB85_1: # %if.then
; X64-NEXT: movl %ecx, %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl (%rdi,%rax,4), %edx		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 15		%0 = and i32 %c, 15
%shl = shl nuw nsw i32 1, %0		%shl = shl nuw nsw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %1, %shl		%and = and i32 %1, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %if.then, label %return		br i1 %tobool.not, label %if.then, label %return

if.then: ; preds = %entry		if.then: ; preds = %entry
%conv2 = zext i32 %0 to i64		%conv2 = zext i32 %0 to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2
%2 = load i32, ptr %arrayidx, align 4		%2 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask0_or_32_gpr_brz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask0_or_32_gpr_brz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask0_or_32_gpr_brz:		; X86-LABEL: atomic_shl1_mask0_or_32_gpr_brz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %edx		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: shll %cl, %edx		; X86-NEXT: andl $31, %eax
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: lock btsl %eax, (%edx)
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB86_1: # %atomicrmw.start		; X86-NEXT: jae .LBB86_1
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: # %bb.2: # %return
; X86-NEXT: movl %eax, %edi		; X86-NEXT: retl
; X86-NEXT: orl %edx, %edi		; X86-NEXT: .LBB86_1: # %if.then
; X86-NEXT: lock cmpxchgl %edi, (%esi)		; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jne .LBB86_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $123, %edx
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jb .LBB86_4
; X86-NEXT: # %bb.3: # %if.then
; X86-NEXT: movl (%esi,%ecx,4), %edx
; X86-NEXT: .LBB86_4: # %return
; X86-NEXT: movl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_or_32_gpr_brz:		; X64-LABEL: atomic_shl1_mask0_or_32_gpr_brz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: jae .LBB86_1
; X64-NEXT: .LBB86_1: # %atomicrmw.start		; X64-NEXT: # %bb.2: # %return
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB86_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $123, %edx
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB86_3
; X64-NEXT: # %bb.4: # %return
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB86_3: # %if.then		; X64-NEXT: .LBB86_1: # %if.then
; X64-NEXT: movl %ecx, %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl (%rdi,%rax,4), %edx		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%shl1 = shl nuw i32 1, %c		%shl1 = shl nuw i32 1, %c
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %if.then, label %return		br i1 %tobool.not, label %if.then, label %return

if.then: ; preds = %entry		if.then: ; preds = %entry
%conv = zext i32 %c to i64		%conv = zext i32 %c to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv
%1 = load i32, ptr %arrayidx, align 4		%1 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask1_or_32_gpr_brz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask1_or_32_gpr_brz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask1_or_32_gpr_brz:		; X86-LABEL: atomic_shl1_mask1_or_32_gpr_brz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %edx		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: shll %cl, %edx		; X86-NEXT: andl $31, %eax
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: lock btsl %eax, (%edx)
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB87_1: # %atomicrmw.start		; X86-NEXT: jae .LBB87_1
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: # %bb.2: # %return
; X86-NEXT: movl %eax, %edi		; X86-NEXT: retl
; X86-NEXT: orl %edx, %edi		; X86-NEXT: .LBB87_1: # %if.then
; X86-NEXT: lock cmpxchgl %edi, (%esi)		; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jne .LBB87_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $123, %edx
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jb .LBB87_4
; X86-NEXT: # %bb.3: # %if.then
; X86-NEXT: movl (%esi,%ecx,4), %edx
; X86-NEXT: .LBB87_4: # %return
; X86-NEXT: movl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_or_32_gpr_brz:		; X64-LABEL: atomic_shl1_mask1_or_32_gpr_brz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: jae .LBB87_1
; X64-NEXT: .LBB87_1: # %atomicrmw.start		; X64-NEXT: # %bb.2: # %return
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB87_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $123, %edx
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB87_3
; X64-NEXT: # %bb.4: # %return
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB87_3: # %if.then		; X64-NEXT: .LBB87_1: # %if.then
; X64-NEXT: movl %ecx, %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl (%rdi,%rax,4), %edx		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl1 = shl nuw i32 1, %rem		%shl1 = shl nuw i32 1, %rem
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %if.then, label %return		br i1 %tobool.not, label %if.then, label %return

if.then: ; preds = %entry		if.then: ; preds = %entry
%conv = zext i32 %c to i64		%conv = zext i32 %c to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv
%1 = load i32, ptr %arrayidx, align 4		%1 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask01_or_32_gpr_brz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask01_or_32_gpr_brz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_or_32_gpr_brz:		; X86-LABEL: atomic_shl1_mask01_or_32_gpr_brz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl $1, %edi		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: shll %cl, %edi		; X86-NEXT: andl $31, %eax
; X86-NEXT: movl (%esi), %eax		; X86-NEXT: lock btsl %eax, (%edx)
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB88_1: # %atomicrmw.start		; X86-NEXT: jae .LBB88_1
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: # %bb.2: # %return
; X86-NEXT: movl %eax, %edx		; X86-NEXT: retl
; X86-NEXT: orl %edi, %edx		; X86-NEXT: .LBB88_1: # %if.then
; X86-NEXT: lock cmpxchgl %edx, (%esi)		; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jne .LBB88_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: movl $123, %edx
; X86-NEXT: testl %edi, %eax
; X86-NEXT: jne .LBB88_4
; X86-NEXT: # %bb.3: # %if.then
; X86-NEXT: movl (%esi,%ecx,4), %edx
; X86-NEXT: .LBB88_4: # %return
; X86-NEXT: movl %edx, %eax
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_or_32_gpr_brz:		; X64-LABEL: atomic_shl1_mask01_or_32_gpr_brz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %esi		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %esi		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: jae .LBB88_1
; X64-NEXT: .LBB88_1: # %atomicrmw.start		; X64-NEXT: # %bb.2: # %return
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %edx
; X64-NEXT: orl %esi, %edx
; X64-NEXT: lock cmpxchgl %edx, (%rdi)
; X64-NEXT: jne .LBB88_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $123, %edx
; X64-NEXT: testl %esi, %eax
; X64-NEXT: je .LBB88_3
; X64-NEXT: # %bb.4: # %return
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB88_3: # %if.then		; X64-NEXT: .LBB88_1: # %if.then
; X64-NEXT: movl %ecx, %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl (%rdi,%rax,4), %edx		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: movl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %if.then, label %return		br i1 %tobool.not, label %if.then, label %return
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_or_32_gpr_brnz:		; X86-LABEL: atomic_shl1_or_32_gpr_brnz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB90_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB90_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB90_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB90_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB90_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB90_5
; X86-NEXT: .LBB90_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB90_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_or_32_gpr_brnz:		; X64-LABEL: atomic_shl1_or_32_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB90_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB90_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB90_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB90_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB90_3:		; X64-NEXT: .LBB90_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %return, label %if.then		br i1 %tobool.not, label %return, label %if.then

if.then: ; preds = %entry		if.then: ; preds = %entry
%idxprom = zext i32 %c to i64		%idxprom = zext i32 %c to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom
%1 = load i32, ptr %arrayidx, align 4		%1 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_small_mask_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_small_mask_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_brnz:		; X86-LABEL: atomic_shl1_small_mask_or_32_gpr_brnz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: andl $15, %ecx		; X86-NEXT: andl $15, %ecx
; X86-NEXT: movl $1, %esi		; X86-NEXT: lock btsl %ecx, (%eax)
; X86-NEXT: shll %cl, %esi		; X86-NEXT: jae .LBB91_1
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: movl (%eax,%ecx,4), %eax
; X86-NEXT: .LBB91_1: # %atomicrmw.start		; X86-NEXT: retl
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: .LBB91_1:
; X86-NEXT: movl %eax, %edi
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB91_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB91_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB91_5
; X86-NEXT: .LBB91_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB91_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_brnz:		; X64-LABEL: atomic_shl1_small_mask_or_32_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: andl $15, %esi
; X64-NEXT: andl $15, %ecx		; X64-NEXT: lock btsl %esi, (%rdi)
; X64-NEXT: movl $1, %edx		; X64-NEXT: jae .LBB91_1
; X64-NEXT: shll %cl, %edx		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: movl %esi, %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB91_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB91_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB91_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB91_3:		; X64-NEXT: .LBB91_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%0 = and i32 %c, 15		%0 = and i32 %c, 15
%shl = shl nuw nsw i32 1, %0		%shl = shl nuw nsw i32 1, %0
%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%1 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %1, %shl		%and = and i32 %1, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
br i1 %tobool.not, label %return, label %if.then		br i1 %tobool.not, label %return, label %if.then

if.then: ; preds = %entry		if.then: ; preds = %entry
%conv2 = zext i32 %0 to i64		%conv2 = zext i32 %0 to i64
%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2		%arrayidx = getelementptr inbounds i32, ptr %v, i64 %conv2
%2 = load i32, ptr %arrayidx, align 4		%2 = load i32, ptr %arrayidx, align 4
br label %return		br label %return

return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %2, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask0_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask0_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask0_or_32_gpr_brnz:		; X86-LABEL: atomic_shl1_mask0_or_32_gpr_brnz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB92_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB92_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB92_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB92_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jae .LBB92_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB92_5
; X86-NEXT: .LBB92_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB92_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_or_32_gpr_brnz:		; X64-LABEL: atomic_shl1_mask0_or_32_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB92_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB92_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB92_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB92_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB92_3:		; X64-NEXT: .LBB92_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%shl1 = shl nuw i32 1, %c		%shl1 = shl nuw i32 1, %c
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
Show All 9 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask1_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask1_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask1_or_32_gpr_brnz:		; X86-LABEL: atomic_shl1_mask1_or_32_gpr_brnz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB93_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB93_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB93_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB93_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: btl %ecx, %eax
; X86-NEXT: jae .LBB93_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB93_5
; X86-NEXT: .LBB93_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB93_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_or_32_gpr_brnz:		; X64-LABEL: atomic_shl1_mask1_or_32_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB93_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB93_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB93_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btl %ecx, %eax
; X64-NEXT: jae .LBB93_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB93_3:		; X64-NEXT: .LBB93_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i32 1, %c		%shl = shl nuw i32 1, %c
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl1 = shl nuw i32 1, %rem		%shl1 = shl nuw i32 1, %rem
%and = and i32 %0, %shl1		%and = and i32 %0, %shl1
Show All 9 Lines
return: ; preds = %entry, %if.then		return: ; preds = %entry, %if.then
%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]		%retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define i32 @atomic_shl1_mask01_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {		define i32 @atomic_shl1_mask01_or_32_gpr_brnz(ptr %v, i32 %c) nounwind {
; X86-LABEL: atomic_shl1_mask01_or_32_gpr_brnz:		; X86-LABEL: atomic_shl1_mask01_or_32_gpr_brnz:
; X86: # %bb.0: # %entry		; X86: # %bb.0: # %entry
; X86-NEXT: pushl %edi		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: movl $1, %esi		; X86-NEXT: andl $31, %edx
; X86-NEXT: shll %cl, %esi		; X86-NEXT: lock btsl %edx, (%ecx)
; X86-NEXT: movl (%edx), %eax		; X86-NEXT: jae .LBB94_1
; X86-NEXT: .p2align 4, 0x90		; X86-NEXT: # %bb.2: # %if.then
; X86-NEXT: .LBB94_1: # %atomicrmw.start		; X86-NEXT: movl (%ecx,%eax,4), %eax
; X86-NEXT: # =>This Inner Loop Header: Depth=1		; X86-NEXT: retl
; X86-NEXT: movl %eax, %edi		; X86-NEXT: .LBB94_1:
; X86-NEXT: orl %esi, %edi
; X86-NEXT: lock cmpxchgl %edi, (%edx)
; X86-NEXT: jne .LBB94_1
; X86-NEXT: # %bb.2: # %atomicrmw.end
; X86-NEXT: testl %esi, %eax
; X86-NEXT: je .LBB94_3
; X86-NEXT: # %bb.4: # %if.then
; X86-NEXT: movl (%edx,%ecx,4), %eax
; X86-NEXT: jmp .LBB94_5
; X86-NEXT: .LBB94_3:
; X86-NEXT: movl $123, %eax		; X86-NEXT: movl $123, %eax
; X86-NEXT: .LBB94_5: # %return
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_or_32_gpr_brnz:		; X64-LABEL: atomic_shl1_mask01_or_32_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movl %esi, %ecx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $31, %eax
; X64-NEXT: shll %cl, %edx		; X64-NEXT: lock btsl %eax, (%rdi)
; X64-NEXT: movl (%rdi), %eax		; X64-NEXT: jae .LBB94_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB94_1: # %atomicrmw.start		; X64-NEXT: movl %esi, %eax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movl %eax, %esi
; X64-NEXT: orl %edx, %esi
; X64-NEXT: lock cmpxchgl %esi, (%rdi)
; X64-NEXT: jne .LBB94_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB94_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movl %ecx, %eax
; X64-NEXT: movl (%rdi,%rax,4), %eax		; X64-NEXT: movl (%rdi,%rax,4), %eax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB94_3:		; X64-NEXT: .LBB94_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i32 %c, 31		%rem = and i32 %c, 31
%shl = shl nuw i32 1, %rem		%shl = shl nuw i32 1, %rem
%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4		%0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
%and = and i32 %0, %shl		%and = and i32 %0, %shl
%tobool.not = icmp eq i32 %and, 0		%tobool.not = icmp eq i32 %and, 0
▲ Show 20 Lines • Show All 1,390 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_xor_64_gpr_val:		; X64-LABEL: atomic_shl1_xor_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btcq %rcx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB122_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: xorq %rdx, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB122_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8		%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
}		}

▲ Show 20 Lines • Show All 166 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_xor_64_gpr_val:		; X64-LABEL: atomic_shl1_small_mask_xor_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: andb $31, %cl		; X64-NEXT: andl $31, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btcq %rcx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB125_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: xorq %rdx, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB125_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 31		%rem = and i64 %c, 31
%shl = shl nuw nsw i64 1, %rem		%shl = shl nuw nsw i64 1, %rem
%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8		%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
}		}
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_xor_64_gpr_val:		; X64-LABEL: atomic_shl1_mask0_xor_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: lock btcq %rcx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB126_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: xorq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB126_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8		%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8
%shl1 = shl nuw i64 1, %c		%shl1 = shl nuw i64 1, %c
%and = and i64 %0, %shl1		%and = and i64 %0, %shl1
ret i64 %and		ret i64 %and
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_xor_64_gpr_val:		; X64-LABEL: atomic_shl1_mask1_xor_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: lock btcq %rcx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB127_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: xorq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB127_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8		%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl1 = shl nuw i64 1, %rem		%shl1 = shl nuw i64 1, %rem
%and = and i64 %0, %shl1		%and = and i64 %0, %shl1
ret i64 %and		ret i64 %and
Show All 36 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_xor_64_gpr_val:		; X64-LABEL: atomic_shl1_mask01_xor_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
		; X64-NEXT: xorl %eax, %eax
		; X64-NEXT: lock btcq %rcx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB128_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: xorq %rdx, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB128_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8		%0 = atomicrmw xor ptr %v, i64 %shl monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
}		}
▲ Show 20 Lines • Show All 1,307 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_and_64_gpr_val:		; X64-LABEL: atomic_shl1_and_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: lock btrq %rcx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: rolq %cl, %rsi		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB146_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: andq %rsi, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB146_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
}		}
▲ Show 20 Lines • Show All 188 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_and_64_gpr_val:		; X64-LABEL: atomic_shl1_small_mask_and_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: andb $31, %cl		; X64-NEXT: andl $31, %ecx
; X64-NEXT: movl $1, %edx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: lock btrq %rcx, (%rdi)
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: rolq %cl, %rsi		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB149_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: andq %rsi, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB149_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andl %edx, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 31		%rem = and i64 %c, 31
%shl = shl nuw nsw i64 1, %rem		%shl = shl nuw nsw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_and_64_gpr_val:		; X64-LABEL: atomic_shl1_mask0_and_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movq $-2, %rdx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: rolq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: lock btrq %rcx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB150_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: andq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB150_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%shl1 = shl nuw i64 1, %c		%shl1 = shl nuw i64 1, %c
%and = and i64 %0, %shl1		%and = and i64 %0, %shl1
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_and_64_gpr_val:		; X64-LABEL: atomic_shl1_mask1_and_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movq $-2, %rdx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: rolq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: lock btrq %rcx, (%rdi)
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: setb %al
; X64-NEXT: .LBB151_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: andq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB151_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: movl $1, %edx
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl1 = shl nuw i64 1, %rem		%shl1 = shl nuw i64 1, %rem
%and = and i64 %0, %shl1		%and = and i64 %0, %shl1
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_and_64_gpr_val:		; X64-LABEL: atomic_shl1_mask01_and_64_gpr_val:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movq %rsi, %rcx
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %ecx
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: lock btrq %rcx, (%rdi)
		; X64-NEXT: setb %al
; X64-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NEXT: rolq %cl, %rsi		; X64-NEXT: shlq %cl, %rax
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB152_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rcx
; X64-NEXT: andq %rsi, %rcx
; X64-NEXT: lock cmpxchgq %rcx, (%rdi)
; X64-NEXT: jne .LBB152_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: andq %rdx, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
ret i64 %and		ret i64 %and
▲ Show 20 Lines • Show All 758 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_and_64_gpr_brnz:		; X64-LABEL: atomic_shl1_and_64_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %eax
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: lock btrq %rax, (%rdi)
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: jae .LBB162_1
; X64-NEXT: rolq %cl, %rsi		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: movq (%rdi,%rsi,8), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB162_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %r8
; X64-NEXT: andq %rsi, %r8
; X64-NEXT: lock cmpxchgq %r8, (%rdi)
; X64-NEXT: jne .LBB162_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testq %rdx, %rax
; X64-NEXT: je .LBB162_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movq (%rdi,%rcx,8), %rax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB162_3:		; X64-NEXT: .LBB162_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
%tobool.not = icmp eq i64 %and, 0		%tobool.not = icmp eq i64 %and, 0
▲ Show 20 Lines • Show All 263 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_small_mask_and_64_gpr_brnz:		; X64-LABEL: atomic_shl1_small_mask_and_64_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: andl $31, %esi
; X64-NEXT: andl $31, %ecx		; X64-NEXT: lock btrq %rsi, (%rdi)
; X64-NEXT: movl $1, %edx		; X64-NEXT: jae .LBB165_1
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: movq (%rdi,%rsi,8), %rax
; X64-NEXT: rolq %cl, %rsi
; X64-NEXT: movq (%rdi), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB165_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %r8
; X64-NEXT: andq %rsi, %r8
; X64-NEXT: lock cmpxchgq %r8, (%rdi)
; X64-NEXT: jne .LBB165_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testl %edx, %eax
; X64-NEXT: je .LBB165_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movq (%rdi,%rcx,8), %rax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB165_3:		; X64-NEXT: .LBB165_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 31		%rem = and i64 %c, 31
%shl = shl nuw nsw i64 1, %rem		%shl = shl nuw nsw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask0_and_64_gpr_brnz:		; X64-LABEL: atomic_shl1_mask0_and_64_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movq $-2, %rdx		; X64-NEXT: andl $63, %eax
; X64-NEXT: rolq %cl, %rdx		; X64-NEXT: lock btrq %rax, (%rdi)
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: jae .LBB166_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB166_1: # %atomicrmw.start		; X64-NEXT: movq (%rdi,%rsi,8), %rax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: andq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB166_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btq %rcx, %rax
; X64-NEXT: jae .LBB166_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movq (%rdi,%rcx,8), %rax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB166_3:		; X64-NEXT: .LBB166_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%shl1 = shl nuw i64 1, %c		%shl1 = shl nuw i64 1, %c
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask1_and_64_gpr_brnz:		; X64-LABEL: atomic_shl1_mask1_and_64_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movq $-2, %rdx		; X64-NEXT: andl $63, %eax
; X64-NEXT: rolq %cl, %rdx		; X64-NEXT: lock btrq %rax, (%rdi)
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: jae .LBB167_1
; X64-NEXT: .p2align 4, 0x90		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: .LBB167_1: # %atomicrmw.start		; X64-NEXT: movq (%rdi,%rsi,8), %rax
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %rsi
; X64-NEXT: andq %rdx, %rsi
; X64-NEXT: lock cmpxchgq %rsi, (%rdi)
; X64-NEXT: jne .LBB167_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: btq %rcx, %rax
; X64-NEXT: jae .LBB167_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movq (%rdi,%rcx,8), %rax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB167_3:		; X64-NEXT: .LBB167_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%shl = shl nuw i64 1, %c		%shl = shl nuw i64 1, %c
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl1 = shl nuw i64 1, %rem		%shl1 = shl nuw i64 1, %rem
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: atomic_shl1_mask01_and_64_gpr_brnz:		; X64-LABEL: atomic_shl1_mask01_and_64_gpr_brnz:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rsi, %rcx		; X64-NEXT: movl %esi, %eax
; X64-NEXT: movl $1, %edx		; X64-NEXT: andl $63, %eax
; X64-NEXT: shlq %cl, %rdx		; X64-NEXT: lock btrq %rax, (%rdi)
; X64-NEXT: movq $-2, %rsi		; X64-NEXT: jae .LBB168_1
; X64-NEXT: rolq %cl, %rsi		; X64-NEXT: # %bb.2: # %if.then
; X64-NEXT: movq (%rdi), %rax		; X64-NEXT: movq (%rdi,%rsi,8), %rax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB168_1: # %atomicrmw.start
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: movq %rax, %r8
; X64-NEXT: andq %rsi, %r8
; X64-NEXT: lock cmpxchgq %r8, (%rdi)
; X64-NEXT: jne .LBB168_1
; X64-NEXT: # %bb.2: # %atomicrmw.end
; X64-NEXT: testq %rdx, %rax
; X64-NEXT: je .LBB168_3
; X64-NEXT: # %bb.4: # %if.then
; X64-NEXT: movq (%rdi,%rcx,8), %rax
; X64-NEXT: retq		; X64-NEXT: retq
; X64-NEXT: .LBB168_3:		; X64-NEXT: .LBB168_1:
; X64-NEXT: movl $123, %eax		; X64-NEXT: movl $123, %eax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = and i64 %c, 63		%rem = and i64 %c, 63
%shl = shl nuw i64 1, %rem		%shl = shl nuw i64 1, %rem
%not = xor i64 %shl, -1		%not = xor i64 %shl, -1
%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8		%0 = atomicrmw and ptr %v, i64 %not monotonic, align 8
%and = and i64 %0, %shl		%and = and i64 %0, %shl
▲ Show 20 Lines • Show All 486 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 486432

llvm/include/llvm/IR/IntrinsicsX86.td

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/lib/Target/X86/X86InstrCompiler.td

llvm/test/CodeGen/X86/atomic-rm-bit-test.ll

[X86] Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.
ClosedPublic