This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
Passes.h
-
Transforms/
-
IPO/
-
PassManagerBuilder.h
-
Scalar.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
ExpandMemCmp.cpp
-
TargetPassConfig.cpp
-
Transforms/
-
IPO/
-
PassManagerBuilder.cpp
-
Scalar/
-
CMakeLists.txt
5/5
ExpandMemCmp.cpp
-
MergeICmps.cpp
-
Scalar.cpp
-
test/
-
CodeGen/
-
AArch64/
-
O3-pipeline.ll
-
bcmp-inline-small.ll
-
ARM/
-
O3-pipeline.ll
-
Generic/
-
llc-start-stop.ll
-
PowerPC/
-
memCmpUsedInZeroEqualityComparison.ll
-
memcmp-mergeexpand.ll
-
memcmp.ll
-
memcmpIR.ll
-
X86/
-
O3-pipeline.ll
-
memcmp-mergeexpand.ll
-
memcmp-optsize.ll
-
memcmp.ll
-
Other/
2/2
opt-O2-pipeline.ll
-
opt-O3-pipeline.ll
-
opt-Os-pipeline.ll
-
Transforms/
-
ExpandMemCmp/
-
AArch64/
-
memcmp.ll
-
PowerPC/
-
lit.local.cfg
1/2
memcmpIR.ll
-
X86/
-
memcmp.ll
-
pr36421.ll
-
PhaseOrdering/
-
PowerPC/
-
lit.local.cfg
-
memCmpUsedInZeroEqualityComparison.ll
-
memcmp-mergeexpand.ll
-
memcmp.ll
-
X86/
-
lit.local.cfg
-
memcmp-mergeexpand.ll
2/3
memcmp.ll
-
pr36421.ll
-
tools/opt/
-
opt/
-
opt.cpp
-
utils/gn/secondary/llvm/lib/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
CodeGen/
-
BUILD.gn
-
Transforms/Scalar/
-
Scalar/
-
BUILD.gn

Differential D60318

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
AbandonedPublic

Authored by courbet on Apr 5 2019, 8:05 AM.

Download Raw Diff

Details

Reviewers

spatel
sbenza
efriedma
chandlerc
hfinkel
echristo
RKSimon

Summary

This opens up numerous possibilities for optimizations.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 37912
Build 37911: arc lint + arc unit

Event Timeline

courbet created this revision.Apr 5 2019, 8:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 8:05 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B30102: Diff 193879.Apr 5 2019, 8:07 AM

courbet edited the summary of this revision. (Show Details)Apr 5 2019, 8:08 AM

I agree that we shouldn't have to do anything too fancy here. Ideally, DAGCombiner would take care of this generally, so we're not optimizing this 1 special-case pattern that includes memcmp and ignoring the general case.

Did you investigate what that would take? We get this in instcombine, so we do have some set of peepholes to potentially copy.

Alternatively, what do you think about making ExpandMemCmp a late IR optimization pass like the vectorizer passes?
When I was looking at memcmp optimizations initially, it was clear that in some larger examples we would benefit from running instcombine/CSE after the expansion. By moving this pass up, we wouldn't have to worry about adding optimizations (like this patch) to the backend because it will all be done for us in existing IR optimization passes.

Alternatively, what do you think about making ExpandMemCmp a late IR optimization pass like the vectorizer passes?

The would be ideal, but unfortunately ExpandMemCmp requires access to the TargetLowering (for TargetLowering::MaxLoadsPerMemcmp which is consistent with what happens for memcpy and memset) . On the other hand, we already have some MemCmpExpansionOptions in TargetTransformInfo, which also knows about memcpy (e.g. getMemcpyLoopLoweringType(), getMemcpyLoopResidualLoweringType and getMemcpyCost), so we could move it here and move ExpandMemCmp to a late IR opt pass. I'll create a patch with this approach.

For reference (and some potential unit and end-to-end test ideas to show wins):
https://bugs.llvm.org/show_bug.cgi?id=36421
https://bugs.llvm.org/show_bug.cgi?id=34032#c13

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.

Herald added subscribers: jsji, kbarton, javed.absar and 3 others. · View Herald TranscriptApr 11 2019, 8:20 AM

Harbormaster completed remote builds in B30375: Diff 194696.Apr 11 2019, 8:21 AM

There is still some test fixing to do for Power, but before I do that I'd like to get your opinion on the approach, in particular regarding the pass placement (I pretty much placed it randomly here).

The nice thing here is that this discovered many more optimization opportunities (e.g. the length2 tests). I attached some benchmark data + fixture to the patch.
There is only one regression for N==24 in the equality case (this can be seen in the length24_eq test).

https://bugs.llvm.org/show_bug.cgi?id=36421

I've added a test for this one, but note that it still requires -extra-vectorizer-passes because there is not EarlyCSE pass after the function simplification pipeline. I could leave it like this or add one non-optionally, WDYT ?

Benchmark results:

D60318_bench.html45 KBDownload

Benchmark fixture:

bench.cc1 KBDownload

courbet retitled this revision from [ExpandMemCmp] Improve generated code for simple non-equality compares. to [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline..Apr 11 2019, 8:32 AM

courbet edited the summary of this revision. (Show Details)

In D60318#1462811, @courbet wrote:

There is still some test fixing to do for Power, but before I do that I'd like to get your opinion on the approach, in particular regarding the pass placement (I pretty much placed it randomly here).

It's next to the memcpy optimization pass, so that seems reasonable to me. But I don't claim any expertise on pass management/placement, so let's add more potential reviewers.
Inline summary for those folks: we'd like to move memcmp expansion from codegen to late in the IR pipeline (still under the control of a target hook) because that can unlock follow-on optimizations for CSE and instcombine. Examples:
https://bugs.llvm.org/show_bug.cgi?id=36421
https://bugs.llvm.org/show_bug.cgi?id=34032#c13

The nice thing here is that this discovered many more optimization opportunities (e.g. the length2 tests). I attached some benchmark data + fixture to the patch.
There is only one regression for N==24 in the equality case (this can be seen in the length24_eq test).

I've added a test for this one, but note that it still requires -extra-vectorizer-passes because there is not EarlyCSE pass after the function simplification pipeline. I could leave it like this or add one non-optionally, WDYT ?

I haven't seen any complaints about the cost of EarlyCSE, so my initial guess is just add it within addMemcmpPasses(). If there's a way to predicate running it on successful memcmp expansion, that would be nice.

Finally (and this could be a follow-on change), we should make the corresponding change to the new pass manager too, so we don't lose these optimizations when we flip that setting.

llvm/test/CodeGen/X86/memcmp.ll
1	It's great to see the end-to-end improvements here in the review, but I think we should not include this in the final commit (assuming we proceed). We should have IR-only regression tests for the passes in question. We could also include 'opt -O2' phase ordering tests within 'test/Transforms/PhaseOrdering/'. To make sure we have the complete opt+llc wins, we could add some small memcmp/bcmp benchmarks to test-suite. (I've never done that, so we can ask others for advice on how to do that.)

Move all tests out of CodeGen.
Update PowerPC tests.
Add EarlyCSE to addMemcmpPasses().

Harbormaster completed remote builds in B30893: Diff 196257.Apr 23 2019, 8:25 AM

@efriedma, @chandlerc, @hfinkel, @echristo: opinions ?

ping

The code changes look like what I expected, but I'd still like to hear from at least 1 other reviewer that we are not violating some high-level principle.

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71	Why are we generating bswap for a big-endian target?

Fixing RUN lines for PowerPC opt tests.

Harbormaster completed remote builds in B31455: Diff 198281.May 6 2019, 8:04 AM

courbet added inline comments.May 6 2019, 8:09 AM

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71	Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data layout from the target, so I/m now explicitly specifying the data layout on the RUN line.

The general approach of expanding memcmp in opt makes sense, I think. The placement seems a little on the early side, but that's probably okay given we don't really have interesting optimizations for memcmp calls besides expanding them.

Not sure about the extra EarlyCSE invocation; it's not free.

llvm/test/Other/opt-O2-pipeline.ll
143	How hard would it be to preserve the domtree?

Thanks Eli.

llvm/test/Other/opt-O2-pipeline.ll
143	I think it should not be too hard because we merely add blocks in a diamond in the middle of the graph, so the change is quite local. I'll have a look at that.

courbet mentioned this in D62068: [MergeICmps] Preserve the dominator tree..May 17 2019, 8:27 AM

courbet mentioned this in rL361239: [MergeICmps] Preserve the dominator tree..May 21 2019, 4:01 AM

courbet mentioned this in rGa95d95d3922e: [MergeICmps] Preserve the dominator tree..

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

In D60318#1509938, @RKSimon wrote:

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

Thanks for the suggestion (I can't believe I lived without this for so long) - I could finally reproduce and I have a fix (D62193).

Rebase

Harbormaster completed remote builds in B32309: Diff 200733.May 22 2019, 7:12 AM

Make ExpandMemCmp preserve the DomTree.

Harbormaster completed remote builds in B32566: Diff 201689.May 28 2019, 9:09 AM

sidorovd mentioned this in rG1686b70cbc79: [MergeICmps] Preserve the dominator tree..May 30 2019, 10:49 AM

Ping

ychen added a subscriber: ychen.Jun 20 2019, 3:52 PM

Ping. Any other concerns ?

spatel added inline comments.Jun 24 2019, 10:12 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
622–623 ↗	(On Diff #201689)	Can we take/use the Function's OptSize attribute as a preliminary step for this patch to reduce the number of diffs?

courbet mentioned this in rG3bc5ad551a4f: [ExpandMemCmp] Move all options to TargetTransformInfo..Jun 25 2019, 1:08 AM

courbet mentioned this in rL364281: [ExpandMemCmp] Move all options to TargetTransformInfo..Jun 25 2019, 1:10 AM

Split off options refactoring to r364281, rebase.

Harbormaster completed remote builds in B33868: Diff 206390.Jun 25 2019, 1:51 AM

spatel added inline comments.Jun 25 2019, 6:49 AM

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755	Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way to do end-to-end testing now within the regression test dir. We would be better off creating real end-to-end (C source --> x86 asm) tests within test-suite? That way, we can be sure that no passes anywhere in the pipeline are interfering with our memcmp patterns.

courbet marked an inline comment as done.Jun 25 2019, 7:13 AM

courbet added inline comments.

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755	Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will fix.

Remove asm testing.

Harbormaster completed remote builds in B33884: Diff 206445.Jun 25 2019, 7:48 AM

LGTM - see inline for a few more nits.

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
308–314	Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split.
435	Formatting - split line.
509	Formatting - split line.
652	Formatting - split line.
llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
6	Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR transform passes.'

This revision is now accepted and ready to land.Jun 25 2019, 8:22 AM

xbolva00 added a subscriber: xbolva00.Jun 25 2019, 9:17 AM

xbolva00 added inline comments.

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
252–253	Unused parameter?

Rebase on r364384.

Harbormaster completed remote builds in B33930: Diff 206614.Jun 26 2019, 2:14 AM

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I'm working on it.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

That was my first go at it too. Tests do break when I remove the updates, so I'm guessing they do :D

clang-format

Harbormaster completed remote builds in B33932: Diff 206629.Jun 26 2019, 4:45 AM

For information, I had to roll this back at it breaks sanitizers. Sanitizer passes insert some nobuiltin attributes on all memcmp calls to prevent expansion and still be able to intercept them (maybeMarkSanitizerLibraryCallNoBuiltin), and these passes are added last in opt:

PMBuilder.addExtension(PassManagerBuilder::EP_OptimizerLast, addMemorySanitizerPass);

so they now run after ExpandMemCmp.

The fix would be to move addMemcmpPasses(MPM); after addExtensionsToPM(EP_ScalarOptimizerLate, MPM);. I'll do this after I get the test-suite benchmarks in, so that we can validate the interactions.

courbet mentioned this in D64082: [MemFunctions] Add microbenchmarks for memory functions..Jul 2 2019, 8:51 AM

test-suite benchmarks: https://reviews.llvm.org/D64082

Full benchmark results for reference (haswell, 10 runs):

output.pdf115 KBDownload

The cells highlighted in blue are the statistically significant ones.

courbet mentioned this in rL369707: [MemFunctions] Add microbenchmarks for memory functions..Aug 22 2019, 2:24 PM

@courbet What's happening with this patch?

Herald added subscribers: • wuzish, MaskRay. · View Herald TranscriptSep 6 2019, 7:03 AM

I rolled back due to some bot breakages that I was not able to see were related or not. I have not come back to it yet, will do.

rebase

Harbormaster completed remote builds in B37851: Diff 219101.Sep 6 2019, 7:19 AM

reopening until the regressions have been investigated

This revision now requires changes to proceed.Sep 6 2019, 7:45 AM

courbet mentioned this in D67349: [Inliner][NFC] Make test less brittle..Sep 9 2019, 5:38 AM

courbet mentioned this in rL371397: [Inliner][NFC] Make test less brittle..Sep 9 2019, 6:08 AM

courbet mentioned this in rG388b9794b619: [Inliner][NFC] Make test less brittle..

Move mem passes after sanitizer passes.
Move codegen test llvm/test/CodeGen/AArch64/bcmp-inline-small.ll to Transforms/ now that AArch64 expands memcmps.
Rebase on r371397.

Harbormaster completed remote builds in B37912: Diff 219339.Sep 9 2019, 6:15 AM

courbet mentioned this in rL371502: Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt….Sep 10 2019, 2:19 AM

courbet mentioned this in rG612c260ec3fe: Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt….

Sorry, this change broke the buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/17395, I reverted it (r371507). Please ensure that you run ninja check-all before committing.

We also detected some compile time regressions in the CTMark subset of the test suite, with lencod regressing by around 4-5%. I haven't fully bisected but the commit list was short and this seemed to be the only suspicious change.

nikic added a subscriber: nikic.Oct 23 2019, 12:07 PM

spatel mentioned this in D69627: [SLP]Fix PR43799: Crash on different sizes of GEP indices..Oct 31 2019, 8:26 AM

@courbet what's happening with this patch?

davezarzycki added a subscriber: davezarzycki.Nov 22 2019, 2:28 AM

In D60318#1756381, @RKSimon wrote:

@courbet what's happening with this patch?

I'm going to abandon it for now. This interferes with the sanitizers in hard to fix ways, and I don't have the bandwith to come back to it. sorry.

courbet mentioned this in D132960: [InstCombine] Transform small unaligned memcmp calls used in zero equality tests.Aug 30 2022, 11:39 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

Passes.h

3 lines

Transforms/

IPO/

PassManagerBuilder.h

1 line

Scalar.h

6 lines

lib/

CodeGen/

1 line

1 line

13 lines

Transforms/

IPO/

PassManagerBuilder.cpp

14 lines

Scalar/

CMakeLists.txt

1 line

MergeICmps.cpp

2 lines

Scalar.cpp

1 line

	Transforms/	Scalar/
		CodeGen/

ExpandMemCmp.cpp

124 lines

test/

CodeGen/

AArch64/

O3-pipeline.ll

4 lines

bcmp-inline-small.ll

ARM/

O3-pipeline.ll

4 lines

Generic/

llc-start-stop.ll

6 lines

PowerPC/

memCmpUsedInZeroEqualityComparison.ll

memcmp-mergeexpand.ll

memcmp.ll

memcmpIR.ll

X86/

O3-pipeline.ll

4 lines

memcmp-mergeexpand.ll

Other/

3 lines

3 lines

3 lines

Transforms/

ExpandMemCmp/

AArch64/

memcmp.ll

124 lines

PowerPC/

lit.local.cfg

3 lines

memcmpIR.ll

294 lines

X86/

memcmp.ll

89 lines

pr36421.ll

79 lines

PhaseOrdering/

PowerPC/

lit.local.cfg

2 lines

memCmpUsedInZeroEqualityComparison.ll

174 lines

memcmp-mergeexpand.ll

49 lines

memcmp.ll

80 lines

X86/

lit.local.cfg

2 lines

memcmp-mergeexpand.ll

76 lines

memcmp.ll

995 lines

pr36421.ll

68 lines

tools/

opt/

opt.cpp

1 line

utils/

gn/

secondary/

llvm/

lib/

CodeGen/

BUILD.gn

1 line

Transforms/

Scalar/

BUILD.gn

1 line

Diff 219339

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 433 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
/// This pass performs outlining on machine instructions directly before		/// This pass performs outlining on machine instructions directly before
/// printing assembly.		/// printing assembly.
ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);		ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);

/// This pass expands the experimental reduction intrinsics into sequences of		/// This pass expands the experimental reduction intrinsics into sequences of
/// shuffles.		/// shuffles.
FunctionPass *createExpandReductionsPass();		FunctionPass *createExpandReductionsPass();

// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();

/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp		/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp
FunctionPass *createCFIInstrInserter();		FunctionPass *createCFIInstrInserter();

/// Create Hardware Loop pass. \see HardwareLoops.cpp		/// Create Hardware Loop pass. \see HardwareLoops.cpp
FunctionPass *createHardwareLoopsPass();		FunctionPass *createHardwareLoopsPass();

} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	private:
void addExtensionsToPM(ExtensionPointTy ETy,		void addExtensionsToPM(ExtensionPointTy ETy,
legacy::PassManagerBase &PM) const;		legacy::PassManagerBase &PM) const;
void addInitialAliasAnalysisPasses(legacy::PassManagerBase &PM) const;		void addInitialAliasAnalysisPasses(legacy::PassManagerBase &PM) const;
void addLTOOptimizationPasses(legacy::PassManagerBase &PM);		void addLTOOptimizationPasses(legacy::PassManagerBase &PM);
void addLateLTOOptimizationPasses(legacy::PassManagerBase &PM);		void addLateLTOOptimizationPasses(legacy::PassManagerBase &PM);
void addPGOInstrPasses(legacy::PassManagerBase &MPM, bool IsCS);		void addPGOInstrPasses(legacy::PassManagerBase &MPM, bool IsCS);
void addFunctionSimplificationPasses(legacy::PassManagerBase &MPM);		void addFunctionSimplificationPasses(legacy::PassManagerBase &MPM);
void addInstructionCombiningPass(legacy::PassManagerBase &MPM) const;		void addInstructionCombiningPass(legacy::PassManagerBase &MPM) const;
		void addMemcmpPasses(legacy::PassManagerBase &MPM) const;

public:		public:
/// populateFunctionPassManager - This fills in the function pass manager,		/// populateFunctionPassManager - This fills in the function pass manager,
/// which is expected to be run on each function immediately as it is		/// which is expected to be run on each function immediately as it is
/// generated. The idea is to reduce the size of the IR in memory.		/// generated. The idea is to reduce the size of the IR in memory.
void populateFunctionPassManager(legacy::FunctionPassManager &FPM);		void populateFunctionPassManager(legacy::FunctionPassManager &FPM);

/// populateModulePassManager - This sets up the primary pass manager.		/// populateModulePassManager - This sets up the primary pass manager.
Show All 18 Lines

llvm/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 369 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// MergeICmps - Merge integer comparison chains into a memcmp			// MergeICmps - Merge integer comparison chains into a memcmp
	//			//
	Pass *createMergeICmpsLegacyPass();			Pass *createMergeICmpsLegacyPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
				// ExpandMemCmp - This pass expands memcmp() to load/stores.
				//
				Pass *createExpandMemCmpPass();

				//===----------------------------------------------------------------------===//
				//
	// ValuePropagation - Propagate CFG-derived value information			// ValuePropagation - Propagate CFG-derived value information
	//			//
	Pass *createCorrelatedValuePropagationPass();			Pass *createCorrelatedValuePropagationPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// InferAddressSpaces - Modify users of addrspacecast instructions with values			// InferAddressSpaces - Modify users of addrspacecast instructions with values
	// in the source address space if using the destination address space is slower			// in the source address space if using the destination address space is slower
	▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show All 15 Lines	add_llvm_library(LLVMCodeGen
CriticalAntiDepBreaker.cpp		CriticalAntiDepBreaker.cpp
DeadMachineInstructionElim.cpp		DeadMachineInstructionElim.cpp
DetectDeadLanes.cpp		DetectDeadLanes.cpp
DFAPacketizer.cpp		DFAPacketizer.cpp
DwarfEHPrepare.cpp		DwarfEHPrepare.cpp
EarlyIfConversion.cpp		EarlyIfConversion.cpp
EdgeBundles.cpp		EdgeBundles.cpp
ExecutionDomainFix.cpp		ExecutionDomainFix.cpp
ExpandMemCmp.cpp
ExpandPostRAPseudos.cpp		ExpandPostRAPseudos.cpp
ExpandReductions.cpp		ExpandReductions.cpp
FaultMaps.cpp		FaultMaps.cpp
FEntryInserter.cpp		FEntryInserter.cpp
FinalizeISel.cpp		FinalizeISel.cpp
FuncletLayout.cpp		FuncletLayout.cpp
GCMetadata.cpp		GCMetadata.cpp
GCMetadataPrinter.cpp		GCMetadataPrinter.cpp
▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGen.cpp

Show All 25 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeDeadMachineInstructionElimPass(Registry);		initializeDeadMachineInstructionElimPass(Registry);
initializeDetectDeadLanesPass(Registry);		initializeDetectDeadLanesPass(Registry);
initializeDwarfEHPreparePass(Registry);		initializeDwarfEHPreparePass(Registry);
initializeEarlyIfConverterPass(Registry);		initializeEarlyIfConverterPass(Registry);
initializeEarlyIfPredicatorPass(Registry);		initializeEarlyIfPredicatorPass(Registry);
initializeEarlyMachineLICMPass(Registry);		initializeEarlyMachineLICMPass(Registry);
initializeEarlyTailDuplicatePass(Registry);		initializeEarlyTailDuplicatePass(Registry);
initializeExpandMemCmpPassPass(Registry);
initializeExpandPostRAPass(Registry);		initializeExpandPostRAPass(Registry);
initializeFEntryInserterPass(Registry);		initializeFEntryInserterPass(Registry);
initializeFinalizeISelPass(Registry);		initializeFinalizeISelPass(Registry);
initializeFinalizeMachineBundlesPass(Registry);		initializeFinalizeMachineBundlesPass(Registry);
initializeFuncletLayoutPass(Registry);		initializeFuncletLayoutPass(Registry);
initializeGCMachineCodeAnalysisPass(Registry);		initializeGCMachineCodeAnalysisPass(Registry);
initializeGCModuleInfoPass(Registry);		initializeGCModuleInfoPass(Registry);
initializeHardwareLoopsPass(Registry);		initializeHardwareLoopsPass(Registry);
▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExpandMemCmp.cpp

This file was moved to llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp.

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
static cl::opt<bool> DisableCopyProp("disable-copyprop", cl::Hidden,		static cl::opt<bool> DisableCopyProp("disable-copyprop", cl::Hidden,
cl::desc("Disable Copy Propagation pass"));		cl::desc("Disable Copy Propagation pass"));
static cl::opt<bool> DisablePartialLibcallInlining("disable-partial-libcall-inlining",		static cl::opt<bool> DisablePartialLibcallInlining("disable-partial-libcall-inlining",
cl::Hidden, cl::desc("Disable Partial Libcall Inlining"));		cl::Hidden, cl::desc("Disable Partial Libcall Inlining"));
static cl::opt<bool> EnableImplicitNullChecks(		static cl::opt<bool> EnableImplicitNullChecks(
"enable-implicit-null-checks",		"enable-implicit-null-checks",
cl::desc("Fold null checks into faulting memory operations"),		cl::desc("Fold null checks into faulting memory operations"),
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
static cl::opt<bool> DisableMergeICmps("disable-mergeicmps",
cl::desc("Disable MergeICmps Pass"),
cl::init(false), cl::Hidden);
static cl::opt<bool> PrintLSR("print-lsr-output", cl::Hidden,		static cl::opt<bool> PrintLSR("print-lsr-output", cl::Hidden,
cl::desc("Print LLVM IR produced by the loop-reduce pass"));		cl::desc("Print LLVM IR produced by the loop-reduce pass"));
static cl::opt<bool> PrintISelInput("print-isel-input", cl::Hidden,		static cl::opt<bool> PrintISelInput("print-isel-input", cl::Hidden,
cl::desc("Print LLVM IR input to isel pass"));		cl::desc("Print LLVM IR input to isel pass"));
static cl::opt<bool> PrintGCInfo("print-gc", cl::Hidden,		static cl::opt<bool> PrintGCInfo("print-gc", cl::Hidden,
cl::desc("Dump garbage collector data"));		cl::desc("Dump garbage collector data"));
static cl::opt<cl::boolOrDefault>		static cl::opt<cl::boolOrDefault>
VerifyMachineCode("verify-machineinstrs", cl::Hidden,		VerifyMachineCode("verify-machineinstrs", cl::Hidden,
▲ Show 20 Lines • Show All 524 Lines • ▼ Show 20 Lines	void TargetPassConfig::addIRPasses() {

// Run loop strength reduction before anything else.		// Run loop strength reduction before anything else.
if (getOptLevel() != CodeGenOpt::None && !DisableLSR) {		if (getOptLevel() != CodeGenOpt::None && !DisableLSR) {
addPass(createLoopStrengthReducePass());		addPass(createLoopStrengthReducePass());
if (PrintLSR)		if (PrintLSR)
addPass(createPrintFunctionPass(dbgs(), "\n\n* Code after LSR *\n"));		addPass(createPrintFunctionPass(dbgs(), "\n\n* Code after LSR *\n"));
}		}

if (getOptLevel() != CodeGenOpt::None) {
// The MergeICmpsPass tries to create memcmp calls by grouping sequences of
// loads and compares. ExpandMemCmpPass then tries to expand those calls
// into optimally-sized loads and compares. The transforms are enabled by a
// target lowering hook.
if (!DisableMergeICmps)
addPass(createMergeICmpsLegacyPass());
addPass(createExpandMemCmpPass());
}

// Run GC lowering passes for builtin collectors		// Run GC lowering passes for builtin collectors
// TODO: add a pass insertion point here		// TODO: add a pass insertion point here
addPass(createGCLoweringPass());		addPass(createGCLoweringPass());
addPass(createShadowStackGCLoweringPass());		addPass(createShadowStackGCLoweringPass());

// Make sure that no unreachable blocks are instruction selected.		// Make sure that no unreachable blocks are instruction selected.
addPass(createUnreachableBlockEliminationPass());		addPass(createUnreachableBlockEliminationPass());

▲ Show 20 Lines • Show All 575 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 240 Lines • ▼ Show 20 Lines
}		}

void PassManagerBuilder::addInstructionCombiningPass(		void PassManagerBuilder::addInstructionCombiningPass(
legacy::PassManagerBase &PM) const {		legacy::PassManagerBase &PM) const {
bool ExpensiveCombines = OptLevel > 2;		bool ExpensiveCombines = OptLevel > 2;
PM.add(createInstructionCombiningPass(ExpensiveCombines));		PM.add(createInstructionCombiningPass(ExpensiveCombines));
}		}

		void PassManagerBuilder::addMemcmpPasses(legacy::PassManagerBase &PM) const {
		if (OptLevel > 0) {
		// The MergeICmpsPass tries to create memcmp calls by grouping sequences of
		// loads and compares. ExpandMemCmpPass then tries to expand those calls
		// into optimally-sized loads and compares. The transforms are enabled by a
		// target transform info hook.
		PM.add(createMergeICmpsLegacyPass());
		PM.add(createExpandMemCmpPass());
		PM.add(createEarlyCSEPass());
		}
		}

void PassManagerBuilder::populateFunctionPassManager(		void PassManagerBuilder::populateFunctionPassManager(
legacy::FunctionPassManager &FPM) {		legacy::FunctionPassManager &FPM) {
addExtensionsToPM(EP_EarlyAsPossible, FPM);		addExtensionsToPM(EP_EarlyAsPossible, FPM);
FPM.add(createEntryExitInstrumenterPass());		FPM.add(createEntryExitInstrumenterPass());

// Add LibraryInfo if we have some.		// Add LibraryInfo if we have some.
if (LibraryInfo)		if (LibraryInfo)
FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));		FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));
▲ Show 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addFunctionSimplificationPasses(
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
MPM.add(createJumpThreadingPass()); // Thread jumps		MPM.add(createJumpThreadingPass()); // Thread jumps
MPM.add(createCorrelatedValuePropagationPass());		MPM.add(createCorrelatedValuePropagationPass());
MPM.add(createDeadStoreEliminationPass()); // Delete dead stores		MPM.add(createDeadStoreEliminationPass()); // Delete dead stores
MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));

addExtensionsToPM(EP_ScalarOptimizerLate, MPM);		addExtensionsToPM(EP_ScalarOptimizerLate, MPM);

		addMemcmpPasses(MPM); // Merge/Expand comparisons.
if (RerollLoops)		if (RerollLoops)
MPM.add(createLoopRerollPass());		MPM.add(createLoopRerollPass());

MPM.add(createAggressiveDCEPass()); // Delete dead instructions		MPM.add(createAggressiveDCEPass()); // Delete dead instructions
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
// Clean up after everything.		// Clean up after everything.
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
▲ Show 20 Lines • Show All 485 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {
// Run a few AA driven optimizations here and now, to cleanup the code.		// Run a few AA driven optimizations here and now, to cleanup the code.
PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.		PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.

PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.		PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.
PM.add(NewGVN ? createNewGVNPass()		PM.add(NewGVN ? createNewGVNPass()
: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.		: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.
PM.add(createMemCpyOptPass()); // Remove dead memcpys.		PM.add(createMemCpyOptPass()); // Remove dead memcpys.
		addMemcmpPasses(PM); // Merge/Expand comparisons.

// Nuke dead stores.		// Nuke dead stores.
PM.add(createDeadStoreEliminationPass());		PM.add(createDeadStoreEliminationPass());

// More loops are countable; try to optimize them.		// More loops are countable; try to optimize them.
PM.add(createIndVarSimplifyPass());		PM.add(createIndVarSimplifyPass());
PM.add(createLoopDeletionPass());		PM.add(createLoopDeletionPass());
if (EnableLoopInterchange)		if (EnableLoopInterchange)
▲ Show 20 Lines • Show All 215 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/CMakeLists.txt

	add_llvm_library(LLVMScalarOpts			add_llvm_library(LLVMScalarOpts
	ADCE.cpp			ADCE.cpp
	AlignmentFromAssumptions.cpp			AlignmentFromAssumptions.cpp
	BDCE.cpp			BDCE.cpp
	CallSiteSplitting.cpp			CallSiteSplitting.cpp
	ConstantHoisting.cpp			ConstantHoisting.cpp
	ConstantProp.cpp			ConstantProp.cpp
	CorrelatedValuePropagation.cpp			CorrelatedValuePropagation.cpp
	DCE.cpp			DCE.cpp
	DeadStoreElimination.cpp			DeadStoreElimination.cpp
	DivRemPairs.cpp			DivRemPairs.cpp
	EarlyCSE.cpp			EarlyCSE.cpp
				ExpandMemCmp.cpp
	FlattenCFGPass.cpp			FlattenCFGPass.cpp
	Float2Int.cpp			Float2Int.cpp
	GuardWidening.cpp			GuardWidening.cpp
	GVN.cpp			GVN.cpp
	GVNHoist.cpp			GVNHoist.cpp
	GVNSink.cpp			GVNSink.cpp
	IVUsersPrinter.cpp			IVUsersPrinter.cpp
	InductiveRangeCheckElimination.cpp			InductiveRangeCheckElimination.cpp
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp

This file was moved from llvm/lib/CodeGen/ExpandMemCmp.cpp.

//===--- ExpandMemCmp.cpp - Expand memcmp() to load/stores ----------------===//		//===--- ExpandMemCmp.cpp - Expand memcmp() to load/stores ----------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass tries to expand memcmp() calls into optimally-sized loads and		// This pass tries to expand memcmp() calls into optimally-sized loads and
// compares for the target.		// compares for the target.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
		#include "llvm/Analysis/DomTreeUpdater.h"
		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
		#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
		#include "llvm/Transforms/Scalar.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "expandmemcmp"		#define DEBUG_TYPE "expandmemcmp"

STATISTIC(NumMemCmpCalls, "Number of memcmp calls");		STATISTIC(NumMemCmpCalls, "Number of memcmp calls");
STATISTIC(NumMemCmpNotConstant, "Number of memcmp calls without constant size");		STATISTIC(NumMemCmpNotConstant, "Number of memcmp calls without constant size");
STATISTIC(NumMemCmpGreaterThanMax,		STATISTIC(NumMemCmpGreaterThanMax,
Show All 10 Lines	static cl::opt<unsigned> MaxLoadsPerMemcmp(
cl::desc("Set maximum number of loads used in expanded memcmp"));		cl::desc("Set maximum number of loads used in expanded memcmp"));

static cl::opt<unsigned> MaxLoadsPerMemcmpOptSize(		static cl::opt<unsigned> MaxLoadsPerMemcmpOptSize(
"max-loads-per-memcmp-opt-size", cl::Hidden,		"max-loads-per-memcmp-opt-size", cl::Hidden,
cl::desc("Set maximum number of loads used in expanded memcmp for -Os/Oz"));		cl::desc("Set maximum number of loads used in expanded memcmp for -Os/Oz"));

namespace {		namespace {


// This class provides helper functions to expand a memcmp library call into an		// This class provides helper functions to expand a memcmp library call into an
// inline expansion.		// inline expansion.
class MemCmpExpansion {		class MemCmpExpansion {
struct ResultBlock {		struct ResultBlock {
BasicBlock *BB = nullptr;		BasicBlock *BB = nullptr;
PHINode *PhiSrc1 = nullptr;		PHINode *PhiSrc1 = nullptr;
PHINode *PhiSrc2 = nullptr;		PHINode *PhiSrc2 = nullptr;

ResultBlock() = default;		ResultBlock() = default;
};		};

CallInst *const CI;		CallInst *const CI;
ResultBlock ResBlock;		ResultBlock ResBlock;
const uint64_t Size;		const uint64_t Size;
unsigned MaxLoadSize;		unsigned MaxLoadSize;
uint64_t NumLoadsNonOneByte;		uint64_t NumLoadsNonOneByte;
const uint64_t NumLoadsPerBlockForZeroCmp;		const uint64_t NumLoadsPerBlockForZeroCmp;
std::vector<BasicBlock *> LoadCmpBlocks;		std::vector<BasicBlock *> LoadCmpBlocks;
BasicBlock *EndBlock;		BasicBlock *EndBlock = nullptr;
PHINode *PhiRes;		PHINode *PhiRes;
const bool IsUsedForZeroCmp;		const bool IsUsedForZeroCmp;
const DataLayout &DL;		const DataLayout &DL;
IRBuilder<> Builder;		IRBuilder<> Builder;
		DomTreeUpdater DTU;
// Represents the decomposition in blocks of the expansion. For example,		// Represents the decomposition in blocks of the expansion. For example,
// comparing 33 bytes on X86+sse can be done with 2x16-byte loads and		// comparing 33 bytes on X86+sse can be done with 2x16-byte loads and
// 1x1-byte load, which would be represented as [{16, 0}, {16, 16}, {32, 1}.		// 1x1-byte load, which would be represented as [{16, 0}, {16, 16}, {32, 1}.
struct LoadEntry {		struct LoadEntry {
LoadEntry(unsigned LoadSize, uint64_t Offset)		LoadEntry(unsigned LoadSize, uint64_t Offset)
: LoadSize(LoadSize), Offset(Offset) {		: LoadSize(LoadSize), Offset(Offset) {}
}

// The size of the load for this block, in bytes.		// The size of the load for this block, in bytes.
unsigned LoadSize;		unsigned LoadSize;
// The offset of this load from the base pointer, in bytes.		// The offset of this load from the base pointer, in bytes.
uint64_t Offset;		uint64_t Offset;
};		};
using LoadEntryVector = SmallVector<LoadEntry, 8>;		using LoadEntryVector = SmallVector<LoadEntry, 8>;
LoadEntryVector LoadSequence;		LoadEntryVector LoadSequence;
Show All 20 Lines	class MemCmpExpansion {
static LoadEntryVector		static LoadEntryVector
computeOverlappingLoadSequence(uint64_t Size, unsigned MaxLoadSize,		computeOverlappingLoadSequence(uint64_t Size, unsigned MaxLoadSize,
unsigned MaxNumLoads,		unsigned MaxNumLoads,
unsigned &NumLoadsNonOneByte);		unsigned &NumLoadsNonOneByte);

public:		public:
MemCmpExpansion(CallInst *CI, uint64_t Size,		MemCmpExpansion(CallInst *CI, uint64_t Size,
const TargetTransformInfo::MemCmpExpansionOptions &Options,		const TargetTransformInfo::MemCmpExpansionOptions &Options,
const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout);		const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout,
		DominatorTree *DT);

unsigned getNumBlocks();		unsigned getNumBlocks();
uint64_t getNumLoads() const { return LoadSequence.size(); }		uint64_t getNumLoads() const { return LoadSequence.size(); }

Value *getMemCmpExpansion();		Value *getMemCmpExpansion();
};		};

MemCmpExpansion::LoadEntryVector MemCmpExpansion::computeGreedyLoadSequence(		MemCmpExpansion::LoadEntryVector MemCmpExpansion::computeGreedyLoadSequence(
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
// 1. A list of load compare blocks - LoadCmpBlocks.		// 1. A list of load compare blocks - LoadCmpBlocks.
// 2. An EndBlock, split from original instruction point, which is the block to		// 2. An EndBlock, split from original instruction point, which is the block to
// return from.		// return from.
// 3. ResultBlock, block to branch to for early exit when a		// 3. ResultBlock, block to branch to for early exit when a
// LoadCmpBlock finds a difference.		// LoadCmpBlock finds a difference.
MemCmpExpansion::MemCmpExpansion(		MemCmpExpansion::MemCmpExpansion(
CallInst *const CI, uint64_t Size,		CallInst *const CI, uint64_t Size,
const TargetTransformInfo::MemCmpExpansionOptions &Options,		const TargetTransformInfo::MemCmpExpansionOptions &Options,
const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout)		const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout,
		DominatorTree *DT)
: CI(CI), Size(Size), MaxLoadSize(0), NumLoadsNonOneByte(0),		: CI(CI), Size(Size), MaxLoadSize(0), NumLoadsNonOneByte(0),
NumLoadsPerBlockForZeroCmp(Options.NumLoadsPerBlock),		NumLoadsPerBlockForZeroCmp(Options.NumLoadsPerBlock),
IsUsedForZeroCmp(IsUsedForZeroCmp), DL(TheDataLayout), Builder(CI) {		IsUsedForZeroCmp(IsUsedForZeroCmp), DL(TheDataLayout), Builder(CI),
		DTU(DT, /PostDominator/ nullptr,
		DomTreeUpdater::UpdateStrategy::Eager) {
assert(Size > 0 && "zero blocks");		assert(Size > 0 && "zero blocks");
// Scale the max size down if the target can load more bytes than we need.		// Scale the max size down if the target can load more bytes than we need.
llvm::ArrayRef<unsigned> LoadSizes(Options.LoadSizes);		llvm::ArrayRef<unsigned> LoadSizes(Options.LoadSizes);
while (!LoadSizes.empty() && LoadSizes.front() > Size) {		while (!LoadSizes.empty() && LoadSizes.front() > Size) {
LoadSizes = LoadSizes.drop_front();		LoadSizes = LoadSizes.drop_front();
}		}
assert(!LoadSizes.empty() && "cannot load Size bytes");		assert(!LoadSizes.empty() && "cannot load Size bytes");
MaxLoadSize = LoadSizes.front();		MaxLoadSize = LoadSizes.front();
Show All 22 Lines

unsigned MemCmpExpansion::getNumBlocks() {		unsigned MemCmpExpansion::getNumBlocks() {
if (IsUsedForZeroCmp)		if (IsUsedForZeroCmp)
return getNumLoads() / NumLoadsPerBlockForZeroCmp +		return getNumLoads() / NumLoadsPerBlockForZeroCmp +
(getNumLoads() % NumLoadsPerBlockForZeroCmp != 0 ? 1 : 0);		(getNumLoads() % NumLoadsPerBlockForZeroCmp != 0 ? 1 : 0);
return getNumLoads();		return getNumLoads();
}		}

void MemCmpExpansion::createLoadCmpBlocks() {		void MemCmpExpansion::createLoadCmpBlocks() {
		assert(ResBlock.BB && "ResBlock must be created before LoadCmpBlocks");
		xbolva00Unsubmitted Done Reply Inline Actions Unused parameter? xbolva00: Unused parameter?
for (unsigned i = 0; i < getNumBlocks(); i++) {		for (unsigned i = 0; i < getNumBlocks(); i++) {
BasicBlock *BB = BasicBlock::Create(CI->getContext(), "loadbb",		BasicBlock *BB = BasicBlock::Create(CI->getContext(), "loadbb",
EndBlock->getParent(), EndBlock);		EndBlock->getParent(), EndBlock);
LoadCmpBlocks.push_back(BB);		LoadCmpBlocks.push_back(BB);
}		}
}		}

void MemCmpExpansion::createResultBlock() {		void MemCmpExpansion::createResultBlock() {
		assert(EndBlock && "EndBlock must be created before ResultBlock");
ResBlock.BB = BasicBlock::Create(CI->getContext(), "res_block",		ResBlock.BB = BasicBlock::Create(CI->getContext(), "res_block",
EndBlock->getParent(), EndBlock);		EndBlock->getParent(), EndBlock);
}		}

/// Return a pointer to an element of type `LoadSizeType` at offset		/// Return a pointer to an element of type `LoadSizeType` at offset
/// `OffsetBytes`.		/// `OffsetBytes`.
Value MemCmpExpansion::getPtrToElementAtOffset(Value Source,		Value MemCmpExpansion::getPtrToElementAtOffset(Value Source,
Type *LoadSizeType,		Type *LoadSizeType,
uint64_t OffsetBytes) {		uint64_t OffsetBytes) {
if (OffsetBytes > 0) {		if (OffsetBytes > 0) {
auto *ByteType = Type::getInt8Ty(CI->getContext());		auto *ByteType = Type::getInt8Ty(CI->getContext());
Source = Builder.CreateGEP(		Source = Builder.CreateGEP(
ByteType, Builder.CreateBitCast(Source, ByteType->getPointerTo()),		ByteType, Builder.CreateBitCast(Source, ByteType->getPointerTo()),
ConstantInt::get(ByteType, OffsetBytes));		ConstantInt::get(ByteType, OffsetBytes));
}		}
return Builder.CreateBitCast(Source, LoadSizeType->getPointerTo());		return Builder.CreateBitCast(Source, LoadSizeType->getPointerTo());
}		}

// This function creates the IR instructions for loading and comparing 1 byte.		// This function creates the IR instructions for loading and comparing 1 byte.
// It loads 1 byte from each source of the memcmp parameters with the given		// It loads 1 byte from each source of the memcmp parameters with the given
// GEPIndex. It then subtracts the two loaded values and adds this result to the		// GEPIndex. It then subtracts the two loaded values and adds this result to the
// final phi node for selecting the memcmp result.		// final phi node for selecting the memcmp result.
void MemCmpExpansion::emitLoadCompareByteBlock(unsigned BlockIndex,		void MemCmpExpansion::emitLoadCompareByteBlock(unsigned BlockIndex,
unsigned OffsetBytes) {		unsigned OffsetBytes) {
Builder.SetInsertPoint(LoadCmpBlocks[BlockIndex]);		BasicBlock *const BB = LoadCmpBlocks[BlockIndex];
		Builder.SetInsertPoint(BB);
Type *LoadSizeType = Type::getInt8Ty(CI->getContext());		Type *LoadSizeType = Type::getInt8Ty(CI->getContext());
Value *Source1 =		Value *Source1 =
getPtrToElementAtOffset(CI->getArgOperand(0), LoadSizeType, OffsetBytes);		getPtrToElementAtOffset(CI->getArgOperand(0), LoadSizeType, OffsetBytes);
Value *Source2 =		Value *Source2 =
getPtrToElementAtOffset(CI->getArgOperand(1), LoadSizeType, OffsetBytes);		getPtrToElementAtOffset(CI->getArgOperand(1), LoadSizeType, OffsetBytes);

Value *LoadSrc1 = Builder.CreateLoad(LoadSizeType, Source1);		Value *LoadSrc1 = Builder.CreateLoad(LoadSizeType, Source1);
Value *LoadSrc2 = Builder.CreateLoad(LoadSizeType, Source2);		Value *LoadSrc2 = Builder.CreateLoad(LoadSizeType, Source2);

LoadSrc1 = Builder.CreateZExt(LoadSrc1, Type::getInt32Ty(CI->getContext()));		LoadSrc1 = Builder.CreateZExt(LoadSrc1, Type::getInt32Ty(CI->getContext()));
LoadSrc2 = Builder.CreateZExt(LoadSrc2, Type::getInt32Ty(CI->getContext()));		LoadSrc2 = Builder.CreateZExt(LoadSrc2, Type::getInt32Ty(CI->getContext()));
Value *Diff = Builder.CreateSub(LoadSrc1, LoadSrc2);		Value *Diff = Builder.CreateSub(LoadSrc1, LoadSrc2);

PhiRes->addIncoming(Diff, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Diff, LoadCmpBlocks[BlockIndex]);

if (BlockIndex < (LoadCmpBlocks.size() - 1)) {		if (BlockIndex < (LoadCmpBlocks.size() - 1)) {
// Early exit branch if difference found to EndBlock. Otherwise, continue to		// Early exit branch if difference found to EndBlock. Otherwise, continue to
// next LoadCmpBlock,		// next LoadCmpBlock,
Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_NE, Diff,		Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_NE, Diff,
ConstantInt::get(Diff->getType(), 0));		ConstantInt::get(Diff->getType(), 0));
BranchInst *CmpBr =		BasicBlock *const NextBB = LoadCmpBlocks[BlockIndex + 1];
BranchInst::Create(EndBlock, LoadCmpBlocks[BlockIndex + 1], Cmp);		BranchInst *CmpBr = BranchInst::Create(EndBlock, NextBB, Cmp);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);
		DTU.applyUpdates({{DominatorTree::Insert, BB, EndBlock},
		{DominatorTree::Insert, BB, NextBB}});
} else {		} else {
		spatelUnsubmitted Done Reply Inline Actions Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split. spatel: Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split.
// The last block has an unconditional branch to EndBlock.		// The last block has an unconditional branch to EndBlock.
BranchInst *CmpBr = BranchInst::Create(EndBlock);		BranchInst *CmpBr = BranchInst::Create(EndBlock);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);
		DTU.applyUpdates({{DominatorTree::Insert, BB, EndBlock}});
}		}
}		}

/// Generate an equality comparison for one or more pairs of loaded values.		/// Generate an equality comparison for one or more pairs of loaded values.
/// This is used in the case where the memcmp() call is compared equal or not		/// This is used in the case where the memcmp() call is compared equal or not
/// equal to zero.		/// equal to zero.
Value *MemCmpExpansion::getCompareLoadPairs(unsigned BlockIndex,		Value *MemCmpExpansion::getCompareLoadPairs(unsigned BlockIndex,
unsigned &LoadIndex) {		unsigned &LoadIndex) {
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	void MemCmpExpansion::emitLoadCompareBlockMultipleLoads(unsigned BlockIndex,

BasicBlock *NextBB = (BlockIndex == (LoadCmpBlocks.size() - 1))		BasicBlock *NextBB = (BlockIndex == (LoadCmpBlocks.size() - 1))
? EndBlock		? EndBlock
: LoadCmpBlocks[BlockIndex + 1];		: LoadCmpBlocks[BlockIndex + 1];
// Early exit branch if difference found to ResultBlock. Otherwise,		// Early exit branch if difference found to ResultBlock. Otherwise,
// continue to next LoadCmpBlock or EndBlock.		// continue to next LoadCmpBlock or EndBlock.
BranchInst *CmpBr = BranchInst::Create(ResBlock.BB, NextBB, Cmp);		BranchInst *CmpBr = BranchInst::Create(ResBlock.BB, NextBB, Cmp);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);
		BasicBlock *const BB = LoadCmpBlocks[BlockIndex];

// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0		// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0
// since early exit to ResultBlock was not taken (no difference was found in		// since early exit to ResultBlock was not taken (no difference was found in
// any of the bytes).		// any of the bytes).
if (BlockIndex == LoadCmpBlocks.size() - 1) {		if (BlockIndex == LoadCmpBlocks.size() - 1) {
Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);		Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);
PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Zero, BB);
}		}
		DTU.applyUpdates({{DominatorTree::Insert, BB, ResBlock.BB},
		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.
		{DominatorTree::Insert, BB, NextBB}});
}		}

// This function creates the IR intructions for loading and comparing using the		// This function creates the IR intructions for loading and comparing using the
// given LoadSize. It loads the number of bytes specified by LoadSize from each		// given LoadSize. It loads the number of bytes specified by LoadSize from each
// source of the memcmp parameters. It then does a subtract to see if there was		// source of the memcmp parameters. It then does a subtract to see if there was
// a difference in the loaded values. If a difference is found, it branches		// a difference in the loaded values. If a difference is found, it branches
// with an early exit to the ResultBlock for calculating which source was		// with an early exit to the ResultBlock for calculating which source was
// larger. Otherwise, it falls through to the either the next LoadCmpBlock or		// larger. Otherwise, it falls through to the either the next LoadCmpBlock or
Show All 9 Lines	if (CurLoadEntry.LoadSize == 1) {
return;		return;
}		}

Type *LoadSizeType =		Type *LoadSizeType =
IntegerType::get(CI->getContext(), CurLoadEntry.LoadSize * 8);		IntegerType::get(CI->getContext(), CurLoadEntry.LoadSize * 8);
Type MaxLoadType = IntegerType::get(CI->getContext(), MaxLoadSize 8);		Type MaxLoadType = IntegerType::get(CI->getContext(), MaxLoadSize 8);
assert(CurLoadEntry.LoadSize <= MaxLoadSize && "Unexpected load type");		assert(CurLoadEntry.LoadSize <= MaxLoadSize && "Unexpected load type");

Builder.SetInsertPoint(LoadCmpBlocks[BlockIndex]);		BasicBlock *const BB = LoadCmpBlocks[BlockIndex];
		Builder.SetInsertPoint(BB);

Value *Source1 = getPtrToElementAtOffset(CI->getArgOperand(0), LoadSizeType,		Value *Source1 = getPtrToElementAtOffset(CI->getArgOperand(0), LoadSizeType,
CurLoadEntry.Offset);		CurLoadEntry.Offset);
Value *Source2 = getPtrToElementAtOffset(CI->getArgOperand(1), LoadSizeType,		Value *Source2 = getPtrToElementAtOffset(CI->getArgOperand(1), LoadSizeType,
CurLoadEntry.Offset);		CurLoadEntry.Offset);

// Load LoadSizeType from the base address.		// Load LoadSizeType from the base address.
Value *LoadSrc1 = Builder.CreateLoad(LoadSizeType, Source1);		Value *LoadSrc1 = Builder.CreateLoad(LoadSizeType, Source1);
Show All 27 Lines	void MemCmpExpansion::emitLoadCompareBlock(unsigned BlockIndex) {
BranchInst *CmpBr = BranchInst::Create(NextBB, ResBlock.BB, Cmp);		BranchInst *CmpBr = BranchInst::Create(NextBB, ResBlock.BB, Cmp);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);

// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0		// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0
// since early exit to ResultBlock was not taken (no difference was found in		// since early exit to ResultBlock was not taken (no difference was found in
// any of the bytes).		// any of the bytes).
if (BlockIndex == LoadCmpBlocks.size() - 1) {		if (BlockIndex == LoadCmpBlocks.size() - 1) {
Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);		Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);
PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Zero, BB);
}		}
		DTU.applyUpdates({{DominatorTree::Insert, BB, ResBlock.BB},
		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.
		{DominatorTree::Insert, BB, NextBB}});
}		}

// This function populates the ResultBlock with a sequence to calculate the		// This function populates the ResultBlock with a sequence to calculate the
// memcmp result. It compares the two loaded source values and returns -1 if		// memcmp result. It compares the two loaded source values and returns -1 if
// src1 < src2 and 1 if src1 > src2.		// src1 < src2 and 1 if src1 > src2.
void MemCmpExpansion::emitMemCmpResultBlock() {		void MemCmpExpansion::emitMemCmpResultBlock() {
// Special case: if memcmp result is used in a zero equality, result does not		// Special case: if memcmp result is used in a zero equality, result does not
// need to be calculated and can simply return 1.		// need to be calculated and can simply return 1.
if (IsUsedForZeroCmp) {		if (IsUsedForZeroCmp) {
BasicBlock::iterator InsertPt = ResBlock.BB->getFirstInsertionPt();		BasicBlock::iterator InsertPt = ResBlock.BB->getFirstInsertionPt();
Builder.SetInsertPoint(ResBlock.BB, InsertPt);		Builder.SetInsertPoint(ResBlock.BB, InsertPt);
Value *Res = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 1);		Value *Res = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 1);
PhiRes->addIncoming(Res, ResBlock.BB);		PhiRes->addIncoming(Res, ResBlock.BB);
BranchInst *NewBr = BranchInst::Create(EndBlock);		BranchInst *NewBr = BranchInst::Create(EndBlock);
Builder.Insert(NewBr);		Builder.Insert(NewBr);
		DTU.applyUpdates({{DominatorTree::Insert, ResBlock.BB, EndBlock}});
return;		return;
}		}
BasicBlock::iterator InsertPt = ResBlock.BB->getFirstInsertionPt();		BasicBlock::iterator InsertPt = ResBlock.BB->getFirstInsertionPt();
Builder.SetInsertPoint(ResBlock.BB, InsertPt);		Builder.SetInsertPoint(ResBlock.BB, InsertPt);

Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_ULT, ResBlock.PhiSrc1,		Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_ULT, ResBlock.PhiSrc1,
ResBlock.PhiSrc2);		ResBlock.PhiSrc2);

Value *Res =		Value *Res =
Builder.CreateSelect(Cmp, ConstantInt::get(Builder.getInt32Ty(), -1),		Builder.CreateSelect(Cmp, ConstantInt::get(Builder.getInt32Ty(), -1),
ConstantInt::get(Builder.getInt32Ty(), 1));		ConstantInt::get(Builder.getInt32Ty(), 1));

BranchInst *NewBr = BranchInst::Create(EndBlock);		BranchInst *NewBr = BranchInst::Create(EndBlock);
Builder.Insert(NewBr);		Builder.Insert(NewBr);
PhiRes->addIncoming(Res, ResBlock.BB);		PhiRes->addIncoming(Res, ResBlock.BB);
		DTU.applyUpdates({{DominatorTree::Insert, ResBlock.BB, EndBlock}});
}		}

void MemCmpExpansion::setupResultBlockPHINodes() {		void MemCmpExpansion::setupResultBlockPHINodes() {
Type MaxLoadType = IntegerType::get(CI->getContext(), MaxLoadSize 8);		Type MaxLoadType = IntegerType::get(CI->getContext(), MaxLoadSize 8);
Builder.SetInsertPoint(ResBlock.BB);		Builder.SetInsertPoint(ResBlock.BB);
// Note: this assumes one load per block.		// Note: this assumes one load per block.
ResBlock.PhiSrc1 =		ResBlock.PhiSrc1 =
Builder.CreatePHI(MaxLoadType, NumLoadsNonOneByte, "phi.src1");		Builder.CreatePHI(MaxLoadType, NumLoadsNonOneByte, "phi.src1");
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines

// This function expands the memcmp call into an inline expansion and returns		// This function expands the memcmp call into an inline expansion and returns
// the memcmp result.		// the memcmp result.
Value *MemCmpExpansion::getMemCmpExpansion() {		Value *MemCmpExpansion::getMemCmpExpansion() {
// Create the basic block framework for a multi-block expansion.		// Create the basic block framework for a multi-block expansion.
if (getNumBlocks() != 1) {		if (getNumBlocks() != 1) {
BasicBlock *StartBlock = CI->getParent();		BasicBlock *StartBlock = CI->getParent();
EndBlock = StartBlock->splitBasicBlock(CI, "endblock");		EndBlock = StartBlock->splitBasicBlock(CI, "endblock");
		DTU.applyUpdates({{DominatorTree::Insert, StartBlock, EndBlock}});
setupEndBlockPHINodes();		setupEndBlockPHINodes();
createResultBlock();		createResultBlock();

// If return value of memcmp is not used in a zero equality, we need to		// If return value of memcmp is not used in a zero equality, we need to
// calculate which source was larger. The calculation requires the		// calculate which source was larger. The calculation requires the
// two loaded source values of each load compare block.		// two loaded source values of each load compare block.
// These will be saved in the phi nodes created by setupResultBlockPHINodes.		// These will be saved in the phi nodes created by setupResultBlockPHINodes.
if (!IsUsedForZeroCmp) setupResultBlockPHINodes();		if (!IsUsedForZeroCmp)
		setupResultBlockPHINodes();

// Create the number of required load compare basic blocks.		// Create the number of required load compare basic blocks.
createLoadCmpBlocks();		createLoadCmpBlocks();

// Update the terminator added by splitBasicBlock to branch to the first		// Update the terminator added by splitBasicBlock to branch to the first
// LoadCmpBlock.		// LoadCmpBlock.
StartBlock->getTerminator()->setSuccessor(0, LoadCmpBlocks[0]);		BasicBlock *const FirstLoadBB = LoadCmpBlocks[0];
		StartBlock->getTerminator()->setSuccessor(0, FirstLoadBB);
		DTU.applyUpdates({{DominatorTree::Delete, StartBlock, EndBlock},
		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.
		{DominatorTree::Insert, StartBlock, FirstLoadBB}});
}		}

Builder.SetCurrentDebugLocation(CI->getDebugLoc());		Builder.SetCurrentDebugLocation(CI->getDebugLoc());

if (IsUsedForZeroCmp)		if (IsUsedForZeroCmp)
return getNumBlocks() == 1 ? getMemCmpEqZeroOneBlock()		return getNumBlocks() == 1 ? getMemCmpEqZeroOneBlock()
: getMemCmpExpansionZeroCase();		: getMemCmpExpansionZeroCase();

▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
/// %47 = zext i8 %45 to i32		/// %47 = zext i8 %45 to i32
/// %48 = sub i32 %46, %47		/// %48 = sub i32 %46, %47
/// br label %endblock		/// br label %endblock
/// endblock: ; preds = %res_block,		/// endblock: ; preds = %res_block,
/// %loadbb3		/// %loadbb3
/// %phi.res = phi i32 [ %48, %loadbb3 ], [ %11, %res_block ]		/// %phi.res = phi i32 [ %48, %loadbb3 ], [ %11, %res_block ]
/// ret i32 %phi.res		/// ret i32 %phi.res
static bool expandMemCmp(CallInst CI, const TargetTransformInfo TTI,		static bool expandMemCmp(CallInst CI, const TargetTransformInfo TTI,
const TargetLowering TLI, const DataLayout DL) {		const DataLayout DL, DominatorTree DT) {
NumMemCmpCalls++;		NumMemCmpCalls++;

// Early exit from expansion if -Oz.		// Early exit from expansion if -Oz.
if (CI->getFunction()->hasMinSize())		if (CI->getFunction()->hasMinSize())
return false;		return false;

// Early exit from expansion if size is not a constant.		// Early exit from expansion if size is not a constant.
ConstantInt *SizeCast = dyn_cast<ConstantInt>(CI->getArgOperand(2));		ConstantInt *SizeCast = dyn_cast<ConstantInt>(CI->getArgOperand(2));
if (!SizeCast) {		if (!SizeCast) {
NumMemCmpNotConstant++;		NumMemCmpNotConstant++;
return false;		return false;
}		}
const uint64_t SizeVal = SizeCast->getZExtValue();		const uint64_t SizeVal = SizeCast->getZExtValue();

if (SizeVal == 0) {		if (SizeVal == 0) {
return false;		return false;
}		}
// TTI call to check if target would like to expand memcmp. Also, get the		// TTI call to check if target would like to expand memcmp. Also, get the
// available load sizes.		// available load sizes.
const bool IsUsedForZeroCmp = isOnlyUsedInZeroEqualityComparison(CI);		const bool IsUsedForZeroCmp = isOnlyUsedInZeroEqualityComparison(CI);
auto Options = TTI->enableMemCmpExpansion(CI->getFunction()->hasOptSize(),		auto Options = TTI->enableMemCmpExpansion(CI->getFunction()->hasOptSize(),
IsUsedForZeroCmp);		IsUsedForZeroCmp);
if (!Options) return false;		if (!Options)
		return false;

if (MemCmpEqZeroNumLoadsPerBlock.getNumOccurrences())		if (MemCmpEqZeroNumLoadsPerBlock.getNumOccurrences())
Options.NumLoadsPerBlock = MemCmpEqZeroNumLoadsPerBlock;		Options.NumLoadsPerBlock = MemCmpEqZeroNumLoadsPerBlock;

if (CI->getFunction()->hasOptSize() &&		if (CI->getFunction()->hasOptSize() &&
MaxLoadsPerMemcmpOptSize.getNumOccurrences())		MaxLoadsPerMemcmpOptSize.getNumOccurrences())
Options.MaxNumLoads = MaxLoadsPerMemcmpOptSize;		Options.MaxNumLoads = MaxLoadsPerMemcmpOptSize;

if (!CI->getFunction()->hasOptSize() && MaxLoadsPerMemcmp.getNumOccurrences())		if (!CI->getFunction()->hasOptSize() && MaxLoadsPerMemcmp.getNumOccurrences())
Options.MaxNumLoads = MaxLoadsPerMemcmp;		Options.MaxNumLoads = MaxLoadsPerMemcmp;

MemCmpExpansion Expansion(CI, SizeVal, Options, IsUsedForZeroCmp, *DL);		MemCmpExpansion Expansion(CI, SizeVal, Options, IsUsedForZeroCmp, *DL, DT);

// Don't expand if this will require more loads than desired by the target.		// Don't expand if this will require more loads than desired by the target.
if (Expansion.getNumLoads() == 0) {		if (Expansion.getNumLoads() == 0) {
NumMemCmpGreaterThanMax++;		NumMemCmpGreaterThanMax++;
return false;		return false;
}		}

NumMemCmpInlined++;		NumMemCmpInlined++;

Value *Res = Expansion.getMemCmpExpansion();		Value *Res = Expansion.getMemCmpExpansion();

// Replace call with result of expansion and erase call.		// Replace call with result of expansion and erase call.
CI->replaceAllUsesWith(Res);		CI->replaceAllUsesWith(Res);
CI->eraseFromParent();		CI->eraseFromParent();

return true;		return true;
}		}



class ExpandMemCmpPass : public FunctionPass {		class ExpandMemCmpPass : public FunctionPass {
public:		public:
static char ID;		static char ID;

ExpandMemCmpPass() : FunctionPass(ID) {		ExpandMemCmpPass() : FunctionPass(ID) {
initializeExpandMemCmpPassPass(*PassRegistry::getPassRegistry());		initializeExpandMemCmpPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (skipFunction(F)) return false;		if (skipFunction(F))

auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
if (!TPC) {
return false;		return false;
}
const TargetLowering* TL =
TPC->getTM<TargetMachine>().getSubtargetImpl(F)->getTargetLowering();

const TargetLibraryInfo *TLI =		const TargetLibraryInfo *TLI =
&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
const TargetTransformInfo *TTI =		const TargetTransformInfo *TTI =
&getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		&getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
auto PA = runImpl(F, TLI, TTI, TL);		// ExpandMemCmp does not need the DominatorTree, but we update it if it's
		// already available.
		auto *DTWP = getAnalysisIfAvailable<DominatorTreeWrapperPass>();
		auto PA = runImpl(F, TLI, TTI, DTWP ? &DTWP->getDomTree() : nullptr);
return !PA.areAllPreserved();		return !PA.areAllPreserved();
}		}

private:		private:
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
		AU.addUsedIfAvailable<DominatorTreeWrapperPass>();
		AU.addPreserved<GlobalsAAWrapperPass>();
		AU.addPreserved<DominatorTreeWrapperPass>();
FunctionPass::getAnalysisUsage(AU);		FunctionPass::getAnalysisUsage(AU);
}		}

PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,		PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,
const TargetTransformInfo *TTI,		const TargetTransformInfo TTI, DominatorTree DT);
const TargetLowering* TL);
// Returns true if a change was made.		// Returns true if a change was made.
bool runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,		bool runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,
const TargetTransformInfo TTI, const TargetLowering TL,		const TargetTransformInfo *TTI, const DataLayout &DL,
const DataLayout& DL);		DominatorTree *DT);
};		};

bool ExpandMemCmpPass::runOnBlock(		bool ExpandMemCmpPass::runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,
BasicBlock &BB, const TargetLibraryInfo *TLI,		const TargetTransformInfo *TTI,
const TargetTransformInfo TTI, const TargetLowering TL,		const DataLayout &DL, DominatorTree *DT) {
const DataLayout& DL) {
for (Instruction& I : BB) {		for (Instruction &I : BB) {
CallInst *CI = dyn_cast<CallInst>(&I);		CallInst *CI = dyn_cast<CallInst>(&I);
if (!CI) {		if (!CI) {
continue;		continue;
}		}
LibFunc Func;		LibFunc Func;
if (TLI->getLibFunc(ImmutableCallSite(CI), Func) &&		if (TLI->getLibFunc(ImmutableCallSite(CI), Func) &&
(Func == LibFunc_memcmp \|\| Func == LibFunc_bcmp) &&		(Func == LibFunc_memcmp \|\| Func == LibFunc_bcmp) &&
expandMemCmp(CI, TTI, TL, &DL)) {		expandMemCmp(CI, TTI, &DL, DT)) {
return true;		return true;
}		}
}		}
return false;		return false;
}		}

		PreservedAnalyses ExpandMemCmpPass::runImpl(Function &F,
PreservedAnalyses ExpandMemCmpPass::runImpl(		const TargetLibraryInfo *TLI,
Function &F, const TargetLibraryInfo TLI, const TargetTransformInfo TTI,		const TargetTransformInfo *TTI,
const TargetLowering* TL) {		DominatorTree *DT) {
const DataLayout& DL = F.getParent()->getDataLayout();		const DataLayout &DL = F.getParent()->getDataLayout();
bool MadeChanges = false;		bool MadeChanges = false;
for (auto BBIt = F.begin(); BBIt != F.end();) {		for (auto BBIt = F.begin(); BBIt != F.end();) {
if (runOnBlock(*BBIt, TLI, TTI, TL, DL)) {		if (runOnBlock(*BBIt, TLI, TTI, DL, DT)) {
MadeChanges = true;		MadeChanges = true;
// If changes were made, restart the function from the beginning, since		// If changes were made, restart the function from the beginning, since
// the structure of the function was changed.		// the structure of the function was changed.
BBIt = F.begin();		BBIt = F.begin();
} else {		} else {
++BBIt;		++BBIt;
}		}
}		}
return MadeChanges ? PreservedAnalyses::none() : PreservedAnalyses::all();		if (!MadeChanges)
		return PreservedAnalyses::all();
		PreservedAnalyses PA;
		PA.preserve<GlobalsAA>();
		PA.preserve<DominatorTreeAnalysis>();
		return PA;
}		}

} // namespace		} // namespace

char ExpandMemCmpPass::ID = 0;		char ExpandMemCmpPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandMemCmpPass, "expandmemcmp",		INITIALIZE_PASS_BEGIN(ExpandMemCmpPass, "expandmemcmp",
"Expand memcmp() to load/stores", false, false)		"Expand memcmp() to load/stores", false, false)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(ExpandMemCmpPass, "expandmemcmp",		INITIALIZE_PASS_END(ExpandMemCmpPass, "expandmemcmp",
"Expand memcmp() to load/stores", false, false)		"Expand memcmp() to load/stores", false, false)

FunctionPass *llvm::createExpandMemCmpPass() {		Pass *llvm::createExpandMemCmpPass() { return new ExpandMemCmpPass(); }
return new ExpandMemCmpPass();
}

llvm/lib/Transforms/Scalar/MergeICmps.cpp

	Show First 20 Lines • Show All 860 Lines • ▼ Show 20 Lines

	static bool runImpl(Function &F, const TargetLibraryInfo &TLI,			static bool runImpl(Function &F, const TargetLibraryInfo &TLI,
	const TargetTransformInfo &TTI, AliasAnalysis &AA,			const TargetTransformInfo &TTI, AliasAnalysis &AA,
	DominatorTree *DT) {			DominatorTree *DT) {
	LLVM_DEBUG(dbgs() << "MergeICmpsLegacyPass: " << F.getName() << "\n");			LLVM_DEBUG(dbgs() << "MergeICmpsLegacyPass: " << F.getName() << "\n");

	// We only try merging comparisons if the target wants to expand memcmp later.			// We only try merging comparisons if the target wants to expand memcmp later.
	// The rationale is to avoid turning small chains into memcmp calls.			// The rationale is to avoid turning small chains into memcmp calls.
	if (!TTI.enableMemCmpExpansion(F.hasOptSize(), true))			if (!TTI.enableMemCmpExpansion(F.hasOptSize(), /IsZeroCmp/ true))
	return false;			return false;

	// If we don't have memcmp avaiable we can't emit calls to it.			// If we don't have memcmp avaiable we can't emit calls to it.
	if (!TLI.has(LibFunc_memcmp))			if (!TLI.has(LibFunc_memcmp))
	return false;			return false;

	DomTreeUpdater DTU(DT, /PostDominatorTree/ nullptr,			DomTreeUpdater DTU(DT, /PostDominatorTree/ nullptr,
	DomTreeUpdater::UpdateStrategy::Eager);			DomTreeUpdater::UpdateStrategy::Eager);
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/Scalar.cpp

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopVersioningLICMPass(Registry);		initializeLoopVersioningLICMPass(Registry);
initializeLoopIdiomRecognizeLegacyPassPass(Registry);		initializeLoopIdiomRecognizeLegacyPassPass(Registry);
initializeLowerAtomicLegacyPassPass(Registry);		initializeLowerAtomicLegacyPassPass(Registry);
initializeLowerExpectIntrinsicPass(Registry);		initializeLowerExpectIntrinsicPass(Registry);
initializeLowerGuardIntrinsicLegacyPassPass(Registry);		initializeLowerGuardIntrinsicLegacyPassPass(Registry);
initializeLowerWidenableConditionLegacyPassPass(Registry);		initializeLowerWidenableConditionLegacyPassPass(Registry);
initializeMemCpyOptLegacyPassPass(Registry);		initializeMemCpyOptLegacyPassPass(Registry);
initializeMergeICmpsLegacyPassPass(Registry);		initializeMergeICmpsLegacyPassPass(Registry);
		initializeExpandMemCmpPassPass(Registry);
initializeMergedLoadStoreMotionLegacyPassPass(Registry);		initializeMergedLoadStoreMotionLegacyPassPass(Registry);
initializeNaryReassociateLegacyPassPass(Registry);		initializeNaryReassociateLegacyPassPass(Registry);
initializePartiallyInlineLibCallsLegacyPassPass(Registry);		initializePartiallyInlineLibCallsLegacyPassPass(Registry);
initializeReassociateLegacyPassPass(Registry);		initializeReassociateLegacyPassPass(Registry);
initializeRegToMemPass(Registry);		initializeRegToMemPass(Registry);
initializeRewriteStatepointsForGCLegacyPassPass(Registry);		initializeRewriteStatepointsForGCLegacyPassPass(Registry);
initializeSCCPLegacyPassPass(Registry);		initializeSCCPLegacyPassPass(Registry);
initializeSROALegacyPassPass(Registry);		initializeSROALegacyPassPass(Registry);
▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/O3-pipeline.ll

	Show All 26 Lines
	; CHECK-NEXT: Loop Data Prefetch			; CHECK-NEXT: Loop Data Prefetch
	; CHECK-NEXT: Falkor HW Prefetch Fix			; CHECK-NEXT: Falkor HW Prefetch Fix
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	; CHECK-NEXT: Loop Strength Reduction			; CHECK-NEXT: Loop Strength Reduction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Merge contiguous icmps into a memcmp
	; CHECK-NEXT: Expand memcmp() to load/stores
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/bcmp-inline-small.ll

This file was deleted.

	; RUN: llc -O2 < %s -mtriple=aarch64-linux-gnu \| FileCheck %s --check-prefixes=CHECK,CHECKN
	; RUN: llc -O2 < %s -mtriple=aarch64-linux-gnu -mattr=strict-align \| FileCheck %s --check-prefixes=CHECK,CHECKS

	declare i32 @bcmp(i8, i8, i64) nounwind readonly
	declare i32 @memcmp(i8, i8, i64) nounwind readonly

	define i1 @bcmp_b2(i8* %s1, i8* %s2) {
	entry:
	%bcmp = call i32 @bcmp(i8* %s1, i8* %s2, i64 15)
	%ret = icmp eq i32 %bcmp, 0
	ret i1 %ret

	; CHECK-LABEL: bcmp_b2:
	; CHECK-NOT: bl bcmp
	; CHECKN: ldr x
	; CHECKN-NEXT: ldr x
	; CHECKN-NEXT: ldur x
	; CHECKN-NEXT: ldur x
	; CHECKS: ldr x
	; CHECKS-NEXT: ldr x
	; CHECKS-NEXT: ldr w
	; CHECKS-NEXT: ldr w
	; CHECKS-NEXT: ldrh w
	; CHECKS-NEXT: ldrh w
	; CHECKS-NEXT: ldrb w
	; CHECKS-NEXT: ldrb w
	}

	define i1 @bcmp_bs(i8* %s1, i8* %s2) optsize {
	entry:
	%memcmp = call i32 @memcmp(i8* %s1, i8* %s2, i64 31)
	%ret = icmp eq i32 %memcmp, 0
	ret i1 %ret

	; CHECK-LABEL: bcmp_bs:
	; CHECKN-NOT: bl memcmp
	; CHECKN: ldp x
	; CHECKN-NEXT: ldp x
	; CHECKN-NEXT: ldr x
	; CHECKN-NEXT: ldr x
	; CHECKN-NEXT: ldur x
	; CHECKN-NEXT: ldur x
	; CHECKS: bl memcmp
	}

llvm/test/CodeGen/ARM/O3-pipeline.ll

	Show All 10 Lines
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	; CHECK-NEXT: Loop Strength Reduction			; CHECK-NEXT: Loop Strength Reduction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Merge contiguous icmps into a memcmp
	; CHECK-NEXT: Expand memcmp() to load/stores
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

llvm/test/CodeGen/Generic/llc-start-stop.ll

	; Note: -verify-machineinstrs is used in order to make this test compatible with EXPENSIVE_CHECKS.			; Note: -verify-machineinstrs is used in order to make this test compatible with EXPENSIVE_CHECKS.
	; RUN: llc < %s -debug-pass=Structure -stop-after=loop-reduce -verify-machineinstrs -o /dev/null 2>&1 \			; RUN: llc < %s -debug-pass=Structure -stop-after=loop-reduce -verify-machineinstrs -o /dev/null 2>&1 \
	; RUN: \| FileCheck %s -check-prefix=STOP-AFTER			; RUN: \| FileCheck %s -check-prefix=STOP-AFTER
	; STOP-AFTER: -loop-reduce			; STOP-AFTER: -loop-reduce
	; STOP-AFTER: Dominator Tree Construction			; STOP-AFTER: Dominator Tree Construction
	; STOP-AFTER: Loop Strength Reduction			; STOP-AFTER: Loop Strength Reduction
	; STOP-AFTER-NEXT: Verify generated machine code			; STOP-AFTER-NEXT: Verify generated machine code
	; STOP-AFTER-NEXT: MIR Printing Pass			; STOP-AFTER-NEXT: MIR Printing Pass

	; RUN: llc < %s -debug-pass=Structure -stop-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=STOP-BEFORE			; RUN: llc < %s -debug-pass=Structure -stop-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=STOP-BEFORE
	; STOP-BEFORE-NOT: -loop-reduce			; STOP-BEFORE-NOT: -loop-reduce
	; STOP-BEFORE: Dominator Tree Construction			; STOP-BEFORE: Dominator Tree Construction
	; STOP-BEFORE-NOT: Loop Strength Reduction			; STOP-BEFORE-NOT: Loop Strength Reduction

	; RUN: llc < %s -debug-pass=Structure -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-AFTER			; RUN: llc < %s -debug-pass=Structure -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-AFTER
	; START-AFTER: -aa -mergeicmps			; START-AFTER: -gc-lowering
	; START-AFTER: FunctionPass Manager			; START-AFTER: FunctionPass Manager
	; START-AFTER-NEXT: Dominator Tree Construction			; START-AFTER-NEXT: Lower Garbage Collection Instructions

	; RUN: llc < %s -debug-pass=Structure -start-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-BEFORE			; RUN: llc < %s -debug-pass=Structure -start-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-BEFORE
	; START-BEFORE: -machine-branch-prob -domtree			; START-BEFORE: -machine-branch-prob -domtree
	; START-BEFORE: FunctionPass Manager			; START-BEFORE: FunctionPass Manager
	; START-BEFORE: Loop Strength Reduction			; START-BEFORE: Loop Strength Reduction
	; START-BEFORE-NEXT: Basic Alias Analysis (stateless AA impl)			; START-BEFORE-NEXT: Lower Garbage Collection Instructions

	; RUN: not llc < %s -start-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-BEFORE			; RUN: not llc < %s -start-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-BEFORE
	; RUN: not llc < %s -stop-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-BEFORE			; RUN: not llc < %s -stop-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-BEFORE
	; RUN: not llc < %s -start-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-AFTER			; RUN: not llc < %s -start-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-AFTER
	; RUN: not llc < %s -stop-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-AFTER			; RUN: not llc < %s -stop-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-AFTER
	; NONEXISTENT-START-BEFORE: "nonexistent" pass is not registered.			; NONEXISTENT-START-BEFORE: "nonexistent" pass is not registered.
	; NONEXISTENT-STOP-BEFORE: "nonexistent" pass is not registered.			; NONEXISTENT-STOP-BEFORE: "nonexistent" pass is not registered.
	; NONEXISTENT-START-AFTER: "nonexistent" pass is not registered.			; NONEXISTENT-START-AFTER: "nonexistent" pass is not registered.
	; NONEXISTENT-STOP-AFTER: "nonexistent" pass is not registered.			; NONEXISTENT-STOP-AFTER: "nonexistent" pass is not registered.

	; RUN: not llc < %s -start-before=loop-reduce -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-START			; RUN: not llc < %s -start-before=loop-reduce -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-START
	; RUN: not llc < %s -stop-before=loop-reduce -stop-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-STOP			; RUN: not llc < %s -stop-before=loop-reduce -stop-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-STOP
	; DOUBLE-START: start-before and start-after specified!			; DOUBLE-START: start-before and start-after specified!
	; DOUBLE-STOP: stop-before and stop-after specified!			; DOUBLE-STOP: stop-before and stop-after specified!

llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -ppc-gpr-icmps=all -verify-machineinstrs -mcpu=pwr8 < %s \| FileCheck %s
	target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"

	@zeroEqualityTest01.buffer1 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 4], align 4
	@zeroEqualityTest01.buffer2 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 3], align 4
	@zeroEqualityTest02.buffer1 = private unnamed_addr constant [4 x i32] [i32 4, i32 0, i32 0, i32 0], align 4
	@zeroEqualityTest02.buffer2 = private unnamed_addr constant [4 x i32] [i32 3, i32 0, i32 0, i32 0], align 4
	@zeroEqualityTest03.buffer1 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 3], align 4
	@zeroEqualityTest03.buffer2 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 4], align 4
	@zeroEqualityTest04.buffer1 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14], align 4
	@zeroEqualityTest04.buffer2 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 13], align 4

	declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

	; Check 4 bytes - requires 1 load for each param.
	define signext i32 @zeroEqualityTest02(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest02:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwz 3, 0(3)
	; CHECK-NEXT: lwz 4, 0(4)
	; CHECK-NEXT: xor 3, 3, 4
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: srwi 3, 3, 5
	; CHECK-NEXT: xori 3, 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 4)
	%not.cmp = icmp ne i32 %call, 0
	%. = zext i1 %not.cmp to i32
	ret i32 %.
	}

	; Check 16 bytes - requires 2 loads for each param (or use vectors?).
	define signext i32 @zeroEqualityTest01(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest01:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ld 5, 0(3)
	; CHECK-NEXT: ld 6, 0(4)
	; CHECK-NEXT: cmpld 5, 6
	; CHECK-NEXT: bne 0, .LBB1_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: ld 3, 8(3)
	; CHECK-NEXT: ld 4, 8(4)
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB1_3
	; CHECK-NEXT: .LBB1_2: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB1_3: # %endblock
	; CHECK-NEXT: clrldi 3, 3, 32
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 16)
	%not.tobool = icmp ne i32 %call, 0
	%. = zext i1 %not.tobool to i32
	ret i32 %.
	}

	; Check 7 bytes - requires 3 loads for each param.
	define signext i32 @zeroEqualityTest03(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest03:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwz 5, 0(3)
	; CHECK-NEXT: lwz 6, 0(4)
	; CHECK-NEXT: cmplw 5, 6
	; CHECK-NEXT: bne 0, .LBB2_3
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: lhz 5, 4(3)
	; CHECK-NEXT: lhz 6, 4(4)
	; CHECK-NEXT: cmplw 5, 6
	; CHECK-NEXT: bne 0, .LBB2_3
	; CHECK-NEXT: # %bb.2: # %loadbb2
	; CHECK-NEXT: lbz 3, 6(3)
	; CHECK-NEXT: lbz 4, 6(4)
	; CHECK-NEXT: cmplw 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB2_4
	; CHECK-NEXT: .LBB2_3: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB2_4: # %endblock
	; CHECK-NEXT: clrldi 3, 3, 32
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 7)
	%not.lnot = icmp ne i32 %call, 0
	%cond = zext i1 %not.lnot to i32
	ret i32 %cond
	}

	; Validate with > 0
	define signext i32 @zeroEqualityTest04() {
	; CHECK-LABEL: zeroEqualityTest04:
	; CHECK: # %bb.0:
	; CHECK-NEXT: addis 3, 2, .LzeroEqualityTest02.buffer1@toc@ha
	; CHECK-NEXT: addis 4, 2, .LzeroEqualityTest02.buffer2@toc@ha
	; CHECK-NEXT: addi 6, 3, .LzeroEqualityTest02.buffer1@toc@l
	; CHECK-NEXT: addi 5, 4, .LzeroEqualityTest02.buffer2@toc@l
	; CHECK-NEXT: ldbrx 3, 0, 6
	; CHECK-NEXT: ldbrx 4, 0, 5
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: bne 0, .LBB3_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 8
	; CHECK-NEXT: ldbrx 3, 6, 4
	; CHECK-NEXT: ldbrx 4, 5, 4
	; CHECK-NEXT: li 5, 0
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: beq 0, .LBB3_3
	; CHECK-NEXT: .LBB3_2: # %res_block
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: li 4, -1
	; CHECK-NEXT: isel 5, 4, 3, 0
	; CHECK-NEXT: .LBB3_3: # %endblock
	; CHECK-NEXT: extsw 3, 5
	; CHECK-NEXT: neg 3, 3
	; CHECK-NEXT: rldicl 3, 3, 1, 63
	; CHECK-NEXT: xori 3, 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest02.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest02.buffer2 to i8*), i64 16)
	%not.cmp = icmp slt i32 %call, 1
	%. = zext i1 %not.cmp to i32
	ret i32 %.
	}

	; Validate with < 0
	define signext i32 @zeroEqualityTest05() {
	; CHECK-LABEL: zeroEqualityTest05:
	; CHECK: # %bb.0:
	; CHECK-NEXT: addis 3, 2, .LzeroEqualityTest03.buffer1@toc@ha
	; CHECK-NEXT: addis 4, 2, .LzeroEqualityTest03.buffer2@toc@ha
	; CHECK-NEXT: addi 6, 3, .LzeroEqualityTest03.buffer1@toc@l
	; CHECK-NEXT: addi 5, 4, .LzeroEqualityTest03.buffer2@toc@l
	; CHECK-NEXT: ldbrx 3, 0, 6
	; CHECK-NEXT: ldbrx 4, 0, 5
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: bne 0, .LBB4_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 8
	; CHECK-NEXT: ldbrx 3, 6, 4
	; CHECK-NEXT: ldbrx 4, 5, 4
	; CHECK-NEXT: li 5, 0
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: beq 0, .LBB4_3
	; CHECK-NEXT: .LBB4_2: # %res_block
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: li 4, -1
	; CHECK-NEXT: isel 5, 4, 3, 0
	; CHECK-NEXT: .LBB4_3: # %endblock
	; CHECK-NEXT: nor 3, 5, 5
	; CHECK-NEXT: rlwinm 3, 3, 1, 31, 31
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest03.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest03.buffer2 to i8*), i64 16)
	%call.lobit = lshr i32 %call, 31
	%call.lobit.not = xor i32 %call.lobit, 1
	ret i32 %call.lobit.not
	}

	; Validate with memcmp()?:
	define signext i32 @equalityFoldTwoConstants() {
	; CHECK-LABEL: equalityFoldTwoConstants:
	; CHECK: # %bb.0: # %loadbb
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 bitcast ([15 x i32]* @zeroEqualityTest04.buffer2 to i8*), i64 16)
	%not.tobool = icmp eq i32 %call, 0
	%cond = zext i1 %not.tobool to i32
	ret i32 %cond
	}

	define signext i32 @equalityFoldOneConstant(i8* %X) {
	; CHECK-LABEL: equalityFoldOneConstant:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ld 4, 0(3)
	; CHECK-NEXT: li 5, 1
	; CHECK-NEXT: sldi 5, 5, 32
	; CHECK-NEXT: cmpld 4, 5
	; CHECK-NEXT: bne 0, .LBB6_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 3
	; CHECK-NEXT: ld 3, 8(3)
	; CHECK-NEXT: sldi 4, 4, 32
	; CHECK-NEXT: ori 4, 4, 2
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB6_3
	; CHECK-NEXT: .LBB6_2: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB6_3: # %endblock
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: srwi 3, 3, 5
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 %X, i64 16)
	%not.tobool = icmp eq i32 %call, 0
	%cond = zext i1 %not.tobool to i32
	ret i32 %cond
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind {
	; CHECK-LABEL: length2_eq_nobuiltin_attr:
	; CHECK: # %bb.0:
	; CHECK-NEXT: mflr 0
	; CHECK-NEXT: std 0, 16(1)
	; CHECK-NEXT: stdu 1, -32(1)
	; CHECK-NEXT: li 5, 2
	; CHECK-NEXT: bl memcmp
	; CHECK-NEXT: nop
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: rlwinm 3, 3, 27, 31, 31
	; CHECK-NEXT: addi 1, 1, 32
	; CHECK-NEXT: ld 0, 16(1)
	; CHECK-NEXT: mtlr 0
	; CHECK-NEXT: blr
	%m = tail call signext i32 @memcmp(i8* %X, i8* %Y, i64 2) nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

llvm/test/CodeGen/PowerPC/memcmp-mergeexpand.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux < %s \| FileCheck %s -check-prefix=PPC64LE

	; This tests interaction between MergeICmp and ExpandMemCmp.

	%"struct.std::pair" = type { i32, i32 }

	define zeroext i1 @opeq1(
	; PPC64LE-LABEL: opeq1:
	; PPC64LE: # %bb.0: # %"entry+land.rhs.i"
	; PPC64LE-NEXT: ld 3, 0(3)
	; PPC64LE-NEXT: ld 4, 0(4)
	; PPC64LE-NEXT: xor 3, 3, 4
	; PPC64LE-NEXT: cntlzd 3, 3
	; PPC64LE-NEXT: rldicl 3, 3, 58, 63
	; PPC64LE-NEXT: blr
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
	entry:
	%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
	%0 = load i32, i32* %first.i, align 4
	%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
	%1 = load i32, i32* %first1.i, align 4
	%cmp.i = icmp eq i32 %0, %1
	br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

	land.rhs.i:
	%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
	%2 = load i32, i32* %second.i, align 4
	%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
	%3 = load i32, i32* %second2.i, align 4
	%cmp3.i = icmp eq i32 %2, %3
	br label %opeq1.exit

	opeq1.exit:
	%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
	ret i1 %4
	}

llvm/test/CodeGen/PowerPC/memcmp.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux < %s \| FileCheck %s -check-prefix=CHECK

	define signext i32 @memcmp8(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp8:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ldbrx 3, 0, 3
	; CHECK-NEXT: ldbrx 4, 0, 4
	; CHECK-NEXT: subfc 5, 3, 4
	; CHECK-NEXT: subfe 5, 4, 4
	; CHECK-NEXT: subfc 4, 4, 3
	; CHECK-NEXT: subfe 3, 3, 3
	; CHECK-NEXT: neg 4, 5
	; CHECK-NEXT: neg 3, 3
	; CHECK-NEXT: subf 3, 3, 4
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 8)
	ret i32 %call
	}

	define signext i32 @memcmp4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp4:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwbrx 3, 0, 3
	; CHECK-NEXT: lwbrx 4, 0, 4
	; CHECK-NEXT: sub 5, 4, 3
	; CHECK-NEXT: sub 3, 3, 4
	; CHECK-NEXT: rldicl 4, 5, 1, 63
	; CHECK-NEXT: rldicl 3, 3, 1, 63
	; CHECK-NEXT: subf 3, 3, 4
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 4)
	ret i32 %call
	}

	define signext i32 @memcmp2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp2:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lhbrx 3, 0, 3
	; CHECK-NEXT: lhbrx 4, 0, 4
	; CHECK-NEXT: subf 3, 4, 3
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 2)
	ret i32 %call
	}

	define signext i32 @memcmp1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp1:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lbz 3, 0(3)
	; CHECK-NEXT: lbz 4, 0(4)
	; CHECK-NEXT: subf 3, 4, 3
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 1) #2
	ret i32 %call
	}

	declare signext i32 @memcmp(i8, i8, i64)

llvm/test/CodeGen/PowerPC/memcmpIR.ll

This file was deleted.

	; RUN: llc -o - -mtriple=powerpc64le-unknown-gnu-linux -stop-after codegenprepare %s \| FileCheck %s
	; RUN: llc -o - -mtriple=powerpc64-unknown-gnu-linux -stop-after codegenprepare %s \| FileCheck %s --check-prefix=CHECK-BE

	define signext i32 @test1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	entry:
	; CHECK-LABEL: @test1(
	; CHECK: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-LABEL: res_block:{{.*}}
	; CHECK: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-NEXT: br label %endblock

	; CHECK-LABEL: loadbb1:{{.*}}
	; CHECK: [[BCC1:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-NEXT: [[BCC2:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-NEXT: [[GEP1:%[0-9]+]] = getelementptr i8, i8* [[BCC2]], i8 8
	; CHECK-NEXT: [[BCL1:%[0-9]+]] = bitcast i8* [[GEP1]] to i64*
	; CHECK-NEXT: [[GEP2:%[0-9]+]] = getelementptr i8, i8* [[BCC1]], i8 8
	; CHECK-NEXT: [[BCL2:%[0-9]+]] = bitcast i8* [[GEP2]] to i64*
	; CHECK-NEXT: [[LOAD1:%[0-9]+]] = load i64, i64* [[BCL1]]
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64* [[BCL2]]
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %endblock, label %res_block

	; CHECK-BE-LABEL: @test1(
	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-BE-LABEL: res_block:{{.*}}
	; CHECK-BE: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-BE-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-BE-NEXT: br label %endblock

	; CHECK-BE-LABEL: loadbb1:{{.*}}
	; CHECK-BE: [[BCC1:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-BE-NEXT: [[BCC2:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-BE-NEXT: [[GEP1:%[0-9]+]] = getelementptr i8, i8* [[BCC2]], i8 8
	; CHECK-BE-NEXT: [[BCL1:%[0-9]+]] = bitcast i8* [[GEP1]] to i64*
	; CHECK-BE-NEXT: [[GEP2:%[0-9]+]] = getelementptr i8, i8* [[BCC1]], i8 8
	; CHECK-BE-NEXT: [[BCL2:%[0-9]+]] = bitcast i8* [[GEP2]] to i64*
	; CHECK-BE-NEXT: [[LOAD1:%[0-9]+]] = load i64, i64* [[BCL1]]
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64* [[BCL2]]
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %endblock, label %res_block

	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 16)
	ret i32 %call
	}

	declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

	define signext i32 @test2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: @test2(
	; CHECK: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD2]])
	; CHECK-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
	; CHECK-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
	; CHECK-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
	; CHECK-NEXT: ret i32 [[SUB]]

	; CHECK-BE-LABEL: @test2(
	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
	; CHECK-BE-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
	; CHECK-BE-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
	; CHECK-BE-NEXT: ret i32 [[SUB]]

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 4)
	ret i32 %call
	}

	define signext i32 @test3(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-LABEL: res_block:{{.*}}
	; CHECK: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-NEXT: br label %endblock

	; CHECK-LABEL: loadbb1:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD2]])
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i32 [[BSWAP1]] to i64
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i32 [[BSWAP2]] to i64
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb2, label %res_block

	; CHECK-LABEL: loadbb2:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i16, i16*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i16, i16*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i16 @llvm.bswap.i16(i16 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i16 @llvm.bswap.i16(i16 [[LOAD2]])
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i16 [[BSWAP1]] to i64
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i16 [[BSWAP2]] to i64
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb3, label %res_block

	; CHECK-LABEL: loadbb3:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i8, i8*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i8, i8*
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i8 [[LOAD1]] to i32
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i8 [[LOAD2]] to i32
	; CHECK-NEXT: [[SUB:%[0-9]+]] = sub i32 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br label %endblock

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-BE-LABEL: res_block:{{.*}}
	; CHECK-BE: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-BE-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-BE-NEXT: br label %endblock

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i32 [[LOAD1]] to i64
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i32 [[LOAD2]] to i64
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb2, label %res_block

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i16, i16*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i16, i16*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i16 [[LOAD1]] to i64
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i16 [[LOAD2]] to i64
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb3, label %res_block

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i8, i8*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i8, i8*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i8 [[LOAD1]] to i32
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i8 [[LOAD2]] to i32
	; CHECK-BE-NEXT: [[SUB:%[0-9]+]] = sub i32 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br label %endblock

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 15)
	ret i32 %call
	}
	; CHECK: call = tail call signext i32 @memcmp
	; CHECK-BE: call = tail call signext i32 @memcmp
	define signext i32 @test4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 65)
	ret i32 %call
	}

	define signext i32 @test5(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2, i32 signext %SIZE) {
	; CHECK: call = tail call signext i32 @memcmp
	; CHECK-BE: call = tail call signext i32 @memcmp
	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%conv = sext i32 %SIZE to i64
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 %conv)
	ret i32 %call
	}

llvm/test/CodeGen/X86/O3-pipeline.ll

	Show All 23 Lines
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	; CHECK-NEXT: Loop Strength Reduction			; CHECK-NEXT: Loop Strength Reduction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Merge contiguous icmps into a memcmp
	; CHECK-NEXT: Expand memcmp() to load/stores
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/memcmp-mergeexpand.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

	; This tests interaction between MergeICmp and ExpandMemCmp.

	%"struct.std::pair" = type { i32, i32 }

	define zeroext i1 @opeq1(
	; X86-LABEL: opeq1:
	; X86: # %bb.0: # %"entry+land.rhs.i"
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: opeq1:
	; X64: # %bb.0: # %"entry+land.rhs.i"
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
	entry:
	%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
	%0 = load i32, i32* %first.i, align 4
	%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
	%1 = load i32, i32* %first1.i, align 4
	%cmp.i = icmp eq i32 %0, %1
	br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

	land.rhs.i:
	%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
	%2 = load i32, i32* %second.i, align 4
	%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
	%3 = load i32, i32* %second2.i, align 4
	%cmp3.i = icmp eq i32 %2, %3
	br label %opeq1.exit

	opeq1.exit:
	%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
	ret i1 %4
	}

llvm/test/CodeGen/X86/memcmp-optsize.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=cmov \| FileCheck %s --check-prefix=X86 --check-prefix=X86-NOSSE
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X86 --check-prefix=X86-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64 --check-prefix=X64-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX2

	; This tests codegen time inlining/optimization of memcmp
	; rdar://6480398

	@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

	declare i32 @memcmp(i8, i8, i64)
	declare i32 @bcmp(i8, i8, i64)

	define i32 @length2(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

	define i1 @length2_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: cmpw (%eax), %cx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpw (%rsi), %ax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length2_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2_eq_nobuiltin_attr:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $2
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_nobuiltin_attr:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movl $2, %edx
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length3(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length3:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: movzwl (%ecx), %esi
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: rolw $8, %si
	; X86-NEXT: cmpw %si, %dx
	; X86-NEXT: jne .LBB4_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 2(%eax), %eax
	; X86-NEXT: movzbl 2(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB4_3
	; X86-NEXT: .LBB4_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB4_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length3:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: cmpw %cx, %ax
	; X64-NEXT: jne .LBB4_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 2(%rdi), %eax
	; X64-NEXT: movzbl 2(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB4_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	ret i32 %m
	}

	define i1 @length3_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length3_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %edx
	; X86-NEXT: xorw (%eax), %dx
	; X86-NEXT: movb 2(%ecx), %cl
	; X86-NEXT: xorb 2(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orw %dx, %ax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length3_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: xorw (%rsi), %ax
	; X64-NEXT: movb 2(%rdi), %cl
	; X64-NEXT: xorb 2(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orw %ax, %cx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length4(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length4:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	ret i32 %m
	}

	define i1 @length4_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length4_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: cmpl (%eax), %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: cmpl (%rsi), %eax
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length4_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl $875770417, (%eax) # imm = 0x34333231
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: cmpl $875770417, (%rdi) # imm = 0x34333231
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length5(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length5:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB9_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB9_3
	; X86-NEXT: .LBB9_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB9_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB9_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB9_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	ret i32 %m
	}

	define i1 @length5_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length5_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: movb 4(%ecx), %cl
	; X86-NEXT: xorb 4(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orl %edx, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: movb 4(%rdi), %cl
	; X64-NEXT: xorb 4(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length8(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length8:
	; X86: # %bb.0:
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl (%esi), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: jne .LBB11_2
	; X86-NEXT: # %bb.1: # %loadbb1
	; X86-NEXT: movl 4(%esi), %ecx
	; X86-NEXT: movl 4(%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: je .LBB11_3
	; X86-NEXT: .LBB11_2: # %res_block
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: setae %al
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB11_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length8:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	ret i32 %m
	}

	define i1 @length8_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length8_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length8_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length8_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl $858927408, %ecx # imm = 0x33323130
	; X86-NEXT: xorl (%eax), %ecx
	; X86-NEXT: movl $926299444, %edx # imm = 0x37363534
	; X86-NEXT: xorl 4(%eax), %edx
	; X86-NEXT: orl %ecx, %edx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $3978425819141910832, %rax # imm = 0x3736353433323130
	; X64-NEXT: cmpq %rax, (%rdi)
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length12_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length12_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length12_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: xorl 8(%rsi), %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length12(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length12:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length12:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB15_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: movl 8(%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB15_3
	; X64-NEXT: .LBB15_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB15_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	ret i32 %m
	}

	; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

	define i32 @length16(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length16:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $16
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length16:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB16_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movq 8(%rdi), %rcx
	; X64-NEXT: movq 8(%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB16_3
	; X64-NEXT: .LBB16_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB16_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
	ret i32 %m
	}

	define i1 @length16_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length16_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu (%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X86-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length16_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length16_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length16_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length16_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

	define i32 @length24(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length24:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $24
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length24:
	; X64: # %bb.0:
	; X64-NEXT: movl $24, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
	ret i32 %m
	}

	define i1 @length24_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length24_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: movq {{.*#+}} xmm0 = mem[0],zero
	; X64-SSE2-NEXT: movq {{.*#+}} xmm2 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: pand %xmm1, %xmm2
	; X64-SSE2-NEXT: pmovmskb %xmm2, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length24_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm2 = mem[0],zero
	; X64-AVX2-NEXT: vpcmpeqb %xmm2, %xmm1, %xmm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length24_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length24_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movq {{.*#+}} xmm1 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length24_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length32(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length32:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $32
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length32:
	; X64: # %bb.0:
	; X64-NEXT: movl $32, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
	ret i32 %m
	}

	; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

	define i1 @length32_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length32_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X64-SSE2-NEXT: pand %xmm2, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length32_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length32_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length64(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length64:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length64:
	; X64: # %bb.0:
	; X64-NEXT: movl $64, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
	ret i32 %m
	}

	define i1 @length64_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-LABEL: length64_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb 32(%rsi), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length64_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length64_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl $.L.str
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $.L.str, %esi
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @bcmp_length2(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: bcmp_length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: bcmp_length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @bcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

llvm/test/CodeGen/X86/memcmp.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=cmov \| FileCheck %s --check-prefix=X86 --check-prefix=X86-NOSSE
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse \| FileCheck %s --check-prefix=X86 --check-prefix=SSE --check-prefix=X86-SSE1
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X86 --check-prefix=SSE --check-prefix=X86-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64 --check-prefix=X64-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX --check-prefix=X64-AVX1
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX --check-prefix=X64-AVX2

	; This tests codegen time inlining/optimization of memcmp
	; rdar://6480398

	@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

	declare i32 @memcmp(i8, i8, i64)

	define i32 @length0(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0:
	; X86: # %bb.0:
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length0:
	; X64: # %bb.0:
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	ret i32 %m
	}

	define i1 @length0_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0_eq:
	; X86: # %bb.0:
	; X86-NEXT: movb $1, %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length0_eq:
	; X64: # %bb.0:
	; X64-NEXT: movb $1, %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length0_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0_lt:
	; X86: # %bb.0:
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length0_lt:
	; X64: # %bb.0:
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i32 @length2(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

	define i1 @length2_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: cmpw (%eax), %cx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpw (%rsi), %ax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_lt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_lt:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_gt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_gt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %ax
	; X86-NEXT: movzwl %cx, %ecx
	; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: subl %eax, %ecx
	; X86-NEXT: testl %ecx, %ecx
	; X86-NEXT: setg %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_gt:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: setg %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp sgt i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_const(i8* %X) nounwind {
	; X86-LABEL: length2_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_eq_nobuiltin_attr:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $2
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_nobuiltin_attr:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movl $2, %edx
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length3(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length3:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: movzwl (%ecx), %esi
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: rolw $8, %si
	; X86-NEXT: cmpw %si, %dx
	; X86-NEXT: jne .LBB9_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 2(%eax), %eax
	; X86-NEXT: movzbl 2(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	; X86-NEXT: .LBB9_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length3:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: cmpw %cx, %ax
	; X64-NEXT: jne .LBB9_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 2(%rdi), %eax
	; X64-NEXT: movzbl 2(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB9_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	ret i32 %m
	}

	define i1 @length3_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length3_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %edx
	; X86-NEXT: xorw (%eax), %dx
	; X86-NEXT: movb 2(%ecx), %cl
	; X86-NEXT: xorb 2(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orw %dx, %ax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length3_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: xorw (%rsi), %ax
	; X64-NEXT: movb 2(%rdi), %cl
	; X64-NEXT: xorb 2(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orw %ax, %cx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length4(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	ret i32 %m
	}

	define i1 @length4_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: cmpl (%eax), %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: cmpl (%rsi), %eax
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_lt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_lt:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_gt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_gt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %eax
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %eax
	; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: cmpl %eax, %ecx
	; X86-NEXT: seta %dl
	; X86-NEXT: sbbl $0, %edx
	; X86-NEXT: testl %edx, %edx
	; X86-NEXT: setg %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_gt:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: xorl %edx, %edx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: seta %dl
	; X64-NEXT: sbbl $0, %edx
	; X64-NEXT: testl %edx, %edx
	; X64-NEXT: setg %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp sgt i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_eq_const(i8* %X) nounwind {
	; X86-LABEL: length4_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl $875770417, (%eax) # imm = 0x34333231
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: cmpl $875770417, (%rdi) # imm = 0x34333231
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length5(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB16_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	; X86-NEXT: .LBB16_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB16_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB16_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	ret i32 %m
	}

	define i1 @length5_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: movb 4(%ecx), %cl
	; X86-NEXT: xorb 4(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orl %edx, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: movb 4(%rdi), %cl
	; X64-NEXT: xorb 4(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length5_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5_lt:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB18_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB18_3
	; X86-NEXT: .LBB18_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB18_3: # %endblock
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_lt:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB18_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB18_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length7_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length7_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 3(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 3(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length7_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl 3(%rdi), %ecx
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: xorl 3(%rsi), %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 7) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length8(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length8:
	; X86: # %bb.0:
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl (%esi), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: jne .LBB20_2
	; X86-NEXT: # %bb.1: # %loadbb1
	; X86-NEXT: movl 4(%esi), %ecx
	; X86-NEXT: movl 4(%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: je .LBB20_3
	; X86-NEXT: .LBB20_2: # %res_block
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: setae %al
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB20_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length8:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	ret i32 %m
	}

	define i1 @length8_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length8_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length8_eq_const(i8* %X) nounwind {
	; X86-LABEL: length8_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl $858927408, %ecx # imm = 0x33323130
	; X86-NEXT: xorl (%eax), %ecx
	; X86-NEXT: movl $926299444, %edx # imm = 0x37363534
	; X86-NEXT: xorl 4(%eax), %edx
	; X86-NEXT: orl %ecx, %edx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $3978425819141910832, %rax # imm = 0x3736353433323130
	; X64-NEXT: cmpq %rax, (%rdi)
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length9_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length9_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $9
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length9_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movb 8(%rdi), %cl
	; X64-NEXT: xorb 8(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length10_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length10_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $10
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length10_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movzwl 8(%rdi), %ecx
	; X64-NEXT: xorw 8(%rsi), %cx
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 10) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length11_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length11_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $11
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length11_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 3(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 3(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 11) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length12_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length12_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length12_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: xorl 8(%rsi), %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length12(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length12:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length12:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB27_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: movl 8(%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB27_3
	; X64-NEXT: .LBB27_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB27_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	ret i32 %m
	}

	define i1 @length13_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length13_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $13
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length13_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 5(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 5(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 13) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length14_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length14_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $14
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length14_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 6(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 6(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 14) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length15_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length15_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $15
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length15_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 7(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 7(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 15) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

	define i32 @length16(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length16:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $16
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length16:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB31_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movq 8(%rdi), %rcx
	; X64-NEXT: movq 8(%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB31_3
	; X64-NEXT: .LBB31_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB31_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
	ret i32 %m
	}

	define i1 @length16_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length16_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length16_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $16
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu (%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X86-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length16_eq:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: setne %al
	; X64-AVX-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length16_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length16_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length16_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $16
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length16_eq_const:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: sete %al
	; X64-AVX-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

	define i32 @length24(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length24:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $24
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length24:
	; X64: # %bb.0:
	; X64-NEXT: movl $24, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
	ret i32 %m
	}

	define i1 @length24_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length24_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length24_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $24
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: movq {{.*#+}} xmm0 = mem[0],zero
	; X64-SSE2-NEXT: movq {{.*#+}} xmm2 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: pand %xmm1, %xmm2
	; X64-SSE2-NEXT: pmovmskb %xmm2, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length24_eq:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm2 = mem[0],zero
	; X64-AVX-NEXT: vpcmpeqb %xmm2, %xmm1, %xmm1
	; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: sete %al
	; X64-AVX-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length24_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length24_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length24_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $24
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movq {{.*#+}} xmm1 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length24_eq_const:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: setne %al
	; X64-AVX-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length32(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length32:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $32
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length32:
	; X64: # %bb.0:
	; X64-NEXT: movl $32, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
	ret i32 %m
	}

	; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

	define i1 @length32_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length32_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length32_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $32
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X64-SSE2-NEXT: pand %xmm2, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length32_eq:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
	; X64-AVX1-NEXT: vpcmpeqb 16(%rsi), %xmm1, %xmm1
	; X64-AVX1-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX1-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX1-NEXT: sete %al
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length32_eq_prefer128(i8* %x, i8* %y) nounwind "prefer-vector-width"="128" {
	; X86-NOSSE-LABEL: length32_eq_prefer128:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length32_eq_prefer128:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $32
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq_prefer128:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq_prefer128:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X64-SSE2-NEXT: pand %xmm2, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length32_eq_prefer128:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vmovdqu 16(%rdi), %xmm1
	; X64-AVX-NEXT: vpcmpeqb 16(%rsi), %xmm1, %xmm1
	; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: sete %al
	; X64-AVX-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length32_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length32_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length32_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $32
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length32_eq_const:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
	; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX1-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX1-NEXT: setne %al
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length64(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length64:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length64:
	; X64: # %bb.0:
	; X64-NEXT: movl $64, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
	ret i32 %m
	}

	define i1 @length64_eq(i8* %x, i8* %y) nounwind {
	; X86-LABEL: length64_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length64_eq:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: pushq %rax
	; X64-AVX1-NEXT: movl $64, %edx
	; X64-AVX1-NEXT: callq memcmp
	; X64-AVX1-NEXT: testl %eax, %eax
	; X64-AVX1-NEXT: setne %al
	; X64-AVX1-NEXT: popq %rcx
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb 32(%rsi), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length64_eq_const(i8* %X) nounwind {
	; X86-LABEL: length64_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl $.L.str
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $.L.str, %esi
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length64_eq_const:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: pushq %rax
	; X64-AVX1-NEXT: movl $.L.str, %esi
	; X64-AVX1-NEXT: movl $64, %edx
	; X64-AVX1-NEXT: callq memcmp
	; X64-AVX1-NEXT: testl %eax, %eax
	; X64-AVX1-NEXT: sete %al
	; X64-AVX1-NEXT: popq %rcx
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; This checks that we do not do stupid things with huge sizes.
	define i32 @huge_length(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: huge_length:
	; X86: # %bb.0:
	; X86-NEXT: pushl $2147483647 # imm = 0x7FFFFFFF
	; X86-NEXT: pushl $-1
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: huge_length:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $9223372036854775807, %rdx # imm = 0x7FFFFFFFFFFFFFFF
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
	ret i32 %m
	}

	define i1 @huge_length_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: huge_length_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $2147483647 # imm = 0x7FFFFFFF
	; X86-NEXT: pushl $-1
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: huge_length_eq:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movabsq $9223372036854775807, %rdx # imm = 0x7FFFFFFFFFFFFFFF
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq

	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; This checks non-constant sizes.
	define i32 @nonconst_length(i8* %X, i8* %Y, i64 %size) nounwind {
	; X86-LABEL: nonconst_length:
	; X86: # %bb.0:
	; X86-NEXT: jmp memcmp # TAILCALL
	;
	; X64-LABEL: nonconst_length:
	; X64: # %bb.0:
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
	ret i32 %m
	}

	define i1 @nonconst_length_eq(i8* %X, i8* %Y, i64 %size) nounwind {
	; X86-LABEL: nonconst_length_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: nonconst_length_eq:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

llvm/test/Other/opt-O2-pipeline.ll

	Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Phi Values Analysis			; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis			; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: MemCpy Optimization			; CHECK-NEXT: MemCpy Optimization
	; CHECK-NEXT: Sparse Conditional Constant Propagation			; CHECK-NEXT: Sparse Conditional Constant Propagation
	; CHECK-NEXT: Demanded bits analysis			; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Bit-Tracking Dead Code Elimination			; CHECK-NEXT: Bit-Tracking Dead Code Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				efriedmaUnsubmitted Done Reply Inline Actions How hard would it be to preserve the domtree? efriedma: How hard would it be to preserve the domtree?
				courbetAuthorUnsubmitted Done Reply Inline Actions I think it should not be too hard because we merely add blocks in a diamond in the middle of the graph, so the change is quite local. I'll have a look at that. courbet: I think it should not be too hard because we merely add blocks in a diamond in the middle of…
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions			; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Lazy Value Information Analysis			; CHECK-NEXT: Lazy Value Information Analysis
	; CHECK-NEXT: Jump Threading			; CHECK-NEXT: Jump Threading
	; CHECK-NEXT: Value Propagation			; CHECK-NEXT: Value Propagation
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Phi Values Analysis			; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Memory Dependence Analysis			; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: Dead Store Elimination			; CHECK-NEXT: Dead Store Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory SSA			; CHECK-NEXT: Memory SSA
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier			; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass			; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Invariant Code Motion			; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Aggressive Dead Code Elimination			; CHECK-NEXT: Aggressive Dead Code Elimination
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/test/Other/opt-O3-pipeline.ll

	Show First 20 Lines • Show All 163 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Memory SSA			; CHECK-NEXT: Memory SSA
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier			; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass			; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Invariant Code Motion			; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Aggressive Dead Code Elimination			; CHECK-NEXT: Aggressive Dead Code Elimination
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/test/Other/opt-Os-pipeline.ll

	Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Memory SSA			; CHECK-NEXT: Memory SSA
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier			; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass			; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Invariant Code Motion			; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Aggressive Dead Code Elimination			; CHECK-NEXT: Aggressive Dead Code Elimination
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/test/Transforms/ExpandMemCmp/AArch64/memcmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -S -expandmemcmp -verify-dom-info -mtriple=aarch64-linux-gnu -data-layout="e-m:e-i64:64-n32:64" \| FileCheck %s
				; RUN: opt < %s -S -expandmemcmp -verify-dom-info -mtriple=aarch64-linux-gnu -mattr=strict-align -data-layout="E-m:e-i64:64-n32:64" \| FileCheck %s --check-prefix=CHECK-STRICTALIGN

				declare i32 @bcmp(i8, i8, i64) nounwind readonly
				declare i32 @memcmp(i8, i8, i64) nounwind readonly

				define i1 @bcmp_b2(i8* %s1, i8* %s2) {
				; CHECK-LABEL: @bcmp_b2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[S1:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[S2:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]]
				; CHECK-NEXT: [[TMP4:%.*]] = xor i64 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[S1]], i8 7
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[S2]], i8 7
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; CHECK-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]]
				; CHECK-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]]
				; CHECK-NEXT: [[TMP11:%.*]] = xor i64 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: [[TMP12:%.*]] = or i64 [[TMP4]], [[TMP11]]
				; CHECK-NEXT: [[TMP13:%.*]] = icmp ne i64 [[TMP12]], 0
				; CHECK-NEXT: [[TMP14:%.*]] = zext i1 [[TMP13]] to i32
				; CHECK-NEXT: [[RET:%.*]] = icmp eq i32 [[TMP14]], 0
				; CHECK-NEXT: ret i1 [[RET]]
				;
				; CHECK-STRICTALIGN-LABEL: @bcmp_b2(
				; CHECK-STRICTALIGN-NEXT: entry:
				; CHECK-STRICTALIGN-NEXT: [[TMP0:%.]] = bitcast i8 [[S1:%.]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP1:%.]] = bitcast i8 [[S2:%.]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]]
				; CHECK-STRICTALIGN-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]]
				; CHECK-STRICTALIGN-NEXT: [[TMP4:%.*]] = xor i64 [[TMP2]], [[TMP3]]
				; CHECK-STRICTALIGN-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[S1]], i8 8
				; CHECK-STRICTALIGN-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; CHECK-STRICTALIGN-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[S2]], i8 8
				; CHECK-STRICTALIGN-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; CHECK-STRICTALIGN-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]]
				; CHECK-STRICTALIGN-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]]
				; CHECK-STRICTALIGN-NEXT: [[TMP11:%.*]] = zext i32 [[TMP9]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP12:%.*]] = zext i32 [[TMP10]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP13:%.*]] = xor i64 [[TMP11]], [[TMP12]]
				; CHECK-STRICTALIGN-NEXT: [[TMP14:%.]] = getelementptr i8, i8 [[S1]], i8 12
				; CHECK-STRICTALIGN-NEXT: [[TMP15:%.]] = bitcast i8 [[TMP14]] to i16*
				; CHECK-STRICTALIGN-NEXT: [[TMP16:%.]] = getelementptr i8, i8 [[S2]], i8 12
				; CHECK-STRICTALIGN-NEXT: [[TMP17:%.]] = bitcast i8 [[TMP16]] to i16*
				; CHECK-STRICTALIGN-NEXT: [[TMP18:%.]] = load i16, i16 [[TMP15]]
				; CHECK-STRICTALIGN-NEXT: [[TMP19:%.]] = load i16, i16 [[TMP17]]
				; CHECK-STRICTALIGN-NEXT: [[TMP20:%.*]] = zext i16 [[TMP18]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP21:%.*]] = zext i16 [[TMP19]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP22:%.*]] = xor i64 [[TMP20]], [[TMP21]]
				; CHECK-STRICTALIGN-NEXT: [[TMP23:%.]] = getelementptr i8, i8 [[S1]], i8 14
				; CHECK-STRICTALIGN-NEXT: [[TMP24:%.]] = getelementptr i8, i8 [[S2]], i8 14
				; CHECK-STRICTALIGN-NEXT: [[TMP25:%.]] = load i8, i8 [[TMP23]]
				; CHECK-STRICTALIGN-NEXT: [[TMP26:%.]] = load i8, i8 [[TMP24]]
				; CHECK-STRICTALIGN-NEXT: [[TMP27:%.*]] = zext i8 [[TMP25]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP28:%.*]] = zext i8 [[TMP26]] to i64
				; CHECK-STRICTALIGN-NEXT: [[TMP29:%.*]] = xor i64 [[TMP27]], [[TMP28]]
				; CHECK-STRICTALIGN-NEXT: [[TMP30:%.*]] = or i64 [[TMP4]], [[TMP13]]
				; CHECK-STRICTALIGN-NEXT: [[TMP31:%.*]] = or i64 [[TMP22]], [[TMP29]]
				; CHECK-STRICTALIGN-NEXT: [[TMP32:%.*]] = or i64 [[TMP30]], [[TMP31]]
				; CHECK-STRICTALIGN-NEXT: [[TMP33:%.*]] = icmp ne i64 [[TMP32]], 0
				; CHECK-STRICTALIGN-NEXT: [[TMP34:%.*]] = zext i1 [[TMP33]] to i32
				; CHECK-STRICTALIGN-NEXT: [[RET:%.*]] = icmp eq i32 [[TMP34]], 0
				; CHECK-STRICTALIGN-NEXT: ret i1 [[RET]]
				;
				entry:
				%bcmp = call i32 @bcmp(i8* %s1, i8* %s2, i64 15)
				%ret = icmp eq i32 %bcmp, 0
				ret i1 %ret
				}

				define i1 @bcmp_bs(i8* %s1, i8* %s2) optsize {
				; CHECK-LABEL: @bcmp_bs(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[S1:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[S2:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]]
				; CHECK-NEXT: [[TMP4:%.*]] = xor i64 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[S1]], i8 8
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[S2]], i8 8
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; CHECK-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]]
				; CHECK-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]]
				; CHECK-NEXT: [[TMP11:%.*]] = xor i64 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: [[TMP12:%.]] = getelementptr i8, i8 [[S1]], i8 16
				; CHECK-NEXT: [[TMP13:%.]] = bitcast i8 [[TMP12]] to i64*
				; CHECK-NEXT: [[TMP14:%.]] = getelementptr i8, i8 [[S2]], i8 16
				; CHECK-NEXT: [[TMP15:%.]] = bitcast i8 [[TMP14]] to i64*
				; CHECK-NEXT: [[TMP16:%.]] = load i64, i64 [[TMP13]]
				; CHECK-NEXT: [[TMP17:%.]] = load i64, i64 [[TMP15]]
				; CHECK-NEXT: [[TMP18:%.*]] = xor i64 [[TMP16]], [[TMP17]]
				; CHECK-NEXT: [[TMP19:%.]] = getelementptr i8, i8 [[S1]], i8 23
				; CHECK-NEXT: [[TMP20:%.]] = bitcast i8 [[TMP19]] to i64*
				; CHECK-NEXT: [[TMP21:%.]] = getelementptr i8, i8 [[S2]], i8 23
				; CHECK-NEXT: [[TMP22:%.]] = bitcast i8 [[TMP21]] to i64*
				; CHECK-NEXT: [[TMP23:%.]] = load i64, i64 [[TMP20]]
				; CHECK-NEXT: [[TMP24:%.]] = load i64, i64 [[TMP22]]
				; CHECK-NEXT: [[TMP25:%.*]] = xor i64 [[TMP23]], [[TMP24]]
				; CHECK-NEXT: [[TMP26:%.*]] = or i64 [[TMP4]], [[TMP11]]
				; CHECK-NEXT: [[TMP27:%.*]] = or i64 [[TMP18]], [[TMP25]]
				; CHECK-NEXT: [[TMP28:%.*]] = or i64 [[TMP26]], [[TMP27]]
				; CHECK-NEXT: [[TMP29:%.*]] = icmp ne i64 [[TMP28]], 0
				; CHECK-NEXT: [[TMP30:%.*]] = zext i1 [[TMP29]] to i32
				; CHECK-NEXT: [[RET:%.*]] = icmp eq i32 [[TMP30]], 0
				; CHECK-NEXT: ret i1 [[RET]]
				;
				; CHECK-STRICTALIGN-LABEL: @bcmp_bs(
				; CHECK-STRICTALIGN-NEXT: entry:
				; CHECK-STRICTALIGN-NEXT: [[MEMCMP:%.]] = call i32 @memcmp(i8 [[S1:%.]], i8 [[S2:%.*]], i64 31)
				; CHECK-STRICTALIGN-NEXT: [[RET:%.*]] = icmp eq i32 [[MEMCMP]], 0
				; CHECK-STRICTALIGN-NEXT: ret i1 [[RET]]
				;
				entry:
				%memcmp = call i32 @memcmp(i8* %s1, i8* %s2, i64 31)
				%ret = icmp eq i32 %memcmp, 0
				ret i1 %ret
				}

llvm/test/Transforms/ExpandMemCmp/PowerPC/lit.local.cfg

This file was added.

				if not 'PowerPC' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -S -expandmemcmp -verify-dom-info -mtriple=powerpc64le-unknown-gnu-linux -data-layout="e-m:e-i64:64-n32:64" \| FileCheck %s
				; RUN: opt < %s -S -expandmemcmp -verify-dom-info -mtriple=powerpc64-unknown-gnu-linux -data-layout="E-m:e-i64:64-n32:64" \| FileCheck %s --check-prefix=CHECK-BE

				define signext i32 @test1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test1(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: br label [[LOADBB:%.*]]
				; CHECK: res_block:
				; CHECK-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP8:%.]], [[LOADBB]] ], [ [[TMP17:%.]], [[LOADBB1:%.]] ]
				; CHECK-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP9:%.]], [[LOADBB]] ], [ [[TMP18:%.*]], [[LOADBB1]] ]
				; CHECK-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb:
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.]] = load i64, i64 [[TMP5]]
				; CHECK-NEXT: [[TMP8]] = call i64 @llvm.bswap.i64(i64 [[TMP6]])
				; CHECK-NEXT: [[TMP9]] = call i64 @llvm.bswap.i64(i64 [[TMP7]])
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP8]], [[TMP9]]
				; CHECK-NEXT: br i1 [[TMP10]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-NEXT: [[TMP14:%.]] = bitcast i8 [[TMP13]] to i64*
				; CHECK-NEXT: [[TMP15:%.]] = load i64, i64 [[TMP12]]
				; CHECK-NEXT: [[TMP16:%.]] = load i64, i64 [[TMP14]]
				; CHECK-NEXT: [[TMP17]] = call i64 @llvm.bswap.i64(i64 [[TMP15]])
				; CHECK-NEXT: [[TMP18]] = call i64 @llvm.bswap.i64(i64 [[TMP16]])
				; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[TMP17]], [[TMP18]]
				; CHECK-NEXT: br i1 [[TMP19]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				; CHECK-BE-LABEL: @test1(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: br label [[LOADBB:%.*]]
				; CHECK-BE: res_block:
				; CHECK-BE-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP6:%.]], [[LOADBB]] ], [ [[TMP13:%.]], [[LOADBB1:%.]] ]
				; CHECK-BE-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP7:%.]], [[LOADBB]] ], [ [[TMP14:%.*]], [[LOADBB1]] ]
				; CHECK-BE-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-BE-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-BE-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK-BE: loadbb:
				; CHECK-BE-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-BE-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-BE-NEXT: [[TMP6]] = load i64, i64* [[TMP4]]
				; CHECK-BE-NEXT: [[TMP7]] = load i64, i64* [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = icmp eq i64 [[TMP6]], [[TMP7]]
				; CHECK-BE-NEXT: br i1 [[TMP8]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK-BE: loadbb1:
				; CHECK-BE-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-BE-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i64*
				; CHECK-BE-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-BE-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; CHECK-BE-NEXT: [[TMP13]] = load i64, i64* [[TMP10]]
				; CHECK-BE-NEXT: [[TMP14]] = load i64, i64* [[TMP12]]
				; CHECK-BE-NEXT: [[TMP15:%.*]] = icmp eq i64 [[TMP13]], [[TMP14]]
				; CHECK-BE-NEXT: br i1 [[TMP15]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK-BE: endblock:
				; CHECK-BE-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-BE-NEXT: ret i32 [[PHI_RES]]
				;
				spatelUnsubmitted Not Done Reply Inline Actions Why are we generating bswap for a big-endian target? spatel: Why are we generating bswap for a big-endian target?
				courbetAuthorUnsubmitted Done Reply Inline Actions Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data layout from the target, so I/m now explicitly specifying the data layout on the RUN line. courbet: Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data…
				entry:






				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 16)
				ret i32 %call
				}

				declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				define signext i32 @test2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i8 [[TMP0]] to i32*
				; CHECK-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP1]] to i32*
				; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP3]]
				; CHECK-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP5]])
				; CHECK-NEXT: [[TMP8:%.*]] = icmp ugt i32 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: [[TMP9:%.*]] = icmp ult i32 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP11:%.*]] = zext i1 [[TMP9]] to i32
				; CHECK-NEXT: [[TMP12:%.*]] = sub i32 [[TMP10]], [[TMP11]]
				; CHECK-NEXT: ret i32 [[TMP12]]
				;
				; CHECK-BE-LABEL: @test2(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[TMP2:%.]] = bitcast i8 [[TMP0]] to i32*
				; CHECK-BE-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP1]] to i32*
				; CHECK-BE-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]]
				; CHECK-BE-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP3]]
				; CHECK-BE-NEXT: [[TMP6:%.*]] = icmp ugt i32 [[TMP4]], [[TMP5]]
				; CHECK-BE-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP4]], [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = zext i1 [[TMP6]] to i32
				; CHECK-BE-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; CHECK-BE-NEXT: [[TMP10:%.*]] = sub i32 [[TMP8]], [[TMP9]]
				; CHECK-BE-NEXT: ret i32 [[TMP10]]
				;


				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 4)
				ret i32 %call
				}

				define signext i32 @test3(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test3(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: br label [[LOADBB:%.*]]
				; CHECK: res_block:
				; CHECK-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP8:%.]], [[LOADBB]] ], [ [[TMP19:%.]], [[LOADBB1:%.]] ], [ [[TMP30:%.]], [[LOADBB2:%.]] ]
				; CHECK-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP9:%.]], [[LOADBB]] ], [ [[TMP20:%.]], [[LOADBB1]] ], [ [[TMP31:%.]], [[LOADBB2]] ]
				; CHECK-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb:
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.]] = load i64, i64 [[TMP5]]
				; CHECK-NEXT: [[TMP8]] = call i64 @llvm.bswap.i64(i64 [[TMP6]])
				; CHECK-NEXT: [[TMP9]] = call i64 @llvm.bswap.i64(i64 [[TMP7]])
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP8]], [[TMP9]]
				; CHECK-NEXT: br i1 [[TMP10]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-NEXT: [[TMP14:%.]] = bitcast i8 [[TMP13]] to i32*
				; CHECK-NEXT: [[TMP15:%.]] = load i32, i32 [[TMP12]]
				; CHECK-NEXT: [[TMP16:%.]] = load i32, i32 [[TMP14]]
				; CHECK-NEXT: [[TMP17:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP15]])
				; CHECK-NEXT: [[TMP18:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP16]])
				; CHECK-NEXT: [[TMP19]] = zext i32 [[TMP17]] to i64
				; CHECK-NEXT: [[TMP20]] = zext i32 [[TMP18]] to i64
				; CHECK-NEXT: [[TMP21:%.*]] = icmp eq i64 [[TMP19]], [[TMP20]]
				; CHECK-NEXT: br i1 [[TMP21]], label [[LOADBB2]], label [[RES_BLOCK]]
				; CHECK: loadbb2:
				; CHECK-NEXT: [[TMP22:%.]] = getelementptr i8, i8 [[TMP0]], i8 12
				; CHECK-NEXT: [[TMP23:%.]] = bitcast i8 [[TMP22]] to i16*
				; CHECK-NEXT: [[TMP24:%.]] = getelementptr i8, i8 [[TMP1]], i8 12
				; CHECK-NEXT: [[TMP25:%.]] = bitcast i8 [[TMP24]] to i16*
				; CHECK-NEXT: [[TMP26:%.]] = load i16, i16 [[TMP23]]
				; CHECK-NEXT: [[TMP27:%.]] = load i16, i16 [[TMP25]]
				; CHECK-NEXT: [[TMP28:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP26]])
				; CHECK-NEXT: [[TMP29:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP27]])
				; CHECK-NEXT: [[TMP30]] = zext i16 [[TMP28]] to i64
				; CHECK-NEXT: [[TMP31]] = zext i16 [[TMP29]] to i64
				; CHECK-NEXT: [[TMP32:%.*]] = icmp eq i64 [[TMP30]], [[TMP31]]
				; CHECK-NEXT: br i1 [[TMP32]], label [[LOADBB3:%.*]], label [[RES_BLOCK]]
				; CHECK: loadbb3:
				; CHECK-NEXT: [[TMP33:%.]] = getelementptr i8, i8 [[TMP0]], i8 14
				; CHECK-NEXT: [[TMP34:%.]] = getelementptr i8, i8 [[TMP1]], i8 14
				; CHECK-NEXT: [[TMP35:%.]] = load i8, i8 [[TMP33]]
				; CHECK-NEXT: [[TMP36:%.]] = load i8, i8 [[TMP34]]
				; CHECK-NEXT: [[TMP37:%.*]] = zext i8 [[TMP35]] to i32
				; CHECK-NEXT: [[TMP38:%.*]] = zext i8 [[TMP36]] to i32
				; CHECK-NEXT: [[TMP39:%.*]] = sub i32 [[TMP37]], [[TMP38]]
				; CHECK-NEXT: br label [[ENDBLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP39]], [[LOADBB3]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				; CHECK-BE-LABEL: @test3(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: br label [[LOADBB:%.*]]
				; CHECK-BE: res_block:
				; CHECK-BE-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP6:%.]], [[LOADBB]] ], [ [[TMP15:%.]], [[LOADBB1:%.]] ], [ [[TMP24:%.]], [[LOADBB2:%.]] ]
				; CHECK-BE-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP7:%.]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ], [ [[TMP25:%.]], [[LOADBB2]] ]
				; CHECK-BE-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-BE-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-BE-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK-BE: loadbb:
				; CHECK-BE-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-BE-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-BE-NEXT: [[TMP6]] = load i64, i64* [[TMP4]]
				; CHECK-BE-NEXT: [[TMP7]] = load i64, i64* [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = icmp eq i64 [[TMP6]], [[TMP7]]
				; CHECK-BE-NEXT: br i1 [[TMP8]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK-BE: loadbb1:
				; CHECK-BE-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-BE-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; CHECK-BE-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-BE-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; CHECK-BE-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]]
				; CHECK-BE-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]]
				; CHECK-BE-NEXT: [[TMP15]] = zext i32 [[TMP13]] to i64
				; CHECK-BE-NEXT: [[TMP16]] = zext i32 [[TMP14]] to i64
				; CHECK-BE-NEXT: [[TMP17:%.*]] = icmp eq i64 [[TMP15]], [[TMP16]]
				; CHECK-BE-NEXT: br i1 [[TMP17]], label [[LOADBB2]], label [[RES_BLOCK]]
				; CHECK-BE: loadbb2:
				; CHECK-BE-NEXT: [[TMP18:%.]] = getelementptr i8, i8 [[TMP0]], i8 12
				; CHECK-BE-NEXT: [[TMP19:%.]] = bitcast i8 [[TMP18]] to i16*
				; CHECK-BE-NEXT: [[TMP20:%.]] = getelementptr i8, i8 [[TMP1]], i8 12
				; CHECK-BE-NEXT: [[TMP21:%.]] = bitcast i8 [[TMP20]] to i16*
				; CHECK-BE-NEXT: [[TMP22:%.]] = load i16, i16 [[TMP19]]
				; CHECK-BE-NEXT: [[TMP23:%.]] = load i16, i16 [[TMP21]]
				; CHECK-BE-NEXT: [[TMP24]] = zext i16 [[TMP22]] to i64
				; CHECK-BE-NEXT: [[TMP25]] = zext i16 [[TMP23]] to i64
				; CHECK-BE-NEXT: [[TMP26:%.*]] = icmp eq i64 [[TMP24]], [[TMP25]]
				; CHECK-BE-NEXT: br i1 [[TMP26]], label [[LOADBB3:%.*]], label [[RES_BLOCK]]
				; CHECK-BE: loadbb3:
				; CHECK-BE-NEXT: [[TMP27:%.]] = getelementptr i8, i8 [[TMP0]], i8 14
				; CHECK-BE-NEXT: [[TMP28:%.]] = getelementptr i8, i8 [[TMP1]], i8 14
				; CHECK-BE-NEXT: [[TMP29:%.]] = load i8, i8 [[TMP27]]
				; CHECK-BE-NEXT: [[TMP30:%.]] = load i8, i8 [[TMP28]]
				; CHECK-BE-NEXT: [[TMP31:%.*]] = zext i8 [[TMP29]] to i32
				; CHECK-BE-NEXT: [[TMP32:%.*]] = zext i8 [[TMP30]] to i32
				; CHECK-BE-NEXT: [[TMP33:%.*]] = sub i32 [[TMP31]], [[TMP32]]
				; CHECK-BE-NEXT: br label [[ENDBLOCK]]
				; CHECK-BE: endblock:
				; CHECK-BE-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP33]], [[LOADBB3]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-BE-NEXT: ret i32 [[PHI_RES]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 15)
				ret i32 %call
				}

				define signext i32 @test4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test4(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 65)
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				; CHECK-BE-LABEL: @test4(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 65)
				; CHECK-BE-NEXT: ret i32 [[CALL]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 65)
				ret i32 %call
				}

				define signext i32 @test5(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2, i32 signext %SIZE) {
				; CHECK-LABEL: @test5(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[CONV:%.]] = sext i32 [[SIZE:%.]] to i64
				; CHECK-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 [[CONV]])
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				; CHECK-BE-LABEL: @test5(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[CONV:%.]] = sext i32 [[SIZE:%.]] to i64
				; CHECK-BE-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 [[CONV]])
				; CHECK-BE-NEXT: ret i32 [[CALL]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%conv = sext i32 %SIZE to i64
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 %conv)
				ret i32 %call
				}

llvm/test/Transforms/ExpandMemCmp/X86/memcmp.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -expandmemcmp -mtriple=i686-unknown-unknown -data-layout=e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X32			; RUN: opt -S -domtree -expandmemcmp -verify-dom-info -mtriple=i686-unknown-unknown -data-layout=e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X32
	; RUN: opt -S -expandmemcmp -memcmp-num-loads-per-block=1 -mtriple=x86_64-unknown-unknown -data-layout=e-m:o-i64:64-f80:128-n8:16:32:64-S128 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X64 --check-prefix=X64_1LD			; RUN: opt -S -domtree -expandmemcmp -verify-dom-info -memcmp-num-loads-per-block=1 -mtriple=x86_64-unknown-unknown -data-layout=e-m:o-i64:64-f80:128-n8:16:32:64-S128 -mattr=+avx2 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X64 --check-prefix=X64_1LD
	; RUN: opt -S -expandmemcmp -memcmp-num-loads-per-block=2 -mtriple=x86_64-unknown-unknown -data-layout=e-m:o-i64:64-f80:128-n8:16:32:64-S128 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X64 --check-prefix=X64_2LD			; RUN: opt -S -domtree -expandmemcmp -verify-dom-info -memcmp-num-loads-per-block=2 -mtriple=x86_64-unknown-unknown -data-layout=e-m:o-i64:64-f80:128-n8:16:32:64-S128 -mattr=+avx2 < %s \| FileCheck %s --check-prefix=ALL --check-prefix=X64 --check-prefix=X64_2LD

	declare i32 @memcmp(i8* nocapture, i8* nocapture, i64)			declare i32 @memcmp(i8* nocapture, i8* nocapture, i64)

	define i32 @cmp2(i8* nocapture readonly %x, i8* nocapture readonly %y) {			define i32 @cmp2(i8* nocapture readonly %x, i8* nocapture readonly %y) {
	; ALL-LABEL: @cmp2(			; ALL-LABEL: @cmp2(
	; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16			; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
	; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16			; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
	; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]]			; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]]
	▲ Show 20 Lines • Show All 1,199 Lines • ▼ Show 20 Lines
	; X64-NEXT: ret i32 [[CONV]]			; X64-NEXT: ret i32 [[CONV]]
	;			;
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16)			%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16)
	%cmp = icmp eq i32 %call, 0			%cmp = icmp eq i32 %call, 0
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	}			}

				define i32 @cmp_eq32(i8* nocapture readonly %x, i8* nocapture readonly %y) {
				; X32-LABEL: @cmp_eq32(
				; X32-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 32)
				; X32-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X32-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
				; X32-NEXT: ret i32 [[CONV]]
				;
				; X64-LABEL: @cmp_eq32(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i256
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i256
				; X64-NEXT: [[TMP3:%.]] = load i256, i256 [[TMP1]]
				; X64-NEXT: [[TMP4:%.]] = load i256, i256 [[TMP2]]
				; X64-NEXT: [[TMP5:%.*]] = icmp ne i256 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.*]] = zext i1 [[TMP5]] to i32
				; X64-NEXT: [[CMP:%.*]] = icmp eq i32 [[TMP6]], 0
				; X64-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
				; X64-NEXT: ret i32 [[CONV]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32)
				%cmp = icmp eq i32 %call, 0
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				define i32 @cmp_eq32_prefer128(i8* nocapture readonly %x, i8* nocapture readonly %y) "prefer-vector-width"="128" {
				; X32-LABEL: @cmp_eq32_prefer128(
				; X32-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 32)
				; X32-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X32-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
				; X32-NEXT: ret i32 [[CONV]]
				;
				; X64_1LD-LABEL: @cmp_eq32_prefer128(
				; X64_1LD-NEXT: br label [[LOADBB:%.*]]
				; X64_1LD: res_block:
				; X64_1LD-NEXT: br label [[ENDBLOCK:%.*]]
				; X64_1LD: loadbb:
				; X64_1LD-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64_1LD-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64_1LD-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]]
				; X64_1LD-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]]
				; X64_1LD-NEXT: [[TMP5:%.*]] = icmp ne i128 [[TMP3]], [[TMP4]]
				; X64_1LD-NEXT: br i1 [[TMP5]], label [[RES_BLOCK:%.]], label [[LOADBB1:%.]]
				; X64_1LD: loadbb1:
				; X64_1LD-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i8 16
				; X64_1LD-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i128*
				; X64_1LD-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i8 16
				; X64_1LD-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i128*
				; X64_1LD-NEXT: [[TMP10:%.]] = load i128, i128 [[TMP7]]
				; X64_1LD-NEXT: [[TMP11:%.]] = load i128, i128 [[TMP9]]
				; X64_1LD-NEXT: [[TMP12:%.*]] = icmp ne i128 [[TMP10]], [[TMP11]]
				; X64_1LD-NEXT: br i1 [[TMP12]], label [[RES_BLOCK]], label [[ENDBLOCK]]
				; X64_1LD: endblock:
				; X64_1LD-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ 1, [[RES_BLOCK]] ]
				; X64_1LD-NEXT: [[CMP:%.*]] = icmp eq i32 [[PHI_RES]], 0
				; X64_1LD-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
				; X64_1LD-NEXT: ret i32 [[CONV]]
				;
				; X64_2LD-LABEL: @cmp_eq32_prefer128(
				; X64_2LD-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64_2LD-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64_2LD-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]]
				; X64_2LD-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]]
				; X64_2LD-NEXT: [[TMP5:%.*]] = xor i128 [[TMP3]], [[TMP4]]
				; X64_2LD-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i8 16
				; X64_2LD-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i128*
				; X64_2LD-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i8 16
				; X64_2LD-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i128*
				; X64_2LD-NEXT: [[TMP10:%.]] = load i128, i128 [[TMP7]]
				; X64_2LD-NEXT: [[TMP11:%.]] = load i128, i128 [[TMP9]]
				; X64_2LD-NEXT: [[TMP12:%.*]] = xor i128 [[TMP10]], [[TMP11]]
				; X64_2LD-NEXT: [[TMP13:%.*]] = or i128 [[TMP5]], [[TMP12]]
				; X64_2LD-NEXT: [[TMP14:%.*]] = icmp ne i128 [[TMP13]], 0
				; X64_2LD-NEXT: [[TMP15:%.*]] = zext i1 [[TMP14]] to i32
				; X64_2LD-NEXT: [[CMP:%.*]] = icmp eq i32 [[TMP15]], 0
				; X64_2LD-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
				; X64_2LD-NEXT: ret i32 [[CONV]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32)
				%cmp = icmp eq i32 %call, 0
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

llvm/test/Transforms/ExpandMemCmp/X86/pr36421.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -domtree -expandmemcmp -verify-dom-info -S \| FileCheck %s

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
				target triple = "x86_64-unknown-unknown"

				@.str = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
				@.str.1 = private unnamed_addr constant [7 x i8] c"ABCDEF\00", align 1

				define i32 @test(i8* nocapture readonly %string, i32 %len) local_unnamed_addr #0 {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[LEN:%.]], 6
				; CHECK-NEXT: br i1 [[COND]], label [[SW_BB:%.]], label [[RETURN:%.]]
				; CHECK: sw.bb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[STRING:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]]
				; CHECK-NEXT: [[TMP2:%.*]] = xor i32 [[TMP1]], 1684234849
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[STRING]], i8 4
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i16*
				; CHECK-NEXT: [[TMP5:%.]] = load i16, i16 [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = zext i16 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP7:%.*]] = xor i32 [[TMP6]], 26213
				; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP2]], [[TMP7]]
				; CHECK-NEXT: [[TMP9:%.*]] = icmp ne i32 [[TMP8]], 0
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP9]] to i32
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[TMP10]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[RETURN]], label [[IF_END:%.*]]
				; CHECK: if.end:
				; CHECK-NEXT: [[TMP11:%.]] = bitcast i8 [[STRING]] to i32*
				; CHECK-NEXT: [[TMP12:%.]] = load i32, i32 [[TMP11]]
				; CHECK-NEXT: [[TMP13:%.*]] = xor i32 [[TMP12]], 1145258561
				; CHECK-NEXT: [[TMP14:%.]] = getelementptr i8, i8 [[STRING]], i8 4
				; CHECK-NEXT: [[TMP15:%.]] = bitcast i8 [[TMP14]] to i16*
				; CHECK-NEXT: [[TMP16:%.]] = load i16, i16 [[TMP15]]
				; CHECK-NEXT: [[TMP17:%.*]] = zext i16 [[TMP16]] to i32
				; CHECK-NEXT: [[TMP18:%.*]] = xor i32 [[TMP17]], 17989
				; CHECK-NEXT: [[TMP19:%.*]] = or i32 [[TMP13]], [[TMP18]]
				; CHECK-NEXT: [[TMP20:%.*]] = icmp ne i32 [[TMP19]], 0
				; CHECK-NEXT: [[TMP21:%.*]] = zext i1 [[TMP20]] to i32
				; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i32 [[TMP21]], 0
				; CHECK-NEXT: [[DOT:%.*]] = select i1 [[CMP2]], i32 64, i32 0
				; CHECK-NEXT: br label [[RETURN]]
				; CHECK: return:
				; CHECK-NEXT: [[RETVAL_0:%.]] = phi i32 [ 61, [[SW_BB]] ], [ [[DOT]], [[IF_END]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: ret i32 [[RETVAL_0]]
				;
				entry:
				%cond = icmp eq i32 %len, 6
				br i1 %cond, label %sw.bb, label %return

				sw.bb: ; preds = %entry
				%call = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i64 6)
				%cmp = icmp eq i32 %call, 0
				br i1 %cmp, label %return, label %if.end

				if.end: ; preds = %sw.bb
				%call1 = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i64 0, i64 0), i64 6)
				%cmp2 = icmp eq i32 %call1, 0
				%. = select i1 %cmp2, i32 64, i32 0
				br label %return

				return: ; preds = %entry, %if.end8, %if.end4, %if.end, %sw.bb
				%retval.0 = phi i32 [ 61, %sw.bb ], [ %., %if.end ], [ 0, %entry ]
				ret i32 %retval.0
				}

				; Function Attrs: nounwind readonly
				declare i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				attributes #0 = { nounwind readonly ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"PIC Level", i32 2}
				!2 = !{!"clang version 7.0.0 (trunk 325350)"}

llvm/test/Transforms/PhaseOrdering/PowerPC/lit.local.cfg

This file was added.

				if not 'PowerPC' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/PhaseOrdering/PowerPC/memCmpUsedInZeroEqualityComparison.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mcpu=pwr8 < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-n32:64"
				target triple = "powerpc64le-unknown-linux-gnu"

				@zeroEqualityTest01.buffer1 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 4], align 4
				@zeroEqualityTest01.buffer2 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 3], align 4
				@zeroEqualityTest02.buffer1 = private unnamed_addr constant [4 x i32] [i32 4, i32 0, i32 0, i32 0], align 4
				@zeroEqualityTest02.buffer2 = private unnamed_addr constant [4 x i32] [i32 3, i32 0, i32 0, i32 0], align 4
				@zeroEqualityTest03.buffer1 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 3], align 4
				@zeroEqualityTest03.buffer2 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 4], align 4
				@zeroEqualityTest04.buffer1 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14], align 4
				@zeroEqualityTest04.buffer2 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 13], align 4

				declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				; Check 4 bytes - requires 1 load for each param.
				define signext i32 @zeroEqualityTest02(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest02(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = zext i1 [[TMP5]] to i32
				; CHECK-NEXT: ret i32 [[TMP6]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 4)
				%not.cmp = icmp ne i32 %call, 0
				%. = zext i1 %not.cmp to i32
				ret i32 %.
				}

				; Check 16 bytes - requires 2 loads for each param (or use vectors?).
				define signext i32 @zeroEqualityTest01(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest01(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 8
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; CHECK-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; CHECK-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: br i1 [[TMP11]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ 1, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 16)
				%not.tobool = icmp ne i32 %call, 0
				%. = zext i1 %not.tobool to i32
				ret i32 %.
				}

				; Check 7 bytes - requires 3 loads for each param.
				define signext i32 @zeroEqualityTest03(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest03(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 4
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i16*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i16*
				; CHECK-NEXT: [[TMP9:%.]] = load i16, i16 [[TMP6]], align 2
				; CHECK-NEXT: [[TMP10:%.]] = load i16, i16 [[TMP8]], align 2
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i16 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: br i1 [[TMP11]], label [[LOADBB2:%.*]], label [[RES_BLOCK]]
				; CHECK: loadbb2:
				; CHECK-NEXT: [[TMP12:%.]] = getelementptr i8, i8 [[X]], i64 6
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[Y]], i64 6
				; CHECK-NEXT: [[TMP14:%.]] = load i8, i8 [[TMP12]], align 1
				; CHECK-NEXT: [[TMP15:%.]] = load i8, i8 [[TMP13]], align 1
				; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i8 [[TMP14]], [[TMP15]]
				; CHECK-NEXT: br i1 [[TMP16]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB2]] ], [ 1, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 7)
				%not.lnot = icmp ne i32 %call, 0
				%cond = zext i1 %not.lnot to i32
				ret i32 %cond
				}

				; Validate with > 0
				define signext i32 @zeroEqualityTest04() {
				; CHECK-LABEL: @zeroEqualityTest04(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: ret i32 0
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest02.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest02.buffer2 to i8*), i64 16)
				%not.cmp = icmp slt i32 %call, 1
				%. = zext i1 %not.cmp to i32
				ret i32 %.
				}

				; Validate with < 0
				define signext i32 @zeroEqualityTest05() {
				; CHECK-LABEL: @zeroEqualityTest05(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: ret i32 0
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest03.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest03.buffer2 to i8*), i64 16)
				%call.lobit = lshr i32 %call, 31
				%call.lobit.not = xor i32 %call.lobit, 1
				ret i32 %call.lobit.not
				}

				; Validate with memcmp()?:
				define signext i32 @equalityFoldTwoConstants() {
				; CHECK-LABEL: @equalityFoldTwoConstants(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: ret i32 1
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 bitcast ([15 x i32]* @zeroEqualityTest04.buffer2 to i8*), i64 16)
				%not.tobool = icmp eq i32 %call, 0
				%cond = zext i1 %not.tobool to i32
				ret i32 %cond
				}

				define signext i32 @equalityFoldOneConstant(i8* %X) {
				; CHECK-LABEL: @equalityFoldOneConstant(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[TMP0]], align 8
				; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 4294967296
				; CHECK-NEXT: br i1 [[TMP2]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 8
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = load i64, i64 [[TMP4]], align 8
				; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 12884901890
				; CHECK-NEXT: br i1 [[TMP6]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 1, [[LOADBB1]] ], [ 0, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 %X, i64 16)
				%not.tobool = icmp eq i32 %call, 0
				%cond = zext i1 %not.tobool to i32
				ret i32 %cond
				}

				define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) {
				; CHECK-LABEL: @length2_eq_nobuiltin_attr(
				; CHECK-NEXT: [[M:%.]] = tail call signext i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 2) #2
				; CHECK-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; CHECK-NEXT: ret i1 [[C]]
				;
				%m = tail call signext i32 @memcmp(i8* %X, i8* %Y, i64 2) nobuiltin
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp-mergeexpand.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -mergeicmps -expandmemcmp -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux < %s \| FileCheck %s --check-prefix=PPC64LE

				; This tests interaction between MergeICmp and ExpandMemCmp.

				%"struct.std::pair" = type { i32, i32 }

				define zeroext i1 @opeq1(
				; PPC64LE-LABEL: @opeq1(
				; PPC64LE-NEXT: "entry+land.rhs.i":
				; PPC64LE-NEXT: [[TMP0:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; PPC64LE-NEXT: [[TMP1:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; PPC64LE-NEXT: [[CSTR:%.]] = bitcast i32 [[TMP0]] to i8*
				; PPC64LE-NEXT: [[CSTR1:%.]] = bitcast i32 [[TMP1]] to i8*
				; PPC64LE-NEXT: [[TMP2:%.]] = bitcast i8 [[CSTR]] to i64*
				; PPC64LE-NEXT: [[TMP3:%.]] = bitcast i8 [[CSTR1]] to i64*
				; PPC64LE-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]]
				; PPC64LE-NEXT: [[TMP5:%.]] = load i64, i64 [[TMP3]]
				; PPC64LE-NEXT: [[TMP6:%.*]] = icmp ne i64 [[TMP4]], [[TMP5]]
				; PPC64LE-NEXT: [[TMP7:%.*]] = zext i1 [[TMP6]] to i32
				; PPC64LE-NEXT: [[TMP8:%.*]] = icmp eq i32 [[TMP7]], 0
				; PPC64LE-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; PPC64LE: opeq1.exit:
				; PPC64LE-NEXT: ret i1 [[TMP8]]
				;
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
				entry:
				%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
				%0 = load i32, i32* %first.i, align 4
				%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
				%1 = load i32, i32* %first1.i, align 4
				%cmp.i = icmp eq i32 %0, %1
				br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

				land.rhs.i:
				%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
				%2 = load i32, i32* %second.i, align 4
				%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
				%3 = load i32, i32* %second2.i, align 4
				%cmp3.i = icmp eq i32 %2, %3
				br label %opeq1.exit

				opeq1.exit:
				%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
				ret i1 %4
				}

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux \| FileCheck %s -check-prefix=CHECK

				define signext i32 @memcmp8(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp8(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER1:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i32 [[BUFFER2:%.]] to i64
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 4
				; CHECK-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; CHECK-NEXT: [[TMP6:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = icmp ugt i64 [[TMP5]], [[TMP6]]
				; CHECK-NEXT: [[TMP8:%.*]] = icmp ult i64 [[TMP5]], [[TMP6]]
				; CHECK-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: ret i32 [[TMP11]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 8)
				ret i32 %call
				}

				define signext i32 @memcmp4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp4(
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[BUFFER1:%.*]], align 4
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[BUFFER2:%.*]], align 4
				; CHECK-NEXT: [[TMP3:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP1]])
				; CHECK-NEXT: [[TMP4:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = icmp ult i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.*]] = zext i1 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP8:%.*]] = zext i1 [[TMP6]] to i32
				; CHECK-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; CHECK-NEXT: ret i32 [[TMP9]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 4)
				ret i32 %call
				}

				define signext i32 @memcmp2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp2(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER1:%.]] to i16
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i32 [[BUFFER2:%.]] to i16
				; CHECK-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; CHECK-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; CHECK-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = zext i16 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP8:%.*]] = zext i16 [[TMP6]] to i32
				; CHECK-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; CHECK-NEXT: ret i32 [[TMP9]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 2)
				ret i32 %call
				}

				define signext i32 @memcmp1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp1(
				; CHECK-NEXT: [[T0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[T1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[LHSC:%.]] = load i8, i8 [[T0]], align 1
				; CHECK-NEXT: [[LHSV:%.*]] = zext i8 [[LHSC]] to i32
				; CHECK-NEXT: [[RHSC:%.]] = load i8, i8 [[T1]], align 1
				; CHECK-NEXT: [[RHSV:%.*]] = zext i8 [[RHSC]] to i32
				; CHECK-NEXT: [[CHARDIFF:%.*]] = sub nsw i32 [[LHSV]], [[RHSV]]
				; CHECK-NEXT: ret i32 [[CHARDIFF]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 1) #2
				ret i32 %call
				}

				declare signext i32 @memcmp(i8, i8, i64)

llvm/test/Transforms/PhaseOrdering/X86/lit.local.cfg

This file was added.

				if not 'X86' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/PhaseOrdering/X86/memcmp-mergeexpand.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -mergeicmps -expandmemcmp -mtriple=i386-unknown-linux < %s \| FileCheck %s --check-prefix=X86
				; RUN: opt -S -mergeicmps -expandmemcmp -mtriple=x86_64-unknown-linux < %s \| FileCheck %s --check-prefix=X64

				; This tests interaction between MergeICmp and ExpandMemCmp.

				%"struct.std::pair" = type { i32, i32 }

				define zeroext i1 @opeq1(
				; X86-LABEL: @opeq1(
				; X86-NEXT: "entry+land.rhs.i":
				; X86-NEXT: [[TMP0:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; X86-NEXT: [[TMP1:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; X86-NEXT: [[CSTR:%.]] = bitcast i32 [[TMP0]] to i8*
				; X86-NEXT: [[CSTR1:%.]] = bitcast i32 [[TMP1]] to i8*
				; X86-NEXT: [[TMP2:%.]] = bitcast i8 [[CSTR]] to i32*
				; X86-NEXT: [[TMP3:%.]] = bitcast i8 [[CSTR1]] to i32*
				; X86-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]]
				; X86-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP3]]
				; X86-NEXT: [[TMP6:%.*]] = xor i32 [[TMP4]], [[TMP5]]
				; X86-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[CSTR]], i8 4
				; X86-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; X86-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[CSTR1]], i8 4
				; X86-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; X86-NEXT: [[TMP11:%.]] = load i32, i32 [[TMP8]]
				; X86-NEXT: [[TMP12:%.]] = load i32, i32 [[TMP10]]
				; X86-NEXT: [[TMP13:%.*]] = xor i32 [[TMP11]], [[TMP12]]
				; X86-NEXT: [[TMP14:%.*]] = or i32 [[TMP6]], [[TMP13]]
				; X86-NEXT: [[TMP15:%.*]] = icmp ne i32 [[TMP14]], 0
				; X86-NEXT: [[TMP16:%.*]] = zext i1 [[TMP15]] to i32
				; X86-NEXT: [[TMP17:%.*]] = icmp eq i32 [[TMP16]], 0
				; X86-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; X86: opeq1.exit:
				; X86-NEXT: ret i1 [[TMP17]]
				;
				; X64-LABEL: @opeq1(
				; X64-NEXT: "entry+land.rhs.i":
				; X64-NEXT: [[TMP0:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; X64-NEXT: [[TMP1:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; X64-NEXT: [[CSTR:%.]] = bitcast i32 [[TMP0]] to i8*
				; X64-NEXT: [[CSTR1:%.]] = bitcast i32 [[TMP1]] to i8*
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[CSTR]] to i64*
				; X64-NEXT: [[TMP3:%.]] = bitcast i8 [[CSTR1]] to i64*
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]]
				; X64-NEXT: [[TMP5:%.]] = load i64, i64 [[TMP3]]
				; X64-NEXT: [[TMP6:%.*]] = icmp ne i64 [[TMP4]], [[TMP5]]
				; X64-NEXT: [[TMP7:%.*]] = zext i1 [[TMP6]] to i32
				; X64-NEXT: [[TMP8:%.*]] = icmp eq i32 [[TMP7]], 0
				; X64-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; X64: opeq1.exit:
				; X64-NEXT: ret i1 [[TMP8]]
				;
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
				entry:
				%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
				%0 = load i32, i32* %first.i, align 4
				%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
				%1 = load i32, i32* %first1.i, align 4
				%cmp.i = icmp eq i32 %0, %1
				br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

				land.rhs.i:
				%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
				%2 = load i32, i32* %second.i, align 4
				%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
				%3 = load i32, i32* %second2.i, align 4
				%cmp3.i = icmp eq i32 %2, %3
				br label %opeq1.exit

				opeq1.exit:
				%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
				ret i1 %4
				}

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=ALL --check-prefix=X86
				; RUN: opt < %s -O2 -S -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=ALL --check-prefix=X64

				; This tests interaction between the MergeICmp and ExpandMemCmp IR transform
				; passes.
				spatelUnsubmitted Done Reply Inline Actions Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR transform passes.' spatel: Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR…

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"


				@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

				declare i32 @memcmp(i8, i8, i64)
				declare i32 @bcmp(i8, i8, i64)

				define i32 @length0(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0(
				; ALL-NEXT: ret i32 0
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				ret i32 %m
				}

				define i1 @length0_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0_eq(
				; ALL-NEXT: ret i1 true
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length0_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0_lt(
				; ALL-NEXT: ret i1 false
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i32 @length2(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = zext i16 [[TMP5]] to i32
				; ALL-NEXT: [[TMP8:%.*]] = zext i16 [[TMP6]] to i32
				; ALL-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; ALL-NEXT: ret i32 [[TMP9]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				ret i32 %m
				}

				define i1 @length2_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = icmp eq i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_lt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[C:%.*]] = icmp ult i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_gt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_gt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[C:%.*]] = icmp ugt i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp sgt i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_eq_const(i8* %X) nounwind {
				; ALL-LABEL: @length2_eq_const(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP3:%.*]] = icmp ne i16 [[TMP2]], 12849
				; ALL-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_eq_nobuiltin_attr(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 2) #5
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i32 @length3(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length3(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = load i16, i16 [[TMP0]], align 2
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i16 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 2
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 2
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
				ret i32 %m
				}

				define i1 @length3_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length3_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = xor i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 2
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 2
				; ALL-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; ALL-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; ALL-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; ALL-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i16
				; ALL-NEXT: [[TMP12:%.*]] = or i16 [[TMP5]], [[TMP11]]
				; ALL-NEXT: [[TMP13:%.*]] = icmp ne i16 [[TMP12]], 0
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length4(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; ALL-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; ALL-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; ALL-NEXT: ret i32 [[TMP11]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				ret i32 %m
				}

				define i1 @length4_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_lt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[TMP7]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_gt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_gt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[TMP7]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp sgt i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_eq_const(i8* %X) nounwind {
				; ALL-LABEL: @length4_eq_const(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP3:%.*]] = icmp eq i32 [[TMP2]], 875770417
				; ALL-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i32 @length5(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				ret i32 %m
				}

				define i1 @length5_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; ALL-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; ALL-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; ALL-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i32
				; ALL-NEXT: [[TMP12:%.*]] = or i32 [[TMP5]], [[TMP11]]
				; ALL-NEXT: [[TMP13:%.*]] = icmp ne i32 [[TMP12]], 0
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length5_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5_lt(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: [[C:%.*]] = icmp slt i32 [[PHI_RES]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length7_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length7_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 3
				; ALL-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 3
				; ALL-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; ALL-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]], align 4
				; ALL-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]], align 4
				; ALL-NEXT: [[TMP11:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP12:%.*]] = icmp ne i32 [[TMP9]], [[TMP10]]
				; ALL-NEXT: [[TMP13:%.*]] = or i1 [[TMP11]], [[TMP12]]
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 7) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length8(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length8(
				; X86-NEXT: loadbb:
				; X86-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; X86-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP4:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; X86-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; X86-NEXT: [[TMP6:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; X86-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X86: res_block:
				; X86-NEXT: [[PHI_SRC1:%.]] = phi i32 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP15:%.*]], [[LOADBB1]] ]
				; X86-NEXT: [[PHI_SRC2:%.]] = phi i32 [ [[TMP5]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ]
				; X86-NEXT: [[TMP7:%.*]] = icmp ult i32 [[PHI_SRC1]], [[PHI_SRC2]]
				; X86-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X86-NEXT: br label [[ENDBLOCK:%.*]]
				; X86: loadbb1:
				; X86-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; X86-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; X86-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; X86-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]], align 4
				; X86-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]], align 4
				; X86-NEXT: [[TMP15]] = call i32 @llvm.bswap.i32(i32 [[TMP13]])
				; X86-NEXT: [[TMP16]] = call i32 @llvm.bswap.i32(i32 [[TMP14]])
				; X86-NEXT: [[TMP17:%.*]] = icmp eq i32 [[TMP13]], [[TMP14]]
				; X86-NEXT: br i1 [[TMP17]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X86: endblock:
				; X86-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X86-NEXT: ret i32 [[PHI_RES]]
				;
				; X64-LABEL: @length8(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP4]])
				; X64-NEXT: [[TMP7:%.*]] = icmp ugt i64 [[TMP5]], [[TMP6]]
				; X64-NEXT: [[TMP8:%.*]] = icmp ult i64 [[TMP5]], [[TMP6]]
				; X64-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; X64-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; X64-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; X64-NEXT: ret i32 [[TMP11]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
				ret i32 %m
				}

				define i1 @length8_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length8_eq(
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; X86-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; X86-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; X86-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; X86-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; X86-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]], align 4
				; X86-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]], align 4
				; X86-NEXT: [[TMP11:%.*]] = icmp eq i32 [[TMP3]], [[TMP4]]
				; X86-NEXT: [[TMP12:%.*]] = icmp eq i32 [[TMP9]], [[TMP10]]
				; X86-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length8_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length8_eq_const(i8* %X) nounwind {
				; X86-LABEL: @length8_eq_const(
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i32*
				; X86-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP4]], align 4
				; X86-NEXT: [[TMP6:%.*]] = icmp ne i32 [[TMP2]], 858927408
				; X86-NEXT: [[TMP7:%.*]] = icmp ne i32 [[TMP5]], 926299444
				; X86-NEXT: [[TMP8:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; X86-NEXT: ret i1 [[TMP8]]
				;
				; X64-LABEL: @length8_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = icmp ne i64 [[TMP2]], 3978425819141910832
				; X64-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length9_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length9_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(9) [[X:%.]], i8 dereferenceable(9) [[Y:%.*]], i64 9) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length9_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; X64-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; X64-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; X64-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i64
				; X64-NEXT: [[TMP12:%.*]] = or i64 [[TMP5]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = icmp eq i64 [[TMP12]], 0
				; X64-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length10_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length10_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(10) [[X:%.]], i8 dereferenceable(10) [[Y:%.*]], i64 10) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length10_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i16*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i16*
				; X64-NEXT: [[TMP10:%.]] = load i16, i16 [[TMP7]], align 2
				; X64-NEXT: [[TMP11:%.]] = load i16, i16 [[TMP9]], align 2
				; X64-NEXT: [[TMP12:%.*]] = xor i16 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i16 [[TMP12]] to i64
				; X64-NEXT: [[TMP14:%.*]] = or i64 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp eq i64 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 10) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length11_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length11_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(11) [[X:%.]], i8 dereferenceable(11) [[Y:%.*]], i64 11) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length11_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 3
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 3
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 11) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length12_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length12_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(12) [[X:%.]], i8 dereferenceable(12) [[Y:%.*]], i64 12) #3
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length12_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i32*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i32*
				; X64-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP7]], align 4
				; X64-NEXT: [[TMP11:%.]] = load i32, i32 [[TMP9]], align 4
				; X64-NEXT: [[TMP12:%.*]] = xor i32 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i32 [[TMP12]] to i64
				; X64-NEXT: [[TMP14:%.*]] = or i64 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp ne i64 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length12(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length12(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(12) [[X:%.]], i8 dereferenceable(12) [[Y:%.*]], i64 12) #3
				; X86-NEXT: ret i32 [[M]]
				;
				; X64-LABEL: @length12(
				; X64-NEXT: loadbb:
				; X64-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP2]])
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; X64-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X64: res_block:
				; X64-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP17:%.*]], [[LOADBB1]] ]
				; X64-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP5]], [[LOADBB]] ], [ [[TMP18:%.]], [[LOADBB1]] ]
				; X64-NEXT: [[TMP7:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; X64-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X64-NEXT: br label [[ENDBLOCK:%.*]]
				; X64: loadbb1:
				; X64-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; X64-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; X64-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]], align 4
				; X64-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]], align 4
				; X64-NEXT: [[TMP15:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP13]])
				; X64-NEXT: [[TMP16:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP14]])
				; X64-NEXT: [[TMP17]] = zext i32 [[TMP15]] to i64
				; X64-NEXT: [[TMP18]] = zext i32 [[TMP16]] to i64
				; X64-NEXT: [[TMP19:%.*]] = icmp eq i32 [[TMP13]], [[TMP14]]
				; X64-NEXT: br i1 [[TMP19]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X64: endblock:
				; X64-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X64-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
				ret i32 %m
				}

				define i1 @length13_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length13_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(13) [[X:%.]], i8 dereferenceable(13) [[Y:%.*]], i64 13) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length13_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 5
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 5
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 13) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length14_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length14_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(14) [[X:%.]], i8 dereferenceable(14) [[Y:%.*]], i64 14) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length14_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 6
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 6
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 14) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length15_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length15_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(15) [[X:%.]], i8 dereferenceable(15) [[Y:%.*]], i64 15) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length15_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 7
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 7
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 15) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

				define i32 @length16(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length16(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(16) [[X:%.]], i8 dereferenceable(16) [[Y:%.*]], i64 16) #3
				; X86-NEXT: ret i32 [[M]]
				;
				; X64-LABEL: @length16(
				; X64-NEXT: loadbb:
				; X64-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP2]])
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; X64-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X64: res_block:
				; X64-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP15:%.*]], [[LOADBB1]] ]
				; X64-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP5]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ]
				; X64-NEXT: [[TMP7:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; X64-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X64-NEXT: br label [[ENDBLOCK:%.*]]
				; X64: loadbb1:
				; X64-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i64*
				; X64-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; X64-NEXT: [[TMP13:%.]] = load i64, i64 [[TMP10]], align 8
				; X64-NEXT: [[TMP14:%.]] = load i64, i64 [[TMP12]], align 8
				; X64-NEXT: [[TMP15]] = call i64 @llvm.bswap.i64(i64 [[TMP13]])
				; X64-NEXT: [[TMP16]] = call i64 @llvm.bswap.i64(i64 [[TMP14]])
				; X64-NEXT: [[TMP17:%.*]] = icmp eq i64 [[TMP13]], [[TMP14]]
				; X64-NEXT: br i1 [[TMP17]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X64: endblock:
				; X64-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X64-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
				ret i32 %m
				}

				define i1 @length16_eq(i8* %x, i8* %y) nounwind {
				; X86-LABEL: @length16_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 dereferenceable(16) [[X:%.]], i8 dereferenceable(16) [[Y:%.*]], i64 16) #3
				; X86-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length16_eq(
				spatelUnsubmitted Not Done Reply Inline Actions Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way to do end-to-end testing now within the regression test dir. We would be better off creating real end-to-end (C source --> x86 asm) tests within test-suite? That way, we can be sure that no passes anywhere in the pipeline are interfering with our memcmp patterns. spatel: Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way…
				courbetAuthorUnsubmitted Done Reply Inline Actions Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will fix. courbet: Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will…
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = icmp ne i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: ret i1 [[TMP5]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
				%cmp = icmp ne i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length16_eq_const(i8* %X) nounwind {
				; X86-LABEL: @length16_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(16) [[X:%.]], i8 dereferenceable(16) getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 16) #3
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length16_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = icmp eq i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

				define i32 @length24(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length24(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(24) [[X:%.]], i8 dereferenceable(24) [[Y:%.*]], i64 24) #3
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
				ret i32 %m
				}

				define i1 @length24_eq(i8* %x, i8* %y) nounwind {
				; X86-LABEL: @length24_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 dereferenceable(24) [[X:%.]], i8 dereferenceable(24) [[Y:%.*]], i64 24) #3
				; X86-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length24_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i64*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 16
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i64*
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP7]], align 8
				; X64-NEXT: [[TMP11:%.]] = load i64, i64 [[TMP9]], align 8
				; X64-NEXT: [[TMP12:%.*]] = xor i64 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i64 [[TMP12]] to i128
				; X64-NEXT: [[TMP14:%.*]] = or i128 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp eq i128 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
				%cmp = icmp eq i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length24_eq_const(i8* %X) nounwind {
				; X86-LABEL: @length24_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(24) [[X:%.]], i8 dereferenceable(24) getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 24) #3
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length24_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = xor i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: [[TMP4:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP4]] to i64*
				; X64-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP5]], align 8
				; X64-NEXT: [[TMP7:%.*]] = xor i64 [[TMP6]], 3689065127958034230
				; X64-NEXT: [[TMP8:%.*]] = zext i64 [[TMP7]] to i128
				; X64-NEXT: [[TMP9:%.*]] = or i128 [[TMP3]], [[TMP8]]
				; X64-NEXT: [[TMP10:%.*]] = icmp ne i128 [[TMP9]], 0
				; X64-NEXT: ret i1 [[TMP10]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length32(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length32(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(32) [[X:%.]], i8 dereferenceable(32) [[Y:%.*]], i64 32) #3
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
				ret i32 %m
				}

				; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

				define i1 @length32_eq(i8* %x, i8* %y) nounwind {
				; X86-LABEL: @length32_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 dereferenceable(32) [[X:%.]], i8 dereferenceable(32) [[Y:%.*]], i64 32) #3
				; X86-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length32_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i128*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 16
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i128*
				; X64-NEXT: [[TMP9:%.]] = load i128, i128 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i128, i128 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i128 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[CMP:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[CMP]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
				%cmp = icmp eq i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length32_eq_const(i8* %X) nounwind {
				; X86-LABEL: @length32_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(32) [[X:%.]], i8 dereferenceable(32) getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 32) #3
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length32_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i128*
				; X64-NEXT: [[TMP5:%.]] = load i128, i128 [[TMP4]], align 8
				; X64-NEXT: [[TMP6:%.*]] = icmp ne i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: [[TMP7:%.*]] = icmp ne i128 [[TMP5]], 65382562593882267225249597816672106294
				; X64-NEXT: [[TMP8:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; X64-NEXT: ret i1 [[TMP8]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length64(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length64(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(64) [[X:%.]], i8 dereferenceable(64) [[Y:%.*]], i64 64) #3
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
				ret i32 %m
				}

				define i1 @length64_eq(i8* %x, i8* %y) nounwind {
				; ALL-LABEL: @length64_eq(
				; ALL-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 dereferenceable(64) [[X:%.]], i8 dereferenceable(64) [[Y:%.*]], i64 64) #3
				; ALL-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
				; ALL-NEXT: ret i1 [[CMP]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
				%cmp = icmp ne i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length64_eq_const(i8* %X) nounwind {
				; ALL-LABEL: @length64_eq_const(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(64) [[X:%.]], i8 dereferenceable(64) getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 64) #3
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; This checks that we do not do stupid things with huge sizes.
				define i32 @huge_length(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @huge_length(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(9223372036854775807) [[X:%.]], i8 dereferenceable(9223372036854775807) [[Y:%.*]], i64 9223372036854775807) #3
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
				ret i32 %m
				}

				define i1 @huge_length_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @huge_length_eq(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 dereferenceable(9223372036854775807) [[X:%.]], i8 dereferenceable(9223372036854775807) [[Y:%.*]], i64 9223372036854775807) #3
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; This checks non-constant sizes.
				define i32 @nonconst_length(i8* %X, i8* %Y, i64 %size) nounwind {
				; ALL-LABEL: @nonconst_length(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.]], i64 [[SIZE:%.]]) #3
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
				ret i32 %m
				}

				define i1 @nonconst_length_eq(i8* %X, i8* %Y, i64 %size) nounwind {
				; ALL-LABEL: @nonconst_length_eq(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.]], i64 [[SIZE:%.]]) #3
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @bcmp_length2(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @bcmp_length2(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = icmp eq i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @bcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

llvm/test/Transforms/PhaseOrdering/X86/pr36421.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S \| FileCheck %s

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
				target triple = "x86_64-unknown-unknown"

				@.str = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
				@.str.1 = private unnamed_addr constant [7 x i8] c"ABCDEF\00", align 1

				define i32 @test(i8* nocapture readonly %string, i32 %len) local_unnamed_addr #0 {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[LEN:%.]], 6
				; CHECK-NEXT: br i1 [[COND]], label [[SW_BB:%.]], label [[RETURN:%.]]
				; CHECK: sw.bb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[STRING:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]], align 4
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr i8, i8 [[STRING]], i64 4
				; CHECK-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP2]] to i16*
				; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP3]], align 2
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i32 [[TMP1]], 1684234849
				; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i16 [[TMP4]], 26213
				; CHECK-NEXT: [[CMP:%.*]] = and i1 [[TMP6]], [[TMP5]]
				; CHECK-NEXT: br i1 [[CMP]], label [[RETURN]], label [[IF_END:%.*]]
				; CHECK: if.end:
				; CHECK-NEXT: [[TMP7:%.*]] = xor i32 [[TMP1]], 1145258561
				; CHECK-NEXT: [[TMP8:%.*]] = xor i16 [[TMP4]], 17989
				; CHECK-NEXT: [[TMP9:%.*]] = zext i16 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP7]], [[TMP9]]
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i32 [[TMP10]], 0
				; CHECK-NEXT: [[DOT:%.*]] = select i1 [[TMP11]], i32 64, i32 0
				; CHECK-NEXT: br label [[RETURN]]
				; CHECK: return:
				; CHECK-NEXT: [[RETVAL_0:%.]] = phi i32 [ 61, [[SW_BB]] ], [ [[DOT]], [[IF_END]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: ret i32 [[RETVAL_0]]
				;
				entry:
				%cond = icmp eq i32 %len, 6
				br i1 %cond, label %sw.bb, label %return

				sw.bb: ; preds = %entry
				%call = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i64 6)
				%cmp = icmp eq i32 %call, 0
				br i1 %cmp, label %return, label %if.end

				if.end: ; preds = %sw.bb
				%call1 = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i64 0, i64 0), i64 6)
				%cmp2 = icmp eq i32 %call1, 0
				%. = select i1 %cmp2, i32 64, i32 0
				br label %return

				return: ; preds = %entry, %if.end8, %if.end4, %if.end, %sw.bb
				%retval.0 = phi i32 [ 61, %sw.bb ], [ %., %if.end ], [ 0, %entry ]
				ret i32 %retval.0
				}

				; Function Attrs: nounwind readonly
				declare i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				attributes #0 = { nounwind readonly ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"PIC Level", i32 2}
				!2 = !{!"clang version 7.0.0 (trunk 325350)"}

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 508 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeAnalysis(Registry);		initializeAnalysis(Registry);
initializeTransformUtils(Registry);		initializeTransformUtils(Registry);
initializeInstCombine(Registry);		initializeInstCombine(Registry);
initializeAggressiveInstCombine(Registry);		initializeAggressiveInstCombine(Registry);
initializeInstrumentation(Registry);		initializeInstrumentation(Registry);
initializeTarget(Registry);		initializeTarget(Registry);
// For codegen passes, only passes that do IR to IR transformation are		// For codegen passes, only passes that do IR to IR transformation are
// supported.		// supported.
initializeExpandMemCmpPassPass(Registry);
initializeScalarizeMaskedMemIntrinPass(Registry);		initializeScalarizeMaskedMemIntrinPass(Registry);
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeAtomicExpandPass(Registry);		initializeAtomicExpandPass(Registry);
initializeRewriteSymbolsLegacyPassPass(Registry);		initializeRewriteSymbolsLegacyPassPass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeDwarfEHPreparePass(Registry);		initializeDwarfEHPreparePass(Registry);
initializeSafeStackLegacyPassPass(Registry);		initializeSafeStackLegacyPassPass(Registry);
initializeSjLjEHPreparePass(Registry);		initializeSjLjEHPreparePass(Registry);
▲ Show 20 Lines • Show All 412 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

Show All 33 Lines	sources = [
"CriticalAntiDepBreaker.cpp",		"CriticalAntiDepBreaker.cpp",
"DFAPacketizer.cpp",		"DFAPacketizer.cpp",
"DeadMachineInstructionElim.cpp",		"DeadMachineInstructionElim.cpp",
"DetectDeadLanes.cpp",		"DetectDeadLanes.cpp",
"DwarfEHPrepare.cpp",		"DwarfEHPrepare.cpp",
"EarlyIfConversion.cpp",		"EarlyIfConversion.cpp",
"EdgeBundles.cpp",		"EdgeBundles.cpp",
"ExecutionDomainFix.cpp",		"ExecutionDomainFix.cpp",
"ExpandMemCmp.cpp",
"ExpandPostRAPseudos.cpp",		"ExpandPostRAPseudos.cpp",
"ExpandReductions.cpp",		"ExpandReductions.cpp",
"FEntryInserter.cpp",		"FEntryInserter.cpp",
"FaultMaps.cpp",		"FaultMaps.cpp",
"FinalizeISel.cpp",		"FinalizeISel.cpp",
"FuncletLayout.cpp",		"FuncletLayout.cpp",
"GCMetadata.cpp",		"GCMetadata.cpp",
"GCMetadataPrinter.cpp",		"GCMetadataPrinter.cpp",
▲ Show 20 Lines • Show All 139 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

Show All 15 Lines	sources = [
"CallSiteSplitting.cpp",		"CallSiteSplitting.cpp",
"ConstantHoisting.cpp",		"ConstantHoisting.cpp",
"ConstantProp.cpp",		"ConstantProp.cpp",
"CorrelatedValuePropagation.cpp",		"CorrelatedValuePropagation.cpp",
"DCE.cpp",		"DCE.cpp",
"DeadStoreElimination.cpp",		"DeadStoreElimination.cpp",
"DivRemPairs.cpp",		"DivRemPairs.cpp",
"EarlyCSE.cpp",		"EarlyCSE.cpp",
		"ExpandMemCmp.cpp",
"FlattenCFGPass.cpp",		"FlattenCFGPass.cpp",
"Float2Int.cpp",		"Float2Int.cpp",
"GVN.cpp",		"GVN.cpp",
"GVNHoist.cpp",		"GVNHoist.cpp",
"GVNSink.cpp",		"GVNSink.cpp",
"GuardWidening.cpp",		"GuardWidening.cpp",
"IVUsersPrinter.cpp",		"IVUsersPrinter.cpp",
"IndVarSimplify.cpp",		"IndVarSimplify.cpp",
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 219339

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h

llvm/include/llvm/Transforms/Scalar.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/ExpandMemCmp.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/Scalar/CMakeLists.txt

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp

llvm/lib/Transforms/Scalar/MergeICmps.cpp

llvm/lib/Transforms/Scalar/Scalar.cpp

llvm/test/CodeGen/AArch64/O3-pipeline.ll

llvm/test/CodeGen/AArch64/bcmp-inline-small.ll

llvm/test/CodeGen/ARM/O3-pipeline.ll

llvm/test/CodeGen/Generic/llc-start-stop.ll

llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll

llvm/test/CodeGen/PowerPC/memcmp-mergeexpand.ll

llvm/test/CodeGen/PowerPC/memcmp.ll

llvm/test/CodeGen/PowerPC/memcmpIR.ll

llvm/test/CodeGen/X86/O3-pipeline.ll

llvm/test/CodeGen/X86/memcmp-mergeexpand.ll

llvm/test/CodeGen/X86/memcmp-optsize.ll

llvm/test/CodeGen/X86/memcmp.ll

llvm/test/Other/opt-O2-pipeline.ll

llvm/test/Other/opt-O3-pipeline.ll

llvm/test/Other/opt-Os-pipeline.ll

llvm/test/Transforms/ExpandMemCmp/AArch64/memcmp.ll

llvm/test/Transforms/ExpandMemCmp/PowerPC/lit.local.cfg

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll

llvm/test/Transforms/ExpandMemCmp/X86/memcmp.ll

llvm/test/Transforms/ExpandMemCmp/X86/pr36421.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/lit.local.cfg

llvm/test/Transforms/PhaseOrdering/PowerPC/memCmpUsedInZeroEqualityComparison.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp-mergeexpand.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp.ll

llvm/test/Transforms/PhaseOrdering/X86/lit.local.cfg

llvm/test/Transforms/PhaseOrdering/X86/memcmp-mergeexpand.ll

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll

llvm/test/Transforms/PhaseOrdering/X86/pr36421.ll

llvm/tools/opt/opt.cpp

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
AbandonedPublic