This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
1/1
TargetTransformInfo.h
-
TargetTransformInfoImpl.h
-
CodeGen/
-
Passes.h
-
TargetLowering.h
-
Transforms/
-
IPO/
-
PassManagerBuilder.h
-
Scalar.h
-
lib/
-
Analysis/
-
TargetTransformInfo.cpp
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
ExpandMemCmp.cpp
-
TargetPassConfig.cpp
-
Target/
-
PowerPC/
-
PPCTargetTransformInfo.h
-
PPCTargetTransformInfo.cpp
-
X86/
-
X86ISelLowering.h
-
X86TargetTransformInfo.h
-
X86TargetTransformInfo.cpp
-
Transforms/
-
IPO/
-
PassManagerBuilder.cpp
-
Scalar/
-
CMakeLists.txt
5/5
ExpandMemCmp.cpp
-
MergeICmps.cpp
-
Scalar.cpp
-
test/
-
CodeGen/
-
AArch64/
-
O3-pipeline.ll
-
ARM/
-
O3-pipeline.ll
-
Generic/
-
llc-start-stop.ll
-
PowerPC/
-
memCmpUsedInZeroEqualityComparison.ll
-
memcmp-mergeexpand.ll
-
memcmp.ll
-
memcmpIR.ll
-
X86/
-
O3-pipeline.ll
-
memcmp-mergeexpand.ll
-
memcmp-optsize.ll
-
memcmp.ll
-
Other/
2/2
opt-O2-pipeline.ll
-
opt-O3-pipeline.ll
-
opt-Os-pipeline.ll
-
Transforms/
-
ExpandMemCmp/
-
PowerPC/
-
lit.local.cfg
1/2
memcmpIR.ll
-
X86/
-
pr36421.ll
-
PhaseOrdering/
-
PowerPC/
-
lit.local.cfg
-
memCmpUsedInZeroEqualityComparison.ll
-
memcmp-mergeexpand.ll
-
memcmp.ll
-
X86/
-
lit.local.cfg
-
memcmp-mergeexpand.ll
2/3
memcmp.ll
-
pr36421.ll
-
tools/opt/
-
opt/
-
opt.cpp
-
utils/gn/secondary/llvm/lib/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
CodeGen/
-
BUILD.gn
-
Transforms/Scalar/
-
Scalar/
-
BUILD.gn

Differential D60318

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
AbandonedPublic

Authored by courbet on Apr 5 2019, 8:05 AM.

Download Raw Diff

Details

Reviewers

spatel
sbenza
efriedma
chandlerc
hfinkel
echristo
RKSimon

Summary

This opens up numerous possibilities for optimizations.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 31455
Build 31454: arc lint + arc unit

Event Timeline

courbet created this revision.Apr 5 2019, 8:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 8:05 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B30102: Diff 193879.Apr 5 2019, 8:07 AM

courbet edited the summary of this revision. (Show Details)Apr 5 2019, 8:08 AM

I agree that we shouldn't have to do anything too fancy here. Ideally, DAGCombiner would take care of this generally, so we're not optimizing this 1 special-case pattern that includes memcmp and ignoring the general case.

Did you investigate what that would take? We get this in instcombine, so we do have some set of peepholes to potentially copy.

Alternatively, what do you think about making ExpandMemCmp a late IR optimization pass like the vectorizer passes?
When I was looking at memcmp optimizations initially, it was clear that in some larger examples we would benefit from running instcombine/CSE after the expansion. By moving this pass up, we wouldn't have to worry about adding optimizations (like this patch) to the backend because it will all be done for us in existing IR optimization passes.

Alternatively, what do you think about making ExpandMemCmp a late IR optimization pass like the vectorizer passes?

The would be ideal, but unfortunately ExpandMemCmp requires access to the TargetLowering (for TargetLowering::MaxLoadsPerMemcmp which is consistent with what happens for memcpy and memset) . On the other hand, we already have some MemCmpExpansionOptions in TargetTransformInfo, which also knows about memcpy (e.g. getMemcpyLoopLoweringType(), getMemcpyLoopResidualLoweringType and getMemcpyCost), so we could move it here and move ExpandMemCmp to a late IR opt pass. I'll create a patch with this approach.

For reference (and some potential unit and end-to-end test ideas to show wins):
https://bugs.llvm.org/show_bug.cgi?id=36421
https://bugs.llvm.org/show_bug.cgi?id=34032#c13

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.

Herald added subscribers: jsji, kbarton, javed.absar and 3 others. · View Herald TranscriptApr 11 2019, 8:20 AM

Harbormaster completed remote builds in B30375: Diff 194696.Apr 11 2019, 8:21 AM

There is still some test fixing to do for Power, but before I do that I'd like to get your opinion on the approach, in particular regarding the pass placement (I pretty much placed it randomly here).

The nice thing here is that this discovered many more optimization opportunities (e.g. the length2 tests). I attached some benchmark data + fixture to the patch.
There is only one regression for N==24 in the equality case (this can be seen in the length24_eq test).

https://bugs.llvm.org/show_bug.cgi?id=36421

I've added a test for this one, but note that it still requires -extra-vectorizer-passes because there is not EarlyCSE pass after the function simplification pipeline. I could leave it like this or add one non-optionally, WDYT ?

Benchmark results:

D60318_bench.html45 KBDownload

Benchmark fixture:

bench.cc1 KBDownload

courbet retitled this revision from [ExpandMemCmp] Improve generated code for simple non-equality compares. to [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline..Apr 11 2019, 8:32 AM

courbet edited the summary of this revision. (Show Details)

In D60318#1462811, @courbet wrote:

There is still some test fixing to do for Power, but before I do that I'd like to get your opinion on the approach, in particular regarding the pass placement (I pretty much placed it randomly here).

It's next to the memcpy optimization pass, so that seems reasonable to me. But I don't claim any expertise on pass management/placement, so let's add more potential reviewers.
Inline summary for those folks: we'd like to move memcmp expansion from codegen to late in the IR pipeline (still under the control of a target hook) because that can unlock follow-on optimizations for CSE and instcombine. Examples:
https://bugs.llvm.org/show_bug.cgi?id=36421
https://bugs.llvm.org/show_bug.cgi?id=34032#c13

The nice thing here is that this discovered many more optimization opportunities (e.g. the length2 tests). I attached some benchmark data + fixture to the patch.
There is only one regression for N==24 in the equality case (this can be seen in the length24_eq test).

I've added a test for this one, but note that it still requires -extra-vectorizer-passes because there is not EarlyCSE pass after the function simplification pipeline. I could leave it like this or add one non-optionally, WDYT ?

I haven't seen any complaints about the cost of EarlyCSE, so my initial guess is just add it within addMemcmpPasses(). If there's a way to predicate running it on successful memcmp expansion, that would be nice.

Finally (and this could be a follow-on change), we should make the corresponding change to the new pass manager too, so we don't lose these optimizations when we flip that setting.

llvm/test/CodeGen/X86/memcmp.ll
0	It's great to see the end-to-end improvements here in the review, but I think we should not include this in the final commit (assuming we proceed). We should have IR-only regression tests for the passes in question. We could also include 'opt -O2' phase ordering tests within 'test/Transforms/PhaseOrdering/'. To make sure we have the complete opt+llc wins, we could add some small memcmp/bcmp benchmarks to test-suite. (I've never done that, so we can ask others for advice on how to do that.)

Move all tests out of CodeGen.
Update PowerPC tests.
Add EarlyCSE to addMemcmpPasses().

Harbormaster completed remote builds in B30893: Diff 196257.Apr 23 2019, 8:25 AM

@efriedma, @chandlerc, @hfinkel, @echristo: opinions ?

ping

The code changes look like what I expected, but I'd still like to hear from at least 1 other reviewer that we are not violating some high-level principle.

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71	Why are we generating bswap for a big-endian target?

Fixing RUN lines for PowerPC opt tests.

Harbormaster completed remote builds in B31455: Diff 198281.May 6 2019, 8:04 AM

courbet added inline comments.May 6 2019, 8:09 AM

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71	Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data layout from the target, so I/m now explicitly specifying the data layout on the RUN line.

The general approach of expanding memcmp in opt makes sense, I think. The placement seems a little on the early side, but that's probably okay given we don't really have interesting optimizations for memcmp calls besides expanding them.

Not sure about the extra EarlyCSE invocation; it's not free.

llvm/test/Other/opt-O2-pipeline.ll
141	How hard would it be to preserve the domtree?

Thanks Eli.

llvm/test/Other/opt-O2-pipeline.ll
141	I think it should not be too hard because we merely add blocks in a diamond in the middle of the graph, so the change is quite local. I'll have a look at that.

courbet mentioned this in D62068: [MergeICmps] Preserve the dominator tree..May 17 2019, 8:27 AM

courbet mentioned this in rL361239: [MergeICmps] Preserve the dominator tree..May 21 2019, 4:01 AM

courbet mentioned this in rGa95d95d3922e: [MergeICmps] Preserve the dominator tree..

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

In D60318#1509938, @RKSimon wrote:

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

Thanks for the suggestion (I can't believe I lived without this for so long) - I could finally reproduce and I have a fix (D62193).

Rebase

Harbormaster completed remote builds in B32309: Diff 200733.May 22 2019, 7:12 AM

Make ExpandMemCmp preserve the DomTree.

Harbormaster completed remote builds in B32566: Diff 201689.May 28 2019, 9:09 AM

sidorovd mentioned this in rG1686b70cbc79: [MergeICmps] Preserve the dominator tree..May 30 2019, 10:49 AM

Ping

ychen added a subscriber: ychen.Jun 20 2019, 3:52 PM

Ping. Any other concerns ?

spatel added inline comments.Jun 24 2019, 10:12 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
622–624	Can we take/use the Function's OptSize attribute as a preliminary step for this patch to reduce the number of diffs?

courbet mentioned this in rG3bc5ad551a4f: [ExpandMemCmp] Move all options to TargetTransformInfo..Jun 25 2019, 1:08 AM

courbet mentioned this in rL364281: [ExpandMemCmp] Move all options to TargetTransformInfo..Jun 25 2019, 1:10 AM

Split off options refactoring to r364281, rebase.

Harbormaster completed remote builds in B33868: Diff 206390.Jun 25 2019, 1:51 AM

spatel added inline comments.Jun 25 2019, 6:49 AM

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755	Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way to do end-to-end testing now within the regression test dir. We would be better off creating real end-to-end (C source --> x86 asm) tests within test-suite? That way, we can be sure that no passes anywhere in the pipeline are interfering with our memcmp patterns.

courbet marked an inline comment as done.Jun 25 2019, 7:13 AM

courbet added inline comments.

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755	Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will fix.

Remove asm testing.

Harbormaster completed remote builds in B33884: Diff 206445.Jun 25 2019, 7:48 AM

LGTM - see inline for a few more nits.

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
299–302	Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split.
419	Formatting - split line.
490	Formatting - split line.
628	Formatting - split line.
llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
6	Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR transform passes.'

This revision is now accepted and ready to land.Jun 25 2019, 8:22 AM

xbolva00 added a subscriber: xbolva00.Jun 25 2019, 9:17 AM

xbolva00 added inline comments.

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
245	Unused parameter?

Rebase on r364384.

Harbormaster completed remote builds in B33930: Diff 206614.Jun 26 2019, 2:14 AM

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I'm working on it.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

That was my first go at it too. Tests do break when I remove the updates, so I'm guessing they do :D

clang-format

Harbormaster completed remote builds in B33932: Diff 206629.Jun 26 2019, 4:45 AM

For information, I had to roll this back at it breaks sanitizers. Sanitizer passes insert some nobuiltin attributes on all memcmp calls to prevent expansion and still be able to intercept them (maybeMarkSanitizerLibraryCallNoBuiltin), and these passes are added last in opt:

PMBuilder.addExtension(PassManagerBuilder::EP_OptimizerLast, addMemorySanitizerPass);

so they now run after ExpandMemCmp.

The fix would be to move addMemcmpPasses(MPM); after addExtensionsToPM(EP_ScalarOptimizerLate, MPM);. I'll do this after I get the test-suite benchmarks in, so that we can validate the interactions.

courbet mentioned this in D64082: [MemFunctions] Add microbenchmarks for memory functions..Jul 2 2019, 8:51 AM

test-suite benchmarks: https://reviews.llvm.org/D64082

Full benchmark results for reference (haswell, 10 runs):

output.pdf115 KBDownload

The cells highlighted in blue are the statistically significant ones.

courbet mentioned this in rL369707: [MemFunctions] Add microbenchmarks for memory functions..Aug 22 2019, 2:24 PM

@courbet What's happening with this patch?

Herald added subscribers: • wuzish, MaskRay. · View Herald TranscriptSep 6 2019, 7:03 AM

I rolled back due to some bot breakages that I was not able to see were related or not. I have not come back to it yet, will do.

rebase

Harbormaster completed remote builds in B37851: Diff 219101.Sep 6 2019, 7:19 AM

reopening until the regressions have been investigated

This revision now requires changes to proceed.Sep 6 2019, 7:45 AM

courbet mentioned this in D67349: [Inliner][NFC] Make test less brittle..Sep 9 2019, 5:38 AM

courbet mentioned this in rL371397: [Inliner][NFC] Make test less brittle..Sep 9 2019, 6:08 AM

courbet mentioned this in rG388b9794b619: [Inliner][NFC] Make test less brittle..

Move mem passes after sanitizer passes.
Move codegen test llvm/test/CodeGen/AArch64/bcmp-inline-small.ll to Transforms/ now that AArch64 expands memcmps.
Rebase on r371397.

Harbormaster completed remote builds in B37912: Diff 219339.Sep 9 2019, 6:15 AM

courbet mentioned this in rL371502: Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt….Sep 10 2019, 2:19 AM

courbet mentioned this in rG612c260ec3fe: Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt….

Sorry, this change broke the buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/17395, I reverted it (r371507). Please ensure that you run ninja check-all before committing.

We also detected some compile time regressions in the CTMark subset of the test suite, with lencod regressing by around 4-5%. I haven't fully bisected but the commit list was short and this seemed to be the only suspicious change.

nikic added a subscriber: nikic.Oct 23 2019, 12:07 PM

spatel mentioned this in D69627: [SLP]Fix PR43799: Crash on different sizes of GEP indices..Oct 31 2019, 8:26 AM

@courbet what's happening with this patch?

davezarzycki added a subscriber: davezarzycki.Nov 22 2019, 2:28 AM

In D60318#1756381, @RKSimon wrote:

@courbet what's happening with this patch?

I'm going to abandon it for now. This interferes with the sanitizers in hard to fix ways, and I don't have the bandwith to come back to it. sorry.

courbet mentioned this in D132960: [InstCombine] Transform small unaligned memcmp calls used in zero equality tests.Aug 30 2022, 11:39 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

34 lines

TargetTransformInfoImpl.h

6 lines

CodeGen/

Passes.h

3 lines

TargetLowering.h

12 lines

Transforms/

IPO/

PassManagerBuilder.h

1 line

Scalar.h

6 lines

lib/

Analysis/

TargetTransformInfo.cpp

6 lines

CodeGen/

1 line

1 line

13 lines

Target/

PowerPC/

PPCTargetTransformInfo.h

4 lines

PPCTargetTransformInfo.cpp

17 lines

X86/

X86ISelLowering.h

5 lines

X86TargetTransformInfo.h

4 lines

X86TargetTransformInfo.cpp

41 lines

Transforms/

IPO/

PassManagerBuilder.cpp

14 lines

Scalar/

CMakeLists.txt

1 line

MergeICmps.cpp

11 lines

Scalar.cpp

1 line

	Transforms/	Scalar/
		CodeGen/

ExpandMemCmp.cpp

114 lines

test/

CodeGen/

AArch64/

O3-pipeline.ll

4 lines

ARM/

O3-pipeline.ll

4 lines

Generic/

llc-start-stop.ll

6 lines

PowerPC/

memCmpUsedInZeroEqualityComparison.ll

memcmp-mergeexpand.ll

memcmp.ll

memcmpIR.ll

X86/

O3-pipeline.ll

4 lines

memcmp-mergeexpand.ll

Other/

6 lines

6 lines

6 lines

Transforms/

ExpandMemCmp/

PowerPC/

lit.local.cfg

3 lines

memcmpIR.ll

294 lines

X86/

pr36421.ll

79 lines

PhaseOrdering/

PowerPC/

lit.local.cfg

2 lines

memCmpUsedInZeroEqualityComparison.ll

174 lines

memcmp-mergeexpand.ll

50 lines

memcmp.ll

80 lines

X86/

lit.local.cfg

2 lines

memcmp-mergeexpand.ll

78 lines

memcmp.ll

1384 lines

pr36421.ll

68 lines

tools/

opt/

opt.cpp

1 line

utils/

gn/

secondary/

llvm/

lib/

CodeGen/

BUILD.gn

1 line

Transforms/

Scalar/

BUILD.gn

1 line

Diff 198281

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 586 Lines • ▼ Show 20 Lines	public:
/// If target has efficient vector element load/store instructions, it can		/// If target has efficient vector element load/store instructions, it can
/// return true here so that insertion/extraction costs are not added to		/// return true here so that insertion/extraction costs are not added to
/// the scalarization cost of a load/store.		/// the scalarization cost of a load/store.
bool supportsEfficientVectorElementLoadStore() const;		bool supportsEfficientVectorElementLoadStore() const;

/// Don't restrict interleaved unrolling to small loops.		/// Don't restrict interleaved unrolling to small loops.
bool enableAggressiveInterleaving(bool LoopHasReductions) const;		bool enableAggressiveInterleaving(bool LoopHasReductions) const;

/// If not nullptr, enable inline expansion of memcmp. IsZeroCmp is		/// Returns options for expansion of memcmp. IsZeroCmp is
/// true if this is the expansion of memcmp(p1, p2, s) == 0.		// true if this is the expansion of memcmp(p1, p2, s) == 0.
struct MemCmpExpansionOptions {		struct MemCmpExpansionOptions {
		// Return true if memcmp expansion is enabled.
		operator bool() const { return MaxNumLoads > 0; }

		// Maximum number of load operations.
		unsigned MaxNumLoads = 0;

// The list of available load sizes (in bytes), sorted in decreasing order.		// The list of available load sizes (in bytes), sorted in decreasing order.
SmallVector<unsigned, 8> LoadSizes;		SmallVector<unsigned, 8> LoadSizes;

		// For memcmp expansion when the memcmp result is only compared equal or
		// not-equal to 0, allow up to this number of load pairs per block. As an
		// example, this may allow 'memcmp(a, b, 3) == 0' in a single block:
		// a0 = load2bytes &a[0]
		// b0 = load2bytes &b[0]
		// a2 = load1byte &a[2]
		// b2 = load1byte &b[2]
		// r = cmp eq (a0 ^ b0 \| a2 ^ b2), 0
		unsigned NumLoadsPerBlock = 1;

// Set to true to allow overlapping loads. For example, 7-byte compares can		// Set to true to allow overlapping loads. For example, 7-byte compares can
// be done with two 4-byte compares instead of 4+2+1-byte compares. This		// be done with two 4-byte compares instead of 4+2+1-byte compares. This
// requires all loads in LoadSizes to be doable in an unaligned way.		// requires all loads in LoadSizes to be doable in an unaligned way.
bool AllowOverlappingLoads = false;		bool AllowOverlappingLoads = false;
};		};
const MemCmpExpansionOptions *enableMemCmpExpansion(bool IsZeroCmp) const;		MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
		bool IsZeroCmp) const;

		spatelUnsubmitted Done Reply Inline Actions Can we take/use the Function's OptSize attribute as a preliminary step for this patch to reduce the number of diffs? spatel: Can we take/use the Function's OptSize attribute as a preliminary step for this patch to reduce…
/// Enable matching of interleaved access groups.		/// Enable matching of interleaved access groups.
bool enableInterleavedAccessVectorization() const;		bool enableInterleavedAccessVectorization() const;

/// Enable matching of interleaved access groups that contain predicated		/// Enable matching of interleaved access groups that contain predicated
/// accesses or gaps and therefore vectorized using masked		/// accesses or gaps and therefore vectorized using masked
/// vector loads/stores.		/// vector loads/stores.
bool enableMaskedInterleavedAccessVectorization() const;		bool enableMaskedInterleavedAccessVectorization() const;

▲ Show 20 Lines • Show All 493 Lines • ▼ Show 20 Lines	public:
virtual bool shouldBuildLookupTablesForConstant(Constant *C) = 0;		virtual bool shouldBuildLookupTablesForConstant(Constant *C) = 0;
virtual bool useColdCCForColdCall(Function &F) = 0;		virtual bool useColdCCForColdCall(Function &F) = 0;
virtual unsigned		virtual unsigned
getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) = 0;		getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) = 0;
virtual unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,		virtual unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,
unsigned VF) = 0;		unsigned VF) = 0;
virtual bool supportsEfficientVectorElementLoadStore() = 0;		virtual bool supportsEfficientVectorElementLoadStore() = 0;
virtual bool enableAggressiveInterleaving(bool LoopHasReductions) = 0;		virtual bool enableAggressiveInterleaving(bool LoopHasReductions) = 0;
virtual const MemCmpExpansionOptions *enableMemCmpExpansion(		virtual MemCmpExpansionOptions
bool IsZeroCmp) const = 0;		enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const = 0;
virtual bool enableInterleavedAccessVectorization() = 0;		virtual bool enableInterleavedAccessVectorization() = 0;
virtual bool enableMaskedInterleavedAccessVectorization() = 0;		virtual bool enableMaskedInterleavedAccessVectorization() = 0;
virtual bool isFPVectorizationPotentiallyUnsafe() = 0;		virtual bool isFPVectorizationPotentiallyUnsafe() = 0;
virtual bool allowsMisalignedMemoryAccesses(LLVMContext &Context,		virtual bool allowsMisalignedMemoryAccesses(LLVMContext &Context,
unsigned BitWidth,		unsigned BitWidth,
unsigned AddressSpace,		unsigned AddressSpace,
unsigned Alignment,		unsigned Alignment,
bool *Fast) = 0;		bool *Fast) = 0;
▲ Show 20 Lines • Show All 271 Lines • ▼ Show 20 Lines	public:

bool supportsEfficientVectorElementLoadStore() override {		bool supportsEfficientVectorElementLoadStore() override {
return Impl.supportsEfficientVectorElementLoadStore();		return Impl.supportsEfficientVectorElementLoadStore();
}		}

bool enableAggressiveInterleaving(bool LoopHasReductions) override {		bool enableAggressiveInterleaving(bool LoopHasReductions) override {
return Impl.enableAggressiveInterleaving(LoopHasReductions);		return Impl.enableAggressiveInterleaving(LoopHasReductions);
}		}
const MemCmpExpansionOptions *enableMemCmpExpansion(		MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
bool IsZeroCmp) const override {		bool IsZeroCmp) const override {
return Impl.enableMemCmpExpansion(IsZeroCmp);		return Impl.enableMemCmpExpansion(OptSize, IsZeroCmp);
}		}
bool enableInterleavedAccessVectorization() override {		bool enableInterleavedAccessVectorization() override {
return Impl.enableInterleavedAccessVectorization();		return Impl.enableInterleavedAccessVectorization();
}		}
bool enableMaskedInterleavedAccessVectorization() override {		bool enableMaskedInterleavedAccessVectorization() override {
return Impl.enableMaskedInterleavedAccessVectorization();		return Impl.enableMaskedInterleavedAccessVectorization();
}		}
bool isFPVectorizationPotentiallyUnsafe() override {		bool isFPVectorizationPotentiallyUnsafe() override {
▲ Show 20 Lines • Show All 343 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines	public:

unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,		unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,
unsigned VF) { return 0; }		unsigned VF) { return 0; }

bool supportsEfficientVectorElementLoadStore() { return false; }		bool supportsEfficientVectorElementLoadStore() { return false; }

bool enableAggressiveInterleaving(bool LoopHasReductions) { return false; }		bool enableAggressiveInterleaving(bool LoopHasReductions) { return false; }

const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(		TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
bool IsZeroCmp) const {		bool IsZeroCmp) const {
return nullptr;		return {};
}		}

bool enableInterleavedAccessVectorization() { return false; }		bool enableInterleavedAccessVectorization() { return false; }

bool enableMaskedInterleavedAccessVectorization() { return false; }		bool enableMaskedInterleavedAccessVectorization() { return false; }

bool isFPVectorizationPotentiallyUnsafe() { return false; }		bool isFPVectorizationPotentiallyUnsafe() { return false; }

▲ Show 20 Lines • Show All 590 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 428 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
/// This pass performs outlining on machine instructions directly before		/// This pass performs outlining on machine instructions directly before
/// printing assembly.		/// printing assembly.
ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);		ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);

/// This pass expands the experimental reduction intrinsics into sequences of		/// This pass expands the experimental reduction intrinsics into sequences of
/// shuffles.		/// shuffles.
FunctionPass *createExpandReductionsPass();		FunctionPass *createExpandReductionsPass();

// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();

/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp		/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp
FunctionPass *createCFIInstrInserter();		FunctionPass *createCFIInstrInserter();

} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/include/llvm/CodeGen/TargetLowering.h

	Show First 20 Lines • Show All 1,363 Lines • ▼ Show 20 Lines
	/// This function returns the maximum number of load operations permitted			/// This function returns the maximum number of load operations permitted
	/// to replace a call to memcmp. The value is set by the target at the			/// to replace a call to memcmp. The value is set by the target at the
	/// performance threshold for such a replacement. If OptSize is true,			/// performance threshold for such a replacement. If OptSize is true,
	/// return the limit for functions that have OptSize attribute.			/// return the limit for functions that have OptSize attribute.
	unsigned getMaxExpandSizeMemcmp(bool OptSize) const {			unsigned getMaxExpandSizeMemcmp(bool OptSize) const {
	return OptSize ? MaxLoadsPerMemcmpOptSize : MaxLoadsPerMemcmp;			return OptSize ? MaxLoadsPerMemcmpOptSize : MaxLoadsPerMemcmp;
	}			}

	/// For memcmp expansion when the memcmp result is only compared equal or
	/// not-equal to 0, allow up to this number of load pairs per block. As an
	/// example, this may allow 'memcmp(a, b, 3) == 0' in a single block:
	/// a0 = load2bytes &a[0]
	/// b0 = load2bytes &b[0]
	/// a2 = load1byte &a[2]
	/// b2 = load1byte &b[2]
	/// r = cmp eq (a0 ^ b0 \| a2 ^ b2), 0
	virtual unsigned getMemcmpEqZeroLoadsPerBlock() const {
	return 1;
	}

	/// Get maximum # of store operations permitted for llvm.memmove			/// Get maximum # of store operations permitted for llvm.memmove
	///			///
	/// This function returns the maximum number of store operations permitted			/// This function returns the maximum number of store operations permitted
	/// to replace a call to llvm.memmove. The value is set by the target at the			/// to replace a call to llvm.memmove. The value is set by the target at the
	/// performance threshold for such a replacement. If OptSize is true,			/// performance threshold for such a replacement. If OptSize is true,
	/// return the limit for functions that have OptSize attribute.			/// return the limit for functions that have OptSize attribute.
	unsigned getMaxStoresPerMemmove(bool OptSize) const {			unsigned getMaxStoresPerMemmove(bool OptSize) const {
	return OptSize ? MaxStoresPerMemmoveOptSize : MaxStoresPerMemmove;			return OptSize ? MaxStoresPerMemmoveOptSize : MaxStoresPerMemmove;
	▲ Show 20 Lines • Show All 2,613 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h

Show First 20 Lines • Show All 189 Lines • ▼ Show 20 Lines	private:
void addExtensionsToPM(ExtensionPointTy ETy,		void addExtensionsToPM(ExtensionPointTy ETy,
legacy::PassManagerBase &PM) const;		legacy::PassManagerBase &PM) const;
void addInitialAliasAnalysisPasses(legacy::PassManagerBase &PM) const;		void addInitialAliasAnalysisPasses(legacy::PassManagerBase &PM) const;
void addLTOOptimizationPasses(legacy::PassManagerBase &PM);		void addLTOOptimizationPasses(legacy::PassManagerBase &PM);
void addLateLTOOptimizationPasses(legacy::PassManagerBase &PM);		void addLateLTOOptimizationPasses(legacy::PassManagerBase &PM);
void addPGOInstrPasses(legacy::PassManagerBase &MPM, bool IsCS);		void addPGOInstrPasses(legacy::PassManagerBase &MPM, bool IsCS);
void addFunctionSimplificationPasses(legacy::PassManagerBase &MPM);		void addFunctionSimplificationPasses(legacy::PassManagerBase &MPM);
void addInstructionCombiningPass(legacy::PassManagerBase &MPM) const;		void addInstructionCombiningPass(legacy::PassManagerBase &MPM) const;
		void addMemcmpPasses(legacy::PassManagerBase &MPM) const;

public:		public:
/// populateFunctionPassManager - This fills in the function pass manager,		/// populateFunctionPassManager - This fills in the function pass manager,
/// which is expected to be run on each function immediately as it is		/// which is expected to be run on each function immediately as it is
/// generated. The idea is to reduce the size of the IR in memory.		/// generated. The idea is to reduce the size of the IR in memory.
void populateFunctionPassManager(legacy::FunctionPassManager &FPM);		void populateFunctionPassManager(legacy::FunctionPassManager &FPM);

/// populateModulePassManager - This sets up the primary pass manager.		/// populateModulePassManager - This sets up the primary pass manager.
Show All 18 Lines

llvm/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 369 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// MergeICmps - Merge integer comparison chains into a memcmp			// MergeICmps - Merge integer comparison chains into a memcmp
	//			//
	Pass *createMergeICmpsPass();			Pass *createMergeICmpsPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
				// ExpandMemCmp - This pass expands memcmp() to load/stores.
				//
				Pass *createExpandMemCmpPass();

				//===----------------------------------------------------------------------===//
				//
	// ValuePropagation - Propagate CFG-derived value information			// ValuePropagation - Propagate CFG-derived value information
	//			//
	Pass *createCorrelatedValuePropagationPass();			Pass *createCorrelatedValuePropagationPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// InferAddressSpaces - Modify users of addrspacecast instructions with values			// InferAddressSpaces - Modify users of addrspacecast instructions with values
	// in the source address space if using the destination address space is slower			// in the source address space if using the destination address space is slower
	▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

	Show First 20 Lines • Show All 267 Lines • ▼ Show 20 Lines
	bool TargetTransformInfo::supportsEfficientVectorElementLoadStore() const {			bool TargetTransformInfo::supportsEfficientVectorElementLoadStore() const {
	return TTIImpl->supportsEfficientVectorElementLoadStore();			return TTIImpl->supportsEfficientVectorElementLoadStore();
	}			}

	bool TargetTransformInfo::enableAggressiveInterleaving(bool LoopHasReductions) const {			bool TargetTransformInfo::enableAggressiveInterleaving(bool LoopHasReductions) const {
	return TTIImpl->enableAggressiveInterleaving(LoopHasReductions);			return TTIImpl->enableAggressiveInterleaving(LoopHasReductions);
	}			}

	const TargetTransformInfo::MemCmpExpansionOptions *			TargetTransformInfo::MemCmpExpansionOptions
	TargetTransformInfo::enableMemCmpExpansion(bool IsZeroCmp) const {			TargetTransformInfo::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
	return TTIImpl->enableMemCmpExpansion(IsZeroCmp);			return TTIImpl->enableMemCmpExpansion(OptSize, IsZeroCmp);
	}			}

	bool TargetTransformInfo::enableInterleavedAccessVectorization() const {			bool TargetTransformInfo::enableInterleavedAccessVectorization() const {
	return TTIImpl->enableInterleavedAccessVectorization();			return TTIImpl->enableInterleavedAccessVectorization();
	}			}

	bool TargetTransformInfo::enableMaskedInterleavedAccessVectorization() const {			bool TargetTransformInfo::enableMaskedInterleavedAccessVectorization() const {
	return TTIImpl->enableMaskedInterleavedAccessVectorization();			return TTIImpl->enableMaskedInterleavedAccessVectorization();
	▲ Show 20 Lines • Show All 954 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show All 16 Lines	add_llvm_library(LLVMCodeGen
DeadMachineInstructionElim.cpp		DeadMachineInstructionElim.cpp
DetectDeadLanes.cpp		DetectDeadLanes.cpp
DFAPacketizer.cpp		DFAPacketizer.cpp
DwarfEHPrepare.cpp		DwarfEHPrepare.cpp
EarlyIfConversion.cpp		EarlyIfConversion.cpp
EdgeBundles.cpp		EdgeBundles.cpp
ExecutionDomainFix.cpp		ExecutionDomainFix.cpp
ExpandISelPseudos.cpp		ExpandISelPseudos.cpp
ExpandMemCmp.cpp
ExpandPostRAPseudos.cpp		ExpandPostRAPseudos.cpp
ExpandReductions.cpp		ExpandReductions.cpp
FaultMaps.cpp		FaultMaps.cpp
FEntryInserter.cpp		FEntryInserter.cpp
FuncletLayout.cpp		FuncletLayout.cpp
GCMetadata.cpp		GCMetadata.cpp
GCMetadataPrinter.cpp		GCMetadataPrinter.cpp
GCRootLowering.cpp		GCRootLowering.cpp
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGen.cpp

Show All 25 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeDeadMachineInstructionElimPass(Registry);		initializeDeadMachineInstructionElimPass(Registry);
initializeDetectDeadLanesPass(Registry);		initializeDetectDeadLanesPass(Registry);
initializeDwarfEHPreparePass(Registry);		initializeDwarfEHPreparePass(Registry);
initializeEarlyIfConverterPass(Registry);		initializeEarlyIfConverterPass(Registry);
initializeEarlyMachineLICMPass(Registry);		initializeEarlyMachineLICMPass(Registry);
initializeEarlyTailDuplicatePass(Registry);		initializeEarlyTailDuplicatePass(Registry);
initializeExpandISelPseudosPass(Registry);		initializeExpandISelPseudosPass(Registry);
initializeExpandMemCmpPassPass(Registry);
initializeExpandPostRAPass(Registry);		initializeExpandPostRAPass(Registry);
initializeFEntryInserterPass(Registry);		initializeFEntryInserterPass(Registry);
initializeFinalizeMachineBundlesPass(Registry);		initializeFinalizeMachineBundlesPass(Registry);
initializeFuncletLayoutPass(Registry);		initializeFuncletLayoutPass(Registry);
initializeGCMachineCodeAnalysisPass(Registry);		initializeGCMachineCodeAnalysisPass(Registry);
initializeGCModuleInfoPass(Registry);		initializeGCModuleInfoPass(Registry);
initializeIfConverterPass(Registry);		initializeIfConverterPass(Registry);
initializeImplicitNullChecksPass(Registry);		initializeImplicitNullChecksPass(Registry);
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExpandMemCmp.cpp

This file was moved to llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp.

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines
static cl::opt<bool> DisableCopyProp("disable-copyprop", cl::Hidden,		static cl::opt<bool> DisableCopyProp("disable-copyprop", cl::Hidden,
cl::desc("Disable Copy Propagation pass"));		cl::desc("Disable Copy Propagation pass"));
static cl::opt<bool> DisablePartialLibcallInlining("disable-partial-libcall-inlining",		static cl::opt<bool> DisablePartialLibcallInlining("disable-partial-libcall-inlining",
cl::Hidden, cl::desc("Disable Partial Libcall Inlining"));		cl::Hidden, cl::desc("Disable Partial Libcall Inlining"));
static cl::opt<bool> EnableImplicitNullChecks(		static cl::opt<bool> EnableImplicitNullChecks(
"enable-implicit-null-checks",		"enable-implicit-null-checks",
cl::desc("Fold null checks into faulting memory operations"),		cl::desc("Fold null checks into faulting memory operations"),
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
static cl::opt<bool> DisableMergeICmps("disable-mergeicmps",
cl::desc("Disable MergeICmps Pass"),
cl::init(false), cl::Hidden);
static cl::opt<bool> PrintLSR("print-lsr-output", cl::Hidden,		static cl::opt<bool> PrintLSR("print-lsr-output", cl::Hidden,
cl::desc("Print LLVM IR produced by the loop-reduce pass"));		cl::desc("Print LLVM IR produced by the loop-reduce pass"));
static cl::opt<bool> PrintISelInput("print-isel-input", cl::Hidden,		static cl::opt<bool> PrintISelInput("print-isel-input", cl::Hidden,
cl::desc("Print LLVM IR input to isel pass"));		cl::desc("Print LLVM IR input to isel pass"));
static cl::opt<bool> PrintGCInfo("print-gc", cl::Hidden,		static cl::opt<bool> PrintGCInfo("print-gc", cl::Hidden,
cl::desc("Dump garbage collector data"));		cl::desc("Dump garbage collector data"));
static cl::opt<cl::boolOrDefault>		static cl::opt<cl::boolOrDefault>
VerifyMachineCode("verify-machineinstrs", cl::Hidden,		VerifyMachineCode("verify-machineinstrs", cl::Hidden,
▲ Show 20 Lines • Show All 522 Lines • ▼ Show 20 Lines	void TargetPassConfig::addIRPasses() {

// Run loop strength reduction before anything else.		// Run loop strength reduction before anything else.
if (getOptLevel() != CodeGenOpt::None && !DisableLSR) {		if (getOptLevel() != CodeGenOpt::None && !DisableLSR) {
addPass(createLoopStrengthReducePass());		addPass(createLoopStrengthReducePass());
if (PrintLSR)		if (PrintLSR)
addPass(createPrintFunctionPass(dbgs(), "\n\n* Code after LSR *\n"));		addPass(createPrintFunctionPass(dbgs(), "\n\n* Code after LSR *\n"));
}		}

if (getOptLevel() != CodeGenOpt::None) {
// The MergeICmpsPass tries to create memcmp calls by grouping sequences of
// loads and compares. ExpandMemCmpPass then tries to expand those calls
// into optimally-sized loads and compares. The transforms are enabled by a
// target lowering hook.
if (!DisableMergeICmps)
addPass(createMergeICmpsPass());
addPass(createExpandMemCmpPass());
}

// Run GC lowering passes for builtin collectors		// Run GC lowering passes for builtin collectors
// TODO: add a pass insertion point here		// TODO: add a pass insertion point here
addPass(createGCLoweringPass());		addPass(createGCLoweringPass());
addPass(createShadowStackGCLoweringPass());		addPass(createShadowStackGCLoweringPass());

// Make sure that no unreachable blocks are instruction selected.		// Make sure that no unreachable blocks are instruction selected.
addPass(createUnreachableBlockEliminationPass());		addPass(createUnreachableBlockEliminationPass());

▲ Show 20 Lines • Show All 574 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	void getUnrollingPreferences(Loop *L, ScalarEvolution &SE,
TTI::UnrollingPreferences &UP);		TTI::UnrollingPreferences &UP);

/// @}		/// @}

/// \name Vector TTI Implementations		/// \name Vector TTI Implementations
/// @{		/// @{
bool useColdCCForColdCall(Function &F);		bool useColdCCForColdCall(Function &F);
bool enableAggressiveInterleaving(bool LoopHasReductions);		bool enableAggressiveInterleaving(bool LoopHasReductions);
const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(		TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
bool IsZeroCmp) const;		bool IsZeroCmp) const;
bool enableInterleavedAccessVectorization();		bool enableInterleavedAccessVectorization();
unsigned getNumberOfRegisters(bool Vector);		unsigned getNumberOfRegisters(bool Vector);
unsigned getRegisterBitWidth(bool Vector) const;		unsigned getRegisterBitWidth(bool Vector) const;
unsigned getCacheLineSize();		unsigned getCacheLineSize();
unsigned getPrefetchDistance();		unsigned getPrefetchDistance();
unsigned getMaxInterleaveFactor(unsigned VF);		unsigned getMaxInterleaveFactor(unsigned VF);
int vectorCostAdjustment(int Cost, unsigned Opcode, Type Ty1, Type Ty2);		int vectorCostAdjustment(int Cost, unsigned Opcode, Type Ty1, Type Ty2);
int getArithmeticInstrCost(		int getArithmeticInstrCost(
Show All 28 Lines

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

Show First 20 Lines • Show All 232 Lines • ▼ Show 20 Lines	bool PPCTTIImpl::enableAggressiveInterleaving(bool LoopHasReductions) {
// do so is particularly expensive. This makes it much more likely (compared		// do so is particularly expensive. This makes it much more likely (compared
// to only using concatenation unrolling).		// to only using concatenation unrolling).
if (ST->getDarwinDirective() == PPC::DIR_A2)		if (ST->getDarwinDirective() == PPC::DIR_A2)
return true;		return true;

return LoopHasReductions;		return LoopHasReductions;
}		}

const PPCTTIImpl::TTI::MemCmpExpansionOptions *		PPCTTIImpl::TTI::MemCmpExpansionOptions
PPCTTIImpl::enableMemCmpExpansion(bool IsZeroCmp) const {		PPCTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
static const auto Options = []() {
TTI::MemCmpExpansionOptions Options;		TTI::MemCmpExpansionOptions Options;
Options.LoadSizes.push_back(8);		Options.LoadSizes = {8, 4, 2, 1};
Options.LoadSizes.push_back(4);		Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
Options.LoadSizes.push_back(2);
Options.LoadSizes.push_back(1);
return Options;		return Options;
}();
return &Options;
}		}

bool PPCTTIImpl::enableInterleavedAccessVectorization() {		bool PPCTTIImpl::enableInterleavedAccessVectorization() {
return true;		return true;
}		}

unsigned PPCTTIImpl::getNumberOfRegisters(bool Vector) {		unsigned PPCTTIImpl::getNumberOfRegisters(bool Vector) {
if (Vector && !ST->hasAltivec() && !ST->hasQPX())		if (Vector && !ST->hasAltivec() && !ST->hasQPX())
▲ Show 20 Lines • Show All 279 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 849 Lines • ▼ Show 20 Lines	public:

bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {		bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {
return VT.isScalarInteger();		return VT.isScalarInteger();
}		}

/// Vector-sized comparisons are fast using PCMPEQ + PMOVMSK or PTEST.		/// Vector-sized comparisons are fast using PCMPEQ + PMOVMSK or PTEST.
MVT hasFastEqualityCompare(unsigned NumBits) const override;		MVT hasFastEqualityCompare(unsigned NumBits) const override;

/// Allow multiple load pairs per block for smaller and faster code.
unsigned getMemcmpEqZeroLoadsPerBlock() const override {
return 2;
}

/// Return the value type to use for ISD::SETCC.		/// Return the value type to use for ISD::SETCC.
EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Context,		EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Context,
EVT VT) const override;		EVT VT) const override;

bool targetShrinkDemandedConstant(SDValue Op, const APInt &Demanded,		bool targetShrinkDemandedConstant(SDValue Op, const APInt &Demanded,
TargetLoweringOpt &TLO) const override;		TargetLoweringOpt &TLO) const override;

/// Determine which of the bits specified in Mask are known to be either		/// Determine which of the bits specified in Mask are known to be either
▲ Show 20 Lines • Show All 753 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86TargetTransformInfo.h

Show First 20 Lines • Show All 189 Lines • ▼ Show 20 Lines	public:
bool isLegalMaskedCompressStore(Type *DataType);		bool isLegalMaskedCompressStore(Type *DataType);
bool hasDivRemOp(Type *DataType, bool IsSigned);		bool hasDivRemOp(Type *DataType, bool IsSigned);
bool isFCmpOrdCheaperThanFCmpZero(Type *Ty);		bool isFCmpOrdCheaperThanFCmpZero(Type *Ty);
bool areInlineCompatible(const Function *Caller,		bool areInlineCompatible(const Function *Caller,
const Function *Callee) const;		const Function *Callee) const;
bool areFunctionArgsABICompatible(const Function *Caller,		bool areFunctionArgsABICompatible(const Function *Caller,
const Function *Callee,		const Function *Callee,
SmallPtrSetImpl<Argument *> &Args) const;		SmallPtrSetImpl<Argument *> &Args) const;
const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(		TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
bool IsZeroCmp) const;		bool IsZeroCmp) const;
bool enableInterleavedAccessVectorization();		bool enableInterleavedAccessVectorization();
private:		private:
int getGSScalarCost(unsigned Opcode, Type *DataTy, bool VariableMask,		int getGSScalarCost(unsigned Opcode, Type *DataTy, bool VariableMask,
unsigned Alignment, unsigned AddressSpace);		unsigned Alignment, unsigned AddressSpace);
int getGSVectorCost(unsigned Opcode, Type DataTy, Value Ptr,		int getGSVectorCost(unsigned Opcode, Type DataTy, Value Ptr,
unsigned Alignment, unsigned AddressSpace);		unsigned Alignment, unsigned AddressSpace);

/// @}		/// @}
};		};

} // end namespace llvm		} // end namespace llvm

#endif		#endif

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

Show First 20 Lines • Show All 3,163 Lines • ▼ Show 20 Lines	bool X86TTIImpl::areFunctionArgsABICompatible(
// incompatible.		// incompatible.
// FIXME Look at the arguments and only consider 512 bit or larger vectors?		// FIXME Look at the arguments and only consider 512 bit or larger vectors?
const TargetMachine &TM = getTLI()->getTargetMachine();		const TargetMachine &TM = getTLI()->getTargetMachine();

return TM.getSubtarget<X86Subtarget>(*Caller).useAVX512Regs() ==		return TM.getSubtarget<X86Subtarget>(*Caller).useAVX512Regs() ==
TM.getSubtarget<X86Subtarget>(*Callee).useAVX512Regs();		TM.getSubtarget<X86Subtarget>(*Callee).useAVX512Regs();
}		}

const X86TTIImpl::TTI::MemCmpExpansionOptions *		X86TTIImpl::TTI::MemCmpExpansionOptions
X86TTIImpl::enableMemCmpExpansion(bool IsZeroCmp) const {		X86TTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
// Only enable vector loads for equality comparison.
// Right now the vector version is not as fast, see #33329.
static const auto ThreeWayOptions = [this]() {
TTI::MemCmpExpansionOptions Options;
if (ST->is64Bit()) {
Options.LoadSizes.push_back(8);
}
Options.LoadSizes.push_back(4);
Options.LoadSizes.push_back(2);
Options.LoadSizes.push_back(1);
return Options;
}();
static const auto EqZeroOptions = [this]() {
TTI::MemCmpExpansionOptions Options;		TTI::MemCmpExpansionOptions Options;
		Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
		Options.NumLoadsPerBlock = 2;
		if (IsZeroCmp) {
		// Only enable vector loads for equality comparison. Right now the vector
		// version is not as fast for three way compare (see #33329).
// TODO: enable AVX512 when the DAG is ready.		// TODO: enable AVX512 when the DAG is ready.
// if (ST->hasAVX512()) Options.LoadSizes.push_back(64);		// if (ST->hasAVX512()) Options.LoadSizes.push_back(64);
if (ST->hasAVX2()) Options.LoadSizes.push_back(32);		if (ST->hasAVX2()) Options.LoadSizes.push_back(32);
if (ST->hasSSE2()) Options.LoadSizes.push_back(16);		if (ST->hasSSE2()) Options.LoadSizes.push_back(16);
		// All GPR and vector loads can be unaligned. SIMD compare requires integer
		// vectors (SSE2/AVX2).
		Options.AllowOverlappingLoads = true;
		}
if (ST->is64Bit()) {		if (ST->is64Bit()) {
Options.LoadSizes.push_back(8);		Options.LoadSizes.push_back(8);
}		}
Options.LoadSizes.push_back(4);		Options.LoadSizes.push_back(4);
Options.LoadSizes.push_back(2);		Options.LoadSizes.push_back(2);
Options.LoadSizes.push_back(1);		Options.LoadSizes.push_back(1);
// All GPR and vector loads can be unaligned. SIMD compare requires integer
// vectors (SSE2/AVX2).
Options.AllowOverlappingLoads = true;
return Options;		return Options;
}();
return IsZeroCmp ? &EqZeroOptions : &ThreeWayOptions;
}		}

bool X86TTIImpl::enableInterleavedAccessVectorization() {		bool X86TTIImpl::enableInterleavedAccessVectorization() {
// TODO: We expect this to be beneficial regardless of arch,		// TODO: We expect this to be beneficial regardless of arch,
// but there are currently some unexplained performance artifacts on Atom.		// but there are currently some unexplained performance artifacts on Atom.
// As a temporary solution, disable on Atom.		// As a temporary solution, disable on Atom.
return !(ST->isAtom());		return !(ST->isAtom());
}		}
▲ Show 20 Lines • Show All 283 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines
}		}

void PassManagerBuilder::addInstructionCombiningPass(		void PassManagerBuilder::addInstructionCombiningPass(
legacy::PassManagerBase &PM) const {		legacy::PassManagerBase &PM) const {
bool ExpensiveCombines = OptLevel > 2;		bool ExpensiveCombines = OptLevel > 2;
PM.add(createInstructionCombiningPass(ExpensiveCombines));		PM.add(createInstructionCombiningPass(ExpensiveCombines));
}		}

		void PassManagerBuilder::addMemcmpPasses(legacy::PassManagerBase &PM) const {
		if (OptLevel > 0) {
		// The MergeICmpsPass tries to create memcmp calls by grouping sequences of
		// loads and compares. ExpandMemCmpPass then tries to expand those calls
		// into optimally-sized loads and compares. The transforms are enabled by a
		// target transform info hook.
		PM.add(createMergeICmpsPass());
		PM.add(createExpandMemCmpPass());
		PM.add(createEarlyCSEPass());
		}
		}

void PassManagerBuilder::populateFunctionPassManager(		void PassManagerBuilder::populateFunctionPassManager(
legacy::FunctionPassManager &FPM) {		legacy::FunctionPassManager &FPM) {
addExtensionsToPM(EP_EarlyAsPossible, FPM);		addExtensionsToPM(EP_EarlyAsPossible, FPM);
FPM.add(createEntryExitInstrumenterPass());		FPM.add(createEntryExitInstrumenterPass());

// Add LibraryInfo if we have some.		// Add LibraryInfo if we have some.
if (LibraryInfo)		if (LibraryInfo)
FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));		FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addFunctionSimplificationPasses(
// This ends the loop pass pipelines.		// This ends the loop pass pipelines.

if (OptLevel > 1) {		if (OptLevel > 1) {
MPM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds		MPM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds
MPM.add(NewGVN ? createNewGVNPass()		MPM.add(NewGVN ? createNewGVNPass()
: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies		: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies
}		}
MPM.add(createMemCpyOptPass()); // Remove memcpy / form memset		MPM.add(createMemCpyOptPass()); // Remove memcpy / form memset
		addMemcmpPasses(MPM); // Merge/Expand comparisons.
MPM.add(createSCCPPass()); // Constant prop with SCCP		MPM.add(createSCCPPass()); // Constant prop with SCCP

// Delete dead bit computations (instcombine runs after to fold away the dead		// Delete dead bit computations (instcombine runs after to fold away the dead
// computations, and then ADCE will run later to exploit any new DCE		// computations, and then ADCE will run later to exploit any new DCE
// opportunities that creates).		// opportunities that creates).
MPM.add(createBitTrackingDCEPass()); // Delete dead bit computations		MPM.add(createBitTrackingDCEPass()); // Delete dead bit computations

// Run instcombine after redundancy elimination to exploit opportunities		// Run instcombine after redundancy elimination to exploit opportunities
▲ Show 20 Lines • Show All 495 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {
PM.add(createPostOrderFunctionAttrsLegacyPass()); // Add nocapture.		PM.add(createPostOrderFunctionAttrsLegacyPass()); // Add nocapture.
PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.		PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.

PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.		PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.
PM.add(NewGVN ? createNewGVNPass()		PM.add(NewGVN ? createNewGVNPass()
: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.		: createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.
PM.add(createMemCpyOptPass()); // Remove dead memcpys.		PM.add(createMemCpyOptPass()); // Remove dead memcpys.
		addMemcmpPasses(PM); // Merge/Expand comparisons.

// Nuke dead stores.		// Nuke dead stores.
PM.add(createDeadStoreEliminationPass());		PM.add(createDeadStoreEliminationPass());

// More loops are countable; try to optimize them.		// More loops are countable; try to optimize them.
PM.add(createIndVarSimplifyPass());		PM.add(createIndVarSimplifyPass());
PM.add(createLoopDeletionPass());		PM.add(createLoopDeletionPass());
if (EnableLoopInterchange)		if (EnableLoopInterchange)
▲ Show 20 Lines • Show All 211 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/CMakeLists.txt

	add_llvm_library(LLVMScalarOpts			add_llvm_library(LLVMScalarOpts
	ADCE.cpp			ADCE.cpp
	AlignmentFromAssumptions.cpp			AlignmentFromAssumptions.cpp
	BDCE.cpp			BDCE.cpp
	CallSiteSplitting.cpp			CallSiteSplitting.cpp
	ConstantHoisting.cpp			ConstantHoisting.cpp
	ConstantProp.cpp			ConstantProp.cpp
	CorrelatedValuePropagation.cpp			CorrelatedValuePropagation.cpp
	DCE.cpp			DCE.cpp
	DeadStoreElimination.cpp			DeadStoreElimination.cpp
	DivRemPairs.cpp			DivRemPairs.cpp
	EarlyCSE.cpp			EarlyCSE.cpp
				ExpandMemCmp.cpp
	FlattenCFGPass.cpp			FlattenCFGPass.cpp
	Float2Int.cpp			Float2Int.cpp
	GuardWidening.cpp			GuardWidening.cpp
	GVN.cpp			GVN.cpp
	GVNHoist.cpp			GVNHoist.cpp
	GVNSink.cpp			GVNSink.cpp
	IVUsersPrinter.cpp			IVUsersPrinter.cpp
	InductiveRangeCheckElimination.cpp			InductiveRangeCheckElimination.cpp
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp

This file was moved from llvm/lib/CodeGen/ExpandMemCmp.cpp.

//===--- ExpandMemCmp.cpp - Expand memcmp() to load/stores ----------------===//		//===--- ExpandMemCmp.cpp - Expand memcmp() to load/stores ----------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass tries to expand memcmp() calls into optimally-sized loads and		// This pass tries to expand memcmp() calls into optimally-sized loads and
// compares for the target.		// compares for the target.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
		#include "llvm/Transforms/Scalar.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "expandmemcmp"		#define DEBUG_TYPE "expandmemcmp"

STATISTIC(NumMemCmpCalls, "Number of memcmp calls");		STATISTIC(NumMemCmpCalls, "Number of memcmp calls");
STATISTIC(NumMemCmpNotConstant, "Number of memcmp calls without constant size");		STATISTIC(NumMemCmpNotConstant, "Number of memcmp calls without constant size");
STATISTIC(NumMemCmpGreaterThanMax,		STATISTIC(NumMemCmpGreaterThanMax,
Show All 10 Lines	static cl::opt<unsigned> MaxLoadsPerMemcmp(
cl::desc("Set maximum number of loads used in expanded memcmp"));		cl::desc("Set maximum number of loads used in expanded memcmp"));

static cl::opt<unsigned> MaxLoadsPerMemcmpOptSize(		static cl::opt<unsigned> MaxLoadsPerMemcmpOptSize(
"max-loads-per-memcmp-opt-size", cl::Hidden,		"max-loads-per-memcmp-opt-size", cl::Hidden,
cl::desc("Set maximum number of loads used in expanded memcmp for -Os/Oz"));		cl::desc("Set maximum number of loads used in expanded memcmp for -Os/Oz"));

namespace {		namespace {


// This class provides helper functions to expand a memcmp library call into an		// This class provides helper functions to expand a memcmp library call into an
// inline expansion.		// inline expansion.
class MemCmpExpansion {		class MemCmpExpansion {
struct ResultBlock {		struct ResultBlock {
BasicBlock *BB = nullptr;		BasicBlock *BB = nullptr;
PHINode *PhiSrc1 = nullptr;		PHINode *PhiSrc1 = nullptr;
PHINode *PhiSrc2 = nullptr;		PHINode *PhiSrc2 = nullptr;

Show All 12 Lines	class MemCmpExpansion {
const bool IsUsedForZeroCmp;		const bool IsUsedForZeroCmp;
const DataLayout &DL;		const DataLayout &DL;
IRBuilder<> Builder;		IRBuilder<> Builder;
// Represents the decomposition in blocks of the expansion. For example,		// Represents the decomposition in blocks of the expansion. For example,
// comparing 33 bytes on X86+sse can be done with 2x16-byte loads and		// comparing 33 bytes on X86+sse can be done with 2x16-byte loads and
// 1x1-byte load, which would be represented as [{16, 0}, {16, 16}, {32, 1}.		// 1x1-byte load, which would be represented as [{16, 0}, {16, 16}, {32, 1}.
struct LoadEntry {		struct LoadEntry {
LoadEntry(unsigned LoadSize, uint64_t Offset)		LoadEntry(unsigned LoadSize, uint64_t Offset)
: LoadSize(LoadSize), Offset(Offset) {		: LoadSize(LoadSize), Offset(Offset) {}
}

// The size of the load for this block, in bytes.		// The size of the load for this block, in bytes.
unsigned LoadSize;		unsigned LoadSize;
// The offset of this load from the base pointer, in bytes.		// The offset of this load from the base pointer, in bytes.
uint64_t Offset;		uint64_t Offset;
};		};
using LoadEntryVector = SmallVector<LoadEntry, 8>;		using LoadEntryVector = SmallVector<LoadEntry, 8>;
LoadEntryVector LoadSequence;		LoadEntryVector LoadSequence;
Show All 20 Lines	class MemCmpExpansion {
static LoadEntryVector		static LoadEntryVector
computeOverlappingLoadSequence(uint64_t Size, unsigned MaxLoadSize,		computeOverlappingLoadSequence(uint64_t Size, unsigned MaxLoadSize,
unsigned MaxNumLoads,		unsigned MaxNumLoads,
unsigned &NumLoadsNonOneByte);		unsigned &NumLoadsNonOneByte);

public:		public:
MemCmpExpansion(CallInst *CI, uint64_t Size,		MemCmpExpansion(CallInst *CI, uint64_t Size,
const TargetTransformInfo::MemCmpExpansionOptions &Options,		const TargetTransformInfo::MemCmpExpansionOptions &Options,
unsigned MaxNumLoads, const bool IsUsedForZeroCmp,		const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout);
unsigned MaxLoadsPerBlockForZeroCmp, const DataLayout &TheDataLayout);

unsigned getNumBlocks();		unsigned getNumBlocks();
uint64_t getNumLoads() const { return LoadSequence.size(); }		uint64_t getNumLoads() const { return LoadSequence.size(); }

Value *getMemCmpExpansion();		Value *getMemCmpExpansion();
};		};

MemCmpExpansion::LoadEntryVector MemCmpExpansion::computeGreedyLoadSequence(		MemCmpExpansion::LoadEntryVector MemCmpExpansion::computeGreedyLoadSequence(
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
// 1. A list of load compare blocks - LoadCmpBlocks.		// 1. A list of load compare blocks - LoadCmpBlocks.
// 2. An EndBlock, split from original instruction point, which is the block to		// 2. An EndBlock, split from original instruction point, which is the block to
// return from.		// return from.
// 3. ResultBlock, block to branch to for early exit when a		// 3. ResultBlock, block to branch to for early exit when a
// LoadCmpBlock finds a difference.		// LoadCmpBlock finds a difference.
MemCmpExpansion::MemCmpExpansion(		MemCmpExpansion::MemCmpExpansion(
CallInst *const CI, uint64_t Size,		CallInst *const CI, uint64_t Size,
const TargetTransformInfo::MemCmpExpansionOptions &Options,		const TargetTransformInfo::MemCmpExpansionOptions &Options,
const unsigned MaxNumLoads, const bool IsUsedForZeroCmp,		const bool IsUsedForZeroCmp, const DataLayout &TheDataLayout)
const unsigned MaxLoadsPerBlockForZeroCmp, const DataLayout &TheDataLayout)		: CI(CI), Size(Size), MaxLoadSize(0), NumLoadsNonOneByte(0),
: CI(CI),		NumLoadsPerBlockForZeroCmp(Options.NumLoadsPerBlock),
Size(Size),		IsUsedForZeroCmp(IsUsedForZeroCmp), DL(TheDataLayout), Builder(CI) {
MaxLoadSize(0),
NumLoadsNonOneByte(0),
NumLoadsPerBlockForZeroCmp(MaxLoadsPerBlockForZeroCmp),
IsUsedForZeroCmp(IsUsedForZeroCmp),
DL(TheDataLayout),
Builder(CI) {
assert(Size > 0 && "zero blocks");		assert(Size > 0 && "zero blocks");
// Scale the max size down if the target can load more bytes than we need.		// Scale the max size down if the target can load more bytes than we need.
llvm::ArrayRef<unsigned> LoadSizes(Options.LoadSizes);		llvm::ArrayRef<unsigned> LoadSizes(Options.LoadSizes);
while (!LoadSizes.empty() && LoadSizes.front() > Size) {		while (!LoadSizes.empty() && LoadSizes.front() > Size) {
LoadSizes = LoadSizes.drop_front();		LoadSizes = LoadSizes.drop_front();
}		}
assert(!LoadSizes.empty() && "cannot load Size bytes");		assert(!LoadSizes.empty() && "cannot load Size bytes");
MaxLoadSize = LoadSizes.front();		MaxLoadSize = LoadSizes.front();
// Compute the decomposition.		// Compute the decomposition.
unsigned GreedyNumLoadsNonOneByte = 0;		unsigned GreedyNumLoadsNonOneByte = 0;
LoadSequence = computeGreedyLoadSequence(Size, LoadSizes, MaxNumLoads,		LoadSequence = computeGreedyLoadSequence(Size, LoadSizes, Options.MaxNumLoads,
GreedyNumLoadsNonOneByte);		GreedyNumLoadsNonOneByte);
NumLoadsNonOneByte = GreedyNumLoadsNonOneByte;		NumLoadsNonOneByte = GreedyNumLoadsNonOneByte;
assert(LoadSequence.size() <= MaxNumLoads && "broken invariant");		assert(LoadSequence.size() <= Options.MaxNumLoads && "broken invariant");
// If we allow overlapping loads and the load sequence is not already optimal,		// If we allow overlapping loads and the load sequence is not already optimal,
// use overlapping loads.		// use overlapping loads.
if (Options.AllowOverlappingLoads &&		if (Options.AllowOverlappingLoads &&
(LoadSequence.empty() \|\| LoadSequence.size() > 2)) {		(LoadSequence.empty() \|\| LoadSequence.size() > 2)) {
unsigned OverlappingNumLoadsNonOneByte = 0;		unsigned OverlappingNumLoadsNonOneByte = 0;
auto OverlappingLoads = computeOverlappingLoadSequence(		auto OverlappingLoads = computeOverlappingLoadSequence(
Size, MaxLoadSize, MaxNumLoads, OverlappingNumLoadsNonOneByte);		Size, MaxLoadSize, Options.MaxNumLoads, OverlappingNumLoadsNonOneByte);
if (!OverlappingLoads.empty() &&		if (!OverlappingLoads.empty() &&
(LoadSequence.empty() \|\|		(LoadSequence.empty() \|\|
OverlappingLoads.size() < LoadSequence.size())) {		OverlappingLoads.size() < LoadSequence.size())) {
LoadSequence = OverlappingLoads;		LoadSequence = OverlappingLoads;
NumLoadsNonOneByte = OverlappingNumLoadsNonOneByte;		NumLoadsNonOneByte = OverlappingNumLoadsNonOneByte;
}		}
}		}
assert(LoadSequence.size() <= MaxNumLoads && "broken invariant");		assert(LoadSequence.size() <= Options.MaxNumLoads && "broken invariant");
}		}

unsigned MemCmpExpansion::getNumBlocks() {		unsigned MemCmpExpansion::getNumBlocks() {
if (IsUsedForZeroCmp)		if (IsUsedForZeroCmp)
return getNumLoads() / NumLoadsPerBlockForZeroCmp +		return getNumLoads() / NumLoadsPerBlockForZeroCmp +
(getNumLoads() % NumLoadsPerBlockForZeroCmp != 0 ? 1 : 0);		(getNumLoads() % NumLoadsPerBlockForZeroCmp != 0 ? 1 : 0);
return getNumLoads();		return getNumLoads();
}		}

void MemCmpExpansion::createLoadCmpBlocks() {		void MemCmpExpansion::createLoadCmpBlocks() {
		xbolva00Unsubmitted Done Reply Inline Actions Unused parameter? xbolva00: Unused parameter?
for (unsigned i = 0; i < getNumBlocks(); i++) {		for (unsigned i = 0; i < getNumBlocks(); i++) {
BasicBlock *BB = BasicBlock::Create(CI->getContext(), "loadbb",		BasicBlock *BB = BasicBlock::Create(CI->getContext(), "loadbb",
EndBlock->getParent(), EndBlock);		EndBlock->getParent(), EndBlock);
LoadCmpBlocks.push_back(BB);		LoadCmpBlocks.push_back(BB);
}		}
}		}

void MemCmpExpansion::createResultBlock() {		void MemCmpExpansion::createResultBlock() {
Show All 37 Lines	void MemCmpExpansion::emitLoadCompareByteBlock(unsigned BlockIndex,

PhiRes->addIncoming(Diff, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Diff, LoadCmpBlocks[BlockIndex]);

if (BlockIndex < (LoadCmpBlocks.size() - 1)) {		if (BlockIndex < (LoadCmpBlocks.size() - 1)) {
// Early exit branch if difference found to EndBlock. Otherwise, continue to		// Early exit branch if difference found to EndBlock. Otherwise, continue to
// next LoadCmpBlock,		// next LoadCmpBlock,
Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_NE, Diff,		Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_NE, Diff,
ConstantInt::get(Diff->getType(), 0));		ConstantInt::get(Diff->getType(), 0));
BranchInst *CmpBr =		BranchInst *CmpBr =
BranchInst::Create(EndBlock, LoadCmpBlocks[BlockIndex + 1], Cmp);		BranchInst::Create(EndBlock, LoadCmpBlocks[BlockIndex + 1], Cmp);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);
} else {		} else {
		spatelUnsubmitted Done Reply Inline Actions Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split. spatel: Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split.
// The last block has an unconditional branch to EndBlock.		// The last block has an unconditional branch to EndBlock.
BranchInst *CmpBr = BranchInst::Create(EndBlock);		BranchInst *CmpBr = BranchInst::Create(EndBlock);
Builder.Insert(CmpBr);		Builder.Insert(CmpBr);
}		}
}		}

/// Generate an equality comparison for one or more pairs of loaded values.		/// Generate an equality comparison for one or more pairs of loaded values.
/// This is used in the case where the memcmp() call is compared equal or not		/// This is used in the case where the memcmp() call is compared equal or not
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	void MemCmpExpansion::emitLoadCompareBlockMultipleLoads(unsigned BlockIndex,

// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0		// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0
// since early exit to ResultBlock was not taken (no difference was found in		// since early exit to ResultBlock was not taken (no difference was found in
// any of the bytes).		// any of the bytes).
if (BlockIndex == LoadCmpBlocks.size() - 1) {		if (BlockIndex == LoadCmpBlocks.size() - 1) {
Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);		Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);
PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);
}		}
}		}
		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.

// This function creates the IR intructions for loading and comparing using the		// This function creates the IR intructions for loading and comparing using the
// given LoadSize. It loads the number of bytes specified by LoadSize from each		// given LoadSize. It loads the number of bytes specified by LoadSize from each
// source of the memcmp parameters. It then does a subtract to see if there was		// source of the memcmp parameters. It then does a subtract to see if there was
// a difference in the loaded values. If a difference is found, it branches		// a difference in the loaded values. If a difference is found, it branches
// with an early exit to the ResultBlock for calculating which source was		// with an early exit to the ResultBlock for calculating which source was
// larger. Otherwise, it falls through to the either the next LoadCmpBlock or		// larger. Otherwise, it falls through to the either the next LoadCmpBlock or
// the EndBlock if this is the last LoadCmpBlock. Loading 1 byte is handled with		// the EndBlock if this is the last LoadCmpBlock. Loading 1 byte is handled with
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	void MemCmpExpansion::emitLoadCompareBlock(unsigned BlockIndex) {

// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0		// Add a phi edge for the last LoadCmpBlock to Endblock with a value of 0
// since early exit to ResultBlock was not taken (no difference was found in		// since early exit to ResultBlock was not taken (no difference was found in
// any of the bytes).		// any of the bytes).
if (BlockIndex == LoadCmpBlocks.size() - 1) {		if (BlockIndex == LoadCmpBlocks.size() - 1) {
Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);		Value *Zero = ConstantInt::get(Type::getInt32Ty(CI->getContext()), 0);
PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);		PhiRes->addIncoming(Zero, LoadCmpBlocks[BlockIndex]);
}		}
}		}
		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.

// This function populates the ResultBlock with a sequence to calculate the		// This function populates the ResultBlock with a sequence to calculate the
// memcmp result. It compares the two loaded source values and returns -1 if		// memcmp result. It compares the two loaded source values and returns -1 if
// src1 < src2 and 1 if src1 > src2.		// src1 < src2 and 1 if src1 > src2.
void MemCmpExpansion::emitMemCmpResultBlock() {		void MemCmpExpansion::emitMemCmpResultBlock() {
// Special case: if memcmp result is used in a zero equality, result does not		// Special case: if memcmp result is used in a zero equality, result does not
// need to be calculated and can simply return 1.		// need to be calculated and can simply return 1.
if (IsUsedForZeroCmp) {		if (IsUsedForZeroCmp) {
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	if (getNumBlocks() != 1) {
EndBlock = StartBlock->splitBasicBlock(CI, "endblock");		EndBlock = StartBlock->splitBasicBlock(CI, "endblock");
setupEndBlockPHINodes();		setupEndBlockPHINodes();
createResultBlock();		createResultBlock();

// If return value of memcmp is not used in a zero equality, we need to		// If return value of memcmp is not used in a zero equality, we need to
// calculate which source was larger. The calculation requires the		// calculate which source was larger. The calculation requires the
// two loaded source values of each load compare block.		// two loaded source values of each load compare block.
// These will be saved in the phi nodes created by setupResultBlockPHINodes.		// These will be saved in the phi nodes created by setupResultBlockPHINodes.
if (!IsUsedForZeroCmp) setupResultBlockPHINodes();		if (!IsUsedForZeroCmp)
		setupResultBlockPHINodes();

// Create the number of required load compare basic blocks.		// Create the number of required load compare basic blocks.
createLoadCmpBlocks();		createLoadCmpBlocks();

// Update the terminator added by splitBasicBlock to branch to the first		// Update the terminator added by splitBasicBlock to branch to the first
// LoadCmpBlock.		// LoadCmpBlock.
StartBlock->getTerminator()->setSuccessor(0, LoadCmpBlocks[0]);		StartBlock->getTerminator()->setSuccessor(0, LoadCmpBlocks[0]);
}		}

		spatelUnsubmitted Done Reply Inline Actions Formatting - split line. spatel: Formatting - split line.
Builder.SetCurrentDebugLocation(CI->getDebugLoc());		Builder.SetCurrentDebugLocation(CI->getDebugLoc());

if (IsUsedForZeroCmp)		if (IsUsedForZeroCmp)
return getNumBlocks() == 1 ? getMemCmpEqZeroOneBlock()		return getNumBlocks() == 1 ? getMemCmpEqZeroOneBlock()
: getMemCmpExpansionZeroCase();		: getMemCmpExpansionZeroCase();

if (getNumBlocks() == 1)		if (getNumBlocks() == 1)
return getMemCmpOneBlock();		return getMemCmpOneBlock();
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
/// %47 = zext i8 %45 to i32		/// %47 = zext i8 %45 to i32
/// %48 = sub i32 %46, %47		/// %48 = sub i32 %46, %47
/// br label %endblock		/// br label %endblock
/// endblock: ; preds = %res_block,		/// endblock: ; preds = %res_block,
/// %loadbb3		/// %loadbb3
/// %phi.res = phi i32 [ %48, %loadbb3 ], [ %11, %res_block ]		/// %phi.res = phi i32 [ %48, %loadbb3 ], [ %11, %res_block ]
/// ret i32 %phi.res		/// ret i32 %phi.res
static bool expandMemCmp(CallInst CI, const TargetTransformInfo TTI,		static bool expandMemCmp(CallInst CI, const TargetTransformInfo TTI,
const TargetLowering TLI, const DataLayout DL) {		const DataLayout *DL) {
NumMemCmpCalls++;		NumMemCmpCalls++;

// Early exit from expansion if -Oz.		// Early exit from expansion if -Oz.
if (CI->getFunction()->hasMinSize())		if (CI->getFunction()->hasMinSize())
return false;		return false;

// Early exit from expansion if size is not a constant.		// Early exit from expansion if size is not a constant.
ConstantInt *SizeCast = dyn_cast<ConstantInt>(CI->getArgOperand(2));		ConstantInt *SizeCast = dyn_cast<ConstantInt>(CI->getArgOperand(2));
if (!SizeCast) {		if (!SizeCast) {
NumMemCmpNotConstant++;		NumMemCmpNotConstant++;
return false;		return false;
}		}
const uint64_t SizeVal = SizeCast->getZExtValue();		const uint64_t SizeVal = SizeCast->getZExtValue();

if (SizeVal == 0) {		if (SizeVal == 0) {
return false;		return false;
}		}
// TTI call to check if target would like to expand memcmp. Also, get the		// TTI call to check if target would like to expand memcmp. Also, get the
// available load sizes.		// available load sizes.
const bool IsUsedForZeroCmp = isOnlyUsedInZeroEqualityComparison(CI);		const bool IsUsedForZeroCmp = isOnlyUsedInZeroEqualityComparison(CI);
const auto *const Options = TTI->enableMemCmpExpansion(IsUsedForZeroCmp);		auto Options = TTI->enableMemCmpExpansion(CI->getFunction()->hasOptSize(),
if (!Options) return false;		IsUsedForZeroCmp);
		if (!Options)
		return false;

const unsigned MaxNumLoads = CI->getFunction()->hasOptSize()		if (MemCmpEqZeroNumLoadsPerBlock.getNumOccurrences())
? (MaxLoadsPerMemcmpOptSize.getNumOccurrences()		Options.NumLoadsPerBlock = MemCmpEqZeroNumLoadsPerBlock;
? MaxLoadsPerMemcmpOptSize
: TLI->getMaxExpandSizeMemcmp(true))
: (MaxLoadsPerMemcmp.getNumOccurrences()
? MaxLoadsPerMemcmp
: TLI->getMaxExpandSizeMemcmp(false));

unsigned NumLoadsPerBlock = MemCmpEqZeroNumLoadsPerBlock.getNumOccurrences()
? MemCmpEqZeroNumLoadsPerBlock
: TLI->getMemcmpEqZeroLoadsPerBlock();

MemCmpExpansion Expansion(CI, SizeVal, *Options, MaxNumLoads,		if (CI->getFunction()->hasOptSize() && MaxLoadsPerMemcmpOptSize.getNumOccurrences())
IsUsedForZeroCmp, NumLoadsPerBlock, *DL);		Options.MaxNumLoads = MaxLoadsPerMemcmpOptSize;
		if (!CI->getFunction()->hasOptSize() && MaxLoadsPerMemcmp.getNumOccurrences())
		Options.MaxNumLoads = MaxLoadsPerMemcmp;

		MemCmpExpansion Expansion(CI, SizeVal, Options, IsUsedForZeroCmp, *DL);

// Don't expand if this will require more loads than desired by the target.		// Don't expand if this will require more loads than desired by the target.
if (Expansion.getNumLoads() == 0) {		if (Expansion.getNumLoads() == 0) {
NumMemCmpGreaterThanMax++;		NumMemCmpGreaterThanMax++;
return false;		return false;
}		}

NumMemCmpInlined++;		NumMemCmpInlined++;

Value *Res = Expansion.getMemCmpExpansion();		Value *Res = Expansion.getMemCmpExpansion();

// Replace call with result of expansion and erase call.		// Replace call with result of expansion and erase call.
CI->replaceAllUsesWith(Res);		CI->replaceAllUsesWith(Res);
CI->eraseFromParent();		CI->eraseFromParent();

return true;		return true;
}		}



class ExpandMemCmpPass : public FunctionPass {		class ExpandMemCmpPass : public FunctionPass {
public:		public:
static char ID;		static char ID;

ExpandMemCmpPass() : FunctionPass(ID) {		ExpandMemCmpPass() : FunctionPass(ID) {
initializeExpandMemCmpPassPass(*PassRegistry::getPassRegistry());		initializeExpandMemCmpPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (skipFunction(F)) return false;		if (skipFunction(F))

auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
if (!TPC) {
return false;		return false;
}
const TargetLowering* TL =
TPC->getTM<TargetMachine>().getSubtargetImpl(F)->getTargetLowering();

const TargetLibraryInfo *TLI =		const TargetLibraryInfo *TLI =
&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
const TargetTransformInfo *TTI =		const TargetTransformInfo *TTI =
&getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		&getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
auto PA = runImpl(F, TLI, TTI, TL);		auto PA = runImpl(F, TLI, TTI);
return !PA.areAllPreserved();		return !PA.areAllPreserved();
}		}

private:		private:
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
		AU.addPreserved<GlobalsAAWrapperPass>();
FunctionPass::getAnalysisUsage(AU);		FunctionPass::getAnalysisUsage(AU);
}		}

PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,		PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,
const TargetTransformInfo *TTI,		const TargetTransformInfo *TTI);
const TargetLowering* TL);
// Returns true if a change was made.		// Returns true if a change was made.
bool runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,		bool runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,
const TargetTransformInfo TTI, const TargetLowering TL,		const TargetTransformInfo *TTI, const DataLayout &DL);
const DataLayout& DL);
};		};

bool ExpandMemCmpPass::runOnBlock(		bool ExpandMemCmpPass::runOnBlock(BasicBlock &BB, const TargetLibraryInfo *TLI,
BasicBlock &BB, const TargetLibraryInfo *TLI,		const TargetTransformInfo *TTI,
const TargetTransformInfo TTI, const TargetLowering TL,
const DataLayout& DL) {		const DataLayout &DL) {
for (Instruction& I : BB) {		for (Instruction &I : BB) {
CallInst *CI = dyn_cast<CallInst>(&I);		CallInst *CI = dyn_cast<CallInst>(&I);
if (!CI) {		if (!CI) {
continue;		continue;
}		}
LibFunc Func;		LibFunc Func;
if (TLI->getLibFunc(ImmutableCallSite(CI), Func) &&		if (TLI->getLibFunc(ImmutableCallSite(CI), Func) &&
(Func == LibFunc_memcmp \|\| Func == LibFunc_bcmp) &&		(Func == LibFunc_memcmp \|\| Func == LibFunc_bcmp) &&
expandMemCmp(CI, TTI, TL, &DL)) {		expandMemCmp(CI, TTI, &DL)) {
return true;		return true;
}		}
}		}
return false;		return false;
}		}

		PreservedAnalyses ExpandMemCmpPass::runImpl(Function &F,
PreservedAnalyses ExpandMemCmpPass::runImpl(		const TargetLibraryInfo *TLI,
Function &F, const TargetLibraryInfo TLI, const TargetTransformInfo TTI,		const TargetTransformInfo *TTI) {
const TargetLowering* TL) {
const DataLayout& DL = F.getParent()->getDataLayout();		const DataLayout &DL = F.getParent()->getDataLayout();
bool MadeChanges = false;		bool MadeChanges = false;
for (auto BBIt = F.begin(); BBIt != F.end();) {		for (auto BBIt = F.begin(); BBIt != F.end();) {
if (runOnBlock(*BBIt, TLI, TTI, TL, DL)) {		if (runOnBlock(*BBIt, TLI, TTI, DL)) {
MadeChanges = true;		MadeChanges = true;
// If changes were made, restart the function from the beginning, since		// If changes were made, restart the function from the beginning, since
// the structure of the function was changed.		// the structure of the function was changed.
BBIt = F.begin();		BBIt = F.begin();
} else {		} else {
++BBIt;		++BBIt;
}		}
}		}
return MadeChanges ? PreservedAnalyses::none() : PreservedAnalyses::all();		if (!MadeChanges)
		return PreservedAnalyses::all();
		PreservedAnalyses PA;
		PA.preserve<GlobalsAA>();
		return PA;
}		}

} // namespace		} // namespace

char ExpandMemCmpPass::ID = 0;		char ExpandMemCmpPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandMemCmpPass, "expandmemcmp",		INITIALIZE_PASS_BEGIN(ExpandMemCmpPass, "expandmemcmp",
"Expand memcmp() to load/stores", false, false)		"Expand memcmp() to load/stores", false, false)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(ExpandMemCmpPass, "expandmemcmp",		INITIALIZE_PASS_END(ExpandMemCmpPass, "expandmemcmp",
"Expand memcmp() to load/stores", false, false)		"Expand memcmp() to load/stores", false, false)

FunctionPass *llvm::createExpandMemCmpPass() {		Pass *llvm::createExpandMemCmpPass() { return new ExpandMemCmpPass(); }
return new ExpandMemCmpPass();
}

llvm/lib/Transforms/Scalar/MergeICmps.cpp

Show All 36 Lines
// return memcmp(this, &o, 8) == 0;		// return memcmp(this, &o, 8) == 0;
// }		// }
//		//
// Which will later be expanded (ExpandMemCmp) as a single 8-bytes icmp.		// Which will later be expanded (ExpandMemCmp) as a single 8-bytes icmp.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/BuildLibCalls.h"		#include "llvm/Transforms/Utils/BuildLibCalls.h"
#include <algorithm>		#include <algorithm>
▲ Show 20 Lines • Show All 766 Lines • ▼ Show 20 Lines	bool runOnFunction(Function &F) override {
return !PA.areAllPreserved();		return !PA.areAllPreserved();
}		}

private:		private:
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
		AU.addPreserved<GlobalsAAWrapperPass>();
}		}

PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,		PreservedAnalyses runImpl(Function &F, const TargetLibraryInfo *TLI,
const TargetTransformInfo TTI, AliasAnalysis AA);		const TargetTransformInfo TTI, AliasAnalysis AA);
};		};

PreservedAnalyses MergeICmps::runImpl(Function &F, const TargetLibraryInfo *TLI,		PreservedAnalyses MergeICmps::runImpl(Function &F, const TargetLibraryInfo *TLI,
const TargetTransformInfo *TTI,		const TargetTransformInfo *TTI,
AliasAnalysis *AA) {		AliasAnalysis *AA) {
LLVM_DEBUG(dbgs() << "MergeICmpsPass: " << F.getName() << "\n");		LLVM_DEBUG(dbgs() << "MergeICmpsPass: " << F.getName() << "\n");

// We only try merging comparisons if the target wants to expand memcmp later.		// We only try merging comparisons if the target wants to expand memcmp later.
// The rationale is to avoid turning small chains into memcmp calls.		// The rationale is to avoid turning small chains into memcmp calls.
if (!TTI->enableMemCmpExpansion(true)) return PreservedAnalyses::all();		if (!TTI->enableMemCmpExpansion(F.hasOptSize(), /IsZeroCmp/ true))
		return PreservedAnalyses::all();

// If we don't have memcmp avaiable we can't emit calls to it.		// If we don't have memcmp avaiable we can't emit calls to it.
if (!TLI->has(LibFunc_memcmp))		if (!TLI->has(LibFunc_memcmp))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

bool MadeChange = false;		bool MadeChange = false;

for (auto BBIt = ++F.begin(); BBIt != F.end(); ++BBIt) {		for (auto BBIt = ++F.begin(); BBIt != F.end(); ++BBIt) {
// A Phi operation is always first in a basic block.		// A Phi operation is always first in a basic block.
if (auto const Phi = dyn_cast<PHINode>(&BBIt->begin()))		if (auto const Phi = dyn_cast<PHINode>(&BBIt->begin()))
MadeChange \|= processPhi(*Phi, TLI, AA);		MadeChange \|= processPhi(*Phi, TLI, AA);
}		}

if (MadeChange) return PreservedAnalyses::none();		if (!MadeChange) return PreservedAnalyses::all();
return PreservedAnalyses::all();		PreservedAnalyses PA;
		PA.preserve<GlobalsAA>();
		return PA;
}		}

} // namespace		} // namespace

char MergeICmps::ID = 0;		char MergeICmps::ID = 0;
INITIALIZE_PASS_BEGIN(MergeICmps, "mergeicmps",		INITIALIZE_PASS_BEGIN(MergeICmps, "mergeicmps",
"Merge contiguous icmps into a memcmp", false, false)		"Merge contiguous icmps into a memcmp", false, false)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_END(MergeICmps, "mergeicmps",		INITIALIZE_PASS_END(MergeICmps, "mergeicmps",
"Merge contiguous icmps into a memcmp", false, false)		"Merge contiguous icmps into a memcmp", false, false)

Pass *llvm::createMergeICmpsPass() { return new MergeICmps(); }		Pass *llvm::createMergeICmpsPass() { return new MergeICmps(); }

llvm/lib/Transforms/Scalar/Scalar.cpp

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopVersioningLICMPass(Registry);		initializeLoopVersioningLICMPass(Registry);
initializeLoopIdiomRecognizeLegacyPassPass(Registry);		initializeLoopIdiomRecognizeLegacyPassPass(Registry);
initializeLowerAtomicLegacyPassPass(Registry);		initializeLowerAtomicLegacyPassPass(Registry);
initializeLowerExpectIntrinsicPass(Registry);		initializeLowerExpectIntrinsicPass(Registry);
initializeLowerGuardIntrinsicLegacyPassPass(Registry);		initializeLowerGuardIntrinsicLegacyPassPass(Registry);
initializeLowerWidenableConditionLegacyPassPass(Registry);		initializeLowerWidenableConditionLegacyPassPass(Registry);
initializeMemCpyOptLegacyPassPass(Registry);		initializeMemCpyOptLegacyPassPass(Registry);
initializeMergeICmpsPass(Registry);		initializeMergeICmpsPass(Registry);
		initializeExpandMemCmpPassPass(Registry);
initializeMergedLoadStoreMotionLegacyPassPass(Registry);		initializeMergedLoadStoreMotionLegacyPassPass(Registry);
initializeNaryReassociateLegacyPassPass(Registry);		initializeNaryReassociateLegacyPassPass(Registry);
initializePartiallyInlineLibCallsLegacyPassPass(Registry);		initializePartiallyInlineLibCallsLegacyPassPass(Registry);
initializeReassociateLegacyPassPass(Registry);		initializeReassociateLegacyPassPass(Registry);
initializeRegToMemPass(Registry);		initializeRegToMemPass(Registry);
initializeRewriteStatepointsForGCLegacyPassPass(Registry);		initializeRewriteStatepointsForGCLegacyPassPass(Registry);
initializeSCCPLegacyPassPass(Registry);		initializeSCCPLegacyPassPass(Registry);
initializeSROALegacyPassPass(Registry);		initializeSROALegacyPassPass(Registry);
▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/O3-pipeline.ll

	Show All 26 Lines
	; CHECK-NEXT: Loop Data Prefetch			; CHECK-NEXT: Loop Data Prefetch
	; CHECK-NEXT: Falkor HW Prefetch Fix			; CHECK-NEXT: Falkor HW Prefetch Fix
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	; CHECK-NEXT: Loop Strength Reduction			; CHECK-NEXT: Loop Strength Reduction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Merge contiguous icmps into a memcmp
	; CHECK-NEXT: Expand memcmp() to load/stores
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/O3-pipeline.ll

	Show All 10 Lines
	; CHECK: Basic Alias Analysis (stateless AA impl)			; CHECK: Basic Alias Analysis (stateless AA impl)
	; CHECK: Module Verifier			; CHECK: Module Verifier
	; CHECK: Natural Loop Information			; CHECK: Natural Loop Information
	; CHECK: Canonicalize natural loops			; CHECK: Canonicalize natural loops
	; CHECK: Scalar Evolution Analysis			; CHECK: Scalar Evolution Analysis
	; CHECK: Loop Pass Manager			; CHECK: Loop Pass Manager
	; CHECK: Induction Variable Users			; CHECK: Induction Variable Users
	; CHECK: Loop Strength Reduction			; CHECK: Loop Strength Reduction
	; CHECK: Basic Alias Analysis (stateless AA impl)
	; CHECK: Function Alias Analysis Results
	; CHECK: Merge contiguous icmps into a memcmp
	; CHECK: Expand memcmp() to load/stores
	; CHECK: Lower Garbage Collection Instructions			; CHECK: Lower Garbage Collection Instructions
	; CHECK: Shadow Stack GC Lowering			; CHECK: Shadow Stack GC Lowering
	; CHECK: Remove unreachable blocks from the CFG			; CHECK: Remove unreachable blocks from the CFG
	; CHECK: Dominator Tree Construction			; CHECK: Dominator Tree Construction
	; CHECK: Natural Loop Information			; CHECK: Natural Loop Information
	; CHECK: Branch Probability Analysis			; CHECK: Branch Probability Analysis
	; CHECK: Block Frequency Analysis			; CHECK: Block Frequency Analysis
	; CHECK: Constant Hoisting			; CHECK: Constant Hoisting
	▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/test/CodeGen/Generic/llc-start-stop.ll

	; Note: -verify-machineinstrs is used in order to make this test compatible with EXPENSIVE_CHECKS.			; Note: -verify-machineinstrs is used in order to make this test compatible with EXPENSIVE_CHECKS.
	; RUN: llc < %s -debug-pass=Structure -stop-after=loop-reduce -verify-machineinstrs -o /dev/null 2>&1 \			; RUN: llc < %s -debug-pass=Structure -stop-after=loop-reduce -verify-machineinstrs -o /dev/null 2>&1 \
	; RUN: \| FileCheck %s -check-prefix=STOP-AFTER			; RUN: \| FileCheck %s -check-prefix=STOP-AFTER
	; STOP-AFTER: -loop-reduce			; STOP-AFTER: -loop-reduce
	; STOP-AFTER: Dominator Tree Construction			; STOP-AFTER: Dominator Tree Construction
	; STOP-AFTER: Loop Strength Reduction			; STOP-AFTER: Loop Strength Reduction
	; STOP-AFTER-NEXT: Verify generated machine code			; STOP-AFTER-NEXT: Verify generated machine code
	; STOP-AFTER-NEXT: MIR Printing Pass			; STOP-AFTER-NEXT: MIR Printing Pass

	; RUN: llc < %s -debug-pass=Structure -stop-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=STOP-BEFORE			; RUN: llc < %s -debug-pass=Structure -stop-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=STOP-BEFORE
	; STOP-BEFORE-NOT: -loop-reduce			; STOP-BEFORE-NOT: -loop-reduce
	; STOP-BEFORE: Dominator Tree Construction			; STOP-BEFORE: Dominator Tree Construction
	; STOP-BEFORE-NOT: Loop Strength Reduction			; STOP-BEFORE-NOT: Loop Strength Reduction

	; RUN: llc < %s -debug-pass=Structure -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-AFTER			; RUN: llc < %s -debug-pass=Structure -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-AFTER
	; START-AFTER: -aa -mergeicmps			; START-AFTER: -gc-lowering
	; START-AFTER: FunctionPass Manager			; START-AFTER: FunctionPass Manager
	; START-AFTER-NEXT: Dominator Tree Construction			; START-AFTER-NEXT: Lower Garbage Collection Instructions

	; RUN: llc < %s -debug-pass=Structure -start-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-BEFORE			; RUN: llc < %s -debug-pass=Structure -start-before=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=START-BEFORE
	; START-BEFORE: -machine-branch-prob -domtree			; START-BEFORE: -machine-branch-prob -domtree
	; START-BEFORE: FunctionPass Manager			; START-BEFORE: FunctionPass Manager
	; START-BEFORE: Loop Strength Reduction			; START-BEFORE: Loop Strength Reduction
	; START-BEFORE-NEXT: Basic Alias Analysis (stateless AA impl)			; START-BEFORE-NEXT: Lower Garbage Collection Instructions

	; RUN: not llc < %s -start-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-BEFORE			; RUN: not llc < %s -start-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-BEFORE
	; RUN: not llc < %s -stop-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-BEFORE			; RUN: not llc < %s -stop-before=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-BEFORE
	; RUN: not llc < %s -start-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-AFTER			; RUN: not llc < %s -start-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-START-AFTER
	; RUN: not llc < %s -stop-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-AFTER			; RUN: not llc < %s -stop-after=nonexistent -o /dev/null 2>&1 \| FileCheck %s -check-prefix=NONEXISTENT-STOP-AFTER
	; NONEXISTENT-START-BEFORE: "nonexistent" pass is not registered.			; NONEXISTENT-START-BEFORE: "nonexistent" pass is not registered.
	; NONEXISTENT-STOP-BEFORE: "nonexistent" pass is not registered.			; NONEXISTENT-STOP-BEFORE: "nonexistent" pass is not registered.
	; NONEXISTENT-START-AFTER: "nonexistent" pass is not registered.			; NONEXISTENT-START-AFTER: "nonexistent" pass is not registered.
	; NONEXISTENT-STOP-AFTER: "nonexistent" pass is not registered.			; NONEXISTENT-STOP-AFTER: "nonexistent" pass is not registered.

	; RUN: not llc < %s -start-before=loop-reduce -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-START			; RUN: not llc < %s -start-before=loop-reduce -start-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-START
	; RUN: not llc < %s -stop-before=loop-reduce -stop-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-STOP			; RUN: not llc < %s -stop-before=loop-reduce -stop-after=loop-reduce -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DOUBLE-STOP
	; DOUBLE-START: start-before and start-after specified!			; DOUBLE-START: start-before and start-after specified!
	; DOUBLE-STOP: stop-before and stop-after specified!			; DOUBLE-STOP: stop-before and stop-after specified!

llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -ppc-gpr-icmps=all -verify-machineinstrs -mcpu=pwr8 < %s \| FileCheck %s
	target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"

	@zeroEqualityTest01.buffer1 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 4], align 4
	@zeroEqualityTest01.buffer2 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 3], align 4
	@zeroEqualityTest02.buffer1 = private unnamed_addr constant [4 x i32] [i32 4, i32 0, i32 0, i32 0], align 4
	@zeroEqualityTest02.buffer2 = private unnamed_addr constant [4 x i32] [i32 3, i32 0, i32 0, i32 0], align 4
	@zeroEqualityTest03.buffer1 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 3], align 4
	@zeroEqualityTest03.buffer2 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 4], align 4
	@zeroEqualityTest04.buffer1 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14], align 4
	@zeroEqualityTest04.buffer2 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 13], align 4

	declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

	; Check 4 bytes - requires 1 load for each param.
	define signext i32 @zeroEqualityTest02(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest02:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwz 3, 0(3)
	; CHECK-NEXT: lwz 4, 0(4)
	; CHECK-NEXT: xor 3, 3, 4
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: srwi 3, 3, 5
	; CHECK-NEXT: xori 3, 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 4)
	%not.cmp = icmp ne i32 %call, 0
	%. = zext i1 %not.cmp to i32
	ret i32 %.
	}

	; Check 16 bytes - requires 2 loads for each param (or use vectors?).
	define signext i32 @zeroEqualityTest01(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest01:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ld 5, 0(3)
	; CHECK-NEXT: ld 6, 0(4)
	; CHECK-NEXT: cmpld 5, 6
	; CHECK-NEXT: bne 0, .LBB1_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: ld 3, 8(3)
	; CHECK-NEXT: ld 4, 8(4)
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB1_3
	; CHECK-NEXT: .LBB1_2: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB1_3: # %endblock
	; CHECK-NEXT: clrldi 3, 3, 32
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 16)
	%not.tobool = icmp ne i32 %call, 0
	%. = zext i1 %not.tobool to i32
	ret i32 %.
	}

	; Check 7 bytes - requires 3 loads for each param.
	define signext i32 @zeroEqualityTest03(i8* %x, i8* %y) {
	; CHECK-LABEL: zeroEqualityTest03:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwz 5, 0(3)
	; CHECK-NEXT: lwz 6, 0(4)
	; CHECK-NEXT: cmplw 5, 6
	; CHECK-NEXT: bne 0, .LBB2_3
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: lhz 5, 4(3)
	; CHECK-NEXT: lhz 6, 4(4)
	; CHECK-NEXT: cmplw 5, 6
	; CHECK-NEXT: bne 0, .LBB2_3
	; CHECK-NEXT: # %bb.2: # %loadbb2
	; CHECK-NEXT: lbz 3, 6(3)
	; CHECK-NEXT: lbz 4, 6(4)
	; CHECK-NEXT: cmplw 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB2_4
	; CHECK-NEXT: .LBB2_3: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB2_4: # %endblock
	; CHECK-NEXT: clrldi 3, 3, 32
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 7)
	%not.lnot = icmp ne i32 %call, 0
	%cond = zext i1 %not.lnot to i32
	ret i32 %cond
	}

	; Validate with > 0
	define signext i32 @zeroEqualityTest04() {
	; CHECK-LABEL: zeroEqualityTest04:
	; CHECK: # %bb.0:
	; CHECK-NEXT: addis 3, 2, .LzeroEqualityTest02.buffer1@toc@ha
	; CHECK-NEXT: addis 4, 2, .LzeroEqualityTest02.buffer2@toc@ha
	; CHECK-NEXT: addi 6, 3, .LzeroEqualityTest02.buffer1@toc@l
	; CHECK-NEXT: addi 5, 4, .LzeroEqualityTest02.buffer2@toc@l
	; CHECK-NEXT: ldbrx 3, 0, 6
	; CHECK-NEXT: ldbrx 4, 0, 5
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: bne 0, .LBB3_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 8
	; CHECK-NEXT: ldbrx 3, 6, 4
	; CHECK-NEXT: ldbrx 4, 5, 4
	; CHECK-NEXT: li 5, 0
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: beq 0, .LBB3_3
	; CHECK-NEXT: .LBB3_2: # %res_block
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: li 4, -1
	; CHECK-NEXT: isel 5, 4, 3, 0
	; CHECK-NEXT: .LBB3_3: # %endblock
	; CHECK-NEXT: extsw 3, 5
	; CHECK-NEXT: neg 3, 3
	; CHECK-NEXT: rldicl 3, 3, 1, 63
	; CHECK-NEXT: xori 3, 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest02.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest02.buffer2 to i8*), i64 16)
	%not.cmp = icmp slt i32 %call, 1
	%. = zext i1 %not.cmp to i32
	ret i32 %.
	}

	; Validate with < 0
	define signext i32 @zeroEqualityTest05() {
	; CHECK-LABEL: zeroEqualityTest05:
	; CHECK: # %bb.0:
	; CHECK-NEXT: addis 3, 2, .LzeroEqualityTest03.buffer1@toc@ha
	; CHECK-NEXT: addis 4, 2, .LzeroEqualityTest03.buffer2@toc@ha
	; CHECK-NEXT: addi 6, 3, .LzeroEqualityTest03.buffer1@toc@l
	; CHECK-NEXT: addi 5, 4, .LzeroEqualityTest03.buffer2@toc@l
	; CHECK-NEXT: ldbrx 3, 0, 6
	; CHECK-NEXT: ldbrx 4, 0, 5
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: bne 0, .LBB4_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 8
	; CHECK-NEXT: ldbrx 3, 6, 4
	; CHECK-NEXT: ldbrx 4, 5, 4
	; CHECK-NEXT: li 5, 0
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: beq 0, .LBB4_3
	; CHECK-NEXT: .LBB4_2: # %res_block
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: li 4, -1
	; CHECK-NEXT: isel 5, 4, 3, 0
	; CHECK-NEXT: .LBB4_3: # %endblock
	; CHECK-NEXT: nor 3, 5, 5
	; CHECK-NEXT: rlwinm 3, 3, 1, 31, 31
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest03.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest03.buffer2 to i8*), i64 16)
	%call.lobit = lshr i32 %call, 31
	%call.lobit.not = xor i32 %call.lobit, 1
	ret i32 %call.lobit.not
	}

	; Validate with memcmp()?:
	define signext i32 @equalityFoldTwoConstants() {
	; CHECK-LABEL: equalityFoldTwoConstants:
	; CHECK: # %bb.0: # %loadbb
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 bitcast ([15 x i32]* @zeroEqualityTest04.buffer2 to i8*), i64 16)
	%not.tobool = icmp eq i32 %call, 0
	%cond = zext i1 %not.tobool to i32
	ret i32 %cond
	}

	define signext i32 @equalityFoldOneConstant(i8* %X) {
	; CHECK-LABEL: equalityFoldOneConstant:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ld 4, 0(3)
	; CHECK-NEXT: li 5, 1
	; CHECK-NEXT: sldi 5, 5, 32
	; CHECK-NEXT: cmpld 4, 5
	; CHECK-NEXT: bne 0, .LBB6_2
	; CHECK-NEXT: # %bb.1: # %loadbb1
	; CHECK-NEXT: li 4, 3
	; CHECK-NEXT: ld 3, 8(3)
	; CHECK-NEXT: sldi 4, 4, 32
	; CHECK-NEXT: ori 4, 4, 2
	; CHECK-NEXT: cmpld 3, 4
	; CHECK-NEXT: li 3, 0
	; CHECK-NEXT: beq 0, .LBB6_3
	; CHECK-NEXT: .LBB6_2: # %res_block
	; CHECK-NEXT: li 3, 1
	; CHECK-NEXT: .LBB6_3: # %endblock
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: srwi 3, 3, 5
	; CHECK-NEXT: blr
	%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 %X, i64 16)
	%not.tobool = icmp eq i32 %call, 0
	%cond = zext i1 %not.tobool to i32
	ret i32 %cond
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) {
	; CHECK-LABEL: length2_eq_nobuiltin_attr:
	; CHECK: # %bb.0:
	; CHECK-NEXT: mflr 0
	; CHECK-NEXT: std 0, 16(1)
	; CHECK-NEXT: stdu 1, -32(1)
	; CHECK-NEXT: .cfi_def_cfa_offset 32
	; CHECK-NEXT: .cfi_offset lr, 16
	; CHECK-NEXT: li 5, 2
	; CHECK-NEXT: bl memcmp
	; CHECK-NEXT: nop
	; CHECK-NEXT: cntlzw 3, 3
	; CHECK-NEXT: rlwinm 3, 3, 27, 31, 31
	; CHECK-NEXT: addi 1, 1, 32
	; CHECK-NEXT: ld 0, 16(1)
	; CHECK-NEXT: mtlr 0
	; CHECK-NEXT: blr
	%m = tail call signext i32 @memcmp(i8* %X, i8* %Y, i64 2) nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

llvm/test/CodeGen/PowerPC/memcmp-mergeexpand.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux < %s \| FileCheck %s -check-prefix=PPC64LE

	; This tests interaction between MergeICmp and ExpandMemCmp.

	%"struct.std::pair" = type { i32, i32 }

	define zeroext i1 @opeq1(
	; PPC64LE-LABEL: opeq1:
	; PPC64LE: # %bb.0: # %entry
	; PPC64LE-NEXT: ld 3, 0(3)
	; PPC64LE-NEXT: ld 4, 0(4)
	; PPC64LE-NEXT: xor 3, 3, 4
	; PPC64LE-NEXT: cntlzd 3, 3
	; PPC64LE-NEXT: rldicl 3, 3, 58, 63
	; PPC64LE-NEXT: blr
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
	entry:
	%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
	%0 = load i32, i32* %first.i, align 4
	%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
	%1 = load i32, i32* %first1.i, align 4
	%cmp.i = icmp eq i32 %0, %1
	br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

	land.rhs.i:
	%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
	%2 = load i32, i32* %second.i, align 4
	%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
	%3 = load i32, i32* %second2.i, align 4
	%cmp3.i = icmp eq i32 %2, %3
	br label %opeq1.exit

	opeq1.exit:
	%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
	ret i1 %4
	}

llvm/test/CodeGen/PowerPC/memcmp.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux < %s \| FileCheck %s -check-prefix=CHECK

	define signext i32 @memcmp8(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp8:
	; CHECK: # %bb.0:
	; CHECK-NEXT: ldbrx 3, 0, 3
	; CHECK-NEXT: ldbrx 4, 0, 4
	; CHECK-NEXT: subfc 5, 3, 4
	; CHECK-NEXT: subfe 5, 4, 4
	; CHECK-NEXT: subfc 4, 4, 3
	; CHECK-NEXT: subfe 3, 3, 3
	; CHECK-NEXT: neg 4, 5
	; CHECK-NEXT: neg 3, 3
	; CHECK-NEXT: subf 3, 3, 4
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 8)
	ret i32 %call
	}

	define signext i32 @memcmp4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp4:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lwbrx 3, 0, 3
	; CHECK-NEXT: lwbrx 4, 0, 4
	; CHECK-NEXT: sub 5, 4, 3
	; CHECK-NEXT: sub 3, 3, 4
	; CHECK-NEXT: rldicl 4, 5, 1, 63
	; CHECK-NEXT: rldicl 3, 3, 1, 63
	; CHECK-NEXT: subf 3, 3, 4
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 4)
	ret i32 %call
	}

	define signext i32 @memcmp2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp2:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lhbrx 3, 0, 3
	; CHECK-NEXT: lhbrx 4, 0, 4
	; CHECK-NEXT: subf 3, 4, 3
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 2)
	ret i32 %call
	}

	define signext i32 @memcmp1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: memcmp1:
	; CHECK: # %bb.0:
	; CHECK-NEXT: lbz 3, 0(3)
	; CHECK-NEXT: lbz 4, 0(4)
	; CHECK-NEXT: subf 3, 4, 3
	; CHECK-NEXT: extsw 3, 3
	; CHECK-NEXT: blr
	%t0 = bitcast i32* %buffer1 to i8*
	%t1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 1) #2
	ret i32 %call
	}

	declare signext i32 @memcmp(i8, i8, i64)

llvm/test/CodeGen/PowerPC/memcmpIR.ll

This file was deleted.

	; RUN: llc -o - -mtriple=powerpc64le-unknown-gnu-linux -stop-after codegenprepare %s \| FileCheck %s
	; RUN: llc -o - -mtriple=powerpc64-unknown-gnu-linux -stop-after codegenprepare %s \| FileCheck %s --check-prefix=CHECK-BE

	define signext i32 @test1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	entry:
	; CHECK-LABEL: @test1(
	; CHECK: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-LABEL: res_block:{{.*}}
	; CHECK: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-NEXT: br label %endblock

	; CHECK-LABEL: loadbb1:{{.*}}
	; CHECK: [[BCC1:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-NEXT: [[BCC2:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-NEXT: [[GEP1:%[0-9]+]] = getelementptr i8, i8* [[BCC2]], i8 8
	; CHECK-NEXT: [[BCL1:%[0-9]+]] = bitcast i8* [[GEP1]] to i64*
	; CHECK-NEXT: [[GEP2:%[0-9]+]] = getelementptr i8, i8* [[BCC1]], i8 8
	; CHECK-NEXT: [[BCL2:%[0-9]+]] = bitcast i8* [[GEP2]] to i64*
	; CHECK-NEXT: [[LOAD1:%[0-9]+]] = load i64, i64* [[BCL1]]
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64* [[BCL2]]
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %endblock, label %res_block

	; CHECK-BE-LABEL: @test1(
	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-BE-LABEL: res_block:{{.*}}
	; CHECK-BE: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-BE-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-BE-NEXT: br label %endblock

	; CHECK-BE-LABEL: loadbb1:{{.*}}
	; CHECK-BE: [[BCC1:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-BE-NEXT: [[BCC2:%[0-9]+]] = bitcast i32* {{.}} to i8
	; CHECK-BE-NEXT: [[GEP1:%[0-9]+]] = getelementptr i8, i8* [[BCC2]], i8 8
	; CHECK-BE-NEXT: [[BCL1:%[0-9]+]] = bitcast i8* [[GEP1]] to i64*
	; CHECK-BE-NEXT: [[GEP2:%[0-9]+]] = getelementptr i8, i8* [[BCC1]], i8 8
	; CHECK-BE-NEXT: [[BCL2:%[0-9]+]] = bitcast i8* [[GEP2]] to i64*
	; CHECK-BE-NEXT: [[LOAD1:%[0-9]+]] = load i64, i64* [[BCL1]]
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64* [[BCL2]]
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %endblock, label %res_block

	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 16)
	ret i32 %call
	}

	declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

	define signext i32 @test2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK-LABEL: @test2(
	; CHECK: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD2]])
	; CHECK-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
	; CHECK-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
	; CHECK-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
	; CHECK-NEXT: ret i32 [[SUB]]

	; CHECK-BE-LABEL: @test2(
	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
	; CHECK-BE-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
	; CHECK-BE-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
	; CHECK-BE-NEXT: ret i32 [[SUB]]

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 4)
	ret i32 %call
	}

	define signext i32 @test3(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
	; CHECK: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i64 @llvm.bswap.i64(i64 [[LOAD2]])
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[BSWAP1]], [[BSWAP2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-LABEL: res_block:{{.*}}
	; CHECK: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-NEXT: br label %endblock

	; CHECK-LABEL: loadbb1:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD2]])
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i32 [[BSWAP1]] to i64
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i32 [[BSWAP2]] to i64
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb2, label %res_block

	; CHECK-LABEL: loadbb2:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i16, i16*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i16, i16*
	; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i16 @llvm.bswap.i16(i16 [[LOAD1]])
	; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i16 @llvm.bswap.i16(i16 [[LOAD2]])
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i16 [[BSWAP1]] to i64
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i16 [[BSWAP2]] to i64
	; CHECK-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br i1 [[ICMP]], label %loadbb3, label %res_block

	; CHECK-LABEL: loadbb3:{{.*}}
	; CHECK: [[LOAD1:%[0-9]+]] = load i8, i8*
	; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i8, i8*
	; CHECK-NEXT: [[ZEXT1:%[0-9]+]] = zext i8 [[LOAD1]] to i32
	; CHECK-NEXT: [[ZEXT2:%[0-9]+]] = zext i8 [[LOAD2]] to i32
	; CHECK-NEXT: [[SUB:%[0-9]+]] = sub i32 [[ZEXT1]], [[ZEXT2]]
	; CHECK-NEXT: br label %endblock

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i64, i64*
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb1, label %res_block

	; CHECK-BE-LABEL: res_block:{{.*}}
	; CHECK-BE: [[ICMP2:%[0-9]+]] = icmp ult i64
	; CHECK-BE-NEXT: [[SELECT:%[0-9]+]] = select i1 [[ICMP2]], i32 -1, i32 1
	; CHECK-BE-NEXT: br label %endblock

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i32 [[LOAD1]] to i64
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i32 [[LOAD2]] to i64
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb2, label %res_block

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i16, i16*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i16, i16*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i16 [[LOAD1]] to i64
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i16 [[LOAD2]] to i64
	; CHECK-BE-NEXT: [[ICMP:%[0-9]+]] = icmp eq i64 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br i1 [[ICMP]], label %loadbb3, label %res_block

	; CHECK-BE: [[LOAD1:%[0-9]+]] = load i8, i8*
	; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i8, i8*
	; CHECK-BE-NEXT: [[ZEXT1:%[0-9]+]] = zext i8 [[LOAD1]] to i32
	; CHECK-BE-NEXT: [[ZEXT2:%[0-9]+]] = zext i8 [[LOAD2]] to i32
	; CHECK-BE-NEXT: [[SUB:%[0-9]+]] = sub i32 [[ZEXT1]], [[ZEXT2]]
	; CHECK-BE-NEXT: br label %endblock

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 15)
	ret i32 %call
	}
	; CHECK: call = tail call signext i32 @memcmp
	; CHECK-BE: call = tail call signext i32 @memcmp
	define signext i32 @test4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {

	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 65)
	ret i32 %call
	}

	define signext i32 @test5(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2, i32 signext %SIZE) {
	; CHECK: call = tail call signext i32 @memcmp
	; CHECK-BE: call = tail call signext i32 @memcmp
	entry:
	%0 = bitcast i32* %buffer1 to i8*
	%1 = bitcast i32* %buffer2 to i8*
	%conv = sext i32 %SIZE to i64
	%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 %conv)
	ret i32 %call
	}

llvm/test/CodeGen/X86/O3-pipeline.ll

	Show All 23 Lines
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	; CHECK-NEXT: Loop Strength Reduction			; CHECK-NEXT: Loop Strength Reduction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Merge contiguous icmps into a memcmp
	; CHECK-NEXT: Expand memcmp() to load/stores
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/memcmp-mergeexpand.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

	; This tests interaction between MergeICmp and ExpandMemCmp.

	%"struct.std::pair" = type { i32, i32 }

	define zeroext i1 @opeq1(
	; X86-LABEL: opeq1:
	; X86: # %bb.0: # %entry
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: opeq1:
	; X64: # %bb.0: # %entry
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
	%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
	entry:
	%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
	%0 = load i32, i32* %first.i, align 4
	%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
	%1 = load i32, i32* %first1.i, align 4
	%cmp.i = icmp eq i32 %0, %1
	br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

	land.rhs.i:
	%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
	%2 = load i32, i32* %second.i, align 4
	%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
	%3 = load i32, i32* %second2.i, align 4
	%cmp3.i = icmp eq i32 %2, %3
	br label %opeq1.exit

	opeq1.exit:
	%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
	ret i1 %4
	}

llvm/test/CodeGen/X86/memcmp-optsize.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=cmov \| FileCheck %s --check-prefix=X86 --check-prefix=X86-NOSSE
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X86 --check-prefix=X86-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64 --check-prefix=X64-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX2

	; This tests codegen time inlining/optimization of memcmp
	; rdar://6480398

	@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

	declare i32 @memcmp(i8, i8, i64)
	declare i32 @bcmp(i8, i8, i64)

	define i32 @length2(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

	define i1 @length2_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: cmpw (%eax), %cx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpw (%rsi), %ax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length2_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length2_eq_nobuiltin_attr:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $2
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_nobuiltin_attr:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movl $2, %edx
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length3(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length3:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: movzwl (%ecx), %esi
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: rolw $8, %si
	; X86-NEXT: cmpw %si, %dx
	; X86-NEXT: jne .LBB4_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 2(%eax), %eax
	; X86-NEXT: movzbl 2(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB4_3
	; X86-NEXT: .LBB4_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB4_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length3:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: cmpw %cx, %ax
	; X64-NEXT: jne .LBB4_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 2(%rdi), %eax
	; X64-NEXT: movzbl 2(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB4_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	ret i32 %m
	}

	define i1 @length3_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length3_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %edx
	; X86-NEXT: xorw (%eax), %dx
	; X86-NEXT: movb 2(%ecx), %cl
	; X86-NEXT: xorb 2(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orw %dx, %ax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length3_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: xorw (%rsi), %ax
	; X64-NEXT: movb 2(%rdi), %cl
	; X64-NEXT: xorb 2(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orw %ax, %cx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length4(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length4:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	ret i32 %m
	}

	define i1 @length4_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length4_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: cmpl (%eax), %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: cmpl (%rsi), %eax
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length4_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl $875770417, (%eax) # imm = 0x34333231
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: cmpl $875770417, (%rdi) # imm = 0x34333231
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length5(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length5:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB9_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB9_3
	; X86-NEXT: .LBB9_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB9_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB9_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB9_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	ret i32 %m
	}

	define i1 @length5_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length5_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: movb 4(%ecx), %cl
	; X86-NEXT: xorb 4(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orl %edx, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: movb 4(%rdi), %cl
	; X64-NEXT: xorb 4(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length8(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length8:
	; X86: # %bb.0:
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl (%esi), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: jne .LBB11_2
	; X86-NEXT: # %bb.1: # %loadbb1
	; X86-NEXT: movl 4(%esi), %ecx
	; X86-NEXT: movl 4(%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: je .LBB11_3
	; X86-NEXT: .LBB11_2: # %res_block
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: setae %al
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB11_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length8:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	ret i32 %m
	}

	define i1 @length8_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length8_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length8_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length8_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl $858927408, %ecx # imm = 0x33323130
	; X86-NEXT: xorl (%eax), %ecx
	; X86-NEXT: movl $926299444, %edx # imm = 0x37363534
	; X86-NEXT: xorl 4(%eax), %edx
	; X86-NEXT: orl %ecx, %edx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $3978425819141910832, %rax # imm = 0x3736353433323130
	; X64-NEXT: cmpq %rax, (%rdi)
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length12_eq(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length12_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length12_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: xorl 8(%rsi), %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length12(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length12:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length12:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB15_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: movl 8(%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB15_3
	; X64-NEXT: .LBB15_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB15_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	ret i32 %m
	}

	; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

	define i32 @length16(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length16:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $16
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length16:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB16_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movq 8(%rdi), %rcx
	; X64-NEXT: movq 8(%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB16_3
	; X64-NEXT: .LBB16_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB16_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
	ret i32 %m
	}

	define i1 @length16_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length16_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu (%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X86-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length16_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length16_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length16_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length16_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

	define i32 @length24(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length24:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $24
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length24:
	; X64: # %bb.0:
	; X64-NEXT: movl $24, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
	ret i32 %m
	}

	define i1 @length24_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length24_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: movq {{.*#+}} xmm0 = mem[0],zero
	; X64-SSE2-NEXT: movq {{.*#+}} xmm2 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: pand %xmm1, %xmm2
	; X64-SSE2-NEXT: pmovmskb %xmm2, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length24_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm2 = mem[0],zero
	; X64-AVX2-NEXT: vpcmpeqb %xmm2, %xmm1, %xmm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length24_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length24_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movq {{.*#+}} xmm1 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length24_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX2-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX2-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length32(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length32:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $32
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length32:
	; X64: # %bb.0:
	; X64-NEXT: movl $32, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
	ret i32 %m
	}

	; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

	define i1 @length32_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-NOSSE-LABEL: length32_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X64-SSE2-NEXT: pand %xmm2, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length32_eq_const(i8* %X) nounwind optsize {
	; X86-NOSSE-LABEL: length32_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length64(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: length64:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length64:
	; X64: # %bb.0:
	; X64-NEXT: movl $64, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
	ret i32 %m
	}

	define i1 @length64_eq(i8* %x, i8* %y) nounwind optsize {
	; X86-LABEL: length64_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb 32(%rsi), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length64_eq_const(i8* %X) nounwind optsize {
	; X86-LABEL: length64_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl $.L.str
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $.L.str, %esi
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @bcmp_length2(i8* %X, i8* %Y) nounwind optsize {
	; X86-LABEL: bcmp_length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: bcmp_length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @bcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

llvm/test/CodeGen/X86/memcmp.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=cmov \| FileCheck %s --check-prefix=X86 --check-prefix=X86-NOSSE
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse \| FileCheck %s --check-prefix=X86 --check-prefix=SSE --check-prefix=X86-SSE1
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X86 --check-prefix=SSE --check-prefix=X86-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64 --check-prefix=X64-SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX --check-prefix=X64-AVX1
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=X64 --check-prefix=X64-AVX --check-prefix=X64-AVX2

	; This tests codegen time inlining/optimization of memcmp
	; rdar://6480398

	@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

	declare i32 @memcmp(i8, i8, i64)

	define i32 @length0(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0:
	; X86: # %bb.0:
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length0:
	; X64: # %bb.0:
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	ret i32 %m
	}

	define i1 @length0_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0_eq:
	; X86: # %bb.0:
	; X86-NEXT: movb $1, %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length0_eq:
	; X64: # %bb.0:
	; X64-NEXT: movb $1, %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length0_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length0_lt:
	; X86: # %bb.0:
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length0_lt:
	; X64: # %bb.0:
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i32 @length2(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	ret i32 %m
	}

	define i1 @length2_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: cmpw (%eax), %cx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpw (%rsi), %ax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_lt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: movzwl %cx, %eax
	; X86-NEXT: movzwl %dx, %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_lt:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_gt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_gt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %ecx
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: rolw $8, %cx
	; X86-NEXT: rolw $8, %ax
	; X86-NEXT: movzwl %cx, %ecx
	; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: subl %eax, %ecx
	; X86-NEXT: testl %ecx, %ecx
	; X86-NEXT: setg %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_gt:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: movzwl %ax, %eax
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: setg %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
	%c = icmp sgt i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_const(i8* %X) nounwind {
	; X86-LABEL: length2_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %eax
	; X86-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: cmpl $12849, %eax # imm = 0x3231
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length2_eq_nobuiltin_attr:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $2
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length2_eq_nobuiltin_attr:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movl $2, %edx
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length3(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length3:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl (%eax), %edx
	; X86-NEXT: movzwl (%ecx), %esi
	; X86-NEXT: rolw $8, %dx
	; X86-NEXT: rolw $8, %si
	; X86-NEXT: cmpw %si, %dx
	; X86-NEXT: jne .LBB9_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 2(%eax), %eax
	; X86-NEXT: movzbl 2(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	; X86-NEXT: .LBB9_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length3:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rsi), %ecx
	; X64-NEXT: rolw $8, %ax
	; X64-NEXT: rolw $8, %cx
	; X64-NEXT: cmpw %cx, %ax
	; X64-NEXT: jne .LBB9_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 2(%rdi), %eax
	; X64-NEXT: movzbl 2(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB9_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	ret i32 %m
	}

	define i1 @length3_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length3_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movzwl (%ecx), %edx
	; X86-NEXT: xorw (%eax), %dx
	; X86-NEXT: movb 2(%ecx), %cl
	; X86-NEXT: xorb 2(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orw %dx, %ax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length3_eq:
	; X64: # %bb.0:
	; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: xorw (%rsi), %ax
	; X64-NEXT: movb 2(%rdi), %cl
	; X64-NEXT: xorb 2(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orw %ax, %cx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length4(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	ret i32 %m
	}

	define i1 @length4_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: cmpl (%eax), %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: cmpl (%rsi), %eax
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_lt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: seta %al
	; X86-NEXT: sbbl $0, %eax
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_lt:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %ecx
	; X64-NEXT: movl (%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpl %edx, %ecx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_gt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length4_gt:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %ecx
	; X86-NEXT: movl (%eax), %eax
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %eax
	; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: cmpl %eax, %ecx
	; X86-NEXT: seta %dl
	; X86-NEXT: sbbl $0, %edx
	; X86-NEXT: testl %edx, %edx
	; X86-NEXT: setg %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_gt:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: xorl %edx, %edx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: seta %dl
	; X64-NEXT: sbbl $0, %edx
	; X64-NEXT: testl %edx, %edx
	; X64-NEXT: setg %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
	%c = icmp sgt i32 %m, 0
	ret i1 %c
	}

	define i1 @length4_eq_const(i8* %X) nounwind {
	; X86-LABEL: length4_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl $875770417, (%eax) # imm = 0x34333231
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length4_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: cmpl $875770417, (%rdi) # imm = 0x34333231
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i32 @length5(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB16_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	; X86-NEXT: .LBB16_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB16_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB16_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	ret i32 %m
	}

	define i1 @length5_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: movb 4(%ecx), %cl
	; X86-NEXT: xorb 4(%eax), %cl
	; X86-NEXT: movzbl %cl, %eax
	; X86-NEXT: orl %edx, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: movb 4(%rdi), %cl
	; X64-NEXT: xorb 4(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length5_lt(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length5_lt:
	; X86: # %bb.0: # %loadbb
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: movl (%ecx), %esi
	; X86-NEXT: bswapl %edx
	; X86-NEXT: bswapl %esi
	; X86-NEXT: cmpl %esi, %edx
	; X86-NEXT: jne .LBB18_1
	; X86-NEXT: # %bb.2: # %loadbb1
	; X86-NEXT: movzbl 4(%eax), %eax
	; X86-NEXT: movzbl 4(%ecx), %ecx
	; X86-NEXT: subl %ecx, %eax
	; X86-NEXT: jmp .LBB18_3
	; X86-NEXT: .LBB18_1: # %res_block
	; X86-NEXT: setae %al
	; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB18_3: # %endblock
	; X86-NEXT: shrl $31, %eax
	; X86-NEXT: # kill: def $al killed $al killed $eax
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length5_lt:
	; X64: # %bb.0: # %loadbb
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl (%rsi), %ecx
	; X64-NEXT: bswapl %eax
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: cmpl %ecx, %eax
	; X64-NEXT: jne .LBB18_1
	; X64-NEXT: # %bb.2: # %loadbb1
	; X64-NEXT: movzbl 4(%rdi), %eax
	; X64-NEXT: movzbl 4(%rsi), %ecx
	; X64-NEXT: subl %ecx, %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	; X64-NEXT: .LBB18_1: # %res_block
	; X64-NEXT: setae %al
	; X64-NEXT: movzbl %al, %eax
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: shrl $31, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
	%c = icmp slt i32 %m, 0
	ret i1 %c
	}

	define i1 @length7_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length7_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 3(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 3(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length7_eq:
	; X64: # %bb.0:
	; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: movl 3(%rdi), %ecx
	; X64-NEXT: xorl (%rsi), %eax
	; X64-NEXT: xorl 3(%rsi), %ecx
	; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 7) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length8(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length8:
	; X86: # %bb.0:
	; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl (%esi), %ecx
	; X86-NEXT: movl (%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: jne .LBB20_2
	; X86-NEXT: # %bb.1: # %loadbb1
	; X86-NEXT: movl 4(%esi), %ecx
	; X86-NEXT: movl 4(%eax), %edx
	; X86-NEXT: bswapl %ecx
	; X86-NEXT: bswapl %edx
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: je .LBB20_3
	; X86-NEXT: .LBB20_2: # %res_block
	; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: cmpl %edx, %ecx
	; X86-NEXT: setae %al
	; X86-NEXT: leal -1(%eax,%eax), %eax
	; X86-NEXT: .LBB20_3: # %endblock
	; X86-NEXT: popl %esi
	; X86-NEXT: retl
	;
	; X64-LABEL: length8:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: seta %al
	; X64-NEXT: sbbl $0, %eax
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	ret i32 %m
	}

	define i1 @length8_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length8_eq:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl (%ecx), %edx
	; X86-NEXT: movl 4(%ecx), %ecx
	; X86-NEXT: xorl (%eax), %edx
	; X86-NEXT: xorl 4(%eax), %ecx
	; X86-NEXT: orl %edx, %ecx
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: cmpq (%rsi), %rax
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length8_eq_const(i8* %X) nounwind {
	; X86-LABEL: length8_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl $858927408, %ecx # imm = 0x33323130
	; X86-NEXT: xorl (%eax), %ecx
	; X86-NEXT: movl $926299444, %edx # imm = 0x37363534
	; X86-NEXT: xorl 4(%eax), %edx
	; X86-NEXT: orl %ecx, %edx
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length8_eq_const:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $3978425819141910832, %rax # imm = 0x3736353433323130
	; X64-NEXT: cmpq %rax, (%rdi)
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i1 @length9_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length9_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $9
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length9_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movb 8(%rdi), %cl
	; X64-NEXT: xorb 8(%rsi), %cl
	; X64-NEXT: movzbl %cl, %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length10_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length10_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $10
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length10_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movzwl 8(%rdi), %ecx
	; X64-NEXT: xorw 8(%rsi), %cx
	; X64-NEXT: movzwl %cx, %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 10) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length11_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length11_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $11
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length11_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 3(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 3(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 11) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length12_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length12_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length12_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: xorl 8(%rsi), %ecx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: setne %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length12(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length12:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $12
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length12:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB27_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movl 8(%rdi), %ecx
	; X64-NEXT: movl 8(%rsi), %edx
	; X64-NEXT: bswapl %ecx
	; X64-NEXT: bswapl %edx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB27_3
	; X64-NEXT: .LBB27_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB27_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
	ret i32 %m
	}

	define i1 @length13_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length13_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $13
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length13_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 5(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 5(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 13) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length14_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length14_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $14
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length14_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 6(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 6(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 14) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	define i1 @length15_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length15_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $15
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: length15_eq:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rax
	; X64-NEXT: movq 7(%rdi), %rcx
	; X64-NEXT: xorq (%rsi), %rax
	; X64-NEXT: xorq 7(%rsi), %rcx
	; X64-NEXT: orq %rax, %rcx
	; X64-NEXT: sete %al
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 15) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

	define i32 @length16(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length16:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $16
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length16:
	; X64: # %bb.0:
	; X64-NEXT: movq (%rdi), %rcx
	; X64-NEXT: movq (%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: jne .LBB31_2
	; X64-NEXT: # %bb.1: # %loadbb1
	; X64-NEXT: movq 8(%rdi), %rcx
	; X64-NEXT: movq 8(%rsi), %rdx
	; X64-NEXT: bswapq %rcx
	; X64-NEXT: bswapq %rdx
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: je .LBB31_3
	; X64-NEXT: .LBB31_2: # %res_block
	; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: cmpq %rdx, %rcx
	; X64-NEXT: setae %al
	; X64-NEXT: leal -1(%rax,%rax), %eax
	; X64-NEXT: .LBB31_3: # %endblock
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
	ret i32 %m
	}

	define i1 @length16_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length16_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length16_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $16
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu (%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X86-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: pmovmskb %xmm1, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length16_eq:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: setne %al
	; X64-AVX-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length16_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length16_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $16
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length16_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $16
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length16_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length16_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length16_eq_const:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: sete %al
	; X64-AVX-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

	define i32 @length24(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length24:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $24
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length24:
	; X64: # %bb.0:
	; X64-NEXT: movl $24, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
	ret i32 %m
	}

	define i1 @length24_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length24_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length24_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $24
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
	; X64-SSE2-NEXT: movq {{.*#+}} xmm0 = mem[0],zero
	; X64-SSE2-NEXT: movq {{.*#+}} xmm2 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: pand %xmm1, %xmm2
	; X64-SSE2-NEXT: pmovmskb %xmm2, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length24_eq:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm2 = mem[0],zero
	; X64-AVX-NEXT: vpcmpeqb %xmm2, %xmm1, %xmm1
	; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: sete %al
	; X64-AVX-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length24_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length24_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $24
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length24_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $24
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length24_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 8(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length24_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movq {{.*#+}} xmm1 = mem[0],zero
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX-LABEL: length24_eq_const:
	; X64-AVX: # %bb.0:
	; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX-NEXT: vmovq {{.*#+}} xmm1 = mem[0],zero
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX-NEXT: setne %al
	; X64-AVX-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length32(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length32:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $32
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length32:
	; X64: # %bb.0:
	; X64-NEXT: movl $32, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
	ret i32 %m
	}

	; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

	define i1 @length32_eq(i8* %x, i8* %y) nounwind {
	; X86-NOSSE-LABEL: length32_eq:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: sete %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length32_eq:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $32
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: sete %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
	; X86-SSE2-NEXT: movdqu (%eax), %xmm2
	; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
	; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X86-SSE2-NEXT: pand %xmm2, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: sete %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
	; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
	; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
	; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
	; X64-SSE2-NEXT: pand %xmm2, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length32_eq:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
	; X64-AVX1-NEXT: vpcmpeqb 16(%rsi), %xmm1, %xmm1
	; X64-AVX1-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
	; X64-AVX1-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX1-NEXT: sete %al
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
	%cmp = icmp eq i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length32_eq_const(i8* %X) nounwind {
	; X86-NOSSE-LABEL: length32_eq_const:
	; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl $0
	; X86-NOSSE-NEXT: pushl $32
	; X86-NOSSE-NEXT: pushl $.L.str
	; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: calll memcmp
	; X86-NOSSE-NEXT: addl $16, %esp
	; X86-NOSSE-NEXT: testl %eax, %eax
	; X86-NOSSE-NEXT: setne %al
	; X86-NOSSE-NEXT: retl
	;
	; X86-SSE1-LABEL: length32_eq_const:
	; X86-SSE1: # %bb.0:
	; X86-SSE1-NEXT: pushl $0
	; X86-SSE1-NEXT: pushl $32
	; X86-SSE1-NEXT: pushl $.L.str
	; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-SSE1-NEXT: calll memcmp
	; X86-SSE1-NEXT: addl $16, %esp
	; X86-SSE1-NEXT: testl %eax, %eax
	; X86-SSE1-NEXT: setne %al
	; X86-SSE1-NEXT: retl
	;
	; X86-SSE2-LABEL: length32_eq_const:
	; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SSE2-NEXT: movdqu (%eax), %xmm0
	; X86-SSE2-NEXT: movdqu 16(%eax), %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
	; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
	; X86-SSE2-NEXT: pand %xmm1, %xmm0
	; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X86-SSE2-NEXT: setne %al
	; X86-SSE2-NEXT: retl
	;
	; X64-SSE2-LABEL: length32_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
	; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
	; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
	; X64-SSE2-NEXT: pand %xmm1, %xmm0
	; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
	; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length32_eq_const:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
	; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
	; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
	; X64-AVX1-NEXT: vpand %xmm1, %xmm0, %xmm0
	; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
	; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
	; X64-AVX1-NEXT: setne %al
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length32_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
	%c = icmp ne i32 %m, 0
	ret i1 %c
	}

	define i32 @length64(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: length64:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: length64:
	; X64: # %bb.0:
	; X64-NEXT: movl $64, %edx
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
	ret i32 %m
	}

	define i1 @length64_eq(i8* %x, i8* %y) nounwind {
	; X86-LABEL: length64_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: setne %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: setne %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length64_eq:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: pushq %rax
	; X64-AVX1-NEXT: movl $64, %edx
	; X64-AVX1-NEXT: callq memcmp
	; X64-AVX1-NEXT: testl %eax, %eax
	; X64-AVX1-NEXT: setne %al
	; X64-AVX1-NEXT: popq %rcx
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb 32(%rsi), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: setne %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
	%cmp = icmp ne i32 %call, 0
	ret i1 %cmp
	}

	define i1 @length64_eq_const(i8* %X) nounwind {
	; X86-LABEL: length64_eq_const:
	; X86: # %bb.0:
	; X86-NEXT: pushl $0
	; X86-NEXT: pushl $64
	; X86-NEXT: pushl $.L.str
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-SSE2-LABEL: length64_eq_const:
	; X64-SSE2: # %bb.0:
	; X64-SSE2-NEXT: pushq %rax
	; X64-SSE2-NEXT: movl $.L.str, %esi
	; X64-SSE2-NEXT: movl $64, %edx
	; X64-SSE2-NEXT: callq memcmp
	; X64-SSE2-NEXT: testl %eax, %eax
	; X64-SSE2-NEXT: sete %al
	; X64-SSE2-NEXT: popq %rcx
	; X64-SSE2-NEXT: retq
	;
	; X64-AVX1-LABEL: length64_eq_const:
	; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: pushq %rax
	; X64-AVX1-NEXT: movl $.L.str, %esi
	; X64-AVX1-NEXT: movl $64, %edx
	; X64-AVX1-NEXT: callq memcmp
	; X64-AVX1-NEXT: testl %eax, %eax
	; X64-AVX1-NEXT: sete %al
	; X64-AVX1-NEXT: popq %rcx
	; X64-AVX1-NEXT: retq
	;
	; X64-AVX2-LABEL: length64_eq_const:
	; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
	; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
	; X64-AVX2-NEXT: cmpl $-1, %eax
	; X64-AVX2-NEXT: sete %al
	; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; This checks that we do not do stupid things with huge sizes.
	define i32 @huge_length(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: huge_length:
	; X86: # %bb.0:
	; X86-NEXT: pushl $2147483647 # imm = 0x7FFFFFFF
	; X86-NEXT: pushl $-1
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: retl
	;
	; X64-LABEL: huge_length:
	; X64: # %bb.0:
	; X64-NEXT: movabsq $9223372036854775807, %rdx # imm = 0x7FFFFFFFFFFFFFFF
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
	ret i32 %m
	}

	define i1 @huge_length_eq(i8* %X, i8* %Y) nounwind {
	; X86-LABEL: huge_length_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl $2147483647 # imm = 0x7FFFFFFF
	; X86-NEXT: pushl $-1
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: huge_length_eq:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: movabsq $9223372036854775807, %rdx # imm = 0x7FFFFFFFFFFFFFFF
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq

	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

	; This checks non-constant sizes.
	define i32 @nonconst_length(i8* %X, i8* %Y, i64 %size) nounwind {
	; X86-LABEL: nonconst_length:
	; X86: # %bb.0:
	; X86-NEXT: jmp memcmp # TAILCALL
	;
	; X64-LABEL: nonconst_length:
	; X64: # %bb.0:
	; X64-NEXT: jmp memcmp # TAILCALL
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
	ret i32 %m
	}

	define i1 @nonconst_length_eq(i8* %X, i8* %Y, i64 %size) nounwind {
	; X86-LABEL: nonconst_length_eq:
	; X86: # %bb.0:
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: pushl {{[0-9]+}}(%esp)
	; X86-NEXT: calll memcmp
	; X86-NEXT: addl $16, %esp
	; X86-NEXT: testl %eax, %eax
	; X86-NEXT: sete %al
	; X86-NEXT: retl
	;
	; X64-LABEL: nonconst_length_eq:
	; X64: # %bb.0:
	; X64-NEXT: pushq %rax
	; X64-NEXT: callq memcmp
	; X64-NEXT: testl %eax, %eax
	; X64-NEXT: sete %al
	; X64-NEXT: popq %rcx
	; X64-NEXT: retq
	%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
	%c = icmp eq i32 %m, 0
	ret i1 %c
	}

llvm/test/Other/opt-O2-pipeline.ll

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Global Value Numbering			; CHECK-NEXT: Global Value Numbering
	; CHECK-NEXT: Phi Values Analysis			; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis			; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: MemCpy Optimization			; CHECK-NEXT: MemCpy Optimization
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				efriedmaUnsubmitted Done Reply Inline Actions How hard would it be to preserve the domtree? efriedma: How hard would it be to preserve the domtree?
				courbetAuthorUnsubmitted Done Reply Inline Actions I think it should not be too hard because we merely add blocks in a diamond in the middle of the graph, so the change is quite local. I'll have a look at that. courbet: I think it should not be too hard because we merely add blocks in a diamond in the middle of…
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Sparse Conditional Constant Propagation			; CHECK-NEXT: Sparse Conditional Constant Propagation
	; CHECK-NEXT: Demanded bits analysis			; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Bit-Tracking Dead Code Elimination			; CHECK-NEXT: Bit-Tracking Dead Code Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/test/Other/opt-O3-pipeline.ll

	Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Global Value Numbering			; CHECK-NEXT: Global Value Numbering
	; CHECK-NEXT: Phi Values Analysis			; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis			; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: MemCpy Optimization			; CHECK-NEXT: MemCpy Optimization
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Sparse Conditional Constant Propagation			; CHECK-NEXT: Sparse Conditional Constant Propagation
	; CHECK-NEXT: Demanded bits analysis			; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Bit-Tracking Dead Code Elimination			; CHECK-NEXT: Bit-Tracking Dead Code Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/test/Other/opt-Os-pipeline.ll

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Global Value Numbering			; CHECK-NEXT: Global Value Numbering
	; CHECK-NEXT: Phi Values Analysis			; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis			; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: MemCpy Optimization			; CHECK-NEXT: MemCpy Optimization
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Merge contiguous icmps into a memcmp
				; CHECK-NEXT: Expand memcmp() to load/stores
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Early CSE
	; CHECK-NEXT: Sparse Conditional Constant Propagation			; CHECK-NEXT: Sparse Conditional Constant Propagation
	; CHECK-NEXT: Demanded bits analysis			; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Bit-Tracking Dead Code Elimination			; CHECK-NEXT: Bit-Tracking Dead Code Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/test/Transforms/ExpandMemCmp/PowerPC/lit.local.cfg

This file was added.

				if not 'PowerPC' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -S -expandmemcmp -mtriple=powerpc64le-unknown-gnu-linux -data-layout="e-m:e-i64:64-n32:64" \| FileCheck %s
				; RUN: opt < %s -S -expandmemcmp -mtriple=powerpc64-unknown-gnu-linux -data-layout="E-m:e-i64:64-n32:64" \| FileCheck %s --check-prefix=CHECK-BE

				define signext i32 @test1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test1(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: br label [[LOADBB:%.*]]
				; CHECK: res_block:
				; CHECK-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP8:%.]], [[LOADBB]] ], [ [[TMP17:%.]], [[LOADBB1:%.]] ]
				; CHECK-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP9:%.]], [[LOADBB]] ], [ [[TMP18:%.*]], [[LOADBB1]] ]
				; CHECK-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb:
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.]] = load i64, i64 [[TMP5]]
				; CHECK-NEXT: [[TMP8]] = call i64 @llvm.bswap.i64(i64 [[TMP6]])
				; CHECK-NEXT: [[TMP9]] = call i64 @llvm.bswap.i64(i64 [[TMP7]])
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP8]], [[TMP9]]
				; CHECK-NEXT: br i1 [[TMP10]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-NEXT: [[TMP14:%.]] = bitcast i8 [[TMP13]] to i64*
				; CHECK-NEXT: [[TMP15:%.]] = load i64, i64 [[TMP12]]
				; CHECK-NEXT: [[TMP16:%.]] = load i64, i64 [[TMP14]]
				; CHECK-NEXT: [[TMP17]] = call i64 @llvm.bswap.i64(i64 [[TMP15]])
				; CHECK-NEXT: [[TMP18]] = call i64 @llvm.bswap.i64(i64 [[TMP16]])
				; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[TMP17]], [[TMP18]]
				; CHECK-NEXT: br i1 [[TMP19]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				; CHECK-BE-LABEL: @test1(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: br label [[LOADBB:%.*]]
				; CHECK-BE: res_block:
				; CHECK-BE-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP6:%.]], [[LOADBB]] ], [ [[TMP13:%.]], [[LOADBB1:%.]] ]
				; CHECK-BE-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP7:%.]], [[LOADBB]] ], [ [[TMP14:%.*]], [[LOADBB1]] ]
				; CHECK-BE-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-BE-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-BE-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK-BE: loadbb:
				; CHECK-BE-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-BE-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-BE-NEXT: [[TMP6]] = load i64, i64* [[TMP4]]
				; CHECK-BE-NEXT: [[TMP7]] = load i64, i64* [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = icmp eq i64 [[TMP6]], [[TMP7]]
				; CHECK-BE-NEXT: br i1 [[TMP8]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK-BE: loadbb1:
				; CHECK-BE-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-BE-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i64*
				; CHECK-BE-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-BE-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; CHECK-BE-NEXT: [[TMP13]] = load i64, i64* [[TMP10]]
				; CHECK-BE-NEXT: [[TMP14]] = load i64, i64* [[TMP12]]
				; CHECK-BE-NEXT: [[TMP15:%.*]] = icmp eq i64 [[TMP13]], [[TMP14]]
				; CHECK-BE-NEXT: br i1 [[TMP15]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK-BE: endblock:
				; CHECK-BE-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-BE-NEXT: ret i32 [[PHI_RES]]
				;
				spatelUnsubmitted Not Done Reply Inline Actions Why are we generating bswap for a big-endian target? spatel: Why are we generating bswap for a big-endian target?
				courbetAuthorUnsubmitted Done Reply Inline Actions Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data layout from the target, so I/m now explicitly specifying the data layout on the RUN line. courbet: Thanks for the catch. We're not, but contrary to `llc`, `opt` does not seem to get the data…
				entry:






				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 16)
				ret i32 %call
				}

				declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				define signext i32 @test2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i8 [[TMP0]] to i32*
				; CHECK-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP1]] to i32*
				; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP3]]
				; CHECK-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP5]])
				; CHECK-NEXT: [[TMP8:%.*]] = icmp ugt i32 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: [[TMP9:%.*]] = icmp ult i32 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP11:%.*]] = zext i1 [[TMP9]] to i32
				; CHECK-NEXT: [[TMP12:%.*]] = sub i32 [[TMP10]], [[TMP11]]
				; CHECK-NEXT: ret i32 [[TMP12]]
				;
				; CHECK-BE-LABEL: @test2(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[TMP2:%.]] = bitcast i8 [[TMP0]] to i32*
				; CHECK-BE-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP1]] to i32*
				; CHECK-BE-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]]
				; CHECK-BE-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP3]]
				; CHECK-BE-NEXT: [[TMP6:%.*]] = icmp ugt i32 [[TMP4]], [[TMP5]]
				; CHECK-BE-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP4]], [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = zext i1 [[TMP6]] to i32
				; CHECK-BE-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; CHECK-BE-NEXT: [[TMP10:%.*]] = sub i32 [[TMP8]], [[TMP9]]
				; CHECK-BE-NEXT: ret i32 [[TMP10]]
				;


				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 4)
				ret i32 %call
				}

				define signext i32 @test3(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test3(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: br label [[LOADBB:%.*]]
				; CHECK: res_block:
				; CHECK-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP8:%.]], [[LOADBB]] ], [ [[TMP19:%.]], [[LOADBB1:%.]] ], [ [[TMP30:%.]], [[LOADBB2:%.]] ]
				; CHECK-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP9:%.]], [[LOADBB]] ], [ [[TMP20:%.]], [[LOADBB1]] ], [ [[TMP31:%.]], [[LOADBB2]] ]
				; CHECK-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb:
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.]] = load i64, i64 [[TMP5]]
				; CHECK-NEXT: [[TMP8]] = call i64 @llvm.bswap.i64(i64 [[TMP6]])
				; CHECK-NEXT: [[TMP9]] = call i64 @llvm.bswap.i64(i64 [[TMP7]])
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP8]], [[TMP9]]
				; CHECK-NEXT: br i1 [[TMP10]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-NEXT: [[TMP14:%.]] = bitcast i8 [[TMP13]] to i32*
				; CHECK-NEXT: [[TMP15:%.]] = load i32, i32 [[TMP12]]
				; CHECK-NEXT: [[TMP16:%.]] = load i32, i32 [[TMP14]]
				; CHECK-NEXT: [[TMP17:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP15]])
				; CHECK-NEXT: [[TMP18:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP16]])
				; CHECK-NEXT: [[TMP19]] = zext i32 [[TMP17]] to i64
				; CHECK-NEXT: [[TMP20]] = zext i32 [[TMP18]] to i64
				; CHECK-NEXT: [[TMP21:%.*]] = icmp eq i64 [[TMP19]], [[TMP20]]
				; CHECK-NEXT: br i1 [[TMP21]], label [[LOADBB2]], label [[RES_BLOCK]]
				; CHECK: loadbb2:
				; CHECK-NEXT: [[TMP22:%.]] = getelementptr i8, i8 [[TMP0]], i8 12
				; CHECK-NEXT: [[TMP23:%.]] = bitcast i8 [[TMP22]] to i16*
				; CHECK-NEXT: [[TMP24:%.]] = getelementptr i8, i8 [[TMP1]], i8 12
				; CHECK-NEXT: [[TMP25:%.]] = bitcast i8 [[TMP24]] to i16*
				; CHECK-NEXT: [[TMP26:%.]] = load i16, i16 [[TMP23]]
				; CHECK-NEXT: [[TMP27:%.]] = load i16, i16 [[TMP25]]
				; CHECK-NEXT: [[TMP28:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP26]])
				; CHECK-NEXT: [[TMP29:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP27]])
				; CHECK-NEXT: [[TMP30]] = zext i16 [[TMP28]] to i64
				; CHECK-NEXT: [[TMP31]] = zext i16 [[TMP29]] to i64
				; CHECK-NEXT: [[TMP32:%.*]] = icmp eq i64 [[TMP30]], [[TMP31]]
				; CHECK-NEXT: br i1 [[TMP32]], label [[LOADBB3:%.*]], label [[RES_BLOCK]]
				; CHECK: loadbb3:
				; CHECK-NEXT: [[TMP33:%.]] = getelementptr i8, i8 [[TMP0]], i8 14
				; CHECK-NEXT: [[TMP34:%.]] = getelementptr i8, i8 [[TMP1]], i8 14
				; CHECK-NEXT: [[TMP35:%.]] = load i8, i8 [[TMP33]]
				; CHECK-NEXT: [[TMP36:%.]] = load i8, i8 [[TMP34]]
				; CHECK-NEXT: [[TMP37:%.*]] = zext i8 [[TMP35]] to i32
				; CHECK-NEXT: [[TMP38:%.*]] = zext i8 [[TMP36]] to i32
				; CHECK-NEXT: [[TMP39:%.*]] = sub i32 [[TMP37]], [[TMP38]]
				; CHECK-NEXT: br label [[ENDBLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP39]], [[LOADBB3]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				; CHECK-BE-LABEL: @test3(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: br label [[LOADBB:%.*]]
				; CHECK-BE: res_block:
				; CHECK-BE-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP6:%.]], [[LOADBB]] ], [ [[TMP15:%.]], [[LOADBB1:%.]] ], [ [[TMP24:%.]], [[LOADBB2:%.]] ]
				; CHECK-BE-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP7:%.]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ], [ [[TMP25:%.]], [[LOADBB2]] ]
				; CHECK-BE-NEXT: [[TMP2:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; CHECK-BE-NEXT: [[TMP3:%.*]] = select i1 [[TMP2]], i32 -1, i32 1
				; CHECK-BE-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK-BE: loadbb:
				; CHECK-BE-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP0]] to i64*
				; CHECK-BE-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP1]] to i64*
				; CHECK-BE-NEXT: [[TMP6]] = load i64, i64* [[TMP4]]
				; CHECK-BE-NEXT: [[TMP7]] = load i64, i64* [[TMP5]]
				; CHECK-BE-NEXT: [[TMP8:%.*]] = icmp eq i64 [[TMP6]], [[TMP7]]
				; CHECK-BE-NEXT: br i1 [[TMP8]], label [[LOADBB1]], label [[RES_BLOCK:%.*]]
				; CHECK-BE: loadbb1:
				; CHECK-BE-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[TMP0]], i8 8
				; CHECK-BE-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; CHECK-BE-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[TMP1]], i8 8
				; CHECK-BE-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; CHECK-BE-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]]
				; CHECK-BE-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]]
				; CHECK-BE-NEXT: [[TMP15]] = zext i32 [[TMP13]] to i64
				; CHECK-BE-NEXT: [[TMP16]] = zext i32 [[TMP14]] to i64
				; CHECK-BE-NEXT: [[TMP17:%.*]] = icmp eq i64 [[TMP15]], [[TMP16]]
				; CHECK-BE-NEXT: br i1 [[TMP17]], label [[LOADBB2]], label [[RES_BLOCK]]
				; CHECK-BE: loadbb2:
				; CHECK-BE-NEXT: [[TMP18:%.]] = getelementptr i8, i8 [[TMP0]], i8 12
				; CHECK-BE-NEXT: [[TMP19:%.]] = bitcast i8 [[TMP18]] to i16*
				; CHECK-BE-NEXT: [[TMP20:%.]] = getelementptr i8, i8 [[TMP1]], i8 12
				; CHECK-BE-NEXT: [[TMP21:%.]] = bitcast i8 [[TMP20]] to i16*
				; CHECK-BE-NEXT: [[TMP22:%.]] = load i16, i16 [[TMP19]]
				; CHECK-BE-NEXT: [[TMP23:%.]] = load i16, i16 [[TMP21]]
				; CHECK-BE-NEXT: [[TMP24]] = zext i16 [[TMP22]] to i64
				; CHECK-BE-NEXT: [[TMP25]] = zext i16 [[TMP23]] to i64
				; CHECK-BE-NEXT: [[TMP26:%.*]] = icmp eq i64 [[TMP24]], [[TMP25]]
				; CHECK-BE-NEXT: br i1 [[TMP26]], label [[LOADBB3:%.*]], label [[RES_BLOCK]]
				; CHECK-BE: loadbb3:
				; CHECK-BE-NEXT: [[TMP27:%.]] = getelementptr i8, i8 [[TMP0]], i8 14
				; CHECK-BE-NEXT: [[TMP28:%.]] = getelementptr i8, i8 [[TMP1]], i8 14
				; CHECK-BE-NEXT: [[TMP29:%.]] = load i8, i8 [[TMP27]]
				; CHECK-BE-NEXT: [[TMP30:%.]] = load i8, i8 [[TMP28]]
				; CHECK-BE-NEXT: [[TMP31:%.*]] = zext i8 [[TMP29]] to i32
				; CHECK-BE-NEXT: [[TMP32:%.*]] = zext i8 [[TMP30]] to i32
				; CHECK-BE-NEXT: [[TMP33:%.*]] = sub i32 [[TMP31]], [[TMP32]]
				; CHECK-BE-NEXT: br label [[ENDBLOCK]]
				; CHECK-BE: endblock:
				; CHECK-BE-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP33]], [[LOADBB3]] ], [ [[TMP3]], [[RES_BLOCK]] ]
				; CHECK-BE-NEXT: ret i32 [[PHI_RES]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 15)
				ret i32 %call
				}

				define signext i32 @test4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @test4(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 65)
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				; CHECK-BE-LABEL: @test4(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 65)
				; CHECK-BE-NEXT: ret i32 [[CALL]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 65)
				ret i32 %call
				}

				define signext i32 @test5(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2, i32 signext %SIZE) {
				; CHECK-LABEL: @test5(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[CONV:%.]] = sext i32 [[SIZE:%.]] to i64
				; CHECK-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 [[CONV]])
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				; CHECK-BE-LABEL: @test5(
				; CHECK-BE-NEXT: entry:
				; CHECK-BE-NEXT: [[TMP0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-BE-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-BE-NEXT: [[CONV:%.]] = sext i32 [[SIZE:%.]] to i64
				; CHECK-BE-NEXT: [[CALL:%.]] = tail call signext i32 @memcmp(i8 [[TMP0]], i8* [[TMP1]], i64 [[CONV]])
				; CHECK-BE-NEXT: ret i32 [[CALL]]
				;
				entry:
				%0 = bitcast i32* %buffer1 to i8*
				%1 = bitcast i32* %buffer2 to i8*
				%conv = sext i32 %SIZE to i64
				%call = tail call signext i32 @memcmp(i8* %0, i8* %1, i64 %conv)
				ret i32 %call
				}

llvm/test/Transforms/ExpandMemCmp/X86/pr36421.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -expandmemcmp -S \| FileCheck %s

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
				target triple = "x86_64-unknown-unknown"

				@.str = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
				@.str.1 = private unnamed_addr constant [7 x i8] c"ABCDEF\00", align 1

				define i32 @test(i8* nocapture readonly %string, i32 %len) local_unnamed_addr #0 {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[LEN:%.]], 6
				; CHECK-NEXT: br i1 [[COND]], label [[SW_BB:%.]], label [[RETURN:%.]]
				; CHECK: sw.bb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[STRING:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]]
				; CHECK-NEXT: [[TMP2:%.*]] = xor i32 [[TMP1]], 1684234849
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[STRING]], i8 4
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i16*
				; CHECK-NEXT: [[TMP5:%.]] = load i16, i16 [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = zext i16 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP7:%.*]] = xor i32 [[TMP6]], 26213
				; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP2]], [[TMP7]]
				; CHECK-NEXT: [[TMP9:%.*]] = icmp ne i32 [[TMP8]], 0
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP9]] to i32
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[TMP10]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[RETURN]], label [[IF_END:%.*]]
				; CHECK: if.end:
				; CHECK-NEXT: [[TMP11:%.]] = bitcast i8 [[STRING]] to i32*
				; CHECK-NEXT: [[TMP12:%.]] = load i32, i32 [[TMP11]]
				; CHECK-NEXT: [[TMP13:%.*]] = xor i32 [[TMP12]], 1145258561
				; CHECK-NEXT: [[TMP14:%.]] = getelementptr i8, i8 [[STRING]], i8 4
				; CHECK-NEXT: [[TMP15:%.]] = bitcast i8 [[TMP14]] to i16*
				; CHECK-NEXT: [[TMP16:%.]] = load i16, i16 [[TMP15]]
				; CHECK-NEXT: [[TMP17:%.*]] = zext i16 [[TMP16]] to i32
				; CHECK-NEXT: [[TMP18:%.*]] = xor i32 [[TMP17]], 17989
				; CHECK-NEXT: [[TMP19:%.*]] = or i32 [[TMP13]], [[TMP18]]
				; CHECK-NEXT: [[TMP20:%.*]] = icmp ne i32 [[TMP19]], 0
				; CHECK-NEXT: [[TMP21:%.*]] = zext i1 [[TMP20]] to i32
				; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i32 [[TMP21]], 0
				; CHECK-NEXT: [[DOT:%.*]] = select i1 [[CMP2]], i32 64, i32 0
				; CHECK-NEXT: br label [[RETURN]]
				; CHECK: return:
				; CHECK-NEXT: [[RETVAL_0:%.]] = phi i32 [ 61, [[SW_BB]] ], [ [[DOT]], [[IF_END]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: ret i32 [[RETVAL_0]]
				;
				entry:
				%cond = icmp eq i32 %len, 6
				br i1 %cond, label %sw.bb, label %return

				sw.bb: ; preds = %entry
				%call = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i64 6)
				%cmp = icmp eq i32 %call, 0
				br i1 %cmp, label %return, label %if.end

				if.end: ; preds = %sw.bb
				%call1 = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i64 0, i64 0), i64 6)
				%cmp2 = icmp eq i32 %call1, 0
				%. = select i1 %cmp2, i32 64, i32 0
				br label %return

				return: ; preds = %entry, %if.end8, %if.end4, %if.end, %sw.bb
				%retval.0 = phi i32 [ 61, %sw.bb ], [ %., %if.end ], [ 0, %entry ]
				ret i32 %retval.0
				}

				; Function Attrs: nounwind readonly
				declare i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				attributes #0 = { nounwind readonly ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"PIC Level", i32 2}
				!2 = !{!"clang version 7.0.0 (trunk 325350)"}

llvm/test/Transforms/PhaseOrdering/PowerPC/lit.local.cfg

This file was added.

				if not 'PowerPC' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/PhaseOrdering/PowerPC/memCmpUsedInZeroEqualityComparison.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mcpu=pwr8 < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-n32:64"
				target triple = "powerpc64le-unknown-linux-gnu"

				@zeroEqualityTest01.buffer1 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 4], align 4
				@zeroEqualityTest01.buffer2 = private unnamed_addr constant [3 x i32] [i32 1, i32 2, i32 3], align 4
				@zeroEqualityTest02.buffer1 = private unnamed_addr constant [4 x i32] [i32 4, i32 0, i32 0, i32 0], align 4
				@zeroEqualityTest02.buffer2 = private unnamed_addr constant [4 x i32] [i32 3, i32 0, i32 0, i32 0], align 4
				@zeroEqualityTest03.buffer1 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 3], align 4
				@zeroEqualityTest03.buffer2 = private unnamed_addr constant [4 x i32] [i32 0, i32 0, i32 0, i32 4], align 4
				@zeroEqualityTest04.buffer1 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14], align 4
				@zeroEqualityTest04.buffer2 = private unnamed_addr constant [15 x i32] [i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 13], align 4

				declare signext i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				; Check 4 bytes - requires 1 load for each param.
				define signext i32 @zeroEqualityTest02(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest02(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = zext i1 [[TMP5]] to i32
				; CHECK-NEXT: ret i32 [[TMP6]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 4)
				%not.cmp = icmp ne i32 %call, 0
				%. = zext i1 %not.cmp to i32
				ret i32 %.
				}

				; Check 16 bytes - requires 2 loads for each param (or use vectors?).
				define signext i32 @zeroEqualityTest01(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest01(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 8
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; CHECK-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; CHECK-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: br i1 [[TMP11]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ 1, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 16)
				%not.tobool = icmp ne i32 %call, 0
				%. = zext i1 %not.tobool to i32
				ret i32 %.
				}

				; Check 7 bytes - requires 3 loads for each param.
				define signext i32 @zeroEqualityTest03(i8* %x, i8* %y) {
				; CHECK-LABEL: @zeroEqualityTest03(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 4
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i16*
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i16*
				; CHECK-NEXT: [[TMP9:%.]] = load i16, i16 [[TMP6]], align 2
				; CHECK-NEXT: [[TMP10:%.]] = load i16, i16 [[TMP8]], align 2
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i16 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: br i1 [[TMP11]], label [[LOADBB2:%.*]], label [[RES_BLOCK]]
				; CHECK: loadbb2:
				; CHECK-NEXT: [[TMP12:%.]] = getelementptr i8, i8 [[X]], i64 6
				; CHECK-NEXT: [[TMP13:%.]] = getelementptr i8, i8 [[Y]], i64 6
				; CHECK-NEXT: [[TMP14:%.]] = load i8, i8 [[TMP12]], align 1
				; CHECK-NEXT: [[TMP15:%.]] = load i8, i8 [[TMP13]], align 1
				; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i8 [[TMP14]], [[TMP15]]
				; CHECK-NEXT: br i1 [[TMP16]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB2]] ], [ 1, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* %x, i8* %y, i64 7)
				%not.lnot = icmp ne i32 %call, 0
				%cond = zext i1 %not.lnot to i32
				ret i32 %cond
				}

				; Validate with > 0
				define signext i32 @zeroEqualityTest04() {
				; CHECK-LABEL: @zeroEqualityTest04(
				; CHECK-NEXT: endblock:
				; CHECK-NEXT: ret i32 0
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest02.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest02.buffer2 to i8*), i64 16)
				%not.cmp = icmp slt i32 %call, 1
				%. = zext i1 %not.cmp to i32
				ret i32 %.
				}

				; Validate with < 0
				define signext i32 @zeroEqualityTest05() {
				; CHECK-LABEL: @zeroEqualityTest05(
				; CHECK-NEXT: endblock:
				; CHECK-NEXT: ret i32 0
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([4 x i32]* @zeroEqualityTest03.buffer1 to i8), i8 bitcast ([4 x i32]* @zeroEqualityTest03.buffer2 to i8*), i64 16)
				%call.lobit = lshr i32 %call, 31
				%call.lobit.not = xor i32 %call.lobit, 1
				ret i32 %call.lobit.not
				}

				; Validate with memcmp()?:
				define signext i32 @equalityFoldTwoConstants() {
				; CHECK-LABEL: @equalityFoldTwoConstants(
				; CHECK-NEXT: endblock:
				; CHECK-NEXT: ret i32 1
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 bitcast ([15 x i32]* @zeroEqualityTest04.buffer2 to i8*), i64 16)
				%not.tobool = icmp eq i32 %call, 0
				%cond = zext i1 %not.tobool to i32
				ret i32 %cond
				}

				define signext i32 @equalityFoldOneConstant(i8* %X) {
				; CHECK-LABEL: @equalityFoldOneConstant(
				; CHECK-NEXT: loadbb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[TMP0]], align 8
				; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 4294967296
				; CHECK-NEXT: br i1 [[TMP2]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; CHECK: res_block:
				; CHECK-NEXT: br label [[ENDBLOCK:%.*]]
				; CHECK: loadbb1:
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 8
				; CHECK-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i64*
				; CHECK-NEXT: [[TMP5:%.]] = load i64, i64 [[TMP4]], align 8
				; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 12884901890
				; CHECK-NEXT: br i1 [[TMP6]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; CHECK: endblock:
				; CHECK-NEXT: [[PHI_RES:%.*]] = phi i32 [ 1, [[LOADBB1]] ], [ 0, [[RES_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[PHI_RES]]
				;
				%call = tail call signext i32 @memcmp(i8* bitcast ([15 x i32]* @zeroEqualityTest04.buffer1 to i8), i8 %X, i64 16)
				%not.tobool = icmp eq i32 %call, 0
				%cond = zext i1 %not.tobool to i32
				ret i32 %cond
				}

				define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) {
				; CHECK-LABEL: @length2_eq_nobuiltin_attr(
				; CHECK-NEXT: [[M:%.]] = tail call signext i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 2) #1
				; CHECK-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; CHECK-NEXT: ret i1 [[C]]
				;
				%m = tail call signext i32 @memcmp(i8* %X, i8* %Y, i64 2) nobuiltin
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp-mergeexpand.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -mergeicmps -expandmemcmp -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux < %s \| FileCheck %s --check-prefix=PPC64LE

				; This tests interaction between MergeICmp and ExpandMemCmp.

				%"struct.std::pair" = type { i32, i32 }

				define zeroext i1 @opeq1(
				; PPC64LE-LABEL: @opeq1(
				; PPC64LE-NEXT: entry:
				; PPC64LE-NEXT: [[FIRST_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; PPC64LE-NEXT: [[FIRST1_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; PPC64LE-NEXT: [[CSTR:%.]] = bitcast i32 [[FIRST_I]] to i8*
				; PPC64LE-NEXT: [[CSTR1:%.]] = bitcast i32 [[FIRST1_I]] to i8*
				; PPC64LE-NEXT: [[TMP0:%.]] = bitcast i8 [[CSTR]] to i64*
				; PPC64LE-NEXT: [[TMP1:%.]] = bitcast i8 [[CSTR1]] to i64*
				; PPC64LE-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]]
				; PPC64LE-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]]
				; PPC64LE-NEXT: [[TMP4:%.*]] = icmp ne i64 [[TMP2]], [[TMP3]]
				; PPC64LE-NEXT: [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
				; PPC64LE-NEXT: [[TMP6:%.*]] = icmp eq i32 [[TMP5]], 0
				; PPC64LE-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; PPC64LE: opeq1.exit:
				; PPC64LE-NEXT: [[TMP7:%.]] = phi i1 [ [[TMP6]], [[ENTRY:%.]] ]
				; PPC64LE-NEXT: ret i1 [[TMP7]]
				;
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
				entry:
				%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
				%0 = load i32, i32* %first.i, align 4
				%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
				%1 = load i32, i32* %first1.i, align 4
				%cmp.i = icmp eq i32 %0, %1
				br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

				land.rhs.i:
				%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
				%2 = load i32, i32* %second.i, align 4
				%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
				%3 = load i32, i32* %second2.i, align 4
				%cmp3.i = icmp eq i32 %2, %3
				br label %opeq1.exit

				opeq1.exit:
				%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
				ret i1 %4
				}

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mcpu=pwr8 -mtriple=powerpc64le-unknown-gnu-linux \| FileCheck %s -check-prefix=CHECK

				define signext i32 @memcmp8(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp8(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER1:%.]] to i64
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i32 [[BUFFER2:%.]] to i64
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 4
				; CHECK-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; CHECK-NEXT: [[TMP6:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = icmp ugt i64 [[TMP5]], [[TMP6]]
				; CHECK-NEXT: [[TMP8:%.*]] = icmp ult i64 [[TMP5]], [[TMP6]]
				; CHECK-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; CHECK-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: ret i32 [[TMP11]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 8)
				ret i32 %call
				}

				define signext i32 @memcmp4(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp4(
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[BUFFER1:%.*]], align 4
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[BUFFER2:%.*]], align 4
				; CHECK-NEXT: [[TMP3:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP1]])
				; CHECK-NEXT: [[TMP4:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = icmp ult i32 [[TMP3]], [[TMP4]]
				; CHECK-NEXT: [[TMP7:%.*]] = zext i1 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP8:%.*]] = zext i1 [[TMP6]] to i32
				; CHECK-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; CHECK-NEXT: ret i32 [[TMP9]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 4)
				ret i32 %call
				}

				define signext i32 @memcmp2(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp2(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[BUFFER1:%.]] to i16
				; CHECK-NEXT: [[TMP2:%.]] = bitcast i32 [[BUFFER2:%.]] to i16
				; CHECK-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; CHECK-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; CHECK-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; CHECK-NEXT: [[TMP7:%.*]] = zext i16 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP8:%.*]] = zext i16 [[TMP6]] to i32
				; CHECK-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; CHECK-NEXT: ret i32 [[TMP9]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 2)
				ret i32 %call
				}

				define signext i32 @memcmp1(i32* nocapture readonly %buffer1, i32* nocapture readonly %buffer2) {
				; CHECK-LABEL: @memcmp1(
				; CHECK-NEXT: [[T0:%.]] = bitcast i32 [[BUFFER1:%.]] to i8
				; CHECK-NEXT: [[T1:%.]] = bitcast i32 [[BUFFER2:%.]] to i8
				; CHECK-NEXT: [[LHSC:%.]] = load i8, i8 [[T0]], align 1
				; CHECK-NEXT: [[LHSV:%.*]] = zext i8 [[LHSC]] to i32
				; CHECK-NEXT: [[RHSC:%.]] = load i8, i8 [[T1]], align 1
				; CHECK-NEXT: [[RHSV:%.*]] = zext i8 [[RHSC]] to i32
				; CHECK-NEXT: [[CHARDIFF:%.*]] = sub nsw i32 [[LHSV]], [[RHSV]]
				; CHECK-NEXT: ret i32 [[CHARDIFF]]
				;
				%t0 = bitcast i32* %buffer1 to i8*
				%t1 = bitcast i32* %buffer2 to i8*
				%call = tail call signext i32 @memcmp(i8* %t0, i8* %t1, i64 1) #2
				ret i32 %call
				}

				declare signext i32 @memcmp(i8, i8, i64)

llvm/test/Transforms/PhaseOrdering/X86/lit.local.cfg

This file was added.

				if not 'X86' in config.root.targets:
				config.unsupported = True

llvm/test/Transforms/PhaseOrdering/X86/memcmp-mergeexpand.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -mergeicmps -expandmemcmp -mtriple=i386-unknown-linux < %s \| FileCheck %s --check-prefix=X86
				; RUN: opt -S -mergeicmps -expandmemcmp -mtriple=x86_64-unknown-linux < %s \| FileCheck %s --check-prefix=X64

				; This tests interaction between MergeICmp and ExpandMemCmp.

				%"struct.std::pair" = type { i32, i32 }

				define zeroext i1 @opeq1(
				; X86-LABEL: @opeq1(
				; X86-NEXT: entry:
				; X86-NEXT: [[FIRST_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; X86-NEXT: [[FIRST1_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; X86-NEXT: [[CSTR:%.]] = bitcast i32 [[FIRST_I]] to i8*
				; X86-NEXT: [[CSTR1:%.]] = bitcast i32 [[FIRST1_I]] to i8*
				; X86-NEXT: [[TMP0:%.]] = bitcast i8 [[CSTR]] to i32*
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[CSTR1]] to i32*
				; X86-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]]
				; X86-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]]
				; X86-NEXT: [[TMP4:%.*]] = xor i32 [[TMP2]], [[TMP3]]
				; X86-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[CSTR]], i8 4
				; X86-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; X86-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[CSTR1]], i8 4
				; X86-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; X86-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]]
				; X86-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]]
				; X86-NEXT: [[TMP11:%.*]] = xor i32 [[TMP9]], [[TMP10]]
				; X86-NEXT: [[TMP12:%.*]] = or i32 [[TMP4]], [[TMP11]]
				; X86-NEXT: [[TMP13:%.*]] = icmp ne i32 [[TMP12]], 0
				; X86-NEXT: [[TMP14:%.*]] = zext i1 [[TMP13]] to i32
				; X86-NEXT: [[TMP15:%.*]] = icmp eq i32 [[TMP14]], 0
				; X86-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; X86: opeq1.exit:
				; X86-NEXT: [[TMP16:%.]] = phi i1 [ [[TMP15]], [[ENTRY:%.]] ]
				; X86-NEXT: ret i1 [[TMP16]]
				;
				; X64-LABEL: @opeq1(
				; X64-NEXT: entry:
				; X64-NEXT: [[FIRST_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[A:%.*]], i64 0, i32 0
				; X64-NEXT: [[FIRST1_I:%.]] = getelementptr inbounds %"struct.std::pair", %"struct.std::pair" [[B:%.*]], i64 0, i32 0
				; X64-NEXT: [[CSTR:%.]] = bitcast i32 [[FIRST_I]] to i8*
				; X64-NEXT: [[CSTR1:%.]] = bitcast i32 [[FIRST1_I]] to i8*
				; X64-NEXT: [[TMP0:%.]] = bitcast i8 [[CSTR]] to i64*
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[CSTR1]] to i64*
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]]
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]]
				; X64-NEXT: [[TMP4:%.*]] = icmp ne i64 [[TMP2]], [[TMP3]]
				; X64-NEXT: [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
				; X64-NEXT: [[TMP6:%.*]] = icmp eq i32 [[TMP5]], 0
				; X64-NEXT: br label [[OPEQ1_EXIT:%.*]]
				; X64: opeq1.exit:
				; X64-NEXT: [[TMP7:%.]] = phi i1 [ [[TMP6]], [[ENTRY:%.]] ]
				; X64-NEXT: ret i1 [[TMP7]]
				;
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %a,
				%"struct.std::pair"* nocapture readonly dereferenceable(8) %b) local_unnamed_addr #0 {
				entry:
				%first.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 0
				%0 = load i32, i32* %first.i, align 4
				%first1.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 0
				%1 = load i32, i32* %first1.i, align 4
				%cmp.i = icmp eq i32 %0, %1
				br i1 %cmp.i, label %land.rhs.i, label %opeq1.exit

				land.rhs.i:
				%second.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %a, i64 0, i32 1
				%2 = load i32, i32* %second.i, align 4
				%second2.i = getelementptr inbounds %"struct.std::pair", %"struct.std::pair"* %b, i64 0, i32 1
				%3 = load i32, i32* %second2.i, align 4
				%cmp3.i = icmp eq i32 %2, %3
				br label %opeq1.exit

				opeq1.exit:
				%4 = phi i1 [ false, %entry ], [ %cmp3.i, %land.rhs.i ]
				ret i1 %4
				}

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=ALL --check-prefix=X86
				; RUN: opt < %s -O2 -S -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=ALL --check-prefix=X64

				; This tests codegen time inlining/optimization of memcmp
				; rdar://6480398
				spatelUnsubmitted Done Reply Inline Actions Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR transform passes.' spatel: Update stale comment: 'This tests interaction between the MergeICmp and ExpandMemCmp IR…

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"


				@.str = private constant [65 x i8] c"0123456789012345678901234567890123456789012345678901234567890123\00", align 1

				declare i32 @memcmp(i8, i8, i64)
				declare i32 @bcmp(i8, i8, i64)

				define i32 @length0(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0(
				; ALL-NEXT: ret i32 0
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				ret i32 %m
				}

				define i1 @length0_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0_eq(
				; ALL-NEXT: ret i1 true
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length0_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length0_lt(
				; ALL-NEXT: ret i1 false
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 0) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i32 @length2(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = zext i16 [[TMP5]] to i32
				; ALL-NEXT: [[TMP8:%.*]] = zext i16 [[TMP6]] to i32
				; ALL-NEXT: [[TMP9:%.*]] = sub nsw i32 [[TMP7]], [[TMP8]]
				; ALL-NEXT: ret i32 [[TMP9]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				ret i32 %m
				}

				define i1 @length2_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = icmp eq i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_lt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[C:%.*]] = icmp ult i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_gt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_gt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP4]])
				; ALL-NEXT: [[C:%.*]] = icmp ugt i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp sgt i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_eq_const(i8* %X) nounwind {
				; ALL-LABEL: @length2_eq_const(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP3:%.*]] = icmp ne i16 [[TMP2]], 12849
				; ALL-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 2) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length2_eq_nobuiltin_attr(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length2_eq_nobuiltin_attr(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 2) #4
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 2) nounwind nobuiltin
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i32 @length3(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length3(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = load i16, i16 [[TMP0]], align 2
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i16 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i16 @llvm.bswap.i16(i16 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i16 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 2
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 2
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
				ret i32 %m
				}

				define i1 @length3_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length3_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = xor i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 2
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 2
				; ALL-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; ALL-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; ALL-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; ALL-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i16
				; ALL-NEXT: [[TMP12:%.*]] = or i16 [[TMP5]], [[TMP11]]
				; ALL-NEXT: [[TMP13:%.*]] = icmp ne i16 [[TMP12]], 0
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 3) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length4(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; ALL-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; ALL-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; ALL-NEXT: ret i32 [[TMP11]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				ret i32 %m
				}

				define i1 @length4_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_lt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[TMP7]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_gt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length4_gt(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP4]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: ret i1 [[TMP7]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 4) nounwind
				%c = icmp sgt i32 %m, 0
				ret i1 %c
				}

				define i1 @length4_eq_const(i8* %X) nounwind {
				; ALL-LABEL: @length4_eq_const(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP3:%.*]] = icmp eq i32 [[TMP2]], 875770417
				; ALL-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 1), i64 4) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i32 @length5(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				ret i32 %m
				}

				define i1 @length5_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; ALL-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; ALL-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; ALL-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i32
				; ALL-NEXT: [[TMP12:%.*]] = or i32 [[TMP5]], [[TMP11]]
				; ALL-NEXT: [[TMP13:%.*]] = icmp ne i32 [[TMP12]], 0
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length5_lt(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length5_lt(
				; ALL-NEXT: loadbb:
				; ALL-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; ALL-NEXT: br i1 [[TMP4]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; ALL: res_block:
				; ALL-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; ALL-NEXT: [[TMP6:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; ALL-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP5]], [[TMP6]]
				; ALL-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; ALL-NEXT: br label [[ENDBLOCK:%.*]]
				; ALL: loadbb1:
				; ALL-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; ALL-NEXT: [[TMP10:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; ALL-NEXT: [[TMP11:%.]] = load i8, i8 [[TMP9]], align 1
				; ALL-NEXT: [[TMP12:%.]] = load i8, i8 [[TMP10]], align 1
				; ALL-NEXT: [[TMP13:%.*]] = zext i8 [[TMP11]] to i32
				; ALL-NEXT: [[TMP14:%.*]] = zext i8 [[TMP12]] to i32
				; ALL-NEXT: [[TMP15:%.*]] = sub nsw i32 [[TMP13]], [[TMP14]]
				; ALL-NEXT: br label [[ENDBLOCK]]
				; ALL: endblock:
				; ALL-NEXT: [[PHI_RES:%.*]] = phi i32 [ [[TMP15]], [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; ALL-NEXT: [[C:%.*]] = icmp slt i32 [[PHI_RES]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 5) nounwind
				%c = icmp slt i32 %m, 0
				ret i1 %c
				}

				define i1 @length7_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length7_eq(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; ALL-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; ALL-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; ALL-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 3
				; ALL-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; ALL-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 3
				; ALL-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; ALL-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]], align 4
				; ALL-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]], align 4
				; ALL-NEXT: [[TMP11:%.*]] = icmp ne i32 [[TMP3]], [[TMP4]]
				; ALL-NEXT: [[TMP12:%.*]] = icmp ne i32 [[TMP9]], [[TMP10]]
				; ALL-NEXT: [[TMP13:%.*]] = or i1 [[TMP11]], [[TMP12]]
				; ALL-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 7) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length8(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length8(
				; X86-NEXT: loadbb:
				; X86-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; X86-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP4:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP2]])
				; X86-NEXT: [[TMP5:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP3]])
				; X86-NEXT: [[TMP6:%.*]] = icmp eq i32 [[TMP2]], [[TMP3]]
				; X86-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X86: res_block:
				; X86-NEXT: [[PHI_SRC1:%.]] = phi i32 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP15:%.*]], [[LOADBB1]] ]
				; X86-NEXT: [[PHI_SRC2:%.]] = phi i32 [ [[TMP5]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ]
				; X86-NEXT: [[TMP7:%.*]] = icmp ult i32 [[PHI_SRC1]], [[PHI_SRC2]]
				; X86-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X86-NEXT: br label [[ENDBLOCK:%.*]]
				; X86: loadbb1:
				; X86-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; X86-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; X86-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; X86-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]], align 4
				; X86-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]], align 4
				; X86-NEXT: [[TMP15]] = call i32 @llvm.bswap.i32(i32 [[TMP13]])
				; X86-NEXT: [[TMP16]] = call i32 @llvm.bswap.i32(i32 [[TMP14]])
				; X86-NEXT: [[TMP17:%.*]] = icmp eq i32 [[TMP13]], [[TMP14]]
				; X86-NEXT: br i1 [[TMP17]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X86: endblock:
				; X86-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X86-NEXT: ret i32 [[PHI_RES]]
				;
				; X64-LABEL: @length8(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP4]])
				; X64-NEXT: [[TMP7:%.*]] = icmp ugt i64 [[TMP5]], [[TMP6]]
				; X64-NEXT: [[TMP8:%.*]] = icmp ult i64 [[TMP5]], [[TMP6]]
				; X64-NEXT: [[TMP9:%.*]] = zext i1 [[TMP7]] to i32
				; X64-NEXT: [[TMP10:%.*]] = zext i1 [[TMP8]] to i32
				; X64-NEXT: [[TMP11:%.*]] = sub nsw i32 [[TMP9]], [[TMP10]]
				; X64-NEXT: ret i32 [[TMP11]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
				ret i32 %m
				}

				define i1 @length8_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length8_eq(
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i32
				; X86-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP4:%.]] = load i32, i32 [[TMP2]], align 4
				; X86-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i32*
				; X86-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 4
				; X86-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i32*
				; X86-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP6]], align 4
				; X86-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP8]], align 4
				; X86-NEXT: [[TMP11:%.*]] = icmp eq i32 [[TMP3]], [[TMP4]]
				; X86-NEXT: [[TMP12:%.*]] = icmp eq i32 [[TMP9]], [[TMP10]]
				; X86-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length8_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 8) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length8_eq_const(i8* %X) nounwind {
				; X86-LABEL: @length8_eq_const(
				; X86-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i32
				; X86-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP1]], align 4
				; X86-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 4
				; X86-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i32*
				; X86-NEXT: [[TMP5:%.]] = load i32, i32 [[TMP4]], align 4
				; X86-NEXT: [[TMP6:%.*]] = icmp ne i32 [[TMP2]], 858927408
				; X86-NEXT: [[TMP7:%.*]] = icmp ne i32 [[TMP5]], 926299444
				; X86-NEXT: [[TMP8:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; X86-NEXT: ret i1 [[TMP8]]
				;
				; X64-LABEL: @length8_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = icmp ne i64 [[TMP2]], 3978425819141910832
				; X64-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 8) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i1 @length9_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length9_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 9) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length9_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP8:%.]] = load i8, i8 [[TMP6]], align 1
				; X64-NEXT: [[TMP9:%.]] = load i8, i8 [[TMP7]], align 1
				; X64-NEXT: [[TMP10:%.*]] = xor i8 [[TMP8]], [[TMP9]]
				; X64-NEXT: [[TMP11:%.*]] = zext i8 [[TMP10]] to i64
				; X64-NEXT: [[TMP12:%.*]] = or i64 [[TMP5]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = icmp eq i64 [[TMP12]], 0
				; X64-NEXT: ret i1 [[TMP13]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length10_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length10_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 10) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length10_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i16*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i16*
				; X64-NEXT: [[TMP10:%.]] = load i16, i16 [[TMP7]], align 2
				; X64-NEXT: [[TMP11:%.]] = load i16, i16 [[TMP9]], align 2
				; X64-NEXT: [[TMP12:%.*]] = xor i16 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i16 [[TMP12]] to i64
				; X64-NEXT: [[TMP14:%.*]] = or i64 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp eq i64 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 10) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length11_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length11_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 11) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length11_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 3
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 3
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 11) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length12_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length12_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 12) #2
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length12_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i32*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i32*
				; X64-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP7]], align 4
				; X64-NEXT: [[TMP11:%.]] = load i32, i32 [[TMP9]], align 4
				; X64-NEXT: [[TMP12:%.*]] = xor i32 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i32 [[TMP12]] to i64
				; X64-NEXT: [[TMP14:%.*]] = or i64 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp ne i64 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length12(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length12(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 12) #2
				; X86-NEXT: ret i32 [[M]]
				;
				; X64-LABEL: @length12(
				; X64-NEXT: loadbb:
				; X64-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP2]])
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; X64-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X64: res_block:
				; X64-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP17:%.*]], [[LOADBB1]] ]
				; X64-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP5]], [[LOADBB]] ], [ [[TMP18:%.]], [[LOADBB1]] ]
				; X64-NEXT: [[TMP7:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; X64-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X64-NEXT: br label [[ENDBLOCK:%.*]]
				; X64: loadbb1:
				; X64-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i32*
				; X64-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i32*
				; X64-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP10]], align 4
				; X64-NEXT: [[TMP14:%.]] = load i32, i32 [[TMP12]], align 4
				; X64-NEXT: [[TMP15:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP13]])
				; X64-NEXT: [[TMP16:%.*]] = call i32 @llvm.bswap.i32(i32 [[TMP14]])
				; X64-NEXT: [[TMP17]] = zext i32 [[TMP15]] to i64
				; X64-NEXT: [[TMP18]] = zext i32 [[TMP16]] to i64
				; X64-NEXT: [[TMP19:%.*]] = icmp eq i32 [[TMP13]], [[TMP14]]
				; X64-NEXT: br i1 [[TMP19]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X64: endblock:
				; X64-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X64-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 12) nounwind
				ret i32 %m
				}

				define i1 @length13_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length13_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 13) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length13_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 5
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 5
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 13) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length14_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length14_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 14) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length14_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 6
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 6
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 14) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @length15_eq(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length15_eq(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 15) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length15_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i64, i64 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 7
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i64*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 7
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i64*
				; X64-NEXT: [[TMP9:%.]] = load i64, i64 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i64 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[C:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 15) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; PR33329 - https://bugs.llvm.org/show_bug.cgi?id=33329

				define i32 @length16(i8* %X, i8* %Y) nounwind {
				; X86-LABEL: @length16(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 16) #2
				; X86-NEXT: ret i32 [[M]]
				;
				; X64-LABEL: @length16(
				; X64-NEXT: loadbb:
				; X64-NEXT: [[TMP0:%.]] = bitcast i8 [[X:%.]] to i64
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[Y:%.]] to i64
				; X64-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP0]], align 8
				; X64-NEXT: [[TMP3:%.]] = load i64, i64 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP2]])
				; X64-NEXT: [[TMP5:%.*]] = call i64 @llvm.bswap.i64(i64 [[TMP3]])
				; X64-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP2]], [[TMP3]]
				; X64-NEXT: br i1 [[TMP6]], label [[LOADBB1:%.]], label [[RES_BLOCK:%.]]
				; X64: res_block:
				; X64-NEXT: [[PHI_SRC1:%.]] = phi i64 [ [[TMP4]], [[LOADBB:%.]] ], [ [[TMP15:%.*]], [[LOADBB1]] ]
				; X64-NEXT: [[PHI_SRC2:%.]] = phi i64 [ [[TMP5]], [[LOADBB]] ], [ [[TMP16:%.]], [[LOADBB1]] ]
				; X64-NEXT: [[TMP7:%.*]] = icmp ult i64 [[PHI_SRC1]], [[PHI_SRC2]]
				; X64-NEXT: [[TMP8:%.*]] = select i1 [[TMP7]], i32 -1, i32 1
				; X64-NEXT: br label [[ENDBLOCK:%.*]]
				; X64: loadbb1:
				; X64-NEXT: [[TMP9:%.]] = getelementptr i8, i8 [[X]], i64 8
				; X64-NEXT: [[TMP10:%.]] = bitcast i8 [[TMP9]] to i64*
				; X64-NEXT: [[TMP11:%.]] = getelementptr i8, i8 [[Y]], i64 8
				; X64-NEXT: [[TMP12:%.]] = bitcast i8 [[TMP11]] to i64*
				; X64-NEXT: [[TMP13:%.]] = load i64, i64 [[TMP10]], align 8
				; X64-NEXT: [[TMP14:%.]] = load i64, i64 [[TMP12]], align 8
				; X64-NEXT: [[TMP15]] = call i64 @llvm.bswap.i64(i64 [[TMP13]])
				; X64-NEXT: [[TMP16]] = call i64 @llvm.bswap.i64(i64 [[TMP14]])
				; X64-NEXT: [[TMP17:%.*]] = icmp eq i64 [[TMP13]], [[TMP14]]
				; X64-NEXT: br i1 [[TMP17]], label [[ENDBLOCK]], label [[RES_BLOCK]]
				; X64: endblock:
				; X64-NEXT: [[PHI_RES:%.*]] = phi i32 [ 0, [[LOADBB1]] ], [ [[TMP8]], [[RES_BLOCK]] ]
				; X64-NEXT: ret i32 [[PHI_RES]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 16) nounwind
				ret i32 %m
				}

				define i1 @length16_eq(i8* %x, i8* %y) nounwind {
				; X86-NOSSE-LABEL: length16_eq:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $16
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				spatelUnsubmitted Not Done Reply Inline Actions Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way to do end-to-end testing now within the regression test dir. We would be better off creating real end-to-end (C source --> x86 asm) tests within test-suite? That way, we can be sure that no passes anywhere in the pipeline are interfering with our memcmp patterns. spatel: Why/how are we checking x86 asm in an IR transform test file? I don't think there's a good way…
				courbetAuthorUnsubmitted Done Reply Inline Actions Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will fix. courbet: Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will…
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: setne %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length16_eq:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $16
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: setne %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length16_eq:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
				; X86-SSE2-NEXT: movdqu (%eax), %xmm1
				; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
				; X86-SSE2-NEXT: pmovmskb %xmm1, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: setne %al
				; X86-SSE2-NEXT: retl
				;
				; X64-SSE2-LABEL: length16_eq:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
				; X64-SSE2-NEXT: movdqu (%rsi), %xmm1
				; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm1
				; X64-SSE2-NEXT: pmovmskb %xmm1, %eax
				; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-SSE2-NEXT: setne %al
				; X64-SSE2-NEXT: retq
				;
				; X64-AVX-LABEL: length16_eq:
				; X64-AVX: # %bb.0:
				; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
				; X64-AVX-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
				; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
				; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-AVX-NEXT: setne %al
				; X64-AVX-NEXT: retq
				; X86-LABEL: @length16_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 16) #2
				; X86-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length16_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = icmp ne i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: ret i1 [[TMP5]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 16) nounwind
				%cmp = icmp ne i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length16_eq_const(i8* %X) nounwind {
				; X86-NOSSE-LABEL: length16_eq_const:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $16
				; X86-NOSSE-NEXT: pushl $.L.str
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: sete %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length16_eq_const:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $16
				; X86-SSE1-NEXT: pushl $.L.str
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: sete %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length16_eq_const:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movdqu (%eax), %xmm0
				; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
				; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: sete %al
				; X86-SSE2-NEXT: retl
				;
				; X64-SSE2-LABEL: length16_eq_const:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
				; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
				; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-SSE2-NEXT: sete %al
				; X64-SSE2-NEXT: retq
				;
				; X64-AVX-LABEL: length16_eq_const:
				; X64-AVX: # %bb.0:
				; X64-AVX-NEXT: vmovdqu (%rdi), %xmm0
				; X64-AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
				; X64-AVX-NEXT: vpmovmskb %xmm0, %eax
				; X64-AVX-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-AVX-NEXT: sete %al
				; X64-AVX-NEXT: retq
				; X86-LABEL: @length16_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 16) #2
				; X86-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length16_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = icmp eq i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: ret i1 [[TMP3]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 16) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; PR33914 - https://bugs.llvm.org/show_bug.cgi?id=33914

				define i32 @length24(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length24(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 24) #2
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 24) nounwind
				ret i32 %m
				}

				define i1 @length24_eq(i8* %x, i8* %y) nounwind {
				; X86-NOSSE-LABEL: length24_eq:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $24
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: sete %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length24_eq:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $24
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: sete %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length24_eq:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
				; X86-SSE2-NEXT: movdqu 8(%ecx), %xmm1
				; X86-SSE2-NEXT: movdqu (%eax), %xmm2
				; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
				; X86-SSE2-NEXT: movdqu 8(%eax), %xmm0
				; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
				; X86-SSE2-NEXT: pand %xmm2, %xmm0
				; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: sete %al
				; X86-SSE2-NEXT: retl
				;
				; X86-LABEL: @length24_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 24) #2
				; X86-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length24_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.*]] = xor i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP6:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP7:%.]] = bitcast i8 [[TMP6]] to i64*
				; X64-NEXT: [[TMP8:%.]] = getelementptr i8, i8 [[Y]], i64 16
				; X64-NEXT: [[TMP9:%.]] = bitcast i8 [[TMP8]] to i64*
				; X64-NEXT: [[TMP10:%.]] = load i64, i64 [[TMP7]], align 8
				; X64-NEXT: [[TMP11:%.]] = load i64, i64 [[TMP9]], align 8
				; X64-NEXT: [[TMP12:%.*]] = xor i64 [[TMP10]], [[TMP11]]
				; X64-NEXT: [[TMP13:%.*]] = zext i64 [[TMP12]] to i128
				; X64-NEXT: [[TMP14:%.*]] = or i128 [[TMP5]], [[TMP13]]
				; X64-NEXT: [[TMP15:%.*]] = icmp eq i128 [[TMP14]], 0
				; X64-NEXT: ret i1 [[TMP15]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 24) nounwind
				%cmp = icmp eq i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length24_eq_const(i8* %X) nounwind {
				; X86-NOSSE-LABEL: length24_eq_const:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $24
				; X86-NOSSE-NEXT: pushl $.L.str
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: setne %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length24_eq_const:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $24
				; X86-SSE1-NEXT: pushl $.L.str
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: setne %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length24_eq_const:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movdqu (%eax), %xmm0
				; X86-SSE2-NEXT: movdqu 8(%eax), %xmm1
				; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
				; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
				; X86-SSE2-NEXT: pand %xmm1, %xmm0
				; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: setne %al
				; X86-SSE2-NEXT: retl
				;
				; X86-LABEL: @length24_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 24) #2
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length24_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.*]] = xor i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: [[TMP4:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP5:%.]] = bitcast i8 [[TMP4]] to i64*
				; X64-NEXT: [[TMP6:%.]] = load i64, i64 [[TMP5]], align 8
				; X64-NEXT: [[TMP7:%.*]] = xor i64 [[TMP6]], 3689065127958034230
				; X64-NEXT: [[TMP8:%.*]] = zext i64 [[TMP7]] to i128
				; X64-NEXT: [[TMP9:%.*]] = or i128 [[TMP3]], [[TMP8]]
				; X64-NEXT: [[TMP10:%.*]] = icmp ne i128 [[TMP9]], 0
				; X64-NEXT: ret i1 [[TMP10]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 24) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length32(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length32(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 32) #2
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 32) nounwind
				ret i32 %m
				}

				; PR33325 - https://bugs.llvm.org/show_bug.cgi?id=33325

				define i1 @length32_eq(i8* %x, i8* %y) nounwind {
				; X86-NOSSE-LABEL: length32_eq:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $32
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: sete %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length32_eq:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $32
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: sete %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length32_eq:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-SSE2-NEXT: movdqu (%ecx), %xmm0
				; X86-SSE2-NEXT: movdqu 16(%ecx), %xmm1
				; X86-SSE2-NEXT: movdqu (%eax), %xmm2
				; X86-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
				; X86-SSE2-NEXT: movdqu 16(%eax), %xmm0
				; X86-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
				; X86-SSE2-NEXT: pand %xmm2, %xmm0
				; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: sete %al
				; X86-SSE2-NEXT: retl
				;
				; X64-SSE2-LABEL: length32_eq:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
				; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
				; X64-SSE2-NEXT: movdqu (%rsi), %xmm2
				; X64-SSE2-NEXT: pcmpeqb %xmm0, %xmm2
				; X64-SSE2-NEXT: movdqu 16(%rsi), %xmm0
				; X64-SSE2-NEXT: pcmpeqb %xmm1, %xmm0
				; X64-SSE2-NEXT: pand %xmm2, %xmm0
				; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-SSE2-NEXT: sete %al
				; X64-SSE2-NEXT: retq
				;
				; X64-AVX1-LABEL: length32_eq:
				; X64-AVX1: # %bb.0:
				; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
				; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
				; X64-AVX1-NEXT: vpcmpeqb (%rsi), %xmm0, %xmm0
				; X64-AVX1-NEXT: vpcmpeqb 16(%rsi), %xmm1, %xmm1
				; X64-AVX1-NEXT: vpand %xmm0, %xmm1, %xmm0
				; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
				; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-AVX1-NEXT: sete %al
				; X64-AVX1-NEXT: retq
				;
				; X64-AVX2-LABEL: length32_eq:
				; X64-AVX2: # %bb.0:
				; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
				; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
				; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
				; X64-AVX2-NEXT: cmpl $-1, %eax
				; X64-AVX2-NEXT: sete %al
				; X64-AVX2-NEXT: vzeroupper
				; X64-AVX2-NEXT: retq
				; X86-LABEL: @length32_eq(
				; X86-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 32) #2
				; X86-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
				; X86-NEXT: ret i1 [[CMP]]
				;
				; X64-LABEL: @length32_eq(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i128
				; X64-NEXT: [[TMP3:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP4:%.]] = load i128, i128 [[TMP2]], align 8
				; X64-NEXT: [[TMP5:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP6:%.]] = bitcast i8 [[TMP5]] to i128*
				; X64-NEXT: [[TMP7:%.]] = getelementptr i8, i8 [[Y]], i64 16
				; X64-NEXT: [[TMP8:%.]] = bitcast i8 [[TMP7]] to i128*
				; X64-NEXT: [[TMP9:%.]] = load i128, i128 [[TMP6]], align 8
				; X64-NEXT: [[TMP10:%.]] = load i128, i128 [[TMP8]], align 8
				; X64-NEXT: [[TMP11:%.*]] = icmp eq i128 [[TMP3]], [[TMP4]]
				; X64-NEXT: [[TMP12:%.*]] = icmp eq i128 [[TMP9]], [[TMP10]]
				; X64-NEXT: [[CMP:%.*]] = and i1 [[TMP12]], [[TMP11]]
				; X64-NEXT: ret i1 [[CMP]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 32) nounwind
				%cmp = icmp eq i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length32_eq_const(i8* %X) nounwind {
				; X86-NOSSE-LABEL: length32_eq_const:
				; X86-NOSSE: # %bb.0:
				; X86-NOSSE-NEXT: pushl $0
				; X86-NOSSE-NEXT: pushl $32
				; X86-NOSSE-NEXT: pushl $.L.str
				; X86-NOSSE-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: calll memcmp
				; X86-NOSSE-NEXT: addl $16, %esp
				; X86-NOSSE-NEXT: testl %eax, %eax
				; X86-NOSSE-NEXT: setne %al
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE1-LABEL: length32_eq_const:
				; X86-SSE1: # %bb.0:
				; X86-SSE1-NEXT: pushl $0
				; X86-SSE1-NEXT: pushl $32
				; X86-SSE1-NEXT: pushl $.L.str
				; X86-SSE1-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-SSE1-NEXT: calll memcmp
				; X86-SSE1-NEXT: addl $16, %esp
				; X86-SSE1-NEXT: testl %eax, %eax
				; X86-SSE1-NEXT: setne %al
				; X86-SSE1-NEXT: retl
				;
				; X86-SSE2-LABEL: length32_eq_const:
				; X86-SSE2: # %bb.0:
				; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE2-NEXT: movdqu (%eax), %xmm0
				; X86-SSE2-NEXT: movdqu 16(%eax), %xmm1
				; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm1
				; X86-SSE2-NEXT: pcmpeqb {{\.LCPI.*}}, %xmm0
				; X86-SSE2-NEXT: pand %xmm1, %xmm0
				; X86-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X86-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X86-SSE2-NEXT: setne %al
				; X86-SSE2-NEXT: retl
				;
				; X64-SSE2-LABEL: length32_eq_const:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: movdqu (%rdi), %xmm0
				; X64-SSE2-NEXT: movdqu 16(%rdi), %xmm1
				; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm1
				; X64-SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0
				; X64-SSE2-NEXT: pand %xmm1, %xmm0
				; X64-SSE2-NEXT: pmovmskb %xmm0, %eax
				; X64-SSE2-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-SSE2-NEXT: setne %al
				; X64-SSE2-NEXT: retq
				;
				; X64-AVX1-LABEL: length32_eq_const:
				; X64-AVX1: # %bb.0:
				; X64-AVX1-NEXT: vmovdqu (%rdi), %xmm0
				; X64-AVX1-NEXT: vmovdqu 16(%rdi), %xmm1
				; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm1, %xmm1
				; X64-AVX1-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0
				; X64-AVX1-NEXT: vpand %xmm1, %xmm0, %xmm0
				; X64-AVX1-NEXT: vpmovmskb %xmm0, %eax
				; X64-AVX1-NEXT: cmpl $65535, %eax # imm = 0xFFFF
				; X64-AVX1-NEXT: setne %al
				; X64-AVX1-NEXT: retq
				;
				; X64-AVX2-LABEL: length32_eq_const:
				; X64-AVX2: # %bb.0:
				; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
				; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
				; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
				; X64-AVX2-NEXT: cmpl $-1, %eax
				; X64-AVX2-NEXT: setne %al
				; X64-AVX2-NEXT: vzeroupper
				; X64-AVX2-NEXT: retq
				; X86-LABEL: @length32_eq_const(
				; X86-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 32) #2
				; X86-NEXT: [[C:%.*]] = icmp ne i32 [[M]], 0
				; X86-NEXT: ret i1 [[C]]
				;
				; X64-LABEL: @length32_eq_const(
				; X64-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i128
				; X64-NEXT: [[TMP2:%.]] = load i128, i128 [[TMP1]], align 8
				; X64-NEXT: [[TMP3:%.]] = getelementptr i8, i8 [[X]], i64 16
				; X64-NEXT: [[TMP4:%.]] = bitcast i8 [[TMP3]] to i128*
				; X64-NEXT: [[TMP5:%.]] = load i128, i128 [[TMP4]], align 8
				; X64-NEXT: [[TMP6:%.*]] = icmp ne i128 [[TMP2]], 70720121592765328381466889075544961328
				; X64-NEXT: [[TMP7:%.*]] = icmp ne i128 [[TMP5]], 65382562593882267225249597816672106294
				; X64-NEXT: [[TMP8:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; X64-NEXT: ret i1 [[TMP8]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 32) nounwind
				%c = icmp ne i32 %m, 0
				ret i1 %c
				}

				define i32 @length64(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @length64(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 64) #2
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 64) nounwind
				ret i32 %m
				}

				define i1 @length64_eq(i8* %x, i8* %y) nounwind {
				; X64-SSE2-LABEL: length64_eq:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: pushq %rax
				; X64-SSE2-NEXT: movl $64, %edx
				; X64-SSE2-NEXT: callq memcmp
				; X64-SSE2-NEXT: testl %eax, %eax
				; X64-SSE2-NEXT: setne %al
				; X64-SSE2-NEXT: popq %rcx
				; X64-SSE2-NEXT: retq
				; X64-AVX1-LABEL: length64_eq:
				; X64-AVX1: # %bb.0:
				; X64-AVX1-NEXT: pushq %rax
				; X64-AVX1-NEXT: movl $64, %edx
				; X64-AVX1-NEXT: callq memcmp
				; X64-AVX1-NEXT: testl %eax, %eax
				; X64-AVX1-NEXT: setne %al
				; X64-AVX1-NEXT: popq %rcx
				; X64-AVX1-NEXT: retq
				; X64-AVX2-LABEL: length64_eq:
				; X64-AVX2: # %bb.0:
				; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
				; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
				; X64-AVX2-NEXT: vpcmpeqb 32(%rsi), %ymm1, %ymm1
				; X64-AVX2-NEXT: vpcmpeqb (%rsi), %ymm0, %ymm0
				; X64-AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
				; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
				; X64-AVX2-NEXT: cmpl $-1, %eax
				; X64-AVX2-NEXT: setne %al
				; X64-AVX2-NEXT: vzeroupper
				; X64-AVX2-NEXT: retq
				; ALL-LABEL: @length64_eq(
				; ALL-NEXT: [[CALL:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 64) #2
				; ALL-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
				; ALL-NEXT: ret i1 [[CMP]]
				;
				%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 64) nounwind
				%cmp = icmp ne i32 %call, 0
				ret i1 %cmp
				}

				define i1 @length64_eq_const(i8* %X) nounwind {
				; X64-SSE2-LABEL: length64_eq_const:
				; X64-SSE2: # %bb.0:
				; X64-SSE2-NEXT: pushq %rax
				; X64-SSE2-NEXT: movl $.L.str, %esi
				; X64-SSE2-NEXT: movl $64, %edx
				; X64-SSE2-NEXT: callq memcmp
				; X64-SSE2-NEXT: testl %eax, %eax
				; X64-SSE2-NEXT: sete %al
				; X64-SSE2-NEXT: popq %rcx
				; X64-SSE2-NEXT: retq
				; X64-AVX1-LABEL: length64_eq_const:
				; X64-AVX1: # %bb.0:
				; X64-AVX1-NEXT: pushq %rax
				; X64-AVX1-NEXT: movl $.L.str, %esi
				; X64-AVX1-NEXT: movl $64, %edx
				; X64-AVX1-NEXT: callq memcmp
				; X64-AVX1-NEXT: testl %eax, %eax
				; X64-AVX1-NEXT: sete %al
				; X64-AVX1-NEXT: popq %rcx
				; X64-AVX1-NEXT: retq
				; X64-AVX2-LABEL: length64_eq_const:
				; X64-AVX2: # %bb.0:
				; X64-AVX2-NEXT: vmovdqu (%rdi), %ymm0
				; X64-AVX2-NEXT: vmovdqu 32(%rdi), %ymm1
				; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm0, %ymm0
				; X64-AVX2-NEXT: vpcmpeqb {{.*}}(%rip), %ymm1, %ymm1
				; X64-AVX2-NEXT: vpand %ymm0, %ymm1, %ymm0
				; X64-AVX2-NEXT: vpmovmskb %ymm0, %eax
				; X64-AVX2-NEXT: cmpl $-1, %eax
				; X64-AVX2-NEXT: sete %al
				; X64-AVX2-NEXT: vzeroupper
				; X64-AVX2-NEXT: retq
				; ALL-LABEL: @length64_eq_const(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i64 0, i64 0), i64 64) #2
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* getelementptr inbounds ([65 x i8], [65 x i8]* @.str, i32 0, i32 0), i64 64) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; This checks that we do not do stupid things with huge sizes.
				define i32 @huge_length(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @huge_length(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 9223372036854775807) #2
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
				ret i32 %m
				}

				define i1 @huge_length_eq(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @huge_length_eq(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.*]], i64 9223372036854775807) #2
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 9223372036854775807) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				; This checks non-constant sizes.
				define i32 @nonconst_length(i8* %X, i8* %Y, i64 %size) nounwind {
				; ALL-LABEL: @nonconst_length(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.]], i64 [[SIZE:%.]]) #2
				; ALL-NEXT: ret i32 [[M]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
				ret i32 %m
				}

				define i1 @nonconst_length_eq(i8* %X, i8* %Y, i64 %size) nounwind {
				; ALL-LABEL: @nonconst_length_eq(
				; ALL-NEXT: [[M:%.]] = tail call i32 @memcmp(i8 [[X:%.]], i8 [[Y:%.]], i64 [[SIZE:%.]]) #2
				; ALL-NEXT: [[C:%.*]] = icmp eq i32 [[M]], 0
				; ALL-NEXT: ret i1 [[C]]
				;
				%m = tail call i32 @memcmp(i8* %X, i8* %Y, i64 %size) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

				define i1 @bcmp_length2(i8* %X, i8* %Y) nounwind {
				; ALL-LABEL: @bcmp_length2(
				; ALL-NEXT: [[TMP1:%.]] = bitcast i8 [[X:%.]] to i16
				; ALL-NEXT: [[TMP2:%.]] = bitcast i8 [[Y:%.]] to i16
				; ALL-NEXT: [[TMP3:%.]] = load i16, i16 [[TMP1]], align 2
				; ALL-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP2]], align 2
				; ALL-NEXT: [[TMP5:%.*]] = icmp eq i16 [[TMP3]], [[TMP4]]
				; ALL-NEXT: ret i1 [[TMP5]]
				;
				%m = tail call i32 @bcmp(i8* %X, i8* %Y, i64 2) nounwind
				%c = icmp eq i32 %m, 0
				ret i1 %c
				}

llvm/test/Transforms/PhaseOrdering/X86/pr36421.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O2 -S \| FileCheck %s

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
				target triple = "x86_64-unknown-unknown"

				@.str = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
				@.str.1 = private unnamed_addr constant [7 x i8] c"ABCDEF\00", align 1

				define i32 @test(i8* nocapture readonly %string, i32 %len) local_unnamed_addr #0 {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[LEN:%.]], 6
				; CHECK-NEXT: br i1 [[COND]], label [[SW_BB:%.]], label [[RETURN:%.]]
				; CHECK: sw.bb:
				; CHECK-NEXT: [[TMP0:%.]] = bitcast i8 [[STRING:%.]] to i32
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]], align 4
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr i8, i8 [[STRING]], i64 4
				; CHECK-NEXT: [[TMP3:%.]] = bitcast i8 [[TMP2]] to i16*
				; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[TMP3]], align 2
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i32 [[TMP1]], 1684234849
				; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i16 [[TMP4]], 26213
				; CHECK-NEXT: [[CMP:%.*]] = and i1 [[TMP6]], [[TMP5]]
				; CHECK-NEXT: br i1 [[CMP]], label [[RETURN]], label [[IF_END:%.*]]
				; CHECK: if.end:
				; CHECK-NEXT: [[TMP7:%.*]] = xor i32 [[TMP1]], 1145258561
				; CHECK-NEXT: [[TMP8:%.*]] = xor i16 [[TMP4]], 17989
				; CHECK-NEXT: [[TMP9:%.*]] = zext i16 [[TMP8]] to i32
				; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP7]], [[TMP9]]
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i32 [[TMP10]], 0
				; CHECK-NEXT: [[DOT:%.*]] = select i1 [[TMP11]], i32 64, i32 0
				; CHECK-NEXT: br label [[RETURN]]
				; CHECK: return:
				; CHECK-NEXT: [[RETVAL_0:%.]] = phi i32 [ 61, [[SW_BB]] ], [ [[DOT]], [[IF_END]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: ret i32 [[RETVAL_0]]
				;
				entry:
				%cond = icmp eq i32 %len, 6
				br i1 %cond, label %sw.bb, label %return

				sw.bb: ; preds = %entry
				%call = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i64 6)
				%cmp = icmp eq i32 %call, 0
				br i1 %cmp, label %return, label %if.end

				if.end: ; preds = %sw.bb
				%call1 = tail call i32 @memcmp(i8* %string, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i64 0, i64 0), i64 6)
				%cmp2 = icmp eq i32 %call1, 0
				%. = select i1 %cmp2, i32 64, i32 0
				br label %return

				return: ; preds = %entry, %if.end8, %if.end4, %if.end, %sw.bb
				%retval.0 = phi i32 [ 61, %sw.bb ], [ %., %if.end ], [ 0, %entry ]
				ret i32 %retval.0
				}

				; Function Attrs: nounwind readonly
				declare i32 @memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #1

				attributes #0 = { nounwind readonly ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"PIC Level", i32 2}
				!2 = !{!"clang version 7.0.0 (trunk 325350)"}

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 502 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeAnalysis(Registry);		initializeAnalysis(Registry);
initializeTransformUtils(Registry);		initializeTransformUtils(Registry);
initializeInstCombine(Registry);		initializeInstCombine(Registry);
initializeAggressiveInstCombine(Registry);		initializeAggressiveInstCombine(Registry);
initializeInstrumentation(Registry);		initializeInstrumentation(Registry);
initializeTarget(Registry);		initializeTarget(Registry);
// For codegen passes, only passes that do IR to IR transformation are		// For codegen passes, only passes that do IR to IR transformation are
// supported.		// supported.
initializeExpandMemCmpPassPass(Registry);
initializeScalarizeMaskedMemIntrinPass(Registry);		initializeScalarizeMaskedMemIntrinPass(Registry);
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeAtomicExpandPass(Registry);		initializeAtomicExpandPass(Registry);
initializeRewriteSymbolsLegacyPassPass(Registry);		initializeRewriteSymbolsLegacyPassPass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeDwarfEHPreparePass(Registry);		initializeDwarfEHPreparePass(Registry);
initializeSafeStackLegacyPassPass(Registry);		initializeSafeStackLegacyPassPass(Registry);
initializeSjLjEHPreparePass(Registry);		initializeSjLjEHPreparePass(Registry);
▲ Show 20 Lines • Show All 425 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

Show All 34 Lines	sources = [
"DFAPacketizer.cpp",		"DFAPacketizer.cpp",
"DeadMachineInstructionElim.cpp",		"DeadMachineInstructionElim.cpp",
"DetectDeadLanes.cpp",		"DetectDeadLanes.cpp",
"DwarfEHPrepare.cpp",		"DwarfEHPrepare.cpp",
"EarlyIfConversion.cpp",		"EarlyIfConversion.cpp",
"EdgeBundles.cpp",		"EdgeBundles.cpp",
"ExecutionDomainFix.cpp",		"ExecutionDomainFix.cpp",
"ExpandISelPseudos.cpp",		"ExpandISelPseudos.cpp",
"ExpandMemCmp.cpp",
"ExpandPostRAPseudos.cpp",		"ExpandPostRAPseudos.cpp",
"ExpandReductions.cpp",		"ExpandReductions.cpp",
"FEntryInserter.cpp",		"FEntryInserter.cpp",
"FaultMaps.cpp",		"FaultMaps.cpp",
"FuncletLayout.cpp",		"FuncletLayout.cpp",
"GCMetadata.cpp",		"GCMetadata.cpp",
"GCMetadataPrinter.cpp",		"GCMetadataPrinter.cpp",
"GCRootLowering.cpp",		"GCRootLowering.cpp",
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

Show All 15 Lines	sources = [
"CallSiteSplitting.cpp",		"CallSiteSplitting.cpp",
"ConstantHoisting.cpp",		"ConstantHoisting.cpp",
"ConstantProp.cpp",		"ConstantProp.cpp",
"CorrelatedValuePropagation.cpp",		"CorrelatedValuePropagation.cpp",
"DCE.cpp",		"DCE.cpp",
"DeadStoreElimination.cpp",		"DeadStoreElimination.cpp",
"DivRemPairs.cpp",		"DivRemPairs.cpp",
"EarlyCSE.cpp",		"EarlyCSE.cpp",
		"ExpandMemCmp.cpp",
"FlattenCFGPass.cpp",		"FlattenCFGPass.cpp",
"Float2Int.cpp",		"Float2Int.cpp",
"GVN.cpp",		"GVN.cpp",
"GVNHoist.cpp",		"GVNHoist.cpp",
"GVNSink.cpp",		"GVNSink.cpp",
"GuardWidening.cpp",		"GuardWidening.cpp",
"IVUsersPrinter.cpp",		"IVUsersPrinter.cpp",
"IndVarSimplify.cpp",		"IndVarSimplify.cpp",
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 198281

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h

llvm/include/llvm/Transforms/Scalar.h

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/ExpandMemCmp.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86TargetTransformInfo.h

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/Scalar/CMakeLists.txt

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp

llvm/lib/Transforms/Scalar/MergeICmps.cpp

llvm/lib/Transforms/Scalar/Scalar.cpp

llvm/test/CodeGen/AArch64/O3-pipeline.ll

llvm/test/CodeGen/ARM/O3-pipeline.ll

llvm/test/CodeGen/Generic/llc-start-stop.ll

llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll

llvm/test/CodeGen/PowerPC/memcmp-mergeexpand.ll

llvm/test/CodeGen/PowerPC/memcmp.ll

llvm/test/CodeGen/PowerPC/memcmpIR.ll

llvm/test/CodeGen/X86/O3-pipeline.ll

llvm/test/CodeGen/X86/memcmp-mergeexpand.ll

llvm/test/CodeGen/X86/memcmp-optsize.ll

llvm/test/CodeGen/X86/memcmp.ll

llvm/test/Other/opt-O2-pipeline.ll

llvm/test/Other/opt-O3-pipeline.ll

llvm/test/Other/opt-Os-pipeline.ll

llvm/test/Transforms/ExpandMemCmp/PowerPC/lit.local.cfg

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll

llvm/test/Transforms/ExpandMemCmp/X86/pr36421.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/lit.local.cfg

llvm/test/Transforms/PhaseOrdering/PowerPC/memCmpUsedInZeroEqualityComparison.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp-mergeexpand.ll

llvm/test/Transforms/PhaseOrdering/PowerPC/memcmp.ll

llvm/test/Transforms/PhaseOrdering/X86/lit.local.cfg

llvm/test/Transforms/PhaseOrdering/X86/memcmp-mergeexpand.ll

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll

llvm/test/Transforms/PhaseOrdering/X86/pr36421.ll

llvm/tools/opt/opt.cpp

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
AbandonedPublic