This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
Analysis/
10/18
ValueTracking.cpp
-
Transforms/Utils/
-
Utils/
-
SimplifyLibCalls.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1/2
memchr-10.ll
1/2
memchr-9.ll
-
memcmp-7.ll
-
memcmp-8.ll
-
memrchr-7.ll
-
str-int-3.ll
2/3
strcall-no-nul.ll
-
strlen-9.ll
1
strnlen-1.ll
-
wcslen-1.ll

Differential D128364

[InstCombine] Look through more casts when folding memchr and memcmp
ClosedPublic

Authored by msebor on Jun 22 2022, 11:00 AM.

Download Raw Diff

Details

Reviewers

nikic
xbolva00
efriedma
lattner

Commits

rGe263a7670e28: [InstCombine] Look through more casts when folding memchr and memcmp

Summary

The memchr and memcmp folders fail for the results of pointer addition involving fractional offsets into constants of types other than i8, such as in

const int a[] = { 0x01010101, 0x02020202 };

void* f (void)
{
  return memchr((const char*)a + 1, 1, 7);
}

and similar examples involving structs and unions. This is due to what seems like an overly restrictive check in the getConstantDataArrayInfo that keeps the function from "looking through [all] bitcast instructions and geps" despite the comment documenting this intent.

In conjunction with the recent enhancement to let all libcall folders work with subobjects of constants of arbitrary types (D125114), this change removes the limitation above, bringing the memchr and memcmp folders up to par with GCC. Tested on by running make check-all on x86_64-linux.

(The code in getConstantDataArrayInfo could stand to be simplified by letting the function take a DataLayout argument. I stopped short of making that change in this patch to minimize the extent of the changes.)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

msebor created this revision.Jun 22 2022, 11:00 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 11:00 AM

Herald added subscribers: jsji, pengfei, hiraditya. · View Herald Transcript

msebor requested review of this revision.Jun 22 2022, 11:00 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 11:00 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B171378: Diff 439083.Jun 22 2022, 11:01 AM

xbolva00 added a reviewer: efriedma.Jun 22 2022, 11:02 AM

efriedma added inline comments.Jun 22 2022, 11:50 AM

llvm/lib/Analysis/ValueTracking.cpp
4196	Maybe it makes sense to just drop the `dyn_cast<GlobalVariable>` here?
4287	Not sure I understand this change; why does it matter if the slice is zero-length here?

msebor added inline comments.Jun 22 2022, 12:24 PM

llvm/lib/Analysis/ValueTracking.cpp
4196	This is what I meant by the simplification opportunity (I think we discussed it with @nikic in another review but I should have made that clear here). As far as I can tell the `GlobalVariable` cast is necessary to get at the `DataLayout` and to compute the offset below. (Or is there some other way to get at it?)
4287	The change prevents treating past-the-end accesses to `char` arrays the same was as those to empty strings, such as in `strlen("123" + 4)`. This is exercised by `str-int-3.ll` that I added in D125114 based on my understanding of the convention that library calls that provably result in an out-of-bounds access are best left for sanitizers to report. (I think an argument can be made that folding such calls to zero is safer than counting on sanitizers, so I don't mind removing the check and adjusting the test if it's decided that's preferable.)

nikic added inline comments.Jun 23 2022, 8:27 AM

llvm/lib/Analysis/ValueTracking.cpp
4196	If you really want to avoid the DataLayout argument, you can do something like `dyn_cast<GlobalVariable>(getUnderlyingObject(V))`, take the DataLayout from there, and then use that to feed the `stripAndAccumulateConstantOffsets()` call. Otherwise this is just shifting the problem to a `GEP + bitcast + GEP` sequence. (And it's worth noting that with opaque pointers the casts being skipped here don't exist anyway.)

efriedma added inline comments.Jun 23 2022, 11:11 AM

llvm/lib/Analysis/ValueTracking.cpp
4287	If we want getConstantStringInfo(TrimAtNul=true) to always fail if it doesn't trim a nul, we could do that, but only failing when we don't trim a null and Slice.Array is null doesn't really make sense to me.

lattner resigned from this revision.Jun 23 2022, 1:41 PM

Have getConstantStringInfo fail for empty array slices when TrimAtNul is set.
Have GetStringLengthH fail for empty array slices.
Gracefully handle empty array slices in memchr and memrchr folders.
Add tests.

llvm/lib/Analysis/ValueTracking.cpp
4196	Thanks for the pointer to `getUnderlyingObj`. Unless there is a form of the function that also computes the aggregate GEP offset I don't think using it here would be appropriate since the rest of the function then has to (again) iterate over the same chain of casts and GEPs to compute the offset. (Would enhancing `getUnderlyingObj`to compute the offset be appropriate? FWIW, GCC has at least two utility functions that do both.) I'm not opposed to adding the `DataLayout` argument (quite the contrary), but as I explained, I avoided it in this patch to keep it from growing too big. If making this small incremental improvement first is a problem I can instead start by submitting a change to add the new argument. Let me know your preference. Opaque pointers obviously obviate the casts, so once typed pointers completely disappear the code can be simplified, The removal of the `FisrstIdx` test will still be necessary to handle these use cases either way but that can be done then. Would you prefer to defer this whole patch until then?
4287	I've made the change you suggest in the updated patch. It has a fairly pervasive impact on the couple of dozen or so clients of the function. There are no tests for this case so I've developed and added a subset of of them in the updated patch. It was a nontrivial effort whose goal is orthogonal to this enhancement and that I think would have been better handled separately and by someone who's less ambivalent about the benefits of doing this than me. (The test coverage would ideally be extended to all supported functions but I leave that for some other time.)

Harbormaster completed remote builds in B171897: Diff 439819.Jun 24 2022, 11:48 AM

nikic added inline comments.Jun 24 2022, 12:28 PM

llvm/lib/Analysis/ValueTracking.cpp
4196	stripAndAccumulateConstantOffsets() is the function that computes aggregate GEP offsets. getUnderlyingObject() is just a suggestion on how you can fetch the DataLayout without passing an argument. Basically the start of this function (down to line 4222) would become something like this: auto GV = dyn_cast<GlobalVariable>(getUnderlyingObject(V)); if (!GV \|\| !GV->isConstant() \|\| !GV->hasDefinitiveInitializer()) return false; // Not based on constant global. const DataLayout &DL = GV->getParent()->getDataLayout(); APInt Offset(DL.getIndexTypeSizeInBits(V->getType()), 0); if (GV != P->stripAndAccumulateConstantOffsets(DL, Offset, /AllowNonInbounds*/ true)) return false; // Could not determine offset from GV.
4287	Possibly I'm misreading @efriedma's comment, but I think the suggestion here was to just drop the change you originally did from the patch. This doesn't seem to have any direct relation to the bitcast stripping part. (For what it's worth, I think I've come around on the sanitizer issue from "we should accommodate sanitizers if it doesn't cost any effort" to "we should ignore sanitizers completely and optimize as aggressively as possible", because these little checks are not a principled solution anyway, but they always cause a lot of discussion about whether some particular case should be special-cased or not.)

Changes in revision 3 of the patch

return a conservative result from getConstantDataArrayInfo for past-the-end offsets,
adds comments to getConstantStringInfo and GetStringLengthH explaining the decision to return a conservative result for out-of-bounds offsets,
fold empty sequences in optimizeMemRChr and optimizeMemRhr,
add new and adjust existing tests.

llvm/lib/Analysis/ValueTracking.cpp
4196	I'm pretty sure (I think) I understood what you meant. My point is that a call to `getUnderlyingObject` has a linear complexity in the number of GEPs/casts, and so would a fully general recursive call to `getConstantDataArrayInfo`. They both walk the IR to find the underlying declaration. Calling the former from the latter would make the latter do double the work, with quadratic complexity in the number of GEPs. The complexity could be kept linear by adding a `DataLayout` argument to `getConstantDataArrayInfo` having `getUnderlyingObject` also accumulate the offset from each GEP it strips on the way to the declaration of the object (it too would need a `DataLayout` argument). Both changes seem useful to me independently of each other (and as I mentioned, a precedent for the `getUnderlyingObject` change is GCC's `get_ref_base_and_extent` utility function, among others). But until at least the first is done, this patch handles a single cast + GEP + cast chain in O(N). Alternatively, I can add the `DataLayout` argument to `getConstantDataArrayInfo` first, and handle in O(N) arbitrarily long chains of cast/GEP sequences. Having said all that, I have a feeling we might be talking past each other and this will not be the end of it.
4287	Okay, given that, I'm comfortable removing both the check and the case `str-int-3.ll` test that otherwise fails. I've also updated the new `strcall-no-nul.ll` test to reflect the current behavior on trunk, and in addition removed the other checks I added to verify the opposite strategy. (Either way, being more consistent about this than the initial patch leads to churn that's strictly outside the scope of this enhancement.) As an aside it would be helpful to capture somewhere the decision to fold even in spite of the effect on sanitizers so that we can point to it when the question comes up again in the future.

nikic added inline comments.Jun 27 2022, 2:28 PM

llvm/lib/Analysis/ValueTracking.cpp
4196	There shouldn't be any quadratic complexity -- when using getUnderlyingObject/stripAndAccumulateConstantOffsets there is no need for getConstantDataArrayInfo itself to be recursive, all the offsets are accumulated within a single call. You are right that there is some redundant work going on in that getUnderlyingObject will first find the base object, and then stripAndAccumulateConstantOffsets will accumulate offsets down to it, but it's still a linear walk, repeated twice, so complexity is still O(n). (If we want to really get down to it, doing two walks will actually be faster in practice, because getUnderlyingObject is significantly faster than stripAndAccumulateConstantOffsets, which means that we can quickly discard any cases that aren't based on a constant GV. This is just a side note, we generally wouldn't specifically optimize for that.)

Harbormaster completed remote builds in B172310: Diff 440373.Jun 27 2022, 4:13 PM

Changes in revision 4 of the patch:

use getUnderlyingObject and stripAndAccumulateConstantOffsets and avoid recursion in getConstantDataArrayInfo,
add tests to exercise long chain of GEPs.

llvm/lib/Analysis/ValueTracking.cpp
4196	Aah, that was my misunderstanding, sorry. I assumed `stripAndAccumulateConstantOffsets` and `accumulateConstantOffset` both operated on just a single GEP except that the former also stripped casts. It looks like the former actually iterates over all the GEPs and accumulates the offsets from all of them. So yes, that does work linearly.

Harbormaster completed remote builds in B172353: Diff 440431.Jun 27 2022, 6:29 PM

nikic added inline comments.Jun 28 2022, 1:21 AM

llvm/lib/Analysis/ValueTracking.cpp
4194	You can move this out of the `GEPOperator` block and also drop the stripPointerCasts() above (and the repeated checks of GV below it). All cases can be treated uniformly.
4239	I wouldn't claim that folding these is really "safer", it's just a (UB-based) optimization.

Changes in revision 5 of the patch:

Simplify getConstantDataArrayInfo some more.
Update comment.

llvm/lib/Analysis/ValueTracking.cpp
4194	Because `stripAndAccumulateConstantOffsets` is a member of the `Value` class and not one of `GEPOperator`. Nice, thanks!
4239	Many users would find replacing an undefined library call with a well-defined expression safer than letting the call take place and crash their program for example. I certainly do, but I see little point in arguing about it here.

Implementation LG, some notes on the tests.

llvm/test/Transforms/InstCombine/memchr-10.ll
22	Incomplete check line?
llvm/test/Transforms/InstCombine/memchr-9.ll
318	These check lines aren't correct and are probably just getting ignored. You need to pass something like `--check-prefixes=CHECK,BE` and `--check-prefixes=CHECK,LE` and regenerate check lines.
llvm/test/Transforms/InstCombine/strcall-no-nul.ll
143	This check line doesn't make a lot of sense ... the syntax is incorrect (and the incorrect syntax is repeated in the cases below) and the ret wouldn't make sense here anyway (because strlen returns an integer not a pointer). The TODO also seems outdated, as this already folds to 0.
llvm/test/Transforms/InstCombine/strnlen-1.ll
166	This TODO is probably not relevant anymore given the updated patch?

This revision is now accepted and ready to land.Jun 28 2022, 12:30 PM

Harbormaster completed remote builds in B172540: Diff 440696.Jun 28 2022, 12:59 PM

This revision was landed with ongoing or failed builds.Jun 28 2022, 3:00 PM

Closed by commit rGe263a7670e28: [InstCombine] Look through more casts when folding memchr and memcmp (authored by msebor). · Explain Why

This revision was automatically updated to reflect the committed changes.

msebor marked an inline comment as done.

msebor added a commit: rGe263a7670e28: [InstCombine] Look through more casts when folding memchr and memcmp.

Thanks for the careful review!

llvm/test/Transforms/InstCombine/memchr-10.ll
22	It's intentional and documented above: the return value doesn't matter (i.e., it's indeterminate here). I didn't want to encode an expectation one way or the other.
llvm/test/Transforms/InstCombine/memchr-9.ll
318	Right. You do have a good eye for detail!
llvm/test/Transforms/InstCombine/strcall-no-nul.ll
143	I've fixed the `strlen` typo. I hand-wrote all these lines and I realize not all of them are entirely correct. My main goal was to more prominently mark up the expected failures, but I was also hoping that `llvm-lit` would pick them up somehow and indicate when there are expected failures in the test. I've removed the rest of the `XFAIL-` lines from the final commit but, for future reference, is there a way to do what I want? (I couldn't find an example in the test suite that I didn't add myself.)

nikic added inline comments.Jun 29 2022, 12:36 AM

llvm/test/Transforms/InstCombine/strcall-no-nul.ll
143	It's possible to mark the entire test as XFAIL using `XFAIL: *`, but I don't think XFAIL interacts with FileCheck in any way. General convention is to check for current codegen and put a TODO/FIXME if it's currently incorrect/suboptimal.

It looks like Transforms/InstCombine/str-int-3.ll is failing on Darwin systems. The Darwin bots are currently down unfortunately though (https://green.lab.llvm.org/green/)

In D128364#3618130, @fhahn wrote:

It looks like Transforms/InstCombine/str-int-3.ll is failing on Darwin systems. The Darwin bots are currently down unfortunately though (https://green.lab.llvm.org/green/)

GreenDragon is back up, please see https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/29787/testReport/LLVM/Transforms_InstCombine/str_int_3_ll/ for more details about the failing test. Please revert the patch if the fix requires more time.

In D128364#3618948, @fhahn wrote:

GreenDragon is back up, please see https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/29787/testReport/LLVM/Transforms_InstCombine/str_int_3_ll/ for more details about the failing test. Please revert the patch if the fix requires more time.

Thanks for the heads up. I'm not having any luck reproducing the failure with a cross for either i386-apple-darwin or x86_64-apple-darwin. Can you please provide more detail? (The full test output and the target triple, or anything else that might help reproduce it on x86_64 Linux.)

I've managed to reproduce it by changing the test case. It seems as though the test input of

%pa_0_0_32 = getelementptr [2 x %struct.A], [2 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 32
%ia_0_0_32 = call i32 @atoi(i8* %pa_0_0_32)

is on Darwin transformed into

%ia_0_0_32 = call i32 @atoi(i8* getelementptr inbounds ([2 x %struct.A], [2 x %struct.A]* @a, i64 1, i64 0, i32 0, i64 0))

which the folder doesn't handle.

In D128364#3619404, @msebor wrote:

I've managed to reproduce it by changing the test case...

Sorry, I was too hasty, that works as expected. I still have not been able to reproduce it.

I suspect the Darwin strtol behaves differently than on Linux for en empty subject sequence (ie., for ""). On Linux strtol succeeds and returns zero without changing errno. On Darwin it sets errno, which then causes the convertStrToNumber utility function to fail. This is permitted by POSIX but the test assumes the Linux behavior.

This also seems be supported by the Apple open source implementation of the function in strtol.c.

I've relaxed the test in rGd515211a0ce1 to avoid exercising the target-dependent behavior. Please let me know if you see any other issues.

In D128364#3619587, @msebor wrote:

I've relaxed the test in rGd515211a0ce1 to avoid exercising the target-dependent behavior. Please let me know if you see any other issues.

That avoids the test failure, but we still have a host-platform difference that seems unnecessary.

IIUC, the problem is here:
https://github.com/llvm/llvm-project/blob/649439e7aeeb3b30f54297b3c6f7d6549fb2f85a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp#L87

On Linux, errno isn't set for an empty string:
https://godbolt.org/z/Tn5j1o4s6

But on a Mac (I tested locally - not sure if it's possible to do that on godbolt), I get this output:

Value of errno: 22
res = 0

Instead of return nullptr on errno, just remove that check? Or return a null constant since anything goes at that point?

I agree there is a detectable difference when folding either undefined calls to atoi (or strtol) either with past the end pointers, or even with the empty string. The former calls are undefined so the difference shouldn't matter, but the latter is well-defined. In the latter case the difference predates the patch which is why I just "papered" over it by disabling the test, but I agree it should be avoided in the well-defined case. I've raised PR 56293 to keep track of it and will look into removing it.

Maybe reland with better patch title as this patch affects more libcalls than just two ones?

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ValueTracking.cpp

73 lines

Transforms/

Utils/

SimplifyLibCalls.cpp

12 lines

test/

Transforms/

InstCombine/

84 lines

324 lines

144 lines

53 lines

84 lines

13 lines

319 lines

91 lines

29 lines

19 lines

Diff 440782

llvm/lib/Analysis/ValueTracking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,181 Lines • ▼ Show 20 Lines
// its initializer if the size of its elements equals ElementSize, or,		// its initializer if the size of its elements equals ElementSize, or,
// for ElementSize == 8, to its representation as an array of unsiged		// for ElementSize == 8, to its representation as an array of unsiged
// char. Return true on success.		// char. Return true on success.
bool llvm::getConstantDataArrayInfo(const Value *V,		bool llvm::getConstantDataArrayInfo(const Value *V,
ConstantDataArraySlice &Slice,		ConstantDataArraySlice &Slice,
unsigned ElementSize, uint64_t Offset) {		unsigned ElementSize, uint64_t Offset) {
assert(V);		assert(V);

// Look through bitcast instructions and geps.		// Drill down into the pointer expression V, ignoring any intervening
V = V->stripPointerCasts();		// casts, and determine the identity of the object it references along
		// with the cumulative byte offset into it.
// If the value is a GEP instruction or constant expression, treat it as an		const GlobalVariable *GV =
// offset.		dyn_cast<GlobalVariable>(getUnderlyingObject(V));
		nikicUnsubmitted Not Done Reply Inline Actions You can move this out of the `GEPOperator` block and also drop the stripPointerCasts() above (and the repeated checks of GV below it). All cases can be treated uniformly. nikic: You can move this out of the `GEPOperator` block and also drop the stripPointerCasts() above…
		mseborAuthorUnsubmitted Done Reply Inline Actions Because `stripAndAccumulateConstantOffsets` is a member of the `Value` class and not one of `GEPOperator`. Nice, thanks! msebor: Because `stripAndAccumulateConstantOffsets` is a member of the `Value` class and not one of…
if (const GEPOperator *GEP = dyn_cast<GEPOperator>(V)) {		if (!GV \|\| !GV->isConstant() \|\| !GV->hasDefinitiveInitializer())
// Fail if the first GEP operand is not a constant zero and we're		// Fail if V is not based on constant global object.
		efriedmaUnsubmitted Not Done Reply Inline Actions Maybe it makes sense to just drop the `dyn_cast<GlobalVariable>` here? efriedma: Maybe it makes sense to just drop the `dyn_cast<GlobalVariable>` here?
		mseborAuthorUnsubmitted Done Reply Inline Actions This is what I meant by the simplification opportunity (I think we discussed it with @nikic in another review but I should have made that clear here). As far as I can tell the `GlobalVariable` cast is necessary to get at the `DataLayout` and to compute the offset below. (Or is there some other way to get at it?) msebor: This is what I meant by the simplification opportunity (I think we discussed it with @nikic in…
		nikicUnsubmitted Not Done Reply Inline Actions If you really want to avoid the DataLayout argument, you can do something like `dyn_cast<GlobalVariable>(getUnderlyingObject(V))`, take the DataLayout from there, and then use that to feed the `stripAndAccumulateConstantOffsets()` call. Otherwise this is just shifting the problem to a `GEP + bitcast + GEP` sequence. (And it's worth noting that with opaque pointers the casts being skipped here don't exist anyway.) nikic: If you really want to avoid the DataLayout argument, you can do something like…
		mseborAuthorUnsubmitted Done Reply Inline Actions Thanks for the pointer to `getUnderlyingObj`. Unless there is a form of the function that also computes the aggregate GEP offset I don't think using it here would be appropriate since the rest of the function then has to (again) iterate over the same chain of casts and GEPs to compute the offset. (Would enhancing `getUnderlyingObj`to compute the offset be appropriate? FWIW, GCC has at least two utility functions that do both.) I'm not opposed to adding the `DataLayout` argument (quite the contrary), but as I explained, I avoided it in this patch to keep it from growing too big. If making this small incremental improvement first is a problem I can instead start by submitting a change to add the new argument. Let me know your preference. Opaque pointers obviously obviate the casts, so once typed pointers completely disappear the code can be simplified, The removal of the `FisrstIdx` test will still be necessary to handle these use cases either way but that can be done then. Would you prefer to defer this whole patch until then? msebor: Thanks for the pointer to `getUnderlyingObj`. Unless there is a form of the function that also…
		nikicUnsubmitted Not Done Reply Inline Actions stripAndAccumulateConstantOffsets() is the function that computes aggregate GEP offsets. getUnderlyingObject() is just a suggestion on how you can fetch the DataLayout without passing an argument. Basically the start of this function (down to line 4222) would become something like this: auto GV = dyn_cast<GlobalVariable>(getUnderlyingObject(V)); if (!GV \|\| !GV->isConstant() \|\| !GV->hasDefinitiveInitializer()) return false; // Not based on constant global. const DataLayout &DL = GV->getParent()->getDataLayout(); APInt Offset(DL.getIndexTypeSizeInBits(V->getType()), 0); if (GV != P->stripAndAccumulateConstantOffsets(DL, Offset, /AllowNonInbounds/ true)) return false; // Could not determine offset from GV. nikic:* stripAndAccumulateConstantOffsets() is the function that computes aggregate GEP offsets.
		mseborAuthorUnsubmitted Done Reply Inline Actions I'm pretty sure (I think) I understood what you meant. My point is that a call to `getUnderlyingObject` has a linear complexity in the number of GEPs/casts, and so would a fully general recursive call to `getConstantDataArrayInfo`. They both walk the IR to find the underlying declaration. Calling the former from the latter would make the latter do double the work, with quadratic complexity in the number of GEPs. The complexity could be kept linear by adding a `DataLayout` argument to `getConstantDataArrayInfo` having `getUnderlyingObject` also accumulate the offset from each GEP it strips on the way to the declaration of the object (it too would need a `DataLayout` argument). Both changes seem useful to me independently of each other (and as I mentioned, a precedent for the `getUnderlyingObject` change is GCC's `get_ref_base_and_extent` utility function, among others). But until at least the first is done, this patch handles a single cast + GEP + cast chain in O(N). Alternatively, I can add the `DataLayout` argument to `getConstantDataArrayInfo` first, and handle in O(N) arbitrarily long chains of cast/GEP sequences. Having said all that, I have a feeling we might be talking past each other and this will not be the end of it. msebor: I'm pretty sure (I think) I understood what you meant. My point is that a call to…
		nikicUnsubmitted Not Done Reply Inline Actions There shouldn't be any quadratic complexity -- when using getUnderlyingObject/stripAndAccumulateConstantOffsets there is no need for getConstantDataArrayInfo itself to be recursive, all the offsets are accumulated within a single call. You are right that there is some redundant work going on in that getUnderlyingObject will first find the base object, and then stripAndAccumulateConstantOffsets will accumulate offsets down to it, but it's still a linear walk, repeated twice, so complexity is still O(n). (If we want to really get down to it, doing two walks will actually be faster in practice, because getUnderlyingObject is significantly faster than stripAndAccumulateConstantOffsets, which means that we can quickly discard any cases that aren't based on a constant GV. This is just a side note, we generally wouldn't specifically optimize for that.) nikic: There shouldn't be any quadratic complexity -- when using…
		mseborAuthorUnsubmitted Done Reply Inline Actions Aah, that was my misunderstanding, sorry. I assumed `stripAndAccumulateConstantOffsets` and `accumulateConstantOffset` both operated on just a single GEP except that the former also stripped casts. It looks like the former actually iterates over all the GEPs and accumulates the offsets from all of them. So yes, that does work linearly. msebor: Aah, that was my misunderstanding, sorry. I assumed `stripAndAccumulateConstantOffsets` and…
// not indexing into the initializer.
const ConstantInt *FirstIdx = dyn_cast<ConstantInt>(GEP->getOperand(1));
if (!FirstIdx \|\| !FirstIdx->isZero())
return false;

Value *Op0 = GEP->getOperand(0);
const GlobalVariable *GV = dyn_cast<GlobalVariable>(Op0);
if (!GV)
return false;		return false;

// Fail if the offset into the initializer is not constant.
const DataLayout &DL = GV->getParent()->getDataLayout();		const DataLayout &DL = GV->getParent()->getDataLayout();
APInt Off(DL.getIndexSizeInBits(GEP->getPointerAddressSpace()), 0);		APInt Off(DL.getIndexTypeSizeInBits(V->getType()), 0);
if (!GEP->accumulateConstantOffset(DL, Off))
		if (GV != V->stripAndAccumulateConstantOffsets(DL, Off,
		/AllowNonInbounds/ true))
		// Fail if a constant offset could not be determined.
return false;		return false;

// Fail if the constant offset is excessive.
uint64_t StartIdx = Off.getLimitedValue();		uint64_t StartIdx = Off.getLimitedValue();
if (StartIdx == UINT64_MAX)		if (StartIdx == UINT64_MAX)
		// Fail if the constant offset is excessive.
return false;		return false;

return getConstantDataArrayInfo(Op0, Slice, ElementSize, StartIdx + Offset);		Offset += StartIdx;
}

// The GEP instruction, constant or instruction, must reference a global
// variable that is a constant and is initialized. The referenced constant
// initializer is the array that we'll use for optimization.
const GlobalVariable *GV = dyn_cast<GlobalVariable>(V);
if (!GV \|\| !GV->isConstant() \|\| !GV->hasDefinitiveInitializer())
return false;

const DataLayout &DL = GV->getParent()->getDataLayout();
ConstantDataArray *Array = nullptr;		ConstantDataArray *Array = nullptr;
ArrayType *ArrayTy = nullptr;		ArrayType *ArrayTy = nullptr;

if (GV->getInitializer()->isNullValue()) {		if (GV->getInitializer()->isNullValue()) {
Type *GVTy = GV->getValueType();		Type *GVTy = GV->getValueType();
uint64_t SizeInBytes = DL.getTypeStoreSize(GVTy).getFixedSize();		uint64_t SizeInBytes = DL.getTypeStoreSize(GVTy).getFixedSize();
uint64_t Length = SizeInBytes / (ElementSize / 8);		uint64_t Length = SizeInBytes / (ElementSize / 8);
if (Length <= Offset)
// Bail on undersized constants to let sanitizers detect library
// calls with them as arguments.
return false;

Slice.Array = nullptr;		Slice.Array = nullptr;
Slice.Offset = 0;		Slice.Offset = 0;
Slice.Length = Length - Offset;		// Return an empty Slice for undersized constants to let callers
		// transform even undefined library calls into simpler, well-defined
		// expressions. This is preferable to making the calls although it
		// prevents sanitizers from detecting such calls.
		Slice.Length = Length < Offset ? 0 : Length - Offset;
return true;		return true;
}		}

auto Init = const_cast<Constant >(GV->getInitializer());		auto Init = const_cast<Constant >(GV->getInitializer());
if (auto *ArrayInit = dyn_cast<ConstantDataArray>(Init)) {		if (auto *ArrayInit = dyn_cast<ConstantDataArray>(Init)) {
Type *InitElTy = ArrayInit->getElementType();		Type *InitElTy = ArrayInit->getElementType();
if (InitElTy->isIntegerTy(ElementSize)) {		if (InitElTy->isIntegerTy(ElementSize)) {
// If Init is an initializer for an array of the expected type		// If Init is an initializer for an array of the expected type
// and size, use it as is.		// and size, use it as is.
Array = ArrayInit;		Array = ArrayInit;
ArrayTy = ArrayInit->getType();		ArrayTy = ArrayInit->getType();
		nikicUnsubmitted Not Done Reply Inline Actions I wouldn't claim that folding these is really "safer", it's just a (UB-based) optimization. nikic: I wouldn't claim that folding these is really "safer", it's just a (UB-based) optimization.
		mseborAuthorUnsubmitted Done Reply Inline Actions Many users would find replacing an undefined library call with a well-defined expression safer than letting the call take place and crash their program for example. I certainly do, but I see little point in arguing about it here. msebor: Many users would find replacing an undefined library call with a well-defined expression safer…
}		}
}		}

if (!Array) {		if (!Array) {
if (ElementSize != 8)		if (ElementSize != 8)
// TODO: Handle conversions to larger integral types.		// TODO: Handle conversions to larger integral types.
return false;		return false;

Show All 24 Lines
bool llvm::getConstantStringInfo(const Value *V, StringRef &Str,		bool llvm::getConstantStringInfo(const Value *V, StringRef &Str,
uint64_t Offset, bool TrimAtNul) {		uint64_t Offset, bool TrimAtNul) {
ConstantDataArraySlice Slice;		ConstantDataArraySlice Slice;
if (!getConstantDataArrayInfo(V, Slice, 8, Offset))		if (!getConstantDataArrayInfo(V, Slice, 8, Offset))
return false;		return false;

if (Slice.Array == nullptr) {		if (Slice.Array == nullptr) {
if (TrimAtNul) {		if (TrimAtNul) {
		// Return a nul-terminated string even for an empty Slice. This is
		// safe because all existing SimplifyLibcalls callers require string
		// arguments and the behavior of the functions they fold is undefined
		// otherwise. Folding the calls this way is preferable to making
		// the undefined library calls, even though it prevents sanitizers
		// from reporting such calls.
Str = StringRef();		Str = StringRef();
return true;		return true;
		efriedmaUnsubmitted Not Done Reply Inline Actions Not sure I understand this change; why does it matter if the slice is zero-length here? efriedma: Not sure I understand this change; why does it matter if the slice is zero-length here?
		mseborAuthorUnsubmitted Done Reply Inline Actions The change prevents treating past-the-end accesses to `char` arrays the same was as those to empty strings, such as in `strlen("123" + 4)`. This is exercised by `str-int-3.ll` that I added in D125114 based on my understanding of the convention that library calls that provably result in an out-of-bounds access are best left for sanitizers to report. (I think an argument can be made that folding such calls to zero is safer than counting on sanitizers, so I don't mind removing the check and adjusting the test if it's decided that's preferable.) msebor: The change prevents treating past-the-end accesses to `char` arrays the same was as those to…
		efriedmaUnsubmitted Done Reply Inline Actions If we want getConstantStringInfo(TrimAtNul=true) to always fail if it doesn't trim a nul, we could do that, but only failing when we don't trim a null and Slice.Array is null doesn't really make sense to me. efriedma: If we want getConstantStringInfo(TrimAtNul=true) to always fail if it doesn't trim a nul, we…
		mseborAuthorUnsubmitted Done Reply Inline Actions I've made the change you suggest in the updated patch. It has a fairly pervasive impact on the couple of dozen or so clients of the function. There are no tests for this case so I've developed and added a subset of of them in the updated patch. It was a nontrivial effort whose goal is orthogonal to this enhancement and that I think would have been better handled separately and by someone who's less ambivalent about the benefits of doing this than me. (The test coverage would ideally be extended to all supported functions but I leave that for some other time.) msebor: I've made the change you suggest in the updated patch. It has a fairly pervasive impact on the…
		nikicUnsubmitted Not Done Reply Inline Actions Possibly I'm misreading @efriedma's comment, but I think the suggestion here was to just drop the change you originally did from the patch. This doesn't seem to have any direct relation to the bitcast stripping part. (For what it's worth, I think I've come around on the sanitizer issue from "we should accommodate sanitizers if it doesn't cost any effort" to "we should ignore sanitizers completely and optimize as aggressively as possible", because these little checks are not a principled solution anyway, but they always cause a lot of discussion about whether some particular case should be special-cased or not.) nikic: Possibly I'm misreading @efriedma's comment, but I think the suggestion here was to just drop…
		mseborAuthorUnsubmitted Done Reply Inline Actions Okay, given that, I'm comfortable removing both the check and the case `str-int-3.ll` test that otherwise fails. I've also updated the new `strcall-no-nul.ll` test to reflect the current behavior on trunk, and in addition removed the other checks I added to verify the opposite strategy. (Either way, being more consistent about this than the initial patch leads to churn that's strictly outside the scope of this enhancement.) As an aside it would be helpful to capture somewhere the decision to fold even in spite of the effect on sanitizers so that we can point to it when the question comes up again in the future. msebor: Okay, given that, I'm comfortable removing both the check and the case `str-int-3.ll` test that…
}		}
if (Slice.Length == 1) {		if (Slice.Length == 1) {
Str = StringRef("", 1);		Str = StringRef("", 1);
return true;		return true;
}		}
// We cannot instantiate a StringRef as we do not have an appropriate string		// We cannot instantiate a StringRef as we do not have an appropriate string
// of 0s at hand.		// of 0s at hand.
return false;		return false;
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	static uint64_t GetStringLengthH(const Value *V,
}		}

// Otherwise, see if we can read the string.		// Otherwise, see if we can read the string.
ConstantDataArraySlice Slice;		ConstantDataArraySlice Slice;
if (!getConstantDataArrayInfo(V, Slice, CharSize))		if (!getConstantDataArrayInfo(V, Slice, CharSize))
return 0;		return 0;

if (Slice.Array == nullptr)		if (Slice.Array == nullptr)
		// Zeroinitializer (including an empty one).
return 1;		return 1;

// Search for nul characters		// Search for the first nul character. Return a conservative result even
		// when there is no nul. This is safe since otherwise the string function
		// being folded such as strlen is undefined, and can be preferable to
		// making the undefined library call.
unsigned NullIndex = 0;		unsigned NullIndex = 0;
for (unsigned E = Slice.Length; NullIndex < E; ++NullIndex) {		for (unsigned E = Slice.Length; NullIndex < E; ++NullIndex) {
if (Slice.Array->getElementAsInteger(Slice.Offset + NullIndex) == 0)		if (Slice.Array->getElementAsInteger(Slice.Offset + NullIndex) == 0)
break;		break;
}		}

return NullIndex + 1;		return NullIndex + 1;
}		}
▲ Show 20 Lines • Show All 2,986 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 939 Lines • ▼ Show 20 Lines	if (LenC->isOne()) {
return B.CreateSelect(Cmp, SrcStr, NullPtr, "memrchr.sel");		return B.CreateSelect(Cmp, SrcStr, NullPtr, "memrchr.sel");
}		}
}		}

StringRef Str;		StringRef Str;
if (!getConstantStringInfo(SrcStr, Str, 0, /TrimAtNul=/false))		if (!getConstantStringInfo(SrcStr, Str, 0, /TrimAtNul=/false))
return nullptr;		return nullptr;

		if (Str.size() == 0)
		// If the array is empty fold memrchr(A, C, N) to null for any value
		// of C and N on the basis that the only valid value of N is zero
		// (otherwise the call is undefined).
		return NullPtr;

uint64_t EndOff = UINT64_MAX;		uint64_t EndOff = UINT64_MAX;
if (LenC) {		if (LenC) {
EndOff = LenC->getZExtValue();		EndOff = LenC->getZExtValue();
if (Str.size() < EndOff)		if (Str.size() < EndOff)
// Punt out-of-bounds accesses to sanitizers and/or libc.		// Punt out-of-bounds accesses to sanitizers and/or libc.
return nullptr;		return nullptr;
}		}

▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	if (CharC) {
// position also fold the result to null.		// position also fold the result to null.
Value *Cmp = B.CreateICmpULE(Size, ConstantInt::get(Size->getType(), Pos),		Value *Cmp = B.CreateICmpULE(Size, ConstantInt::get(Size->getType(), Pos),
"memchr.cmp");		"memchr.cmp");
Value *SrcPlus =		Value *SrcPlus =
B.CreateGEP(B.getInt8Ty(), SrcStr, B.getInt64(Pos), "memchr.ptr");		B.CreateGEP(B.getInt8Ty(), SrcStr, B.getInt64(Pos), "memchr.ptr");
return B.CreateSelect(Cmp, NullPtr, SrcPlus);		return B.CreateSelect(Cmp, NullPtr, SrcPlus);
}		}

		if (Str.size() == 0)
		// If the array is empty fold memchr(A, C, N) to null for any value
		// of C and N on the basis that the only valid value of N is zero
		// (otherwise the call is undefined).
		return NullPtr;

if (LenC)		if (LenC)
Str = substr(Str, LenC->getZExtValue());		Str = substr(Str, LenC->getZExtValue());

size_t Pos = Str.find_first_not_of(Str[0]);		size_t Pos = Str.find_first_not_of(Str[0]);
if (Pos == StringRef::npos		if (Pos == StringRef::npos
\|\| Str.find_first_not_of(Str[Pos], Pos) == StringRef::npos) {		\|\| Str.find_first_not_of(Str[Pos], Pos) == StringRef::npos) {
// If the source array consists of at most two consecutive sequences		// If the source array consists of at most two consecutive sequences
// of the same characters, then for any C and N (whether in bounds or		// of the same characters, then for any C and N (whether in bounds or
▲ Show 20 Lines • Show All 2,727 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/memchr-10.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s
				;
				; Verify that the result of memchr calls with past-the-end pointers used
				; in equality expressions don't cause trouble and either are folded when
				; they might be valid or not when they're provably undefined.

				declare i8* @memchr(i8*, i32, i64)


				@a5 = constant [5 x i8] c"12345"


				; Fold memchr(a5 + 5, c, 1) == a5 + 5 to an arbitrary constrant.
				; The call is transformed to a5[5] == c by the memchr simplifier, with
				; a5[5] being indeterminate. The equality then is the folded with
				; an undefined/arbitrary result.

				define i1 @call_memchr_ap5_c_1_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @call_memchr_ap5_c_1_eq_a(
				; CHECK-NEXT: ret i1
				;
				nikicUnsubmitted Not Done Reply Inline Actions Incomplete check line? nikic: Incomplete check line?
				mseborAuthorUnsubmitted Done Reply Inline Actions It's intentional and documented above: the return value doesn't matter (i.e., it's indeterminate here). I didn't want to encode an expectation one way or the other. msebor: It's intentional and documented above: the return value doesn't matter (i.e., it's…
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%qap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 1, i32 0
				%q = call i8* @memchr(i8* %pap5, i32 %c, i64 1)
				%cmp = icmp eq i8* %q, %qap5
				ret i1 %cmp
				}


				; Fold memchr(a5 + 5, c, 5) == a5 + 5 to an arbitrary constant.

				define i1 @call_memchr_ap5_c_5_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @call_memchr_ap5_c_5_eq_a(
				; CHECK-NEXT: ret i1
				;
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%qap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 1, i32 0
				%q = call i8* @memchr(i8* %pap5, i32 %c, i64 5)
				%cmp = icmp eq i8* %q, %qap5
				ret i1 %cmp
				}


				; Fold memchr(a5 + 5, c, n) == a5 to false.

				define i1 @fold_memchr_ap5_c_n_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @fold_memchr_ap5_c_n_eq_a(
				; CHECK-NEXT: ret i1 false
				;
				%pa = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memchr(i8* %pap5, i32 %c, i64 %n)
				%cmp = icmp eq i8* %q, %pa
				ret i1 %cmp
				}


				; Fold memchr(a5 + 5, c, n) == null to true on the basis that n must
				; be zero in order for the call to be valid.

				define i1 @fold_memchr_ap5_c_n_eqz(i32 %c, i64 %n) {
				; CHECK-LABEL: @fold_memchr_ap5_c_n_eqz(
				; CHECK-NEXT: ret i1 true
				;
				%p = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memchr(i8* %p, i32 %c, i64 %n)
				%cmp = icmp eq i8* %q, null
				ret i1 %cmp
				}


				; Fold memchr(a5 + 5, '\0', n) == null to true on the basis that n must
				; be zero in order for the call to be valid.

				define i1 @fold_memchr_a_nul_n_eqz(i64 %n) {
				; CHECK-LABEL: @fold_memchr_a_nul_n_eqz(
				; CHECK-NEXT: ret i1 true
				;
				%p = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memchr(i8* %p, i32 0, i64 %n)
				%cmp = icmp eq i8* %q, null
				ret i1 %cmp
				}

llvm/test/Transforms/InstCombine/memchr-9.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; Verify that calls to memchr with arrays of elements larger than char
				; are folded correctly.
				; RUN: opt < %s -passes=instcombine -S -data-layout="E" \| FileCheck %s --check-prefixes=CHECK,BE-CHECK
				; RUN: opt < %s -passes=instcombine -S -data-layout="e" \| FileCheck %s --check-prefixes=CHECK,LE-CHECK
				;
				; Exercise folding of memchr calls with addition expressions involving
				; pointers into constant arrays of types larger than char and fractional
				; offsets.

				declare i8* @memchr(i8*, i32, i64)

				%struct.A = type { [2 x i16], [2 x i16] }

				; Hex byte representation: 00 00 01 01 02 02 03 03
				@a = constant [1 x %struct.A] [%struct.A { [2 x i16] [i16 0, i16 257], [2 x i16] [i16 514, i16 771] }]


				define void @fold_memchr_A_pIb_cst_cst(i8** %pchr) {
				; CHECK-LABEL: @fold_memchr_A_pIb_cst_cst(
				; CHECK-NEXT: store i8* bitcast ([1 x %struct.A]* @a to i8), i8* [[PCHR:%.*]], align 8
				; CHECK-NEXT: [[PST_0_1_1:%.]] = getelementptr i8, i8** [[PCHR]], i64 1
				; CHECK-NEXT: store i8* null, i8** [[PST_0_1_1]], align 8
				; CHECK-NEXT: [[PST_0_4_4:%.]] = getelementptr i8, i8** [[PCHR]], i64 2
				; CHECK-NEXT: store i8* null, i8** [[PST_0_4_4]], align 8
				; CHECK-NEXT: [[PST_1_0_1:%.]] = getelementptr i8, i8** [[PCHR]], i64 3
				; CHECK-NEXT: store i8* getelementptr (i8, i8* bitcast ([1 x %struct.A]* @a to i8), i64 1), i8* [[PST_1_0_1]], align 8
				; CHECK-NEXT: [[PST_1_0_3:%.]] = getelementptr i8, i8** [[PCHR]], i64 4
				; CHECK-NEXT: store i8* getelementptr (i8, i8* bitcast ([1 x %struct.A]* @a to i8), i64 1), i8* [[PST_1_0_3]], align 8
				; CHECK-NEXT: [[PST_1_1_1:%.]] = getelementptr i8, i8** [[PCHR]], i64 5
				; CHECK-NEXT: store i8* null, i8** [[PST_1_1_1]], align 8
				; CHECK-NEXT: [[PST_1_1_2:%.]] = getelementptr i8, i8** [[PCHR]], i64 6
				; CHECK-NEXT: store i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 1) to i8), i8* [[PST_1_1_2]], align 8
				; CHECK-NEXT: [[PST_1_3_3:%.]] = getelementptr i8, i8** [[PCHR]], i64 7
				; CHECK-NEXT: store i8* null, i8** [[PST_1_3_3]], align 8
				; CHECK-NEXT: [[PST_1_3_4:%.]] = getelementptr i8, i8** [[PCHR]], i64 8
				; CHECK-NEXT: store i8* null, i8** [[PST_1_3_4]], align 8
				; CHECK-NEXT: [[PST_1_3_6:%.]] = getelementptr i8, i8** [[PCHR]], i64 10
				; CHECK-NEXT: store i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 1, i64 1) to i8), i8* [[PST_1_3_6]], align 8
				; CHECK-NEXT: ret void
				;
				%pa = getelementptr [1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0
				%pi8a = bitcast %struct.A* %pa to i8*

				%pi8ap0 = getelementptr i8, i8* %pi8a, i32 0

				; Fold memchr((char*)a + 0, '\0', 1) to a.
				%pst_0_0_1 = getelementptr i8, i8* %pchr, i32 0
				%chr_0_0_1 = call i8* @memchr(i8* %pi8ap0, i32 0, i64 1)
				store i8* %chr_0_0_1, i8** %pst_0_0_1

				; Fold memchr((char*)a + 0, '\01', 1) to null.
				%pst_0_1_1 = getelementptr i8, i8* %pchr, i32 1
				%chr_0_1_1 = call i8* @memchr(i8* %pi8ap0, i32 1, i64 1)
				store i8* %chr_0_1_1, i8** %pst_0_1_1

				; Fold memchr((char*)a + 0, '\04', 4) to null.
				%pst_0_4_4 = getelementptr i8, i8* %pchr, i32 2
				%chr_0_4_4 = call i8* @memchr(i8* %pi8ap0, i32 4, i64 4)
				store i8* %chr_0_4_4, i8** %pst_0_4_4


				%pi8ap1 = getelementptr i8, i8* %pi8a, i32 1

				; Fold memchr((char)a + 1, '\0', 1) to (char)a + 1.
				%pst_1_0_1 = getelementptr i8, i8* %pchr, i32 3
				%chr_1_0_1 = call i8* @memchr(i8* %pi8ap1, i32 0, i64 1)
				store i8* %chr_1_0_1, i8** %pst_1_0_1

				; Fold memchr((char)a + 1, '\0', 3) to (char)a + 1.
				%pst_1_0_3 = getelementptr i8, i8* %pchr, i32 4
				%chr_1_0_3 = call i8* @memchr(i8* %pi8ap1, i32 0, i64 3)
				store i8* %chr_1_0_3, i8** %pst_1_0_3

				; Fold memchr((char*)a + 1, '\01', 1) to null.
				%pst_1_1_1 = getelementptr i8, i8* %pchr, i32 5
				%chr_1_1_1 = call i8* @memchr(i8* %pi8ap1, i32 1, i64 1)
				store i8* %chr_1_1_1, i8** %pst_1_1_1

				; Fold memchr((char)a + 1, '\01', 2) to (char)a + 2.
				%pst_1_1_2 = getelementptr i8, i8* %pchr, i32 6
				%chr_1_1_2 = call i8* @memchr(i8* %pi8ap1, i32 1, i64 2)
				store i8* %chr_1_1_2, i8** %pst_1_1_2

				; Fold memchr((char*)a + 1, '\03', 3) to null.
				%pst_1_3_3 = getelementptr i8, i8* %pchr, i32 7
				%chr_1_3_3 = call i8* @memchr(i8* %pi8ap1, i32 3, i64 3)
				store i8* %chr_1_3_3, i8** %pst_1_3_3

				; Fold memchr((char*)a + 1, '\03', 4) to null.
				%pst_1_3_4 = getelementptr i8, i8* %pchr, i32 8
				%chr_1_3_4 = call i8* @memchr(i8* %pi8ap1, i32 3, i64 4)
				store i8* %chr_1_3_4, i8** %pst_1_3_4

				; Fold memchr((char*)a + 1, '\03', 5) to null.
				%pst_1_3_5 = getelementptr i8, i8* %pchr, i32 9
				%chr_1_3_5 = call i8* @memchr(i8* %pi8ap1, i32 3, i64 5)
				store i8* %chr_1_3_4, i8** %pst_1_3_4

				; Fold memchr((char)a + 1, '\03', 6) to (char)a + 5.
				%pst_1_3_6 = getelementptr i8, i8* %pchr, i32 10
				%chr_1_3_6 = call i8* @memchr(i8* %pi8ap1, i32 3, i64 6)
				store i8* %chr_1_3_6, i8** %pst_1_3_6


				ret void
				}


				define void @fold_memchr_A_pIb_cst_N(i64 %N, i8** %pchr) {
				; CHECK-LABEL: @fold_memchr_A_pIb_cst_N(
				; CHECK-NEXT: [[MEMCHR_CMP:%.]] = icmp eq i64 [[N:%.]], 0
				; CHECK-NEXT: [[TMP1:%.]] = select i1 [[MEMCHR_CMP]], i8 null, i8* bitcast ([1 x %struct.A]* @a to i8*)
				; CHECK-NEXT: store i8* [[TMP1]], i8** [[PCHR:%.*]], align 8
				; CHECK-NEXT: [[PST_0_1_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 1
				; CHECK-NEXT: [[MEMCHR_CMP1:%.*]] = icmp ult i64 [[N]], 3
				; CHECK-NEXT: [[TMP2:%.]] = select i1 [[MEMCHR_CMP1]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 1) to i8*)
				; CHECK-NEXT: store i8* [[TMP2]], i8** [[PST_0_1_N]], align 8
				; CHECK-NEXT: [[PST_0_4_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 2
				; CHECK-NEXT: store i8* null, i8** [[PST_0_4_N]], align 8
				; CHECK-NEXT: [[PST_1_0_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 3
				; CHECK-NEXT: [[MEMCHR_CMP2:%.*]] = icmp eq i64 [[N]], 0
				; CHECK-NEXT: [[TMP3:%.]] = select i1 [[MEMCHR_CMP2]], i8 null, i8* getelementptr (i8, i8* bitcast ([1 x %struct.A]* @a to i8*), i64 1)
				; CHECK-NEXT: store i8* [[TMP3]], i8** [[PST_1_0_N]], align 8
				; CHECK-NEXT: [[PST_1_1_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 4
				; CHECK-NEXT: [[MEMCHR_CMP3:%.*]] = icmp ult i64 [[N]], 2
				; CHECK-NEXT: [[TMP4:%.]] = select i1 [[MEMCHR_CMP3]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 1) to i8*)
				; CHECK-NEXT: store i8* [[TMP4]], i8** [[PST_1_1_N]], align 8
				; CHECK-NEXT: [[PST_1_2_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 5
				; CHECK-NEXT: [[MEMCHR_CMP4:%.*]] = icmp ult i64 [[N]], 4
				; CHECK-NEXT: [[TMP5:%.]] = select i1 [[MEMCHR_CMP4]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 1, i64 0) to i8*)
				; CHECK-NEXT: store i8* [[TMP5]], i8** [[PST_1_2_N]], align 8
				; CHECK-NEXT: [[PST_1_3_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 6
				; CHECK-NEXT: [[MEMCHR_CMP5:%.*]] = icmp ult i64 [[N]], 6
				; CHECK-NEXT: [[TMP6:%.]] = select i1 [[MEMCHR_CMP5]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 1, i64 1) to i8*)
				; CHECK-NEXT: store i8* [[TMP6]], i8** [[PST_1_3_N]], align 8
				; CHECK-NEXT: [[PST_1_4_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 7
				; CHECK-NEXT: store i8* null, i8** [[PST_1_4_N]], align 8
				; CHECK-NEXT: [[PST_2_0_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 8
				; CHECK-NEXT: store i8* null, i8** [[PST_2_0_N]], align 8
				; CHECK-NEXT: [[PST_2_1_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 9
				; CHECK-NEXT: [[MEMCHR_CMP6:%.*]] = icmp eq i64 [[N]], 0
				; CHECK-NEXT: [[TMP7:%.]] = select i1 [[MEMCHR_CMP6]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 1) to i8*)
				; CHECK-NEXT: store i8* [[TMP7]], i8** [[PST_2_1_N]], align 8
				; CHECK-NEXT: [[PST_2_2_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 10
				; CHECK-NEXT: [[MEMCHR_CMP7:%.*]] = icmp ult i64 [[N]], 3
				; CHECK-NEXT: [[TMP8:%.]] = select i1 [[MEMCHR_CMP7]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 1, i64 0) to i8*)
				; CHECK-NEXT: store i8* [[TMP8]], i8** [[PST_2_2_N]], align 8
				; CHECK-NEXT: [[PST_2_3_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 11
				; CHECK-NEXT: [[MEMCHR_CMP8:%.*]] = icmp ult i64 [[N]], 5
				; CHECK-NEXT: [[TMP9:%.]] = select i1 [[MEMCHR_CMP8]], i8 null, i8* bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0, i32 1, i64 1) to i8*)
				; CHECK-NEXT: store i8* [[TMP9]], i8** [[PST_2_3_N]], align 8
				; CHECK-NEXT: [[PST_2_4_N:%.]] = getelementptr i8, i8** [[PCHR]], i64 12
				; CHECK-NEXT: store i8* null, i8** [[PST_2_4_N]], align 8
				; CHECK-NEXT: ret void
				;
				%pa = getelementptr [1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0
				%pi8a = bitcast %struct.A* %pa to i8*

				%pi8ap0 = getelementptr i8, i8* %pi8a, i32 0

				; Fold memchr((char*)a + 0, '\0', N) to N ? a : null.
				%pst_0_0_n = getelementptr i8, i8* %pchr, i32 0
				%chr_0_0_n = call i8* @memchr(i8* %pi8ap0, i32 0, i64 %N)
				store i8* %chr_0_0_n, i8** %pst_0_0_n

				; Fold memchr((char*)a + 0, '\01', N) to N < 2 ? null : a.
				%pst_0_1_n = getelementptr i8, i8* %pchr, i32 1
				%chr_0_1_n = call i8* @memchr(i8* %pi8ap0, i32 1, i64 %N)
				store i8* %chr_0_1_n, i8** %pst_0_1_n

				; Fold memchr((char*)a + 0, '\04', N) to null.
				%pst_0_4_n = getelementptr i8, i8* %pchr, i32 2
				%chr_0_4_n = call i8* @memchr(i8* %pi8ap0, i32 4, i64 %N)
				store i8* %chr_0_4_n, i8** %pst_0_4_n


				%pi8ap1 = getelementptr i8, i8* %pi8a, i32 1

				; Fold memchr((char*)a + 1, '\0', N) to null.
				%pst_1_0_n = getelementptr i8, i8* %pchr, i32 3
				%chr_1_0_n = call i8* @memchr(i8* %pi8ap1, i32 0, i64 %N)
				store i8* %chr_1_0_n, i8** %pst_1_0_n

				; Fold memchr((char)a + 1, '\01', N) N ? (char)a + 1 : null.
				%pst_1_1_n = getelementptr i8, i8* %pchr, i32 4
				%chr_1_1_n = call i8* @memchr(i8* %pi8ap1, i32 1, i64 %N)
				store i8* %chr_1_1_n, i8** %pst_1_1_n

				; Fold memchr((char)a + 1, '\02', N) to N < 2 ? null : (char)a + 4.
				%pst_1_2_n = getelementptr i8, i8* %pchr, i32 5
				%chr_1_2_n = call i8* @memchr(i8* %pi8ap1, i32 2, i64 %N)
				store i8* %chr_1_2_n, i8** %pst_1_2_n

				; Fold memchr((char)a + 1, '\03', N) to N < 6 ? null : (char)a + 6.
				%pst_1_3_n = getelementptr i8, i8* %pchr, i32 6
				%chr_1_3_n = call i8* @memchr(i8* %pi8ap1, i32 3, i64 %N)
				store i8* %chr_1_3_n, i8** %pst_1_3_n

				; Fold memchr((char*)a + 1, '\04', N) to null.
				%pst_1_4_n = getelementptr i8, i8* %pchr, i32 7
				%chr_1_4_n = call i8* @memchr(i8* %pi8ap1, i32 4, i64 %N)
				store i8* %chr_1_4_n, i8** %pst_1_4_n


				%pi8ap2 = getelementptr i8, i8* %pi8a, i32 2

				; Fold memchr((char*)a + 2, '\0', N) to null.
				%pst_2_0_n = getelementptr i8, i8* %pchr, i32 8
				%chr_2_0_n = call i8* @memchr(i8* %pi8ap2, i32 0, i64 %N)
				store i8* %chr_2_0_n, i8** %pst_2_0_n

				; Fold memchr((char)a + 2, '\01', N) N ? (char)a + 2 : null.
				%pst_2_1_n = getelementptr i8, i8* %pchr, i32 9
				%chr_2_1_n = call i8* @memchr(i8* %pi8ap2, i32 1, i64 %N)
				store i8* %chr_2_1_n, i8** %pst_2_1_n

				; Fold memchr((char)a + 2, '\02', N) to N < 3 ? null : (char)a + 2.
				%pst_2_2_n = getelementptr i8, i8* %pchr, i32 10
				%chr_2_2_n = call i8* @memchr(i8* %pi8ap2, i32 2, i64 %N)
				store i8* %chr_2_2_n, i8** %pst_2_2_n

				; Fold memchr((char)a + 2, '\03', N) to N < 5 ? null : (char)a + 4.
				%pst_2_3_n = getelementptr i8, i8* %pchr, i32 11
				%chr_2_3_n = call i8* @memchr(i8* %pi8ap2, i32 3, i64 %N)
				store i8* %chr_2_3_n, i8** %pst_2_3_n

				; Fold memchr((char*)a + 2, '\04', N) to null.
				%pst_2_4_n = getelementptr i8, i8* %pchr, i32 12
				%chr_2_4_n = call i8* @memchr(i8* %pi8ap2, i32 4, i64 %N)
				store i8* %chr_2_4_n, i8** %pst_2_4_n

				ret void
				}


				; Verify that calls with out of bounds offsets are not folded.

				define void @call_memchr_A_pIb_xs_cst(i8** %pchr) {
				; CHECK-LABEL: @call_memchr_A_pIb_xs_cst(
				; CHECK-NEXT: [[CHR_1_0_0_2:%.]] = call i8 @memchr(i8* noundef nonnull dereferenceable(1) bitcast (%struct.A* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 1, i64 0) to i8*), i32 0, i64 2)
				; CHECK-NEXT: store i8* [[CHR_1_0_0_2]], i8** [[PCHR:%.*]], align 8
				; CHECK-NEXT: [[PST_1_0_1_2:%.]] = getelementptr i8, i8** [[PCHR]], i64 1
				; CHECK-NEXT: [[CHR_1_0_1_2:%.]] = call i8 @memchr(i8* noundef nonnull dereferenceable(1) bitcast (%struct.A* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 1, i64 0) to i8*), i32 0, i64 2)
				; CHECK-NEXT: store i8* [[CHR_1_0_1_2]], i8** [[PST_1_0_1_2]], align 8
				; CHECK-NEXT: [[PST_0_0_8_2:%.]] = getelementptr i8, i8** [[PCHR]], i64 2
				; CHECK-NEXT: [[CHR_0_0_8_2:%.]] = call i8 @memchr(i8* noundef nonnull dereferenceable(1) bitcast (i16* getelementptr inbounds ([1 x %struct.A], [1 x %struct.A]* @a, i64 1, i64 0, i32 0, i64 0) to i8*), i32 0, i64 2)
				; CHECK-NEXT: store i8* [[CHR_0_0_8_2]], i8** [[PST_0_0_8_2]], align 8
				; CHECK-NEXT: ret void
				;
				; Verify that the call isn't folded when the first GEP index is excessive.
				%pa1 = getelementptr [1 x %struct.A], [1 x %struct.A]* @a, i64 1, i64 0
				%pi8a1 = bitcast %struct.A* %pa1 to i8*

				%pi8a1p0 = getelementptr i8, i8* %pi8a1, i32 0

				; Don't fold memchr((char*)&a[1] + 0, '\0', 2).
				%pst_1_0_0_2 = getelementptr i8, i8* %pchr, i32 0
				%chr_1_0_0_2 = call i8* @memchr(i8* %pi8a1p0, i32 0, i64 2)
				store i8* %chr_1_0_0_2, i8** %pst_1_0_0_2

				%pi8a1p1 = getelementptr i8, i8* %pi8a1, i32 1

				; Likewise, don't fold memchr((char*)&a[1] + 1, '\0', 2).
				%pst_1_0_1_2 = getelementptr i8, i8* %pchr, i32 1
				%chr_1_0_1_2 = call i8* @memchr(i8* %pi8a1p0, i32 0, i64 2)
				store i8* %chr_1_0_1_2, i8** %pst_1_0_1_2

				; Verify that the call isn't folded when the first GEP index is in bounds
				; but the byte offset is excessive.
				%pa0 = getelementptr [1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0
				%pi8a0 = bitcast %struct.A* %pa0 to i8*

				%pi8a0p8 = getelementptr i8, i8* %pi8a0, i32 8

				; Don't fold memchr((char*)&a[0] + 8, '\0', 2).
				%pst_0_0_8_2 = getelementptr i8, i8* %pchr, i32 2
				%chr_0_0_8_2 = call i8* @memchr(i8* %pi8a0p8, i32 0, i64 2)
				store i8* %chr_0_0_8_2, i8** %pst_0_0_8_2

				ret void
				}


				@ai64 = constant [2 x i64] [i64 0, i64 -1]

				; Verify that a memchr call with an argument consisting of three GEPs
				; is folded.

				define i8* @fold_memchr_gep_gep_gep() {
				; CHECK-LABEL: @fold_memchr_gep_gep_gep(
				; CHECK-NEXT: ret i8* bitcast (i16* getelementptr (i16, i16* bitcast (i32* getelementptr (i32, i32* bitcast (i64* getelementptr inbounds ([2 x i64], [2 x i64]* @ai64, i64 0, i64 1) to i32), i64 1) to i16), i64 1) to i8*)
				;

				%p8_1 = getelementptr [2 x i64], [2 x i64]* @ai64, i64 0, i64 1
				%p4_0 = bitcast i64* %p8_1 to i32*
				%p4_1 = getelementptr i32, i32* %p4_0, i64 1

				%p2_0 = bitcast i32* %p4_1 to i16*
				%p2_1 = getelementptr i16, i16* %p2_0, i64 1
				%q2_1 = bitcast i16* %p2_1 to i8*

				%pc = call i8* @memchr(i8* %q2_1, i32 -1, i64 2)
				ret i8* %pc
				}


				%union.U = type { [2 x i32] }

				@u = constant %union.U { [2 x i32] [i32 286331153, i32 35791394] }

				; Verify memchr folding of a union member.

				define i8* @fold_memchr_union_member() {
				; CHECK-LABEL: @fold_memchr_union_member(
				; BE-CHECK-NEXT: ret i8* getelementptr (i8, i8* bitcast (%union.U* @u to i8*), i64 5)
				; LE-CHECK-NEXT: ret i8* bitcast (i32* getelementptr inbounds (%union.U, %union.U* @u, i64 0, i32 0, i64 1) to i8*)
				;
				nikicUnsubmitted Not Done Reply Inline Actions These check lines aren't correct and are probably just getting ignored. You need to pass something like `--check-prefixes=CHECK,BE` and `--check-prefixes=CHECK,LE` and regenerate check lines. nikic: These check lines aren't correct and are probably just getting ignored. You need to pass…
				mseborAuthorUnsubmitted Done Reply Inline Actions Right. You do have a good eye for detail! msebor: Right. You do have a good eye for detail!
				%pu = getelementptr %union.U, %union.U* @u, i64 0
				%pi8u = bitcast %union.U* %pu to i8*
				%pi8u_p1 = getelementptr i8, i8* %pi8u, i64 1
				%pc = call i8* @memchr(i8* %pi8u_p1, i32 34, i64 8)
				ret i8* %pc
				}

llvm/test/Transforms/InstCombine/memcmp-7.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s
				;
				; Exercise folding of memcmp calls with addition expressions involving
				; pointers into constant arrays of types larger than char and fractional
				; offsets.

				declare i32 @memcmp(i8, i8, i64)

				@i32a = constant [2 x i16] [i16 4386, i16 13124]
				@i32b = constant [2 x i16] [i16 4386, i16 13124]


				define void @fold_memcmp_i32a_i32b_pIb(i32 %I, i32* %pcmp)
				; CHECK-LABEL: @fold_memcmp_i32a_i32b_pIb(
				; CHECK-NEXT: store i32 0, i32* [[PCMP:%.*]], align 4
				; CHECK-NEXT: [[PST_1_1_2:%.]] = getelementptr i32, i32 [[PCMP]], i64 1
				; CHECK-NEXT: store i32 0, i32* [[PST_1_1_2]], align 4
				; CHECK-NEXT: [[PST_1_1_3:%.]] = getelementptr i32, i32 [[PCMP]], i64 2
				; CHECK-NEXT: store i32 0, i32* [[PST_1_1_3]], align 4
				; CHECK-NEXT: ret void
				;
				{
				%pi32a = getelementptr [2 x i16], [2 x i16]* @i32a, i32 0, i32 0
				%pi32b = getelementptr [2 x i16], [2 x i16]* @i32b, i32 0, i32 0

				%pi8a = bitcast i16* %pi32a to i8*
				%pi8b = bitcast i16* %pi32b to i8*

				%pi8ap1 = getelementptr i8, i8* %pi8a, i32 1
				%pi8bp1 = getelementptr i8, i8* %pi8b, i32 1

				%pst_1_1_1 = getelementptr i32, i32* %pcmp, i32 0
				%cmp_1_1_1 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8ap1, i64 1)
				store i32 %cmp_1_1_1, i32* %pst_1_1_1

				%pst_1_1_2 = getelementptr i32, i32* %pcmp, i32 1
				%cmp_1_1_2 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8ap1, i64 2)
				store i32 %cmp_1_1_2, i32* %pst_1_1_2

				%pst_1_1_3 = getelementptr i32, i32* %pcmp, i32 2
				%cmp_1_1_3 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8ap1, i64 3)
				store i32 %cmp_1_1_3, i32* %pst_1_1_3

				ret void
				}


				%struct.A = type { [4 x i8] }
				%struct.B = type { [2 x i8], [2 x i8] }

				@a = constant [1 x %struct.A] [%struct.A { [4 x i8] [i8 1, i8 2, i8 3, i8 4] }]
				@b = constant [1 x %struct.B] [%struct.B { [2 x i8] [i8 1, i8 2], [2 x i8] [i8 3, i8 4]}]

				define void @fold_memcmp_A_B_pIb(i32 %I, i32* %pcmp) {
				; CHECK-LABEL: @fold_memcmp_A_B_pIb(
				; CHECK-NEXT: store i32 0, i32* [[PCMP:%.*]], align 4
				; CHECK-NEXT: [[PST_0_0_2:%.]] = getelementptr i32, i32 [[PCMP]], i64 1
				; CHECK-NEXT: store i32 0, i32* [[PST_0_0_2]], align 4
				; CHECK-NEXT: [[PST_0_0_3:%.]] = getelementptr i32, i32 [[PCMP]], i64 2
				; CHECK-NEXT: store i32 0, i32* [[PST_0_0_3]], align 4
				; CHECK-NEXT: [[PST_0_0_4:%.]] = getelementptr i32, i32 [[PCMP]], i64 3
				; CHECK-NEXT: store i32 0, i32* [[PST_0_0_4]], align 4
				; CHECK-NEXT: [[PST_0_1_1:%.]] = getelementptr i32, i32 [[PCMP]], i64 4
				; CHECK-NEXT: store i32 -1, i32* [[PST_0_1_1]], align 4
				; CHECK-NEXT: [[PST_0_1_2:%.]] = getelementptr i32, i32 [[PCMP]], i64 5
				; CHECK-NEXT: store i32 -1, i32* [[PST_0_1_2]], align 4
				; CHECK-NEXT: [[PST_0_1_3:%.]] = getelementptr i32, i32 [[PCMP]], i64 6
				; CHECK-NEXT: store i32 -1, i32* [[PST_0_1_3]], align 4
				; CHECK-NEXT: [[PST_1_0_1:%.]] = getelementptr i32, i32 [[PCMP]], i64 4
				; CHECK-NEXT: store i32 1, i32* [[PST_1_0_1]], align 4
				; CHECK-NEXT: [[PST_1_0_2:%.]] = getelementptr i32, i32 [[PCMP]], i64 5
				; CHECK-NEXT: store i32 1, i32* [[PST_1_0_2]], align 4
				; CHECK-NEXT: [[PST_1_0_3:%.]] = getelementptr i32, i32 [[PCMP]], i64 6
				; CHECK-NEXT: store i32 1, i32* [[PST_1_0_3]], align 4
				; CHECK-NEXT: ret void
				;
				%pa = getelementptr [1 x %struct.A], [1 x %struct.A]* @a, i64 0, i64 0
				%pb = getelementptr [1 x %struct.B], [1 x %struct.B]* @b, i64 0, i64 0

				%pi8a = bitcast %struct.A* %pa to i8*
				%pi8b = bitcast %struct.B* %pb to i8*

				%pi8ap0 = getelementptr i8, i8* %pi8a, i32 0
				%pi8bp0 = getelementptr i8, i8* %pi8b, i32 0

				; Fold memcmp(&a, &b, 1) to 0;
				%pst_0_0_1 = getelementptr i32, i32* %pcmp, i32 0
				%cmp_0_0_1 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp0, i64 1)
				store i32 %cmp_0_0_1, i32* %pst_0_0_1

				; Fold memcmp(&a, &b, 2) to 0;
				%pst_0_0_2 = getelementptr i32, i32* %pcmp, i32 1
				%cmp_0_0_2 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp0, i64 2)
				store i32 %cmp_0_0_2, i32* %pst_0_0_2

				; Fold memcmp(&a, &b, 3) to 0;
				%pst_0_0_3 = getelementptr i32, i32* %pcmp, i32 2
				%cmp_0_0_3 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp0, i64 3)
				store i32 %cmp_0_0_3, i32* %pst_0_0_3

				; Fold memcmp(&a, &b, 4) to 0;
				%pst_0_0_4 = getelementptr i32, i32* %pcmp, i32 3
				%cmp_0_0_4 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp0, i64 4)
				store i32 %cmp_0_0_4, i32* %pst_0_0_4


				%pi8bp1 = getelementptr i8, i8* %pi8b, i32 1

				; Fold memcmp(&a, (char*)&b + 1, 1) to -1;
				%pst_0_1_1 = getelementptr i32, i32* %pcmp, i32 4
				%cmp_0_1_1 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp1, i64 1)
				store i32 %cmp_0_1_1, i32* %pst_0_1_1

				; Fold memcmp(&a, (char*)&b + 1, 2) to -1;
				%pst_0_1_2 = getelementptr i32, i32* %pcmp, i32 5
				%cmp_0_1_2 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp1, i64 2)
				store i32 %cmp_0_1_2, i32* %pst_0_1_2

				; Fold memcmp(&a, (char*)&b + 1, 3) to -1;
				%pst_0_1_3 = getelementptr i32, i32* %pcmp, i32 6
				%cmp_0_1_3 = call i32 @memcmp(i8* %pi8ap0, i8* %pi8bp1, i64 3)
				store i32 %cmp_0_1_3, i32* %pst_0_1_3


				%pi8ap1 = getelementptr i8, i8* %pi8a, i32 1

				; Fold memcmp((char*)&a + 1, &b, 1) to +1;
				%pst_1_0_1 = getelementptr i32, i32* %pcmp, i32 4
				%cmp_1_0_1 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8bp0, i64 1)
				store i32 %cmp_1_0_1, i32* %pst_1_0_1

				; Fold memcmp((char*)&a + 1, &b, 2) to +1;
				%pst_1_0_2 = getelementptr i32, i32* %pcmp, i32 5
				%cmp_1_0_2 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8bp0, i64 2)
				store i32 %cmp_1_0_2, i32* %pst_1_0_2

				; Fold memcmp((char*)&a + 1, &b, 3) to +1;
				%pst_1_0_3 = getelementptr i32, i32* %pcmp, i32 6
				%cmp_1_0_3 = call i32 @memcmp(i8* %pi8ap1, i8* %pi8bp0, i64 3)
				store i32 %cmp_1_0_3, i32* %pst_1_0_3

				ret void
				}

llvm/test/Transforms/InstCombine/memcmp-8.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s
				;
				; Verify that the result of memrchr calls with past-the-end pointers used
				; don't cause trouble and are optimally folded.

				declare i32 @memcmp(i8, i8, i64)


				@a5 = constant [5 x i8] c"12345";


				; Fold memcmp(a5, a5 + 5, n) to 0 on the assumption that n is 0 otherwise
				; the call would be undefined.

				define i32 @fold_memcmp_a5_a5p5_n(i64 %n) {
				; CHECK-LABEL: @fold_memcmp_a5_a5p5_n(
				; CHECK-NEXT: ret i32 0
				;
				%pa5_p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%pa5_p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%cmp = call i32 @memcmp(i8* %pa5_p0, i8* %pa5_p5, i64 %n)
				ret i32 %cmp
				}


				; Same as above but for memcmp(a5 + 5, a5 + 5, n).

				define i32 @fold_memcmp_a5p5_a5p5_n(i64 %n) {
				; CHECK-LABEL: @fold_memcmp_a5p5_a5p5_n(
				; CHECK-NEXT: ret i32 0
				;
				%pa5_p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%qa5_p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%cmp = call i32 @memcmp(i8* %pa5_p5, i8* %qa5_p5, i64 %n)
				ret i32 %cmp
				}


				; TODO: Likewise, fold memcmp(a5 + i, a5 + 5, n) to 0 on same basis.

				define i32 @fold_memcmp_a5pi_a5p5_n(i32 %i, i64 %n) {
				; CHECK-LABEL: @fold_memcmp_a5pi_a5p5_n(
				; CHECK-NEXT: [[TMP1:%.]] = sext i32 [[I:%.]] to i64
				; CHECK-NEXT: [[PA5_PI:%.]] = getelementptr [5 x i8], [5 x i8] @a5, i64 0, i64 [[TMP1]]
				; CHECK-NEXT: [[CMP:%.]] = call i32 @memcmp(i8 [[PA5_PI]], i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 1, i64 0), i64 [[N:%.*]])
				; CHECK-NEXT: ret i32 [[CMP]]
				;
				%pa5_pi = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 %i
				%pa5_p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%cmp = call i32 @memcmp(i8* %pa5_pi, i8* %pa5_p5, i64 %n)
				ret i32 %cmp
				}

llvm/test/Transforms/InstCombine/memrchr-7.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s
				;
				; Verify that the result of memrchr calls with past-the-end pointers used
				; in equality expressions don't cause trouble and either are folded when
				; they might be valid or not when they're provably undefined.

				declare i8* @memrchr(i8*, i32, i64)


				@a5 = constant [5 x i8] c"12345"


				; Fold memrchr(a5 + 5, c, 1) == a5 + 5 to an arbitrary constant.
				; The call is transformed to a5[5] == c by the memrchr simplifier, with
				; a5[5] being indeterminate. The equality then is the folded with
				; an undefined/arbitrary result.

				define i1 @call_memrchr_ap5_c_1_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @call_memrchr_ap5_c_1_eq_a(
				; CHECK-NEXT: ret i1
				;
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%qap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 1, i32 0
				%q = call i8* @memrchr(i8* %pap5, i32 %c, i64 1)
				%cmp = icmp eq i8* %q, %qap5
				ret i1 %cmp
				}


				; Fold memrchr(a5 + 5, c, 5) == a5 + 5 to an arbitrary constant.

				define i1 @call_memrchr_ap5_c_5_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @call_memrchr_ap5_c_5_eq_a(
				; CHECK-NEXT: ret i1
				;
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%qap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 1, i32 0
				%q = call i8* @memrchr(i8* %pap5, i32 %c, i64 5)
				%cmp = icmp eq i8* %q, %qap5
				ret i1 %cmp
				}


				; Fold memrchr(a5 + 5, c, n) == a5 to false.

				define i1 @fold_memrchr_ap5_c_n_eq_a(i32 %c, i64 %n) {
				; CHECK-LABEL: @fold_memrchr_ap5_c_n_eq_a(
				; CHECK-NEXT: ret i1 false
				;
				%pa = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%pap5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memrchr(i8* %pap5, i32 %c, i64 %n)
				%cmp = icmp eq i8* %q, %pa
				ret i1 %cmp
				}


				; Fold memrchr(a5 + 5, c, n) == null to true on the basis that n must
				; be zero in order for the call to be valid.

				define i1 @fold_memrchr_ap5_c_n_eqz(i32 %c, i64 %n) {
				; CHECK-LABEL: @fold_memrchr_ap5_c_n_eqz(
				; CHECK-NEXT: ret i1 true
				;
				%p = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memrchr(i8* %p, i32 %c, i64 %n)
				%cmp = icmp eq i8* %q, null
				ret i1 %cmp
				}


				; Fold memrchr(a5 + 5, '\0', n) == null to true again on the basis that
				; n must be zero in order for the call to be valid.

				define i1 @fold_memrchr_a_nul_n_eqz(i64 %n) {
				; CHECK-LABEL: @fold_memrchr_a_nul_n_eqz(
				; CHECK-NEXT: ret i1 true
				;
				%p = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @memrchr(i8* %p, i32 0, i64 %n)
				%cmp = icmp eq i8* %q, null
				ret i1 %cmp
				}

llvm/test/Transforms/InstCombine/str-int-3.ll

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	; Fold atoi(a[1].b) to 1234.
%ia1b = call i32 @atoi(i8* %pa1b)		%ia1b = call i32 @atoi(i8* %pa1b)
%pia1b = getelementptr i32, i32* %pi, i32 3		%pia1b = getelementptr i32, i32* %pi, i32 3
store i32 %ia1b, i32* %pia1b		store i32 %ia1b, i32* %pia1b

ret void		ret void
}		}


; Do not fold atoi with an excessive offset. It's undefined so folding		; Fold atoi with an excessive offset. It's undefined so folding it to zero
; it (e.g., to zero) would be valid and might prevent crashes or returning		; is valid and might prevent crashes or returning a bogus value, even though
; a bogus value but could also prevent detecting the bug by sanitizers.		; it prevents detecting the bug by sanitizers.

define void @call_atoi_offset_out_of_bounds(i32* %pi) {		define void @call_atoi_offset_out_of_bounds(i32* %pi) {
; CHECK-LABEL: @call_atoi_offset_out_of_bounds(		; CHECK-LABEL: @call_atoi_offset_out_of_bounds(
; CHECK-NEXT: [[IA_0_0_32:%.]] = call i32 @atoi(i8 getelementptr inbounds ([2 x %struct.A], [2 x %struct.A]* @a, i64 1, i64 0, i32 0, i64 0))		; CHECK-NEXT: store i32 0, i32* [[PI:%.*]], align 4
; CHECK-NEXT: store i32 [[IA_0_0_32]], i32* [[PI:%.*]], align 4
; CHECK-NEXT: [[IA_0_0_33:%.]] = call i32 @atoi(i8 getelementptr ([2 x %struct.A], [2 x %struct.A]* @a, i64 1, i64 0, i32 0, i64 1))		; CHECK-NEXT: [[IA_0_0_33:%.]] = call i32 @atoi(i8 getelementptr ([2 x %struct.A], [2 x %struct.A]* @a, i64 1, i64 0, i32 0, i64 1))
; CHECK-NEXT: store i32 [[IA_0_0_33]], i32* [[PI]], align 4		; CHECK-NEXT: store i32 [[IA_0_0_33]], i32* [[PI]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
; Do not fold atoi((const char*)a + sizeof a).		; Fold atoi((const char*)a + sizeof a) to zero.
%pa_0_0_32 = getelementptr [2 x %struct.A], [2 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 32		%pa_0_0_32 = getelementptr [2 x %struct.A], [2 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 32
%ia_0_0_32 = call i32 @atoi(i8* %pa_0_0_32)		%ia_0_0_32 = call i32 @atoi(i8* %pa_0_0_32)
%pia_0_0_32 = getelementptr i32, i32* %pi, i32 0		%pia_0_0_32 = getelementptr i32, i32* %pi, i32 0
store i32 %ia_0_0_32, i32* %pia_0_0_32		store i32 %ia_0_0_32, i32* %pia_0_0_32

; Likewise, do not fold atoi((const char*)a + sizeof a + 1).		; Likewise, fold atoi((const char*)a + sizeof a + 1) to zero.
%pa_0_0_33 = getelementptr [2 x %struct.A], [2 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 33		%pa_0_0_33 = getelementptr [2 x %struct.A], [2 x %struct.A]* @a, i64 0, i64 0, i32 0, i64 33
%ia_0_0_33 = call i32 @atoi(i8* %pa_0_0_33)		%ia_0_0_33 = call i32 @atoi(i8* %pa_0_0_33)
%pia_0_0_33 = getelementptr i32, i32* %pi, i32 0		%pia_0_0_33 = getelementptr i32, i32* %pi, i32 0
store i32 %ia_0_0_33, i32* %pia_0_0_33		store i32 %ia_0_0_33, i32* %pia_0_0_33

ret void		ret void
}		}

▲ Show 20 Lines • Show All 231 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/strcall-no-nul.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s
				;
				; Verify that calls with arguments with pointers just past the end of
				; a string to [a subset of] library functions that expect nul-terminated
				; strings as arguments are folded to safe values. The rationale is that
				; since they are undefined and even though folding them isn't important
				; for efficiency and prevents sanitizers from detecting and reporting
				; them, sanitizers usually don't run, and transforming such invalid
				; calls to something valid is safer than letting the program run off
				; the rails. See the Safe Optimizations for Sanitizers RFC for
				; an in-depth discussion of the trade-offs:
				; https://discourse.llvm.org/t/rfc-safe-optimizations-for-sanitizers

				declare i8* @strchr(i8*, i32)
				declare i8* @strrchr(i8*, i32)
				declare i32 @strcmp(i8, i8)
				declare i32 @strncmp(i8, i8, i64)
				declare i8* @strstr(i8, i8)

				declare i8* @stpcpy(i8, i8)
				declare i8* @strcpy(i8, i8)
				declare i8* @stpncpy(i8, i8, i64)
				declare i8* @strncpy(i8, i8, i64)

				declare i64 @strlen(i8*)
				declare i64 @strnlen(i8*, i64)

				declare i8* @strpbrk(i8, i8)

				declare i64 @strspn(i8, i8)
				declare i64 @strcspn(i8, i8)

				declare i32 @sprintf(i8, i8, ...)
				declare i32 @snprintf(i8, i64, i8, ...)


				@a5 = constant [5 x i8] c"%s\0045";


				; Fold strchr(a5 + 5, '\0') to null.

				define i8* @fold_strchr_past_end() {
				; CHECK-LABEL: @fold_strchr_past_end(
				; CHECK-NEXT: ret i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 1, i64 0)
				;
				%p = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%q = call i8* @strchr(i8* %p, i32 0)
				ret i8* %q
				}

				; Fold strcmp(a5, a5 + 5) (and vice versa) to null.

				define void @fold_strcmp_past_end(i32* %pcmp) {
				; CHECK-LABEL: @fold_strcmp_past_end(
				; CHECK-NEXT: store i32 1, i32* [[PCMP:%.*]], align 4
				; CHECK-NEXT: [[PC50:%.]] = getelementptr i32, i32 [[PCMP]], i64 1
				; CHECK-NEXT: store i32 -1, i32* [[PC50]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%c05 = call i32 @strcmp(i8* %p0, i8* %p5)
				%pc05 = getelementptr i32, i32* %pcmp, i32 0
				store i32 %c05, i32* %pc05

				%c50 = call i32 @strcmp(i8* %p5, i8* %p0)
				%pc50 = getelementptr i32, i32* %pcmp, i32 1
				store i32 %c50, i32* %pc50

				ret void
				}


				; Likewise, fold strncmp(a5, a5 + 5, 5) (and vice versa) to null.

				define void @fold_strncmp_past_end(i32* %pcmp) {
				; CHECK-LABEL: @fold_strncmp_past_end(
				; CHECK-NEXT: store i32 1, i32* [[PCMP:%.*]], align 4
				; CHECK-NEXT: [[PC50:%.]] = getelementptr i32, i32 [[PCMP]], i64 1
				; CHECK-NEXT: store i32 -1, i32* [[PC50]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%c05 = call i32 @strncmp(i8* %p0, i8* %p5, i64 5)
				%pc05 = getelementptr i32, i32* %pcmp, i32 0
				store i32 %c05, i32* %pc05

				%c50 = call i32 @strncmp(i8* %p5, i8* %p0, i64 5)
				%pc50 = getelementptr i32, i32* %pcmp, i32 1
				store i32 %c50, i32* %pc50

				ret void
				}


				; Fold strrchr(a5 + 5, '\0') to null.

				define i8* @fold_strrchr_past_end(i32 %c) {
				; CHECK-LABEL: @fold_strrchr_past_end(
				; CHECK-NEXT: ret i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 1, i64 0)
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i8* @strrchr(i8* %p5, i32 0)
				ret i8* %r
				}


				; Fold strstr(a5 + 5, a5) (and vice versa) to null.

				define void @fold_strstr_past_end(i8** %psub) {
				; CHECK-LABEL: @fold_strstr_past_end(
				; CHECK-NEXT: store i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 0, i64 0), i8** [[PSUB:%.*]], align 8
				; CHECK-NEXT: [[PS50:%.]] = getelementptr i8, i8** [[PSUB]], i64 1
				; CHECK-NEXT: store i8* null, i8** [[PS50]], align 8
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%s05 = call i8* @strstr(i8* %p0, i8* %p5)
				%ps05 = getelementptr i8, i8* %psub, i32 0
				store i8* %s05, i8** %ps05

				%s50 = call i8* @strstr(i8* %p5, i8* %p0)
				%ps50 = getelementptr i8, i8* %psub, i32 1
				store i8* %s50, i8** %ps50

				ret void
				}


				; Fold strlen(a5 + 5) to 0.

				define i64 @fold_strlen_past_end() {
				; CHECK-LABEL: @fold_strlen_past_end(
				; CHECK-NEXT: ret i64 0
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i64 @strlen(i8* %p5)
				nikicUnsubmitted Done Reply Inline Actions This check line doesn't make a lot of sense ... the syntax is incorrect (and the incorrect syntax is repeated in the cases below) and the ret wouldn't make sense here anyway (because strlen returns an integer not a pointer). The TODO also seems outdated, as this already folds to 0. nikic: This check line doesn't make a lot of sense ... the syntax is incorrect (and the incorrect…
				mseborAuthorUnsubmitted Done Reply Inline Actions I've fixed the `strlen` typo. I hand-wrote all these lines and I realize not all of them are entirely correct. My main goal was to more prominently mark up the expected failures, but I was also hoping that `llvm-lit` would pick them up somehow and indicate when there are expected failures in the test. I've removed the rest of the `XFAIL-` lines from the final commit but, for future reference, is there a way to do what I want? (I couldn't find an example in the test suite that I didn't add myself.) msebor: I've fixed the `strlen` typo. I hand-wrote all these lines and I realize not all of them are…
				nikicUnsubmitted Not Done Reply Inline Actions It's possible to mark the entire test as XFAIL using `XFAIL: `, but I don't think XFAIL interacts with FileCheck in any way. General convention is to check for current codegen and put a TODO/FIXME if it's currently incorrect/suboptimal. nikic:* It's possible to mark the entire test as XFAIL using `XFAIL: *`, but I don't think XFAIL…
				ret i64 %r
				}


				; TODO: Fold stpcpy(dst, a5 + 5) to (*dst = '\0', dst).

				define i8* @fold_stpcpy_past_end(i8* %dst) {
				; CHECK-LABEL: @fold_stpcpy_past_end(
				; CHECK-NEXT: ret i8* [[DST:%.*]]
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i8* @strcpy(i8* %dst, i8* %p5)
				ret i8* %r
				}


				; TODO: Fold strcpy(dst, a5 + 5) to (*dst = '\0', dst).

				define i8* @fold_strcpy_past_end(i8* %dst) {
				; CHECK-LABEL: @fold_strcpy_past_end(
				; CHECK-NEXT: ret i8* [[DST:%.*]]
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i8* @strcpy(i8* %dst, i8* %p5)
				ret i8* %r
				}


				; TODO: Fold stpncpy(dst, a5 + 5, 5) to (memset(dst, 0, 5), dst + 5).

				define i8* @fold_stpncpy_past_end(i8* %dst) {
				; CHECK-LABEL: @fold_stpncpy_past_end(
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(5) [[DST:%.*]], i8 0, i64 5, i1 false)
				; CHECK-NEXT: ret i8* [[DST]]
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i8* @strncpy(i8* %dst, i8* %p5, i64 5)
				ret i8* %r
				}


				; TODO: Fold strncpy(dst, a5 + 5, 5) to memset(dst, 0, 5).

				define i8* @fold_strncpy_past_end(i8* %dst) {
				; CHECK-LABEL: @fold_strncpy_past_end(
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(5) [[DST:%.*]], i8 0, i64 5, i1 false)
				; CHECK-NEXT: ret i8* [[DST]]
				;
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%r = call i8* @strncpy(i8* %dst, i8* %p5, i64 5)
				ret i8* %r
				}


				; Fold strpbrk(a5, a5 + 5) (and vice versa) to null.

				define void @fold_strpbrk_past_end(i8** %psub) {
				; CHECK-LABEL: @fold_strpbrk_past_end(
				; CHECK-NEXT: store i8* null, i8** [[PSUB:%.*]], align 8
				; CHECK-NEXT: [[PS50:%.]] = getelementptr i8, i8** [[PSUB]], i64 1
				; CHECK-NEXT: store i8* null, i8** [[PS50]], align 8
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%s05 = call i8* @strpbrk(i8* %p0, i8* %p5)
				%ps05 = getelementptr i8, i8* %psub, i32 0
				store i8* %s05, i8** %ps05

				%s50 = call i8* @strpbrk(i8* %p5, i8* %p0)
				%ps50 = getelementptr i8, i8* %psub, i32 1
				store i8* %s50, i8** %ps50

				ret void
				}


				; Fold strspn(a5, a5 + 5) (and vice versa) to null.

				define void @fold_strspn_past_end(i64* %poff) {
				; CHECK-LABEL: @fold_strspn_past_end(
				; CHECK-NEXT: store i64 0, i64* [[POFF:%.*]], align 4
				; CHECK-NEXT: [[PO50:%.]] = getelementptr i64, i64 [[POFF]], i64 1
				; CHECK-NEXT: store i64 0, i64* [[PO50]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%o05 = call i64 @strspn(i8* %p0, i8* %p5)
				%po05 = getelementptr i64, i64* %poff, i32 0
				store i64 %o05, i64* %po05

				%o50 = call i64 @strspn(i8* %p5, i8* %p0)
				%po50 = getelementptr i64, i64* %poff, i32 1
				store i64 %o50, i64* %po50

				ret void
				}


				; Fold strcspn(a5, a5 + 5) (and vice versa) to null.

				define void @fold_strcspn_past_end(i64* %poff) {
				; CHECK-LABEL: @fold_strcspn_past_end(
				; CHECK-NEXT: store i64 2, i64* [[POFF:%.*]], align 4
				; CHECK-NEXT: [[PO50:%.]] = getelementptr i64, i64 [[POFF]], i64 1
				; CHECK-NEXT: store i64 0, i64* [[PO50]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%o05 = call i64 @strcspn(i8* %p0, i8* %p5)
				%po05 = getelementptr i64, i64* %poff, i32 0
				store i64 %o05, i64* %po05

				%o50 = call i64 @strcspn(i8* %p5, i8* %p0)
				%po50 = getelementptr i64, i64* %poff, i32 1
				store i64 %o50, i64* %po50

				ret void
				}


				; Fold sprintf(dst, a5 + 5) to zero, and also
				; TODO: fold sprintf(dst, "%s", a5 + 5) to zero.

				define void @fold_sprintf_past_end(i32* %pcnt, i8* %dst) {
				; CHECK-LABEL: @fold_sprintf_past_end(
				; CHECK-NEXT: store i32 0, i32* [[PCNT:%.*]], align 4
				; CHECK-NEXT: [[PN05:%.]] = getelementptr i32, i32 [[PCNT]], i64 1
				; CHECK-NEXT: store i32 0, i32* [[PN05]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%n5_ = call i32 (i8, i8, ...) @sprintf(i8* %dst, i8* %p5)
				%pn5_ = getelementptr i32, i32* %pcnt, i32 0
				store i32 %n5_, i32* %pn5_

				%n05 = call i32 (i8, i8, ...) @sprintf(i8* %dst, i8* %p0, i8* %p5)
				%pn05 = getelementptr i32, i32* %pcnt, i32 1
				store i32 %n05, i32* %pn05

				ret void
				}


				; Fold snprintf(dst, n, a5 + 5) to zero, and also
				; TODO: fold snprintf(dst, n, "%s", a5 + 5) to zero.

				define void @fold_snprintf_past_end(i32* %pcnt, i8* %dst, i64 %n) {
				; CHECK-LABEL: @fold_snprintf_past_end(
				; CHECK-NEXT: [[N5_:%.]] = call i32 (i8, i64, i8, ...) @snprintf(i8 [[DST:%.]], i64 [[N:%.]], i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 1, i64 0))
				; CHECK-NEXT: store i32 [[N5_]], i32* [[PCNT:%.*]], align 4
				; CHECK-NEXT: [[N05:%.]] = call i32 (i8, i64, i8, ...) @snprintf(i8 [[DST]], i64 [[N]], i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 0, i64 0), i8* getelementptr inbounds ([5 x i8], [5 x i8]* @a5, i64 1, i64 0))
				; CHECK-NEXT: [[PN05:%.]] = getelementptr i32, i32 [[PCNT]], i64 1
				; CHECK-NEXT: store i32 [[N05]], i32* [[PN05]], align 4
				; CHECK-NEXT: ret void
				;
				%p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5

				%n5_ = call i32 (i8, i64, i8, ...) @snprintf(i8* %dst, i64 %n, i8* %p5)
				%pn5_ = getelementptr i32, i32* %pcnt, i32 0
				store i32 %n5_, i32* %pn5_

				%n05 = call i32 (i8, i64, i8, ...) @snprintf(i8* %dst, i64 %n, i8* %p0, i8* %p5)
				%pn05 = getelementptr i32, i32* %pcnt, i32 1
				store i32 %n05, i32* %pn05

				ret void
				}

llvm/test/Transforms/InstCombine/strlen-9.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				;
				; Verify that strlen calls with unterminated constant arrays or with
				; just past-the-end pointers to strings are not folded.
				;
				; RUN: opt < %s -passes=instcombine -S \| FileCheck %s

				declare i64 @strlen(i8*)

				@a5 = constant [5 x i8] c"12345"
				@s5 = constant [6 x i8] c"12345\00"
				@z0 = constant [0 x i8] zeroinitializer
				@z5 = constant [5 x i8] zeroinitializer


				; Verify that all the invalid calls below are folded. This is safer than
				; making the library calls even though it prevents sanitizers from reporting
				; the bugs.

				define void @fold_strlen_no_nul(i64* %plen, i32 %i) {
				; CHECK-LABEL: @fold_strlen_no_nul(
				; CHECK-NEXT: store i64 5, i64* [[PLEN:%.*]], align 4
				; CHECK-NEXT: [[PNA5_P5:%.]] = getelementptr i64, i64 [[PLEN]], i64 1
				; CHECK-NEXT: store i64 0, i64* [[PNA5_P5]], align 4
				; CHECK-NEXT: [[PNS5_P6:%.]] = getelementptr i64, i64 [[PLEN]], i64 2
				; CHECK-NEXT: store i64 0, i64* [[PNS5_P6]], align 4
				; CHECK-NEXT: [[TMP1:%.]] = sext i32 [[I:%.]] to i64
				; CHECK-NEXT: [[PA5_PI:%.]] = getelementptr [5 x i8], [5 x i8] @a5, i64 0, i64 [[TMP1]]
				; CHECK-NEXT: [[NA5_PI:%.]] = call i64 @strlen(i8 noundef nonnull dereferenceable(1) [[PA5_PI]])
				; CHECK-NEXT: [[PNA5_PI:%.]] = getelementptr i64, i64 [[PLEN]], i64 3
				; CHECK-NEXT: store i64 [[NA5_PI]], i64* [[PNA5_PI]], align 4
				; CHECK-NEXT: [[PNZ0_P0:%.]] = getelementptr i64, i64 [[PLEN]], i64 4
				; CHECK-NEXT: store i64 0, i64* [[PNZ0_P0]], align 4
				; CHECK-NEXT: [[TMP2:%.*]] = sext i32 [[I]] to i64
				; CHECK-NEXT: [[PZ0_PI:%.]] = getelementptr [0 x i8], [0 x i8] @z0, i64 0, i64 [[TMP2]]
				; CHECK-NEXT: [[NZ0_PI:%.]] = call i64 @strlen(i8 noundef nonnull dereferenceable(1) [[PZ0_PI]])
				; CHECK-NEXT: [[PNZ0_PI:%.]] = getelementptr i64, i64 [[PLEN]], i64 5
				; CHECK-NEXT: store i64 [[NZ0_PI]], i64* [[PNZ0_PI]], align 4
				; CHECK-NEXT: [[PNZ5_P5:%.]] = getelementptr i64, i64 [[PLEN]], i64 6
				; CHECK-NEXT: store i64 0, i64* [[PNZ5_P5]], align 4
				; CHECK-NEXT: ret void
				;
				; Verify that strlen(a5) is folded to 5.
				%pa0_p0 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 0
				%na5_p0 = call i64 @strlen(i8* %pa0_p0)
				%pna5_p0 = getelementptr i64, i64* %plen, i64 0
				store i64 %na5_p0, i64* %pna5_p0

				; Verify that strlen(a5 + 5) is folded to 0.
				%pa5_p5 = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 5
				%na5_p5 = call i64 @strlen(i8* %pa5_p5)
				%pna5_p5 = getelementptr i64, i64* %plen, i64 1
				store i64 %na5_p5, i64* %pna5_p5

				; Verify that strlen(s5 + 6) is folded to 0.
				%ps5_p6 = getelementptr [6 x i8], [6 x i8]* @s5, i32 0, i32 6
				%ns5_p6 = call i64 @strlen(i8* %ps5_p6)
				%pns5_p6 = getelementptr i64, i64* %plen, i64 2
				store i64 %ns5_p6, i64* %pns5_p6

				; TODO: Verify that strlen(a5 + i) is folded to 5 - i? It's currently
				; not folded because the variable offset makes getConstantDataArrayInfo
				; fail.
				%pa5_pi = getelementptr [5 x i8], [5 x i8]* @a5, i32 0, i32 %i
				%na5_pi = call i64 @strlen(i8* %pa5_pi)
				%pna5_pi = getelementptr i64, i64* %plen, i64 3
				store i64 %na5_pi, i64* %pna5_pi

				; Verify that strlen(z0) is folded to 0.
				%pz0_p0 = getelementptr [0 x i8], [0 x i8]* @z0, i32 0, i32 0
				%nz0_p0 = call i64 @strlen(i8* %pz0_p0)
				%pnz0_p0 = getelementptr i64, i64* %plen, i64 4
				store i64 %nz0_p0, i64* %pnz0_p0

				; TODO: Verify that strlen(z0 + i) is folded to 0. As the case above,
				; this one is not folded either because the variable offset makes
				; getConstantDataArrayInfo fail.

				%pz0_pi = getelementptr [0 x i8], [0 x i8]* @z0, i32 0, i32 %i
				%nz0_pi = call i64 @strlen(i8* %pz0_pi)
				%pnz0_pi = getelementptr i64, i64* %plen, i64 5
				store i64 %nz0_pi, i64* %pnz0_pi

				; Verify that strlen(z5 + 5) is folded to 0.
				%pz5_p5 = getelementptr [5 x i8], [5 x i8]* @z5, i32 0, i32 5
				%nz5_p5 = call i64 @strlen(i8* %pz5_p5)
				%pnz5_p5 = getelementptr i64, i64* %plen, i64 6
				store i64 %nz5_p5, i64* %pnz5_p5

				ret void
				}

llvm/test/Transforms/InstCombine/strnlen-1.ll

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	}			}


	; Fold strnlen(ax, 1) to *ax ? 1 : 0.			; Fold strnlen(ax, 1) to *ax ? 1 : 0.

	define i64 @fold_strnlen_ax_1() {			define i64 @fold_strnlen_ax_1() {
	; CHECK-LABEL: @fold_strnlen_ax_1(			; CHECK-LABEL: @fold_strnlen_ax_1(
	; CHECK-NEXT: [[STRNLEN_CHAR0:%.]] = load i8, i8 getelementptr inbounds ([0 x i8], [0 x i8]* @ax, i64 0, i64 0), align 1			; CHECK-NEXT: [[STRNLEN_CHAR0:%.]] = load i8, i8 getelementptr inbounds ([0 x i8], [0 x i8]* @ax, i64 0, i64 0), align 1
	; CHECK-NEXT: [[STRNLEN_CHAR0CMP_NOT:%.*]] = icmp ne i8 [[STRNLEN_CHAR0]], 0			; CHECK-NEXT: [[STRNLEN_CHAR0CMP:%.*]] = icmp ne i8 [[STRNLEN_CHAR0]], 0
	; CHECK-NEXT: [[STRNLEN_SEL:%.*]] = zext i1 [[STRNLEN_CHAR0CMP_NOT]] to i64			; CHECK-NEXT: [[TMP1:%.*]] = zext i1 [[STRNLEN_CHAR0CMP]] to i64
	; CHECK-NEXT: ret i64 [[STRNLEN_SEL]]			; CHECK-NEXT: ret i64 [[TMP1]]
	;			;
	%ptr = getelementptr [0 x i8], [0 x i8]* @ax, i32 0, i32 0			%ptr = getelementptr [0 x i8], [0 x i8]* @ax, i32 0, i32 0
	%len = call i64 @strnlen(i8* %ptr, i64 1)			%len = call i64 @strnlen(i8* %ptr, i64 1)
	ret i64 %len			ret i64 %len
	}			}


	; Fold strnlen(s5, 0) to 0.			; Fold strnlen(s5, 0) to 0.
	▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i64 0			; CHECK-NEXT: ret i64 0
	;			;
	%ptr = getelementptr [9 x i8], [9 x i8]* @s5_3, i32 0, i32 5			%ptr = getelementptr [9 x i8], [9 x i8]* @s5_3, i32 0, i32 5
	%len = call i64 @strnlen(i8* %ptr, i64 5)			%len = call i64 @strnlen(i8* %ptr, i64 5)
	ret i64 %len			ret i64 %len
	}			}


	; Fold strnlen(s5_3 + 6, 5) to 3.			; Fold strnlen(s5_3 + 6, 3) to 3.

	define i64 @fold_strnlen_s5_3_p6_5() {			define i64 @fold_strnlen_s5_3_p6_3() {
	; CHECK-LABEL: @fold_strnlen_s5_3_p6_5(			; CHECK-LABEL: @fold_strnlen_s5_3_p6_3(
	; CHECK-NEXT: ret i64 3			; CHECK-NEXT: ret i64 3
	;			;
	%ptr = getelementptr [9 x i8], [9 x i8]* @s5_3, i32 0, i32 6			%ptr = getelementptr [9 x i8], [9 x i8]* @s5_3, i32 0, i32 6
	%len = call i64 @strnlen(i8* %ptr, i64 5)			%len = call i64 @strnlen(i8* %ptr, i64 3)
				ret i64 %len
				}


				; Fold even the invalid strnlen(s5_3 + 6, 4) call where the bound exceeds
				nikicUnsubmitted Not Done Reply Inline Actions This TODO is probably not relevant anymore given the updated patch? nikic: This TODO is probably not relevant anymore given the updated patch?
				; the number of characters in the array. This is arguably safer than
				; making the library call (although the low bound makes it unlikely that
				; the call would misbehave).

				define i64 @call_strnlen_s5_3_p6_4() {
				; CHECK-LABEL: @call_strnlen_s5_3_p6_4(
				; CHECK-NEXT: ret i64 3
				;
				%ptr = getelementptr [9 x i8], [9 x i8]* @s5_3, i32 0, i32 6
				%len = call i64 @strnlen(i8* %ptr, i64 4)
	ret i64 %len			ret i64 %len
	}			}

llvm/test/Transforms/InstCombine/wcslen-1.ll

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	;
%hello_p = getelementptr [6 x i32], [6 x i32]* @hello, i64 0, i64 0		%hello_p = getelementptr [6 x i32], [6 x i32]* @hello, i64 0, i64 0
%hello_l = call i64 @wcslen(i32* %hello_p)		%hello_l = call i64 @wcslen(i32* %hello_p)
%eq_hello = icmp eq i64 %hello_l, 0		%eq_hello = icmp eq i64 %hello_l, 0
ret i1 %eq_hello		ret i1 %eq_hello
}		}

define i1 @test_simplify6(i32* %str_p) {		define i1 @test_simplify6(i32* %str_p) {
; CHECK-LABEL: @test_simplify6(		; CHECK-LABEL: @test_simplify6(
; CHECK-NEXT: [[STRLENFIRST:%.]] = load i32, i32 [[STR_P:%.*]], align 4		; CHECK-NEXT: [[CHAR0:%.]] = load i32, i32 [[STR_P:%.*]], align 4
; CHECK-NEXT: [[EQ_NULL:%.*]] = icmp eq i32 [[STRLENFIRST]], 0		; CHECK-NEXT: [[EQ_NULL:%.*]] = icmp eq i32 [[CHAR0]], 0
; CHECK-NEXT: ret i1 [[EQ_NULL]]		; CHECK-NEXT: ret i1 [[EQ_NULL]]
;		;
%str_l = call i64 @wcslen(i32* %str_p)		%str_l = call i64 @wcslen(i32* %str_p)
%eq_null = icmp eq i64 %str_l, 0		%eq_null = icmp eq i64 %str_l, 0
ret i1 %eq_null		ret i1 %eq_null
}		}

; Check wcslen(x) != 0 --> *x != 0.		; Check wcslen(x) != 0 --> *x != 0.

define i1 @test_simplify7() {		define i1 @test_simplify7() {
; CHECK-LABEL: @test_simplify7(		; CHECK-LABEL: @test_simplify7(
; CHECK-NEXT: ret i1 true		; CHECK-NEXT: ret i1 true
;		;
%hello_p = getelementptr [6 x i32], [6 x i32]* @hello, i64 0, i64 0		%hello_p = getelementptr [6 x i32], [6 x i32]* @hello, i64 0, i64 0
%hello_l = call i64 @wcslen(i32* %hello_p)		%hello_l = call i64 @wcslen(i32* %hello_p)
%ne_hello = icmp ne i64 %hello_l, 0		%ne_hello = icmp ne i64 %hello_l, 0
ret i1 %ne_hello		ret i1 %ne_hello
}		}

define i1 @test_simplify8(i32* %str_p) {		define i1 @test_simplify8(i32* %str_p) {
; CHECK-LABEL: @test_simplify8(		; CHECK-LABEL: @test_simplify8(
; CHECK-NEXT: [[STRLENFIRST:%.]] = load i32, i32 [[STR_P:%.*]], align 4		; CHECK-NEXT: [[CHAR0:%.]] = load i32, i32 [[STR_P:%.*]], align 4
; CHECK-NEXT: [[NE_NULL:%.*]] = icmp ne i32 [[STRLENFIRST]], 0		; CHECK-NEXT: [[NE_NULL:%.*]] = icmp ne i32 [[CHAR0]], 0
; CHECK-NEXT: ret i1 [[NE_NULL]]		; CHECK-NEXT: ret i1 [[NE_NULL]]
;		;
%str_l = call i64 @wcslen(i32* %str_p)		%str_l = call i64 @wcslen(i32* %str_p)
%ne_null = icmp ne i64 %str_l, 0		%ne_null = icmp ne i64 %str_l, 0
ret i1 %ne_null		ret i1 %ne_null
}		}

define i64 @test_simplify9(i1 %x) {		define i64 @test_simplify9(i1 %x) {
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	;
%and = and i32 %x, 15		%and = and i32 %x, 15
%hello_p = getelementptr inbounds [13 x i32], [13 x i32]* @null_hello_mid, i32 0, i32 %and		%hello_p = getelementptr inbounds [13 x i32], [13 x i32]* @null_hello_mid, i32 0, i32 %and
%hello_l = call i64 @wcslen(i32* %hello_p)		%hello_l = call i64 @wcslen(i32* %hello_p)
ret i64 %hello_l		ret i64 %hello_l
}		}

@str16 = constant [1 x i16] [i16 0]		@str16 = constant [1 x i16] [i16 0]

define i64 @test_no_simplify4() {		; Fold the invalid call to zero. This is safer than letting the undefined
; CHECK-LABEL: @test_no_simplify4(		; library call take place even though it prevents sanitizers from detecting
; CHECK-NEXT: [[L:%.]] = call i64 @wcslen(i32 bitcast ([1 x i16]* @str16 to i32*))		; it.
; CHECK-NEXT: ret i64 [[L]]
		define i64 @test_simplify12() {
		; CHECK-LABEL: @test_simplify12(
		; CHECK-NEXT: ret i64 0
;		;
%l = call i64 @wcslen(i32* bitcast ([1 x i16]* @str16 to i32*))		%l = call i64 @wcslen(i32* bitcast ([1 x i16]* @str16 to i32*))
ret i64 %l		ret i64 %l
}		}

attributes #0 = { null_pointer_is_valid }		attributes #0 = { null_pointer_is_valid }