Page MenuHomePhabricator

Strlen loop idiom recognition
Needs ReviewPublic

Authored by anjankgk on Sep 28 2020, 6:19 PM.

Details

Summary

(Original patch by musman under D83392. I am planning to commit this in his absence, with his consent.)
This patch adds strlen to LoopIdiomRecognize.cpp. It is the first part of 3 patches:

  1. Strlen loop idiom recognition
  2. Strlen16 recognition and creation of new strlen16 intrinsic
  3. Folding of strlen/strlen16 call if they are only used for zero equality comparison (replace with load of first char)

Example that this recognizes:

unsigned strlen(char *Str) {

char *Src = Str;
if (!Src)
  return 0;
for (; *Src;)
  Src++;
return Src - Str;

}

Diff Detail

Event Timeline

anjankgk created this revision.Sep 28 2020, 6:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2020, 6:19 PM
anjankgk requested review of this revision.Sep 28 2020, 6:19 PM
shawnl requested changes to this revision.Oct 2 2020, 6:57 PM

This pass should recognise the nonnull attribute, as it is analyzing against null.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1614

If the passed pointer has the "nonnull" attribute then this condition need not be present.

This revision now requires changes to proceed.Oct 2 2020, 6:57 PM
shawnl added a comment.Oct 3 2020, 5:33 AM

It is possible DCE will handle the null check with "nonnull" attribute, so add the test first.

Otherwise I like how little code this took compared to earlier versions.

(For future reference, Phabricator has an option "commandeer" a revision, to take control if the original author of a patch leaves a patch incomplete for whatever reason. Obviously use sparingly.)

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1614

The icmp isn't against a pointer; it's against an i8. Or, it should be; I guess the code is missing a check that it's i8, as opposed to some arbitrary integer type.

For comparison, it's undefined behavior to call strlen with a null pointer.

1666

I'm not sure how ResInst is connected to the rest of this transform; why are we transforming the operand of some random PHI node?

1683

"dead instructions", not "death instructions".

llvm/test/Transforms/LoopIdiom/recognize-strlen.ll
1

Can you generate the check lines use update_test_checks.py? It isn't obvious what the whole function looks like as a result of this transform.

shawnl resigned from this revision.Oct 4 2020, 4:15 PM
shawnl added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1614

My bad. I was really working from example code (and the tests also have a null check). This is why in my C code I always use '\x00' for NUL (the character) and not 0 or NULL.

This revision now requires review to proceed.Oct 4 2020, 4:15 PM
anjankgk marked 4 inline comments as done.Oct 7 2020, 3:15 PM
anjankgk added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

I added a comment for clarity -

// Finally, we find the phi node that corresponds to the distance between the
// pointers and replace it's uses by the call to strlen in the transformed
// code.
anjankgk updated this revision to Diff 296817.Oct 7 2020, 3:24 PM
anjankgk marked an inline comment as done.

Check for i8 type and use update_test_checks.py for testfile.

efriedma added inline comments.Oct 7 2020, 3:25 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

The patch doesn't appear to prove that anywhere.

anjankgk added inline comments.Oct 7 2020, 3:31 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

I am getting the sub instruction saved in ResInst and replacing that in line# 1689. Is that something that you prefer to be done in a different way?

efriedma added inline comments.Oct 7 2020, 3:38 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

How do you know it's a sub instruction?

anjankgk added inline comments.Oct 7 2020, 8:05 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

I agree. The current code is actually just looking for the phi node from the predecessor (LoopExitBB) block. I think, I should add a code to ensure that is actually a sub instruction. However, I am afraid that that could again lead to a pattern-matching kind of code which was initially posted.

efriedma added inline comments.Oct 7 2020, 10:36 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

What happens if we ignore ResInst, and rewrite LCSSAPhi instead? That still eliminates the inner loop; later passes should optimize the resulting arithmetic.

anjankgk added inline comments.Oct 15 2020, 2:37 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

I tried a few things, but I am not sure how to do that.
This is the i/p BB:

for.end:                                          ; preds = %for.inc
  %incdec.ptr.lcssa = phi i8* [ %incdec.ptr, %for.inc ]
  %sub.ptr.lhs.cast = ptrtoint i8* %incdec.ptr.lcssa to i64
  %sub.ptr.rhs.cast = ptrtoint i8* %Str to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
  br label %cleanup

return type of strlen is i64 (same as ResInt which is %sub.ptr.sub), as compared to the type of lcssa instrn (%incdec.ptr.lcssa) which is a i8*. I am not clear how I can rewrite lcssa with the strlen as suggested.

efriedma added inline comments.Oct 15 2020, 2:46 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1666

%incdec.ptr.lcssa is going to be equivalent to something like getelementptr i8, i8* %Str, i64 %strlen. Speaking in generic terms, replace %Str with the start value of LCSSAEv. Add the two values together with SCEV, and use SCEVExpander to materialize it.

anjankgk marked an inline comment as done.Oct 19 2020, 3:24 PM
anjankgk updated this revision to Diff 299193.Oct 19 2020, 3:27 PM

Rewrite the LCSSAPhi instruction to reflect strlen idiom recognition.

Phabricator doesn't track incremental patches the way you're uploading them; please upload the full diff against master.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1665

LCSSAEv->getStart() is the value of the output of the loop; the input to strlen needs to be the input to the load (LoadEv->getStart()).

More tests would be helpful to catch this sort of issue.

1671

getPointerBase() isn't right; it strips all offsets from the pointer.

I'd like to see more regression test coverage of different variations on the values produced by the loop.

It'll be nice to know if this patch worked fine with a set of regression tests.
results on LLVM testsuite, or clang bootstrap will really help.

Thanks for working on this.