Page MenuHomePhabricator

Add 'strlen' formation to LoopIdiomRecognize
Needs ReviewPublic

Authored by nlewycky on Aug 18 2015, 11:56 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

This recognizer finds loops of the form:

for (const char *ptr = p; *ptr; ++ptr, ++len)
  /*empty*/ ;

and converts them into calls to strlen().

Diff Detail

Repository
rL LLVM

Event Timeline

nlewycky updated this revision to Diff 32510.Aug 18 2015, 11:56 PM
nlewycky retitled this revision from to Add 'strlen' formation to LoopIdiomRecognize.
nlewycky updated this object.
nlewycky set the repository for this revision to rL LLVM.
nlewycky added a subscriber: llvm-commits.
Prazek added a subscriber: Prazek.Aug 19 2015, 12:06 AM

I don't think I have enough power to say LGTM, but looks legit to me.

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1105

I think it would look better if You would put this in the same line. Now it somehow look like
for (... );

IndVar;
IndVar = ...
hfinkel added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1081

Do you also need to skip debug intrinsics, etc.?

1119

Can you please add a test case for this variant with vectors?

sanjoy added a subscriber: sanjoy.Aug 19 2015, 12:34 AM
sanjoy added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
64

I'd open the namespace in just the one function that uses it.

205

Why this->?

1043

Why not declare Ptr as a LoadInst *? Then you won't need this second cast<Instruction>(.

1045

Why not isIntegerTy(8)?

1047

auto * might be cleaner.

1052

auto *?

1057

Why not use a range for here?

1072

How can Start ever not be loop invariant?

My reasoning went like this: it'll have to dominate the entry edge, and therefore the preheader, but if it is defined inside the only BB, then the only BB == the header dominates the preheader, making it part of the loop.

1083

What if these have uses outside the loop? Also why these specifically? Why not also handle, say, mul?

1087

Why not use a range for loop?

1089

Why not

if (auto *OpI = dyn_cast<..)
  if (OpI->getParent() != ...
    ...

Alternatively, why not use Loop::hasLoopInvariantOperands ?

sanjoy added inline comments.Aug 19 2015, 12:40 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
205

Never mind this one -- I missed the LPM on both sides initially.

sanjoy added inline comments.Aug 19 2015, 12:47 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1083

What if these have uses outside the loop?

I'll redact this bit -- I had initially missed what you were doing with the splicing logic below.

nlewycky marked 7 inline comments as done.Aug 19 2015, 2:46 PM
nlewycky added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1043

Because then it can't be used in m_Value(Load).

Do you think I should I add an m_Load instead?

1045

'i8*' not 'i8'. I could do "cast<PointerType>(Ptr->getType())->getElementType()->isIntegerTy(8)" if you prefer?

1057

I wanted to exclude the terminator.

I could add a simple "if (isa<TerminatorInst>(Inst)) continue;".

I don't think we can use "E = llvm::prior(LoopBody->end())" because it may walk off the beginning of the block. At this point, we haven't yet shown that LoopCond, PtrGEP or PtrPHI are actually members instructions inside LoopBody.

1072

I concur. This is now an assertion.

1081

Also, I'm not updating DebugLoc anywhere in this code. If there's debug intrinsics they'll move along with the splice, which is probably close enough to right, as much as any tracking for optimized debug info.

1083

Gave this some thought, decided we can safely extend it.

1087

Not applicable.

sanjoy added inline comments.Aug 19 2015, 3:00 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1043

Do you think I should I add an m_Load instead?

I think that's a good idea, it will be generally useful.

1045

I think what you have is fine. Sorry for misreading the code originally.

1057

I could add a simple "if (isa<TerminatorInst>(Inst)) continue;".

I mildly prefer that.

1087

I may be missing something here, but can't you do for (Value *Op : Inst->operands())?

alexr added a subscriber: alexr.Aug 20 2015, 1:31 PM

This makes me very nervous. We have fielded a lot of user complaints about canonicalizing loops into memcpy calls. strlen feels like a more dubious choice.

My past experience writing SIMD versions of string functions tells me that the typical string length that strlen is tuned for may be very different from the amount of data an open-coded loop is intended to handle.

I would want to see very tuned lowering code to go with this. I'd also very much like to see the lowering code pay attention to target-specific cost information such as the possible cost to cross a dynamic library boundary when weighing the lowering of a call vs expanding inline.