This fixes https://bugs.llvm.org/show_bug.cgi?id=41047.
IRBuilderBase::CreateMemCpy and similar functions are called in memcpy optimization(lib/Transforms/Scalar/MemCpyOptimizer.cpp). I think it is safe to enable tail call.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 29338 Build 29337: arc lint + arc unit
Event Timeline
A "tail" call is a call which does not refer to any memory allocated in the current stack frame. You aren't actually proving that invariant anywhere.
I'm sorry, I misunderstood "tail". I will revert IRBuilder.cpp changes.
I have two ideas to enable tail call.
- Add TailCallElimPass after MemCpyOptPass.
- In the end of MemCpyOptPass, use lib/Transforms/Scalar/TailRecursionElimination.cpp/markTails.
I think 1 is simpler and better.
Is there any problem?
Rerunning the analysis at some point after memcpyopt definitely makes sense. I'm not sure where exactly you'd put it, though... I guess after memcpyopt, nothing is really likely to create new tail calls, so just sticking it after memcpyopt might be fine. Maybe interesting to gather statistics for a few different placements, though.
Actually performing tail recursion elimination after loop optimizations is probably a bad idea, so we might want to split the pass.
I guess after memcpyopt, nothing is really likely to create new tail calls, so just sticking it after memcpyopt might be fine.
I think so too.
As you said, I'd like to split TailCallElimPass into two passes, "TailCallMarkingPass" and "TailCallElimPass".
TailCallMarkingPass is only for marking function calls as "tail".
New TailCallElimPass does what old TailCallElimPass did except for marking the calls.
I will make a new patch for these.
Couldn’t passes like FunctionAttrs / Attributor infer “tail call mark” ?
Not sure how many times these passes go, indeed.
Sure. I already have (down-stream) code for other instruction attributes, adding more is what I wanted to do anyway.
I'll look into and let you know!
For those who missed it: "Attributor" is a proposed replacement of function attribute detection, see D59918 and its child patches.