Currently, tailcallelim does
(1) mark call instruction as "tail" if possible
(2) eliminate tail call and generate loops
Like memcpyopt, some passes add call instruction but these calls are not marked as tail[1]. So it is better to add (1) after such passes.
I split tailcallelim into tailcallmark and tailcallelim.
tailcallmark - only for marking function calls as "tail".
tailcallelim - does what old tailcalleim did except for marking the calls.
Function behavior descriptions go usually above the function (3 x /). See canMoveAboveCall below for an example.