This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Teach the move free before null test opti how to deal with noop casts
ClosedPublic

Authored by qcolombet on Oct 16 2018, 9:56 PM.

Details

Summary

InstCombine features an optimization that essentially replaces:
if (a)

free(a)

into:
free(a)

Right now, this optimization is gated by the minsize attribute and therefore
we only perform it if we can prove that we are going to be able to eliminate
the branch and the destination block.

However when casts are involved the optimization would fail to apply, because
the optimization was not smart enough to realize that it is possible to also
move the casts away from the destination block and that is harmless to the
performance since they are just noops.
E.g.,
foo(int *a)
if (a)

free((char*)a)

Wouldn't be optimized by instcombine, because

  • We would refuse to hoist the bitcast i32* %a to i8 in the source block
  • We would fail to see that bitcast i32* %a to i8 and %a are the same value.

This patch fixes both these problems:

  • It teaches the pattern matching of the comparison how to look

through casts.

  • It checks that whether the additional instruction in the destination block

can be hoisted and are harmless performance-wise.

  • It hoists all the code of the destination block in the source block.

Diff Detail

Repository
rL LLVM

Event Timeline

qcolombet created this revision.Oct 16 2018, 9:56 PM
lebedev.ri added inline comments.
lib/Transforms/InstCombine/InstructionCombining.cpp
2363–2367 ↗(On Diff #169948)
if (!match(TI, m_Br(m_ICmp(Pred,
                           m_CombineOr(m_Specific(Op),
                                       m_Specific(Op->stripPointerCasts())),
                           m_Zero()), TrueBB, FalseBB)))

This is adding to a transform that seems out-of-place for instcombine:

  1. It's a hoist transform that relies on SimplifyCFG to do the real optimization. Does the whole thing belong there (or CGP) instead?
  2. It's the only transform in instcombine that is gated by 'minsize', so it's a limited optimization rather than a canonicalization?

Hi Sanjay,

This is adding to a transform that seems out-of-place for instcombine:
It's a hoist transform that relies on SimplifyCFG to do the real optimization. Does the whole thing belong there (or CGP) instead?

Interestingly I think we had the exact same conversation when it was first added back in 2013, but I cannot find any trace of this discussion... My mind must play me tricks :).
I am happy to move the whole thing in simplifyCFG. That said, strictly speaking that combine doesn't modify the CFG.
CGP seems too late to me.

It's the only transform in instcombine that is gated by 'minsize', so it's a limited optimization rather than a canonicalization?

That is correct, though, I'd argue that a lot of code in instcombine are optimizations too (as opposed to pure cannonicalization.)

How would you like to proceed?

Cheers,
-Quentin

lib/Transforms/InstCombine/InstructionCombining.cpp
2363–2367 ↗(On Diff #169948)

Ah great! I was looking for something like m_CombineOr, but missed it.

Interestingly I think we had the exact same conversation when it was first added back in 2013, but I cannot find any trace of this discussion... My mind must play me tricks :).

I think the original patch was before I got here, so I probably didn't see it...but I did ask about that patch in D45343, so that might be what you are remembering. :)

I am happy to move the whole thing in simplifyCFG. That said, strictly speaking that combine doesn't modify the CFG.

If it doesn't look too hard, moving it over to SimplifyCFG would be great. That way, we can be sure that the empty block removal always happens without interference from some other pass.

CGP seems too late to me.

It's the only transform in instcombine that is gated by 'minsize', so it's a limited optimization rather than a canonicalization?

That is correct, though, I'd argue that a lot of code in instcombine are optimizations too (as opposed to pure cannonicalization.)

How would you like to proceed?

Cheers,
-Quentin

If it doesn't look too hard, moving it over to SimplifyCFG would be great. That way, we can be sure that the empty block removal always happens without interference from some other pass.

Sounds good!

Hi Sanjay,

I had a quick look and it doesn't seem to be a good fit for simplifyCFG.
In particular, simplifyCFG only inspects terminator instructions and it feels out of place to add a scan through the entire list of instructions to find a call to free.

Let me know if you want me to pursue in that direction or if you have another idea.

Cheers,
-Quentin

qcolombet updated this revision to Diff 171526.Oct 29 2018, 9:49 AM
  • Use m_CombineOr instead of repeating the pattern to match
qcolombet marked an inline comment as done.Oct 29 2018, 9:49 AM
spatel accepted this revision.Oct 29 2018, 1:34 PM

I had a quick look and it doesn't seem to be a good fit for simplifyCFG.
In particular, simplifyCFG only inspects terminator instructions and it feels out of place to add a scan through the entire list of instructions to find a call to free.

Let me know if you want me to pursue in that direction or if you have another idea.

I was afraid this was going to be an odd fit no matter what we tried to do with it, so no I don't have any better ideas short of making a standalone pass, and I'm not sure if there are any other transforms like this to justify that effort.

Given that this is a small extension of the existing transform, I'll say LGTM.

This revision is now accepted and ready to land.Oct 29 2018, 1:34 PM