Page MenuHomePhabricator

[PPC] Lower llvm.ppc.cfence(constant) to no-op.

Authored by timshen on May 24 2017, 1:52 PM.



As the test case shows, if a pass constant-fold the generated
llvm.ppc.cfence(%tmp) to llvm.ppc.cfence(0), cfence should lower to
nothing, instead of crash later in legalization.

Event Timeline

timshen created this revision.May 24 2017, 1:52 PM
kbarton added inline comments.May 25 2017, 2:51 PM

I agree with this, but I'm wondering if it could/should be generalized a bit more.
Are there other cases of SDNodes that we could reach here that are not valid?
Alternatively, is it possible to enumerate the legal SD nodes that could be here and remove everything that is not legal?


The triple we normally use is powerpc64le-unknown-linux-gnu.

timshen updated this revision to Diff 100322.May 25 2017, 4:06 PM

Rebase onto D33573, the crasher fix.

timshen marked an inline comment as done.May 25 2017, 4:14 PM
timshen added inline comments.

I dag a bit more and found the root cause, which is fixed in D33573. This patch, however only makes things faster, not more correct.

The optimization opportunity comes from that, a pass proves at compile time, that the load instruction itself doesn't synchronize with any stores, therefore it's Ok to change the parameter to @llvm.ppc.cfence to something else.

The interesting bits are "at compile time". This basically limits the optimization to ConstantSDNode only - other nodes represent behaviors at runtime, and are unlikely to be proven not synchronized with any stores whatsoever.

I suggest to keep these two cases (constant and others), until we find other pessimizations. Does it make sense?

kbarton accepted this revision.Jun 5 2017, 10:39 AM



Yes, that's makes sense. Thanks for looking into this.

This revision is now accepted and ready to land.Jun 5 2017, 10:39 AM
timshen abandoned this revision.Jun 5 2017, 3:51 PM

The problem is not PPC specific, so I'm going to abandon this patch, in order to not demotivate the correct fix. Someone should do it at IR level, when it actually pessimize their code. :)