This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Improve handling of empty VPT blocks in tail predicated loops
ClosedPublic

Authored by dmgreen on Dec 1 2020, 12:59 AM.

Details

Summary

A vpt block that just contains either VPST;VCTP or VPT;VCTP, once the VCTP is removed will become invalid. This fixed the first by removing the now empty block and bails out for the second, as we have no simple way of converting a VPT to a VCMP.

Diff Detail

Event Timeline

dmgreen created this revision.Dec 1 2020, 12:59 AM
dmgreen requested review of this revision.Dec 1 2020, 12:59 AM
SjoerdMeijer added inline comments.Dec 1 2020, 2:53 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
316

Perhaps I am confused, but should this:

Insts.front()->getOpcode() != ARM::MVE_VPST

not be be ==?

dmgreen added inline comments.Dec 1 2020, 3:02 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
316

The block should start with either a VPST or a VPT. This is checking that it's a VPT, by proxy of it not being a VPST. (There are many VPT opcodes, so that is more difficult to check).

samtebbs added inline comments.Dec 1 2020, 5:36 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
316

There is a isVPTOpcode function that you could use to make the check more precise.

dmgreen updated this revision to Diff 309735.Dec 5 2020, 5:51 AM

I've added an assert that checks isVPTOpcode (which I think may already be checked but it's not a bad check to add again if we are relying on it here).

samtebbs accepted this revision.Dec 14 2020, 1:39 AM

Looks good to me!

This revision is now accepted and ready to land.Dec 14 2020, 1:39 AM