This is an archive of the discontinued LLVM Phabricator instance.

[ARM] MVE: isLegalMaskedLoad
AbandonedPublic

Authored by SjoerdMeijer on Aug 30 2019, 7:16 AM.

Details

Summary

This tweaks TTI hook isLegalMaskedLoadr, which has 2 use cases: it is queried
for vector types, which is what the current implementation supports, but the
vectorizer also queries it supplying scalar types. The latter wasn't taken into
account, and scalar types get rejected as legal masked loads because their size
!= 128 bits.

On MVE, with most instructions being VPT Block compatible, you could argue that
everything can be legally masked (modulo the exceptions), but this new
implementation rejects double-word ints and floats, which seems to be
reasonable for now. This seems to improve codegen for some existing cases, and
I will follow this up with loop vectorization tests that also query this.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Aug 30 2019, 7:16 AM

It's a shame that the isLegalMaskedLoad works like that! Being given integer types. MVE only has some expands and we don't handle all of them.

Unfortunately I don't think it's quite this simple. We still need to make sure that the vector types that we don't know how to handle are excluded correctly. We don't handle widening loads yet, for example (although we could and it would fix the testcase here I think), That's what the vector width line was trying to guard against, and the tests don't look correct.

Instead, can we just put the VecWidth check behind a check of isVectorTy? So if we are called from the vectorizer, we check the scalar type size and if we are called later from the legalise helper we check the full type. We may just be opening ourselves up to a lot more bugs/inefficiencies, we will have to make sure we test this well before we enable it.

llvm/test/CodeGen/Thumb2/mve-masked-ldst.ll
28

This seems to be doing a 4xi32 cmp, followed by a 16xi8 load using that predicate, and then sign extending the wrong registers in place. It should be sign extending the first 4 value, not the 0,4,8,12 values. If I'm reading this correctly.

I'm surprised it's doing this exactly though. You may be finding some other bug here.

I don't think the old code is correct either! I've reverted the masked loads and stores code until we can fix some of the issues.

Hi Dave, thanks for looking and the clarifications!
Would it be possible to address the "scalar use-case" here? That would allow me to continue with writing test-cases for the vectorizer etc. while you look at handling the vector types?

First things first, I think we need to fix the saving/loading of predicates, which isn't going to be pretty but is I think we have to do. I'll put together a patch for that.

Then it's probably best to re-enable masked loads/stores but behind an option. That'll let you go ahead with your testing, without making everything too inefficient or broken. Will that work for you? So long as you are not needing narrowing/widening masked loads/stores, it should look OK.

Yep, that sounds like a plan, thanks!

When you put a new patch together, I assume you will be touching this function isLegalMaskedLoad again? In other words, I can abandon this change?

SjoerdMeijer abandoned this revision.Sep 13 2019, 12:22 AM

Abandoning, this is addressed by D67186