This is an archive of the discontinued LLVM Phabricator instance.

[SVE] MVT scalable size queries

Authored by huntergr on Aug 28 2019, 4:15 AM.



Size queries for MVTs, split out from D53137.

Contains a fix for FindMemType to avoid using scalable vector types to contain non-scalable types.

Diff Detail

Event Timeline

huntergr created this revision.Aug 28 2019, 4:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 28 2019, 4:15 AM
cameron.mcinally accepted this revision.Aug 28 2019, 7:41 AM



Nit-picky: would it make sense to add an explicit !scalable assert to all these functions?

I see that this particular function is guarded by the asserts in getSizeInBits() and getExtendedSizeInBits(), but there's really no guarantee that future modifications will preserve this behavior.

To be clear, I don't feel strongly about this. Just thinking aloud...

This revision is now accepted and ready to land.Aug 28 2019, 7:41 AM


Thanks for the review; there's still D53137 and D66339 as prerequisites so I won't commit this yet.


I can certainly add asserts more widely to make it obvious instead of relying on lower functions to assert and just commenting on it.

huntergr updated this revision to Diff 226410.Oct 25 2019, 4:50 AM
  • Refactored to match the same approach used in D53137 -- TypeSize returned for all size queries instead of introducing separate interfaces and relying on implicit conversion to be (mostly) compatible with existing code.
  • Added 'isByteSized()' interface to TypeSize.
  • Minor bugfix to DAGCombiner::ForwardStoreValueToDirectLoad, where implicit conversions allowed an unsigned number to wrap and then be converted to a larger signed integer; with TypeSize using uint64_t instead, this showed up during a make check-all run with ubsan active. Can't write a test for that one since the transform bailed out when Offset was non-zero.
rovka added a subscriber: rovka.Oct 29 2019, 3:31 AM
rovka added inline comments.

That's a weird change...


Why does LE care about isVector and LT doesn't?

In any case, I think both functions would benefit from being rewritten using std::tie and the corresponding operators.

huntergr added inline comments.Nov 5 2019, 9:39 AM

I think that was a copy-paste from elsewhere. Will revert.


LT seems to be used with filtered min/max functions below and will only operate on either only vector or only scalar MVTs, so never compares between them.

LE is then explicitly used to remove MVTs from Typesets that may have mixed scalar and vector MVTs, so needs to avoid comparing them.

I'll try reworking them with std::tie.

sdesmalen added inline comments.Nov 6 2019, 3:31 AM

Should this be more explicit here?

const TypeSize &Size = Other.second.getSizeInBits();
if (Size.isScalable())
  ScalableBytes += Other.first * Size.getScalableSize() / 8;
  Bytes += Other.first * Size.getFixedSize() /8;

(Note that this also assumes adding a getScalableSize() to TypeSize)


Is this a good argument to add an interface for isKnownGreater and isKnownGreaterOrEqual (vis-a-vis isKnownSmaller/IsKnownSmallerOrEqual) to TypeSize as suggested in ?

huntergr updated this revision to Diff 228846.Nov 12 2019, 3:35 AM
  • Switched the lambdas in CodeGenDAGPatterns to use std::tie for ordering. This needed the TypeSize objects to be constant (I get a "non-const lvalue reference" error otherwise), so I changed all query methods to return const objects.
  • Changed stack offset to deliberately use a TypeSize. This hit another problem with implicit promotion and signed v. unsigned casting, so I've added explicit casts.
huntergr marked 3 inline comments as done.Nov 12 2019, 3:58 AM

I also changed the SelectionDAGBuilder code for masked loads/stores/gathers/scatters to use the known min size when creating a MachineMemoryOperand, since MMO isn't aware of scalable types yet. I've left TODOs as reminders.


After thinking about it for a bit, I don't think so.

isKnownSmaller would return false for v8f32 vs. nxv4f32, and we would then proceed with a direct comparison of scalable and non-scalable type sizes, which will assert. Using that as the only comparison would also ignore the different ordering in this function, where a v4f32 would be considered less than an i64 since the scalar size is compared first.

It's possible that a different canonical ordering would work as well, but I'm not sure how many places in CodeGenDAGPatterns assume things about the order. Using the std::tie comparisons changes things slightly as it is, as noted below.


I've reworked them to use std::tie, which meant I had to make the size query methods return const TypeSize. Everything still seems to work (make check-all, plus an LNT run).

I don't think the std::tie comparisons are strictly equivalent to what was there before -- if A.getScalarSizeInBits() is greater than B.getScalarSizeInBits() in LT then I think the std::tie comparison would return false immediately instead of progressing to the second comparison as the previous code would.

huntergr added inline comments.Nov 12 2019, 4:01 AM

...actually, I think it works out the same anyway, since you would only get a true result from LT in the second case if the scalar sizes were equal.

rovka added a comment.Nov 13 2019, 3:47 AM

Regarding the change to return const, I'm not convinced that's a good idea (we actually have a clang-tidy check that warns about that). I think it would be better to either name those temporaries or use std::make_tuple instead of std::tie (whichever you prefer).


Could you be a bit more specific about why you need to cast to signed? It was unsigned in the original code too.


I was actually thinking of including isScalableVector and isVector below in the tie (since bool is an integral type and compares the way you'd expect).

huntergr added inline comments.Nov 13 2019, 5:01 AM

This code was buggy before, but didn't crash as getStoreSizeInBits returned a 32 bit unsigned, vs. the uint64_t we now return as an implicit cast.

For the failing case, LDMemType's size was larger than STMemType's size, so wrapped to a very large uint64_t value. That was then divided by 8 (which unset the top bit due to being an unsigned divide), implicitly cast to an int64_t so that Offset (which was zero) could be subtracted.

The next bit of code multiplies Offset by 8, which wraps for a signed value (which UBSan caught).

It didn't happen for the 32 bit result since the top bits were clear at the time of the implicit cast to int64_t.

I suspect the bug would have been found quickly once cases with a non-zero Offset were implemented, but the code below currently bails out in that case.


Yeah, I thought of that too, but ran into problems when trying to make it const (maybe because bool is a primitive?).

I'll try with make_tuple and see what happens.

huntergr updated this revision to Diff 229281.Nov 14 2019, 5:18 AM
  • Removed const qualifier from queries
  • Swapped to std::make_tuple for the comparisons in CodeGenDAGPatterns.cpp
rovka accepted this revision.Nov 15 2019, 5:12 AM

I don't see anything else wrong with this. LGTM if you rename the LE predicate.


Ok, this looks a bit better, but I'm still not happy about the state of this code. I'm not going to hold your patch hostage because of this, since it was a problem of the existing code, but LE is a really really misleading name for this predicate. It suggests that it's defining a partial order, which it isn't. If A is a vector and B is a scalar (or the other way around), we get false for both A <= B and B <= A. That never happens for a mathematical A <= B (since that implies that B < A, which means B <= A should return true). OTOH, fixing it to be a proper partial order would make us remove too much. I think the best solution here would be to use a different name for the predicate - SameKindLE maybe?

This revision was automatically updated to reflect the committed changes.