Size queries for MVTs, split out from D53137.
Contains a fix for FindMemType to avoid using scalable vector types to contain non-scalable types.
Nit-picky: would it make sense to add an explicit !scalable assert to all these functions?
I see that this particular function is guarded by the asserts in getSizeInBits() and getExtendedSizeInBits(), but there's really no guarantee that future modifications will preserve this behavior.
To be clear, I don't feel strongly about this. Just thinking aloud...
I think that was a copy-paste from elsewhere. Will revert.
LT seems to be used with filtered min/max functions below and will only operate on either only vector or only scalar MVTs, so never compares between them.
LE is then explicitly used to remove MVTs from Typesets that may have mixed scalar and vector MVTs, so needs to avoid comparing them.
I'll try reworking them with std::tie.
Should this be more explicit here?
const TypeSize &Size = Other.second.getSizeInBits(); if (Size.isScalable()) ScalableBytes += Other.first * Size.getScalableSize() / 8; else Bytes += Other.first * Size.getFixedSize() /8;
(Note that this also assumes adding a getScalableSize() to TypeSize)
Is this a good argument to add an interface for isKnownGreater and isKnownGreaterOrEqual (vis-a-vis isKnownSmaller/IsKnownSmallerOrEqual) to TypeSize as suggested in https://reviews.llvm.org/D53137#1621351 ?
I also changed the SelectionDAGBuilder code for masked loads/stores/gathers/scatters to use the known min size when creating a MachineMemoryOperand, since MMO isn't aware of scalable types yet. I've left TODOs as reminders.
After thinking about it for a bit, I don't think so.
isKnownSmaller would return false for v8f32 vs. nxv4f32, and we would then proceed with a direct comparison of scalable and non-scalable type sizes, which will assert. Using that as the only comparison would also ignore the different ordering in this function, where a v4f32 would be considered less than an i64 since the scalar size is compared first.
It's possible that a different canonical ordering would work as well, but I'm not sure how many places in CodeGenDAGPatterns assume things about the order. Using the std::tie comparisons changes things slightly as it is, as noted below.
I've reworked them to use std::tie, which meant I had to make the size query methods return const TypeSize. Everything still seems to work (make check-all, plus an LNT run).
I don't think the std::tie comparisons are strictly equivalent to what was there before -- if A.getScalarSizeInBits() is greater than B.getScalarSizeInBits() in LT then I think the std::tie comparison would return false immediately instead of progressing to the second comparison as the previous code would.
Regarding the change to return const, I'm not convinced that's a good idea (we actually have a clang-tidy check that warns about that). I think it would be better to either name those temporaries or use std::make_tuple instead of std::tie (whichever you prefer).
Could you be a bit more specific about why you need to cast to signed? It was unsigned in the original code too.
I was actually thinking of including isScalableVector and isVector below in the tie (since bool is an integral type and compares the way you'd expect).
This code was buggy before, but didn't crash as getStoreSizeInBits returned a 32 bit unsigned, vs. the uint64_t we now return as an implicit cast.
For the failing case, LDMemType's size was larger than STMemType's size, so wrapped to a very large uint64_t value. That was then divided by 8 (which unset the top bit due to being an unsigned divide), implicitly cast to an int64_t so that Offset (which was zero) could be subtracted.
The next bit of code multiplies Offset by 8, which wraps for a signed value (which UBSan caught).
It didn't happen for the 32 bit result since the top bits were clear at the time of the implicit cast to int64_t.
I suspect the bug would have been found quickly once cases with a non-zero Offset were implemented, but the code below currently bails out in that case.
Yeah, I thought of that too, but ran into problems when trying to make it const (maybe because bool is a primitive?).
I'll try with make_tuple and see what happens.
I don't see anything else wrong with this. LGTM if you rename the LE predicate.
Ok, this looks a bit better, but I'm still not happy about the state of this code. I'm not going to hold your patch hostage because of this, since it was a problem of the existing code, but LE is a really really misleading name for this predicate. It suggests that it's defining a partial order, which it isn't. If A is a vector and B is a scalar (or the other way around), we get false for both A <= B and B <= A. That never happens for a mathematical A <= B (since that implies that B < A, which means B <= A should return true). OTOH, fixing it to be a proper partial order would make us remove too much. I think the best solution here would be to use a different name for the predicate - SameKindLE maybe?