For use in AMDGPU selection DAG.
Technically there's precedent with MVT::v3i32 but where does this end? What happens when someone next asks for v8f32 -> v15f32? How tricky would it be for a custom AMDGPU DAG node to encode the vector width instead and it just uses the next pow2 type?
Feel free to commit. I'll rebase this on top of the new state.
Yes, the precedent is v3i32/v5i32.
I agree its a dangerous slope, but I would be happy if we declared v7 as the hard limit for non-pow2 types.
Declaring a hard limit may also benefit code that has to work with these?
On recent hardware we have features which mean there isn't a need for large contiguous vector types in common cases.
However we have to disable this beyond v5 on some common hardware for stability.
Supporting v6/v7 benefits all our hardware going back many generations (where benefit = reduced register pressure).
I can workaround the lack of v6/v7, but our consensus is that we should just add the MVT types (see D103800).
Where does it end for power-of-two types? It seems like they get added whenever any target has a need for them. I don't see why non-power-of-two types would be fundamentally different in that regard.
This commit seems to have injected a failure for x86-32 codegen in Halide; for one of our tests involving vector reductions, this commit causes a new assertion failure: MachineValueType.h:469: llvm::MVT llvm::MVT::changeVectorElementType(llvm::MVT) const: Assertion VecTy.SimpleTy != MVT::INVALID_SIMPLE_VALUE_TYPE && "Simple vector VT not representable by simple integer vector VT!"' failed.`
I'll be adding more details and adding a repro case as soon as I have a small one; posting this now to see if anyone else has noticed a regression/injection.
Debugging into this a little further, it's unclear to me how EVT::changeVectorElementType() can be expected to work reliably: it assumes that if EltVT.isSimple() == true, then MVT::changeVectorElementType(EltVT) will be legal, but this is clearly not the case, since there aren't equivalent widths of MVTs for all element types. (In this case, v6i32 -> v6i16 isn't possible since v6i16 doesn't exist).
But what really puzzles me is why this never broke before, since even before this change, this wasn't a valid assumption (e.g. v3i32 -> v3i8 would have failed similarly before) -- is there logic upstream of this call that intends to guard against this somehow?