This patch also fixes up a number of cases in DAGCombine and
SelectionDAGBuilder where the size of a scalable vector is used in a
fixed-width context (thus triggering an assertion failure).
For every place you're adding if (scalable) return false;, I'd like to see a comment explaining why we're bailing out.
|1255 ↗||(On Diff #232884)|
While you're here, indentation?
Should we have a helper for this pattern?
This is sort of weird for a method named "SelectAddrModeFrameIndexSVE"; should it not just fail?
Is this necessary?
This seems sort of confusing. "Scale" here is implicitly multiplied by vl, and there's isn't any way for the caller to tell except by checking the opcode.
IsLE? Are we supposed to do something different on big-endian targets?
Should we always use PTRUE_B, even for non-byte element sizes, to encourage CSE?
Should we prefer to use ldr/str where legal, to take advantage of the larger immediate offset?
- Added convenience function MemoryLocation::getSizeOrUnknown
- Removed isLE predicate from predicate store patterns.
- Added comments explaining why we bail out of a function when the type is a scalable vector.
- Addressed other suggestions to clean up code.
|1255 ↗||(On Diff #232884)|
Yes, that would be useful. I've added MemoryLocation::getSizeOrUnknown(const TypeSize &)
Agreed, that should not have been there. Fixed.
No, good catch!
I'm not sure if is an actual issue in practice though. Are you suggesting to make Scale a TypeSize instead of an unsigned?
No, that was a misunderstanding on my part. I've removed this now.
Our experience is that vectorized loops have most predicates CSEd anyway. For a loop that operates on two lanes, often a predicate is already available and there is no need to introduce an extra ptrue_b. If a loop using floats is vectorized with VF=2, we don't want operations on <vscale x 2 x float> to use ptrue.b because that would enable operations on all (vscale x) 4 lanes, which may not be valid.
That would not be endian safe, hence the preference to use ST1 (note that the order is dictated by the AAPCS for when passing the vectors by reference). This case of saving/restoring to/from the stack like this is pretty rare. Normal spills and fills will indeed use the STR/LDR instructions. And normal load/store vector instructions that are not storing to a local will likely use other addressing modes like reg+reg.
"how many bytes are dereferenced".
I'm not sure how you're proving that "N" is a FrameIndexSDNode here?
Yes, that would force the callers to explicitly handle scalable types. It looks like some of them don't.
Okay, that makes sense. For the CSE thing, we could maybe add an optimization pass after isel if it's necessary.
- Code in SelectAddrModeFrameIndexSVE now checks if index is a FrameIndexSDNode (rather than assume it is one).
- Fixed whitespace and updated comment.
Given that this is a change propagates through the rest of the code-base, I will do this in a separate patch.