This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors.
ClosedPublic

Authored by sdesmalen on Oct 12 2021, 4:59 AM.

Details

Summary

When inserting a scalable subvector into a scalable vector through
the stack, the index to store to needs to be scaled by vscale.
Before this patch, that didn't yet happen, so it would generate the
wrong offset, thus storing a subvector to the incorrect address
and overwriting the wrong lanes.

For some insert:

nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2)

The offset was not scaled by vscale:

orr     x8, x8, #0x4
st1h    { z0.h }, p0, [sp]
st1h    { z1.d }, p1, [x8]
ld1h    { z0.h }, p0/z, [sp]

And is changed to:

mov x8, sp
st1h { z0.h }, p0, [sp]
st1h { z1.d }, p1, [x8, #1, mul vl]
ld1h { z0.h }, p0/z, [sp]

Diff Detail

Event Timeline

sdesmalen created this revision.Oct 12 2021, 4:59 AM
sdesmalen requested review of this revision.Oct 12 2021, 4:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2021, 4:59 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7874

Can you do something like this:

Index = DAG.getVScale(DL, IdxVT, APInt(IdxVT.getSizeInBits(), Index.getConstantOperandVal(0)));

?
Insead of multiply the index by a scalable vector of size 1?

david-arm added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7867–7868

Hi @sdesmalen, it's just a thought, but while you're in this area is it also worth clamping the index for scalable vectors too? The comment above is incorrect, because we do explicitly clamp the index in other places for scalable vectors.

sdesmalen added inline comments.Oct 12 2021, 5:23 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7867–7868

I don't think any clamping is required, because when both the subvector and the vector being inserted into are scalable, we know at compiletime whether the vector index will exceed the size of the input vector.

david-arm added inline comments.Oct 12 2021, 5:28 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7867–7868

Oh ok - I wonder why we do this for fixed-length vectors? I was sort of expecting the problem to be the same for both inserting fixed into fixed and inserting scalable into scalable?

I was specifically worried about what we did in practice for this case:

call <vscale x 8 x half> @llvm.experimental.vector.insert.nxv8f16.nxv2f16(<vscale x 8 x half> %vec, <vscale x 2 x half> %in, i64 10)

because if vscale=1 then we're inserting beyond the end of the vector.

sdesmalen added inline comments.Oct 12 2021, 5:43 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7867–7868

This exists for inserting/extracting fixed from scalable, where we don't know at compile-time if the fixed offset exceeds the scalable vector.

The example you give here:

call <vscale x 8 x half> @llvm.experimental.vector.insert.nxv8f16.nxv2f16(<vscale x 8 x half> %vec, <vscale x 2 x half> %in, i64 10)

is always out of bounds, because vscale*10 > vscale*(8-2) for any vscale.

david-arm added inline comments.Oct 12 2021, 5:58 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7867–7868

OK, I still wasn't sure what you meant here by checking at compile time, but I tried out a test manually and I see that the Verifier emits an error for indices that are too large. So that's fine then!

sdesmalen added inline comments.Oct 12 2021, 7:59 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7874

I'm not sure if you can always know for sure that Index is a constant value so that's why I used an explicit multiply with vscale.

paulwalker-arm added inline comments.Oct 12 2021, 8:47 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7874

To me the rational for not using getConstantOperandAPInt is at odds to the rational for not clamping the index. None of what you say is wrong but there's nothing to say this function has to be used in conjunction with ISD::INSERT_SUBVECTOR, but the function's description does say:

If \p Idx plus the size of \p SubVecVT is out of bounds the returned pointer is unspecified, but the value returned will be such that the entire subvector would be within the vector bounds.

So either this function is only ever used in combination with EXTRACT_SUBVECTOR/INSERT_SUBVECTOR, in which case we can assume Index to be a constant (perhaps even changing the prototype to force this?), or this is a generic helper function and thus Index can be anything and must be clamped to honour the function's description.

sdesmalen updated this revision to Diff 379723.EditedOct 14 2021, 8:04 AM
  • Simplified getVectorSubVecPointer to always clamp dynamically.
  • For the combinations in the table below, this revision changed clampDynamicVectorIndex as follows:
    • Accepts cases: A, B, C, D, E, F (C and F were added)
    • It now also asserts that A and F must have valid indices (i.e. are already validated by their original operation, e.g. EXTRACT/INSERT_SUBVECTOR or EXTRACT/INSERT_VECTOR_ELT.
	  |  index  |  subvector  |  vector
	  |----------------------------------
	A |  const        fixed         fixed
	B |  const        fixed      scalable
	C |  const     scalable      scalable
	D |    var        fixed         fixed
	E |    var        fixed      scalable
	F |    var     scalable      scalable
	
	Note that the following combinations are invalid, because we don't support
	extracting a scalable vector from a fixed-width vector:
	X |  const     scalable         fixed
	X |    var     scalable         fixed
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7822–7824

Given you're not making assumptions as to where Idx is coming from I don't think an assert is safe enough. Sure an asserts build will exit here but a release build could leak/corrupt data, which is a problem the user is trying to prevent, hence calling this function.

Instead I think the assert should be replaced by always clamping Idx (see the MaxIndex calculation below). If the source of Idx is indeed an EXTRACT_SUBVECTOR/INSERT_SUBVECTOR then the invalid index should be asserting as part of getNode rather than getting this far.

sdesmalen updated this revision to Diff 380631.Oct 19 2021, 3:16 AM

Always clamp instead of asserting.

sdesmalen added inline comments.Oct 19 2021, 3:20 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7822–7824

Good point about the generated code possibly doing the wrong thing for non-assert builds. I've changed it to always clamp, and removed the assert. From looking at the uses of getVectorSubVecPointer, it's not really worth passing in some bool to tell whether the value is coming from an insert/extract subvector in order to assert. The nodes where this function is called for insert/extract subvectors, have nodes that must already have been checked for correctness in other places.

paulwalker-arm accepted this revision.Oct 19 2021, 9:52 AM
paulwalker-arm added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7869

The assert text is no longer correct.

This revision is now accepted and ready to land.Oct 19 2021, 9:52 AM
sdesmalen marked an inline comment as done.Oct 20 2021, 6:08 AM
sdesmalen added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7869

Good spot, I've changed it! Thanks.