This is an archive of the discontinued LLVM Phabricator instance.

[PoC][RISCV] Use scalar register for fixed-length vectors
AbandonedPublic

Authored by wangpc on Jul 18 2023, 12:14 AM.

Details

Summary

So that we can vectorize some loops with small element size.

For small vectors like v4i8, v8i8, v4i16, etc., they can be fit in
a whole scalar register.

We can vectorize load/store now, but there is no vector operation
on scalar registers (RVP extension is limited too).

I don't know if this is the right way to go and no other target
has done something like this. The changes seem to be intrusive, and
we have a lot of works to do if we want to go further.

For the example, it should be optimized to memcpy call in fact.

Related discussion:

Diff Detail

Event Timeline

wangpc created this revision.Jul 18 2023, 12:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 12:14 AM
wangpc requested review of this revision.Jul 18 2023, 12:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 12:14 AM
wangpc edited the summary of this revision. (Show Details)Jul 18 2023, 12:16 AM
wangpc edited the summary of this revision. (Show Details)Jul 18 2023, 12:24 AM

Does this optimization only happen in load and store sequence to do memcpy or memset, or something like?

If so, enabling the vectorization in middle end and vector type legal in backend may too aggressive and complicated, because no operations are legal in such types. Introducing vectorization and backend support for such condition can work, but it influences too much such as evaluation of cost model and other opt as such types are recognized as legal type. Instead, recognizing this condition as memcpy would make sense more.

Does this optimization only happen in load and store sequence to do memcpy or memset, or something like?

If so, enabling the vectorization in middle end and vector type legal in backend may too aggressive and complicated, because no operations are legal in such types. Introducing vectorization and backend support for such condition can work, but it influences too much such as evaluation of cost model and other opt as such types are recognized as legal type. Instead, recognizing this condition as memcpy would make sense more.

Yes, it's limited. And we know that RVP is also not suitable for vectorization. I just post this PoC here to gather some feedbacks. :-)
But I think this kind of vectorization is feasible for XCVsimd extension (D153721).

wangpc abandoned this revision.Jul 20 2023, 12:59 AM