Page MenuHomePhabricator

[RISCV] Support simple fractional steps in matching VID sequences

Authored by frasercrmck on Jul 22 2021, 2:24 AM.



This patch extends the optimization of VID-sequence BUILD_VECTORs
introduced in D104921 to include simple fractional steps composed of a
separated integer numerator and denominator.

A notable limitation in this sequence detection is that only sequences
with steps N/1 or 1/D are found, meaning that the step between elements
and the frequency with which it changes is consistent across the whole
sequence. Fractional steps such as 2/3 won't be matched as those would
involve more complex tracking of state or some level of backtracking.

As is stands, however, this patch is sufficient to match common
interleave-type shuffle indices, for example matching <0,0,1,1> (or
commonly <0,u,1,u> or <u,0,u,1>) to an index sequence divided by 2.

While the optimization is relatively undef-tolerant, due to greedy
pattern-matching there even are some simple patterns which confuse the
sequence detection into identifying either a suboptimal sequence or no
sequence at all.

Currently only fractional-step sequences identified as having a
power-of-two denominator are actually lowered to RVV instructions. This
is to avoid introducing divisions into the generated code.

Diff Detail

Event Timeline

frasercrmck created this revision.Jul 22 2021, 2:24 AM
frasercrmck requested review of this revision.Jul 22 2021, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2021, 2:24 AM
craig.topper added inline comments.Jul 29 2021, 9:34 AM

Is the * in the sequence trying to convey something?


Do we need to worry about ValDiff being INT64_MIN which makes std::abs undefined?


Why is SeqStepNum cast from int64_t to uint64_t? I don't understand why it is the original code either.

frasercrmck added inline comments.Jul 29 2021, 9:48 AM

It's supposed to convey that we might currently be at that element to demonstrate the code comment, but I agree it's not very clear.


Oh good catch, thanks. I'm surprised my sanitizer build didn't hit that. Sounds like a gap in my testing.


It's very possible I went overboard in trying to prevent signed integer overflow e.g. INT64_MIN. I could swear the sanitizers were complaining at me though.


This test sadly decides to spill even more than it did previously, meaning it's technically a regression.

craig.topper added inline comments.Jul 29 2021, 10:02 AM

Good point. I guess I'm only used to thinking about addition/subtract overflow not multiply overflow.

craig.topper added inline comments.Jul 29 2021, 10:07 AM

I guess that's because we have vid and vid/2 both live?

  • rebase
  • avoid use of std::abs to avoid INT64_MIN woes
frasercrmck marked 2 inline comments as done.Aug 2 2021, 8:55 AM
frasercrmck added inline comments.

Yeah to be honest I was surprised I hadn't ever really had to think about it before. I think I'm doing the right thing here but I wondering if it could be cleaner somehow.


That'd be my guess, yeah. It's keeping vid alive in v24 for longer when there aren't really that many registers available: 4, I think. It may even work out cheaper to "rematerialize" it later nearer the use. I don't know how in control we are of those decisions though.

This revision is now accepted and ready to land.Aug 2 2021, 9:33 AM
This revision was automatically updated to reflect the committed changes.
frasercrmck marked an inline comment as done.