Page MenuHomePhabricator

Change the context instruction for computeKnownBits in LoadStoreVectorizer pass

Authored by wvoquine on Apr 30 2021, 5:03 PM.



This change enables cases for which the index value for the first load/store instruction
in a pair could be a function argument. This allows using llvm.assume to provide known
bits information in such cases.

Diff Detail

Event Timeline

wvoquine created this revision.Apr 30 2021, 5:03 PM
wvoquine requested review of this revision.Apr 30 2021, 5:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2021, 5:03 PM
wvoquine added inline comments.Apr 30 2021, 5:46 PM

I've realized this should be "v2i8" in the test name.

wvoquine updated this revision to Diff 342102.Apr 30 2021, 7:16 PM
arsenm added inline comments.May 3 2021, 9:59 AM

How do we know A is the point the vectorized instruction will be inserted, and not B?


e.g. can you add some tests with an assume between the two instructions to vectorize?

wvoquine added inline comments.

By the way, we have discussed some of these points with @rtereshin and @bogner

Neither of OpA and OpB is a point of the vectorization. They both dominate the point of vectorization, but there's no information here about which one is closer to it.

This change only enables more cases to pass: if OpB is nullptr then it means we have bailed long ago in the function:
( See this part:

// At this point A could be a function parameter, i.e. not an instruction
Value *ValA = OpA->getOperand(0);
OpB = dyn_cast<Instruction>(OpB->getOperand(0));
if (!OpB || ValA->getType() != OpB->getType())
  return false;


I just allow it not to bail if OpA appears to be nullptr.

There's no assumption as to which one goes at the point or closer to the point of vectorization.

Also without llvm.assume I couldn't differentiate the two contexts in a test.

And I added a test with the llvm.assume - ld_v2i8_add_different_contexts.

Now, sometimes OpA might be with a better context for the calculation of known bits. In such cases we might fail to vectorize.

If llvm.assume's are inconsistent in the two points then this ought to be an UB.

For an improvement in the future we might need to use the actual vectorization point as the context.

wvoquine added inline comments.May 3 2021, 10:43 AM

Don't you think ld_v2i8_add_different_contexts would count as one, or do you prefer a test without a control flow in it?


There's no assumption as to which one goes at the point or closer to the point of vectorization.

Having said that, if a case is a result of unrolling, in many such occasions, OpB indeed naturally goes after OpA. And that means the context is more precise
(the contexts shouldn't be inconsistent).

wvoquine marked 2 inline comments as done.May 3 2021, 11:02 AM
arsenm added inline comments.May 4 2021, 10:34 AM

ld_v2i8_add_different_contexts does not have an assume between the two loads, so I don't mean that

wvoquine added inline comments.May 6 2021, 12:03 PM

I see some unexpected results when I place llvm.assume between loads, or just before the first load.
I will have to debug a little bit, and will likely update the change.

A safer version could be to still use OpA if it's not nullptr and OpB if it is.

wvoquine added inline comments.May 6 2021, 7:28 PM

In fact llvm.assume itself did block the transformation as it was considered to be writing to the memory.
Similar to other passes we can ignore assume in this case though.

wvoquine updated this revision to Diff 343570.May 6 2021, 8:42 PM

Added tests where the assume intrinsic is placed between the vectorized instructions or even after them in the end of the basic block.

Added assume intrinsic to the list of ignored instructions when detecting whether it's legal to vectorize over an instruction which may read or write to memory.

wvoquine updated this revision to Diff 343573.May 6 2021, 8:55 PM
wvoquine marked an inline comment as done.

Update the test ld_v2i8_add_context to align it with the comment put for the test: the llvm.assume should be placed between the two loads.

wvoquine added inline comments.May 6 2021, 9:02 PM

Now the tests ld_v2i8_add_different_contexts1 and ld_v2i8_add_context have llvm.assume between the vectorized loads.
Also in ld_v2i8_add_context1 the llvm.assume goes after both and a store coming after that, it still works with the single basic block.

wvoquine added inline comments.May 10 2021, 4:33 PM

I've added the tests along with the patch to let them show a difference (ignore llvm.assume itself when checking on memory conflicts between the two loads).

Are there more concerns?

arsenm accepted this revision.May 11 2021, 5:19 PM
This revision is now accepted and ready to land.May 11 2021, 5:19 PM

Thanks for accepting!

Could someone merge the change?

bogner accepted this revision.May 12 2021, 2:41 PM

lg. I'll merge this for you shortly

xbolva00 added inline comments.


wvoquine added inline comments.May 12 2021, 3:31 PM

Thanks! This could be addressed later on with the other intrinsics where that is applicable.