This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] Fix alignment in single element store
ClosedPublic

Authored by qiucf on May 31 2021, 10:40 AM.

Details

Summary

This concern was raised in D98240. It's a miscompile and thanks for comments from @lebedev.ri.

Diff Detail

Event Timeline

qiucf created this revision.May 31 2021, 10:40 AM
qiucf requested review of this revision.May 31 2021, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2021, 10:40 AM
tschuett added inline comments.
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
834

This line is too cute for me, but ...

Hm, this first needs more radical fixes - this is currently miscompiling vector indexes: https://alive2.llvm.org/ce/z/aWtH9w
I would suggest to first simply require the index to be constant.

Right, canScalarizeAccess() already does that, okay.
But then, it would be good to have a positive test for @insert_store_nonconst() :)

lebedev.ri added inline comments.May 31 2021, 11:34 AM
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
835

(alive *seems* to be happy with this)
This is going to be precise and optimal for constant indexes,
but i think we can get at least the lower-bound estimate for variable indexes:
the new address will be offset from the base address by DL.getTypeStoreSize(NewElement->getType()),
so i think we can do

else
  NewAlignment = commonAlignment(
          NewAlignment,
          DL.getTypeStoreSize(NewElement->getType()));

Please add positive tests:

; New alignment should be 8
define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)

  %i = load <8 x i64>, <8 x i64>* %q, align 8
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 8
  ret void
}

; New alignment should be 4
define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)

  %i = load <8 x i64>, <8 x i64>* %q, align 4
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 4
  ret void
}
qiucf updated this revision to Diff 348984.Jun 1 2021, 8:36 AM

Add another test (first..)

Please feel free to just directly commit new tests.
The new tests i asked for should be positive tests - they should be getting transformed (missing @llvm.assume())

lebedev.ri requested changes to this revision.Jun 2 2021, 6:48 AM
This revision now requires changes to proceed.Jun 2 2021, 6:48 AM
qiucf updated this revision to Diff 350528.Jun 8 2021, 1:38 AM

Thanks.
This looks fine to me now.
Can anyone spot any issues with the new alignment logic? @fhahn @spatel?

llvm/test/Transforms/VectorCombine/load-insert-store.ll
128

I think we still want those two tests i suggested,
they demonstrate that we don't increase alignment from the maximal one allowed.

Please precommit the tests.

spatel added inline comments.Jun 8 2021, 8:02 AM
llvm/test/Transforms/VectorCombine/load-insert-store.ll
23

How do we justify this increase in alignment?
The original code had minimal align 1, so it could be anything. We are creating a scalar store at an address 6 bytes over that, so it could still be anything?

lebedev.ri added inline comments.Jun 8 2021, 9:26 AM
llvm/test/Transforms/VectorCombine/load-insert-store.ll
23

This change is correct.
Before store <...>, align 1, we have already established that the %q is more aligned,
as per the load <...> with an implicit alignment, which isn't 1.
https://alive2.llvm.org/ce/z/C2qnUc

spatel added inline comments.Jun 8 2021, 9:45 AM
llvm/test/Transforms/VectorCombine/load-insert-store.ll
23

Ah, thanks for explaining. IIUC, we add explicit alignment to all load/store in IR now, so we should add the align 16 to this test to avoid confusion - and a test comment would be nice too :).

128

+1 - additional tests and pre-commit will make this easier to understand.

qiucf updated this revision to Diff 350760.Jun 8 2021, 7:06 PM
qiucf marked an inline comment as done.

Add explicit align to affected test.
Add comment for implicit alignment.

spatel added inline comments.Jun 9 2021, 9:39 AM
llvm/test/Transforms/VectorCombine/load-insert-store.ll
128

Let me know if I'm not seeing it, but we want 1 test with nonconst index where the original alignment is less than the presumed alignment for the new scalar store:

define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)
  %i = load <8 x i64>, <8 x i64>* %q, align 4
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 2 ; make this different just to exercise the logic a bit more
  ret void
}
lebedev.ri requested changes to this revision.Jun 9 2021, 9:40 AM

(better, but still not quite there, my previous comments still stand unaddressed...)

This revision now requires changes to proceed.Jun 9 2021, 9:40 AM
qiucf updated this revision to Diff 351074.Jun 9 2021, 11:31 PM
lebedev.ri accepted this revision.Jun 10 2021, 2:16 AM

LGTM unless there are other comments.
Thanks.

This revision is now accepted and ready to land.Jun 10 2021, 2:16 AM
spatel accepted this revision.Jun 10 2021, 6:12 AM

LGTM

This revision was automatically updated to reflect the committed changes.
qiucf added a comment.Jun 10 2021, 7:32 PM

Landed. Thanks for the review!