Download Raw Diff

Details

Reviewers

lebedev.ri
fhahn
spatel
efriedma
hgreving
nlopes

Commits

rG2670c7dd5b25: [VectorCombine] Fix alignment in single element store

Summary

This concern was raised in D98240. It's a miscompile and thanks for comments from @lebedev.ri.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	30 ms	x64 debian > LLVM.Transforms/VectorCombine/AArch64::load-extract-insert-store-scalarization.ll
	90 ms	x64 windows > LLVM.Transforms/VectorCombine/AArch64::load-extract-insert-store-scalarization.ll

Event Timeline

qiucf created this revision.May 31 2021, 10:40 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptMay 31 2021, 10:40 AM

qiucf requested review of this revision.May 31 2021, 10:40 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2021, 10:40 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

tschuett added a subscriber: tschuett.May 31 2021, 10:44 AM

tschuett added inline comments.

llvm/lib/Transforms/Vectorize/VectorCombine.cpp
834	This line is too cute for me, but ...

Hm, this first needs more radical fixes - this is currently miscompiling vector indexes: https://alive2.llvm.org/ce/z/aWtH9w
I would suggest to first simply require the index to be constant.

Right, canScalarizeAccess() already does that, okay.
But then, it would be good to have a positive test for @insert_store_nonconst() :)

lebedev.ri added inline comments.May 31 2021, 11:34 AM

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

835

(alive *seems* to be happy with this)
This is going to be precise and optimal for constant indexes,
but i think we can get at least the lower-bound estimate for variable indexes:
the new address will be offset from the base address by DL.getTypeStoreSize(NewElement->getType()),
so i think we can do

else
  NewAlignment = commonAlignment(
          NewAlignment,
          DL.getTypeStoreSize(NewElement->getType()));

Please add positive tests:

; New alignment should be 8
define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)

  %i = load <8 x i64>, <8 x i64>* %q, align 8
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 8
  ret void
}

; New alignment should be 4
define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)

  %i = load <8 x i64>, <8 x i64>* %q, align 4
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 4
  ret void
}

Harbormaster completed remote builds in B106941: Diff 348837.May 31 2021, 12:00 PM

Add another test (first..)

Harbormaster completed remote builds in B107049: Diff 348984.Jun 1 2021, 9:12 AM

Please feel free to just directly commit new tests.
The new tests i asked for should be positive tests - they should be getting transformed (missing @llvm.assume())

lebedev.ri requested changes to this revision.Jun 2 2021, 6:48 AM

This revision now requires changes to proceed.Jun 2 2021, 6:48 AM

qiucf updated this revision to Diff 350528.Jun 8 2021, 1:38 AM

Harbormaster completed remote builds in B108158: Diff 350528.Jun 8 2021, 2:10 AM

Thanks.
This looks fine to me now.
Can anyone spot any issues with the new alignment logic? @fhahn @spatel?

llvm/test/Transforms/VectorCombine/load-insert-store.ll
123	I think we still want those two tests i suggested, they demonstrate that we don't increase alignment from the maximal one allowed. Please precommit the tests.

spatel added inline comments.Jun 8 2021, 8:02 AM

llvm/test/Transforms/VectorCombine/load-insert-store.ll
23	How do we justify this increase in alignment? The original code had minimal `align 1`, so it could be anything. We are creating a scalar store at an address 6 bytes over that, so it could still be anything?

lebedev.ri added inline comments.Jun 8 2021, 9:26 AM

llvm/test/Transforms/VectorCombine/load-insert-store.ll
23	This change is correct. Before `store <...>, align 1`, we have already established that the `%q` is more aligned, as per the `load <...>` with an implicit alignment, which isn't `1`. https://alive2.llvm.org/ce/z/C2qnUc

spatel added inline comments.Jun 8 2021, 9:45 AM

llvm/test/Transforms/VectorCombine/load-insert-store.ll
23	Ah, thanks for explaining. IIUC, we add explicit alignment to all load/store in IR now, so we should add the `align 16` to this test to avoid confusion - and a test comment would be nice too :).
123	+1 - additional tests and pre-commit will make this easier to understand.

Add explicit align to affected test.
Add comment for implicit alignment.

Harbormaster completed remote builds in B108321: Diff 350760.Jun 8 2021, 7:45 PM

spatel added inline comments.Jun 9 2021, 9:39 AM

llvm/test/Transforms/VectorCombine/load-insert-store.ll

123

Let me know if I'm not seeing it, but we want 1 test with nonconst index where the original alignment is less than the presumed alignment for the new scalar store:

define void @src(<8 x i64>* %q, i64 %s, i32 %idx) {
  %cmp = icmp ult i32 %idx, 2
  call void @llvm.assume(i1 %cmp)
  %i = load <8 x i64>, <8 x i64>* %q, align 4
  %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx
  store <8 x i64> %vecins, <8 x i64>* %q, align 2 ; make this different just to exercise the logic a bit more
  ret void
}

(better, but still not quite there, my previous comments still stand unaddressed...)

This revision now requires changes to proceed.Jun 9 2021, 9:40 AM

qiucf updated this revision to Diff 351074.Jun 9 2021, 11:31 PM

Harbormaster completed remote builds in B108547: Diff 351074.Jun 10 2021, 12:11 AM

LGTM unless there are other comments.
Thanks.

This revision is now accepted and ready to land.Jun 10 2021, 2:16 AM

LGTM

Closed by commit rG2670c7dd5b25: [VectorCombine] Fix alignment in single element store (authored by qiucf). · Explain WhyJun 10 2021, 7:32 PM

This revision was automatically updated to reflect the committed changes.

qiucf added a commit: rG2670c7dd5b25: [VectorCombine] Fix alignment in single element store.

Landed. Thanks for the review!

Diff 348837

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	MemoryLocation::get(SI), AA))			MemoryLocation::get(SI), AA))
	return false;			return false;

	Value *GEP = GetElementPtrInst::CreateInBounds(			Value *GEP = GetElementPtrInst::CreateInBounds(
	SI->getPointerOperand(), {ConstantInt::get(Idx->getType(), 0), Idx});			SI->getPointerOperand(), {ConstantInt::get(Idx->getType(), 0), Idx});
	Builder.Insert(GEP);			Builder.Insert(GEP);
	StoreInst *NSI = Builder.CreateStore(NewElement, GEP);			StoreInst *NSI = Builder.CreateStore(NewElement, GEP);
	NSI->copyMetadata(*SI);			NSI->copyMetadata(*SI);
	if (SI->getAlign() < NSI->getAlign())			Align NewAlignment(1);
				tschuettUnsubmitted Not Done Reply Inline Actions This line is too cute for me, but ... tschuett: This line is too cute for me, but ...
	NSI->setAlignment(SI->getAlign());			if (auto *C = dyn_cast<ConstantInt>(Idx)) {
				lebedev.riUnsubmitted Not Done Reply Inline Actions (alive seems to be happy with this) This is going to be precise and optimal for constant indexes, but i think we can get at least the lower-bound estimate for variable indexes: the new address will be offset from the base address by `DL.getTypeStoreSize(NewElement->getType())`, so i think we can do else NewAlignment = commonAlignment( NewAlignment, DL.getTypeStoreSize(NewElement->getType())); Please add positive tests: ; New alignment should be 8 define void @src(<8 x i64>* %q, i64 %s, i32 %idx) { %cmp = icmp ult i32 %idx, 2 call void @llvm.assume(i1 %cmp) %i = load <8 x i64>, <8 x i64>* %q, align 8 %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx store <8 x i64> %vecins, <8 x i64>* %q, align 8 ret void } ; New alignment should be 4 define void @src(<8 x i64>* %q, i64 %s, i32 %idx) { %cmp = icmp ult i32 %idx, 2 call void @llvm.assume(i1 %cmp) %i = load <8 x i64>, <8 x i64>* %q, align 4 %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx store <8 x i64> %vecins, <8 x i64>* %q, align 4 ret void } lebedev.ri: (alive seems to be happy with this) This is going to be precise and optimal for constant…
				NewAlignment = std::max(SI->getAlign(), Load->getAlign());
				NewAlignment = commonAlignment(
				NewAlignment,
				C->getZExtValue() * DL.getTypeStoreSize(NewElement->getType()));
				}
				NSI->setAlignment(NewAlignment);
	replaceValue(I, *NSI);			replaceValue(I, *NSI);
	// Need erasing the store manually.			// Need erasing the store manually.
	I.eraseFromParent();			I.eraseFromParent();
	return true;			return true;
	}			}

	return false;			return false;
	}			}
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/test/Transforms/VectorCombine/load-insert-store.ll

Show All 14 Lines	entry:
store <16 x i8> %vecins, <16 x i8>* %q, align 16		store <16 x i8> %vecins, <16 x i8>* %q, align 16
ret void		ret void
}		}

define void @insert_store_i16_align1(<8 x i16>* %q, i16 zeroext %s) {		define void @insert_store_i16_align1(<8 x i16>* %q, i16 zeroext %s) {
; CHECK-LABEL: @insert_store_i16_align1(		; CHECK-LABEL: @insert_store_i16_align1(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds <8 x i16>, <8 x i16> [[Q:%.*]], i32 0, i32 3		; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds <8 x i16>, <8 x i16> [[Q:%.*]], i32 0, i32 3
; CHECK-NEXT: store i16 [[S:%.]], i16 [[TMP0]], align 1		; CHECK-NEXT: store i16 [[S:%.]], i16 [[TMP0]], align 2
		spatelUnsubmitted Not Done Reply Inline Actions How do we justify this increase in alignment? The original code had minimal `align 1`, so it could be anything. We are creating a scalar store at an address 6 bytes over that, so it could still be anything? spatel: How do we justify this increase in alignment? The original code had minimal `align 1`, so it…
		lebedev.riUnsubmitted Not Done Reply Inline Actions This change is correct. Before `store <...>, align 1`, we have already established that the `%q` is more aligned, as per the `load <...>` with an implicit alignment, which isn't `1`. https://alive2.llvm.org/ce/z/C2qnUc lebedev.ri: This change is correct. Before `store <...>, align 1`, we have already established that the…
		spatelUnsubmitted Done Reply Inline Actions Ah, thanks for explaining. IIUC, we add explicit alignment to all load/store in IR now, so we should add the `align 16` to this test to avoid confusion - and a test comment would be nice too :). spatel: Ah, thanks for explaining. IIUC, we add explicit alignment to all load/store in IR now, so we…
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%0 = load <8 x i16>, <8 x i16>* %q		%0 = load <8 x i16>, <8 x i16>* %q
%vecins = insertelement <8 x i16> %0, i16 %s, i32 3		%vecins = insertelement <8 x i16> %0, i16 %s, i32 3
store <8 x i16> %vecins, <8 x i16>* %q, align 1		store <8 x i16> %vecins, <8 x i16>* %q, align 1
ret void		ret void
}		}
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines
; CHECK-LABEL: @insert_store_nonconst(		; CHECK-LABEL: @insert_store_nonconst(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.]] = load <16 x i8>, <16 x i8> [[Q:%.*]], align 16		; CHECK-NEXT: [[TMP0:%.]] = load <16 x i8>, <16 x i8> [[Q:%.*]], align 16
; CHECK-NEXT: [[VECINS:%.]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.]], i32 [[IDX:%.*]]		; CHECK-NEXT: [[VECINS:%.]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.]], i32 [[IDX:%.*]]
; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16		; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%0 = load <16 x i8>, <16 x i8>* %q		%0 = load <16 x i8>, <16 x i8>* %q
		lebedev.riUnsubmitted Not Done Reply Inline Actions I think we still want those two tests i suggested, they demonstrate that we don't increase alignment from the maximal one allowed. Please precommit the tests. lebedev.ri: I think we still want those two tests i suggested, they demonstrate that we don't increase…
		spatelUnsubmitted Not Done Reply Inline Actions +1 - additional tests and pre-commit will make this easier to understand. spatel: +1 - additional tests and pre-commit will make this easier to understand.
		spatelUnsubmitted Not Done Reply Inline Actions Let me know if I'm not seeing it, but we want 1 test with nonconst index where the original alignment is less than the presumed alignment for the new scalar store: define void @src(<8 x i64>* %q, i64 %s, i32 %idx) { %cmp = icmp ult i32 %idx, 2 call void @llvm.assume(i1 %cmp) %i = load <8 x i64>, <8 x i64>* %q, align 4 %vecins = insertelement <8 x i64> %i, i64 %s, i32 %idx store <8 x i64> %vecins, <8 x i64>* %q, align 2 ; make this different just to exercise the logic a bit more ret void } spatel: Let me know if I'm not seeing it, but we want 1 test with nonconst index where the original…

This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] Fix alignment in single element store
ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 348837

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

llvm/test/Transforms/VectorCombine/load-insert-store.ll

This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] Fix alignment in single element storeClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 348837

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

llvm/test/Transforms/VectorCombine/load-insert-store.ll

[VectorCombine] Fix alignment in single element store
ClosedPublic