This is an archive of the discontinued LLVM Phabricator instance.

Fix an ordering bug in the scalarizer.
ClosedPublic

Authored by sheredom on Sep 26 2018, 3:06 AM.

Download Raw Diff

Details

Reviewers

arsenm
mehdi_amini
uabelho
bkramer

Commits

rG3d4579829e85: Fix an ordering bug in the scalarizer.
rL344128: Fix an ordering bug in the scalarizer.

Summary

I've added a new test case that causes the scalarizer to try and use dead-and-erased values - caused by the basic blocks not being in domination order within the function. To fix this, instead of iterating through the blocks in function order, I walk them in reverse post order.

Diff Detail

Repository: rL LLVM

Event Timeline

sheredom created this revision.Sep 26 2018, 3:06 AM

Herald added subscribers: llvm-commits, wdng. · View Herald TranscriptSep 26 2018, 3:06 AM

Is this the same problem as described in https://bugs.llvm.org/show_bug.cgi?id=28911 ?

uabelho added a subscriber: dstenb.Sep 27 2018, 4:44 AM

In D52540#1247681, @uabelho wrote:

Is this the same problem as described in https://bugs.llvm.org/show_bug.cgi?id=28911 ?

I wasn't seeing the exact same failure as that bug describes - it was failing in Scalarizer::finish in the test included.

But I did pull the example from the bugzilla and tested it and it no longer crashes with my scalarizer fix.

In D52540#1247778, @sheredom wrote:

In D52540#1247681, @uabelho wrote:

Is this the same problem as described in https://bugs.llvm.org/show_bug.cgi?id=28911 ?

I wasn't seeing the exact same failure as that bug describes - it was failing in Scalarizer::finish in the test included.

But I did pull the example from the bugzilla and tested it and it no longer crashes with my scalarizer fix.

I don't know the Scalarizer but from the descriptions of the problems I think it sounds like they are similar or the same,
and if so, there are then two possible solutions.

dstenb described a possible fix in a comment of PR28911 and we've been using that fix for a long time now for our
out-of-tree target without problems.

Perhaps that fix is a hack and using RPOT is the proper way to deal with this, I've no idea. I just wanted to point out the
possibility.

Anyway, I applied the fix in this patch on our local tree and so far so good so I don't have any objections, but I don't
really know the code enough to say it's the right thing to do.

In D52540#1247810, @uabelho wrote:

dstenb described a possible fix in a comment of PR28911 and we've been using that fix for a long time now for our
out-of-tree target without problems.

Perhaps that fix is a hack and using RPOT is the proper way to deal with this, I've no idea. I just wanted to point out the
possibility.

I took a look at dstenb's approach and it is definitely not a hack - its just solving this issue from a slightly different viewpoint. Their approach just ensure thats the lists of values are never accidentally invalidated, whereas mine forces us to walk the BB graph in order such that we should never generate invalid lists in the first place.

Anyway, I applied the fix in this patch on our local tree and so far so good so I don't have any objections, but I don't
really know the code enough to say it's the right thing to do.

Great - glad it works for you!

Should I incorporate the test case from the bugzilla into this commit? Given that its a slightly different failure case that is also fixed by my change, seems like its worth ensuring it doesn't accidentally regress again.

In D52540#1247832, @sheredom wrote:

Should I incorporate the test case from the bugzilla into this commit? Given that its a slightly different failure case that is also fixed by my change, seems like its worth ensuring it doesn't accidentally regress again.

Yes, sure that sounds good!

We have a local regression test for PR28911 that dstenb created when he applied the fix locally for our target:

; RUN: opt %s -scalarizer -verify -S -o - | FileCheck %s

; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-32b26ac.bc"
target triple = "x86_64-unknown-linux-gnu"

define void @f3() local_unnamed_addr {
bb1:
  br label %bb2

bb3:
;CHECK-LABEL: bb3:
;CHECK-NEXT: br label %bb4
  %h.10.0.vec.insert = shufflevector <1 x i16> %h.10.1, <1 x i16> undef, <1 x i32> <i32 0>
  br label %bb4

bb2:
;CHECK-LABEL: bb2:
;CHECK: phi i16
  %h.10.1 = phi <1 x i16> [ undef, %bb1 ]
  br label %bb3

bb4:
;CHECK-LABEL: bb4:
;CHECK: phi i16
  %h.10.2 = phi <1 x i16> [ %h.10.0.vec.insert, %bb3 ]
  ret void
}

Perhaps something like that could be used.

Incorporate a related test case that my approach also fixes from https://bugs.llvm.org/show_bug.cgi?id=28911

We are using the Scalarizer in our out-of-tree target and I've run some tests with the patch without problems so I think it's ok.
Please wait a day before submitting though in case someone who really knows this code objects, but if not I think it's ok to push.

This revision is now accepted and ready to land.Oct 2 2018, 11:39 PM

Closed by commit rL344128: Fix an ordering bug in the scalarizer. (authored by sheredom). · Explain WhyOct 10 2018, 2:29 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

Scalarizer.cpp

9 lines

test/

Transforms/

Scalarizer/

crash-bug.ll

2 lines

order-bug.ll

23 lines

phi-bug.ll

24 lines

Diff 168959

llvm/trunk/lib/Transforms/Scalar/Scalarizer.cpp

//===- Scalarizer.cpp - Scalarize vector operations -----------------------===//		//===- Scalarizer.cpp - Scalarize vector operations -----------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass converts vector operations into scalar operations, in order		// This pass converts vector operations into scalar operations, in order
// to expose optimization opportunities on the individual scalar operations.		// to expose optimization opportunities on the individual scalar operations.
// It is mainly intended for targets that do not have vector units, but it		// It is mainly intended for targets that do not have vector units, but it
// may also be useful for revectorizing code to different vector widths.		// may also be useful for revectorizing code to different vector widths.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Analysis/VectorUtils.h"		#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
▲ Show 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	ScalarizeLoadStore =
M.getContext().getOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>();		M.getContext().getOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>();
return false;		return false;
}		}

bool Scalarizer::runOnFunction(Function &F) {		bool Scalarizer::runOnFunction(Function &F) {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;
assert(Gathered.empty() && Scattered.empty());		assert(Gathered.empty() && Scattered.empty());
for (BasicBlock &BB : F) {
for (BasicBlock::iterator II = BB.begin(), IE = BB.end(); II != IE;) {		// To ensure we replace gathered components correctly we need to do an ordered
		// traversal of the basic blocks in the function.
		ReversePostOrderTraversal<BasicBlock *> RPOT(&F.getEntryBlock());
		for (BasicBlock *BB : RPOT) {
		for (BasicBlock::iterator II = BB->begin(), IE = BB->end(); II != IE;) {
Instruction I = &II;		Instruction I = &II;
bool Done = visit(I);		bool Done = visit(I);
++II;		++II;
if (Done && I->getType()->isVoidTy())		if (Done && I->getType()->isVoidTy())
I->eraseFromParent();		I->eraseFromParent();
}		}
}		}
return finish();		return finish();
▲ Show 20 Lines • Show All 501 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Scalarizer/crash-bug.ll

Show All 9 Lines	%bb2_vec = shufflevector <2 x i16> <i16 0, i16 10000>,
<2 x i16> %bb1_vec,		<2 x i16> %bb1_vec,
<2 x i32> <i32 0, i32 3>		<2 x i32> <i32 0, i32 3>
br label %bb1		br label %bb1

bb1: ; preds = %bb2, %0		bb1: ; preds = %bb2, %0
%bb1_vec = phi <2 x i16> [ <i16 100, i16 200>, %0 ], [ %bb2_vec, %bb2 ]		%bb1_vec = phi <2 x i16> [ <i16 100, i16 200>, %0 ], [ %bb2_vec, %bb2 ]
;CHECK: bb1:		;CHECK: bb1:
;CHECK: %bb1_vec.i0 = phi i16 [ 100, %0 ], [ 0, %bb2 ]		;CHECK: %bb1_vec.i0 = phi i16 [ 100, %0 ], [ 0, %bb2 ]
;CHECK: %bb1_vec.i1 = phi i16 [ 200, %0 ], [ %bb1_vec.i1, %bb2 ]		;CHECK: %bb2_vec.i1 = phi i16 [ 200, %0 ], [ %bb2_vec.i1, %bb2 ]
br i1 undef, label %bb3, label %bb2		br i1 undef, label %bb3, label %bb2

bb3:		bb3:
ret void		ret void
}		}

llvm/trunk/test/Transforms/Scalarizer/order-bug.ll

				; RUN: opt %s -scalarizer -S -o - \| FileCheck %s

				; This input caused the scalarizer to replace & erase gathered results when
				; future gathered results depended on them being alive

				define dllexport spir_func <4 x i32> @main(float %a) {
				entry:
				%i = insertelement <4 x float> undef, float %a, i32 0
				br label %z

				y:
				; CHECK: %f.upto0 = insertelement <4 x i32> undef, i32 %b.i0, i32 0
				; CHECK: %f.upto1 = insertelement <4 x i32> %f.upto0, i32 %b.i0, i32 1
				; CHECK: %f.upto2 = insertelement <4 x i32> %f.upto1, i32 %b.i0, i32 2
				; CHECK: %f = insertelement <4 x i32> %f.upto2, i32 %b.i0, i32 3
				%f = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer
				ret <4 x i32> %f

				z:
				; CHECK: %b.i0 = bitcast float %a to i32
				%b = bitcast <4 x float> %i to <4 x i32>
				br label %y
				}

llvm/trunk/test/Transforms/Scalarizer/phi-bug.ll

				; RUN: opt %s -scalarizer -verify -S -o - \| FileCheck %s

				define void @f3() local_unnamed_addr {
				bb1:
				br label %bb2

				bb3:
				; CHECK-LABEL: bb3:
				; CHECK-NEXT: br label %bb4
				%h.10.0.vec.insert = shufflevector <1 x i16> %h.10.1, <1 x i16> undef, <1 x i32> <i32 0>
				br label %bb4

				bb2:
				; CHECK-LABEL: bb2:
				; CHECK: phi i16
				%h.10.1 = phi <1 x i16> [ undef, %bb1 ]
				br label %bb3

				bb4:
				; CHECK-LABEL: bb4:
				; CHECK: phi i16
				%h.10.2 = phi <1 x i16> [ %h.10.0.vec.insert, %bb3 ]
				ret void
				}