This is an archive of the discontinued LLVM Phabricator instance.

Support sext and trunc instructions in SCEV delinearization algorithm
Needs ReviewPublic

Authored by ManuelSelva on Jul 17 2017, 4:29 AM.

Download Raw Diff

Details

Reviewers

grosser
sebpop

Summary

This patch is a very first attempt to support sext and trunc instructions in SCEV delinearization algorithm.
The patch includes:

modifications of the algorithm trying to divide an SCEV by another SCEV in order to support sext and trunc instructions.
a test case for this new feature

Diff Detail

Event Timeline

ManuelSelva created this revision.Jul 17 2017, 4:29 AM

Herald added subscribers: mzolotukhin, sanjoy. · View Herald TranscriptJul 17 2017, 4:29 AM

ManuelSelva added a project: Restricted Project.Jul 17 2017, 6:00 AM

Add Sebastian as reviewer.

Hi Manuel,

most changes are simple. I added a couple of style comments. I probably only get to this on the weekend, but the main thing we need to check is under which conditions ignoring sext/zext is legal. I am afraid, this is only OK with non-trivial checks -- or if ew can get nsw information from the IR.

Out of interest. Why does your IR has the mix of 32 and 64 values in the first place? And why do you have overflow-checked intrinsics?

lib/Analysis/ScalarEvolution.cpp
873	Indentation.
878	Indentation.
925	Alignment?
940	Can this assert ever be reached? It seems you bail out right before the position at which the assert is inserted.

Hi Tobias,

Thank you for looking at that. To first answer your questions "out of interest", this IR is generated by JavaScriptCore, a JavaScript virtual machine performing JIT compilation. The virtual machine is using NaN-boxing, i.e., uses the unused values (all representing NaN) of 64 bits floating point representation to represent JavaScript objects. In this representation, 32 bits integers are represented by 64 bits values all starting with the FFFF 0000 32 bits pattern. As a consequence, to get the integer value from a 64 bits value, the virtual machine generates LLVM code that truncates the value to a 32 bits integer.

Then, when performing arithmetic operations on these 32 bits integers, we need to ensure that they do not overflow, because if it's the case, we can't represent them anymore with the NaN-boxing trick and so we need to do *something* to represent them properly. This "something" is not present in the test case I provided in order to make it as minimal as possible.

If you are curious on this NaN-boxing thing and more generally on JavaScript implementations, you can have a look at the following links:

https://manuelselva.github.io/docs/presentations/2017-06-21-Manuel-Selva-js-polyhedral-journees-compil.pdf (see in the backup slides, slides I made for a recent presentation)
https://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations)

I'll now look into the details of this patch, to ensure its correctness as you said.

Best,
Manu

alexsusu added a subscriber: alexsusu.Oct 11 2017, 6:38 AM

In D35478#814250, @ManuelSelva wrote:

Hi Tobias,

Thank you for looking at that. To first answer your questions "out of interest", this IR is generated by JavaScriptCore, a JavaScript virtual machine performing JIT compilation. The virtual machine is using NaN-boxing, i.e., uses the unused values (all representing NaN) of 64 bits floating point representation to represent JavaScript objects. In this representation, 32 bits integers are represented by 64 bits values all starting with the FFFF 0000 32 bits pattern. As a consequence, to get the integer value from a 64 bits value, the virtual machine generates LLVM code that truncates the value to a 32 bits integer.

Then, when performing arithmetic operations on these 32 bits integers, we need to ensure that they do not overflow, because if it's the case, we can't represent them anymore with the NaN-boxing trick and so we need to do *something* to represent them properly. This "something" is not present in the test case I provided in order to make it as minimal as possible.

If you are curious on this NaN-boxing thing and more generally on JavaScript implementations, you can have a look at the following links:

https://manuelselva.github.io/docs/presentations/2017-06-21-Manuel-Selva-js-polyhedral-journees-compil.pdf (see in the backup slides, slides I made for a recent presentation)

https://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations)

I'll now look into the details of this patch, to ensure its correctness as you said.

Best,
Manu

Hello.
  I applied the patch that was provided in this review to my LLVM build.
  I would like to draw attention to a bug of the deliniarization procedure that is related to type casts (from i32 to i64), and is probably related to the problems that were supposed to be addressed by the patch in this review.
  I attach a C file that reproduces this bug - the delinearization procedure fails due to sext from i32 to i64, and Polly finds Non affine access functions, which he shouldn't. Note that in the same C file I also provide on line 3 a commented signature of the Test function which can be uncommented, for which delinearization works OK and all access functions are affine.
  I will try to look deeper in the problem with delinearization. Please let me know if you can help with this problem.

Best regards,
  Alex

MultiDim.c1 KBDownload

Hello.

I applied the patch that was provided in this review to my LLVM build.
I would like to draw attention to a bug of the deliniarization procedure that is related to type casts (from i32 to i64), and is probably related to the problems that were supposed to be addressed by the patch in this review.
I attach a C file that reproduces this bug - the delinearization procedure fails due to sext from i32 to i64, and Polly finds Non affine access functions, which he shouldn't. Note that in the same C file I also provide on line 3 a commented signature of the Test function which can be uncommented, for which delinearization works OK and all access functions are affine.
I will try to look deeper in the problem with delinearization. Please let me know if you can help with this problem.

Best regards,

Alex

MultiDim.c1 KBDownload

alexsusu added a child revision: D40369: Support sext instruction in SCEV delinearization algorithm (new revision).Nov 22 2017, 11:07 AM

Meinersbur removed a child revision: D40369: Support sext instruction in SCEV delinearization algorithm (new revision).May 2 2018, 10:07 AM

Revision Contents

Path

Size

lib/

Analysis/

ScalarEvolution.cpp

52 lines

test/

Analysis/

Delinearization/

constant_sext_trunc.ll

83 lines

Diff 106849

lib/Analysis/ScalarEvolution.cpp

Context not available.

	// Except in the trivial case described above, we do not know how to divide	// Except in the trivial case described above, we do not know how to divide
	// Expr by Denominator for the following functions with empty implementation.	// Expr by Denominator for the following functions with empty implementation.
	void visitTruncateExpr(const SCEVTruncateExpr *Numerator) {}
	void visitZeroExtendExpr(const SCEVZeroExtendExpr *Numerator) {}	void visitZeroExtendExpr(const SCEVZeroExtendExpr *Numerator) {}
	void visitSignExtendExpr(const SCEVSignExtendExpr *Numerator) {}
	void visitUDivExpr(const SCEVUDivExpr *Numerator) {}	void visitUDivExpr(const SCEVUDivExpr *Numerator) {}
	void visitSMaxExpr(const SCEVSMaxExpr *Numerator) {}	void visitSMaxExpr(const SCEVSMaxExpr *Numerator) {}
	void visitUMaxExpr(const SCEVUMaxExpr *Numerator) {}	void visitUMaxExpr(const SCEVUMaxExpr *Numerator) {}
	void visitUnknown(const SCEVUnknown *Numerator) {}	void visitUnknown(const SCEVUnknown *Numerator) {}
	void visitCouldNotCompute(const SCEVCouldNotCompute *Numerator) {}	void visitCouldNotCompute(const SCEVCouldNotCompute *Numerator) {}

		void visitTruncateExpr(const SCEVTruncateExpr *Numerator) {
		const SCEV Q, R;
		divide(SE, Numerator->getOperand(), Denominator, &Q, &R);
		Quotient = SE.getTruncateExpr(Q, Numerator->getType());
		grosserUnsubmitted Not Done Reply Inline Actions Indentation. grosser: Indentation.
		Remainder = SE.getTruncateExpr(R, Numerator->getType());
		}

		void visitSignExtendExpr(const SCEVSignExtendExpr *Numerator) {
		const SCEV Q, R;
		grosserUnsubmitted Not Done Reply Inline Actions Indentation. grosser: Indentation.
		divide(SE, Numerator->getOperand(), Denominator, &Q, &R);
		Quotient = SE.getSignExtendExpr(Q, Numerator->getType());
		Remainder = SE.getSignExtendExpr(R, Numerator->getType());
		}


	void visitConstant(const SCEVConstant *Numerator) {	void visitConstant(const SCEVConstant *Numerator) {
	if (const SCEVConstant *D = dyn_cast<SCEVConstant>(Denominator)) {	if (const SCEVConstant *D = dyn_cast<SCEVConstant>(Denominator)) {
	APInt NumeratorVal = Numerator->getAPInt();	APInt NumeratorVal = Numerator->getAPInt();
Context not available.
	APInt::sdivrem(NumeratorVal, DenominatorVal, QuotientVal, RemainderVal);	APInt::sdivrem(NumeratorVal, DenominatorVal, QuotientVal, RemainderVal);
	Quotient = SE.getConstant(QuotientVal);	Quotient = SE.getConstant(QuotientVal);
	Remainder = SE.getConstant(RemainderVal);	Remainder = SE.getConstant(RemainderVal);
		Quotient = SE.getTruncateOrNoop(Quotient, Numerator->getType());
		Remainder = SE.getTruncateOrNoop(Remainder, Numerator->getType());
	return;	return;
	}	}
	}	}
Context not available.
	return cannotDivide(Numerator);	return cannotDivide(Numerator);
	divide(SE, Numerator->getStart(), Denominator, &StartQ, &StartR);	divide(SE, Numerator->getStart(), Denominator, &StartQ, &StartR);
	divide(SE, Numerator->getStepRecurrence(SE), Denominator, &StepQ, &StepR);	divide(SE, Numerator->getStepRecurrence(SE), Denominator, &StepQ, &StepR);
	// Bail out if the types do not match.	assert(
	Type *Ty = Denominator->getType();	Numerator->getStart()->getType() == StartQ->getType()
	if (Ty != StartQ->getType() \|\| Ty != StartR->getType() \|\|	&& StartQ->getType() == StartR->getType()
	Ty != StepQ->getType() \|\| Ty != StepR->getType())	&& "Expected matching types");
	return cannotDivide(Numerator);	assert(
	Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),	Numerator->getStepRecurrence(SE)->getType() == StepQ->getType()
	Numerator->getNoWrapFlags());	&& StepQ->getType() == StepR->getType()
	Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),	&& "Expected matching types");
	Numerator->getNoWrapFlags());	Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
		Numerator->getNoWrapFlags());
		Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
		Numerator->getNoWrapFlags());
		grosserUnsubmitted Not Done Reply Inline Actions Alignment? grosser: Alignment?
	}	}

	void visitAddExpr(const SCEVAddExpr *Numerator) {	void visitAddExpr(const SCEVAddExpr *Numerator) {
	SmallVector<const SCEV *, 2> Qs, Rs;	SmallVector<const SCEV *, 2> Qs, Rs;
	Type *Ty = Denominator->getType();	Type *Ty = Numerator->getType();

	for (const SCEV *Op : Numerator->operands()) {	for (const SCEV *Op : Numerator->operands()) {
	const SCEV Q, R;	const SCEV Q, R;
Context not available.
	// Bail out if types do not match.	// Bail out if types do not match.
	if (Ty != Q->getType() \|\| Ty != R->getType())	if (Ty != Q->getType() \|\| Ty != R->getType())
	return cannotDivide(Numerator);	return cannotDivide(Numerator);
		assert(Ty == Q->getType() && Ty == R->getType() &&
		"Expected matching types");
		grosserUnsubmitted Not Done Reply Inline Actions Can this assert ever be reached? It seems you bail out right before the position at which the assert is inserted. grosser: Can this assert ever be reached? It seems you bail out right before the position at which the…

	Qs.push_back(Q);	Qs.push_back(Q);
	Rs.push_back(R);	Rs.push_back(R);
Context not available.
	// The Remainder is obtained by replacing Denominator by 0 in Numerator.	// The Remainder is obtained by replacing Denominator by 0 in Numerator.
	ValueToValueMap RewriteMap;	ValueToValueMap RewriteMap;
	RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =	RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
	cast<SCEVConstant>(Zero)->getValue();	cast<SCEVConstant>(SE.getZero(Denominator->getType()))->getValue();
	Remainder = SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);	Remainder = SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);

	if (Remainder->isZero()) {	if (Remainder->isZero()) {
	// The Quotient is obtained by replacing Denominator by 1 in Numerator.	// The Quotient is obtained by replacing Denominator by 1 in Numerator.
	RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =	RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
	cast<SCEVConstant>(One)->getValue();	cast<SCEVConstant>(SE.getOne(Denominator->getType()))->getValue();
	Quotient =	Quotient =
	SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);	SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);
	return;	return;
Context not available.
	SCEVDivision(ScalarEvolution &S, const SCEV *Numerator,	SCEVDivision(ScalarEvolution &S, const SCEV *Numerator,
	const SCEV *Denominator)	const SCEV *Denominator)
	: SE(S), Denominator(Denominator) {	: SE(S), Denominator(Denominator) {
	Zero = SE.getZero(Denominator->getType());	Zero = SE.getZero(Numerator->getType());
	One = SE.getOne(Denominator->getType());	One = SE.getOne(Numerator->getType());

	// We generally do not know how to divide Expr by Denominator. We	// We generally do not know how to divide Expr by Denominator. We
	// initialize the division to a "cannot divide" state to simplify the rest	// initialize the division to a "cannot divide" state to simplify the rest
Context not available.

test/Analysis/Delinearization/constant_sext_trunc.ll

				; RUN: opt -delinearize -analyze < %s \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; CHECK: Inst: store i64 -281474976710602, i64* %19
				; CHECK-NEXT: In Loop with Header: Block #6
				; CHECK-NEXT: AccessFunction: (8 * (sext i32 {{[{][{]}}0,+,(trunc i64 %4 to i32)}<%"Block #3">,+,1}<%"Block #6"> to i64))
				; CHECK-NEXT: Base offset: %18
				; CHECK-NEXT: ArrayDecl[UnknownSize][%4] with elements of 8 bytes.
				; CHECK-NEXT: ArrayRef[(sext i32 {0,+,1}<%"Block #3"> to i64)][{0,+,1}<nuw><nsw><%"Block #6">]

				; CHECK: Inst: store i64 -281474976710602, i64* %19
				; CHECK-NEXT: In Loop with Header: Block #3
				; CHECK-NEXT: AccessFunction: (8 * (sext i32 {(-1 + (trunc i64 %4 to i32)),+,(trunc i64 %4 to i32)}<%"Block #3"> to i64))
				; CHECK-NEXT: Base offset: %18
				; CHECK-NEXT: ArrayDecl[UnknownSize][%4] with elements of 8 bytes.
				; CHECK-NEXT: ArrayRef[(sext i32 {1,+,1}<%"Block #3"> to i64)][-1]

				define i64 @jsBody_test() #0 {
				Prologue:
				%0 = call i8* @llvm.frameaddress(i32 0)
				%1 = ptrtoint i8* %0 to i64
				%2 = add i64 %1, 56
				%3 = inttoptr i64 %2 to i64*
				%4 = load i64, i64* %3
				%5 = add i64 %1, 48
				%6 = inttoptr i64 %5 to i64*
				%7 = load i64, i64* %6
				%8 = trunc i64 %4 to i32
				%9 = icmp slt i32 0, %8
				%10 = add i64 %7, 8
				%11 = inttoptr i64 %10 to i64*
				%12 = load i64, i64* %11
				%13 = add i64 %12, 0
				br label %"Block #3"

				"Block #1": ; preds = %"Block #9"
				br label %"Block #3"

				"Block #3": ; preds = %"Block #1", %Prologue
				%.01 = phi i32 [ 0, %Prologue ], [ %22, %"Block #1" ]
				br i1 %9, label %"Block #4", label %"Block #9"

				"Block #4": ; preds = %"Block #3"
				%14 = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 %8, i32 %.01)
				%15 = extractvalue { i32, i1 } %14, 0
				br label %"Block #6"

				"Block #5": ; preds = %"Block #6"
				br label %"Block #6"

				"Block #6": ; preds = %"Block #5", %"Block #4"
				%.0 = phi i32 [ 0, %"Block #4" ], [ %20, %"Block #5" ]
				%16 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %15, i32 %.0)
				%17 = extractvalue { i32, i1 } %16, 0
				%18 = inttoptr i64 %13 to [10000000 x i64]*
				%19 = getelementptr [10000000 x i64], [10000000 x i64]* %18, i32 0, i32 %17
				store i64 -281474976710602, i64* %19
				%20 = add i32 %.0, 1
				%21 = icmp slt i32 %20, %8
				br i1 %21, label %"Block #5", label %"Block #9"

				"Block #9": ; preds = %"Block #6", %"Block #3"
				%22 = add i32 %.01, 1
				%23 = icmp slt i32 %22, 1000
				br i1 %23, label %"Block #1", label %"Block #10"

				"Block #10": ; preds = %"Block #9"
				ret i64 10
				}

				; Function Attrs: nounwind readnone
				declare i8* @llvm.frameaddress(i32) #1

				; Function Attrs: nounwind readnone speculatable
				declare { i32, i1 } @llvm.smul.with.overflow.i32(i32, i32) #2

				; Function Attrs: nounwind readnone speculatable
				declare { i32, i1 } @llvm.sadd.with.overflow.i32(i32, i32) #2

				attributes #0 = { "target-features"="-avx" }
				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind readnone speculatable }