This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/IR/
-
IR/
3
ConstantFold.cpp
-
test/Analysis/ConstantFolding/
-
Analysis/
-
ConstantFolding/
4
cast-vector.ll

Differential D38546

[ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP
ClosedPublic

Authored by bjope on Oct 4 2017, 8:54 AM.

Download Raw Diff

Details

Reviewers

craig.topper
majnemer
davide
filcab
efriedma

Commits

rG1c043a9f2862: [ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP
rL316430: [ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP

Summary

Got asserts in llvm::CastInst::getCastOpcode saying:
`DestBits == SrcBits && "Illegal cast to vector (wrong type or size)"' failed.

Problem seemed to be that llvm::ConstantFoldCastInstruction did
not handle ptrtoint cast of a getelementptr returning a vector
correctly. I assume such situations are quite rare, since the
GEP needs to be considered as a constant value (base pointer
being null).
The solution used here is to simply avoid the constant fold
of ptrtoint when the value is a vector. It is not supported,
and by bailing out we do not fail on assertions later on.

Change-Id: Ic8fb38e2014cdf2cf4027b10b9535c32fe044a33

Diff Detail

Build Status

Buildable 11364
Build 11364: arc lint + arc unit

Event Timeline

bjope created this revision.Oct 4 2017, 8:54 AM

bjope added inline comments.Oct 4 2017, 9:01 AM

lib/IR/ConstantFold.cpp
623	Should this perhaps go below the V->isNullValue() check? (I could not see that isNullValue() will be true for a vector of null pointers, and I am not sure that the ConstantInt::get(DestTy, 0) would be correct if DestTy is a vector. So I did put the check here before the isNullValue() check.)

bjope added a reviewer: davide.Oct 9 2017, 6:27 AM

Ping!

Hi @bjope,

Thanks for the patch, but it seems to be trying to paper over another problem instead of fixing it.
I've commented on the specific line where I have concerns.

Thank you,

Filipe

test/Analysis/ConstantFolding/cast-vector.ll
11	This is invalid IR (per langref). the `ptrtoint` constant expression only takes a pointer type argument: ptrtoint (CST to TYPE) Convert a pointer typed constant to the corresponding integer constant. TYPE must be an integer type. CST must be of pointer type. The CST value is zero extended, truncated, or unchanged to make it fit in TYPE. It seems to me that one of the following is the best fix: Fix whatever is creating a `ptrtoint` constant expression with a vector argument (make sure to also add an assert to make sure it doesn't happen again) Make `ptrtoint` of vectors a valid constant expression, which involves changing the langref and making sure these kinds of folds are valid.

This revision now requires changes to proceed.Oct 16 2017, 11:44 AM

bjope added inline comments.Oct 16 2017, 1:59 PM

test/Analysis/ConstantFolding/cast-vector.ll
11	@filcab: Are you sure? AFAIK this is valid according to verifiers. And it seems to work. The examples for both ptrtoint and inttoptr in the "Instruction Reference" part of the langref includes examples with vectors. For ptrtoint the version of the langref that I've been looking at says: The ‘ptrtoint‘ instruction takes a value to cast, which must be a value of type pointer or a vector of pointers, and a type to cast it to ty2, which must be an integer or a vector of integers type. I assume that you quoted the description from the "Constant Expressions" section of the langref. Are you saying that there is a limitaiton for constant expressions, and that the Instruction Reference part of the langref only is valid for non-constant expressions? Besides, you say that I should "fix whatever is creating a ptrtoint constant expression with a vector argument". My patch is about bailing out from ConstantFoldCastInstruction without doing any folding. Basically leaving the ptrtoint as it was in the input (which follows the example from the Instruction Reference part of the langref.... as the arguments isn't constant folded I guess it can't be seen as a constant expression...).

filcab added inline comments.Oct 17 2017, 10:53 AM

test/Analysis/ConstantFolding/cast-vector.ll
11	The `ptrtoint` instruction in the input for your test is ok per langref. The `ptrtoint` constant expression on the output you expect is not. If you look at your `CHECK` line, you're expecting an output like: `ret <2 x i16> ptrtoint (<2 x i32> getelementptr ([10 x i32], [10 x i32] null, <2 x i64> zeroinitializer, <2 x i64> <i64 5, i64 7>) to <2 x i16>)` This is a `ret` of a `ptrtoint` constant expression which takes a `getelementptr` constant expression. On the input, you have instructions, which are different. `ConstantFoldCastInstruction` is currently being called on malformed IR which includes that `ptrtoint` constant expression. The IR is already bad (according to langref), so it's not useful to have `ConstantFoldCastInstruction` instead of stopping whatever emitted that IR. (That's just one of the possible fixes)

bjope added inline comments.Oct 17 2017, 2:53 PM

test/Analysis/ConstantFolding/cast-vector.ll
11	So you are saying that constant expressions has special limitations, and that the Instruction Reference part of the langref only is valid for non-constant expressions? (note that vector types are `first class` types, and the Constant Expressions section of the langref do say "Constant expressions may be of any first class type") Is the following also malformed IR then? %bar = add <2 x i16> %foo, trunc (<2 x i32> <i32 78, i32 99> to <2 x i16>) I think it is weird that using vector types in constant expressions with cast operations works just fine if it isn't allowed. Is it perhaps just the langref that is a little bit vague regarding this? Nevertheless, I do not think that the IR has been "malformed" yet when I hit the assert. My fix is just about bailing out when analysing a `ptrtoint` expression, very much in the same way as it already is done for `trunc` inside `ConstantFoldCastInstruction`. Maybe it is a pity that the constant GEP expression isn't constant evaluated, reducing everything into a ret with a <2 x i16> with two literals. But that is kind of a different problem, and I'm not sure how common it is to have GEPs that are based on null pointers (only way to get a GEP that is a constant expression?), so maybe adding logic for that isn't worth the trouble. As a comparison, if I change the test case to this (equivalent code but without the vector gep): %gep1 = getelementptr inbounds [10 x i32], [10 x i32]* null, i16 0, i16 5 %gep2 = getelementptr inbounds [10 x i32], [10 x i32]* null, i16 0, i16 7 %gepvec1 = insertelement <2 x i32> undef, i32 %gep1, i32 0 %gepvec2 = insertelement <2 x i32> %gepvec1, i32 %gep2, i32 1 %vec = ptrtoint <2 x i32> %gepvec2 to <2 x i16> ret <2 x i16> %vec then instsimplify will reduce it to ret <2 x i16> <i16 ptrtoint (i32 inttoptr (i64 20 to i32) to i16), i16 ptrtoint (i32 inttoptr (i64 28 to i32) to i16)> while instcombine will reduce it to ret <2 x i16> <i16 20, i16 28> So someone has bothered about evaluating constant GEPs that aren't vectors. But it also show that instsimplify doesn't reduce the ptrtoint and inttoptr fully, not even for the scalar case. The IR produced in my test case: ret <2 x i16> ptrtoint (<2 x i32> getelementptr ([10 x i32], [10 x i32]* null, <2 x i64> zeroinitializer, <2 x i64> <i64 5, i64 7>) to <2 x i16>) is accepted both by the verify pass, and llc. So instead of an assert, my original test case (C code) compiles into valid assembler. And I get the same assembler result after llc for all these variants of ret.

The constant expression ptrtoint is supposed to have the same rules as the instruction ptrtoint. Looks like we just forgot to update LangRef when pointer vectors were added.

In D38546#900133, @efriedma wrote:

The constant expression ptrtoint is supposed to have the same rules as the instruction ptrtoint. Looks like we just forgot to update LangRef when pointer vectors were added.

Then the test case is valid.
And this patch avoids the assert by not executing the piece of code that does not support a getelementptr that returns a pointer of vectors. So it is a simple bugfix for an existing problem.
So can I land this now?

bjope requested review of this revision.Oct 18 2017, 11:43 PM

bjope edited edge metadata.

Needs a FIXME or something explaining why exactly you're returning early; in theory, we should be able to fold vector ptrtoint operations the same way we fold integer ptrtoint operations, given the right logic.

Also, while we're discussing LangRef, would you mind updating it to clarify the rules here?

Add a FIXME as requested by Eli.

I've now added a FIXME describing the reason for now handling vectors at the
moment. And I moved the check closer to the code that needs to be guarded.

Also updated the test case with a second function, in order to show that the
problem exists for both the "offset-of" and "sizeof" kind of expressions.

Harbormaster completed remote builds in B11364: Diff 119784.Oct 22 2017, 3:20 AM

In D38546#902378, @efriedma wrote:

Needs a FIXME or something explaining why exactly you're returning early; in theory, we should be able to fold vector ptrtoint operations the same way we fold integer ptrtoint operations, given the right logic.

Ok! I've updated the patch.

Also, while we're discussing LangRef, would you mind updating it to clarify the rules here?

I've proposed a LangRef update in a separate patch: https://reviews.llvm.org/D39165

LGTM with fixed whitespace.

lib/IR/ConstantFold.cpp
640	Looks like the whitespace got messed up here somehow?

bjope added inline comments.Oct 24 2017, 12:57 AM

lib/IR/ConstantFold.cpp
640	Yes, thanks! I just found out that my emacs config ended up inserting tabs instead of spaces depending on how my local git repo was named - I did not know that - quite unexpected. I'll correct the indents before I push this.

Closed by commit rL316430: [ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP (authored by bjope). · Explain WhyOct 24 2017, 5:08 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

IR/

ConstantFold.cpp

9 lines

test/

Analysis/

ConstantFolding/

cast-vector.ll

32 lines

Diff 119784

lib/IR/ConstantFold.cpp

Show First 20 Lines • Show All 614 Lines • ▼ Show 20 Lines	if (ConstantFP *FPC = dyn_cast<ConstantFP>(V)) {
return ConstantInt::get(FPC->getContext(), IntVal);		return ConstantInt::get(FPC->getContext(), IntVal);
}		}
return nullptr; // Can't fold.		return nullptr; // Can't fold.
case Instruction::IntToPtr: //always treated as unsigned		case Instruction::IntToPtr: //always treated as unsigned
if (V->isNullValue()) // Is it an integral null value?		if (V->isNullValue()) // Is it an integral null value?
return ConstantPointerNull::get(cast<PointerType>(DestTy));		return ConstantPointerNull::get(cast<PointerType>(DestTy));
return nullptr; // Other pointer types cannot be casted		return nullptr; // Other pointer types cannot be casted
case Instruction::PtrToInt: // always treated as unsigned		case Instruction::PtrToInt: // always treated as unsigned
// Is it a null pointer value?		// Is it a null pointer value?
		bjopeAuthorUnsubmitted Not Done Reply Inline Actions Should this perhaps go below the V->isNullValue() check? (I could not see that isNullValue() will be true for a vector of null pointers, and I am not sure that the ConstantInt::get(DestTy, 0) would be correct if DestTy is a vector. So I did put the check here before the isNullValue() check.) bjope: Should this perhaps go below the V->isNullValue() check? (I could not see that isNullValue()…
if (V->isNullValue())		if (V->isNullValue())
return ConstantInt::get(DestTy, 0);		return ConstantInt::get(DestTy, 0);
// If this is a sizeof-like expression, pull out multiplications by		// If this is a sizeof-like expression, pull out multiplications by
// known factors to expose them to subsequent folding. If it's an		// known factors to expose them to subsequent folding. If it's an
// alignof-like expression, factor out known factors.		// alignof-like expression, factor out known factors.
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(V))		if (ConstantExpr *CE = dyn_cast<ConstantExpr>(V))
if (CE->getOpcode() == Instruction::GetElementPtr &&		if (CE->getOpcode() == Instruction::GetElementPtr &&
CE->getOperand(0)->isNullValue()) {		CE->getOperand(0)->isNullValue()) {
		// FIXME: Looks like getFoldedSizeOf(), getFoldedOffsetOf() and
		// getFoldedAlignOf() don't handle the case when DestTy is a vector of
		// pointers yet. We end up in asserts in CastInst::getCastOpcode (see
		// test/Analysis/ConstantFolding/cast-vector.ll). I've only seen this
		// happen in one "real" C-code test case, so it does not seem to be an
		// important optimization to handle vectors here. For now, simply bail
		// out.
		if (DestTy->isVectorTy())
		return nullptr;
		efriedmaUnsubmitted Not Done Reply Inline Actions Looks like the whitespace got messed up here somehow? efriedma: Looks like the whitespace got messed up here somehow?
		bjopeAuthorUnsubmitted Not Done Reply Inline Actions Yes, thanks! I just found out that my emacs config ended up inserting tabs instead of spaces depending on how my local git repo was named - I did not know that - quite unexpected. I'll correct the indents before I push this. bjope: Yes, thanks! I just found out that my emacs config ended up inserting tabs instead of spaces…
GEPOperator *GEPO = cast<GEPOperator>(CE);		GEPOperator *GEPO = cast<GEPOperator>(CE);
Type *Ty = GEPO->getSourceElementType();		Type *Ty = GEPO->getSourceElementType();
if (CE->getNumOperands() == 2) {		if (CE->getNumOperands() == 2) {
// Handle a sizeof-like expression.		// Handle a sizeof-like expression.
Constant *Idx = CE->getOperand(1);		Constant *Idx = CE->getOperand(1);
bool isOne = isa<ConstantInt>(Idx) && cast<ConstantInt>(Idx)->isOne();		bool isOne = isa<ConstantInt>(Idx) && cast<ConstantInt>(Idx)->isOne();
if (Constant *C = getFoldedSizeOf(Ty, DestTy, !isOne)) {		if (Constant *C = getFoldedSizeOf(Ty, DestTy, !isOne)) {
Idx = ConstantExpr::getCast(CastInst::getCastOpcode(Idx, true,		Idx = ConstantExpr::getCast(CastInst::getCastOpcode(Idx, true,
▲ Show 20 Lines • Show All 1,651 Lines • Show Last 20 Lines

test/Analysis/ConstantFolding/cast-vector.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -instsimplify -S \| FileCheck %s

				; Test constant fold of constant expression GEP used by ptrtoint (the
				; "offsetof-like expression" case).
				; This used to hit an assert due to not supporting vectors in
				; llvm::ConstantFoldCastInstruction when handling ptrtoint.
				define <2 x i16> @test1() {
				; CHECK-LABEL: @test1(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: ret <2 x i16> ptrtoint (<2 x i32> getelementptr ([10 x i32], [10 x i32] null, <2 x i64> zeroinitializer, <2 x i64> <i64 5, i64 7>) to <2 x i16>)
				filcabUnsubmitted Not Done Reply Inline Actions This is invalid IR (per langref). the `ptrtoint` constant expression only takes a pointer type argument: ptrtoint (CST to TYPE) Convert a pointer typed constant to the corresponding integer constant. TYPE must be an integer type. CST must be of pointer type. The CST value is zero extended, truncated, or unchanged to make it fit in TYPE. It seems to me that one of the following is the best fix: Fix whatever is creating a `ptrtoint` constant expression with a vector argument (make sure to also add an assert to make sure it doesn't happen again) Make `ptrtoint` of vectors a valid constant expression, which involves changing the langref and making sure these kinds of folds are valid. filcab: This is invalid IR (per langref). the `ptrtoint` constant expression only takes a pointer type…
				bjopeAuthorUnsubmitted Not Done Reply Inline Actions @filcab: Are you sure? AFAIK this is valid according to verifiers. And it seems to work. The examples for both ptrtoint and inttoptr in the "Instruction Reference" part of the langref includes examples with vectors. For ptrtoint the version of the langref that I've been looking at says: The ‘ptrtoint‘ instruction takes a value to cast, which must be a value of type pointer or a vector of pointers, and a type to cast it to ty2, which must be an integer or a vector of integers type. I assume that you quoted the description from the "Constant Expressions" section of the langref. Are you saying that there is a limitaiton for constant expressions, and that the Instruction Reference part of the langref only is valid for non-constant expressions? Besides, you say that I should "fix whatever is creating a ptrtoint constant expression with a vector argument". My patch is about bailing out from ConstantFoldCastInstruction without doing any folding. Basically leaving the ptrtoint as it was in the input (which follows the example from the Instruction Reference part of the langref.... as the arguments isn't constant folded I guess it can't be seen as a constant expression...). bjope: @filcab: Are you sure? AFAIK this is valid according to verifiers. And it seems to work. The…
				filcabUnsubmitted Not Done Reply Inline Actions The `ptrtoint` instruction in the input for your test is ok per langref. The `ptrtoint` constant expression on the output you expect is not. If you look at your `CHECK` line, you're expecting an output like: `ret <2 x i16> ptrtoint (<2 x i32> getelementptr ([10 x i32], [10 x i32] null, <2 x i64> zeroinitializer, <2 x i64> <i64 5, i64 7>) to <2 x i16>)` This is a `ret` of a `ptrtoint` constant expression which takes a `getelementptr` constant expression. On the input, you have instructions, which are different. `ConstantFoldCastInstruction` is currently being called on malformed IR which includes that `ptrtoint` constant expression. The IR is already bad (according to langref), so it's not useful to have `ConstantFoldCastInstruction` instead of stopping whatever emitted that IR. (That's just one of the possible fixes) filcab: The `ptrtoint` instruction in the input for your test is ok per langref. The `ptrtoint`…
				bjopeAuthorUnsubmitted Not Done Reply Inline Actions So you are saying that constant expressions has special limitations, and that the Instruction Reference part of the langref only is valid for non-constant expressions? (note that vector types are `first class` types, and the Constant Expressions section of the langref do say "Constant expressions may be of any first class type") Is the following also malformed IR then? %bar = add <2 x i16> %foo, trunc (<2 x i32> <i32 78, i32 99> to <2 x i16>) I think it is weird that using vector types in constant expressions with cast operations works just fine if it isn't allowed. Is it perhaps just the langref that is a little bit vague regarding this? Nevertheless, I do not think that the IR has been "malformed" yet when I hit the assert. My fix is just about bailing out when analysing a `ptrtoint` expression, very much in the same way as it already is done for `trunc` inside `ConstantFoldCastInstruction`. Maybe it is a pity that the constant GEP expression isn't constant evaluated, reducing everything into a ret with a <2 x i16> with two literals. But that is kind of a different problem, and I'm not sure how common it is to have GEPs that are based on null pointers (only way to get a GEP that is a constant expression?), so maybe adding logic for that isn't worth the trouble. As a comparison, if I change the test case to this (equivalent code but without the vector gep): %gep1 = getelementptr inbounds [10 x i32], [10 x i32]* null, i16 0, i16 5 %gep2 = getelementptr inbounds [10 x i32], [10 x i32]* null, i16 0, i16 7 %gepvec1 = insertelement <2 x i32> undef, i32 %gep1, i32 0 %gepvec2 = insertelement <2 x i32> %gepvec1, i32 %gep2, i32 1 %vec = ptrtoint <2 x i32> %gepvec2 to <2 x i16> ret <2 x i16> %vec then instsimplify will reduce it to ret <2 x i16> <i16 ptrtoint (i32 inttoptr (i64 20 to i32) to i16), i16 ptrtoint (i32 inttoptr (i64 28 to i32) to i16)> while instcombine will reduce it to ret <2 x i16> <i16 20, i16 28> So someone has bothered about evaluating constant GEPs that aren't vectors. But it also show that instsimplify doesn't reduce the ptrtoint and inttoptr fully, not even for the scalar case. The IR produced in my test case: ret <2 x i16> ptrtoint (<2 x i32> getelementptr ([10 x i32], [10 x i32]* null, <2 x i64> zeroinitializer, <2 x i64> <i64 5, i64 7>) to <2 x i16>) is accepted both by the verify pass, and llc. So instead of an assert, my original test case (C code) compiles into valid assembler. And I get the same assembler result after llc for all these variants of ret. bjope: So you are saying that constant expressions has special limitations, and that the Instruction…
				;
				entry:
				%gep = getelementptr inbounds [10 x i32], [10 x i32]* null, i16 0, <2 x i16> <i16 5, i16 7>
				%vec = ptrtoint <2 x i32*> %gep to <2 x i16>
				ret <2 x i16> %vec
				}

				; Test constant fold of constant expression GEP used by ptrtoint (the
				; "sizeof-like expression" case).
				; This used to hit an assert due to not supporting vectors in
				; llvm::ConstantFoldCastInstruction when handling ptrtoint.
				define <2 x i16> @test2() {
				; CHECK-LABEL: @test2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: ret <2 x i16> ptrtoint (<2 x i32> getelementptr (i32, i32 null, <2 x i64> <i64 5, i64 7>) to <2 x i16>)
				;
				entry:
				%gep = getelementptr i32, i32* null, <2 x i16> <i16 5, i16 7>
				%vec = ptrtoint <2 x i32*> %gep to <2 x i16>
				ret <2 x i16> %vec
				}