This is an archive of the discontinued LLVM Phabricator instance.

MergeFuncs should handle dynamic GEPs
Needs ReviewPublic

Authored by pete on Jan 25 2015, 11:19 PM.

Download Raw Diff

Details

Reviewers

Summary

When compiling std::vector<float> and std::vector<int>, both get a function called push_back_slow_path. This function could be merged except that one does float[i] and the other int[i]. These are identical given that int and float have the same size and alignment.

This patch teaches MergeFunctions about dynamic GEPs, which must return the same offset given 'index * element size' for both LHS and RHS.

This reduces the size of an LTO'd llc by almost 1.5%.

Diff Detail

Event Timeline

Added Stepan as a reviewer.

Hi Pete,
Thanks for patch. The idea looks good, but I'm still in middle of review..

lib/Transforms/IPO/MergeFunctions.cpp
895	"fielf" => "field"
900	May be "cast<ConstantInt>"?
926	I stopped here :-)

pete added inline comments.Jan 28 2015, 1:26 PM

lib/Transforms/IPO/MergeFunctions.cpp
895	Missed that. Thanks.
900	I thought that too at first, but then i discovered the magic of vector GEPs. About to upload a patch which does this the right way, i.e., we handle the scalar and vector cases.

Updated to actually handle vector GEPs. Added positive and negative test cases for vector GEP merging.

Hello Pete,
I've looked at this patch. You add support for dynamic GEPs. I was on non-llvm project for a while, and dynamic GEP is quite new thing, at least when you can use arrays as indices. It also a bit strange, that we can use only splat arrays.. Anyways, I suppose we should get vector of pointers in this case, right? Can you provide me with links, where I can read more about?

I've left few inline comments.
Also, I've launched test-suite for it..

test/Transforms/MergeFunc/gep.ll
63	Dynamic GEP is quite new thing. Do we have specs for dynamic GEPs somewhere already? In particular rule definition for splat values?
test/Transforms/MergeFunc/gep2.ll
54	What would happen, if I'll create same method, but change this line to: %offset_i = getelementptr %vector_char4_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 2, i32 2, i32 2, i32 0> It looks like your patch gonna merge such functions. Is it right?

Hi Stepan

From LangRef (http://llvm.org/docs/LangRef.html#getelementptr-instruction), the interesting part for structures is that the GEP indices must be the same value. If using a vector, it must be a splat of a constant. I guess using a vector of the same constant is redundant (compared to a scalar) except that it keeps all the GEP indices matching in terms of all being vectors or all being scalar. The part you want here is:

"When indexing into a (optionally packed) structure, only i32 integerconstants are allowed (when using a vector of indices they must all be the same i32 integer constant)”

For arrays or pointers, it says:

"When indexing into an array, pointer or vector, integers of any width are allowed, and they are not required to be constant”

The main new piece of functionality in MergeFuncs here is that for dynamic GEPs, it should be able to work out that 2 GEPs are equivalent even when the index is dynamic. This is true when the underlying type has the same offset from a[i] to a[i + 1]. Eg, indexing a float[] vs int[] will give you the same offset for the same index ‘i’. If any of this is unclear in the tests or patch comments, please let me know and i’ll be happy to improve them.

test/Transforms/MergeFunc/gep.ll
63	I think I answered this in the main comment at the end. Let me know if anything else needs clarification.
test/Transforms/MergeFunc/gep2.ll
54	Ah, you're totally right. If you try to merge 2 GEPs, both with non-splat vectors, then both will give nullptr on these lines: ConstIntL = dyn_cast_or_null<ConstantInt>(ConstL->getSplatValue()); ConstIntR = dyn_cast_or_null<ConstantInt>(ConstR->getSplatValue()); and then we'll get past this test, and crash if (int Res = cmpNumbers(!!ConstIntL, !!ConstIntR)) return Res; Nice catch! I'll update the code and tests to handle this.

Hi Pete, I'll go on vocation from next week. And will offline till 10th of March. If you have some diffs to show, please don't hesitate for too long ;-)

Ah, sorry about that. I forgot you were waiting on me. I was about to ping this :)

Updated code to handle non-splat vector constants.

I actually reordered a bunch of stuff here so that we detect early if the GEPs have identical types and operands, and can just return on that case. If that fails, we check for a DataLayout, and if we have one we do the offset checking for dynamic offsets. If we fail to find a splat constant, then at that point we know the GEPs weren't identical so we just use their types to order them which is what would have happened in the code prior to this patch.

I added a new test case in gep2.ll which checks that 2 vector GEPs with different non-splat constants don't merge. Before I made this fix that test would fire the assert you spotted.

Revision Contents

Path

Size

lib/

Transforms/

IPO/

MergeFunctions.cpp

114 lines

test/

Transforms/

MergeFunc/

gep.ll

79 lines

gep2.ll

76 lines

Diff 18918

lib/Transforms/IPO/MergeFunctions.cpp

Show First 20 Lines • Show All 611 Lines • ▼ Show 20 Lines	int FunctionComparator::cmpConstants(const Constant L, const Constant R) {
}		}
}		}

/// cmpType - compares two types,		/// cmpType - compares two types,
/// defines total ordering among the types set.		/// defines total ordering among the types set.
/// See method declaration comments for more details.		/// See method declaration comments for more details.
int FunctionComparator::cmpTypes(Type TyL, Type TyR) const {		int FunctionComparator::cmpTypes(Type TyL, Type TyR) const {

PointerType *PTyL = dyn_cast<PointerType>(TyL);		PointerType *PTyL = dyn_cast<PointerType>(TyL->getScalarType());
PointerType *PTyR = dyn_cast<PointerType>(TyR);		PointerType *PTyR = dyn_cast<PointerType>(TyR->getScalarType());

if (DL) {		if (DL) {
if (PTyL && PTyL->getAddressSpace() == 0) TyL = DL->getIntPtrType(TyL);		if (PTyL && PTyL->getAddressSpace() == 0) TyL = DL->getIntPtrType(TyL);
if (PTyR && PTyR->getAddressSpace() == 0) TyR = DL->getIntPtrType(TyR);		if (PTyR && PTyR->getAddressSpace() == 0) TyR = DL->getIntPtrType(TyR);
}		}

if (TyL == TyR)		if (TyL == TyR)
return 0;		return 0;
▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines	return cmpNumbers(RMWI->getSynchScope(),
cast<AtomicRMWInst>(R)->getSynchScope());		cast<AtomicRMWInst>(R)->getSynchScope());
}		}
return 0;		return 0;
}		}

// Determine whether two GEP operations perform the same underlying arithmetic.		// Determine whether two GEP operations perform the same underlying arithmetic.
// Read method declaration comments for more details.		// Read method declaration comments for more details.
int FunctionComparator::cmpGEPs(const GEPOperator *GEPL,		int FunctionComparator::cmpGEPs(const GEPOperator *GEPL,
const GEPOperator *GEPR) {		const GEPOperator *GEPR) {

unsigned int ASL = GEPL->getPointerAddressSpace();		unsigned int ASL = GEPL->getPointerAddressSpace();
unsigned int ASR = GEPR->getPointerAddressSpace();		unsigned int ASR = GEPR->getPointerAddressSpace();

if (int Res = cmpNumbers(ASL, ASR))		if (int Res = cmpNumbers(ASL, ASR))
return Res;		return Res;

// When we have target data, we can reduce the GEP down to the value in bytes		// When we have target data, we can reduce the GEP down to the value in bytes
// added to the address.		// added to the address.
if (DL) {		if (DL) {
unsigned BitWidth = DL->getPointerSizeInBits(ASL);		unsigned BitWidth = DL->getPointerSizeInBits(ASL);
APInt OffsetL(BitWidth, 0), OffsetR(BitWidth, 0);		APInt OffsetL(BitWidth, 0), OffsetR(BitWidth, 0);
if (GEPL->accumulateConstantOffset(*DL, OffsetL) &&		if (GEPL->accumulateConstantOffset(*DL, OffsetL) &&
GEPR->accumulateConstantOffset(*DL, OffsetR))		GEPR->accumulateConstantOffset(*DL, OffsetR))
return cmpAPInts(OffsetL, OffsetR);		return cmpAPInts(OffsetL, OffsetR);
}		}

		if (int Res = cmpNumbers(GEPL->getNumOperands(), GEPR->getNumOperands()))
		return Res;

		// If we have datalayout available, we can look at the actual offsets which
		// would be generated for dynamic indices. For example,
		// float a[]; int b[]; a[i] and b[i] will both generate the same offsets from
		// their bases.
		if (!DL) {
if (int Res = cmpNumbers((uint64_t)GEPL->getPointerOperand()->getType(),		if (int Res = cmpNumbers((uint64_t)GEPL->getPointerOperand()->getType(),
(uint64_t)GEPR->getPointerOperand()->getType()))		(uint64_t)GEPR->getPointerOperand()->getType()))
return Res;		return Res;
		} else {
		// Note that the pointer operand may be a vector of pointers. Take the
		// scalar element which holds a pointer.
		Type *EltTyL = GEPL->getPointerOperandType()->getScalarType();
		Type *EltTyR = GEPR->getPointerOperandType()->getScalarType();
		for (unsigned i = 0, e = GEPL->getNumOperands(); i != e; ++i) {
		Value *OpL = GEPL->getOperand(i);
		Value *OpR = GEPR->getOperand(i);

if (int Res = cmpNumbers(GEPL->getNumOperands(), GEPR->getNumOperands()))		// The bases only need to be identical operands, we don't care about
		// their offsets.
		if (!i) {
		if (int Res = cmpValues(OpL, OpR))
return Res;		return Res;
		continue;
		}

for (unsigned i = 0, e = GEPL->getNumOperands(); i != e; ++i) {		StructType *StTyL = dyn_cast<StructType>(EltTyL);
if (int Res = cmpValues(GEPL->getOperand(i), GEPR->getOperand(i)))		StructType *StTyR = dyn_cast<StructType>(EltTyR);
		dyatkovskiyUnsubmitted Not Done Reply Inline Actions "fielf" => "field" dyatkovskiy: "fielf" => "field"
		peteAuthorUnsubmitted Not Done Reply Inline Actions Missed that. Thanks. pete: Missed that. Thanks.
		// TODO: Handle case where only one side is a struct, so long as both
		// sides have a constant at this index, and the field offsets are a match.
		if (int Res = cmpNumbers(!!StTyL, !!StTyR))
		return Res;

		dyatkovskiyUnsubmitted Not Done Reply Inline Actions May be "cast<ConstantInt>"? dyatkovskiy: May be "cast<ConstantInt>"?
		peteAuthorUnsubmitted Not Done Reply Inline Actions I thought that too at first, but then i discovered the magic of vector GEPs. About to upload a patch which does this the right way, i.e., we handle the scalar and vector cases. pete: I thought that too at first, but then i discovered the magic of vector GEPs. About to upload a…
		if (StTyL) {
		unsigned IntL = cast<Constant>(OpL)->getUniqueInteger().getZExtValue();
		unsigned IntR = cast<Constant>(OpR)->getUniqueInteger().getZExtValue();
		if (int Res = cmpNumbers(IntL, IntR))
		return Res;

		// Make sure that both left and right get the same offset here.
		// We don't actually care what fields we skip over, so long as we get
		// the same offset.
		uint64_t OffsetL = DL->getStructLayout(StTyL)->getElementOffset(IntL);
		uint64_t OffsetR = DL->getStructLayout(StTyR)->getElementOffset(IntR);
		if (int Res = cmpNumbers(OffsetL, OffsetR))
return Res;		return Res;
		EltTyL = StTyL->getElementType(IntL);
		EltTyR = StTyR->getElementType(IntR);
		continue;
		}

		EltTyL = cast<SequentialType>(EltTyL)->getElementType();
		EltTyR = cast<SequentialType>(EltTyR)->getElementType();

		// If this is a constant subscript, handle it quickly.
		const Constant *ConstL = dyn_cast<Constant>(OpL);
		const Constant *ConstR = dyn_cast<Constant>(OpR);

		if (int Res = cmpNumbers(!!ConstL, !!ConstR))
		dyatkovskiyUnsubmitted Not Done Reply Inline Actions I stopped here :-) dyatkovskiy: I stopped here :-)
		return Res;

		if (ConstL) {
		bool ZeroL = ConstL->isNullValue();
		bool ZeroR = ConstR->isNullValue();
		if (int Res = cmpNumbers(ZeroL, ZeroR))
		return Res;
		if (ZeroL) continue;

		// Constant non-zero indices must be the same offset. For example,
		// char c[4]; short s[4];
		// c[2] and s[1] have the same offset.

		// If we have scalars, we can just get the integer index, otherwise
		// for a vector we look for a splatted value.
		const Constant *ConstIntL = nullptr;
		const Constant *ConstIntR = nullptr;
		if (ConstL->getType()->isIntegerTy()) {
		ConstIntL = dyn_cast<ConstantInt>(ConstL);
		ConstIntR = dyn_cast<ConstantInt>(ConstR);
		} else {
		assert(ConstL->getType()->isVectorTy() && "Expected vector type");
		ConstIntL = dyn_cast_or_null<ConstantInt>(ConstL->getSplatValue());
		ConstIntR = dyn_cast_or_null<ConstantInt>(ConstR->getSplatValue());
		}

		if (int Res = cmpNumbers(!!ConstIntL, !!ConstIntR))
		return Res;

		unsigned IntL = cast<ConstantInt>(ConstIntL)->getZExtValue();
		unsigned IntR = cast<ConstantInt>(ConstIntR)->getZExtValue();
		if (int Res = cmpNumbers(DL->getTypeAllocSize(EltTyL) * IntL,
		DL->getTypeAllocSize(EltTyR) * IntR))
		return Res;
		} else {
		// Dynamic indices. These must be the same index value and the same
		// underlying size.
		if (int Res = cmpValues(OpL, OpR))
		return Res;

		if (int Res = cmpNumbers(DL->getTypeAllocSize(EltTyL),
		DL->getTypeAllocSize(EltTyR)))
		return Res;
		}
		}
}		}

return 0;		return 0;
}		}

/// Compare two values used by the two functions under pair-wise comparison. If		/// Compare two values used by the two functions under pair-wise comparison. If
/// this is the first time the values are seen, they're added to the mapping so		/// this is the first time the values are seen, they're added to the mapping so
/// that we will detect mismatches on next use.		/// that we will detect mismatches on next use.
▲ Show 20 Lines • Show All 611 Lines • Show Last 20 Lines

test/Transforms/MergeFunc/gep.ll

This file was added.

				; RUN: opt -S -mergefunc -o - %s \| FileCheck %s

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.10.0"

				%vector_float = type { i64, [8 x float] }
				%vector_int = type { { i32, i32 }, [8 x i32] }
				%vector_char4 = type { { i32, i32 }, [8 x <4 x i8>] }
				%vector_short2 = type { { i32, i32 }, [8 x <2 x i16>] }

				declare void @keepalive(i8*)

				; The i'th element of the float and int vectors should be at the same offset.

				; CHECK-DAG: call void @{{vector_float\|vector_int}}

				define linkonce_odr void @vector_float(%vector_float* %this, i64 %i) unnamed_addr {
				entry:
				%offset_i = getelementptr %vector_float* %this, i64 0, i32 1, i64 %i
				%bc_offset_i = bitcast float* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				define linkonce_odr void @vector_int(%vector_int* %this, i64 %i) unnamed_addr {
				entry:
				%offset_i = getelementptr %vector_int* %this, i64 0, i32 1, i64 %i
				%bc_offset_i = bitcast i32* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				; The character at index 2 should be at the same position as the short at index 1.

				; CHECK-DAG: call void @{{vector_char4_index2\|vector_short2_index1}}

				define linkonce_odr void @vector_char4_index2(%vector_char4* %this, i64 %i) unnamed_addr {
				entry:
				%offset_i = getelementptr %vector_char4* %this, i64 0, i32 1, i64 %i, i64 2
				%bc_offset_i = bitcast i8* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				define linkonce_odr void @vector_short2_index1(%vector_short2* %this, i64 %i) unnamed_addr {
				entry:
				%offset_i = getelementptr %vector_short2* %this, i64 0, i32 1, i64 %i, i64 1
				%bc_offset_i = bitcast i16* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				; Vector constant indices vs scalar indices shouldn't change whether we can merge.

				; CHECK-DAG: call void @{{vector_char4_vec\|vector_short2_vec}}

				%vector_char4_ptr = type <4 x %vector_char4*>
				%vector_short2_ptr = type <4 x %vector_short2*>

				define linkonce_odr void @vector_char4_vec(%vector_char4_ptr %this, <4 x i64> %i) unnamed_addr {
				entry:
				%bc_ptr = bitcast %vector_char4_ptr %this to %vector_char4_ptr
				%offset_i = getelementptr %vector_char4_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 2, i32 2, i32 2, i32 2>
				dyatkovskiyUnsubmitted Not Done Reply Inline Actions Dynamic GEP is quite new thing. Do we have specs for dynamic GEPs somewhere already? In particular rule definition for splat values? dyatkovskiy: Dynamic GEP is quite new thing. Do we have specs for dynamic GEPs somewhere already? In…
				peteAuthorUnsubmitted Not Done Reply Inline Actions I think I answered this in the main comment at the end. Let me know if anything else needs clarification. pete: I think I answered this in the main comment at the end. Let me know if anything else needs…
				%ext_i = extractelement <4 x i8*> %offset_i, i32 0
				%bc_offset_i = bitcast i8* %ext_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				define linkonce_odr void @vector_short2_vec(%vector_char4_ptr %this, <4 x i64> %i) unnamed_addr {
				entry:
				%bc_ptr = bitcast %vector_char4_ptr %this to %vector_short2_ptr
				%offset_i = getelementptr %vector_short2_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 1, i32 1, i32 1, i32 1>
				%ext_i = extractelement <4 x i16*> %offset_i, i32 0
				%bc_offset_i = bitcast i16* %ext_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

test/Transforms/MergeFunc/gep2.ll

This file was added.

				; RUN: opt -S -mergefunc -o - %s \| FileCheck %s

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.10.0"

				%vector_char4 = type { { i32, i32 }, [8 x <4 x i8>] }
				%vector_short2 = type { { i32, i32 }, [8 x <2 x i16>] }

				declare void @keepalive(i8*)

				; The character at index 1 should not at the same position as the short at index 1.

				define linkonce_odr void @vector_char4_index1(%vector_char4* %this, i64 %i) unnamed_addr {
				entry:
				; CHECK-LABEL: @vector_char4_index1
				; CHECK: getelementptr
				; CHECK: bitcast
				; CHECK: call void @keepalive
				; CHECK: ret void
				%offset_i = getelementptr %vector_char4* %this, i64 0, i32 1, i64 %i, i64 1
				%bc_offset_i = bitcast i8* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				define linkonce_odr void @vector_short2_index1(%vector_short2* %this, i64 %i) unnamed_addr {
				entry:
				; CHECK-LABEL: @vector_short2_index1
				; CHECK: getelementptr
				; CHECK: bitcast
				; CHECK: call void @keepalive
				; CHECK: ret void
				%offset_i = getelementptr %vector_short2* %this, i64 0, i32 1, i64 %i, i64 1
				%bc_offset_i = bitcast i16* %offset_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				; Vector constant indices vs scalar indices only support splatted indices for now.

				%vector_char4_ptr = type <4 x %vector_char4*>
				%vector_short2_ptr = type <4 x %vector_short2*>

				define linkonce_odr void @vector_char4_vec(%vector_char4_ptr %this, <4 x i64> %i) unnamed_addr {
				entry:
				; CHECK-LABEL: @vector_char4_vec
				; CHECK: bitcast
				; CHECK: getelementptr
				; CHECK: extractelement
				; CHECK: bitcast
				; CHECK: call void @keepalive
				; CHECK: ret void
				%bc_ptr = bitcast %vector_char4_ptr %this to %vector_char4_ptr
				%offset_i = getelementptr %vector_char4_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 0, i32 2, i32 2, i32 2>
				dyatkovskiyUnsubmitted Not Done Reply Inline Actions What would happen, if I'll create same method, but change this line to: %offset_i = getelementptr %vector_char4_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 2, i32 2, i32 2, i32 0> It looks like your patch gonna merge such functions. Is it right? dyatkovskiy: What would happen, if I'll create same method, but change this line to: %offset_i =…
				peteAuthorUnsubmitted Not Done Reply Inline Actions Ah, you're totally right. If you try to merge 2 GEPs, both with non-splat vectors, then both will give nullptr on these lines: ConstIntL = dyn_cast_or_null<ConstantInt>(ConstL->getSplatValue()); ConstIntR = dyn_cast_or_null<ConstantInt>(ConstR->getSplatValue()); and then we'll get past this test, and crash if (int Res = cmpNumbers(!!ConstIntL, !!ConstIntR)) return Res; Nice catch! I'll update the code and tests to handle this. pete: Ah, you're totally right. If you try to merge 2 GEPs, both with non-splat vectors, then both…
				%ext_i = extractelement <4 x i8*> %offset_i, i32 0
				%bc_offset_i = bitcast i8* %ext_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}

				define linkonce_odr void @vector_short2_vec(%vector_char4_ptr %this, <4 x i64> %i) unnamed_addr {
				entry:
				; CHECK-LABEL: @vector_short2_vec
				; CHECK: bitcast
				; CHECK: getelementptr
				; CHECK: extractelement
				; CHECK: bitcast
				; CHECK: call void @keepalive
				; CHECK: ret void
				%bc_ptr = bitcast %vector_char4_ptr %this to %vector_short2_ptr
				%offset_i = getelementptr %vector_short2_ptr %bc_ptr, <4 x i32><i32 0, i32 0, i32 0, i32 0>, <4 x i32><i32 1, i32 1, i32 1, i32 1>, <4 x i64> %i, <4 x i32><i32 1, i32 1, i32 1, i32 1>
				%ext_i = extractelement <4 x i16*> %offset_i, i32 0
				%bc_offset_i = bitcast i16* %ext_i to i8*
				call void @keepalive(i8* %bc_offset_i)
				ret void
				}
				No newline at end of file