This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineCalls.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
deref-alloc-fns.ll

Differential D78810

[InstCombine] Check max alignment before adding attr on aligned_alloc
Needs ReviewPublic

Authored by bondhugula on Apr 24 2020, 7:54 AM.

Download Raw Diff

Details

Reviewers

clin1
jdoerfert
lebedev.ri

Summary

In InstCombineCalls, check maximum alignment before adding the aligment
attribute on aligned_alloc. Fixes 45654.
https://bugs.llvm.org/show_bug.cgi?id=45654

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	160 ms	MLIR.Dialect/Vector::Unknown Unit Message ("")
	620 ms	MLIR.mlir-tblgen::Unknown Unit Message ("")

Event Timeline

bondhugula created this revision.Apr 24 2020, 7:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2020, 7:54 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

This revision now requires changes to proceed.Apr 24 2020, 8:10 AM

Harbormaster failed remote builds in B54574: Diff 259887!Apr 24 2020, 9:09 AM

In D78810#2001925, @lebedev.ri wrote:

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Clamping would be incorrect here since the allocation risks not providing the desired alignment if it gets promoted to alloca. aligned_alloc doesn't appear to have an upper bound on ailgnment (any power of 2 size_t value is fine I think), while the 1 << 29 limit is for LLVM's alloca among others. I think the right/better fix for this could be to just add the attribute to aligned_alloc as long as it's a power of two, but not promote to alloca when it's larger than 1 << 29. But then I'm not sure if that would cause other undesired behavior in LLVM with an alignment attribute more than 1 << 29. If it did, then not adding the attribute at all for larger than MaximumAlignment appears to be the safe/right thing.

In D78810#2002021, @bondhugula wrote:

In D78810#2001925, @lebedev.ri wrote:

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Clamping would be incorrect here since the allocation risks not providing the desired alignment
if it gets promoted to alloca.

Ah, interesting point, i have not considered it.
But then we already have that problem in other places,
at least clang/lib/CodeGen/CGCall.cpp, AbstractAssumeAlignedAttrEmitter.

aligned_alloc doesn't appear to have an upper bound on ailgnment
(any power of 2 size_t value is fine I think),
while the 1 << 29 limit is for LLVM's alloca among others.

I think the right/better fix for this could be to just add the attribute to aligned_alloc
as long as it's a power of two, but not promote to alloca when it's larger than 1 << 29.

I'm not sure what you mean. Clearly, as per this patch, that is what we do now,
and it fails because there's an artificial (and bogus, too low) upper limit.

But then I'm not sure if that would cause other undesired behavior in LLVM
with an alignment attribute more than 1 << 29.
If it did, then not adding the attribute at all for larger
than MaximumAlignment appears to be the safe/right thing.

In D78810#2002195, @lebedev.ri wrote:

In D78810#2002021, @bondhugula wrote:

In D78810#2001925, @lebedev.ri wrote:

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Clamping would be incorrect here since the allocation risks not providing the desired alignment
if it gets promoted to alloca.

Ah, interesting point, i have not considered it.
But then we already have that problem in other places,
at least clang/lib/CodeGen/CGCall.cpp, AbstractAssumeAlignedAttrEmitter.

aligned_alloc doesn't appear to have an upper bound on ailgnment
(any power of 2 size_t value is fine I think),
while the 1 << 29 limit is for LLVM's alloca among others.

I think the right/better fix for this could be to just add the attribute to aligned_alloc
as long as it's a power of two, but not promote to alloca when it's larger than 1 << 29.

I'm not sure what you mean. Clearly, as per this patch, that is what we do now,
and it fails because there's an artificial (and bogus, too low) upper limit.

No, this patch is not adding the alignment attribute at all. So, the clang assertion (in the bug report) goes away. The heap to stack promotion will pull the alignment attribute from the call operand and that would have to be guarded so that the promotion doesn't happen for alignments larger than 1 << 29 (separate bug) because that's what alloca is able to support.

But then I'm not sure if that would cause other undesired behavior in LLVM
with an alignment attribute more than 1 << 29.
If it did, then not adding the attribute at all for larger
than MaximumAlignment appears to be the safe/right thing.

FWIW, dropping the attribute LGTM. The heap/stack promotion is still a work in progress, is that correct? At least, I don't see it happening in trunk.

In D78810#2098816, @clin1 wrote:

FWIW, dropping the attribute LGTM. The heap/stack promotion is still a work in progress, is that correct? At least, I don't see it happening in trunk.

You'll get basic support if you enable the Attributor (-attributor-enable=cgscc).

In D78810#2002021, @bondhugula wrote:

In D78810#2001925, @lebedev.ri wrote:

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Clamping would be incorrect here since the allocation risks not providing the desired alignment if it gets promoted to alloca. aligned_alloc doesn't appear to have an upper bound on ailgnment (any power of 2 size_t value is fine I think), while the 1 << 29 limit is for LLVM's alloca among others. I think the right/better fix for this could be to just add the attribute to aligned_alloc as long as it's a power of two, but not promote to alloca when it's larger than 1 << 29. But then I'm not sure if that would cause other undesired behavior in LLVM with an alignment attribute more than 1 << 29. If it did, then not adding the attribute at all for larger than MaximumAlignment appears to be the safe/right thing.

While I agree this seems to be a safe fix, I fail to see how clamping or promotion to alloca would be problematic:
First, the attribute:
align(X) in IR means the alignment is at least X. We can always use a factor of the real alignment as X, which is what clamping would result in, right? So from the IR semantics standpoint align(1<<29) would be fine if the argument to aligned_alloc is a constant power of two bigger than 1<<29.
Second, the promotion:
If we have a call to XXXalloc that we want to replace with a stack allocation we need to ensure the guaranteed alignment is not decreased. Assuming we don't know the guaranteed alignment of the call, we are out of luck wrt. promotion. Assuming the alignment is too much for the stack, we have to give up as well. Assuming we know the guaranteed alignment and it is a constant small enough for us to put on an alloca instruction, we can go ahead. What I try to say is: The annotation of a minimal alignment (the IR attribute align) is irrelevant when it comes to promotion to alloca.

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:46 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:46 PM

Herald added subscribers: StephenFan, bollu. · View Herald Transcript

Some version of this fix made it into:
https://github.com/llvm/llvm-project/commit/8233439fdbf5e11ba4a9f53801008721727f53a5
This review can be closed --- thanks.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineCalls.cpp

6 lines

test/

Transforms/

InstCombine/

deref-alloc-fns.ll

9 lines

Diff 259887

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 4,460 Lines • ▼ Show 20 Lines	if (isMallocLikeFn(&Call, TLI) && Op0C) {
else		else
Call.addAttribute(AttributeList::ReturnIndex,		Call.addAttribute(AttributeList::ReturnIndex,
Attribute::getWithDereferenceableOrNullBytes(		Attribute::getWithDereferenceableOrNullBytes(
Call.getContext(), Op0C->getZExtValue()));		Call.getContext(), Op0C->getZExtValue()));
} else if (isAlignedAllocLikeFn(&Call, TLI) && Op1C) {		} else if (isAlignedAllocLikeFn(&Call, TLI) && Op1C) {
Call.addAttribute(AttributeList::ReturnIndex,		Call.addAttribute(AttributeList::ReturnIndex,
Attribute::getWithDereferenceableOrNullBytes(		Attribute::getWithDereferenceableOrNullBytes(
Call.getContext(), Op1C->getZExtValue()));		Call.getContext(), Op1C->getZExtValue()));
// Add alignment attribute if alignment is a power of two constant.		// Add an alignment attribute if the alignment is a power of two constant
		// less than the maximum alignment.
if (Op0C) {		if (Op0C) {
uint64_t AlignmentVal = Op0C->getZExtValue();		uint64_t AlignmentVal = Op0C->getZExtValue();
if (llvm::isPowerOf2_64(AlignmentVal))		if (isPowerOf2_64(AlignmentVal) &&
		Op0C->getValue().ule(Value::MaximumAlignment))
Call.addAttribute(AttributeList::ReturnIndex,		Call.addAttribute(AttributeList::ReturnIndex,
Attribute::getWithAlignment(Call.getContext(),		Attribute::getWithAlignment(Call.getContext(),
Align(AlignmentVal)));		Align(AlignmentVal)));
}		}
} else if (isReallocLikeFn(&Call, TLI) && Op1C) {		} else if (isReallocLikeFn(&Call, TLI) && Op1C) {
Call.addAttribute(AttributeList::ReturnIndex,		Call.addAttribute(AttributeList::ReturnIndex,
Attribute::getWithDereferenceableOrNullBytes(		Attribute::getWithDereferenceableOrNullBytes(
Call.getContext(), Op1C->getZExtValue()));		Call.getContext(), Op1C->getZExtValue()));
▲ Show 20 Lines • Show All 610 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/deref-alloc-fns.ll

	Show All 32 Lines
	; CHECK-LABEL: @aligned_alloc_constant_size(			; CHECK-LABEL: @aligned_alloc_constant_size(
	; CHECK-NEXT: [[CALL:%.]] = tail call noalias align 32 dereferenceable_or_null(512) i8 @aligned_alloc(i64 32, i64 512)			; CHECK-NEXT: [[CALL:%.]] = tail call noalias align 32 dereferenceable_or_null(512) i8 @aligned_alloc(i64 32, i64 512)
	; CHECK-NEXT: ret i8* [[CALL]]			; CHECK-NEXT: ret i8* [[CALL]]
	;			;
	%call = tail call noalias i8* @aligned_alloc(i64 32, i64 512)			%call = tail call noalias i8* @aligned_alloc(i64 32, i64 512)
	ret i8* %call			ret i8* %call
	}			}

	declare noalias i8* @foo(i8, i8, i8*)			declare noalias i8* @escape(i8, i8, i8, i8)

	define noalias i8* @aligned_alloc_dynamic_args(i64 %align, i64 %size) {			define noalias i8* @aligned_alloc_dynamic_args(i64 %align, i64 %size) {
	; CHECK-LABEL: @aligned_alloc_dynamic_args(			; CHECK-LABEL: @aligned_alloc_dynamic_args(
	; CHECK-NEXT: tail call noalias dereferenceable_or_null(1024) i8* @aligned_alloc(i64 %{{.*}}, i64 1024)			; CHECK-NEXT: tail call noalias dereferenceable_or_null(1024) i8* @aligned_alloc(i64 %{{.*}}, i64 1024)
	; CHECK-NEXT: tail call noalias i8* @aligned_alloc(i64 0, i64 1024)			; CHECK-NEXT: tail call noalias i8* @aligned_alloc(i64 0, i64 1024)
	; CHECK-NEXT: tail call noalias i8* @aligned_alloc(i64 32, i64 %{{.*}})			; CHECK-NEXT: tail call noalias i8* @aligned_alloc(i64 32, i64 %{{.*}})
				; CHECK-NEXT: tail call noalias dereferenceable_or_null(4096) i8* @aligned_alloc(i64 9223372036854775807, i64 4096)
	;			;
				; No alignment attribute will be added in these cases.
	%call = tail call noalias i8* @aligned_alloc(i64 %align, i64 1024)			%call = tail call noalias i8* @aligned_alloc(i64 %align, i64 1024)
	%call_1 = tail call noalias i8* @aligned_alloc(i64 0, i64 1024)			%call_1 = tail call noalias i8* @aligned_alloc(i64 0, i64 1024)
	%call_2 = tail call noalias i8* @aligned_alloc(i64 32, i64 %size)			%call_2 = tail call noalias i8* @aligned_alloc(i64 32, i64 %size)
				; Alignment is more than the maximum alignment allowed.
				; 9223372036854775807 is 2^63 - 1.
				%call_3 = tail call noalias i8* @aligned_alloc(i64 9223372036854775807, i64 4096)

	call i8* @foo(i8* %call, i8* %call_1, i8* %call_2)			call i8* @escape(i8* %call, i8* %call_1, i8* %call_2, i8* %call_3)
	ret i8* %call			ret i8* %call
	}			}

	define noalias i8* @malloc_constant_size2() {			define noalias i8* @malloc_constant_size2() {
	; CHECK-LABEL: @malloc_constant_size2(			; CHECK-LABEL: @malloc_constant_size2(
	; CHECK-NEXT: [[CALL:%.]] = tail call noalias dereferenceable_or_null(80) i8 @malloc(i64 40)			; CHECK-NEXT: [[CALL:%.]] = tail call noalias dereferenceable_or_null(80) i8 @malloc(i64 40)
	; CHECK-NEXT: ret i8* [[CALL]]			; CHECK-NEXT: ret i8* [[CALL]]
	;			;
	▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines