This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
-
LangRef.rst
-
lib/IR/
-
IR/
-
Value.cpp
-
test/
-
Analysis/BasicAA/
-
BasicAA/
-
dereferenceable.ll
-
CodeGen/X86/
-
X86/
-
load-partial.ll
-
Transforms/LICM/
-
LICM/
-
hoist-deref-load.ll

Differential D99100

[WIP] Implement RFC: Decomposing deref(N) into deref(N) + nofree
AbandonedPublic

Authored by reames on Mar 22 2021, 11:59 AM.

Download Raw Diff

Details

Reviewers

nlopes
jdoerfert
apilipenko
nlewycky
bollu

Summary

This implements the semantic change to dereferenceability described in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree".

At the moment, it shows the (widespread) optimization impact of simply toggling the behavior. My plan is to examine each test change to see if a) we can generalize the transform slightly to not depend on global deref, or b) what appropriate test changes (e.g. adding attributes) make sense without destroying the intend of the test. My plan is to tackle each transform in it's own review, and rebase this one incrementally as we go.

In addition to the updated tests, there are currently 4 failing tests. These simply happen to be difficult to show updates in easily due to limitations of our auto-update tests. They will be included before final review of this patch.

LLVM :: Analysis/BasicAA/dereferenceable.ll
LLVM :: Analysis/ValueTracking/memory-dereferenceable.ll
LLVM :: Transforms/VectorCombine/X86/load-inseltpoison.ll
LLVM :: Transforms/VectorCombine/X86/load.ll

Diff Detail

Event Timeline

reames created this revision.Mar 22 2021, 11:59 AM

Herald added a reviewer: bollu. · View Herald TranscriptMar 22 2021, 11:59 AM

Herald added subscribers: dexonsmith, dantrushin, pengfei and 3 others. · View Herald Transcript

reames requested review of this revision.Mar 22 2021, 11:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 11:59 AM

Harbormaster completed remote builds in B95060: Diff 332382.Mar 22 2021, 1:21 PM

reames mentioned this in D99135: [deref] Implement initial set of inference rules for deref-at-point.Mar 22 2021, 7:42 PM

reames added a parent revision: D99138: [deref] Use readonly to infer global dereferenceability in a callee.Mar 22 2021, 8:20 PM

reames added a parent revision: D99135: [deref] Implement initial set of inference rules for deref-at-point.

reames added a parent revision: D95815: [deref-at-point] restrict inference of dereferenceability based on allocsize attribute.Mar 25 2021, 2:59 PM

reames mentioned this in rGe75a2dfe209d: [tests] Stablize tests for possible change in deref semantics.Jul 14 2021, 1:05 PM

reames mentioned this in rG7e496c29e2bc: [tests] Stablize tests for possible change in deref semantics.Jul 14 2021, 1:37 PM

Rebase over previous changes and now stablized tests.

Harbormaster completed remote builds in B114099: Diff 358754.Jul 14 2021, 2:30 PM

This is nearing the point of being ready for real review. When that happens, I'm going to open a new review with a much revised description. We've ended up moving in a direction which doesn't align well with the original framing on the review.

The last major piece before I put this up for review is some performance validation to make sure the impact of this isn't "too bad". Some of that can happen after the review is posted, but I want to at least have a sanity check first.

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

In D99100#2878416, @nikic wrote:

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

In D99100#2878486, @reames wrote:

In D99100#2878416, @nikic wrote:

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

I think the controversy was about the function attribute only. It's pretty important to me that using dereferenceable plus nofree on an argument works (and does so without any additional nosync requirements), because that means you can get the current dereferenceable semantics back simply by emitting dereferenceable nofree in your frontend wherever you used plain dereferenceable before. That gives a clear migration path for which no regressions should be expected (is that right?)

In D99100#2878515, @nikic wrote:

In D99100#2878486, @reames wrote:

In D99100#2878416, @nikic wrote:

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

I think the controversy was about the function attribute only. It's pretty important to me that using dereferenceable plus nofree on an argument works (and does so without any additional nosync requirements), because that means you can get the current dereferenceable semantics back simply by emitting dereferenceable nofree in your frontend wherever you used plain dereferenceable before. That gives a clear migration path for which no regressions should be expected (is that right?)

To be clear, there is no current migration plan for which no regressions are expected. I tried to come up with one, it failed miserably. I'm no longer working towards that goal.

Your point about wanting nofree on a parameter to not require nosync runs exactly into the discussion I linked.

If you want to drive the nofree parameter case, feel free. If you want to collect some numbers and tell me whether this a real or hypothetical regression, that would be super useful.

Lest I seem too dismissive, let me expand on something I was planning to include in the description of the real patch after we had some preliminary numbers. We will infer nofree+nosync on the function if possible. In practice, this covers a lot of the concerning cases I saw. The biggest hesitation I have is that inlining small functions effectively destroys our ability to infer nofree/nosync regions. We don't really have an answer for context nofree reasoning after heavy inlining. If we're going to have a major regression, I'm expecting it to come from that interaction.

@nikic I remember the situation with the parameter attribute. The issue is that the attribute describes actions taken through that particular copy of the pointer, not all copies of the pointer. This makes it easier to infer, but makes it very challenging to use for optimization. As an example, consider the following:
void foo(bool c, char * deref(1) nofree a, char * b) {

loop {
  if (c) { free(b); break; }
  v = *a;
}

}
foo(true, p, p);

In this example, we'd want to hoist the load from a, but since b can point to the same object, we can't.

This was the motivation for the use of noalias in the original proposal. Review discussion made it quickly clear that there was no consensus as to what noalias actually meant, and I dropped that approach. There might be room to drive that forward, but it'll intersect with all the aliasing work involving the same attribute in complicated ways.

Extending this pointer property to an object property was the notion behind the "nofreeobj" idea. I still think that would work if pursued.

Abandoning this WIP in favor of the real patch in D110745.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

34 lines

lib/

IR/

Value.cpp

2 lines

test/

Analysis/

BasicAA/

dereferenceable.ll

2 lines

CodeGen/

X86/

load-partial.ll

2 lines

Transforms/

LICM/

hoist-deref-load.ll

16 lines

Diff 358754

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,263 Lines • ▼ Show 20 Lines	``nonnull``
This indicates that the parameter or return pointer is not null. This		This indicates that the parameter or return pointer is not null. This
attribute may only be applied to pointer typed parameters. This is not		attribute may only be applied to pointer typed parameters. This is not
checked or enforced by LLVM; if the parameter or return pointer is null,		checked or enforced by LLVM; if the parameter or return pointer is null,
:ref:`poison value <poisonvalues>` is returned or passed instead.		:ref:`poison value <poisonvalues>` is returned or passed instead.
The ``nonnull`` attribute should be combined with the ``noundef`` attribute		The ``nonnull`` attribute should be combined with the ``noundef`` attribute
to ensure a pointer is not null or otherwise the behavior is undefined.		to ensure a pointer is not null or otherwise the behavior is undefined.

``dereferenceable(<n>)``		``dereferenceable(<n>)``
This indicates that the parameter or return pointer is dereferenceable. This		This indicates that the parameter or return pointer is dereferenceable at
attribute may only be applied to pointer typed parameters. A pointer that		the instant of the call. This attribute may only be applied to pointer
is dereferenceable can be loaded from speculatively without a risk of		typed parameters. The number of bytes known to be dereferenceable must
trapping. The number of bytes known to be dereferenceable must be provided		be provided in parentheses. It is legal for the number of bytes to be less
in parentheses. It is legal for the number of bytes to be less than the		than the size of the pointee type.
size of the pointee type. The ``nonnull`` attribute does not imply
dereferenceability (consider a pointer to one element past the end of an		A pointer that is dereferenceable at a particular location in the program
array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in		can be loaded from speculatively without a risk of trapping at that
``addrspace(0)`` (which is the default address space), except if the		location. In general, once a memory location becomes dereferenceable, it
		will remain dereferenceable until the underlying object is freed.

		The ``nonnull`` attribute does not imply dereferenceability (consider a
		pointer to one element past the end of an array), however
		``dereferenceable(<n>)`` does imply ``nonnull`` in ``addrspace(0)``
		(which is the default address space), except if the
``null_pointer_is_valid`` function attribute is present.		``null_pointer_is_valid`` function attribute is present.
``n`` should be a positive number. The pointer should be well defined,		``n`` should be a positive number. The pointer should be well defined,
otherwise it is undefined behavior. This means ``dereferenceable(<n>)``		otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
implies ``noundef``.		implies ``noundef``.

``dereferenceable_or_null(<n>)``		``dereferenceable_or_null(<n>)``
This indicates that the parameter or return value isn't both		This indicates that the parameter or return value isn't both
non-null and non-dereferenceable (up to ``<n>`` bytes) at the same		non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
▲ Show 20 Lines • Show All 4,921 Lines • ▼ Show 20 Lines
or switch that it is attached to is completely unpredictable.		or switch that it is attached to is completely unpredictable.

.. _md_dereferenceable:		.. _md_dereferenceable:

'``dereferenceable``' Metadata		'``dereferenceable``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The existence of the ``!dereferenceable`` metadata on the instruction		The existence of the ``!dereferenceable`` metadata on the instruction
tells the optimizer that the value loaded is known to be dereferenceable.		tells the optimizer that the value loaded is known to be dereferenceable at
The number of bytes known to be dereferenceable is specified by the integer		that program location. The number of bytes known to be dereferenceable is
value in the metadata node. This is analogous to the ''dereferenceable''		specified by the integer value in the metadata node. This is analogous to the
attribute on parameters and return values.		''dereferenceable'' attribute on parameters and return values.

.. _md_dereferenceable_or_null:		.. _md_dereferenceable_or_null:

'``dereferenceable_or_null``' Metadata		'``dereferenceable_or_null``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The existence of the ``!dereferenceable_or_null`` metadata on the		The existence of the ``!dereferenceable_or_null`` metadata on the
instruction tells the optimizer that the value loaded is known to be either		instruction tells the optimizer that the value loaded is known to be either
dereferenceable or null.		dereferenceable or null at that program location.
The number of bytes known to be dereferenceable is specified by the integer		The number of bytes known to be dereferenceable is specified by the integer
value in the metadata node. This is analogous to the ''dereferenceable_or_null''		value in the metadata node. This is analogous to the ''dereferenceable_or_null''
attribute on parameters and return values.		attribute on parameters and return values.

.. _llvm.loop:		.. _llvm.loop:

'``llvm.loop``'		'``llvm.loop``'
^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^
▲ Show 20 Lines • Show All 16,371 Lines • Show Last 20 Lines

llvm/lib/IR/Value.cpp

	Show All 33 Lines
	#include "llvm/Support/ErrorHandling.h"			#include "llvm/Support/ErrorHandling.h"
	#include "llvm/Support/ManagedStatic.h"			#include "llvm/Support/ManagedStatic.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	#include <algorithm>			#include <algorithm>

	using namespace llvm;			using namespace llvm;

	static cl::opt<unsigned> UseDerefAtPointSemantics(			static cl::opt<unsigned> UseDerefAtPointSemantics(
	"use-dereferenceable-at-point-semantics", cl::Hidden, cl::init(false),			"use-dereferenceable-at-point-semantics", cl::Hidden, cl::init(true),
	cl::desc("Deref attributes and metadata infer facts at definition only"));			cl::desc("Deref attributes and metadata infer facts at definition only"));

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Value Class			// Value Class
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	static inline Type checkType(Type Ty) {			static inline Type checkType(Type Ty) {
	assert(Ty && "Value defined with a null type: Error!");			assert(Ty && "Value defined with a null type: Error!");
	return Ty;			return Ty;
	▲ Show 20 Lines • Show All 1,187 Lines • Show Last 20 Lines

llvm/test/Analysis/BasicAA/dereferenceable.ll

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	bb:
store i32 1, i32* %obj, align 4		store i32 1, i32* %obj, align 4
store i64 0, i64* %ret, align 8		store i64 0, i64* %ret, align 8
%tmp = load i32, i32* %obj, align 4		%tmp = load i32, i32* %obj, align 4
ret i32 %tmp		ret i32 %tmp
}		}

define i32 @local_and_deref_ret_2() {		define i32 @local_and_deref_ret_2() {
; CHECK: Function: local_and_deref_ret_2: 2 pointers, 2 call sites		; CHECK: Function: local_and_deref_ret_2: 2 pointers, 2 call sites
; CHECK-NEXT: NoAlias: i32* %obj, i32* %ret		; CHECK-NEXT: MayAlias: i32* %obj, i32* %ret
bb:		bb:
%obj = alloca i32		%obj = alloca i32
call void @unknown(i32* %obj)		call void @unknown(i32* %obj)
%ret = call dereferenceable(8) i32* @get_i32_deref8()		%ret = call dereferenceable(8) i32* @get_i32_deref8()
store i32 1, i32* %obj, align 4		store i32 1, i32* %obj, align 4
store i32 0, i32* %ret, align 8		store i32 0, i32* %ret, align 8
%tmp = load i32, i32* %obj, align 4		%tmp = load i32, i32* %obj, align 4
ret i32 %tmp		ret i32 %tmp
▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/load-partial.ll

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	; AVX-NEXT: retq
%7 = insertelement <4 x float> %5, float %6, i32 1		%7 = insertelement <4 x float> %5, float %6, i32 1
%8 = getelementptr inbounds <4 x float>, <4 x float>* %0, i64 0, i64 2		%8 = getelementptr inbounds <4 x float>, <4 x float>* %0, i64 0, i64 2
%9 = load float, float* %8, align 4		%9 = load float, float* %8, align 4
%10 = insertelement <4 x float> %7, float %9, i32 2		%10 = insertelement <4 x float> %7, float %9, i32 2
%11 = insertelement <4 x float> %10, float %9, i32 3		%11 = insertelement <4 x float> %10, float %9, i32 3
ret <4 x float> %11		ret <4 x float> %11
}		}

define <4 x float> @load_float4_float3_trunc(<4 x float>* nocapture readonly dereferenceable(16)) {		define <4 x float> @load_float4_float3_trunc(<4 x float>* nocapture readonly dereferenceable(16)) nofree nosync {
; SSE-LABEL: load_float4_float3_trunc:		; SSE-LABEL: load_float4_float3_trunc:
; SSE: # %bb.0:		; SSE: # %bb.0:
; SSE-NEXT: movaps (%rdi), %xmm0		; SSE-NEXT: movaps (%rdi), %xmm0
; SSE-NEXT: retq		; SSE-NEXT: retq
;		;
; AVX-LABEL: load_float4_float3_trunc:		; AVX-LABEL: load_float4_float3_trunc:
; AVX: # %bb.0:		; AVX: # %bb.0:
; AVX-NEXT: vmovaps (%rdi), %xmm0		; AVX-NEXT: vmovaps (%rdi), %xmm0
▲ Show 20 Lines • Show All 225 Lines • Show Last 20 Lines

llvm/test/Transforms/LICM/hoist-deref-load.ll

	Show First 20 Lines • Show All 420 Lines • ▼ Show 20 Lines
	; because the dereferenceable meatdata on the c = *cptr load.			; because the dereferenceable meatdata on the c = *cptr load.
	define void @test7(i32* noalias %a, i32* %b, i32** %cptr, i32 %n) #0 {			define void @test7(i32* noalias %a, i32* %b, i32** %cptr, i32 %n) #0 {
	; CHECK-LABEL: @test7(			; CHECK-LABEL: @test7(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[C:%.]] = load i32, i32** [[CPTR:%.*]], align 8, !dereferenceable !0, !align !0			; CHECK-NEXT: [[C:%.]] = load i32, i32** [[CPTR:%.*]], align 8, !dereferenceable !0, !align !0
	; CHECK-NEXT: [[CMP11:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[CMP11:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[CMP11]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]			; CHECK-NEXT: br i1 [[CMP11]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[C]], align 4
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP1]], 0			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP0]], 0
	; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; CHECK: if.then:			; CHECK: if.then:
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[C]], align 4
	; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX3]], align 4			; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX3]], align 4
	; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP2]], [[TMP0]]			; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP2]], [[TMP1]]
	; CHECK-NEXT: store i32 [[MUL]], i32* [[ARRAYIDX]], align 4			; CHECK-NEXT: store i32 [[MUL]], i32* [[ARRAYIDX]], align 4
	; CHECK-NEXT: br label [[FOR_INC]]			; CHECK-NEXT: br label [[FOR_INC]]
	; CHECK: for.inc:			; CHECK: for.inc:
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[C:%.]] = load i32, i32** [[CPTR:%.*]], align 8, !dereferenceable_or_null !0, !align !0			; CHECK-NEXT: [[C:%.]] = load i32, i32** [[CPTR:%.*]], align 8, !dereferenceable_or_null !0, !align !0
	; CHECK-NEXT: [[NOT_NULL:%.]] = icmp ne i32 [[C]], null			; CHECK-NEXT: [[NOT_NULL:%.]] = icmp ne i32 [[C]], null
	; CHECK-NEXT: br i1 [[NOT_NULL]], label [[NOT_NULL:%.]], label [[FOR_END:%.]]			; CHECK-NEXT: br i1 [[NOT_NULL]], label [[NOT_NULL:%.]], label [[FOR_END:%.]]
	; CHECK: not.null:			; CHECK: not.null:
	; CHECK-NEXT: [[CMP11:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[CMP11:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[CMP11]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END]]			; CHECK-NEXT: br i1 [[CMP11]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[C]], align 4
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP1]], 0			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP0]], 0
	; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; CHECK: if.then:			; CHECK: if.then:
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[C]], align 4
	; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX3]], align 4			; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX3]], align 4
	; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP2]], [[TMP0]]			; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP2]], [[TMP1]]
	; CHECK-NEXT: store i32 [[MUL]], i32* [[ARRAYIDX]], align 4			; CHECK-NEXT: store i32 [[MUL]], i32* [[ARRAYIDX]], align 4
	; CHECK-NEXT: br label [[FOR_INC]]			; CHECK-NEXT: br label [[FOR_INC]]
	; CHECK: for.inc:			; CHECK: for.inc:
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	▲ Show 20 Lines • Show All 647 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Implement RFC: Decomposing deref(N) into deref(N) + nofreeAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 358754

llvm/docs/LangRef.rst

llvm/lib/IR/Value.cpp

llvm/test/Analysis/BasicAA/dereferenceable.ll

llvm/test/CodeGen/X86/load-partial.ll

llvm/test/Transforms/LICM/hoist-deref-load.ll

[WIP] Implement RFC: Decomposing deref(N) into deref(N) + nofree
AbandonedPublic