This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/docs/
-
docs/
11/45
LangRef.rst
16
NoAliasInfo.rst
-
UserGuides.rst

Differential D68484

[PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation.
Needs ReviewPublic

Authored by jeroen.dobbelaere on Oct 4 2019, 1:59 PM.

Download Raw Diff

Tokens

"Like" token, awarded by Dushistov.

Details

Reviewers

hfinkel
jdoerfert

Summary

This patch is the first of a series that introduces full restrict support
in LLVM and clang. The full support is based on the original local restrict
patches from Hal Finkel and is an implementation of the
'RFC: Full 'restrict' support in LLVM' [1].

In order to show the dependencies, in what follows, most of the time
a non-functional rebased patch from Hal Finkel is provided, followed
by a patch that enhances the full restrict support and makes everything
compile and run again.

[1] https://lists.llvm.org/pipermail/llvm-dev/2019-October/135672.html

Notes:

The mechanism with the ptr_provenance is such that passes that don't know about it will either work (and maybe miss certain optimizations) or crash. This is considered to be better than producing wrong code.
This set of patches is at the moment not complete. It is tested and works for the use cases of my company. But it is to be expected that some optimization passes will not interact well with it. In our experience, a number of optimization passes do have problems with the optional extra argument for load and store instructions, and they are normally easy to fix. It is possible that we did not yet catch all of those in passes that we don't use.
One item that is known to be missing, is LLVM IR bitcode support for the noalias_sidechannel of the load/store instruction (ascii LLVM IR is supported).
The new pass manager support has been fixed (D68507).
SLPVectorizer issues also have been fixed. (D68517)
The options enabling/disabling full restrict have been improved. (D68484)
A latent problem where invalid llvm-ir was produced during inlining has been fixed. (D68509)
A bug were the noalias depends-on relationship was lost has been fixed. (D68512 and D68521)

Added with the drop of 2020/06/12:

Renaming of 'side channel' to provenance, ptr_provenance
Incorporating Hal Finkel's changes. This should make it easier to review. It also reduces the number of patches to 26.
Handling of llvm.noalias.copy.guard during SROA has been improved.
Handling of Loop Unrolling has been improved.
Fixed a case in -fno-full-restrict where the new annotations were still produced.

Added with the drop of 2020/09/07

llvm-IR bitcode support
DeadArgumentElimination support

Added with the drop of 2021/05/18

coexists with llvm.experimental.noalias.scope.decl
some small improvements/fixes

Notes:

NoAliasInfo.rst describes the noalias intrinsics infrastructure.

This set of patches is based on f8dbd61074176bae92ec360a093ac7bc498c9321 (May 18, 2021)

A convenience patch is available at D69542.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

rpjohnst added a subscriber: rpjohnst.Oct 5 2019, 12:34 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)Oct 5 2019, 2:34 AM

lebedev.ri added a child revision: D68485: [PATCH 02/38] [noalias] D9375: An llvm.noalias intrinsic.Oct 5 2019, 9:44 AM

The patches in this patch series are not generated with full context (-U99999) (and not fully clang-formatted) :/

This mentions something it calls <channel> but as far as i can tell it is not actually explained *what* that is, what type it is.

In D68484#1696523, @lebedev.ri wrote:

This mentions something it calls <channel> but as far as i can tell it is not actually explained *what* that is, what type it is.

As the syntax description shows, it is the same as the type of the pointer operand <ty>*.
I should maybe have mentioned it explicitly.

Document the type of the noalias_sidechannel for load and store.

craig.topper added a subscriber: craig.topper.Oct 7 2019, 12:26 PM

craig.topper added inline comments.

llvm/docs/LangRef.rst
20452	Extra space after "path"
20455	in -> into?
20472	"allows to track" reads funny. Maybe "allows tracking of"?
20478	intrinsics*

lzutao added a subscriber: lzutao.Oct 7 2019, 1:14 PM

Dushistov added a subscriber: Dushistov.Oct 7 2019, 1:42 PM

Thank you very much for working on this and putting all this into motion!

I started to look at this patch in isolation but with the rough idea of the approach in mind.
I did add various comments, from small wording changes to proposals on how we conceptually describe things.
Given that many comments would be repeated for each intrinsic, I stopped after the first and want to see what people think.

llvm/docs/LangRef.rst
20415	Nit: of `load` and `store` instructions. The "of" sounds weird to me. The documentation below explains how it works, with `restrict` in mind. I personally dislike sentences like this and would just remove it. The intrinsics can also be used to specify alias assumptions that are not restrict based. Arguably that is always true. The section describes the semantics of the alias stuff and how that can be used to model `restrict`. It is implied that other things can be modeled as well.
20432	I think you mix the "templated" definition (`XXX`) with instantiations (`i8*`, `%struct.FOO`, ...). I would prefer we pick either. Precedence says you replace `XXX` with the types of that instantiation.
20458	introduces alias assumptions plural vs singular in the normal computation path this isn't a "known" term for me (see below) of a pointer and it will be opaque for most optimizations this is the hope but it is questionable if it is true and why it is here I would replace the first sentence with: "The `llvm.noalias` intrinsic attaches alias assumptions to its first argument." The whole pass thing and splitting comes to early (IMHO). I don't know yet what these intrinsics mean but I learn that they are transformed. That said, `llvm.side.noalias` is not described here.
20464	Nit: remove "is used to", just "identifies" Nit: remove "exact" (what does it mean given that we actually move stuff around under the normal "as if" rules) It's not "done inside" and loops are only one example of this. What you want to say is more general: "Whenever a `llvm.noalias.decl` intrinsic is duplicated through code transformations, care must be taken to duplicate and uniquify the scopes and intrinsics. These steps are described in the following." To be honest, I'm not sure if it makes sense to say something like that here already.
20469	Not: stray "this" in 16284 The inlining sentence does not really clear up anything here, partially because we don't know what is happening.
20473	remove "a blob of"
20484	Either a real object, a constant where the value is relative to 0 or `null`. There is a word missing and 0 is `null`.
20486	This seems odd, why introduce two things that do the same thing.
20489	"entries with a single element each." It represents the variable declaration that contains one or more restrict pointers. I do not understand this sentence.
20493	For both items above: No need for "points to". a restrict variable. Maybe more specific: "the restrict pointer `%p`."
20497	Maybe: "the address of an object with at least one restrict pointer constituent.
20500	I did not understand the above wording.
20517	Nit: "by the following"
20521	"an extra number" is not helpful " related to " -> "describing"/"identifying"
20536	"related to" -> "referencing" or "scoped in"
20666	I mentioned that before, the lady of the lake says XXX is specialized here: https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic (do we add attributes here? if so, we need attributes here, e.g., `nocapture`).
20675	I dislike the loop body and alloca sentence because they do not convey information. To be honest, I don't know what the second sentence is saying. What I would prefer is to say something like: "The intrinsic identifies the scope of the restrict restrict pointer through a virtual side-effect that ensures the control dependences are preserved. This virtual side-effect will also keep allocations alive and explicit." (I guessed what you want to say wrt. alloca) (allocas and mallocs should not be treated differently anyway, we have a heap-2-stack transformation now and for the purpose of this discussion there should not be a difference anyway)
20679	The above reads funny, maybe: "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." Also, why do we want an `i8` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8` which is more confusing in this context full of `i8*` that are actually pointers (IMHO).
20688	I would love to remove this duplication, is that possible? Why do we need to talk about `alloca` and "optimized away"? Can't we say: "The first argument `%p.alloca` points to an object in memory with one or more restrict pointers constituents or `null`."
20692	Is it a list with one element or a list with entries that have one element each? What I read earlier sounded different from what I read here.
20701	Copy and paste from somewhere above. I'd avoid duplication if possible in favor of references.
20707	The above sentence is broken somewhere (I think). Maybe make it two.

Thanks for all the feedback ! I added some explanations.

llvm/docs/LangRef.rst
20432	That's true, but imho the full intrinsic name become very long, cluttering the display., For clarity I replaced the type encodings with XXX. This makes it easier to focus on the intrinsics and the actual arguments. I agree that this is not perfect.
20486	The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as an offset. Later it needed to be separated: SROA first splits alloca's into multiple smaller alloca's. Each separate restrict pointer now points to its own alloca (%p.addr), and there is no place to put the offset. You can differentiate by splitting the p.scope, but that would imply duplicating scopes all over the place. The p.objId serves as a convenient and less costly solution to differentiate the pointers in this case.
20489	hmm. Not sure how to explain it further. What I want to say is (shown with an example:) int restrict A; // one !p.scope, one restrict pointer int restrict B[10]; // another (single) !p.scope, ten restrict pointers struct FOO { int* restrict mA; int * mB; int* restrict mC; } C; // yet another !p.scope, 2 restrict pointers
20679	I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too much about it and just settled on i8*.

jdoerfert added inline comments.Oct 7 2019, 4:11 PM

llvm/docs/LangRef.rst
20486	So `objId` is an offset into `p.addr`? If so, let's document it that way. How does this work if there are multiple restrict pointers in the object, e.g. `struct { restrict a; restrict b }`? Maybe it would help if you point me towards the place where I can see this intrinsic in action. At least then I might be able to provide better feedback on the wording.
20489	In that example, how doe the `p.scopes` look like? Or, asked differently, is the `p.scope` a consequence of the declaration, hence does it uniquely identifies a declaration?
20679	If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with all the i8 pointers that fly around. The wording will then make it clear that these are tokens.

Here is an example test.c:

struct FOO {
  int* restrict pA;
  int* pB;
  int* restrict pC;
};

void bar(int* a, int* b, int* c) {
  struct FOO tmp = { a, b, c };
  *tmp.pA=42;
  *tmp.pB=43;
  *tmp.pC=44;
}

Compiled as:

clang -mllvm --print-before-all -mllvm -debug -emit-llvm -O2 test.c -S -o -

Before SROA:

%tmp = alloca %struct.FOO, align 8
...
%1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
...
%pA1 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 0
%5 = load i32*, i32** %pA1, align 8, !tbaa !9, !noalias !6
%6 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %5, i8* %1, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6
...
%pC3 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 2
%8 = load i32*, i32** %pC3, align 8, !tbaa !12, !noalias !6
%9 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %8, i8* %1, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6

During SROA: Notice how llvm.noalias.decl and llvm.noalias is split, using 0, 8 and 16 for the p.objId :

...
Rewriting alloca partition [0,8) to:   %tmp.sroa.0 = alloca i32*
Found llvm.noalias.decl:   %1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
 [...]
  rewriting [0,8) slice #2
    original:   %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %2, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6
          to:   %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %1, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6
 [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
 [...]
Rewriting alloca partition [8,16) to:   %tmp.sroa.6 = alloca i32*
Found llvm.noalias.decl:   %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6)
  [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6)
  [...]
  rewriting [8,16) slice #7
    original:   %9 = load i32*, i32** %pB2, align 8, !tbaa !11, !noalias !6
          to:   %tmp.sroa.6.8. = load i32*, i32** %tmp.sroa.6, !tbaa !11, !noalias !6
Rewriting alloca partition [16,24) to:   %tmp.sroa.8 = alloca i32*
Found llvm.noalias.decl:   %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6)
  [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %4 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6)
  [...]
  rewriting [16,24) slice #10
    original:   %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %4, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6
          to:   %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %3, i32** %tmp.sroa.8, i64 16, metadata !6), !tbaa !12, !noalias !6
  Speculating PHIs
  Speculating Selects

Then later on:

Promoting allocas with mem2reg...
Zeoring noalias.decl dep:   %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
Zeroing operand 2 of   %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %0, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6
[...]
Zeoring noalias.decl dep:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !2)
Zeroing operand 2 of   %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %2, i32** %tmp.sroa.8, i64 16, metadata !2), !tbaa !10, !noalias !2
[...]
Zeoring noalias.decl dep:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !2)
[...]

(aargh, 'Zeoring' should of course be 'Zeroing' ;) )

After this pass, we get:

define dso_local void @bar(i32* %a, i32* %b, i32* %c) #0 {
entry:
  %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !2)
  %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 8, metadata !2)  ; This one will be removed later on, as it is not used anywhere.
  %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 16, metadata !2)
  %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %a, i8* %0, i32** null, i64 0, metadata !2), !tbaa !5, !noalias !2
  store i32 42, i32* %3, align 4, !tbaa !10, !noalias !2
  store i32 43, i32* %b, align 4, !tbaa !10, !noalias !2
  %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %c, i8* %2, i32** null, i64 16, metadata !2), !tbaa !12, !noalias !2
  store i32 44, i32* %4, align 4, !tbaa !10, !noalias !2
  ret void
}
[...]
!2 = !{!3} ; p.scope: 'tmp' in function 'bar', also recycled by !noalias, as it is the only restrict declaration
!3 = distinct !{!3, !4, !"bar: tmp"}
!4 = distinct !{!4, !"bar"}

llvm/docs/LangRef.rst
20486	This is the confusing part for me for the LangRef vs the usage: should the LangRef describe only the high level effect, or can it also describe how llvm treats/optimizes stuff internally ? I have somehow the feeling the we might want to have a separate restrict handling document, describing how the intrincs and metadata work together. Or do you think such a thing also belongs to the LangRef ?
20489	Yes, the p.scope is a result of the declaration and uniquely identifies one.
20679	ok. We can consider that.

nickdesaulniers added a subscriber: nickdesaulniers.Oct 8 2019, 11:53 PM

jrmuizel added a subscriber: jrmuizel.Oct 9 2019, 9:48 AM

a.elovikov added a subscriber: a.elovikov.Oct 9 2019, 4:36 PM

a.elovikov added inline comments.

llvm/docs/LangRef.rst
20424	I find it strange to see %side.p on both left and right sides. Is it a typo or does it have some special meaning? After reading till the intrinsics' description I believe it should be just "%p" on the right side.
20758	the `noalias_sidechannel` path Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here).
20798	No explicit "or null" here. Is that intentional?

jeroen.dobbelaere marked 3 inline comments as done.Oct 10 2019, 1:34 PM

jeroen.dobbelaere added inline comments.

llvm/docs/LangRef.rst
20424	yes, that's a typo. the second %side.p should be %p: %side.p = i8* @llvm.side.noalias.XXX(i8* %p, ...)
20758	The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store). I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar).
20798	It can be 'null'

TijmenW added a subscriber: TijmenW.Oct 10 2019, 1:41 PM

a.elovikov added inline comments.Oct 10 2019, 1:46 PM

llvm/docs/LangRef.rst
20758	How about this: It will be transformed into a `llvm.side.noalias` intrinsic and moved onto the `noalias_sidechannel` path for loads/stores and fed into the @llvm.noalias.arg.guard/@llvm.noalias.copy.guard intrinsics for function boundaries/copies respectively.

jeroen.dobbelaere marked an inline comment as done.Oct 10 2019, 1:53 PM

jeroen.dobbelaere added inline comments.

llvm/docs/LangRef.rst
20758	... and fed into the @llvm.noalias.arg.guard intrinsics for function boundaries. (The @llvm.noalias.copy.guard is generated by the clang frontend)

vchuravy added a subscriber: vchuravy.Oct 17 2019, 10:44 AM

izik1 added a subscriber: izik1.Oct 18 2019, 11:39 AM

vtjnash added a subscriber: vtjnash.Oct 27 2019, 8:11 PM

The is a rebase based on 82d3ba87d06f9e2abc6e27d8799587d433c56630

jeroen.dobbelaere mentioned this in D69542: Full Restrict Support - single patch.Oct 28 2019, 6:02 PM

aqjune added a subscriber: aqjune.Oct 30 2019, 7:52 PM

uenoku added a subscriber: uenoku.Nov 2 2019, 6:48 AM

simoll added a subscriber: simoll.Nov 5 2019, 6:29 AM

chriselrod added a subscriber: chriselrod.Dec 3 2019, 2:51 AM

CryZe added a subscriber: CryZe.Feb 28 2020, 7:46 AM

Dushistov awarded a token.Feb 28 2020, 7:49 AM

JonChesterfield added a subscriber: JonChesterfield.Feb 28 2020, 3:23 PM

jeroen.dobbelaere mentioned this in D75285: Mark restrict pointer or reference to const as invariant.Mar 3 2020, 4:51 AM

jeroen.dobbelaere mentioned this in D74935: [LangRef][AliasAnalysis] Clarify `noalias` affects only modified objects.Mar 4 2020, 3:08 AM

jeroen.dobbelaere added a subscriber: alexey.zhikhar.Mar 6 2020, 2:31 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)Mar 6 2020, 2:52 AM

samgreen added a subscriber: samgreen.Apr 8 2020, 12:11 PM

damageboy added a subscriber: damageboy.Apr 15 2020, 12:17 AM

chrisjackson added a subscriber: chrisjackson.May 14 2020, 3:29 PM

alex added a subscriber: alex.May 27 2020, 9:49 PM

Note: I am working on an updated version of the patches, rebased to a more recent version of the tree; including some bug fixes and taking into account the rename of noalias_sidechannel to ptr_provenance etc.

jeroen.dobbelaere updated this revision to Diff 270414.Jun 12 2020, 9:04 AM

jeroen.dobbelaere retitled this revision from [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. to [PATCH 01/26] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation..

jeroen.dobbelaere edited the summary of this revision. (Show Details)

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q? And load i8, i8* %p, i8* %p_prov is equivalent to load(llvm.noalias.arg.guard(p, p_prov))? And llvm.provenance.noalias([...], %p.addr, <type>** %prov.p.addr, [...] is equivalent to llvm.noalias([...], llvm.noalias.arg.guard(p, p_prov), [...])? And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

So really, these are the new concepts in the IR:

llvm.noalias.decl: this introduces a new "scope" for aliasing.
llvm.noalias: this associates a pointer with the llvm.noalias.decl.

And the rest can be expressed in terms of those intrinsics and basic IR instructions.

It took me a long time to parse this out; I think the description here needs to be reorganized. It really needs to separate out the semantic core from the detailed dive into the various intrinsics. Maybe into five sections: how noalias scopes work, how separating provenance from pointer values works, a high-level description of the intrinsics, the suggested lowering of the C "restrict", and the detailed description of the individual intrinsics.

Before optimizations, there is the declaration of the restrict pointer and `llvm.noalias` is used whenever the value of the restrict pointer is read.

Maybe explain why you're suggesting this, as opposed to using llvm.noalias when the value is written. (I guess it has something to do with the C standard's definition of "based on"?)

jeroen.dobbelaere edited child revisions, added: D68487: [PATCH 02/27] [noalias] Introduce family of noalias intrinsics.; removed: D68485: [PATCH 02/38] [noalias] D9375: An llvm.noalias intrinsic.Jun 12 2020, 11:30 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)Jun 12 2020, 11:54 AM

In D68484#2090338, @efriedma wrote:

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q?

This is not correct: the 'llvm.noalias.arg.guard(p,ptr_provenance)' combines the 'value of the pointer' (p) with the 'provenance of the pointer' (ptr_provenance).
The ptr_provenance does not has a real 'value'. It is more like a dependency.

When you follow both, they should come together at some point, like at the input argument of a function :

the ptr_provenance purpose is to track the llvm.provenance.noalias information (and its dependencies). Normally there are no computations on this path.
the normal 'p' path, should in the best case only contain computations.

Due to inlining, it is possible that somewhere in the flow, the normal 'p' path also contains noalias information. The propagation pass should flatten that out.

And load i8, i8* %p, i8* %p_prov is equivalent to load(llvm.noalias.arg.guard(p, p_prov))?

Yes, this is correct. The ptr_provenance path was added explicitly to the load/store instructions, in order to get the llvm.noalias.arg.guard out of the way of most optimizations.
This makes is it much easier to keep the code correct in the presence of optimizations.

And llvm.provenance.noalias([...], %p.addr, <type>** %prov.p.addr, [...] is equivalent to llvm.noalias([...], llvm.noalias.arg.guard(p, p_prov), [...])?

Yes. llvm.provenance.noalias and llvm.noalias are equivalent. The former does track more information, as it is itself also treated like a 'memory instruction', so that we llvm.noalias.arg.guard is not needed.

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

So really, these are the new concepts in the IR:

llvm.noalias.decl: this introduces a new "scope" for aliasing.

llvm.noalias: this associates a pointer with the llvm.noalias.decl.

llvm.noalias.arg.guard: combines a pointer (computation) path with a ptr_provenance path
llvm.noalias.copy.guard: indicates on what indices in memory a restrict pointer is located

And the rest can be expressed in terms of those intrinsics and basic IR instructions.

Yes.
llvm.provenance.noalias was introduced as a 'safeguard', to make it clear that it always must be on the 'ptr_provenance' operand side.
The ptr_provenance operand was introduced to keep the information out of the way of most optimization passes.

It took me a long time to parse this out; I think the description here needs to be reorganized. It really needs to separate out the semantic core from the detailed dive into the various intrinsics. Maybe into five sections: how noalias scopes work, how separating provenance from pointer values works, a high-level description of the intrinsics, the suggested lowering of the C "restrict", and the detailed description of the individual intrinsics.

Yes, that makes sense. I am in the process of putting all of this in a separate document, but I didn't want to wait to get the updated patches out ;)
I hope to update this 01/26 patch early next week with the next iteration of the documentation. This is already very useful input for it !

Before optimizations, there is the declaration of the restrict pointer and `llvm.noalias` is used whenever the value of the restrict pointer is read.

Maybe explain why you're suggesting this, as opposed to using llvm.noalias when the value is written. (I guess it has something to do with the C standard's definition of "based on"?)

Yes. I hope that the updated documentation will make this easier to understand.

Thanks !

Jeroen Dobbelaere

jeroen.dobbelaere edited the summary of this revision. (Show Details)Jun 12 2020, 12:23 PM

In D68484#2090569, @jeroen.dobbelaere wrote:

In D68484#2090338, @efriedma wrote:

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q?

This is not correct: the 'llvm.noalias.arg.guard(p,ptr_provenance)' combines the 'value of the pointer' (p) with the 'provenance of the pointer' (ptr_provenance).
The ptr_provenance does not has a real 'value'. It is more like a dependency.

When you follow both, they should come together at some point, like at the input argument of a function :

the ptr_provenance purpose is to track the llvm.provenance.noalias information (and its dependencies). Normally there are no computations on this path.

the normal 'p' path, should in the best case only contain computations.

Due to inlining, it is possible that somewhere in the flow, the normal 'p' path also contains noalias information. The propagation pass should flatten that out.

getelementptr q, (ptrtoint(p)-ptrtoint(q)) should return a pointer with provenance of q, and the value of p. (http://llvm.org/docs/LangRef.html#pointer-aliasing-rules). I can't see how it isn't equivalent... unless noalias provenance is somehow different from the usual aliasing rules.

Does the presence of provenance markings fix https://bugs.llvm.org/show_bug.cgi?id=35229 ?

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

Oh, I see, it only applies the provenance to loads derived from that pointer, not all loads from the memory.

efriedma added a subscriber: Meinersbur.Jun 12 2020, 2:10 PM

In D68484#2090724, @efriedma wrote:

In D68484#2090569, @jeroen.dobbelaere wrote:

getelementptr q, (ptrtoint(p)-ptrtoint(q)) should return a pointer with provenance of q, and the value of p. (http://llvm.org/docs/LangRef.html#pointer-aliasing-rules). I can't see how it isn't equivalent... unless noalias provenance is somehow different from the usual aliasing rules.

I see now. It is indeed somewhat equivalent. The separate intrinsic makes it easier to convey the specific purpose of the construct and to control the kind of optimizations that we want to allow.
A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.
But, this is not the goal of the full restrict patches, and I would rather start with the current focused set of intrinsics, before trying to expand on it.

Does the presence of provenance markings fix https://bugs.llvm.org/show_bug.cgi?id=35229 ?

No, that problem is not fixed with the full restrict patches.

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

Oh, I see, it only applies the provenance to loads derived from that pointer, not all loads from the memory.

yes.

MSxDOS added a subscriber: MSxDOS.Jun 15 2020, 10:49 PM

I see now. It is indeed somewhat equivalent. The separate intrinsic makes it easier to convey the specific purpose of the construct and to control the kind of optimizations that we want to allow.

Sure, I wasn't suggesting that you'd want to actually use the getelementptr version, just trying to understand the intended meaning.

A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.

Is there some semantic difference between llvm.noalias.arg.guard and something like llvm.ptr.provenance? Or is it just a difference in the intended use?

In D68484#2097013, @efriedma wrote:

A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.

Is there some semantic difference between llvm.noalias.arg.guard and something like llvm.ptr.provenance? Or is it just a difference in the intended use?

The llvm.noalias.arg.guard is intended to only track noalias dependencies. The llvm.ptr.provenance could be used to track provenance in a more general way (Like pointing to the original alloca).

Initial version of 'NoAliasInfo.rst', describing the noalias intrinsics infrastructure.

Notes:

in a future version 'llvm.noalias' and 'llvm.provenance.noalias' will be merged into a single intrinsic.
any feedback is welcome !

Matt added a subscriber: Matt.Jun 29 2020, 12:41 PM

jeroen.dobbelaere retitled this revision from [PATCH 01/26] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. to [PATCH 01/26] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation..Jul 7 2020, 3:14 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)

• jyn514 added a subscriber: • jyn514.Jul 8 2020, 8:43 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)Jul 10 2020, 9:13 AM

Herald added a subscriber: kosarev. · View Herald TranscriptJul 10 2020, 9:13 AM

In D68484#2116935, @jeroen.dobbelaere wrote:

Notes:

in a future version 'llvm.noalias' and 'llvm.provenance.noalias' will be merged into a single intrinsic.

I was thinking of merging llvm.noalias and llvm.provenance.noalias but now decided to not do it:

llvm.noalias is a convenience shortcut to llvm.provenance.noalias + llvm.noalias.arg.guard
keeping the convenience intrinsic reduces the amount of generated code and makes tracking tbaa on the intrinsics easier.

Updated NoAliasInfo.rst to explain the relationship between @llvm.noalias and @llvm.provenance.noalias, @llvm.noalias.arg.guard

Rebased to c06b7e2ab5167ad031745a706204abed1aefd823 (July 14, 2020)

MaskRay added a subscriber: MaskRay.Aug 17 2020, 10:57 AM

Rebased to 9fb46a452d4e5666828c95610ceac8dcd9e4ce16 (September 7, 2020)

Hmm. If anybody knows how to hide the inline comments from an older revision..

Rebased to 9fb46a452d4e5666828c95610ceac8dcd9e4ce16 (September 7, 2020)

jeroen.dobbelaere retitled this revision from [PATCH 01/26] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation. to [PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation..Sep 7 2020, 2:37 PM

jdoerfert added subscribers: fhahn, arsenm.Sep 8 2020, 10:36 AM

jordypotman added a subscriber: jordypotman.Sep 9 2020, 1:27 AM

The effect of the patches on the compile time can be found here: https://llvm-compile-time-tracker.com/index.php?branch=dobbelaj-snps/perf/full_restrict-20200907
For some the regressions, I already have some ideas on how reduce the impact. I propose to have the discussion at the respective patches.

yaxunl added a subscriber: yaxunl.Sep 15 2020, 12:48 PM

ping

Any feedback on this patch ?

Note: On some architectures, you might want to use -mllvm -enable-aa-sched-mi to make use of alias information when scheduling the machine instructions.

ppenzin added a subscriber: ppenzin.Oct 14 2020, 6:20 PM

jeroen.dobbelaere mentioned this in D90104: [LoopUnroll] Duplicate noalias metadata.Oct 26 2020, 1:01 AM

As promised, I've started testing this patch set in rust. Unfortunately I quickly ran into an assertion failure on the following reduced test case:

%0 = type { i32 }
%1 = type { i32 }

define internal void @foo0(%0* noalias %ptr) {
    store %0 zeroinitializer, %0* %ptr
    ret void
}

define internal void @foo1(%1* noalias %ptr) {
    store %1 zeroinitializer, %1* %ptr
    ret void
}

define void @bar(%0* %ptr0, %1* %ptr1) {
    call void @foo0(%0* noalias %ptr0)
    call void @foo1(%1* noalias %ptr1)
    ret void
}

Run opt -inline:

opt: /home/nikic/rust/src/llvm-project/llvm/include/llvm/Support/Casting.h:269: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::Function; Y = llvm::Value; typename llvm::cast_retty<X, Y*>::ret_type = llvm::Function*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build/x86_64-unknown-linux-gnu/llvm/bin/opt -S -inline 
1.	Running pass 'CallGraph Pass Manager' on module '<stdin>'.
 #0 0x0000557429562c40 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a8c40)
 #1 0x00005574295608e4 llvm::sys::RunSignalHandlers() (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a68e4)
 #2 0x0000557429560a28 SignalHandler(int) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a6a28)
 #3 0x00007fcfc4eb43c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
 #4 0x00007fcfc498418b raise /build/glibc-ZN95T4/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #5 0x00007fcfc4963859 abort /build/glibc-ZN95T4/glibc-2.31/stdlib/abort.c:81:7
 #6 0x00007fcfc4963729 get_sysdep_segment_value /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:509:8
 #7 0x00007fcfc4963729 _nl_load_domain /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:970:34
 #8 0x00007fcfc4974f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
 #9 0x0000557428c60829 (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x1fa6829)
#10 0x0000557428c6b554 llvm::IRBuilderBase::CreateNoAliasDeclaration(llvm::Value*, llvm::Value*, llvm::Value*) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x1fb1554)
#11 0x00005574295f3783 AddNoAliasIntrinsics(llvm::CallBase&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::MDNode*&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x2939783)
#12 0x00005574295f4699 llvm::InlineFunction(llvm::CallBase&, llvm::InlineFunctionInfo&, llvm::AAResults*, bool, llvm::Function*) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x293a699)
#13 0x0000557428e49c48 llvm::LegacyInlinerBase::inlineCalls(llvm::CallGraphSCC&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x218fc48)
#14 0x00005574283ac72e (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x16f272e)
#15 0x0000557428cc1503 llvm::legacy::PassManagerImpl::run(llvm::Module&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x2007503)
#16 0x000055742734a7e2 main (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x6907e2)
#17 0x00007fcfc49650b3 __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:342:3
#18 0x00005574273e4b5e _start (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x72ab5e)

I believe this is a known problem with the name mangling for pointers to anonymous types. I think @fhahn may know more about this, IIRC this came up as a problem with PredicateInfo as well.

jeroen.dobbelaere mentioned this in D91250: Support intrinsic overloading on unnamed types.Nov 11 2020, 4:20 AM

In D68484#2372356, @nikic wrote:

As promised, I've started testing this patch set in rust. Unfortunately I quickly ran into an assertion failure on the following reduced test case:

Thank you for trying this out ! D91250 should resolve that problem. Can you try again with that patch applied ?

Thanks !

Jeroen Dobbelaere

A few comments inline.

llvm/docs/LangRef.rst
20421	Should this be: `%prov.p = i8* @llvm.provenance.noalias.XXX(i8* %p, i8* %p.decl,` ? Or could you clarify the `%prov.p` on both sides?
llvm/docs/NoAliasInfo.rst
267	This seems like it may be useful to see a fragment/sample of the metadata declarations here? Similar for the `OutOfLoop` snippet.
363	IIUC, this example looks like a good place to clarify the distinction between `p.alloca` and `p.addr` in the LangRef description, along the lines of "in this example the address is the alloca, but in general they may refer to different locations; the alloca could be a struct and the address could point to a member of the struct".
374	Expand with explanation on what does not alias in the example.

jeroen.dobbelaere mentioned this in D92887: [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed.Dec 8 2020, 2:46 PM

jryans added a subscriber: jryans.Dec 12 2020, 8:15 AM

PaulGrandperrin added a subscriber: PaulGrandperrin.Jan 4 2021, 6:27 AM

penzn added a subscriber: penzn.Jan 5 2021, 10:25 AM

troyj added a subscriber: troyj.Jan 22 2021, 8:16 AM

jeroen.dobbelaere mentioned this in rG774629641bf3: [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias….Jan 24 2021, 4:49 AM

slanterns added a subscriber: slanterns.Jan 24 2021, 5:41 AM

Note for those that have not been following the LLVM AA Technical Calls: we have introduced part of the infrastructure needed for full restrict by focusing on fixing https://bugs.llvm.org/show_bug.cgi?id=39282 (See D93039, D93040, D92887, D94306).

Next steps involve looking at providing the ptr_provenance infrastructure.

I also try to provide an update of at least the single patch (D69542) in the coming weeks.

jeroen.dobbelaere mentioned this in rG04790d9cfba3: Support intrinsic overloading on unnamed types.Mar 19 2021, 6:35 AM

jeroen.dobbelaere updated this revision to Diff 346108.May 18 2021, 4:22 AM

jeroen.dobbelaere edited the summary of this revision. (Show Details)

Rebased to f8dbd61074176bae92ec360a093ac7bc498c9321.

Harbormaster completed remote builds in B104984: Diff 346108.May 18 2021, 5:31 AM

ychen added a subscriber: ychen.Jun 13 2021, 2:41 PM

jeroen.dobbelaere mentioned this in D104268: [ptr_provenance] Introduce optional ptr_provenance operand to load/store.Jun 14 2021, 2:53 PM

jeroen.dobbelaere mentioned this in D105805: [NFC] Do not track calls to inlined intrinsics in IFI..Jul 12 2021, 4:02 AM

jeroen.dobbelaere mentioned this in rG1d8030053d46: [NFC] Do not track calls to inlined intrinsics in IFI..Jul 13 2021, 1:36 AM

jeroen.dobbelaere mentioned this in D111160: [UnknownProvenance] Add LLVM-IR support for unknown_provenance.Nov 5 2021, 4:43 AM

I updated and rebased the convenience patch to 8924ba3bf8c6b0e8d14dff455e4e449a426a2700 (November 17, 2021) (See D69542)

anemet added a subscriber: anemet.Jan 21 2022, 9:39 PM

Let me make a proposal for a simpler IR:

declare i8* @llvm.noalias(i8*) noalias argmemonly
declare void @llvm.noalias.end(i8*, i8*) argmemonly

define @foo(i8* %ptr) {

%p2 = call i8* @llvm.noalias(i8* %ptr)

use %p2..

call @llvm.noalias.end(i8* %ptr, i8* %p2)

use %ptr..
}

Access to restrict pointer is bounded by the noalias & noalias.end calls.
Advantages:

Much simpler than the current proposal.
Perf improvements from day one: by tagging the intrinsics properly, LLVM's AA algorithms can already decide that %p2 doesn't alias anything else.
Safe from day one: no memory operations can be moved across the barriers. This is enforced by LLVM IR semantics already, no changes needed!

Further perf improvements can be made, like hoisting llvm.alias intrinsics, teach LLVM that these intrinsics don't actually write to memory, etc.

Essentially, I don't see a need to track provenance explicitly with metadata. It's already easily accessible. Explicit tracking adds overhead, so it has to be very well justified. Right now I don't understand the motivation.

Please let me know what you think, especially what use case wouldn't work with the proposal above. Thanks!

In D68484#3282674, @nlopes wrote:

Access to restrict pointer is bounded by the noalias & noalias.end calls.
Advantages:

Much simpler than the current proposal.

Perf improvements from day one: by tagging the intrinsics properly, LLVM's AA algorithms can already decide that %p2 doesn't alias anything else.

Safe from day one: no memory operations can be moved across the barriers. This is enforced by LLVM IR semantics already, no changes needed!

Further perf improvements can be made, like hoisting llvm.alias intrinsics, teach LLVM that these intrinsics don't actually write to memory, etc.

Essentially, I don't see a need to track provenance explicitly with metadata. It's already easily accessible. Explicit tracking adds overhead, so it has to be very well justified. Right now I don't understand the motivation.

Please let me know what you think, especially what use case wouldn't work with the proposal above. Thanks!

Hi @nlopes, thanks for looking into this.

I am not sure what you expect the semantics of @llvm.noalias and @llvm.noalias.end to be. Having examples on how this is supposed to work and and to allow us to implement C99 restrict would be useful.

The current implementation is what it is because of:

for a C99 restrict implementation, simpler is not necessarily 'correct' :(
the need to implement the 'based on' relationship, also in a a way that clang is producing code, where this dependency is not easily seen. The same restrict pointer usage can appear in different blocks at different places. They will not be using the same '@llvm.nolias' intrinsic.
One of the aims is also to allow memory operations to be moved across (certain) barriers as much as possible. The used intrinsics should (in the end) get completely out of the way of optimizations. The original @llvm.noalias is opaque, but gets converted into a @llvm.provenance.noalias later on which is put on the ptr_provenance path for that specific reason.

I am not sure what you expect the semantics of @llvm.noalias and @llvm.noalias.end to be. Having examples on how this is supposed to work and and to allow us to implement C99 restrict would be useful.

I believe these two are sufficient:

declare i8* @llvm.noalias(i8*) noalias argmemonly
declare void @llvm.noalias.end(i8*, i8*) argmemonly

noalias creates a new object with data aliasing that of the input and same size. noalias.end() doesn't do anything. It's there to prevent memory operations to cross the boundary.
These intrinsics delimit the lexical scope of the restrict variables. They are also introduced when inlining a function with noalias paramater attributes.

So:

{
  restrict *q = p;
  use(q);
}
use(p);

is represented as:

%q = call @llvm.noalias(%p)
use(%q)
call @llvm.noalias.end(%p, %q)

use(%p)

The two uses cannot cross the barriers. This is enforced by LLVM out-of-the-box. Of course one can implement transformations to widen or eliminate the barriers.

the need to implement the 'based on' relationship, also in a a way that clang is producing code, where this dependency is not easily seen. The same restrict pointer usage can appear in different blocks at different places.

You need to elaborate this a bit more, otherwise I don't understand what you mean.

One of the aims is also to allow memory operations to be moved across (certain) barriers as much as possible. The used intrinsics should (in the end) get completely out of the way of optimizations. The original @llvm.noalias is opaque, but gets converted into a @llvm.provenance.noalias later on which is put on the ptr_provenance path for that specific reason.

Sure, but barriers exist and cannot be removed. restrict creates a new memory block in a well-defined region. Barriers can be widened and even removed (by giving up on the restrict information). But whether you use intrinsics as barriers or some metadata doesn't matter.
Intrinsics as the ones proposed above have the advantage of being correct from day one and without missing places that need to be updated to learn about the new metadata. This is a huge plus.

the need to implement the 'based on' relationship, also in a a way that clang is producing code, where this dependency is not easily seen. The same restrict pointer usage can appear in different blocks at different places.

You need to elaborate this a bit more, otherwise I don't understand what you mean.

How would you map (before _any_ optimizations; aka, what kind of code should clang produce):

// int *p, *q,  *s,  *t;
{
   int *restrict rp = p;
   int *restrict rq = q;
   int *restrict rs = s;

   int * based_on_rp1 = rp + index1;

   use(rp); 
   use(based_on_rp1); // aliases with rp
   use(rq); // only aliases with rq
   use(rs);
   int * based_on_something;
   if (some_input) {
     based_on_something = based_on_rp1;
   } else {
     based_on_something = rs;
   }
   use(based_on_something); // might alias with rs, or rp
   int * based_on_rp2 = rp + index2;
   use(based_on_rp2);  // aliases with rp
   use(rp);
   use(t); // will not alias with anything above
 }
 use(t+index3); // might alias with everything above

One of the aims is also to allow memory operations to be moved across (certain) barriers as much as possible. The used intrinsics should (in the end) get completely out of the way of optimizations. The original @llvm.noalias is opaque, but gets converted into a @llvm.provenance.noalias later on which is put on the ptr_provenance path for that specific reason.

Sure, but barriers exist and cannot be removed. restrict creates a new memory block in a well-defined region. Barriers can be widened and even removed (by giving up on the restrict information).
But whether you use intrinsics as barriers or some metadata doesn't matter.

In some way, it does matter. Intrinsics result in real barriers that can only be left out if done explicitly. (aka, dropping them might result in wrong code).
Metadata can be dropped at will, it should not influence correctness. That is the reason why the full restrict implementation uses both: intrinsics for adding the based-on relationship; metadata for the scope.
The goal of the patches is not just to provide restrict support; The goals is to provide restrict support _and_ good optimizations making use of this knowledge.

Intrinsics as the ones proposed above have the advantage of being correct from day one and without missing places that need to be updated to learn about the new metadata. This is a huge plus.

Were you able to check the initial description ? (https://lists.llvm.org/pipermail/llvm-dev/2019-October/135672.html) As well as the talk I gave last LLVM Dev conference ?
They should give a decent explanation on why the different concepts are introduced, keeping correctness and optimizations in mind.

Thanks,

Jeroen

In D68484#3283485, @jeroen.dobbelaere wrote:
the need to implement the 'based on' relationship, also in a a way that clang is producing code, where this dependency is not easily seen. The same restrict pointer usage can appear in different blocks at different places.

You need to elaborate this a bit more, otherwise I don't understand what you mean.

How would you map (before _any_ optimizations; aka, what kind of code should clang produce):
// int *p, *q,  *s,  *t;
{
   int *restrict rp = p;
   int *restrict rq = q;
   int *restrict rs = s;

   int * based_on_rp1 = rp + index1;

   use(rp); 
   use(based_on_rp1); // aliases with rp
   use(rq); // only aliases with rq
   use(rs);
   int * based_on_something;
   if (some_input) {
     based_on_something = based_on_rp1;
   } else {
     based_on_something = rs;
   }
   use(based_on_something); // might alias with rs, or rp
   int * based_on_rp2 = rp + index2;
   use(based_on_rp2);  // aliases with rp
   use(rp);
   use(t); // will not alias with anything above
 }
 use(t+index3); // might alias with everything above

Thank you for the example. I don't see any complication because what I've proposed can handle these implicitly and leverage the LLVM's AA reasoning as-is.
So your example would be compiled to:

%rp = call @llvm.noalias(%p)
%rq = call @llvm.noalias(%q)
%rs = call @llvm.noalias(%s)

%based_on_rp1 = gep %rp, %index1

use(%rp)
use(%based_on_rp1)
use(%rq)  ; only aliases with rq; LLVM gets it automatically
use(%rs)

if (some_input) {
  %based_on_something0 = %based_on_rp1
} else {
  %based_on_something1 = %rs
}
%based_on_something = phi(based_on_something0, based_on_something1)
use(based_on_something)  ; may-alias with rs, or rp

%based_on_rp2 = gep %rp, index2
use(%based_on_rp2)  ;aliases with rp
use(%rp)
use(%t)  ; will not alias with anything above; LLVM knows that for free

call llvm.alias.end(%p, %rp)
call llvm.alias.end(%q, %rq)
call llvm.alias.end(%s, %rs)

The translation is pretty straightforward AFAICT. And the aliasing properties you want to establish are given for free by the current AA.

I'm sorry I'm late to the party, which I'm sure is frustrating for you, but only recently someone called my attention to this proposal and asked me to review it.

flip1995 added a subscriber: flip1995.Feb 11 2022, 6:14 AM

awarzynski added a subscriber: awarzynski.Jun 7 2022, 1:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 1:48 AM

vzakhari added a subscriber: vzakhari.Dec 2 2022, 4:49 PM

vzakhari added inline comments.

llvm/docs/NoAliasInfo.rst
41	typo: `associate d`
362	Please add parenthesis, so that it looks like a call instruction. Same at line 365.
374	Can you please expand on the meaning of "must block optimizations"? Maybe list some (important) optimizations that are blocked and insert a link to `Optimization passes` section below for optimizations that are not blocked but must handle the scopes properly?
398	Should `and` here be `or/and`? I think it might be useful to put an example somewhere that shows what happens when a pointer provenance cannot be proven not to be based on a restrict pointer. For example: void unknown(); extern int gp1; extern int gp2; void test(int *restrict p1) { gp1 = p1; unknown(); for (int i = 0; i < 100; ++i) gp2[i] = p1[i] + 1; } As I understand, the store instruction inside the loop uses a non-restrict pointer (`gp2`) with unknown/missing provenance and since `p1` escapes, `gp2` might be based on `p1`. Thus, the load and the store pontentially alias. I would be great to see an explanation of the logic used to compute the `MayAlias` result here.
457	Please add parenthesis for the calls.
603	Can you please add some example(s)? As I understand, this describes cases with global restrict pointer and with indirect loads of restrict pointers (e.g. `int restrict restrict`).
624	Do we need to represent the aggregate type as an extra argument to make it work with opaque pointers?
721	Is it important that the unknown scope has the function scope as its "parent"? !6 = distinct !{!6, !7, !"test: unknown scope"} !7 = distinct !{!7, !"test"}
890	typo: "the the"
1019	Is this correct?
1250	`past` -> `passed`?
1384	Can you please add notes that `ScopedNoAliasAA` is the alias analysis that is using the `noalias` and `ptr_provenance` information? Do you also have estimation of the complexity of `alias` queries with the new representation? How is it affected by the number of scopes in the scope lists attached to the load/store, by the length of the provenance chain, etc.?

vzakhari added inline comments.Dec 2 2022, 6:40 PM

llvm/docs/NoAliasInfo.rst
1367	Does `noalias` attribute becomes redundant for arguments, since `clang` homes it with `llvm.noalias.decl` and loads from it with `llvm.noalias`?

h-vetinari added a subscriber: h-vetinari.May 23 2023, 4:14 PM

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

330 lines

NoAliasInfo.rst

1399 lines

UserGuides.rst

5 lines

Diff 346108

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 5,892 Lines • ▼ Show 20 Lines
	This describes a struct with two fields. The first is at offset 0 bytes			This describes a struct with two fields. The first is at offset 0 bytes
	with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes			with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
	and has size 4 bytes and has tbaa tag !2.			and has size 4 bytes and has tbaa tag !2.

	Note that the fields need not be contiguous. In this example, there is a			Note that the fields need not be contiguous. In this example, there is a
	4 byte gap between the two fields. This gap represents padding which			4 byte gap between the two fields. This gap represents padding which
	does not carry useful data and need not be preserved.			does not carry useful data and need not be preserved.

				.. _noalias_and_aliasscope:

	'``noalias``' and '``alias.scope``' Metadata			'``noalias``' and '``alias.scope``' Metadata
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	``noalias`` and ``alias.scope`` metadata provide the ability to specify generic			``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
	noalias memory-access sets. This means that some collection of memory access			noalias memory-access sets. This means that some collection of memory access
	instructions (loads, stores, memory-accessing calls, etc.) that carry			instructions (loads, stores, memory-accessing calls, etc.) that carry
	``noalias`` metadata can specifically be specified not to alias with some other			``noalias`` metadata can specifically be specified not to alias with some other
	collection of memory access instructions that carry ``alias.scope`` metadata.			collection of memory access instructions that carry ``alias.scope`` metadata.
	▲ Show 20 Lines • Show All 3,715 Lines • ▼ Show 20 Lines
	'``load``' Instruction			'``load``' Instruction
	^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	<result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]			<result> = load [volatile] <ty>, <ty>* <pointer>[, ptr_provenance <ty>* <channel>][,align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
	<result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]			<result> = load atomic [volatile] <ty>, <ty>* <pointer>[, ptr_provenance <ty>* <channel>] [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
	!<nontemp_node> = !{ i32 1 }			!<nontemp_node> = !{ i32 1 }
	!<empty_node> = !{}			!<empty_node> = !{}
	!<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }			!<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
	!<align_node> = !{ i64 <value_alignment> }			!<align_node> = !{ i64 <value_alignment> }

	Overview:			Overview:
	"""""""""			"""""""""

	Show All 15 Lines
	Atomic loads produce :ref:`defined <memmodel>` results when they may see			Atomic loads produce :ref:`defined <memmodel>` results when they may see
	multiple atomic stores. The type of the pointee must be an integer, pointer, or			multiple atomic stores. The type of the pointee must be an integer, pointer, or
	floating-point type whose bit width is a power of two greater than or equal to			floating-point type whose bit width is a power of two greater than or equal to
	eight and less than or equal to a target-specific size limit. ``align`` must be			eight and less than or equal to a target-specific size limit. ``align`` must be
	explicitly specified on atomic loads, and the load has undefined behavior if the			explicitly specified on atomic loads, and the load has undefined behavior if the
	alignment is not set to a value which is at least the size in bytes of the			alignment is not set to a value which is at least the size in bytes of the
	pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.			pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.

				The optional ``ptr_provenance`` argument specifies the noalias chain of the
				pointer operand. It has the same type as the pointer operand. Together with the
				``!noalias`` metadata on the instruction, and the ``llvm.provenance.noalias``,
				``llvm.noalias.arg.guard`` intrinsics in the chain, this is used to deduce if
				two load/store instructions may or may not alias. (See `Scoped NoAlias Related
				Intrinsics`_)

	The optional constant ``align`` argument specifies the alignment of the			The optional constant ``align`` argument specifies the alignment of the
	operation (that is, the alignment of the memory address). A value of 0			operation (that is, the alignment of the memory address). A value of 0
	or an omitted ``align`` argument means that the operation has the ABI			or an omitted ``align`` argument means that the operation has the ABI
	alignment for the target. It is the responsibility of the code emitter			alignment for the target. It is the responsibility of the code emitter
	to ensure that the alignment information is correct. Overestimating the			to ensure that the alignment information is correct. Overestimating the
	alignment results in undefined behavior. Underestimating the alignment			alignment results in undefined behavior. Underestimating the alignment
	may produce less efficient code. An alignment of 1 is always safe. The			may produce less efficient code. An alignment of 1 is always safe. The
	maximum possible alignment is ``1 << 29``. An alignment value higher			maximum possible alignment is ``1 << 29``. An alignment value higher
	▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines
	'``store``' Instruction			'``store``' Instruction
	^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void			store [volatile] <ty> <value>, <ty>* <pointer>[, ptr_provenance <ty>* <channel>][, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void
	store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void			store atomic [volatile] <ty> <value>, <ty>* <pointer>[, ptr_provenance <ty>* <channel>] [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
	!<nontemp_node> = !{ i32 1 }			!<nontemp_node> = !{ i32 1 }
	!<empty_node> = !{}			!<empty_node> = !{}

	Overview:			Overview:
	"""""""""			"""""""""

	The '``store``' instruction is used to write to memory.			The '``store``' instruction is used to write to memory.

	Show All 15 Lines
	Atomic loads produce :ref:`defined <memmodel>` results when they may see			Atomic loads produce :ref:`defined <memmodel>` results when they may see
	multiple atomic stores. The type of the pointee must be an integer, pointer, or			multiple atomic stores. The type of the pointee must be an integer, pointer, or
	floating-point type whose bit width is a power of two greater than or equal to			floating-point type whose bit width is a power of two greater than or equal to
	eight and less than or equal to a target-specific size limit. ``align`` must be			eight and less than or equal to a target-specific size limit. ``align`` must be
	explicitly specified on atomic stores, and the store has undefined behavior if			explicitly specified on atomic stores, and the store has undefined behavior if
	the alignment is not set to a value which is at least the size in bytes of the			the alignment is not set to a value which is at least the size in bytes of the
	pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.			pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.

				The optional ``ptr_provenance`` argument specifies the noalias chain of the
				pointer operand. It has the same type as the pointer operand. Together with the
				``!noalias`` metadata on the instruction, and the ``llvm.provenance.noalias``,
				``llvm.noalias.arg.guard`` intrinsics in the chain, this is used to deduce if
				two load/store instructions may or may not alias. (See `Scoped NoAlias Related
				Intrinsics`_)

	The optional constant ``align`` argument specifies the alignment of the			The optional constant ``align`` argument specifies the alignment of the
	operation (that is, the alignment of the memory address). A value of 0			operation (that is, the alignment of the memory address). A value of 0
	or an omitted ``align`` argument means that the operation has the ABI			or an omitted ``align`` argument means that the operation has the ABI
	alignment for the target. It is the responsibility of the code emitter			alignment for the target. It is the responsibility of the code emitter
	to ensure that the alignment information is correct. Overestimating the			to ensure that the alignment information is correct. Overestimating the
	alignment results in undefined behavior. Underestimating the			alignment results in undefined behavior. Underestimating the
	alignment may produce less efficient code. An alignment of 1 is always			alignment may produce less efficient code. An alignment of 1 is always
	safe. The maximum possible alignment is ``1 << 29``. An alignment			safe. The maximum possible alignment is ``1 << 29``. An alignment
	▲ Show 20 Lines • Show All 10,583 Lines • ▼ Show 20 Lines


	::			::

	declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)			declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)

	Overview:			Overview:
	"""""""""			"""""""""

				jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: of `load` and `store` instructions. The "of" sounds weird to me. The documentation below explains how it works, with `restrict` in mind. I personally dislike sentences like this and would just remove it. The intrinsics can also be used to specify alias assumptions that are not restrict based. Arguably that is always true. The section describes the semantics of the alias stuff and how that can be used to model `restrict`. It is implied that other things can be modeled as well. jdoerfert: Nit: > of `load` and `store` instructions. The "of" sounds weird to me. --- > The…
	The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a			The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
	noalias scope is declared. When the intrinsic is duplicated, a decision must			noalias scope is declared. When the intrinsic is duplicated, a decision must
	also be made about the scope: depending on the reason of the duplication,			also be made about the scope: depending on the reason of the duplication,
	the scope might need to be duplicated as well.			the scope might need to be duplicated as well.


				asbirleaUnsubmitted Not Done Reply Inline Actions Should this be: `%prov.p = i8* @llvm.provenance.noalias.XXX(i8* %p, i8* %p.decl,` ? Or could you clarify the `%prov.p` on both sides? asbirlea: Should this be: `%prov.p = i8* @llvm.provenance.noalias.XXX(i8* %p, i8* %p.decl,` ? Or…
	Arguments:			Arguments:
	""""""""""			""""""""""

				a.elovikovUnsubmitted Not Done Reply Inline Actions I find it strange to see %side.p on both left and right sides. Is it a typo or does it have some special meaning? After reading till the intrinsics' description I believe it should be just "%p" on the right side. a.elovikov: I find it strange to see %side.p on both left and right sides. Is it a typo or does it have…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions yes, that's a typo. the second %side.p should be %p: %side.p = i8* @llvm.side.noalias.XXX(i8* %p, ...) jeroen.dobbelaere: yes, that's a typo. the second %side.p should be %p: %side.p = i8* @llvm.side.noalias.XXX…
	The ``!id.scope.list`` argument is metadata that is a list of ``noalias``			The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
	metadata references. The format is identical to that required for ``noalias``			metadata references. The format is identical to that required for ``noalias``
	metadata. This list must have exactly one element.			metadata. This list must have exactly one element.

	Semantics:			Semantics:
	""""""""""			""""""""""

	The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a			The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
				jdoerfertUnsubmitted Not Done Reply Inline Actions I think you mix the "templated" definition (`XXX`) with instantiations (`i8*`, `%struct.FOO`, ...). I would prefer we pick either. Precedence says you replace `XXX` with the types of that instantiation. jdoerfert: I think you mix the "templated" definition (`XXX`) with instantiations (`i8*`, `%struct.FOO`…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions That's true, but imho the full intrinsic name become very long, cluttering the display., For clarity I replaced the type encodings with XXX. This makes it easier to focus on the intrinsics and the actual arguments. I agree that this is not perfect. jeroen.dobbelaere: That's true, but imho the full intrinsic name become very long, cluttering the display., For…
	noalias scope is declared. When the intrinsic is duplicated, a decision must			noalias scope is declared. When the intrinsic is duplicated, a decision must
	also be made about the scope: depending on the reason of the duplication,			also be made about the scope: depending on the reason of the duplication,
	the scope might need to be duplicated as well.			the scope might need to be duplicated as well.

	For example, when the intrinsic is used inside a loop body, and that loop is			For example, when the intrinsic is used inside a loop body, and that loop is
	unrolled, the associated noalias scope must also be duplicated. Otherwise, the			unrolled, the associated noalias scope must also be duplicated. Otherwise, the
	noalias property it signifies would spill across loop iterations, whereas it			noalias property it signifies would spill across loop iterations, whereas it
	was only valid within a single iteration.			was only valid within a single iteration.

	.. code-block:: llvm			.. code-block:: llvm

	; This examples shows two possible positions for noalias.decl and how they impact the semantics:			; This examples shows two possible positions for noalias.decl and how they impact the semantics:
	; If it is outside the loop (Version 1), then %a and %b are noalias across all iterations.			; If it is outside the loop (Version 1), then %a and %b are noalias across all iterations.
	; If it is inside the loop (Version 2), then %a and %b are noalias only within one iteration.			; If it is inside the loop (Version 2), then %a and %b are noalias only within one iteration.
	declare void @decl_in_loop(i8* %a.base, i8* %b.base) {			declare void @decl_in_loop(i8* %a.base, i8* %b.base) {
	entry:			entry:
	; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop			; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
	br label %loop			br label %loop

	loop:			loop:
				craig.topperUnsubmitted Not Done Reply Inline Actions Extra space after "path" craig.topper: Extra space after "path"
	%a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]			%a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]
	%b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]			%b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]
	; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop			; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
				craig.topperUnsubmitted Not Done Reply Inline Actions in -> into? craig.topper: in -> into?
	%val = load i8, i8* %a, !alias.scope !2			%val = load i8, i8* %a, !alias.scope !2
	store i8 %val, i8* %b, !noalias !2			store i8 %val, i8* %b, !noalias !2
	%a.inc = getelementptr inbounds i8, i8* %a, i64 1			%a.inc = getelementptr inbounds i8, i8* %a, i64 1
				jdoerfertUnsubmitted Not Done Reply Inline Actions introduces alias assumptions plural vs singular in the normal computation path this isn't a "known" term for me (see below) of a pointer and it will be opaque for most optimizations this is the hope but it is questionable if it is true and why it is here I would replace the first sentence with: "The `llvm.noalias` intrinsic attaches alias assumptions to its first argument." The whole pass thing and splitting comes to early (IMHO). I don't know yet what these intrinsics mean but I learn that they are transformed. That said, `llvm.side.noalias` is not described here. jdoerfert: > introduces alias assumptions plural vs singular > in the normal computation path this isn't…
	%b.inc = getelementptr inbounds i8, i8* %b, i64 1			%b.inc = getelementptr inbounds i8, i8* %b, i64 1
	%cond = call i1 @cond()			%cond = call i1 @cond()
	br i1 %cond, label %loop, label %exit			br i1 %cond, label %loop, label %exit

	exit:			exit:
	ret void			ret void
				jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: remove "is used to", just "identifies" Nit: remove "exact" (what does it mean given that we actually move stuff around under the normal "as if" rules) It's not "done inside" and loops are only one example of this. What you want to say is more general: "Whenever a `llvm.noalias.decl` intrinsic is duplicated through code transformations, care must be taken to duplicate and uniquify the scopes and intrinsics. These steps are described in the following." To be honest, I'm not sure if it makes sense to say something like that here already. jdoerfert: Nit: remove "is used to", just "identifies" Nit: remove "exact" (what does it mean given that…
	}			}

	!0 = !{!0} ; domain			!0 = !{!0} ; domain
	!1 = !{!1, !0} ; scope			!1 = !{!1, !0} ; scope
	!2 = !{!1} ; scope list			!2 = !{!1} ; scope list
				jdoerfertUnsubmitted Not Done Reply Inline Actions Not: stray "this" in 16284 The inlining sentence does not really clear up anything here, partially because we don't know what is happening. jdoerfert: Not: stray "this" in 16284 The inlining sentence does not really clear up anything here…

	Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope			Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
	are possible, but one should never dominate another. Violations are pointed out			are possible, but one should never dominate another. Violations are pointed out
				craig.topperUnsubmitted Not Done Reply Inline Actions "allows to track" reads funny. Maybe "allows tracking of"? craig.topper: "allows to track" reads funny. Maybe "allows tracking of"?
	by the verifier as they indicate a problem in either a transformation pass or			by the verifier as they indicate a problem in either a transformation pass or
				jdoerfertUnsubmitted Not Done Reply Inline Actions remove "a blob of" jdoerfert: remove "a blob of"
	the input.			the input.

				Scoped NoAlias Related Intrinsics
				---------------------------------

				craig.topperUnsubmitted Not Done Reply Inline Actions intrinsics* craig.topper: intrinsics*
				This set of intrinsics provide the basis for full C99 '``restrict``' support
				(See: iso-C99-n1256, 6.7.3.1 'Formal definition of restrict').

				In C99 restrict, accesses using a pointer based on a restrict pointer ``p``
				are assumed to not alias with other accesses that are not based on ``p``, as
				long as both accesses are within the scope of the declaration of ``p``.
				jdoerfertUnsubmitted Not Done Reply Inline Actions Either a real object, a constant where the value is relative to 0 or `null`. There is a word missing and 0 is `null`. jdoerfert: > Either a real object, a constant where the value is relative to 0 or ``null``. There is a…

				The intrinsics work together with the '``!noalias``'
				jdoerfertUnsubmitted Not Done Reply Inline Actions This seems odd, why introduce two things that do the same thing. jdoerfert: This seems odd, why introduce two things that do the same thing.
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as an offset. Later it needed to be separated: SROA first splits alloca's into multiple smaller alloca's. Each separate restrict pointer now points to its own alloca (%p.addr), and there is no place to put the offset. You can differentiate by splitting the p.scope, but that would imply duplicating scopes all over the place. The p.objId serves as a convenient and less costly solution to differentiate the pointers in this case. jeroen.dobbelaere: The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as…
				jdoerfertUnsubmitted Not Done Reply Inline Actions So `objId` is an offset into `p.addr`? If so, let's document it that way. How does this work if there are multiple restrict pointers in the object, e.g. `struct { restrict a; restrict b }`? Maybe it would help if you point me towards the place where I can see this intrinsic in action. At least then I might be able to provide better feedback on the wording. jdoerfert: So `objId` is an offset into `p.addr`? If so, let's document it that way. How does this work…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions This is the confusing part for me for the LangRef vs the usage: should the LangRef describe only the high level effect, or can it also describe how llvm treats/optimizes stuff internally ? I have somehow the feeling the we might want to have a separate restrict handling document, describing how the intrincs and metadata work together. Or do you think such a thing also belongs to the LangRef ? jeroen.dobbelaere: This is the confusing part for me for the LangRef vs the usage: should the LangRef describe…
				:ref:`metadata <noalias_and_aliasscope>` annotations on memory instructions and
				the ``ptr_provenance`` of ``load`` and ``store`` instructions.

				jdoerfertUnsubmitted Not Done Reply Inline Actions "entries with a single element each." It represents the variable declaration that contains one or more restrict pointers. I do not understand this sentence. jdoerfert: "entries with a single element each." > It represents the variable declaration that contains…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions hmm. Not sure how to explain it further. What I want to say is (shown with an example:) int restrict A; // one !p.scope, one restrict pointer int restrict B[10]; // another (single) !p.scope, ten restrict pointers struct FOO { int* restrict mA; int * mB; int* restrict mC; } C; // yet another !p.scope, 2 restrict pointers jeroen.dobbelaere: hmm. Not sure how to explain it further. What I want to say is (shown with an example:) int…
				jdoerfertUnsubmitted Not Done Reply Inline Actions In that example, how doe the `p.scopes` look like? Or, asked differently, is the `p.scope` a consequence of the declaration, hence does it uniquely identifies a declaration? jdoerfert: In that example, how doe the `p.scopes` look like? Or, asked differently, is the `p.scope` a…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions Yes, the p.scope is a result of the declaration and uniquely identifies one. jeroen.dobbelaere: Yes, the p.scope is a result of the declaration and uniquely identifies one.
				The full set of intrinsics is:

				.. code-block:: llvm

				jdoerfertUnsubmitted Not Done Reply Inline Actions For both items above: No need for "points to". a restrict variable. Maybe more specific: "the restrict pointer `%p`." jdoerfert: For both items above: No need for "points to". > a restrict variable. Maybe more specific: "the…
				%p.decl = i8* @llvm.noalias.decl.XXX(i8** %p.alloc, i32 <p.objId>, metadata !p.scope)
				%p.noalias = i8* @llvm.noalias.XXX(i8* %p, i8* %p.decl, i8** %p.addr,
				i32 <p.objId>, metadata !p.scope) !noalias !VisibleScopes
				%prov.p = i8* @llvm.provenance.noalias.XXX(i8* %prov.p, i8* %p.decl,
				jdoerfertUnsubmitted Not Done Reply Inline Actions Maybe: "the address of an object with at least one restrict pointer constituent. jdoerfert: Maybe: "the address of an object with at least one restrict pointer constituent.
				i8 p.addr, i8 %prov.p.addr,
				i32 <p.objId>, metadata !p.scope) !noalias !VisibleScopes
				%p.guard = i8* @llvm.noalias.arg.guard.XXX(i8* %p, i8* %prov.p)
				jdoerfertUnsubmitted Not Done Reply Inline Actions I did not understand the above wording. jdoerfert: I did not understand the above wording.
				%p.copy.guard = %struct.FOO* @llvm.noalias.copy.guard.XXX(%struct.FOO* %p.block, i8* %p.decl,
				metadata !p.indices, metadata !p.scope)


				A detailed description of these intrinsics and how the work is explained in
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`

				.. _int_noalias_decl:

				'``llvm.noalias.decl``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. The return type and argument types are encoded
				in ``XXX``.
				jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: "by the following" jdoerfert: Nit: "by the following"

				::

				declare i8* @llvm.noalias.decl.XXX(<type>* %p.alloca, i32 <p.objId>, metadata !p.scope)
				jdoerfertUnsubmitted Not Done Reply Inline Actions "an extra number" is not helpful " related to " -> "describing"/"identifying" jdoerfert: "an extra number" is not helpful " related to " -> "describing"/"identifying"

				Overview:
				"""""""""

				``llvm.noalias.decl`` is inserted at the location of a restrict pointer
				declaration. It makes it possible to identify that a restrict scope is only
				valid inside the body of a loop. It also makes it possible to identify that a
				certain ``alloca`` is associated to an object that contains one or more
				restrict pointers.

				The handle it returns is always of type ``i8*`` and does
				not really represent a value. It is merely used to track a dependency on the
				declaration.

				Arguments:
				jdoerfertUnsubmitted Not Done Reply Inline Actions "related to" -> "referencing" or "scoped in" jdoerfert: "related to" -> "referencing" or "scoped in"
				""""""""""

				The first argument ``%p.alloca`` points to the ``alloca`` that contains one
				or more restrict pointers. It can also be ``null`` if the ``alloca`` has been
				optimized away.

				The second argument ``p.objId`` is an integer representing an object id.

				The third argument ``!p.scope`` is metadata that is a list of ``noalias``
				metadata references. The format is identical to that required for ``noalias``
				metadata. This list must have exactly one element.

				Semantics:
				""""""""""

				The ``llvm.noalias.decl`` intrinsic is used to identify the exact location of
				a restrict pointer declaration. When this is done inside the loop body,
				care must be taken to duplicate and uniquify the scopes and intrinsics when
				the loop is unrolled. Otherwise the restrict scope could spill across
				iterations.

				It also associates specific restrict properties to an ``alloca`` and is used
				to propagate those properties to ``llvm.noalias``, ``llvm.provenance.noalias`` and
				``llvm.noalias.copy.guard`` intrinsics when inlining and optimizations make
				the relationship between those intrinsics and the actual variable declaration
				visible.

				A detailed description of these intrinsics and how the work is explained in
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`

				.. _int_noalias:

				'``llvm.noalias``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. The return type and argument types are encoded
				in ``XXX``.

				::

				declare <type>* @llvm.noalias.XXX(<type>* %p, i8* %p.decl, <type>** %p.addr,
				i32 <p.objId>, metadata !p.scope)

				Overview:
				"""""""""

				``llvm.noalias`` is inserted at the moment that restrict properties are
				introduced. This is typically done after loading a restrict pointer from
				memory. Its return value can be seen as *the pointer value with restrict
				properties*

				Arguments:
				""""""""""

				The first argument ``%p`` is the pointer on which the aliasing assumption is
				being placed.

				The second argument ``%p.decl`` refers to the ``llvm.noalias.decl`` that is
				associated with the pointer declaration.

				The third argument ``%p.addr`` is the address in memory of this pointer.

				The fourth argument ``p.objId`` is an integer representing an object id.

				The fifth argument ``!p.scope`` is metadata that is a list of ``noalias``
				metadata references. The format is identical to that required for ``noalias``
				metadata. This list must have exactly one element.

				Semantics:
				""""""""""

				The ``llvm.noalias`` intrinsic adds alias assumptions to the pointer
				computation path. It also blocks optimizations on this computation path.

				It will be transformed into a ``llvm.provenance.noalias`` intrinsic and moved onto
				the ``ptr_provenance`` path, so that pointer optimizations can still be
				done and the restrict information is not lost.

				A detailed description of these intrinsics and how the work is explained in
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`

				.. _int_provenance_noalias:

				'``llvm.provenance.noalias``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. The return type and argument types are encoded
				in ``XXX``.

				::

				declare <type>* @llvm.provenance.noalias.XXX(<type>* %p, i8* %p.decl,
				<type> %p.addr, <type> %prov.p.addr,
				i32 <p.objId>, metadata !p.scope)

				Overview:
				"""""""""

				``llvm.provenance.noalias`` is inserted at the moment that restrict properties are
				introduced on the ``ptr_provenance``. This is typically done when a
				``llvm.noalias`` is converted into the ``llvm.provenance.noalias``

				The value it returns (``%prov.p``) is representing the pointer ``%p`` on the
				``ptr_provenance``.

				Arguments:
				""""""""""

				The first argument ``%p`` is the pointer (or its ``ptr_provenance``) on
				which the aliasing assumption is being placed.

				The second argument ``%p.decl`` refers to the ``llvm.noalias.decl`` that is
				associated with the pointer declaration.

				The third argument ``%p.addr`` is the address in memory of this pointer.

				The fourth argument ``%prov.p.addr`` represents ``%p.addr`` on the
				``ptr_provenance``.

				The fifth argument ``p.objId`` is an integer representing an object id.

				The sixth argument ``!p.scope`` is metadata that is a list of ``noalias``
				metadata references. The format is identical to that required for ``noalias``
				metadata. This list must have exactly one element.
				jdoerfertUnsubmitted Not Done Reply Inline Actions I mentioned that before, the lady of the lake says XXX is specialized here: https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic (do we add attributes here? if so, we need attributes here, e.g., `nocapture`). jdoerfert: I mentioned that before, the lady of the lake says XXX is specialized here: https://llvm.


				Semantics:
				""""""""""

				The ``llvm.provenance.noalias`` intrinsic adds alias assumptions to the
				``ptr_provenance`` of the memory instructions that depends on it.

				For this purpose, the ``llvm.provenance.noalias`` itself is also considered to be
				jdoerfertUnsubmitted Not Done Reply Inline Actions I dislike the loop body and alloca sentence because they do not convey information. To be honest, I don't know what the second sentence is saying. What I would prefer is to say something like: "The intrinsic identifies the scope of the restrict restrict pointer through a virtual side-effect that ensures the control dependences are preserved. This virtual side-effect will also keep allocations alive and explicit." (I guessed what you want to say wrt. alloca) (allocas and mallocs should not be treated differently anyway, we have a heap-2-stack transformation now and for the purpose of this discussion there should not be a difference anyway) jdoerfert: I dislike the loop body and alloca sentence because they do not convey information. To be…
				a memory instruction and has its own ``ptr_provenance`` for the ``%p.addr``
				argument.

				A detailed description of these intrinsics and how the work is explained in
				jdoerfertUnsubmitted Not Done Reply Inline Actions The above reads funny, maybe: "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." Also, why do we want an `i8` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8` which is more confusing in this context full of `i8` that are actually pointers (IMHO). jdoerfert:* The above reads funny, maybe: "The returned value is a handle to track dependences on the…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too much about it and just settled on i8. jeroen.dobbelaere:* I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too…
				jdoerfertUnsubmitted Not Done Reply Inline Actions If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with all the i8 pointers that fly around. The wording will then make it clear that these are tokens. jdoerfert: If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions ok. We can consider that. jeroen.dobbelaere: ok. We can consider that.
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`


				.. _int_noalias_arg_guard:

				'``llvm.noalias.arg.guard``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				jdoerfertUnsubmitted Not Done Reply Inline Actions I would love to remove this duplication, is that possible? Why do we need to talk about `alloca` and "optimized away"? Can't we say: "The first argument `%p.alloca` points to an object in memory with one or more restrict pointers constituents or `null`." jdoerfert: 1) I would love to remove this duplication, is that possible? 2) Why do we need to talk about…
				"""""""

				This is an overloaded intrinsic. The return type and argument types are encoded
				in ``XXX``.
				jdoerfertUnsubmitted Not Done Reply Inline Actions Is it a list with one element or a list with entries that have one element each? What I read earlier sounded different from what I read here. jdoerfert: Is it a list with one element or a list with entries that have one element each? What I read…

				::

				declare <type>* @llvm.noalias.arg.guard.XXX(<type>* %p, <type>* %prov.p)

				Overview:
				"""""""""

				The ``llvm.noalias.arg.guard`` intrinsic brings alias assumption properties that
				jdoerfertUnsubmitted Not Done Reply Inline Actions Copy and paste from somewhere above. I'd avoid duplication if possible in favor of references. jdoerfert: Copy and paste from somewhere above. I'd avoid duplication if possible in favor of references.
				are on the ``ptr_provenance`` path of a pointer back into the
				computation path of that pointer.

				Arguments:
				""""""""""

				jdoerfertUnsubmitted Not Done Reply Inline Actions The above sentence is broken somewhere (I think). Maybe make it two. jdoerfert: The above sentence is broken somewhere (I think). Maybe make it two.
				The first argument ``%p`` represents the computation path of the pointer.

				The second argument ``%prov.p`` represents the ``ptr_provenance`` path
				of that pointer.

				Semantics:
				""""""""""

				The ``llvm.noalias.arg.guard`` is typically generated when a ``llvm.noalias``
				intrinsic is converted to a ``llvm.provenance.noalias``, but the pointer escapes
				because it is used as an argument to a function or it is returned.

				It can typically be optimized away after inlining:
				* When it is encountered on the computation path, it is assumed to return the
				first argument ``%p``.

				When it is encountered on a ``ptr_provenance`` path, it is assumed to
				return the second argument ``%prov.p``.

				A detailed description of these intrinsics and how the work is explained in
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`


				.. _int_noalias_copy_guard:

				'``llvm.noalias.copy.guard``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. The return type and argument types are encoded
				in ``XXX``.

				::

				declare <type>* @llvm.noalias.copy.guard.XXX(<type>* %p.block, i8* %p.decl,
				metadata !p.indices, metadata !p.scope)

				Overview:
				"""""""""

				``llvm.noalias.copy.guard`` is inserted in the source pointer argument when a
				block of memory that will be copied (either using ``llvm.memcpy`` or a
				combination of ``load``/``store`` instructions) is associated to a variable
				that contains at least one restrict pointer. This could be a ``struct`` that
				contains one or more restrict member pointers, or an array of restrict pointers.

				The intrinsic returns the first argument.

				Arguments:
				a.elovikovUnsubmitted Not Done Reply Inline Actions the `noalias_sidechannel` path Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here). a.elovikov: > the ``noalias_sidechannel`` path Not sure about terminology, but are `@llvm.noalias.arg.
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store). I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar). jeroen.dobbelaere: The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The…
				a.elovikovUnsubmitted Not Done Reply Inline Actions How about this: It will be transformed into a `llvm.side.noalias` intrinsic and moved onto the `noalias_sidechannel` path for loads/stores and fed into the @llvm.noalias.arg.guard/@llvm.noalias.copy.guard intrinsics for function boundaries/copies respectively. a.elovikov: How about this: It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto…
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions ... and fed into the @llvm.noalias.arg.guard intrinsics for function boundaries. (The @llvm.noalias.copy.guard is generated by the clang frontend) jeroen.dobbelaere: ... and fed into the @llvm.noalias.arg.guard intrinsics for function boundaries. (The @llvm.
				""""""""""

				The first argument ``%p.block`` represents the block that will be copied.

				The second argument ``%p.decl`` refers to the ``llvm.noalias.decl`` that is
				associated with the block.

				The third argument ``!p.indices`` refers to a metadata list of metadata. Each
				entry refers to another metadata list of integers, describing the GEP path
				that contains a restrict pointer. A -1 value indicates that any index value
				is a match. See above for an example.

				Semantics:
				""""""""""

				The ``llvm.noalias.copy.guard`` provides extra restrict information about a
				block of memory that is copied over. When the memory copy is optimized away,
				they will still be able to match the pointer access to the correct restrict
				information.

				A detailed description of these intrinsics and how the work is explained in
				::doc::`Restrict and NoAlias Information in LLVM <NoAliasInfo>`


	Floating Point Environment Manipulation intrinsics			Floating Point Environment Manipulation intrinsics
	--------------------------------------------------			--------------------------------------------------

	These functions read or write floating point environment, such as rounding			These functions read or write floating point environment, such as rounding
	mode or state of floating point exceptions. Altering the floating point			mode or state of floating point exceptions. Altering the floating point
	environment requires special care. See :ref:`Floating Point Environment <floatenv>`.			environment requires special care. See :ref:`Floating Point Environment <floatenv>`.

	'``llvm.flt.rounds``' Intrinsic			'``llvm.flt.rounds``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	declare i32 @llvm.flt.rounds()			declare i32 @llvm.flt.rounds()
				a.elovikovUnsubmitted Not Done Reply Inline Actions No explicit "or null" here. Is that intentional? a.elovikov: No explicit "or null" here. Is that intentional?
				jeroen.dobbelaereAuthorUnsubmitted Done Reply Inline Actions It can be 'null' jeroen.dobbelaere: It can be 'null'

	Overview:			Overview:
	"""""""""			"""""""""

	The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.			The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.

	Semantics:			Semantics:
	""""""""""			""""""""""
	▲ Show 20 Lines • Show All 1,673 Lines • Show Last 20 Lines

llvm/docs/NoAliasInfo.rst

This file was added.

========================================

Restrict and NoAlias Information in LLVM

========================================

.. contents::

:local:

:depth: 2

Introduction

============

LLVM provides a number of mechanisms to annotate that two accesses will not

alias. This document describes the provenance annotations on pointers with the

``noalias`` intrinsics, and ``noalias`` metadata on memory instructions

(load/store).

Together, they provide a means to decide if two memory instructions **will not

alias**, by looking at the pointer provenance and combining this with the active

scopes (specified by the ``noalias`` metadata) of the memory instructions.

All of C99 restrict can be mapped onto these annotations, resulting in a

powerful mechanism to provide extra alias information.

Relation with C99 ``restrict``

==============================

The noalias infrastructure can be used to fully support *C99 restrict* [#R1]_:

restrict pointers as function arguments, as local variables, as struct members.

(See [#R2]_: iso-C99-n1256, 6.7.3.1 'Formal definition of restrict').

Modeling ``restrict`` requires two pieces of information for a memory

instruction:

- the *depends on* relationship of the pointer path on a restrict

pointer (pointer provenance)

- the *visibility* of that restrict pointer to the memory instruction

Every variable, that contains at least one restrict pointer, that is defined in

a function (function arguments and local variables), will get a noalias scope

that is associate d to this variable declaration. The ``!noalias`` metadata is

vzakhariUnsubmitted

Not Done

typo: associate d

vzakhari: typo: `associate d`

used to annotate every memory instruction in the function with the restrict

variables (noalias scopes) that are visible to that memory instruction.

Each restrict variable also gets a ``@llvm.noalias.decl`` at the place in the

control flow where it is defined. This identifies if restrict scopes must be

duplicated or not when loops are transformed.

Whenever a restrict pointer is read, a ``@llvm.noalias`` intrinsic is introduced

to indicate the dependency on a restrict pointer. This intrinsic depends on the

pointer value and on the address of the pointer. Different addresses represent

different restrict pointers. Different restrict pointers point to different sets

of objects.

A struct can contain multiple restrict pointers. As such, a single variable

definition (single scope) can contain multiple restrict pointers. The addresses

of the pointers ensure that they can be differentiated.

When a struct is copied using the internal ``@llvm.memcpy``, the ``@llvm.noalias``

intrinsic cannot be used. The ``@llvm.noalias.copy.guard`` provides information

on what parts of the struct represent restrict pointers. This ensures that the

correct dependencies can be reconstructed when the ``@llvm.memcpy`` is optimized

away.

When a pointer to a restrict pointer is dereferenced, there is no local scope

available. These restrict pointers are associated with an ``unknown function

scope``. It is sometimes possible to refine those scopes later to actual

variable definitions.

.. _noaliasinfo_basic_mechanism:

Basic Mechanism

===============

Describing the full set of intrinsics with all their arguments at once, might be

confusing and can make it difficult to understand. Therefore, we start with

explaining the basic technology. We then build upon it to add missing parts.

Throughout the explanation, C99 and LLVM-IR code fragments are used to provide

examples. This does not mean that the provided infrastructure can only be used

for implementing C99 restrict. It just shows that at least C99 restrict can be

mapped onto it. Other provenance based alias annotations will likely map just as

well onto it, without (or with only small) adaptations to the provided

infrastructure.

Single ``noalias`` Scope

------------------------

Following code fragment:

.. code-block:: C

void foo(int *pA, int *pB, int i) {

int * restrict rpA = pA;

int * pX = &rpA[i];

*rpA = 42; // (1) based on rpA

*pB = 43; // (2) not based on rpA

*pX = 44; // (3) based on rpA

}

contains one *restrict* pointer ``rpA``, one pointer ``pX`` depending on it, and

one pointer ``pB`` not depending on ``rpA``. Based on the C99 restrict

description, \*rpA and \*pX can alias with each other. They will not alias with

\*pB.

In pseudo LLVM-IR code, this can be represented as:

.. code-block:: llvm

define void @foo(i32* %pA, i32* %pB, i64 %i) {

%rpA = tail call i32* @llvm.noalias(i32* %pA, metadata !2)

%arrayidx = getelementptr inbounds i32, i32* %pA, i64 %i

store i32 42, i32* %rpA, !noalias !2 ; (1)

store i32 43, i32* %pB, !noalias !2 ; (2)

store i32 44, i32* %arrayidx, !noalias !2 ; (3)

ret void

}

; MetaData

!2 = !{!3} ; contains a single scope: !3

!3 = distinct !{!3, !4, !"foo: rpA"} ; this scope represents rpA

!4 = distinct !{!4, !"foo"}

* Metadata !2 defines a list of a single scope ``!3`` that represents ``rpA``

* The ``@llvm.noalias`` intrinsic is associated with the single scope ``!3`` in

``metadata !2``. It indicates that accesses based on this pointer are depending

on this ``!3`` scope. They will not alias with accesses *not* depending on the

same ``!3`` scope, as long as the scope is visible to both accesses.

* For this example, the ``!3`` scope is visible to all three stores (``!noalias

!2`` annotation on the stores). Because of this:

* ``(1)`` and ``(3)`` may alias to each other: ``%rpA`` and ``%arrayidx``

depend on the same ``!3`` scope.

* ``(1)`` and ``(3)`` will not alias with ``(2)``: ``%pB`` does not depend on

the ``!3`` scope.

Multiple ``noalias`` Scopes

---------------------------

Let's extend the example:

.. code-block:: C

void foo(int *pA, int *pB, int *pC, int i) {

int * restrict rpA = pA;

int * pX = &rpA[i];

*rpA = 42; // (1) based on rpA

*pB = 43; // (2) not based on rpA

*pX = 44; // (3) based on rpA

{

int * restrict rpC = pC;

// rpA and rpC visible

*rpA = 45; // (4) based on rpA

*pB = 46; // (5) not based on rpA nor rpC

*rpC = 47; // (6) based on rpC

}

with following pseudo LLVM-IR code:

.. code-block:: llvm

define void @foo(i32* %pA, i32* %pB, i32* %pC, i64 %i) {

%rpA = tail call i32* @llvm.noalias(i32* %pA, metadata !2) ; rpA

%arrayidx = getelementptr inbounds i32, i32* %pA, i64 %i

%rpC = tail call i32* @llvm.noalias(i32* %pC, metadata !11) ; rpC

store i32 42, i32* %rpA, !noalias !2 ; (1) rpA

store i32 43, i32* %pB, !noalias !2 ; (2) rpA

store i32 44, i32* %arrayidx, !noalias !2 ; (3) rpA

store i32 45, i32* %rpA, !noalias !13 ; (4) rpA and rpC

store i32 46, i32* %pB, !noalias !13 ; (5) rpA and rpC

store i32 47, i32* %rpC, !noalias !13 ; (6) rpA and rpC

ret void

}

; MetaData

!2 = !{!3} ; single scope: rpA

!3 = distinct !{!3, !4, !"foo: rpA"}

!4 = distinct !{!4, !"foo"}

!11 = !{!12} ; single scope: rpC

!12 = distinct !{!12, !4, !"foo: rpC"}

!13 = !{!12, !3} ; scopes: rpA and rpC

In this fragment:

* ``%rpA`` is associated with scope ``!3``

* ``%rpC`` is associated with scope ``!12``

* ``(1)``, ``(2)`` and ``(3)`` only see ``rpA``. (scope ``!3``)

* ``(4)``, ``(5)`` and ``(6)`` see ``rpA`` and ``rpC`` (scopes ``!3`` and ``!12``)

Following C99 restrict:

* ``(4)``, ``(5)`` and ``(6)`` will not alias each other.

* ``(6)`` will not alias ``(3)``:

* ``(6)`` is based on ``rpC``, which is visible to ``(6)``, but not to

``(3)`` => no conclusion.

* ``(3)`` is based on ``rpA`` which is visible to both ``(6)`` and ``(3)`` =>

will not alias

* ``(6)`` might alias with ``(2)``:

* ``rpC`` is visible to ``(6)``, but not to ``(2)``.

* There are no other dependencies for those accesses.

Location of the ``restrict`` Declaration

----------------------------------------

Some optimization passes need to know where a restrict variable has been

declared. Only when that information is known, they can perform the correct

transformations.

One of those transformations is *loop unrolling*. When restrict is applicable

across iterations, the loop can be unrolled without extra changes. But when

restrict is only applicable inside a single iteration, care must be taken to

also duplicate the noalias scopes while duplicating the loop body.

Following code example shows those two cases:

.. code-block:: c

void restrictInLoop(int *pA, int *pB, int *pC, long N) {

for (int i=0; i<N; ++i) {

// stores can be reordered inside a single iterator, but not across

// iterations

int * restrict rpA = pA;

int * restrict rpB = pB;

rpB[i] = 2*pC[i];

rpA[i] = 3*pC[i];

}

void restrictOutOfLoop(int *pA, int *pB, int *pC, long N) {

// stores through rpA and rpB will never alias and can be reordered,

int * restrict rpA = pA;

int * restrict rpB = pB;

for (int i=0; i<N; ++i) {

rpB[i] = 2*pC[i];

rpA[i] = 3*pC[i];

}

The ``@llvm.noalias.decl`` intrinsic is used to track where in the control flow a

restrict variable was introduced. When it is found inside a loop body, it

indicates that the associated *noalias scope* must be duplicated during loop

unrolling.

For the example, the corresponding pieces of LLVM-IR look like:

.. code-block:: llvm

define void @restrictInLoop(i32* %pA, i32* %pB, i32* %pC, i64 %N) {

entry:

%cmp18 = icmp sgt i64 %N, 0

br i1 %cmp18, label %for.body, label %for.cond.cleanup

for.body: ; preds = %entry, %for.body

%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]

%0 = call i8* @llvm.noalias.decl(i32** null, i64 0, metadata !2) ; rpA - inside the loop

%1 = call i8* @llvm.noalias.decl(i32** null, i64 0, metadata !5) ; rpB - inside the loop

...

asbirleaUnsubmitted

Not Done

This seems like it may be useful to see a fragment/sample of the metadata declarations here?
Similar for the OutOfLoop snippet.

asbirlea: This seems like it may be useful to see a fragment/sample of the metadata declarations here?

and

.. code-block:: llvm

define void @restrictOutOfLoop(i32* %pA, i32* %pB, i32* %pC, i64 %N) {

entry:

%0 = call i8* @llvm.noalias.decl(i32** null, i64 0, metadata !16) ; rpA - outside the loop

%1 = call i8* @llvm.noalias.decl(i32** null, i64 0, metadata !19) ; rpB - outside the loop

%cmp18 = icmp sgt i64 %N, 0

br i1 %cmp18, label %for.body.lr.ph, label %for.cond.cleanup

...

Note: the ``restrictInLoop`` situation is something that can easily happen after

inlining a function with ``restrict`` arguments:

.. code-block:: C

void doCompute(int * restrict rpA, int * restrict rpB, int * pC, long i) {

rpB[i] = 2*pC[i];

rpA[i] = 3*pC[i];

}

void restrictInLoop(int *pA, int *pB, int *pC, long N) {

for (int i=0; i<N; ++i) {

// stores can be reordered inside a single iterator, but not across

// iterations

doCompute(pA, pB, pC, i);

}

Provenance Based Alias Annotations

==================================

In principle, the two intrinsics we have seen so far, should be enough to

provide all necessary information. Now that the basic mechanism has been

explained, we can focus on the various arguments and extensions and why they are

needed.

The Basic Infrastructure

------------------------

In ``C99 restrict``, restrictness is associated with ``object P`` [#R2]_. It is

introduced when the pointer value is read from ``object P``. Different ``object

P`` point to different sets of objects. Because of this, the declaration of a

variable that contains multiple restrict pointers (like an array of restrict

pointers, or a struct that has multiple restrict member pointers) will result in

a single ``scope`` that contains multiple ``object P``.

* ``@llvm.noalias.decl %p.alloc, metadata !Scope``

* indicates at what location in the control flow a restrict pointer has been

declared.

* ``%p.alloc`` refers to the ``alloca`` associated with the declaration.

* ``!Scope`` metadata refers to the unique scope, associated with this

declaration.

* Note: the ``@llvm.noalias.decl`` intrinsic can normally not be moved outside

loops. Its purpose is to identify the freedom that a restrict pointer has

with respect to loop bodies.

* ``@llvm.noalias %p, %p.decl, %p.addr``

* introduces ``noalias`` information to the instructions that (directly or

indirectly) depend on this intrinsic. It is created when *reading a restrict

pointer* and is used to track the 'based-on' relationship.

* ``%p`` is the pointer value that was read. This is also the value that is

returned by this intrinsic.

* ``%p.decl`` refers to the ``@llvm.noalias.decl`` that is associated with

this restrict pointer.

* ``p.addr`` represents the address of ``object P``.

* Note: sometimes the declaration is not known upfront. In that case,

``%p.decl`` is ``null``. After inlining and /or optimizations, it can be

possible to infer the ``llvm.noalias.decl``.

* the tuple < ``%p.addr``, ``!Scope`` > defines the ``object P``.

Example A:

.. code-block:: C

int foo(int* pA, int* pB) {

int * restrict rpA=pA;

*rpA=42;

*pB=99;

return *rpA;

}

And in pseudo LLVM-IR as how clang would produce it:

.. code-block:: llvm

define i32 @foo(i32* %pA, i32* %pB) {

%rpA.address = alloca i32*

%rpA.decl = call @llvm.noalias.decl %rpA.address, !metadata !10 ; declaration of a restrict pointer

vzakhariUnsubmitted

Not Done

Please add parenthesis, so that it looks like a call instruction. Same at line 365.

vzakhari: Please add parenthesis, so that it looks like a call instruction. Same at line 365.

store i32* %pA, i32** %rpA.address, !noalias !10

asbirleaUnsubmitted

Not Done

IIUC, this example looks like a good place to clarify the distinction between p.alloca and p.addr in the LangRef description, along the lines of "in this example the address is the alloca, but in general they may refer to different locations; the alloca could be a struct and the address could point to a member of the struct".

asbirlea: IIUC, this example looks like a good place to clarify the distinction between `p.alloca` and `p.

%rpA = load i32*, i32** %rpA.address, !noalias !10

%rpA.1 = i32* call @llvm.noalias %rpA, %rpA.decl, %rpA.address ; reading of a restrict pointer

store i32 42, i32* %rpA.1, !noalias !10

store i32 99, i32* %pB, !noalias !10

%1 = load i32, i32* %rpA.1, !noalias !10

ret i32 %1

}

With this representation, we have enough information to decide whether two

load/stores are not aliasing, based on the ``noalias`` annotations. But, the

added intrinsics must block optimizations. Later on we will see how the

asbirleaUnsubmitted

Not Done

Expand with explanation on what does not alias in the example.

asbirlea: Expand with explanation on what does not alias in the example.

vzakhariUnsubmitted

Not Done

Can you please expand on the meaning of "must block optimizations"? Maybe list some (important) optimizations that are blocked and insert a link to Optimization passes section below for optimizations that are not blocked but must handle the scopes properly?

vzakhari: Can you please expand on the meaning of "must block optimizations"? Maybe list some…

infrastructure is expanded to allow for optimizations.

Summary:

* ``%p.decl = @llvm.noalias.decl %p.alloc, metadata !Scope``

* ``%p.val = @llvm.noalias %p, %p.decl, %p.addr``

Pointer Provenance

------------------

In order to keep track of the dependency on the ``@llvm.noalias`` intrinsics,

but still allow most optimization passes to do their work, an extra optional

operand for ``load``/``store`` instruction is introduced: the ``ptr_provenance``

operand.

The idea is that the *pointer operand* is used for normal pointer

computations. The ``ptr_provenance`` operand is used to track ``noalias``

related dependencies. Optimizations (like LSR) can modify the *pointer operand*

as they see fit. As long as the ``ptr_provenance operand`` is not touched, we

are still able to deduce the noalias related information.

When an optimization introduces a ``load``/``store`` without keeping the

``ptr_provenance`` operand and the ``!noalias`` metadata, we fall back to the

vzakhariUnsubmitted

Not Done

Should and here be or/and?

I think it might be useful to put an example somewhere that shows what happens when a pointer provenance cannot be proven not to be based on a restrict pointer. For example:

void unknown();
extern int *gp1;
extern int *gp2;
void test(int *restrict p1) {
  gp1 = p1;
  unknown();
  for (int i = 0; i < 100; ++i)
    gp2[i] = p1[i] + 1;
}

As I understand, the store instruction inside the loop uses a non-restrict pointer (gp2) with unknown/missing provenance and since p1 escapes, gp2 might be based on p1. Thus, the load and the store pontentially alias. I would be great to see an explanation of the logic used to compute the MayAlias result here.

vzakhari: Should `and` here be `or/and`? I think it might be useful to put an example somewhere that…

fail-safe *worst case*.

Although the actual pointer computations can be removed from the

``ptr_provenance``, it can still contain *PHI* nodes, *select* instructions and

*casts*.

For clang, it is hard to track the usage of a pointer and it will not generate

the ``ptr_provenance`` operand. At LLVM-IR level, this is much easier. Because

of that the annotations exist in two states and a conversion pass is introduced:

* Before *noalias propagation*:

This state is produced by clang and sometimes by SROA. The ``@llvm.noalias``

intrinsic is used in the computation path of the pointer. It is treated as a

mostly opaque intrinsic and blocks most optimizations.

* After *noalias propagation*:

A *noalias propagation and conversion* pass is introduced:

* ``@llvm.noalias`` intrinsics are converted into ``@llvm.provenance.noalias``

intrinsics.

* their usage is removed from the main pointer computations of

``load``/``store`` instructions and moved to the ``ptr_provenance`` operand.

* When a pointer depending on a ``@llvm.noalias`` intrinsic is passed as an

argument, returned from a function or stored into memory, a

``@llvm.noalias.arg.guard`` is introduced. This combines the original

pointer computation with the provenance information. After inlining, it is

also used to propagate the noalias information to the ``load``/``store``

instructions.

So, we now have two extra intrinsics:

* ``@llvm.provenance.noalias`` %prov.p, %p.decl, %p.addr

* provides restrict information to a ``ptr_provenance`` operand

* ``%prov.p``: tracks the provenance information associated with the pointer

value that was read.

* ``%p.decl`` refers to the ``@llvm.noalias.decl`` that is associated with the

restrict pointer.

* ``%p.addr``: represents the address of ``object P``.

* ``@llvm.noalias.arg.guard %p, %prov.p``

* combines pointer and ``ptr_provenance`` information when a pointer value

with ``noalias`` dependencies escapes. It is normally used for function

arguments, returns, or stores to memory.

* ``%p`` tracks the pointer computation

* ``%prov.p`` tracks the provenance of the pointer.

After noalias propagation and conversion, example A becomes:

.. code-block:: llvm

define i32 @foo(i32* %pA, i32* %pB) {

%rpA.address = alloca i32*

%rpA.decl = i8* call @llvm.noalias.decl i32* %rpA.address, !metadata !10 ; declaration of a restrict pointer

vzakhariUnsubmitted

Not Done

Please add parenthesis for the calls.

vzakhari: Please add parenthesis for the calls.

store i32* %pA, i32** %rpA.address, !noalias !10

%rpA = load i32*, i32** %rpA.address, !noalias !10

; reading of a restrict pointer:

%prov.rpA.1 = i32* call @llvm.provenance.noalias i32* %rpA, i8* %rpA.decl, i32* %rpA.address

store i32 42, i32* %rpA, ptr_provenance i32* %prov.rpA.1, !noalias !10

store i32 99, i32* %pB, !noalias !10

%1 = load i32, i32* %rpA.1, !noalias !10

ret i32 %1

}

Summary:

* ``%p.decl = @llvm.noalias.decl %p.alloc, metadata !Scope``

* ``%p.noalias = @llvm.noalias %p, %p.decl, %p.addr``

* ``%prov.p = @llvm.provenance.noalias %prov.p.2, %p.decl, %p.addr``

* ``%p.guard = @llvm.noalias.arg.guard %p, %prov.p``

.. _noalias_vs_provenance_noalias:

``@llvm.noalias`` vs ``@llvm.provenance.noalias``

-------------------------------------------------

The ``@llvm.noalias`` intrinsic is a convenience shortcut for the combination of

``@llvm.provenance.noalias``, which can only reside on the ptr_provenance path,

and ``@llvm.noalias.arg.guard``, which combines the normal pointer with the

ptr_provenance path:

* This results in less initial code to be generated by ``clang``.

* It also helps during SROA when introducing ``noalias`` information for pointers

inside a struct.

* The noalias propagation and conversion pass depends on the property of

``@llvm.provenance.noalias`` to only reside on the ``ptr_provenance`` path to

reduce the amount of work.

.. code-block:: llvm

; Following:

%rpA = load i32*, i32** %rpA.address, !noalias !10

%rpA.1 = i32* call @llvm.noalias %rpA, %rpA.decl, %rpA.address

store i32 42, i32* %rpA.1, !noalias !10

; is a shortcut for:

%rpA = load i32*, i32** %rpA.address, !noalias !10

%rpA.prov = i32* call @llvm.provenance.noalias %rpA, %rpA.decl, %rpA.address

%rpA.guard = i32* call @llvm.noalias.arg.guard %rpA, %rpA.prov

store i32 42, i32* %rpA.guard, !noalias !10

; and after noalias propagation and conversion, this becomes:

%rpA = load i32*, i32** %rpA.address, !noalias !10

%prov.rpA = i32* call @llvm.provenance.noalias %rpA, %rpA.decl, %rpA.address

store i32 42, i32* %rpA, ptr_provenance i32* %prov.rpA, !noalias !10

SROA and Stack optimizations

----------------------------

When SROA eliminates a local variable, we do not have an address for ``object P``

anymore (the alloca is removed and ``%p.addr`` becomes ``null``). At that moment

we can only depend on the ``!Scope`` metadata to differentiate restrict

objects. For convenience, we also add this information to the ``@llvm.noalias``

and ``@llvm.provenance.noalias`` intrinsics.

It is also possible that a single variable declaration contains multiple

restrict pointers (think of a struct containing multiple restrict pointers, or

an array of restrict pointers). For correctness, SROA must introduce new scopes

when splitting it up. But cloning and adapting scopes can be very

expensive. Because of that, we introduce an extra *object ID* (``objId``)

parameter for ``@llvm.noalias.decl``, ``@llvm.noalias`` and

``llvm.provenance.noalias``. This can be thought of as the *offset in the

variable*. This allows us to differentiate *noalias* dependencies coming from

the same variable, but representing different *noalias* pointers.

Summary:

* ``%p.decl = @llvm.noalias.decl %p.alloc, i64 objId, metadata !Scope``

* ``%p.noalias = @llvm.noalias %p, %p.decl, %p.addr, i64 objId, metadata !Scope``

* ``%prov.p = @llvm.provenance.noalias %prov.p.2, %p.decl, %p.addr, i64 objId, metadata !Scope``

* ``%p.guard = @llvm.noalias.arg.guard %p, %prov.p``

For alias analysis, this means that two ``@llvm.provenance.noalias`` intrinsics represent a

different ``object P0`` and, ``object P1``, if:

* ``%p0.addr`` and ``%p1.addr`` are different

* or, ``objId0`` and ``objId1`` are different

* or, ``!Scope0`` and ``!Scope1`` are different

Optimizing a restrict pointer pointing to a restrict pointer

------------------------------------------------------------

Example:

.. code-block:: C

int * restrict * restrict ppA = ...;

int * restrict * restrict ppB = ...;

**ppA=42;

**ppB=99;

return **ppA; // according to C99, 6.7.3.1 paragraph 4, **ppA and **ppB are not aliasing

In order to allow this optimization, we also need to track the ``!noalias`` scope

when the ``@llvm.noalias`` intrinsic is introduced. The ``%p.addr`` parameter in the

``@llvm.provenance.noalias`` version will also get a ``ptr_provenance`` operand,

through the ``%prov.p.addr`` argument.

In short, the ``@llvm.noalias`` and ``@llvm.provenance.noalias`` intrinsics are

treated as if they are a memory operation.

Summary:

* ``%p.decl = @llvm.noalias.decl %p.alloc, i64 objId, metadata !Scope``

* ``%p.noalias = @llvm.noalias %p, %p.decl, %p.addr, i64 objId, metadata !Scope, !noalias !VisibleScopes``

* ``%prov.p = @llvm.provenance.noalias %prov.p.2, %p.decl, %p.addr, %prov.p.addr, i64 objId, metadata !Scope, !noalias !VisibleScopes``

* ``%p.guard = @llvm.noalias.arg.guard %p, %prov.p``

For alias analysis, this means that two ``@llvm.provenance.noalias`` intrinsics represent a

different ``object P0`` and ``object P1`` if:

* ``%p0.addr`` and ``%p1.addr`` are different

* or, ``objId0`` and ``objId1`` are different

* or, ``!Scope0`` and ``!Scope1`` are different

* or we can prove that { ``%p0.addr``, ``%prov.p0.addr``, ``!VisibleScopes0`` } and

{ ``%p1.addr``, ``%prov.p1.addr``, ``!VisibleScopes1`` } do not alias for both

intrinsics. (As if we treat each of the two ``@llvm.provenance.noalias`` as a

**store to ``%p.addr``** and we must prove that the two stores do not alias;

also see [#R8]_, question 2)

``Unknown function`` Scope

--------------------------

When the declaration of a restrict pointer is not visible, *C99, 6.7.3.1

paragraph 2*, says that the pointer is assumed to start living from ``main``.

This case can be handled by the ``unknown function`` scope, which is annotated

to the function itself. This can be treated as saying: the scope of this restrict

pointer starts somewhere outside this function. In such case, the

``@llvm.noalias`` and ``@llvm.provenance.noalias`` will not be associated with a

``@llvm.noalias.decl``. It is possible that after inlining, the scopes can be

refined to a declaration which became visible.

For convenience, each function can have its own ``unknown function`` scope

specified by a ``noalias !UnknownScope`` metadata attribute on the function itself.

vzakhariUnsubmitted

Not Done

Can you please add some example(s)? As I understand, this describes cases with global restrict pointer and with indirect loads of restrict pointers (e.g. int *restrict *restrict).

vzakhari: Can you please add some example(s)? As I understand, this describes cases with global restrict…

Aggregate Copies

----------------

Restrictness is introduced by *reading a restrict pointer*. It is not always

possible to add the necessary ``@llvm.noalias`` annotation when this is done. An

aggregate containing one or more restrict pointers can be copied with a single

``load``/``store` pair or a ``@llvm.memcpy``. This makes it hard to track when a

restrict pointer is copied over. As long as this is treated as an memory escape,

there is no issue. At the moment that the copy is optimized away, we must be

able to reconstruct the ``noalias`` dependencies for correctness.

For this, a final intrinsic is introduced: ``@llvm.noalias.copy.guard``:

* ``@llvm.noalias.copy.guard %p.addr, %p.decl, metadata !Indices, metadata !Scope``

* Guards a ``%p.addr`` object that is copied as a single aggregate or ``@llvm.memcpy``

* ``%p.addr``: the object to guard

* ``%p.decl``: (when available), the ``@llvm.noalias.decl`` associated with the object

* ``!Indices``: this refers to a metadata list. Each element of the list

vzakhariUnsubmitted

Not Done

Do we need to represent the aggregate type as an extra argument to make it work with opaque pointers?

vzakhari: Do we need to represent the aggregate type as an extra argument to make it work with opaque…

refers to a set of indices where a restrict pointer is located, similar to

the indices for a ``getelementptr``.

* ``!Scope``: the declaration scope of ``%p.decl``

This information allows *SROA* to introduce the needed ``@llvm.noalias`` intrinsics

when a struct is split up.

Summary:

* potential ``!noalias !UnknownScope`` annotation at function level

* ``%p.decl = @llvm.noalias.decl %p.alloc, i64 objId, metadata !Scope``

* ``%p.noalias = @llvm.noalias %p, %p.decl, %p.addr, i64 objId, metadata !Scope, !noalias !VisibleScopes``

* ``%prov.p = @llvm.provenance.noalias %prov.p.2, %p.decl, %p.addr, %prov.p.addr, i64 objId, metadata !Scope, !noalias !VisibleScopes``

* ``%p.guard = @llvm.noalias.arg.guard %p, %prov.p``

* ``%p.addr.guard = @llvm.noalias.copy.guard %p.addr, %p.decl, metadata !Indices, metadata !Scope, !noalias !VisibleScopes``

Optimization passes

-------------------

For correctness, some optimization passes must be aware of the *noalias intrinsics*:

inlining [#R7]_, unrolling [#R6]_, loop rotation, ... Whenever a body is duplicated that

contains a ``@llvm.noalias.decl``, it must be decided how that duplication must be done.

Sometimes new unique scopes must be introduced, sometimes not.

Other optimization passes can perform better by knowing about the ``ptr_provenance``: when

new ``load``/``store`` instructions are introduced, adding ``ptr_provenance``

information can result in better alias analysis for those instructions.

It is possible that an optimization pass is doing a wrong optimization, by doing

a transformation that omits the ``ptr_provenance`` operand, but keeps the

``!noalias`` information. This can happen when the ``!noalias`` metadata is

copied directly, instead of using ``AAMetadata`` and

``getAAMetadata/setAAMetadata``:

.. code-block:: C

AAMDNodes AAMD;

OldLoad->getAAMetadata(AAMD);

NewLoad->setAAMetadata(AAMD);

// only do this if it is safe to copy over the 'ptr_provenance' info

// The !noalias info will then also be copied over

NewLoad->setAAMetadataNoAliasProvenance(AAMD);

Possible Future Enhancements

----------------------------

* c++ alias_set

With this framework in place, it should be easy to extend it to support the

*alias_set* proposal [#R3]_. This can be done by tracking a separate *universe

object*, instead of *object P*.

Detailed Description

====================

This section gives a detailed description of the various intrinsics and

metadata.

``!noalias`` Scope Metadata

---------------------------

The ``!noalias`` metadata consists of a *list of scopes*. Each scope is also

associated to the function to which it belongs.

.. code-block:: llvm

; MetaData

!2 = !{!3} ; single scope: rpA

!3 = distinct !{!3, !4, !"foo: rpA"} ; variable 'rpA'

!4 = distinct !{!4, !"foo"} ; function 'foo'

!5 = !{!6}

!6 = distinct !{!6, !7, !"foo: unknown scope"}

!7 = distinct !{!7, !"foo"}

!11 = !{!12} ; single scope: rpC

!12 = distinct !{!12, !4, !"foo: rpC"} ; variable 'rpC'

!13 = !{!12, !3} ; multiple scopes: rpA and rpC

This structure is used in following places:

* as a single scope:

* used as *metavalue* argument by ``@llvm.noalias.decl``, ``@llvm.noalias``,

``@llvm.provenance.noalias``, ``@llvm.noalias.copy.guard``. (``!2, !11``) to

describe the scope that is associated with the noalias intrinsic.

* used as ``!noalias`` metadata on a function to describe the ``unknown

function scope``. (``!5``)

* as one or more scopes:

* used as ``!noalias`` metadata describingthe *visible scopes* on memory

instructions (``load``/``store``) and ``@llvm.noalias`` and

``@llvm.provenance.noalias`` intrinsics.

.. note:: The ``Unknown Function Scope`` is a special scope that is attached

through ``!noalias`` metadata on a function defintion. It identifies

vzakhariUnsubmitted

Not Done

Is it important that the unknown scope has the function scope as its "parent"?

!6 = distinct !{!6, !7, !"test: unknown scope"}
!7 = distinct !{!7, !"test"}

vzakhari: Is it important that the unknown scope has the function scope as its "parent"? ``` !6 =…

the scope that is used for *noalias* pointers for which the

declaration is not known.

``ptr_provenance`` path

-----------------------

The ``ptr_provenance`` path is reserved for tracking *noalias* information that

is associated to pointers. Value computations should be omitted as much as

possible.

For memory instructions, this means that the actual pointer value and the

provenance information can be separated. This allows optimization passes to

rewrite the pointer computation and still keep the correct provenance information.

A ``ptr_provenance`` path normally starts:

* with the ``ptr_provenance`` operand of a ``load``/``store`` instruction

* with the ``ptr_provenance`` operand of the ``@llvm.noalias.arg.guard``

intrinsic

* with the ``ptr.provenance`` operand of the ``@llvm.provenance.noalias``

intrinsic

As the ``@llvm.provenance.noalias``, can only be part of a ``ptr_provenance``

path, its ``%p`` operand is also part of the ``ptr_provenance`` path.

Although all uses of a ``@llvm.provenance.noalias`` must be on a

``ptr_provenance`` path, following the *based on* path must end at a normal

pointer value. This can for example be the input argument of a

function. Optimizations like inlining can provide extra information for such a

pointer.

Examples

--------

This section contains some examples that are used in the description of the

intrinsics.

.. _noaliasinfo_local_restrict:

Example A: local restrict

"""""""""""""""""""""""""

.. _noaliasinfo_local_restrict_C:

C99 code with local restrict variables:

.. code-block:: C

int foo(int * pA, int i, int *pC) {

int * restrict rpA = pA;

int * restrict rpB = pA+i;

// The three accesses are promised to not alias each other

*rpA = 10;

*rpB = 20;

*pC = 30;

return *rpA+*rpB+*pC;

}

.. _noaliasinfo_local_restrict_llvm_0:

LLVM-IR code as produced by clang:

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local i32 @foo(i32* %pA, i32 %i, i32* %pC) #0 {

entry:

%pA.addr = alloca i32*, align 4

%i.addr = alloca i32, align 4

%pC.addr = alloca i32*, align 4

%rpA = alloca i32*, align 4

%rpB = alloca i32*, align 4

store i32* %pA, i32** %pA.addr, align 4, !tbaa !3, !noalias !7

store i32 %i, i32* %i.addr, align 4, !tbaa !11, !noalias !7

store i32* %pC, i32** %pC.addr, align 4, !tbaa !3, !noalias !7

%0 = bitcast i32** %rpA to i8*

call void @llvm.lifetime.start.p0i8(i64 4, i8* %0) #4, !noalias !7

%1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %rpA, i64 0, metadata !13), !noalias !7

%2 = load i32*, i32** %pA.addr, align 4, !tbaa !3, !noalias !7

store i32* %2, i32** %rpA, align 4, !tbaa !3, !noalias !7

%3 = bitcast i32** %rpB to i8*

call void @llvm.lifetime.start.p0i8(i64 4, i8* %3) #4, !noalias !7

%4 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %rpB, i64 0, metadata !14), !noalias !7

%5 = load i32*, i32** %pA.addr, align 4, !tbaa !3, !noalias !7

%6 = load i32, i32* %i.addr, align 4, !tbaa !11, !noalias !7

%add.ptr = getelementptr inbounds i32, i32* %5, i32 %6

store i32* %add.ptr, i32** %rpB, align 4, !tbaa !3, !noalias !7

%7 = load i32*, i32** %rpA, align 4, !tbaa !3, !noalias !7

%8 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %7, i8* %1, i32** %rpA, i64 0, metadata !13),

!tbaa !3, !noalias !7

store i32 10, i32* %8, align 4, !tbaa !11, !noalias !7

%9 = load i32*, i32** %rpB, align 4, !tbaa !3, !noalias !7

%10 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %9, i8* %4, i32** %rpB, i64 0, metadata !14),

!tbaa !3, !noalias !7

store i32 20, i32* %10, align 4, !tbaa !11, !noalias !7

%11 = load i32*, i32** %pC.addr, align 4, !tbaa !3, !noalias !7

store i32 30, i32* %11, align 4, !tbaa !11, !noalias !7

%12 = load i32*, i32** %rpA, align 4, !tbaa !3, !noalias !7

%13 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %12, i8* %1, i32** %rpA, i64 0, metadata !13),

!tbaa !3, !noalias !7

%14 = load i32, i32* %13, align 4, !tbaa !11, !noalias !7

%15 = load i32*, i32** %rpB, align 4, !tbaa !3, !noalias !7

%16 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %15, i8* %4, i32** %rpB, i64 0, metadata !14),

!tbaa !3, !noalias !7

%17 = load i32, i32* %16, align 4, !tbaa !11, !noalias !7

%add = add nsw i32 %14, %17

%18 = load i32*, i32** %pC.addr, align 4, !tbaa !3, !noalias !7

%19 = load i32, i32* %18, align 4, !tbaa !11, !noalias !7

%add1 = add nsw i32 %add, %19

%20 = bitcast i32** %rpB to i8*

call void @llvm.lifetime.end.p0i8(i64 4, i8* %20) #4

%21 = bitcast i32** %rpA to i8*

call void @llvm.lifetime.end.p0i8(i64 4, i8* %21) #4

ret i32 %add1

}

; ....

!7 = !{!15, !17}

!13 = !{!15}

!14 = !{!17}

!15 = distinct !{!15, !16, !"foo: rpA"}

!16 = distinct !{!16, !"foo"}

!17 = distinct !{!17, !16, !"foo: rpB"}

.. _noaliasinfo_local_restrict_llvm_1:

LLVM-IR code during optimization: stack objects have already been optimized

away, ``@llvm.noalias`` has been converted into ``@llvm.provenance.noalias`` and

propagated to the ``ptr_provenance`` path.

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local i32 @foo(i32* %pA, i32 %i, i32* %pC) #0 {

entry:

%0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !3)

%1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !6)

%add.ptr = getelementptr inbounds i32, i32* %pA, i32 %i

%2 = call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %pA, i8* %0,

i32** null, i32** undef, i64 0, metadata !3), !tbaa !8, !noalias !12

store i32 10, i32* %pA, ptr_provenance i32* %2, align 4, !tbaa !13, !noalias !12

%3 = call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %add.ptr, i8* %1,

i32** null, i32** undef, i64 0, metadata !6), !tbaa !8, !noalias !12

store i32 20, i32* %add.ptr, ptr_provenance i32* %3, align 4, !tbaa !13, !noalias !12

store i32 30, i32* %pC, align 4, !tbaa !13, !noalias !12

%4 = load i32, i32* %pA, ptr_provenance i32* %2, align 4, !tbaa !13, !noalias !12

%5 = load i32, i32* %add.ptr, ptr_provenance i32* %3, align 4, !tbaa !13, !noalias !12

%add = add nsw i32 %4, %5

%add1 = add nsw i32 %add, 30

ret i32 %add1

}

; ...

!3 = !{!4}

!4 = distinct !{!4, !5, !"foo: rpA"}

!5 = distinct !{!5, !"foo"}

!6 = !{!7}

!7 = distinct !{!7, !5, !"foo: rpB"}

!8 = !{!9, !9, i64 0}

!12 = !{!4, !7}

.. _noaliasinfo_local_restrict_llvm_2:

And LLVM-IR code after optimizations: alias analysis found the the stores do not

vzakhariUnsubmitted

Not Done

typo: "the the"

vzakhari: typo: "the the"

alias to each other and the values have been propagated.

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local i32 @foo(i32* nocapture %pA, i32 %i, i32* nocapture %pC) local_unnamed_addr #0 {

entry:

%0 = tail call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !3)

%1 = tail call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !6)

%add.ptr = getelementptr inbounds i32, i32* %pA, i32 %i

%2 = tail call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %pA, i8* %0,

i32** null, i32** undef, i64 0, metadata !3), !tbaa !8, !noalias !12

store i32 10, i32* %pA, ptr_provenance i32* %2, align 4, !tbaa !13, !noalias !12

%3 = tail call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* nonnull %add.ptr, i8* %1,

i32** null, i32** undef, i64 0, metadata !6), !tbaa !8, !noalias !12

store i32 20, i32* %add.ptr, ptr_provenance i32* %3, align 4, !tbaa !13, !noalias !12

store i32 30, i32* %pC, align 4, !tbaa !13, !noalias !12

ret i32 60

}

; ....

!3 = !{!4}

!4 = distinct !{!4, !5, !"foo: rpA"}

!5 = distinct !{!5, !"foo"}

!6 = !{!7}

!7 = distinct !{!7, !5, !"foo: rpB"}

!12 = !{!4, !7}

.. _noaliasinfo_pass_restrict:

Example B: pass a restrict pointer

""""""""""""""""""""""""""""""""""

.. _noaliasinfo_pass_restrict_C:

C99 code with local restrict variables:

.. code-block:: C

int fum(int * p);

int foo(int * pA) {

int * restrict rpA = pA;

*rpA = 10;

return fum(rpA);

}

.. _noaliasinfo_pass_restrict_llvm_0:

LLVM-IR code as produced by clang:

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local i32 @foo(i32* %pA) #0 {

entry:

%pA.addr = alloca i32*, align 4

%rpA = alloca i32*, align 4

store i32* %pA, i32** %pA.addr, align 4, !tbaa !3, !noalias !7

%0 = bitcast i32** %rpA to i8*

call void @llvm.lifetime.start.p0i8(i64 4, i8* %0) #5, !noalias !7

%1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %rpA, i64 0, metadata !7), !noalias !7

%2 = load i32*, i32** %pA.addr, align 4, !tbaa !3, !noalias !7

store i32* %2, i32** %rpA, align 4, !tbaa !3, !noalias !7

%3 = load i32*, i32** %rpA, align 4, !tbaa !3, !noalias !7

%4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %3, i8* %1, i32** %rpA, i64 0, metadata !7),

!tbaa !3, !noalias !7

store i32 10, i32* %4, align 4, !tbaa !10, !noalias !7

%5 = load i32*, i32** %rpA, align 4, !tbaa !3, !noalias !7

%6 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %5, i8* %1, i32** %rpA, i64 0, metadata !7),

!tbaa !3, !noalias !7

%call = call i32 @fum(i32* %6), !noalias !7

%7 = bitcast i32** %rpA to i8*

call void @llvm.lifetime.end.p0i8(i64 4, i8* %7) #5

ret i32 %call

}

.. _noaliasinfo_pass_restrict_llvm_1:

And LLVM-IR code after optimizations: stack objects have been optimized

away; ``@llvm.noalias`` has been converted into ``@llvm.provenance.noalias`` and

propagated to the ``ptr_provenance`` path. A ``@llvm.noalias.arg.guard`` has

been introduced to combine the ``ptr_provenance`` and the pointer value before

passing it to ``@fum``.

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local i32 @foo(i32* %pA) local_unnamed_addr #0 {

entry:

%0 = tail call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !3)

%1 = tail call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %pA, i8* %0,

i32** null, i32** undef, i64 0, metadata !3), !tbaa !6, !noalias !3

store i32 10, i32* %pA, ptr_provenance i32* %1, align 4, !tbaa !10, !noalias !3

%.guard.guard.guard.guard = call i32* @llvm.noalias.arg.guard.p0i32.p0i32(i32* nonnull %pA, i32* %1)

%call = tail call i32 @fum(i32* nonnull %.guard.guard.guard.guard) #4, !noalias !3

ret i32 %call

}

``@llvm.noalias.decl`` Intrinsic

--------------------------------

Syntax:

"""""""

.. code-block:: llvm

%p.decl =

i8* call @llvm.noalias.decl

T* %p.alloca, i64 objId, metadata !Scope

Overview:

"""""""""

Identify where in the control flow a *noalias* declaration happened.

Arguments:

""""""""""

* ``%p.alloca``: points to the ``alloca`` to which this declaration is

associated. Or ``null`` when the ``alloca`` was optimized away.

* ``objId``: an ID that is associated to this declaration. *SROA* treats this as

an offset wrt to the original ``alloca``.

vzakhariUnsubmitted

Not Done

* ``objId``: an ID that is associated to this declaration. *SROA* treats this as

- an offset wrt to the original ``alloca``.

+ an offset (in bytes) wrt to the original ``alloca``.

* ``!Scope``: a single scope that is associated with this declaration.

Is this correct?

vzakhari: Is this correct?

* ``!Scope``: a single scope that is associated with this declaration.

Semantics:

""""""""""

Identify where in the control flow a *noalias* declaration happened. When this

intrinsic is duplicated, care must be taken to decide if the associated

``!Scope`` metadata must be duplicated as well (in case of loop unrolling) or

not (in case of code hoisting over then/else paths).

The function returns a handle to the *noalias* declaration.

Examples:

=========

See :ref:`Example A: local restrict<noaliasinfo_local_restrict>` and

:ref:`Example B: pass a restrict pointer<noaliasinfo_pass_restrict>`.

``@llvm.noalias`` Intrinsic

---------------------------

Syntax:

"""""""

.. code-block:: llvm

%p.noalias =

T* call @llvm.noalias

T* %p, i8* %p.decl,

T** %p.addr, i64 objId, metadata !Scope,

!noalias !VisibleScopes

Overview:

"""""""""

Adds *noalias* provenance information to a pointer.

Arguments:

""""""""""

* ``%p``: the original value of the pointer.

* ``%p.decl``: the associated *noalias* declaration (or ``null`` if the

declaration is not available).

* ``%p.addr``: the address of the pointer.

* ``objId``: the ID that is associated to the noalisa declaration. *SROA* treats

this as an offset wrt to the original ``alloca``.

* ``!Scope``: a single scope that is associated with the noalias declaration.

* ``!VisibleScopes``: the scopes related to *noalias* declarations that are

visible to location in the control flow where the noalias pointer is read from

memory.

Semantics:

""""""""""

Adds *noalias* provenance information so that all memory instructions that

depend on ``%p.noalias`` are known to be based on a pointer with extra *noalias*

info. This is a mostly opaque intrinsic for optimizations. In order to not block

optimizations, it will be converted into a ``@llvm.provenance.noalias`` and

moved to the ``ptr_provenance`` path of memory instructions.

When a ``%p.decl`` is available, following arguments must match the ones in that

declaration: ``objId``, ``!Scope``.

When ``!Scope`` points to the *unknown function scope*, ``%p.decl`` must be

``null``.

.. note::

``@llvm.noalias`` can be seen as a shortcut for ``@llvm.provenance.noalias``

and ``@llvm.noalias.arg.guard``. See

:ref:`@llvm.noalias vs @llvm.provenance.noalias<noalias_vs_provenance_noalias>`.

Examples:

=========

See :ref:`Example A: local restrict<noaliasinfo_local_restrict_llvm_0>` and

:ref:`Example B: pass a restrict pointer<noaliasinfo_pass_restrict_llvm_0>`.

``@llvm.provenance.noalias`` Intrinsic

--------------------------------------

Syntax:

"""""""

.. code-block:: llvm

%prov.p =

T* call @llvm.provenance.noalias

T* %p, i8* %p.decl,

T** %p.addr, T** %prov.p.addr, i64 objId, metadata !Scope,

!noalias !VisibleScopes``

Overview:

"""""""""

Adds *noalias* provenance information to a pointer. This version, which is

similar to ``@llvm.noalias``, must only be found on the ``ptr_provenance`` path.

Arguments:

""""""""""

* ``%p``: the original value of the pointer, or a depending

``@llvm.provenance.noalias``.

* ``%p.decl``: the associated *noalias* declaration (or ``null`` if the

declaration is not available).

* ``%p.addr``: the address of the pointer.

* ``%prov.p.addr``: the ``ptr_provenance`` associated to ``%p.addr``. If this is

``Undef``, the original ``%p.addr`` must be followed.

* ``objId``: the ID that is associated to the noalisa declaration. *SROA* treats

this as an offset wrt to the original ``alloca``.

* ``!Scope``: a single scope that is associated with the noalias declaration.

* ``!VisibleScopes``: the scopes related to *noalias* declarations that are

visible to location in the control flow where the noalias pointer is read from

memory.

Semantics:

""""""""""

Adds *noalias* provenance information to a pointer. This is similar to

``@llvm.noalias``, but this version must only be found on the ``ptr_provenance``

path of memory instructions or of the ``@llvm.noalias.arg.guard`` intrinsic.

It can also be found on the path of the ``%prov.p.addr`` and on the ``%p``

arguments of another ``@llvm.provenance.noalias`` intrinsic.

When a ``%p.decl`` is available, following arguments must match the ones in that

declaration: ``objId``, ``!Scope``.

When ``!Scope`` points to the *unknown function scope*, ``%p.decl`` must be

``null``.

Examples:

=========

See :ref:`Example A: local restrict<noaliasinfo_local_restrict_llvm_1>` and

:ref:`Example B: pass a restrict pointer<noaliasinfo_pass_restrict_llvm_1>`.

``@llvm.noalias.arg.guard`` Intrinsic

-------------------------------------

Syntax:

"""""""

.. code-block:: llvm

%p.guard =

T* call @llvm.noalias.arg.guard

T* %p, T* %prov.p

Overview:

"""""""""

Combines the value of a pointer with its *noalias* provenance information.

Arguments:

""""""""""

* ``%p``: the value of the pointer

* ``%prov.p``: the provenance information associated to ``%p``

Semantics:

""""""""""

Combines the value of a pointer with its *noalias* provenance information. This

is normally introduced when converting ``@llvm.noalias`` into

``@llvm.provenance.noalias`` and the pointer is passed as a function

argument, returned from a function or stored to memory. This intrinsic ensures

that at a later time (after inlining and/or other optimizations), the provenance

information can be propagated to the memory instructions depending on the guard.

Examples:

=========

See :ref:`Example B: pass a restrict pointer<noaliasinfo_pass_restrict_llvm_1>`.

``@llvm.noalias.copy.guard`` Intrinsic

--------------------------------------

Syntax:

"""""""

.. code-block:: llvm

%p.addr.guard =

T* call @llvm.noalias.copy.guard

T* %p.addr, i8* %p.decl,

metadata !Indices,

metadata !Scope,

!noalias !VisibleScopes

Overview:

"""""""""

Annotates that the memory block pointed to by ``%p.addr`` contains *noalias

annotated pointers* (restrict pointers).

Arguments:

""""""""""

* ``%p.addr``: points to the block of memory that will be copied

* ``%p.decl``: the associated *noalias* declaration (or ``null`` if the

declaration is not available).

* ``!Indices``: the set of indices, describing on what locations a *noalias*

pointer can be found.

* ``!Scope``: a single scope that is associated with the noalias declaration.

* ``!VisibleScopes``: the scopes related to *noalias* declarations that are

visible to location in the control flow where the noalias pointer is read from

memory.

Semantics:

""""""""""

Annotates that the memory block pointed to by ``%p.addr`` contains *noalias

annotated pointers* (restrict pointers). The ``!Indices`` indicate where in

memory the *noalias* pointers are located.

When a block copy (aggregate load/store or ``@llvm.memcpy``) uses

``%p.addr.guard`` as a source, *SROA* is able to reconstruct the implied

``@llvm.noalias`` intrinsics. This ensure that the *noalias* information for

those pointers is tracked.

When a ``%p.decl`` is available, the ``!Scope`` argument must match the one in

that declaration.

When ``!Scope`` points to the *unknown function scope*, ``%p.decl`` must be

``null``.

``!Indices`` points to a list of metadata. Each entry in that list contains a

set of ``i32`` values, corresponding to the indices that would be past to

vzakhariUnsubmitted

Not Done

past -> passed?

vzakhari: `past` -> `passed`?

``getelementptr`` to retrieve a field in the struct. When the ``i32`` value is

**-1**, it indicates that any possible value should be checked (0, 1, 2, ...),

as long as the resulting address fits the size of the memory copy.

Examples:

"""""""""

Code example with a ``llvm.noalias.copy.guard``:

* Note the **-1** to represent ``a[i]`` in the indices of ``!15``.

* After optimization, the ``alloca`` is gone. The ``llvm.memcpy`` is also gone,

but the remaining dependency on restrict pointers is kept in the

``llvm.noalias.provenance``. Two are needed for this example: one related to

the declaration of ``struct B tmp``. One related to the ``unknown function

scope``.

.. code-block:: C

struct B {

int * restrict p;

struct A {

int m;

int * restrict p;

} a[5];

};

void FOO(struct B* b) {

struct B tmp = *b;

*tmp.a[1].p=32;

}

Results in following code:

.. code-block:: llvm

%struct.B = type { i32*, [5 x %struct.A] }

%struct.A = type { i32, i32* }

; Function Attrs: nounwind

define dso_local void @FOO(%struct.B* %b) #0 !noalias !3 {

entry:

%b.addr = alloca %struct.B*, align 4

%tmp = alloca %struct.B, align 4

store %struct.B* %b, %struct.B** %b.addr, align 4, !tbaa !6, !noalias !10

%0 = bitcast %struct.B* %tmp to i8*

call void @llvm.lifetime.start.p0i8(i64 44, i8* %0) #5, !noalias !10

%1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.Bs.i64(%struct.B* %tmp, i64 0, metadata !12),

!noalias !10

%2 = load %struct.B*, %struct.B** %b.addr, align 4, !tbaa !6, !noalias !10

%3 = call %struct.B* @llvm.noalias.copy.guard.p0s_struct.Bs.p0i8(%struct.B* %2,

i8* null, metadata !13, metadata !3)

%4 = bitcast %struct.B* %tmp to i8*

%5 = bitcast %struct.B* %3 to i8*

call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %4, i8* align 4 %5, i32 44, i1 false),

!tbaa.struct !16, !noalias !10

%a = getelementptr inbounds %struct.B, %struct.B* %tmp, i32 0, i32 1

%arrayidx = getelementptr inbounds [5 x %struct.A], [5 x %struct.A]* %a, i32 0, i32 1

%p = getelementptr inbounds %struct.A, %struct.A* %arrayidx, i32 0, i32 1

%6 = load i32*, i32** %p, align 4, !tbaa !18, !noalias !10

%7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %6,

i8* %1, i32** %p, i64 0, metadata !12), !tbaa !18, !noalias !10

store i32 32, i32* %7, align 4, !tbaa !21, !noalias !10

%8 = bitcast %struct.B* %tmp to i8*

call void @llvm.lifetime.end.p0i8(i64 44, i8* %8) #5

ret void

}

...

!3 = !{!4}

!4 = distinct !{!4, !5, !"FOO: unknown scope"}

!5 = distinct !{!5, !"FOO"}

!10 = !{!11, !4}

!11 = distinct !{!11, !5, !"FOO: tmp"}

!12 = !{!11}

!13 = !{!14, !15}

!14 = !{i32 -1, i32 0}

!15 = !{i32 -1, i32 1, i32 -1, i32 1}

And after optimizations:

.. code-block:: llvm

; Function Attrs: nounwind

define dso_local void @FOO(%struct.B* nocapture %b) local_unnamed_addr #0 !noalias !3 {

entry:

%0 = tail call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 16, metadata !6)

%tmp.sroa.69.0..sroa_idx10 = getelementptr inbounds %struct.B, %struct.B* %b, i32 0, i32 1, i32 1, i32 1

%tmp.sroa.69.0.copyload = load i32*, i32** %tmp.sroa.69.0..sroa_idx10, align 4, !tbaa.struct !8, !noalias !14

%1 = tail call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %tmp.sroa.69.0.copyload,

i8* null, i32** nonnull %tmp.sroa.69.0..sroa_idx10, i32** undef, i64 0, metadata !3)

%2 = tail call i32* @llvm.provenance.noalias.p0i32.p0i8.p0p0i32.p0p0i32.i64(i32* %1,

i8* %0, i32** null, i32** undef, i64 16, metadata !6), !tbaa !15, !noalias !14

store i32 32, i32* %tmp.sroa.69.0.copyload, ptr_provenance i32* %2, align 4, !tbaa !18, !noalias !14

ret void

}

...

!3 = !{!4}

!4 = distinct !{!4, !5, !"FOO: unknown scope"}

!5 = distinct !{!5, !"FOO"}

!6 = !{!7}

!7 = distinct !{!7, !5, !"FOO: tmp"}

Other usages of ``noalias`` inside LLVM

=======================================

``noalias`` attribute on parameters or function

-----------------------------------------------

This indicates that memory locations accessed via pointer values

:ref:`based <pointeraliasing>` on the argument or return value are not also

vzakhariUnsubmitted

Not Done

Does noalias attribute becomes redundant for arguments, since clang homes it with llvm.noalias.decl and loads from it with llvm.noalias?

vzakhari: Does `noalias` attribute becomes redundant for arguments, since `clang` homes it with `llvm.

accessed, during the execution of the function, via pointer values not

*based* on the argument or return value.

See :ref:`noalias attribute<noalias>`

``noalias`` and ``alias.scope`` Metadata

----------------------------------------

``noalias`` and ``alias.scope`` metadata provide the ability to specify generic

noalias memory-access sets.

See :ref:`noalias and alias.scope Metadata <noalias_and_aliasscope>`

The usage of this construct is not recommended, as it can result in wrong code

when inlining and loop unrolling optimizations are applied.

vzakhariUnsubmitted

Not Done

Can you please add notes that ScopedNoAliasAA is the alias analysis that is using the noalias and ptr_provenance information?
Do you also have estimation of the complexity of alias queries with the new representation? How is it affected by the number of scopes in the scope lists attached to the load/store, by the length of the provenance chain, etc.?

vzakhari: Can you please add notes that `ScopedNoAliasAA` is the alias analysis that is using the…

References

==========

.. rubric:: References

.. [#R1] https://en.wikipedia.org/wiki/Restrict

.. [#R2] WG14 N1256: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf (Chapter 6.7.3.1 Formal definition of restrict)

.. [#R3] WG21 N4150: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4150.pdf

.. [#R4] https://reviews.llvm.org/D9375 Hal Finkel's local restrict patches

.. [#R5] https://bugs.llvm.org/show_bug.cgi?id=39240 "clang/llvm looses restrictness, resulting in wrong code"

.. [#R6] https://bugs.llvm.org/show_bug.cgi?id=39282 "Loop unrolling incorrectly duplicates noalias metadata"

.. [#R7] https://www.godbolt.org/z/cUk6To "testcase showing that LLVM-IR is not able to differentiate if restrict is done inside or outside the loop"

.. [#R8] DR294: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_294.htm

.. [#R9] WG14 N2250: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2260.pdf Clarifying the restrict Keyword v2

.. [#R10] RFC: Full 'restrict' support in LLVM https://lists.llvm.org/pipermail/llvm-dev/2019-October/135672.html

llvm/docs/UserGuides.rst

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	.. toctree::
MemorySSA		MemorySSA
MergeFunctions		MergeFunctions
MCJITDesignAndImplementation		MCJITDesignAndImplementation
ORCv2		ORCv2
OpaquePointers		OpaquePointers
JITLink		JITLink
NewPassManager		NewPassManager
NVPTXUsage		NVPTXUsage
		NoAliasInfo
Phabricator		Phabricator
Passes		Passes
ReportingGuide		ReportingGuide
Remarks		Remarks
SourceLevelDebugging		SourceLevelDebugging
StackSafetyAnalysis		StackSafetyAnalysis
SupportLibrary		SupportLibrary
TableGen/index		TableGen/index
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines

:doc:`MergeFunctions`		:doc:`MergeFunctions`
Describes functions merging optimization.		Describes functions merging optimization.

:doc:`AliasAnalysis`		:doc:`AliasAnalysis`
Information on how to write a new alias analysis implementation or how to		Information on how to write a new alias analysis implementation or how to
use existing analyses.		use existing analyses.

		:doc:`NoAliasInfo`
		Information on how provenance based alias analysis, used to implement C99
		restrict, works.

:doc:`MemorySSA`		:doc:`MemorySSA`
Information about the MemorySSA utility in LLVM, as well as how to use it.		Information about the MemorySSA utility in LLVM, as well as how to use it.

:doc:`LoopTerminology`		:doc:`LoopTerminology`
A document describing Loops and associated terms as used in LLVM.		A document describing Loops and associated terms as used in LLVM.

:doc:`Vectorizers`		:doc:`Vectorizers`
This document describes the current status of vectorization in LLVM.		This document describes the current status of vectorization in LLVM.
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 346108

llvm/docs/LangRef.rst

llvm/docs/NoAliasInfo.rst

llvm/docs/UserGuides.rst

[PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation.
Needs ReviewPublic