Page MenuHomePhabricator

[PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation.
Needs ReviewPublic

Authored by jeroen.dobbelaere on Oct 4 2019, 1:59 PM.

Details

Reviewers
hfinkel
jdoerfert
Summary

This patch is the first of a series that introduces full restrict support
in LLVM and clang. The full support is based on the original local restrict
patches from Hal Finkel and is an implementation of the
'RFC: Full 'restrict' support in LLVM' [1].

In order to show the dependencies, in what follows, most of the time
a non-functional rebased patch from Hal Finkel is provided, followed
by a patch that enhances the full restrict support and makes everything
compile and run again.

[1] https://lists.llvm.org/pipermail/llvm-dev/2019-October/135672.html

Notes:

  • The mechanism with the ptr_provenance is such that passes that don't know about it will either work (and maybe miss certain optimizations) or crash. This is considered to be better than producing wrong code.
  • This set of patches is at the moment not complete. It is tested and works for the use cases of my company. But it is to be expected that some optimization passes will not interact well with it. In our experience, a number of optimization passes do have problems with the optional extra argument for load and store instructions, and they are normally easy to fix. It is possible that we did not yet catch all of those in passes that we don't use.
  • One item that is known to be missing, is LLVM IR bitcode support for the noalias_sidechannel of the load/store instruction (ascii LLVM IR is supported).
  • The new pass manager support has been fixed (D68507).
  • SLPVectorizer issues also have been fixed. (D68517)
  • The options enabling/disabling full restrict have been improved. (D68484)
  • A latent problem where invalid llvm-ir was produced during inlining has been fixed. (D68509)
  • A bug were the noalias depends-on relationship was lost has been fixed. (D68512 and D68521)

Added with the drop of 2020/06/12:

  • Renaming of 'side channel' to provenance, ptr_provenance
  • Incorporating Hal Finkel's changes. This should make it easier to review. It also reduces the number of patches to 26.
  • Handling of llvm.noalias.copy.guard during SROA has been improved.
  • Handling of Loop Unrolling has been improved.
  • Fixed a case in -fno-full-restrict where the new annotations were still produced.

Added with the drop of 2020/09/07

  • llvm-IR bitcode support
  • DeadArgumentElimination support

Notes:

  • NoAliasInfo.rst describes the noalias intrinsics infrastructure.

This set of patches is based on 9fb46a452d4e5666828c95610ceac8dcd9e4ce16 (September 7, 2020)

A convenience patch is available at D69542.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jdoerfert added inline comments.Oct 7 2019, 4:11 PM
llvm/docs/LangRef.rst
19245

In that example, how doe the p.scopes look like? Or, asked differently, is the p.scope a consequence of the declaration, hence does it uniquely identifies a declaration?

19435

If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with all the i8 pointers that fly around. The wording will then make it clear that these are tokens.

jeroen.dobbelaere marked 3 inline comments as done.Oct 8 2019, 8:24 AM

Here is an example test.c:

struct FOO {
  int* restrict pA;
  int* pB;
  int* restrict pC;
};

void bar(int* a, int* b, int* c) {
  struct FOO tmp = { a, b, c };
  *tmp.pA=42;
  *tmp.pB=43;
  *tmp.pC=44;
}

Compiled as:

clang -mllvm --print-before-all -mllvm -debug -emit-llvm -O2 test.c -S -o -

Before SROA:

%tmp = alloca %struct.FOO, align 8
...
%1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
...
%pA1 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 0
%5 = load i32*, i32** %pA1, align 8, !tbaa !9, !noalias !6
%6 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %5, i8* %1, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6
...
%pC3 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 2
%8 = load i32*, i32** %pC3, align 8, !tbaa !12, !noalias !6
%9 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %8, i8* %1, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6

During SROA: Notice how llvm.noalias.decl and llvm.noalias is split, using 0, 8 and 16 for the p.objId :

...
Rewriting alloca partition [0,8) to:   %tmp.sroa.0 = alloca i32*
Found llvm.noalias.decl:   %1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
 [...]
  rewriting [0,8) slice #2
    original:   %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %2, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6
          to:   %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %1, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6
 [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
 [...]
Rewriting alloca partition [8,16) to:   %tmp.sroa.6 = alloca i32*
Found llvm.noalias.decl:   %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6)
  [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6)
  [...]
  rewriting [8,16) slice #7
    original:   %9 = load i32*, i32** %pB2, align 8, !tbaa !11, !noalias !6
          to:   %tmp.sroa.6.8. = load i32*, i32** %tmp.sroa.6, !tbaa !11, !noalias !6
Rewriting alloca partition [16,24) to:   %tmp.sroa.8 = alloca i32*
Found llvm.noalias.decl:   %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
New   llvm.noalias.decl:   %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6)
  [...]
  rewriting split [0,24) slice #4 (splittable)
    original:   %4 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6)
          to:   %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6)
  [...]
  rewriting [16,24) slice #10
    original:   %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %4, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6
          to:   %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %3, i32** %tmp.sroa.8, i64 16, metadata !6), !tbaa !12, !noalias !6
  Speculating PHIs
  Speculating Selects

Then later on:

Promoting allocas with mem2reg...
Zeoring noalias.decl dep:   %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6)
Zeroing operand 2 of   %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %0, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6
[...]
Zeoring noalias.decl dep:   %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !2)
Zeroing operand 2 of   %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %2, i32** %tmp.sroa.8, i64 16, metadata !2), !tbaa !10, !noalias !2
[...]
Zeoring noalias.decl dep:   %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !2)
[...]

(aargh, 'Zeoring' should of course be 'Zeroing' ;) )

After this pass, we get:

define dso_local void @bar(i32* %a, i32* %b, i32* %c) #0 {
entry:
  %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !2)
  %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 8, metadata !2)  ; This one will be removed later on, as it is not used anywhere.
  %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 16, metadata !2)
  %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %a, i8* %0, i32** null, i64 0, metadata !2), !tbaa !5, !noalias !2
  store i32 42, i32* %3, align 4, !tbaa !10, !noalias !2
  store i32 43, i32* %b, align 4, !tbaa !10, !noalias !2
  %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %c, i8* %2, i32** null, i64 16, metadata !2), !tbaa !12, !noalias !2
  store i32 44, i32* %4, align 4, !tbaa !10, !noalias !2
  ret void
}
[...]
!2 = !{!3} ; p.scope: 'tmp' in function 'bar', also recycled by !noalias, as it is the only restrict declaration
!3 = distinct !{!3, !4, !"bar: tmp"}
!4 = distinct !{!4, !"bar"}
llvm/docs/LangRef.rst
19242

This is the confusing part for me for the LangRef vs the usage: should the LangRef describe only the high level effect, or can it also describe how llvm treats/optimizes stuff internally ? I have somehow the feeling the we might want to have a separate restrict handling document, describing how the intrincs and metadata work together. Or do you think such a thing also belongs to the LangRef ?

19245

Yes, the p.scope is a result of the declaration and uniquely identifies one.

19435

ok. We can consider that.

a.elovikov added inline comments.
llvm/docs/LangRef.rst
19180

I find it strange to see %side.p on both left and right sides. Is it a typo or does it have some special meaning?

After reading till the intrinsics' description I believe it should be just "%p" on the right side.

19514

the `noalias_sidechannel` path

Not sure about terminology, but are @llvm.noalias.arg.guard/@llvm.noalias.copy.guard considered as noalias_sidechannel? I'd suggest not to use the spelling from the load/store instructions and have a more general moved onto the "side" path (if my understanding is correct here).

19554

No explicit "or null" here. Is that intentional?

jeroen.dobbelaere marked 3 inline comments as done.Oct 10 2019, 1:34 PM
jeroen.dobbelaere added inline comments.
llvm/docs/LangRef.rst
19180

yes, that's a typo. the second %side.p should be %p:

%side.p = i8* @llvm.side.noalias.XXX(i8* %p, ...)
19514

The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store).
I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar).

19554

It can be 'null'

a.elovikov added inline comments.Oct 10 2019, 1:46 PM
llvm/docs/LangRef.rst
19514

How about this:

It will be transformed into a `llvm.side.noalias` intrinsic and moved onto
the `noalias_sidechannel` path for loads/stores and fed into the @llvm.noalias.arg.guard/@llvm.noalias.copy.guard intrinsics for function boundaries/copies respectively.

jeroen.dobbelaere marked an inline comment as done.Oct 10 2019, 1:53 PM
jeroen.dobbelaere added inline comments.
llvm/docs/LangRef.rst
19514

... and fed into the @llvm.noalias.arg.guard intrinsics for function boundaries.

(The @llvm.noalias.copy.guard is generated by the clang frontend)

izik1 added a subscriber: izik1.Oct 18 2019, 11:39 AM
jeroen.dobbelaere edited the summary of this revision. (Show Details)
aqjune added a subscriber: aqjune.Oct 30 2019, 7:52 PM
uenoku added a subscriber: uenoku.Nov 2 2019, 6:48 AM
simoll added a subscriber: simoll.Nov 5 2019, 6:29 AM
CryZe added a subscriber: CryZe.Feb 28 2020, 7:46 AM
jeroen.dobbelaere edited the summary of this revision. (Show Details)Mar 6 2020, 2:52 AM
alex added a subscriber: alex.May 27 2020, 9:49 PM

Note: I am working on an updated version of the patches, rebased to a more recent version of the tree; including some bug fixes and taking into account the rename of noalias_sidechannel to ptr_provenance etc.

jeroen.dobbelaere retitled this revision from [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. to [PATCH 01/26] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation..
jeroen.dobbelaere edited the summary of this revision. (Show Details)

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q? And load i8, i8* %p, i8* %p_prov is equivalent to load(llvm.noalias.arg.guard(p, p_prov))? And llvm.provenance.noalias([...], %p.addr, <type>** %prov.p.addr, [...] is equivalent to llvm.noalias([...], llvm.noalias.arg.guard(p, p_prov), [...])? And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

So really, these are the new concepts in the IR:

  1. llvm.noalias.decl: this introduces a new "scope" for aliasing.
  2. llvm.noalias: this associates a pointer with the llvm.noalias.decl.

And the rest can be expressed in terms of those intrinsics and basic IR instructions.

It took me a long time to parse this out; I think the description here needs to be reorganized. It really needs to separate out the semantic core from the detailed dive into the various intrinsics. Maybe into five sections: how noalias scopes work, how separating provenance from pointer values works, a high-level description of the intrinsics, the suggested lowering of the C "restrict", and the detailed description of the individual intrinsics.


Before optimizations, there is the declaration of the restrict pointer and `llvm.noalias` is used whenever the value of the restrict pointer is read.

Maybe explain why you're suggesting this, as opposed to using llvm.noalias when the value is written. (I guess it has something to do with the C standard's definition of "based on"?)

jeroen.dobbelaere edited the summary of this revision. (Show Details)Jun 12 2020, 11:54 AM

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q?

This is not correct: the 'llvm.noalias.arg.guard(p,ptr_provenance)' combines the 'value of the pointer' (p) with the 'provenance of the pointer' (ptr_provenance).
The ptr_provenance does not has a real 'value'. It is more like a dependency.

When you follow both, they should come together at some point, like at the input argument of a function :

  • the ptr_provenance purpose is to track the llvm.provenance.noalias information (and its dependencies). Normally there are no computations on this path.
  • the normal 'p' path, should in the best case only contain computations.

Due to inlining, it is possible that somewhere in the flow, the normal 'p' path also contains noalias information. The propagation pass should flatten that out.

And load i8, i8* %p, i8* %p_prov is equivalent to load(llvm.noalias.arg.guard(p, p_prov))?

Yes, this is correct. The ptr_provenance path was added explicitly to the load/store instructions, in order to get the llvm.noalias.arg.guard out of the way of most optimizations.
This makes is it much easier to keep the code correct in the presence of optimizations.

And llvm.provenance.noalias([...], %p.addr, <type>** %prov.p.addr, [...] is equivalent to llvm.noalias([...], llvm.noalias.arg.guard(p, p_prov), [...])?

Yes. llvm.provenance.noalias and llvm.noalias are equivalent. The former does track more information, as it is itself also treated like a 'memory instruction', so that we llvm.noalias.arg.guard is not needed.

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

So really, these are the new concepts in the IR:

  1. llvm.noalias.decl: this introduces a new "scope" for aliasing.
  2. llvm.noalias: this associates a pointer with the llvm.noalias.decl.
  1. llvm.noalias.arg.guard: combines a pointer (computation) path with a ptr_provenance path
  2. llvm.noalias.copy.guard: indicates on what indices in memory a restrict pointer is located

And the rest can be expressed in terms of those intrinsics and basic IR instructions.

Yes.
llvm.provenance.noalias was introduced as a 'safeguard', to make it clear that it always must be on the 'ptr_provenance' operand side.
The ptr_provenance operand was introduced to keep the information out of the way of most optimization passes.

It took me a long time to parse this out; I think the description here needs to be reorganized. It really needs to separate out the semantic core from the detailed dive into the various intrinsics. Maybe into five sections: how noalias scopes work, how separating provenance from pointer values works, a high-level description of the intrinsics, the suggested lowering of the C "restrict", and the detailed description of the individual intrinsics.

Yes, that makes sense. I am in the process of putting all of this in a separate document, but I didn't want to wait to get the updated patches out ;)
I hope to update this 01/26 patch early next week with the next iteration of the documentation. This is already very useful input for it !


Before optimizations, there is the declaration of the restrict pointer and `llvm.noalias` is used whenever the value of the restrict pointer is read.

Maybe explain why you're suggesting this, as opposed to using llvm.noalias when the value is written. (I guess it has something to do with the C standard's definition of "based on"?)

Yes. I hope that the updated documentation will make this easier to understand.

Thanks !

Jeroen Dobbelaere

jeroen.dobbelaere edited the summary of this revision. (Show Details)Jun 12 2020, 12:23 PM

If I'm understanding correctly, llvm.noalias.arg.guard(p, q) is equivalent to getelementptr q, p-q?

This is not correct: the 'llvm.noalias.arg.guard(p,ptr_provenance)' combines the 'value of the pointer' (p) with the 'provenance of the pointer' (ptr_provenance).
The ptr_provenance does not has a real 'value'. It is more like a dependency.

When you follow both, they should come together at some point, like at the input argument of a function :

  • the ptr_provenance purpose is to track the llvm.provenance.noalias information (and its dependencies). Normally there are no computations on this path.
  • the normal 'p' path, should in the best case only contain computations.

Due to inlining, it is possible that somewhere in the flow, the normal 'p' path also contains noalias information. The propagation pass should flatten that out.

getelementptr q, (ptrtoint(p)-ptrtoint(q)) should return a pointer with provenance of q, and the value of p. (http://llvm.org/docs/LangRef.html#pointer-aliasing-rules). I can't see how it isn't equivalent... unless noalias provenance is somehow different from the usual aliasing rules.

Does the presence of provenance markings fix https://bugs.llvm.org/show_bug.cgi?id=35229 ?

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

Oh, I see, it only applies the provenance to loads derived from that pointer, not all loads from the memory.

getelementptr q, (ptrtoint(p)-ptrtoint(q)) should return a pointer with provenance of q, and the value of p. (http://llvm.org/docs/LangRef.html#pointer-aliasing-rules). I can't see how it isn't equivalent... unless noalias provenance is somehow different from the usual aliasing rules.

I see now. It is indeed somewhat equivalent. The separate intrinsic makes it easier to convey the specific purpose of the construct and to control the kind of optimizations that we want to allow.
A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.
But, this is not the goal of the full restrict patches, and I would rather start with the current focused set of intrinsics, before trying to expand on it.

Does the presence of provenance markings fix https://bugs.llvm.org/show_bug.cgi?id=35229 ?

No, that problem is not fixed with the full restrict patches.

And llvm.noalias.copy.guard is equivalent to loading a pointer, applying llvm.noalias to it, and storing it back to the same address?

No.

llvm.noalias.copy.guard tells that the pointer it returns has restrict pointers as specified by the struct indices (encoded in the metadata value).

Oh, I see, it only applies the provenance to loads derived from that pointer, not all loads from the memory.

yes.

MSxDOS added a subscriber: MSxDOS.Jun 15 2020, 10:49 PM

I see now. It is indeed somewhat equivalent. The separate intrinsic makes it easier to convey the specific purpose of the construct and to control the kind of optimizations that we want to allow.

Sure, I wasn't suggesting that you'd want to actually use the getelementptr version, just trying to understand the intended meaning.

A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.

Is there some semantic difference between llvm.noalias.arg.guard and something like llvm.ptr.provenance? Or is it just a difference in the intended use?

A generalized version of the 'llvm.noalias.arg.guard', maybe something like 'llvm.ptr.provenance %pValue, %pProv1 [, %pProv_i]*', could convey the same information, and could be a help for fixing the bug you mentions.

Is there some semantic difference between llvm.noalias.arg.guard and something like llvm.ptr.provenance? Or is it just a difference in the intended use?

The llvm.noalias.arg.guard is intended to only track noalias dependencies. The llvm.ptr.provenance could be used to track provenance in a more general way (Like pointing to the original alloca).

jeroen.dobbelaere edited the summary of this revision. (Show Details)

Initial version of 'NoAliasInfo.rst', describing the noalias intrinsics infrastructure.

Notes:

  • in a future version 'llvm.noalias' and 'llvm.provenance.noalias' will be merged into a single intrinsic.
  • any feedback is welcome !
Matt added a subscriber: Matt.Jun 29 2020, 12:41 PM
jeroen.dobbelaere retitled this revision from [PATCH 01/26] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. to [PATCH 01/26] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation..Jul 7 2020, 3:14 AM
jeroen.dobbelaere edited the summary of this revision. (Show Details)
jeroen.dobbelaere edited the summary of this revision. (Show Details)
jeroen.dobbelaere edited the summary of this revision. (Show Details)Jul 10 2020, 9:13 AM

Notes:

  • in a future version 'llvm.noalias' and 'llvm.provenance.noalias' will be merged into a single intrinsic.

I was thinking of merging llvm.noalias and llvm.provenance.noalias but now decided to not do it:

  • llvm.noalias is a convenience shortcut to llvm.provenance.noalias + llvm.noalias.arg.guard
  • keeping the convenience intrinsic reduces the amount of generated code and makes tracking tbaa on the intrinsics easier.
jeroen.dobbelaere edited the summary of this revision. (Show Details)

Updated NoAliasInfo.rst to explain the relationship between @llvm.noalias and @llvm.provenance.noalias, @llvm.noalias.arg.guard

Rebased to c06b7e2ab5167ad031745a706204abed1aefd823 (July 14, 2020)

jeroen.dobbelaere edited the summary of this revision. (Show Details)

Rebased to 9fb46a452d4e5666828c95610ceac8dcd9e4ce16 (September 7, 2020)

Hmm. If anybody knows how to hide the inline comments from an older revision..

jeroen.dobbelaere retitled this revision from [PATCH 01/26] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation. to [PATCH 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation..Sep 7 2020, 2:37 PM

The effect of the patches on the compile time can be found here: https://llvm-compile-time-tracker.com/index.php?branch=dobbelaj-snps/perf/full_restrict-20200907
For some the regressions, I already have some ideas on how reduce the impact. I propose to have the discussion at the respective patches.

yaxunl added a subscriber: yaxunl.Sep 15 2020, 12:48 PM

ping

Any feedback on this patch ?

Note: On some architectures, you might want to use -mllvm -enable-aa-sched-mi to make use of alias information when scheduling the machine instructions.

nikic added a subscriber: nikic.Tue, Nov 3, 2:25 PM

As promised, I've started testing this patch set in rust. Unfortunately I quickly ran into an assertion failure on the following reduced test case:

%0 = type { i32 }
%1 = type { i32 }

define internal void @foo0(%0* noalias %ptr) {
    store %0 zeroinitializer, %0* %ptr
    ret void
}

define internal void @foo1(%1* noalias %ptr) {
    store %1 zeroinitializer, %1* %ptr
    ret void
}

define void @bar(%0* %ptr0, %1* %ptr1) {
    call void @foo0(%0* noalias %ptr0)
    call void @foo1(%1* noalias %ptr1)
    ret void
}

Run opt -inline:

opt: /home/nikic/rust/src/llvm-project/llvm/include/llvm/Support/Casting.h:269: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::Function; Y = llvm::Value; typename llvm::cast_retty<X, Y*>::ret_type = llvm::Function*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build/x86_64-unknown-linux-gnu/llvm/bin/opt -S -inline 
1.	Running pass 'CallGraph Pass Manager' on module '<stdin>'.
 #0 0x0000557429562c40 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a8c40)
 #1 0x00005574295608e4 llvm::sys::RunSignalHandlers() (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a68e4)
 #2 0x0000557429560a28 SignalHandler(int) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x28a6a28)
 #3 0x00007fcfc4eb43c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
 #4 0x00007fcfc498418b raise /build/glibc-ZN95T4/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #5 0x00007fcfc4963859 abort /build/glibc-ZN95T4/glibc-2.31/stdlib/abort.c:81:7
 #6 0x00007fcfc4963729 get_sysdep_segment_value /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:509:8
 #7 0x00007fcfc4963729 _nl_load_domain /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:970:34
 #8 0x00007fcfc4974f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
 #9 0x0000557428c60829 (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x1fa6829)
#10 0x0000557428c6b554 llvm::IRBuilderBase::CreateNoAliasDeclaration(llvm::Value*, llvm::Value*, llvm::Value*) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x1fb1554)
#11 0x00005574295f3783 AddNoAliasIntrinsics(llvm::CallBase&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::MDNode*&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x2939783)
#12 0x00005574295f4699 llvm::InlineFunction(llvm::CallBase&, llvm::InlineFunctionInfo&, llvm::AAResults*, bool, llvm::Function*) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x293a699)
#13 0x0000557428e49c48 llvm::LegacyInlinerBase::inlineCalls(llvm::CallGraphSCC&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x218fc48)
#14 0x00005574283ac72e (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x16f272e)
#15 0x0000557428cc1503 llvm::legacy::PassManagerImpl::run(llvm::Module&) (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x2007503)
#16 0x000055742734a7e2 main (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x6907e2)
#17 0x00007fcfc49650b3 __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:342:3
#18 0x00005574273e4b5e _start (build/x86_64-unknown-linux-gnu/llvm/bin/opt+0x72ab5e)

I believe this is a known problem with the name mangling for pointers to anonymous types. I think @fhahn may know more about this, IIRC this came up as a problem with PredicateInfo as well.

As promised, I've started testing this patch set in rust. Unfortunately I quickly ran into an assertion failure on the following reduced test case:

Thank you for trying this out ! D91250 should resolve that problem. Can you try again with that patch applied ?

Thanks !

Jeroen Dobbelaere