This is an archive of the discontinued LLVM Phabricator instance.

llvm/docs/LangRef.rst
20409	*`a memory region` Maybe even something like: `annotates a pointer with memory region that is accessible through that pointer.`
20418	Do we want to specify that `end_offset` > `begin_offset` ?
20444–20446	I have a hard time parsing that sentence, did you mean something like: The only ``inbounds`` addresses that can be computed from pointer ``%ptr`` are those that lie within offsets ``[-8, 24]`` from that pointer. Other offsets will result in a :ref:`poison value <poisonvalues>`.
20445	Given the semantics above, 24 should be excluded: `[8,24)`

arichardson added a subscriber: arichardson.Dec 8 2021, 12:24 AM

lebedev.ri marked an inline comment as done.Dec 8 2021, 2:57 AM

lebedev.ri added inline comments.

llvm/docs/LangRef.rst
20418	`end_offset` can not be required to be bigger than `begin_offset`. E.g. consider the following 32-byte region: `[%ptr - 32, %ptr + 0]`, where `%ptr` is the `end` pointer. Likewise, i don't think we can specify that the total region size is non-zero, because then we can't declare empty memory regions, where there only is a `end` pointer. What i think is something that still needs to be better clarified, is whether the `begin_offset` should specify the offset from the beginning of the memory region to the pointer (i.e. be non-negative (`begin_offset s>= 0`)), or the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)), I guess, given that `end_offset` specifies the offset from the pointer to the `end` of the region, we want `begin_offset` to be symmetrical, and specify the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)).
20445	Hm, no, why? Is the `end` pointer not `inbounds`?

courbet added inline comments.Dec 8 2021, 4:07 AM

llvm/docs/LangRef.rst
20418	`end_offset` can not be required to be bigger than `begin_offset`. E.g. consider the following 32-byte region: `[%ptr - 32, %ptr + 0]`, where `%ptr` is the `end` pointer. In that base, `begin_offset` is `-32` and `end_offset` is `0`, right ? So we still have `begin_offset <= end_offset`. Likewise, i don't think we can specify that the total region size is non-zero, because then we can't declare empty memory regions, where there only is a `end` pointer. Yes you're right, obviously for the case of `int a[0]`. BTW there is an inconsistency between GNU C and C99 on this for flexible array members: in the former case we should not emit an annotation for this while we should for the latter. What i think is something that still needs to be better clarified, is whether the `begin_offset` should specify the offset from the beginning of the memory region to the pointer (i.e. be non-negative (`begin_offset s>= 0`)), or the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)), My vote goes to `begin_offset >= 0` => memory region begins after the pointer.

Misc wording improvements.

Harbormaster completed remote builds in B138120: Diff 392711.Dec 8 2021, 5:32 AM

(@courbet i believe i have addressed your comments)

courbet added inline comments.Dec 10 2021, 1:14 AM

llvm/docs/LangRef.rst
20428–20431	I'm still confused as to why we need that restriction. For example, why do we want to disallow the following: %p2 = getelementptr inbounds i32, i32* %p1, i64 -42 %p2_bounded = @llvm.memory.region.decl.p0i8(%p2, 42, 43) %p3 = getelementptr inbounds i32, i32* %p2_bounded, i64 42 %v = load i32, i32* %p3, align 4

lebedev.ri marked an inline comment as done.Dec 10 2021, 1:24 AM

lebedev.ri added inline comments.

llvm/docs/LangRef.rst
10355–10362	I'm still confused as to why we need that restriction.

nikic added a subscriber: nikic.Dec 13 2021, 3:02 AM

nikic added inline comments.

llvm/docs/LangRef.rst
20428–20431	At least as defined, `%p2_bounded` would already be poison in that case, because `%p2_bounded` would be before the declared memory region. The fact that `%p3` would point back into the memory region doesn't matter in that case, because the pointer is already poison.
20472	Doesn't this sentence contradict the first sentence in "Semantics"? If you want to make a distinction between inbounds/non-inbounds, then I think you have to do that in terms of restricting the visible allocated object, rather than saying that any pointer based on it cannot be outside the range. That would mean that something like `%p = memory.region.decl(%p0, 8, 16)` would not be poison, though dereferencing it would be and doing an inbounds gep would be, while doing a non-inbounds gep by 8 and then dereferencing would be legal.
llvm/include/llvm/IR/Intrinsics.td
1186	While technically correct, annotating it `Returned` means that the intrinsic is simply going to be optimized away. You want to drop `returned` here and instead add the intrinsic to `isIntrinsicReturningPointerAliasingArgumentWithoutCapturing()`, which handles various intrinsics of this kind.

Rebased, addressed review notes.
Ping, i feel like i'd want to start making progress here.

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 12:13 PM

lebedev.ri added inline comments.Mar 2 2022, 12:13 PM

llvm/docs/LangRef.rst
20472	Hmm, i do not remember why i've added that footprint here. I believe we don't want to make a distinction between inbounds/non-inbounds.

Harbormaster completed remote builds in B152218: Diff 412505.Mar 2 2022, 1:17 PM

I still believe this should be a generic assume.passthrough intrinsic with an operand bundle for the specific use case.
I briefly described that here: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154281.html

The intrinsic would also be speculatable, readnone, all the good stuff.
The resulting value has all the annotated properties *or* is poison.
This matches attributes on call sites and as such we can retain attributes from call sites:

call @foo(i8* %p, i8* align(16) nonnull %p);

will become

%arg1 = llvm.assume.passthrough(%p) ["align"(16), "nonnull"]
call @foo(i8* %p, i8* align(16) nonnull %arg1);

just before inlining. This way the information is retained properly.
Note that we should already retain it with llvm.assume if there is a noundef as well.

Does anybody else have any thoughts/opinions on this?

In D115274#3361333, @jdoerfert wrote:

I still believe this should be a generic assume.passthrough intrinsic with an operand bundle for the specific use case.

Is that a blocker?
I'm guessing you also argue that the noalias stuff should likewise also be designed this way?

This concept of sub-objects also shows up in the 'restrict' stuff. The difference is that with 'restrict' the aliasing constraints are dynamic: the promise is that e.g. two (dynamic) accesses in the original program don't alias. Here the constrains are static.

Ideally we would have a single solution for all this stuff, but instead right now we have multiple disjoint solutions. Each one will have to go through the pain of patching all optimizers so they understand the new intrinsics so they don't throttle back. It's not ideal..
I just wish we could sit physically in some place for a few days and work on a solution once and for all.

It's also very hard to discuss designs when so much code has been written already (especially for the 'restrict' stuff).

Regarding this patch in particular, I would mention that the storage of the returned sub-object is shared with the parent object, just to make it explicit. The concept looks ok.
Again, my concern is that to be able to use this new intrinsic from clang by default will take a lot of work. You'll likely face many perf regressions before patching a bunch of optimizations.

Thanks for taking a look!

In D115274#3362337, @nlopes wrote:

This concept of sub-objects also shows up in the 'restrict' stuff. The difference is that with 'restrict' the aliasing constraints are dynamic: the promise is that e.g. two (dynamic) accesses in the original program don't alias. Here the constrains are static.

Ideally we would have a single solution for all this stuff, but instead right now we have multiple disjoint solutions. Each one will have to go through the pain of patching all optimizers so they understand the new intrinsics so they don't throttle back. It's not ideal..

Right.

I just wish we could sit physically in some place for a few days and work on a solution once and for all.

It's also very hard to discuss designs when so much code has been written already (especially for the 'restrict' stuff).

FWIW, currently i don't have any further code for this.

Regarding this patch in particular, I would mention that the storage of the returned sub-object is shared with the parent object, just to make it explicit. The concept looks ok.

Done.

Again, my concern is that to be able to use this new intrinsic from clang by default will take a lot of work. You'll likely face many perf regressions before patching a bunch of optimizations.

I acknowledge this reality. I'm just not sure we have a better approach than that,
it would be good to come up with a "a solution once and for all",
but i'm not sure how that would look.

Harbormaster completed remote builds in B154009: Diff 414962.Mar 13 2022, 4:26 PM

ping

lebedev.ri mentioned this in D114988: [IR] `GetElementPtrInst`: per-index `inrange` support.Mar 21 2022, 7:49 AM

dtemirbulatov added a subscriber: dtemirbulatov.Mar 21 2022, 7:57 AM

In general I like this and would be very happy if we had a better way of dealing with sub-objects in LLVM.

Looking at this from the CHERI perspective, it seems like we could enforce this intrinsic at runtime by lowering it the same way as we current do for the out-of-tree`@llvm.cheri.bounds.set()`. This intrinsic also creates a memory subregion, the only difference as far as I can see it is that negative start offsets have to be encoded by doing a negative GEP first since we only have a pointer+length argument. And of course that it is enforced at runtime by narrowing the bounds that are part of the CHERI pointer.

llvm/docs/LangRef.rst
20452	Do you envision this being used for all sub-object pointer creations? If so it might need a flag to disable it since it might break some C patterns such as `container_of`. According to https://godbolt.org/z/evTbejaMf the container_of macro results in an inbounds GEP, so with sufficient inlining things might break? About three years ago I spent quite a lot of time enforcing sub-object bounds at runtime using CHERI. Almost all code works just fine but there are things such as container_of() that require opt-out annotations. I wrote about the incompatibilities that I found in Chapter 5 of https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-949.pdf. TL;DR: not many changes needed - about 50 annotations across the entire FreeBSD source tree. Almost all annotations due to container_of or emulation of C++ inheritance in C.

Thank you for taking a look!

llvm/docs/LangRef.rst
20452	I'm not sure what you mean by "all sub-object pointer creations". Roughly, front-ends should emit this intrinsic on some pointer with some bounds iff they know that it would be UB to go from that specific pointer (aka, as per def-use) outside of the specified bounds. The one case we know of is C arrays within structs.

arichardson added inline comments.Mar 21 2022, 4:04 PM

llvm/docs/LangRef.rst
20452	By sub-object pointer creations I mean something like `&obj->field`. You could conceivable treat that as declaring a new sub-object bounded to just `field`. E.g. something like this: https://godbolt.org/z/bM7j1bxqs

lebedev.ri marked 2 inline comments as done.Mar 21 2022, 4:13 PM

lebedev.ri added a subscriber: aaron.ballman.

lebedev.ri added inline comments.

llvm/docs/LangRef.rst
20452	I think the question is slightly wrong. It's up to front-ends to decide when they can and can't emit this. If you are asking about https://godbolt.org/z/adq5EWx17, then as per previous conversations about this, i do believe that code to be well-defined and not UB. IOW, i do not believe that as per the current C/C++ standards wording each data member of a struct is it's own sub-object from which you are not allowed to get to it's neighbor objects, But perhaps @aaron.ballman wants to correct me on this.

arichardson added inline comments.Mar 21 2022, 4:29 PM

llvm/docs/LangRef.rst
20452	Yes absolutely agreed that this is purely up to the frontend, I just assumed you had plans to update clang to emit the new intrinsic. It has been a long time since I looked at the C standard with regards to subobjects but if I recall correctly you are right that this is not defined as being illegal. However, doesn't that also mean that you can access the member before/after an embedded array?

lebedev.ri marked 2 inline comments as done.Mar 21 2022, 4:39 PM

lebedev.ri added inline comments.

llvm/docs/LangRef.rst
20452	Right. Having a pointer to the array member of a struct isn't going to do anything. The magic happens when you have a pointer to the element of the array member: https://godbolt.org/z/cjE5bY4G4 <- manually crafted

aaron.ballman added inline comments.Mar 22 2022, 5:37 AM

llvm/docs/LangRef.rst
20452	IOW, i do not believe that as per the current C/C++ standards wording each data member of a struct is it's own sub-object from which you are not allowed to get to it's neighbor objects, But perhaps @aaron.ballman wants to correct me on this. The rules on pointer arithmetic in C++: http://eel.is/c++draft/expr.add#4, so https://godbolt.org/z/adq5EWx17 looks like UB to me in C++. C2x 6.5.6p9 reads similarly, so it also looks like UB to me in C. What language rules make you think otherwise?

lebedev.ri marked 2 inline comments as done.Mar 22 2022, 6:40 AM

lebedev.ri added inline comments.

llvm/docs/LangRef.rst
20452	Hm, thank you for correcting me, that does not match my recollection from the previous time we discussed this. What about https://godbolt.org/z/8T54zxT43, is that pointer well-defined?

aaron.ballman added inline comments.Mar 22 2022, 6:59 AM

llvm/docs/LangRef.rst
20452	I believe that's also UB for the same reason -- the subtraction violates http://eel.is/c++draft/expr.add#4.3, so we can't say much about the resulting pointer value. Note, a slightly different case of using `+ 1` instead of `- 1` is valid because any object can be treated as an array of one (https://eel.is/c++draft/basic.compound#3.sentence-12) and a one-past-the-end pointer is valid (http://eel.is/c++draft/expr.add#4.2) so long as it's not dereferenced.

@aaron.ballman thank you!

llvm/docs/LangRef.rst
20452	Right, `+1` (`end` pointer) is non-dereferenceable, and is fine. Hmmmm, that sounds too good to be true. Then i stand corrected, i guess we could emit it even for such cases, although that will likely break code, and i'm not sure if a sanitizer could be implemented to catch the UB.

In D115274#3361864, @lebedev.ri wrote:

Does anybody else have any thoughts/opinions on this?

In D115274#3361333, @jdoerfert wrote:

I still believe this should be a generic assume.passthrough intrinsic with an operand bundle for the specific use case.

Is that a blocker?

No. Though, we should put a TODO somewhere if someone comes looking for a nice project to cleanup our messes.

In D115274#3403284, @jdoerfert wrote:

In D115274#3361864, @lebedev.ri wrote:

Does anybody else have any thoughts/opinions on this?

In D115274#3361333, @jdoerfert wrote:

I still believe this should be a generic assume.passthrough intrinsic with an operand bundle for the specific use case.

Is that a blocker?

No. Though, we should put a TODO somewhere if someone comes looking for a nice project to cleanup our messes.

Awesome! There has been a lot of approval of the proposal, but does anyone want to rubber-stamp it before i land this?

I am happy with this but I feel like someone else should also approve it :)

llvm/docs/LangRef.rst
20452	Not a sanitizer for x86, but you can use CHERI LLVM and compile with -cheri-bounds=subobject-safe to detect such violations at runtime. Code can then be run on QEMU or also on Arm's Morello boards that were recently released (if you happen to be one of the recipients). We use that flag for the FreeBSD kernel and it works very well with only a few minor adjustments. I believe that a sanitizer for non-CHERI hardware would require complete provenance (bounds) shadow memory for every pointer so would be a big engineering effort.

This revision is now accepted and ready to land.Mar 23 2022, 1:45 PM

FWIW, this sounds like a generally useful feature to me and I believe it can be used to better express some C and C++ semantics.

Ping. Do you plan to commit the change? Any blockage?

In D115274#3432740, @dtemirbulatov wrote:

Ping. Do you plan to commit the change? Any blockage?

I don't know. I think i'm waiting for the dust to settle on the github pr debacle.

In D115274#3432750, @lebedev.ri wrote:

In D115274#3432740, @dtemirbulatov wrote:

Ping. Do you plan to commit the change? Any blockage?

I don't know. I think i'm waiting for the dust to settle on the github pr debacle.

Any context on that? I feel like I missed something here...

Anyway, I find the overall wording here still a bit confusing. I would put more emphasis on the fact that this effectively restricts the "allocated object" to a certain offset range, which should have the following three effects:

For non-inbounds GEPs, this should have no effect.
For inbounds GEPs, if the GEP goes outside the range [ptr+begin_offset, ptr+end_offset], the GEP result is poison.
For accesses, if the access is outside the range [ptr+begin_offset, ptr+end_offset-1], the behavior is undefined.

The current wording mostly emphasizes the middle point, but not so much the first and the last. And the fact that the "one past the end of the region" address is only valid for GEP inbounds but not for accesses is probably important for optimization purposes.

llvm/docs/LangRef.rst
20417	nit: "offset to" -> "offset from"?
llvm/include/llvm/IR/Intrinsics.td
1185	nit: Is the ReadNone here actually meaningful if the whole intrinsic is already IntrNoMem?

In D115274#3432812, @nikic wrote:

In D115274#3432750, @lebedev.ri wrote:

In D115274#3432740, @dtemirbulatov wrote:

Ping. Do you plan to commit the change? Any blockage?

I don't know. I think i'm waiting for the dust to settle on the github pr debacle.

Any context on that? I feel like I missed something here...

The TLDR is that i don't feel like contributing to projects that continuously shit on their fellow contributors.
Migration from phab to github pull requests will likely be my tipping point.

Anyway, I find the overall wording here still a bit confusing. I would put more emphasis on the fact that this effectively restricts the "allocated object" to a certain offset range, which should have the following three effects:

For non-inbounds GEPs, this should have no effect.

For inbounds GEPs, if the GEP goes outside the range [ptr+begin_offset, ptr+end_offset], the GEP result is poison.

For accesses, if the access is outside the range [ptr+begin_offset, ptr+end_offset-1], the behavior is undefined.

The current wording mostly emphasizes the middle point, but not so much the first and the last. And the fact that the "one past the end of the region" address is only valid for GEP inbounds but not for accesses is probably important for optimization purposes.

brooks added a subscriber: brooks.Apr 8 2022, 1:47 PM

This comment was removed by brooks.

bsmith added a subscriber: bsmith.May 9 2022, 7:22 AM

Matt added a subscriber: Matt.Sep 3 2022, 11:52 AM

Herald added a subscriber: kosarev. · View Herald TranscriptSep 3 2022, 11:52 AM

It will be great to have this patch landed. It will help to rewrite D126533 and make progress. Alternatively, would it be OK for someone else to commandeer this patch?
Apologies for the community issues around phabricator and github pr. But I guess it will take time to solve.

Herald added a subscriber: StephenFan. · View Herald TranscriptApr 5 2023, 3:42 AM

efriedma mentioned this in D150192: Allow clang to emit inrange metadata when generating code for array subscripts.May 10 2023, 9:52 AM

simeon added a child revision: D152275: Use memory region declaration intrinsic when generating code for array subscripts.Jun 6 2023, 7:51 AM

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

39 lines

include/

llvm/

IR/

Intrinsics.td

9 lines

lib/

CodeGen/

GlobalISel/

IRTranslator.cpp

3 lines

SelectionDAG/

SelectionDAGBuilder.cpp

1 line

test/

CodeGen/

AMDGPU/

GlobalISel/

memory_region_decl.ll

21 lines

X86/

memory_region_decl.ll

32 lines

Diff 392490

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,346 Lines • ▼ Show 20 Lines	define i32* @foo(%struct.ST* %s) {
%t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1		%t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
%t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2		%t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
%t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3		%t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
%t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4		%t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
%t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5		%t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
ret i32* %t5		ret i32* %t5
}		}

If the ``inbounds`` keyword is present, the result value of the		If the ``inbounds`` keyword is present, the result value of the
``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the		``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
following rules is violated:		following rules is violated:

* The base pointer has an in bounds address of an allocated object, which		* The base pointer has an in bounds address of an allocated object, which
means that it points into an allocated object, or to its end. The only		means that it points into an allocated object, or to its end. The only
in bounds address for a null pointer in the default address-space is the		in bounds address for a null pointer in the default address-space is the
null pointer itself.		null pointer itself.
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions I'm still confused as to why we need that restriction. lebedev.ri: > I'm still confused as to why we need that restriction.
* If the type of an index is larger than the pointer index type, the		* If the type of an index is larger than the pointer index type, the
truncation to the pointer index type preserves the signed value.		truncation to the pointer index type preserves the signed value.
* The multiplication of an index by the type size does not wrap the pointer		* The multiplication of an index by the type size does not wrap the pointer
index type in a signed sense (``nsw``).		index type in a signed sense (``nsw``).
* The successive addition of offsets (without adding the base address) does		* The successive addition of offsets (without adding the base address) does
not wrap the pointer index type in a signed sense (``nsw``).		not wrap the pointer index type in a signed sense (``nsw``).
* The successive addition of the current address, interpreted as an unsigned		* The successive addition of the current address, interpreted as an unsigned
number, and an offset, interpreted as a signed number, does not wrap the		number, and an offset, interpreted as a signed number, does not wrap the
▲ Show 20 Lines • Show All 10,015 Lines • ▼ Show 20 Lines
Semantics:		Semantics:
""""""""""		""""""""""

Returns another pointer that aliases its argument but which has no associated		Returns another pointer that aliases its argument but which has no associated
``invariant.group`` metadata.		``invariant.group`` metadata.
It does not read any memory and can be speculated.		It does not read any memory and can be speculated.


		.. _int_memory_region_decl:

		'``llvm.memory.region.decl``' Intrinsic
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""

		::

		declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree nosync nounwind readnone speculatable willreturn

		Overview:
		"""""""""

		The '``llvm.memory.region.decl``' intrinsic annotates memory region.
		courbetUnsubmitted Done Reply Inline Actions `a memory region` Maybe even something like: `annotates a pointer with memory region that is accessible through that pointer.` courbet:* *`a memory region` Maybe even something like: `annotates a pointer with memory region that is…

		Arguments:
		""""""""""

		This is an overloaded intrinsic. The memory region can belong to any address
		space. The first argument is a pointer into the memory region. The returned
		pointer, which is the first argument, must belong to the same address space
		as the argument. The second argument specifies the offset to the pointer (the
		nikicUnsubmitted Not Done Reply Inline Actions nit: "offset to" -> "offset from"? nikic: nit: "offset to" -> "offset from"?
		first argument) at which the memory region begins. The third argument specifies
		courbetUnsubmitted Done Reply Inline Actions Do we want to specify that `end_offset` > `begin_offset` ? courbet: Do we want to specify that `end_offset` > `begin_offset` ?
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions `end_offset` can not be required to be bigger than `begin_offset`. E.g. consider the following 32-byte region: `[%ptr - 32, %ptr + 0]`, where `%ptr` is the `end` pointer. Likewise, i don't think we can specify that the total region size is non-zero, because then we can't declare empty memory regions, where there only is a `end` pointer. What i think is something that still needs to be better clarified, is whether the `begin_offset` should specify the offset from the beginning of the memory region to the pointer (i.e. be non-negative (`begin_offset s>= 0`)), or the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)), I guess, given that `end_offset` specifies the offset from the pointer to the `end` of the region, we want `begin_offset` to be symmetrical, and specify the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)). lebedev.ri: `end_offset` can not be required to be bigger than `begin_offset`. E.g. consider the following…
		courbetUnsubmitted Done Reply Inline Actions `end_offset` can not be required to be bigger than `begin_offset`. E.g. consider the following 32-byte region: `[%ptr - 32, %ptr + 0]`, where `%ptr` is the `end` pointer. In that base, `begin_offset` is `-32` and `end_offset` is `0`, right ? So we still have `begin_offset <= end_offset`. Likewise, i don't think we can specify that the total region size is non-zero, because then we can't declare empty memory regions, where there only is a `end` pointer. Yes you're right, obviously for the case of `int a[0]`. BTW there is an inconsistency between GNU C and C99 on this for flexible array members: in the former case we should not emit an annotation for this while we should for the latter. What i think is something that still needs to be better clarified, is whether the `begin_offset` should specify the offset from the beginning of the memory region to the pointer (i.e. be non-negative (`begin_offset s>= 0`)), or the offset from the pointer to the beginning of the memory region (i.e. be non-positive (`begin_offset s<= 0`)), My vote goes to `begin_offset >= 0` => memory region begins after the pointer. courbet: > `end_offset` can not be required to be bigger than `begin_offset`. > E.g. consider the…
		the offset to the pointer (the first argument) at which the memory region ends.

		Semantics:
		""""""""""

		The returned pointer, and, transitively, any pointer that is def-use based on
		that pointer, points into the memory region ``[ptr+begin_offset, ptr+end_offset)``,
		or is a :ref:`poison value <poisonvalues>` otherwise.

		This intrinsic is intended to be an optimization hint, there are no correctness
		concerns with completely ignoring and/or dropping it. The main use-case is
		to be able to annotate array bounds in C family of languages,
		which may allow alloca splitting, and better alias analysis.
		courbetUnsubmitted Done Reply Inline Actions I'm still confused as to why we need that restriction. For example, why do we want to disallow the following: %p2 = getelementptr inbounds i32, i32* %p1, i64 -42 %p2_bounded = @llvm.memory.region.decl.p0i8(%p2, 42, 43) %p3 = getelementptr inbounds i32, i32* %p2_bounded, i64 42 %v = load i32, i32* %p3, align 4 courbet: I'm still confused as to why we need that restriction. For example, why do we want to disallow…
		nikicUnsubmitted Done Reply Inline Actions At least as defined, `%p2_bounded` would already be poison in that case, because `%p2_bounded` would be before the declared memory region. The fact that `%p3` would point back into the memory region doesn't matter in that case, because the pointer is already poison. nikic: At least as defined, `%p2_bounded` would already be poison in that case, because `%p2_bounded`…


.. _constrainedfp:		.. _constrainedfp:

Constrained Floating-Point Intrinsics		Constrained Floating-Point Intrinsics
-------------------------------------		-------------------------------------

These intrinsics are used to provide special handling of floating-point		These intrinsics are used to provide special handling of floating-point
operations when specific rounding mode or floating-point exception behavior is		operations when specific rounding mode or floating-point exception behavior is
required. By default, LLVM optimization passes assume that the rounding mode is		required. By default, LLVM optimization passes assume that the rounding mode is
round-to-nearest and that floating-point exceptions will not be monitored.		round-to-nearest and that floating-point exceptions will not be monitored.
Constrained FP intrinsics are used to support non-default rounding modes and		Constrained FP intrinsics are used to support non-default rounding modes and
accurately preserve exception behavior without compromising LLVM's ability to		accurately preserve exception behavior without compromising LLVM's ability to
optimize FP code when the default behavior is used.		optimize FP code when the default behavior is used.
		courbetUnsubmitted Done Reply Inline Actions Given the semantics above, 24 should be excluded: `[8,24)` courbet: Given the semantics above, 24 should be excluded: `[8,24)`
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Hm, no, why? Is the `end` pointer not `inbounds`? lebedev.ri: Hm, no, why? Is the `end` pointer not `inbounds`?

		courbetUnsubmitted Done Reply Inline Actions I have a hard time parsing that sentence, did you mean something like: The only ``inbounds`` addresses that can be computed from pointer ``%ptr`` are those that lie within offsets ``[-8, 24]`` from that pointer. Other offsets will result in a :ref:`poison value <poisonvalues>`. courbet: I have a hard time parsing that sentence, did you mean something like: ``` The only…
If any FP operation in a function is constrained then they all must be		If any FP operation in a function is constrained then they all must be
constrained. This is required for correct LLVM IR. Optimizations that		constrained. This is required for correct LLVM IR. Optimizations that
move code around can create miscompiles if mixing of constrained and normal		move code around can create miscompiles if mixing of constrained and normal
operations is done. The correct way to mix constrained and less constrained		operations is done. The correct way to mix constrained and less constrained
operations is to use the rounding mode and exception handling metadata to		operations is to use the rounding mode and exception handling metadata to
mark constrained intrinsics as having LLVM's default behavior.		mark constrained intrinsics as having LLVM's default behavior.
		arichardsonUnsubmitted Done Reply Inline Actions Do you envision this being used for all sub-object pointer creations? If so it might need a flag to disable it since it might break some C patterns such as `container_of`. According to https://godbolt.org/z/evTbejaMf the container_of macro results in an inbounds GEP, so with sufficient inlining things might break? About three years ago I spent quite a lot of time enforcing sub-object bounds at runtime using CHERI. Almost all code works just fine but there are things such as container_of() that require opt-out annotations. I wrote about the incompatibilities that I found in Chapter 5 of https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-949.pdf. TL;DR: not many changes needed - about 50 annotations across the entire FreeBSD source tree. Almost all annotations due to container_of or emulation of C++ inheritance in C. arichardson: Do you envision this being used for all sub-object pointer creations? If so it might need a…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions I'm not sure what you mean by "all sub-object pointer creations". Roughly, front-ends should emit this intrinsic on some pointer with some bounds iff they know that it would be UB to go from that specific pointer (aka, as per def-use) outside of the specified bounds. The one case we know of is C arrays within structs. lebedev.ri: I'm not sure what you mean by "all sub-object pointer creations". Roughly, front-ends should…
		arichardsonUnsubmitted Done Reply Inline Actions By sub-object pointer creations I mean something like `&obj->field`. You could conceivable treat that as declaring a new sub-object bounded to just `field`. E.g. something like this: https://godbolt.org/z/bM7j1bxqs arichardson: By sub-object pointer creations I mean something like `&obj->field`. You could conceivable…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions I think the question is slightly wrong. It's up to front-ends to decide when they can and can't emit this. If you are asking about https://godbolt.org/z/adq5EWx17, then as per previous conversations about this, i do believe that code to be well-defined and not UB. IOW, i do not believe that as per the current C/C++ standards wording each data member of a struct is it's own sub-object from which you are not allowed to get to it's neighbor objects, But perhaps @aaron.ballman wants to correct me on this. lebedev.ri: I think the question is slightly wrong. It's up to front-ends to decide when they can and can't…
		arichardsonUnsubmitted Done Reply Inline Actions Yes absolutely agreed that this is purely up to the frontend, I just assumed you had plans to update clang to emit the new intrinsic. It has been a long time since I looked at the C standard with regards to subobjects but if I recall correctly you are right that this is not defined as being illegal. However, doesn't that also mean that you can access the member before/after an embedded array? arichardson: Yes absolutely agreed that this is purely up to the frontend, I just assumed you had plans to…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Right. Having a pointer to the array member of a struct isn't going to do anything. The magic happens when you have a pointer to the element of the array member: https://godbolt.org/z/cjE5bY4G4 <- manually crafted lebedev.ri: Right. Having a pointer to the array member of a struct isn't going to do anything. The magic…
		aaron.ballmanUnsubmitted Done Reply Inline Actions IOW, i do not believe that as per the current C/C++ standards wording each data member of a struct is it's own sub-object from which you are not allowed to get to it's neighbor objects, But perhaps @aaron.ballman wants to correct me on this. The rules on pointer arithmetic in C++: http://eel.is/c++draft/expr.add#4, so https://godbolt.org/z/adq5EWx17 looks like UB to me in C++. C2x 6.5.6p9 reads similarly, so it also looks like UB to me in C. What language rules make you think otherwise? aaron.ballman: > IOW, i do not believe that as per the current C/C++ standards wording each data member of a…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Hm, thank you for correcting me, that does not match my recollection from the previous time we discussed this. What about https://godbolt.org/z/8T54zxT43, is that pointer well-defined? lebedev.ri: Hm, thank you for correcting me, that does not match my recollection from the previous time we…
		aaron.ballmanUnsubmitted Done Reply Inline Actions I believe that's also UB for the same reason -- the subtraction violates http://eel.is/c++draft/expr.add#4.3, so we can't say much about the resulting pointer value. Note, a slightly different case of using `+ 1` instead of `- 1` is valid because any object can be treated as an array of one (https://eel.is/c++draft/basic.compound#3.sentence-12) and a one-past-the-end pointer is valid (http://eel.is/c++draft/expr.add#4.2) so long as it's not dereferenced. aaron.ballman: I believe that's also UB for the same reason -- the subtraction violates http://eel.
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Right, `+1` (`end` pointer) is non-dereferenceable, and is fine. Hmmmm, that sounds too good to be true. Then i stand corrected, i guess we could emit it even for such cases, although that will likely break code, and i'm not sure if a sanitizer could be implemented to catch the UB. lebedev.ri: Right, `+1` (`end` pointer) is non-dereferenceable, and is fine. Hmmmm, that sounds too good…
		arichardsonUnsubmitted Not Done Reply Inline Actions Not a sanitizer for x86, but you can use CHERI LLVM and compile with -cheri-bounds=subobject-safe to detect such violations at runtime. Code can then be run on QEMU or also on Arm's Morello boards that were recently released (if you happen to be one of the recipients). We use that flag for the FreeBSD kernel and it works very well with only a few minor adjustments. I believe that a sanitizer for non-CHERI hardware would require complete provenance (bounds) shadow memory for every pointer so would be a big engineering effort. arichardson: Not a sanitizer for x86, but you can use CHERI LLVM and compile with -cheri-bounds=subobject…

Each of these intrinsics corresponds to a normal floating-point operation. The		Each of these intrinsics corresponds to a normal floating-point operation. The
data arguments and the return value are the same as the corresponding FP		data arguments and the return value are the same as the corresponding FP
operation.		operation.

The rounding mode argument is a metadata string specifying what		The rounding mode argument is a metadata string specifying what
assumptions, if any, the optimizer can make when transforming constant		assumptions, if any, the optimizer can make when transforming constant
values. Some constrained FP intrinsics omit this argument. If required		values. Some constrained FP intrinsics omit this argument. If required
by the intrinsic, this argument must be one of the following strings:		by the intrinsic, this argument must be one of the following strings:

::		::

"round.dynamic"		"round.dynamic"
"round.tonearest"		"round.tonearest"
"round.downward"		"round.downward"
"round.upward"		"round.upward"
"round.towardzero"		"round.towardzero"
"round.tonearestaway"		"round.tonearestaway"

If this argument is "round.dynamic" optimization passes must assume that the		If this argument is "round.dynamic" optimization passes must assume that the
		nikicUnsubmitted Done Reply Inline Actions Doesn't this sentence contradict the first sentence in "Semantics"? If you want to make a distinction between inbounds/non-inbounds, then I think you have to do that in terms of restricting the visible allocated object, rather than saying that any pointer based on it cannot be outside the range. That would mean that something like `%p = memory.region.decl(%p0, 8, 16)` would not be poison, though dereferencing it would be and doing an inbounds gep would be, while doing a non-inbounds gep by 8 and then dereferencing would be legal. nikic: Doesn't this sentence contradict the first sentence in "Semantics"? If you want to make a…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Hmm, i do not remember why i've added that footprint here. I believe we don't want to make a distinction between inbounds/non-inbounds. lebedev.ri: Hmm, i do not remember why i've added that footprint here. I believe we don't want to make a…
rounding mode is unknown and may change at runtime. No transformations that		rounding mode is unknown and may change at runtime. No transformations that
depend on rounding mode may be performed in this case.		depend on rounding mode may be performed in this case.

The other possible values for the rounding mode argument correspond to the		The other possible values for the rounding mode argument correspond to the
similarly named IEEE rounding modes. If the argument is any of these values		similarly named IEEE rounding modes. If the argument is any of these values
optimization passes may perform transformations as long as they are consistent		optimization passes may perform transformations as long as they are consistent
with the specified rounding mode.		with the specified rounding mode.

▲ Show 20 Lines • Show All 3,405 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 1,170 Lines • ▼ Show 20 Lines	def int_launder_invariant_group : DefaultAttrsIntrinsic<[llvm_anyptr_ty],
[LLVMMatchType<0>],		[LLVMMatchType<0>],
[IntrInaccessibleMemOnly, IntrSpeculatable, IntrWillReturn]>;		[IntrInaccessibleMemOnly, IntrSpeculatable, IntrWillReturn]>;


def int_strip_invariant_group : DefaultAttrsIntrinsic<[llvm_anyptr_ty],		def int_strip_invariant_group : DefaultAttrsIntrinsic<[llvm_anyptr_ty],
[LLVMMatchType<0>],		[LLVMMatchType<0>],
[IntrSpeculatable, IntrNoMem, IntrWillReturn]>;		[IntrSpeculatable, IntrNoMem, IntrWillReturn]>;

		// Declares that the returned pointer (the first argument),
		// and any pointer that is (transitively) def-use based on that pointer,
		// points into the memory region [ptr+begin_offset, ptr+end_offset),
		// or is poison otherwise.
		def int_memory_region_decl : DefaultAttrsIntrinsic<[llvm_anyptr_ty],
		[LLVMMatchType<0> /ptr/, llvm_i64_ty /begin_offset/,
		llvm_i64_ty /end_offset/], [IntrNoMem, IntrSpeculatable,
		nikicUnsubmitted Not Done Reply Inline Actions nit: Is the ReadNone here actually meaningful if the whole intrinsic is already IntrNoMem? nikic: nit: Is the ReadNone here actually meaningful if the whole intrinsic is already IntrNoMem?
		NoCapture<ArgIndex<0>>, Returned<ArgIndex<0>>, ReadNone<ArgIndex<0>>]>;
		nikicUnsubmitted Done Reply Inline Actions While technically correct, annotating it `Returned` means that the intrinsic is simply going to be optimized away. You want to drop `returned` here and instead add the intrinsic to `isIntrinsicReturningPointerAliasingArgumentWithoutCapturing()`, which handles various intrinsics of this kind. nikic: While technically correct, annotating it `Returned` means that the intrinsic is simply going to…

//===------------------------ Stackmap Intrinsics -------------------------===//		//===------------------------ Stackmap Intrinsics -------------------------===//
//		//
def int_experimental_stackmap : DefaultAttrsIntrinsic<[],		def int_experimental_stackmap : DefaultAttrsIntrinsic<[],
[llvm_i64_ty, llvm_i32_ty, llvm_vararg_ty],		[llvm_i64_ty, llvm_i32_ty, llvm_vararg_ty],
[Throws]>;		[Throws]>;
def int_experimental_patchpoint_void : DefaultAttrsIntrinsic<[],		def int_experimental_patchpoint_void : DefaultAttrsIntrinsic<[],
[llvm_i64_ty, llvm_i32_ty,		[llvm_i64_ty, llvm_i32_ty,
llvm_ptr_ty, llvm_i32_ty,		llvm_ptr_ty, llvm_i32_ty,
▲ Show 20 Lines • Show All 745 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

Show First 20 Lines • Show All 2,139 Lines • ▼ Show 20 Lines	case Intrinsic::invariant_start: {
return true;		return true;
}		}
case Intrinsic::invariant_end:		case Intrinsic::invariant_end:
return true;		return true;
case Intrinsic::expect:		case Intrinsic::expect:
case Intrinsic::annotation:		case Intrinsic::annotation:
case Intrinsic::ptr_annotation:		case Intrinsic::ptr_annotation:
case Intrinsic::launder_invariant_group:		case Intrinsic::launder_invariant_group:
case Intrinsic::strip_invariant_group: {		case Intrinsic::strip_invariant_group:
		case Intrinsic::memory_region_decl: {
// Drop the intrinsic, but forward the value.		// Drop the intrinsic, but forward the value.
MIRBuilder.buildCopy(getOrCreateVReg(CI),		MIRBuilder.buildCopy(getOrCreateVReg(CI),
getOrCreateVReg(*CI.getArgOperand(0)));		getOrCreateVReg(*CI.getArgOperand(0)));
return true;		return true;
}		}
case Intrinsic::assume:		case Intrinsic::assume:
case Intrinsic::experimental_noalias_scope_decl:		case Intrinsic::experimental_noalias_scope_decl:
case Intrinsic::var_annotation:		case Intrinsic::var_annotation:
▲ Show 20 Lines • Show All 1,359 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,614 Lines • ▼ Show 20 Lines	#include "llvm/IR/VPIntrinsics.def"

case Intrinsic::is_constant:		case Intrinsic::is_constant:
llvm_unreachable("llvm.is.constant.* should have been lowered already");		llvm_unreachable("llvm.is.constant.* should have been lowered already");

case Intrinsic::annotation:		case Intrinsic::annotation:
case Intrinsic::ptr_annotation:		case Intrinsic::ptr_annotation:
case Intrinsic::launder_invariant_group:		case Intrinsic::launder_invariant_group:
case Intrinsic::strip_invariant_group:		case Intrinsic::strip_invariant_group:
		case Intrinsic::memory_region_decl:
// Drop the intrinsic, but forward the value		// Drop the intrinsic, but forward the value
setValue(&I, getValue(I.getOperand(0)));		setValue(&I, getValue(I.getOperand(0)));
return;		return;

case Intrinsic::assume:		case Intrinsic::assume:
case Intrinsic::experimental_noalias_scope_decl:		case Intrinsic::experimental_noalias_scope_decl:
case Intrinsic::var_annotation:		case Intrinsic::var_annotation:
case Intrinsic::sideeffect:		case Intrinsic::sideeffect:
▲ Show 20 Lines • Show All 4,626 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/memory_region_decl.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -global-isel -mtriple=amdgcn-unknown-amdhsa < %s \| FileCheck %s

				declare i8* @llvm.memory.region.decl.p0i8(i8*, i64, i64)

				define i8* @test_i8(i8* %ptr, i64 %begin_off, i64 %end_off) {
				; CHECK-LABEL: test_i8:
				; CHECK: ; %bb.0:
				; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; CHECK-NEXT: s_setpc_b64 s[30:31]
				%r = call i8* @llvm.memory.region.decl.p0i8(i8* %ptr, i64 %begin_off, i64 %end_off)
				ret i8* %r
				}

				define i8* @test_i8_naive(i8* %ptr, i64 %begin_off, i64 %end_off) {
				; CHECK-LABEL: test_i8_naive:
				; CHECK: ; %bb.0:
				; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; CHECK-NEXT: s_setpc_b64 s[30:31]
				ret i8* %ptr
				}

llvm/test/CodeGen/X86/memory_region_decl.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-unknown-linux \| FileCheck %s --check-prefixes=X64
				; RUN: llc < %s -mtriple=i686-unknown-linux \| FileCheck %s --check-prefixes=X86

				declare i8* @llvm.memory.region.decl.p0i8(i8*, i64, i64)

				define i8* @test_i8(i8* %ptr, i64 %begin_off, i64 %end_off) {
				; X64-LABEL: test_i8:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: test_i8:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: retl
				%r = call i8* @llvm.memory.region.decl.p0i8(i8* %ptr, i64 %begin_off, i64 %end_off)
				ret i8* %r
				}

				define i8* @test_i8_naive(i8* %ptr, i64 %begin_off, i64 %end_off) {
				; X64-LABEL: test_i8_naive:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: test_i8_naive:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: retl
				ret i8* %ptr
				}

This is an archive of the discontinued LLVM Phabricator instance.

[IR][RFC] Memory region declaration intrinsicAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 392490

llvm/docs/LangRef.rst

llvm/include/llvm/IR/Intrinsics.td

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/test/CodeGen/AMDGPU/GlobalISel/memory_region_decl.ll

llvm/test/CodeGen/X86/memory_region_decl.ll

[IR][RFC] Memory region declaration intrinsic
AcceptedPublic