This is an archive of the discontinued LLVM Phabricator instance.

BasicAA: Recognize inttoptr as isEscapeSource
ClosedPublic

Authored by JosephTremoulet on Apr 29 2021, 8:41 AM.

Details

Summary

Pointers escape when converted to integers, so a pointer produced by
converting an integer to a pointer must not be a local non-escaping
object.

Diff Detail

Event Timeline

JosephTremoulet requested review of this revision.Apr 29 2021, 8:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2021, 8:41 AM

I've verified this doesn't break anything in check-all or in test-suite with SpecInt2017 included. I'm happy for suggestions if there are other testing paces this should be put through. Thanks.

I believe this is correct, and also not affected by open ptrtoint/inttoptr issues.

llvm/test/Analysis/BasicAA/noalias-inttoptr.ll
11

We should also have negative (MayAlias) tests for a) non-local object and b) local escaping object.

  • Include negative tests and alloca as well as noalias parameter
JosephTremoulet marked an inline comment as done.Apr 29 2021, 10:47 AM

I believe this is correct, and also not affected by open ptrtoint/inttoptr issues.

Thanks for confirming!

nikic accepted this revision.Apr 29 2021, 11:39 AM

LGTM, but I'd suggest waiting a bit in case there are further comments -- it's a tricky area.

This revision is now accepted and ready to land.Apr 29 2021, 11:39 AM

isNonEscapingLocalObject returns true if the pointer is a noalias argument, and it's worth clarifying why this is okay somewhere.

The noalias argument case is slightly different from other cases because its address could have been ptrtoint-ed already. For example:

%i = ptrtoint i8* %ptr0 to i64
call void @f(i8* %ptr0, i64 %i)

define void @f(i8* noalias %ptr, i64 %i) {
  ; access %ptr <- isNonEscapingLocalObject(%ptr) is true, but
  ; access inttoptr(%i) <- this can exactly touch %ptr.
}

It isn't clear whether the second access is successfully done according to the definition of based-on in LangRef.
Should we clarify that noalias ptr (%ptr) is not based on the original pointer (%ptr0)?

isNonEscapingLocalObject returns true if the pointer is a noalias argument, and it's worth clarifying why this is okay somewhere.

The noalias argument case is slightly different from other cases because its address could have been ptrtoint-ed already. For example:

%i = ptrtoint i8* %ptr0 to i64
call void @f(i8* %ptr0, i64 %i)

define void @f(i8* noalias %ptr, i64 %i) {
  ; access %ptr <- isNonEscapingLocalObject(%ptr) is true, but
  ; access inttoptr(%i) <- this can exactly touch %ptr.
}

It isn't clear whether the second access is successfully done according to the definition of based-on in LangRef.
Should we clarify that noalias ptr (%ptr) is not based on the original pointer (%ptr0)?

By my read, it isn't that %ptr isn't based on %ptr0 (I think it is based on %ptr0, though the LangRef based-on definition doesn't explicitly address copies from call arguments/parameters/returns, phis, selects, extractelement/extractvalue, etc.; maybe it should?).
Rather, I think that it's this language in the definition of noalias:

This indicates that memory locations accessed via pointer values
 :ref:`based <pointeraliasing>` on the argument or return value are not also
 accessed, during the execution of the function, via pointer values not
 *based* on the argument or return value. ... The caller shares the responsibility with the callee for
 ensuring that these requirements are met.

which means that the example you show is an incorrect use of the noalias attribute, because of the access via inttoptr(%i), which is not based on %ptr.

aqjune accepted this revision.May 1 2021, 11:33 PM

isNonEscapingLocalObject returns true if the pointer is a noalias argument, and it's worth clarifying why this is okay somewhere.

The noalias argument case is slightly different from other cases because its address could have been ptrtoint-ed already. For example:

%i = ptrtoint i8* %ptr0 to i64
call void @f(i8* %ptr0, i64 %i)

define void @f(i8* noalias %ptr, i64 %i) {
  ; access %ptr <- isNonEscapingLocalObject(%ptr) is true, but
  ; access inttoptr(%i) <- this can exactly touch %ptr.
}

It isn't clear whether the second access is successfully done according to the definition of based-on in LangRef.
Should we clarify that noalias ptr (%ptr) is not based on the original pointer (%ptr0)?

By my read, it isn't that %ptr isn't based on %ptr0 (I think it is based on %ptr0, though the LangRef based-on definition doesn't explicitly address copies from call arguments/parameters/returns, phis, selects, extractelement/extractvalue, etc.; maybe it should?).
Rather, I think that it's this language in the definition of noalias:

This indicates that memory locations accessed via pointer values
 :ref:`based <pointeraliasing>` on the argument or return value are not also
 accessed, during the execution of the function, via pointer values not
 *based* on the argument or return value. ... The caller shares the responsibility with the callee for
 ensuring that these requirements are met.

which means that the example you show is an incorrect use of the noalias attribute, because of the access via inttoptr(%i), which is not based on %ptr.

Makes sense!

nlopes accepted this revision.May 2 2021, 3:05 PM

There's this comment in isNonEscapingLocalObject:

// If this is an argument that corresponds to a byval or noalias argument,
// then it has not escaped before entering the function.  Check if it escapes
// inside the function.

Which supports this patch. Though this claim doesn't seem very clear to me. e.g. we mark memcpy's arguments with noalias. It's not guaranteed case that the input pointers haven't escaped before. So the exact meaning of noalias is not clear to me.
I've read the C11 standard re restrict, and it is omissive w.r.t. to int2ptr casts. Though I think this patch respects the spirit of the standard. We just need to nail the exact semantics of noalias at some point..
TL;DR: LGTM.

There's this comment in isNonEscapingLocalObject:

// If this is an argument that corresponds to a byval or noalias argument,
// then it has not escaped before entering the function.  Check if it escapes
// inside the function.

Which supports this patch. Though this claim doesn't seem very clear to me. e.g. we mark memcpy's arguments with noalias. It's not guaranteed case that the input pointers haven't escaped before. So the exact meaning of noalias is not clear to me.
I've read the C11 standard re restrict, and it is omissive w.r.t. to int2ptr casts. Though I think this patch respects the spirit of the standard. We just need to nail the exact semantics of noalias at some point..
TL;DR: LGTM.

Yeah I think the guarantee is something more like: if it did escape before the function, then it won't be accessed during the function via any pointer based on that escape (because such an access couldn't be based on the noalias parameter). But I'm technically misusing "based on" there, because it's defined as a predicate on pointer values, whereas "that escape" is a use...

Anyway, thanks!

LGTM, but I'd suggest waiting a bit in case there are further comments -- it's a tricky area.

@nikic, a week later there have been two comments, both approvals. Do you think it's good to go now?

Thanks

nikic added a comment.May 6 2021, 1:26 PM

@JosephTremoulet Yeah, that's more than enough! Please go ahead.

This revision was landed with ongoing or failed builds.May 7 2021, 7:49 AM
This revision was automatically updated to reflect the committed changes.