This is an archive of the discontinued LLVM Phabricator instance.

Allow clang to emit inrange metadata when generating code for array subscripts
Needs ReviewPublic

Authored by simeon-imgtec on May 9 2023, 4:14 AM.

Details

Summary

As out-of-bounds array accesses are undefined in C/C++, we can capture this restriction using the newly introduced inrange attribute on individual GEP indices. The underlying motivation is that it would enable less conservative inferences in (basic) alias analysis, in turn improving passes such as GVN and LICM.

As the inrange keyword is currently only supported on constant GEPs, we extend the support to GEP instructions, and mofiy clang to emit the inrange flag in an array subscript scenario where it is safe to do so.

Diff Detail

Event Timeline

simeon-imgtec created this revision.May 9 2023, 4:14 AM
simeon-imgtec created this object with visibility "All Users".
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 4:14 AM
simeon-imgtec requested review of this revision.May 9 2023, 4:14 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
simeon-imgtec added a child revision: Restricted Differential Revision.May 9 2023, 5:28 AM

From what I recall, "inrange" is actually more restrictive than the normal C/C++ array indexing rules. Specifically, the bits regarding comparisons. "inrange" was designed to allow splitting globals indexed using inrange.

That isn't to say that functionality like this isn't useful, but we probably need to call it something different. And I'm not sure an attribute on a GEP is actually the best way to represent the semantics. It might make sense to use a separate intrinsic to represent this, especially given the direction of https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 .

From what I recall, "inrange" is actually more restrictive than the normal C/C++ array indexing rules. Specifically, the bits regarding comparisons. "inrange" was designed to allow splitting globals indexed using inrange.

That isn't to say that functionality like this isn't useful, but we probably need to call it something different. And I'm not sure an attribute on a GEP is actually the best way to represent the semantics. It might make sense to use a separate intrinsic to represent this, especially given the direction of https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 .

Hi Eli,

Very appreciate your inputs. ^^
Is there any idea how the intrinsic should look like and how it associated with GEP before ptradd arrive?

efriedma changed the visibility from "All Users" to "Public (No Login Required)".May 10 2023, 9:28 AM