This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
Attr.td
1/1
AttrDocs.td
-
lib/
-
CodeGen/
15/18
CGExpr.cpp
-
Sema/
5/5
SemaDeclAttr.cpp
-
test/
-
CodeGen/
-
bpf-preserve-static-offset-arr.c
-
bpf-preserve-static-offset-bitfield.c
-
bpf-preserve-static-offset-lvalue.c
-
bpf-preserve-static-offset-non-bpf.c
-
bpf-preserve-static-offset-pai.c
-
Misc/
-
pragma-attribute-supported-attributes-list.test
-
Sema/
-
bpf-attr-preserve-static-offset-warns-nonbpf.c
-
bpf-attr-preserve-static-offset-warns.c
2/2
bpf-attr-preserve-static-offset.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
1/4
Intrinsics.td
-
IntrinsicsBPF.td
-
lib/Target/BPF/
-
Target/
-
BPF/
-
BPF.h
-
BPFAbstractMemberAccess.cpp
-
BPFCORE.h
-
BPFCheckAndAdjustIR.cpp
3/4
BPFPreserveStaticOffset.cpp
-
BPFTargetMachine.cpp
-
CMakeLists.txt
-
test/CodeGen/BPF/preserve-static-offset/
-
CodeGen/
-
BPF/
-
preserve-static-offset/
6/6
load-align.ll
-
load-arr-pai.ll
-
load-atomic.ll
-
load-chain-2.ll
-
load-chain-oob.ll
-
load-chain-u8-oob.ll
-
load-chain-u8-type-mismatch.ll
-
load-chain-u8.ll
-
load-chain.ll
-
load-inline.ll
-
load-non-const.ll
-
load-ptr-pai.ll
-
load-simple.ll
-
load-struct-pai.ll
-
load-undo-align.ll
-
load-undo-chain-oob.ll
-
load-undo-chain-u8.ll
-
load-undo-chain.ll
-
load-undo-simple.ll
-
load-undo-volatile.ll
-
load-union-pai.ll
-
load-unroll-inline.ll
-
load-unroll.ll
-
load-volatile.ll
-
load-zero.ll
-
store-align.ll
-
store-atomic.ll
-
store-chain-2.ll
-
store-chain-oob.ll
-
store-chain-u8-oob.ll
-
store-chain-u8.ll
-
store-chain.ll
-
store-pai.ll
-
store-simple.ll
-
store-undo-align.ll
-
store-undo-chain-oob.ll
-
store-undo-chain-u8.ll
-
store-undo-chain.ll
-
store-undo-simple.ll
-
store-undo-volatile.ll
-
store-unroll-inline.ll
-
store-volatile.ll
-
store-zero.ll

Differential D133361

[BPF] Attribute preserve_static_offset for structs
ClosedPublic

Authored by eddyz87 on Sep 6 2022, 8:31 AM.

Download Raw Diff

Details

Reviewers

yonghong-song
aaron.ballman
erichkeane
ast

Commits

rG030b8cb1561d: [BPF] Attribute preserve_static_offset for structs
rGcb13e9286b6d: [BPF] Attribute preserve_static_offset for structs

Summary

This commit adds a new BPF specific structure attribte
__attribute__((preserve_static_offset)) and a pass to deal with it.

This attribute may be attached to a struct or union declaration, where
it notifies the compiler that this structure is a "context" structure.
The following limitations apply to context structures:

runtime environment might patch access to the fields of this type by updating the field offset;

BPF verifier limits access patterns allowed for certain data types. E.g. struct __sk_buff and struct bpf_sock_ops. For these types only LD/ST <reg> <static-offset> memory loads and stores are allowed.

This is so because offsets of the fields of these structures do not match real offsets in the running kernel. During BPF program load/verification loads and stores to the fields of these types are rewritten so that offsets match real offsets. For this rewrite to happen static offsets have to be encoded in the instructions.

See kernel/bpf/verifier.c:convert_ctx_access function in the Linux kernel source tree for details.

runtime environment might disallow access to the field of the type through modified pointers.

During BPF program verification a tag PTR_TO_CTX is tracked for register values. In case if register with such tag is modified BPF programs are not allowed to read or write memory using register. See kernel/bpf/verifier.c:check_mem_access function in the Linux kernel source tree for details.

Access to the structure fields is translated to IR as a sequence:

(load (getelementptr %ptr %offset)) or
(store (getelementptr %ptr %offset))

During instruction selection phase such sequences are translated as a
single load instruction with embedded offset, e.g. LDW %ptr, %offset,
which matches access pattern necessary for the restricted
set of types described above (when %offset is static).

Multiple optimizer passes might separate these instructions, this
includes:

SimplifyCFGPass (sinking)
InstCombine (sinking)
GVN (hoisting)

The preserve_static_offset attribute marks structures for which the
following transformations happen:

at the early IR processing stage:
- (load (getelementptr ...)) replaced by call to intrinsic llvm.bpf.getelementptr.and.load;
- (store (getelementptr ...)) replaced by call to intrinsic llvm.bpf.getelementptr.and.store;
at the late IR processing stage this modification is undone.

Such handling prevents various optimizer passes from generating
sequences of instructions that would be rejected by BPF verifier.

The attribute((preserve_static_offset)) has a priority over
attribute((preserve_access_index)). When preserve_access_index
attribute is present preserve access index transformations are not
applied.

This addresses the issue reported by the following thread:

https://lore.kernel.org/bpf/CAA-VZPmxh8o8EBcJ=m-DH4ytcxDFmo0JKsm1p1gf40kS0CE3NQ@mail.gmail.com/T/#m4b9ce2ce73b34f34172328f975235fc6f19841b6

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

eddyz87 created this revision.Sep 6 2022, 8:31 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptSep 6 2022, 8:31 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: hiraditya, mgorny. · View Herald Transcript

Harbormaster completed remote builds in B185229: Diff 458181.Sep 6 2022, 10:23 AM

Added folding for GEP chains with constant indices:
- inline anonymous unions work as expected
- type casts to char* with consequitive array access work as expected
Moved context rewrite to a late callback function processing pipeline stage:
- necessary to run the rewrite after loop unrolling, as rewrite can't handle non-constant indices for non structural GEP chains

Harbormaster completed remote builds in B186229: Diff 459560.Sep 12 2022, 3:33 PM

added GVN pass after context rewrite to handle cases clobbered by marker calls
added marker simplification and context rewrite passes for opt pipeline
more test cases

Harbormaster completed remote builds in B186824: Diff 460356.Sep 15 2022, 5:41 AM

Merged simplify and rewrite passes, various adjustments to unclobber CSE and GVN opportunities

Harbormaster completed remote builds in B187413: Diff 461103.Sep 18 2022, 5:43 PM

Merge getelementptr and get.and.load/store calls for function inlining

Harbormaster completed remote builds in B187563: Diff 461317.Sep 19 2022, 1:56 PM

Force doesNotAccessMemory for context marker function

Harbormaster completed remote builds in B188071: Diff 462038.Sep 21 2022, 4:24 PM

Moved readonly/writeonly attrs as properties of call site to handle volatile load and stores

Harbormaster completed remote builds in B188565: Diff 462709.Sep 25 2022, 4:03 AM

Rebase, replaced .c based back-end changes by .ll based ones.

Herald added a subscriber: zzheng. · View Herald TranscriptJun 26 2023, 7:55 AM

Harbormaster completed remote builds in B241179: Diff 534561.Jun 26 2023, 10:26 AM

renamed llvm.bpf.context.marker to llvm.context.marker.bpf (to avoid issues with intrinsic being looked up in the wrong table and not getting an ID / Attrs for CallInst);
use new pass manager for tests.

Harbormaster completed remote builds in B241663: Diff 535210.Jun 27 2023, 8:37 PM

removed lots of uses of auto
removed usage of std::set

Harbormaster completed remote builds in B241775: Diff 535380.Jun 28 2023, 8:44 AM

Rebase.

Harbormaster completed remote builds in B242273: Diff 536044.Jun 29 2023, 7:48 PM

Generate immarg for index arguments of gep.and.load, gep.and.store.

Hi Yonghong,

Could you please take a look at this revision? I tried to describe the mechanics in the description / commit message and comments. Below are details regarding testing.
I tested these changes using a Linux Kernel special branch that declares decl_tag("ctx") for relevant functions (available here).
While comparing object files generated without decl_tag("ctx") and with decl_tag("ctx") I found some differences in 5 tests. (I use "with ctx" and "without ctx" below as a shorthand).

bpf_flow.bpf.o

Without ctx: 2713 instructions
With ctx: 2711 instructions

Difference: 2 load instructions appear in different BB and this requires two additional register to register assignments in "without ctx" case.
Difference is caused by GVNPass::PerformLoadPRE sub-pass and disappears if -mllvm -enable-load-pre=false option is passed. As the name implies this sub-pass operates only on load instructions.

connect4_prog.bpf.o

Without ctx: 351 instructions
With ctx: 351 instructions

Difference: one redundant load instruction is put in it's own basic block when compiled without ctx.

c
static __inline int set_notsent_lowat(struct bpf_sock_addr *ctx)
{ ... ctx->type ... }

int connect_v4_prog(struct bpf_sock_addr *ctx)
{
	...
	if (set_notsent_lowat(ctx))
		return 0;

	if (ctx->type ...) // ctx->type load is partially redundat after inlining
		return 0;
}

In the without ctx this is done by a part of JumpThreadingPass:

c++
/// simplifyPartiallyRedundantLoad - If LoadI is an obviously partially
/// redundant load instruction, eliminate it by replacing it with a PHI node.
/// This is an important optimization that encourages jump threading, and needs
/// to be run interlaced with other jump threading tasks.
bool JumpThreadingPass::simplifyPartiallyRedundantLoad(LoadInst *LoadI)

test_cls_redirect.bpf.o

Without ctx: 1616 instructions
With ctx: 1615 instructions

Difference: some differences in basic blocks placement. IR basic block structure is identical up until 'Branch Probability Basic Block Placement' machine pass.
Before that in without ctx case there is one additional basic block, which is introduced by GVNPass on process_icmpv4 function. Differences caused by GVNPass::PerformLoadPRE sub-pass which calls
llvm::SplitCriticalEdge that introduces the basic block.

test_misc_tcp_hdr_options.bpf.o

Without ctx: 611 instructions
With ctx: 612 instructions

Difference: one more load instruction generated in with ctx case. Happens in the following code snippet:

c
unsigned int nr_data = 0;
...
static int check_active_hdr_in(struct bpf_sock_ops *skops)
{
	...
	if (... < skops->skb_len)  // (1)
		nr_data++;
	...
	if (... == skops->skb_len) // (2)
		nr_pure_ack++;
	...
}

When compiled without ctx EarlyCSEPass reuses skops->skb_len computed at (1) for point (2). However, when compiled with ctx this does not happen. This happens because clang is not sure if skops, passed as parameter to getelementptr.and.load shares memory with nr_data. The following modification removes all differences:

c
static int check_active_hdr_in(struct bpf_sock_ops * restrict skops)

test_tcpnotify_kern.bpf.o

Without ctx: 57 instructions
With ctx: 58 instructions

Difference: one more store instruction generated, CFG structure differs slightly.
Difference is caused by SimplifyCFGPass, difference disappears if -mllvm -sink-common-insts=false option is supplied. The following IR snippet is the cause:

llvm
...
if.then:                                          ; preds = %entry
  %1 = getelementptr inbounds %struct.bpf_sock_ops, ptr %skops, i64 0, i32 1
  store i32 -1, ptr %1, align 4, !tbaa !240
  br label %cleanup30
...
...
sw.epilog:                                        ; preds = %if.end, ...
  %rv.0 = phi i32 [ -1, %sw.default ], ...
  %7 = getelementptr inbounds %struct.bpf_sock_ops, ptr %skops, i64 0, i32 1
  store i32 %rv.0, ptr %7, align 4, !tbaa !240
  br label %cleanup30, !dbg !278

cleanup30:                                        ; preds = %sw.epilog, %if.then
  %retval.0 = phi i32 [ 0, %if.then ], [ 1, %sw.epilog ], !dbg !226
  ret i32 %retval.0

Without ctx SimplifyCFGPass sinks getelementptr and store instructions to cleanup30 BB:

llvm
cleanup30:                                        ; preds = %sw.bb19, ...
  %rv.0.sink = phi i32 [ -1, %entry ], ...
  %retval.0 = phi i32 [ 0, %entry ], ...
  %6 = getelementptr inbounds %struct.bpf_sock_ops, ptr %skops, i64 0, i32 1, !dbg !226
  store i32 %rv.0.sink, ptr %6, align 4, !dbg !226, !tbaa !270
  ret i32 %retval.0, !dbg !271

For ctx case snippet looks as follows:

llvm
if.then:                                          ; preds = %entry
  tail call void ... @llvm.bpf.getelementptr.and.store.i32
    (i32 -1, ptr nonnull writeonly elementtype(%struct.bpf_sock_ops) %skops, ...
  br label %cleanup30
...
...
sw.epilog:                                        ; preds = %if.end ...
  %rv.0 = phi i32 [ -1, %sw.default ], ...
  call void ... @llvm.bpf.getelementptr.and.store.i32
    (i32 %rv.0, ptr nonnull writeonly elementtype(%struct.bpf_sock_ops) %skops, ...
  br label %cleanup30

cleanup30:                                        ; preds = %sw.epilog, %if.then
  %retval.0 = phi i32 [ 0, %if.then ], [ 1, %sw.epilog ], !dbg !228
  ret i32 %retval.0, !dbg !279

Note that in if.then the instruction is tail call while for sw.epilog the instruction is call. These instructions are considered not to be the same by SimplifyCFG.cpp:canSinkInstructions. I have an old merge request that fixes this: https://reviews.llvm.org/D134743 . Need to rebase it and ping someone to review.

Herald added a reviewer: aaron.ballman. · View Herald TranscriptJul 2 2023, 6:20 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, cfe-commits, jdoerfert. · View Herald Transcript

Harbormaster completed remote builds in B242719: Diff 536643.Jul 2 2023, 6:33 PM

Removed special Sema processing and btf_decl_tag("ctx") prioritization in CGExpr, preserve access index calls are now handled by BPFContextMarker transformation.
Correctness fixes for isPointerOperand().

Harbormaster completed remote builds in B255148: Diff 553819.Aug 27 2023, 6:08 PM

yonghong-song added inline comments.Aug 27 2023, 8:19 PM

llvm/include/llvm/IR/Intrinsics.td
2472	Is it possible to make this builtin as BPF specific one?

eddyz87 added inline comments.Aug 28 2023, 3:18 AM

llvm/include/llvm/IR/Intrinsics.td
2472	Currently `llvm::Intrinsic::context_marker_bpf` gets defined in `llvm/IR/Intrinsics.h` (via include of generated file `IntrinsicEnums.inc`, same as `preserve_{struct,union,array}_access_index`). BPF specific intrinsics are defined in `llvm/IR/IntrinsicsBPF.h` (generated directly w/o .inc intermediary). Thus, if I move `context_marker_bpf` to `IntrinsicsBPF.td` I would have to include `IntrinsicsBPF.h` in `CGExpr.cpp`. However, I don't see any target specific includes in that file.

git-clang-format fixes.

Harbormaster completed remote builds in B255184: Diff 553884.Aug 28 2023, 4:29 AM

I will try to review other parts of code in the next few days.

llvm/include/llvm/IR/Intrinsics.td
2472	I went through the related clang code and indeed found it is hard to get a BPF target defined function in CGF or CGM. On the other hand, we can consider this new builtin under the umbrella "Intrinsics that are used to preserve debug information". Maybe we can rename the intrinsic name to 'int_preserve_context_marker'? The goal of this builtin to preserve certain load/store which should be immune from optimizations. I try to generalize this so your function name 'wrapWithBPFContextMarker' can be renamed to 'wrapWithContextMarker'. There is no need to mention BPF any more. In the commit message, you can mention that similar to int_preserve_array_access_index, int_preserve_context_marker is only implemented in BPF backend. But other architectures can implement processing these intrinsics if they want to achieve some results similar to bpf backend. WDYT?

eddyz87 added inline comments.Aug 29 2023, 5:22 AM

llvm/include/llvm/IR/Intrinsics.td
2472	I can rename these things, but tbh I don't think this functionality would be useful anywhere outside BPF, thus such renaming would be kind-of deceptive (and in case it would be useful, the renaming could be done at the time of second use). Something more generic might look like `!btf_decl_tag <str-value>` metadata node attached to something. However, in the current form this would require transfer of such decl tag from type to function parameters and variables, e.g.: struct foo { ... } __attribute__((btf_decl_tag("ctx"))); void bar(struct foo p) { ... } During code-gen for `bar` use rule like "if function parameter has type annotated with btf_decl_tag, attach metadata to such parameter": define void @bar(ptr %p !btf_decl_tag !1) { ... } !1 = { "ctx" } Such rule looks a bit weird: tag is transferred from type to it's usage what usages should be annotated? we care about function parameters but from generic point of view `alloca`s or field accesses should be annotated as well. The same metadata approach but with "ctx" attributes annotating function parameters (as you suggested originally, if I recall correctly) seems to be most generic and least controversial of all, e.g.: void bar(struct foo p __attribute__((btf_decl_tag("ctx")))) { ... } Converted to: define void @bar(ptr %p !btf_decl_tag !1) { ... } !1 = { "ctx" } However, this is less ergonomic for BPF, because user will have to annotate function parameters. (On the other hand, no changes in the kernel headers would be necessary).

I can rename these things, but tbh I don't think this functionality would be useful anywhere outside BPF, thus such renaming would be kind-of deceptive (and in case it would be useful, the renaming could be done at the time of second use).

Then let us keep this for now. We can decide what to do after clang frontend people reviewed this code.

I can rename these things, but tbh I don't think this functionality would be useful anywhere outside BPF, thus such renaming would be kind-of deceptive (and in case it would be useful, the renaming could be done at the time of second use).

I agree that it's not useful outside of BPF, but it's useful outside of 'ctx'. I think 'preserve_constant_field_offset' would be more accurate description of the restriction.
We can expand in the doc that it's a constant offset when field of the struct is accessed.

Also instead of btf_tag it would be better to add another builtin similar to preserve_access_index.
Currently we add attribute((preserve_access_index)) to trigger CO-RE.
This one will be a new attribute((preserve_constant_field_offset)) that will be specified manually either in uapi/bpf.h or in vmlinux.h on some structs
and it will have precedence over preserve_access_index, so
(attribute((preserve_access_index)), apply_to = record) in vmlinux.h will be ignored on such structs.
Otherwise it's a bit odd that special names inside btf_tag have stronger rules than other attribute((preserve_access_index)).

In D133361#4628951, @ast wrote:

I agree that it's not useful outside of BPF, but it's useful outside of 'ctx'. I think 'preserve_constant_field_offset' would be more accurate description of the restriction.
We can expand in the doc that it's a constant offset when field of the struct is accessed.

Also instead of btf_tag it would be better to add another builtin similar to preserve_access_index.
Currently we add attribute((preserve_access_index)) to trigger CO-RE.
This one will be a new attribute((preserve_constant_field_offset)) that will be specified manually either in uapi/bpf.h or in vmlinux.h on some structs

This makes sense, I'll adjust the implementation to use new attribute (also will try to use metadata instead of intrinsic call, replace by intrinsic on the back-end side).

and it will have precedence over preserve_access_index, so
(attribute((preserve_access_index)), apply_to = record) in vmlinux.h will be ignored on such structs.
Otherwise it's a bit odd that special names inside btf_tag have stronger rules than other attribute((preserve_access_index)).

Note on current propagation logic: whenever there is an expression E of type T, where T is a struct annotated with btf_decl_tag("ctx"), all usages of E are traversed recursively visiting getelementptr and calls to preserve_{struct,union,array}_index, the latter are replaced by getelementptr. E.g.:

#define __pai __attribute__((preserve_access_context));
#define __ctx __attribute__((btf_decl_tag("ctx")))
struct bar { int a; int b; } __pai;
struct foo {
  struct bar b;
} __ctx __pai;
... struct foo f; ...
... f.b.bb ...

The call to preserve_struct_index generated for bb in f.b.bb would be replaced by getelementptr and later by getelementptr.and.load.
However, context structures don't have nested structures at the moment. The list of context structures is:

__sk_buff
bpf_cgroup_dev_ctx
bpf_sk_lookup
bpf_sock
bpf_sock_addr
bpf_sock_ops
bpf_sockopt
bpf_sysctl
sk_msg_md
sk_reuseport_md
xdp_md

In D133361#4629081, @eddyz87 wrote:
Note on current propagation logic: whenever there is an expression E of type T, where T is a struct annotated with btf_decl_tag("ctx"), all usages of E are traversed recursively visiting getelementptr and calls to preserve_{struct,union,array}_index, the latter are replaced by getelementptr. E.g.:
#define __pai __attribute__((preserve_access_context));
#define __ctx __attribute__((btf_decl_tag("ctx")))
struct bar { int a; int b; } __pai;
struct foo {
  struct bar b;
} __ctx __pai;
... struct foo f; ...
... f.b.bb ...
The call to preserve_struct_index generated for bb in f.b.bb would be replaced by getelementptr and later by getelementptr.and.load.
However, context structures don't have nested structures at the moment. The list of context structures is:

__sk_buff

bpf_cgroup_dev_ctx

bpf_sk_lookup

bpf_sock

bpf_sock_addr

bpf_sock_ops

bpf_sockopt

bpf_sysctl

sk_msg_md

sk_reuseport_md

xdp_md

Right. Such recursive propagation of PAI is necessary. For btf_tag we cannot do it. Always propagating it won't be correct.
New preserve_const_field_offset would need to be propagated too and we actually have nested unions.
__bpf_md_ptr is such example. btf_tag wouldn't propagate into that union, but attr(preserve_const_field_offset) should.

After retesting kernel build with LLVM=1 and libbpf patch to reconstruct btf_decl_tag [1], statistics for BPF selftests looks as follows:

out of 653 object files 13 have some differences with and w/o this change;
for 2 programs there is small instruction count increase (+2 insn total);
for 5 programs there is small instruction decrease (-6 insn total);
6 programs differ slightly but number of instructions is the same.

(The differences are insignificant, the rest of the comment could be skipped as it is probably not interesting for anyone but me).

Differences for first 5 programs were already described in this comment. The rest of the differences in described below.

netns_cookie_prog.bpf.o

Without ctx: 46 instructions
With ctx: 46 instructions

Instruction reordering:

 <get_netns_cookie_sk_msg>:
 	r6 = r1
-	r2 = *(u64 *)(r6 + 0x48)
 	r1 = *(u32 *)(r6 + 0x10)
-	if w1 != 0xa goto +0xb <LBB1_4>
+	if w1 != 0xa goto +0xc <LBB1_4>
+	r2 = *(u64 *)(r6 + 0x48)
 	if r2 == 0x0 goto +0xa <LBB1_4>
 	r1 = 0x0 ll
 	r3 = 0x0

The difference is introduced by "Machine code sinking" transformation. Before the transformation both 0x48 and 0x10 loads reside in the same basic block:

;; Old:
bb.0.entry:
  ...
  %0:gpr = CORE_LD64 345, %2:gpr, @"llvm.sk_msg_md:0:72$0:10:0"
  %9:gpr32 = CORE_LD32 350, %2:gpr, @"llvm.sk_msg_md:0:16$0:2"
  JNE_ri_32 killed %9:gpr32, 10, %bb.3

;; New:
bb.0.entry:
  ...
  %0:gpr = LDD %2:gpr, 72
  %3:gpr32 = LDW32 %2:gpr, 16
  JNE_ri_32 killed %3:gpr32, 10, %bb.3

Note: CORE pseudo-instructions are replaced by regular loads because btf_decl_tag("ctx") has priority over preserve_access_index attribute. The "Machine code sinking" transformation (MachineSink.cpp) can move LDD, LDW instructions, but can't move CORE_LD* because CORE_LD* instructions are marked as MCID::UnmodeledSideEffects in BPFGenInstrInfo.inc (maybe something to adjust):

// called from MachineSinking::SinkInstruction
bool MachineInstr::isSafeToMove(AAResults *AA, bool &SawStore) const {
  if (... hasUnmodeledSideEffects())
    return false;
  ...
}

sock_destroy_prog.bpf.o

Without ctx: 102 instructions
With ctx: 101 instructions

In the following code fragment:

	if (ctx->protocol == IPPROTO_TCP)
		bpf_map_update_elem(&tcp_conn_sockets, &key, &sock_cookie, 0);
	else if (ctx->protocol == IPPROTO_UDP)
		bpf_map_update_elem(&udp_conn_sockets, &keyc, &sock_cookie, 0);
	else
		return 1;

Version w/o btf_decl_tag("ctx") keeps two loads for ctx->protocol because of the llvm.bpf.passthrough call. Version with btf_decl_tag("ctx") eliminates second load via a combination of EarlyCSEPass/InstCombinePass/SimplifyCFGPass passes.

socket_cookie_prog.bpf.o

Without ctx: 66 instructions
With ctx: 66 instructions

For the following C code fragment:

SEC("sockops")
int update_cookie_sockops(struct bpf_sock_ops *ctx)
{
	struct bpf_sock *sk = ctx->sk;
	struct socket_cookie *p;

	if (ctx->family != AF_INET6)
		return 1;

	if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
		return 1;

	if (!sk)
		return 1;
    ...
}

Code with decl_tag("ctx") does reordering for ctx->sk load relative to ctx->family and ctx->op loads:

//  old                                       new
 <update_cookie_sockops>:                  <update_cookie_sockops>:
    r6 = r1                                   r6 = r1
-   r2 = *(u64 *)(r6 + 0xb8)
    r1 = *(u32 *)(r6 + 0x14)                  r1 = *(u32 *)(r6 + 0x14)
    if w1 != 0xa goto +0x13 <LBB1_6>          if w1 != 0xa goto +0x14 <LBB1_6>
    r1 = *(u32 *)(r6 + 0x0)                   r1 = *(u32 *)(r6 + 0x0)
    if w1 != 0x3 goto +0x11 <LBB1_6>          if w1 != 0x3 goto +0x12 <LBB1_6>
                                          +   r2 = *(u64 *)(r6 + 0xb8)
    if r2 == 0x0 goto +0x10 <LBB1_6>          if r2 == 0x0 goto +0x10 <LBB1_6>
    r1 = 0x0 ll                               r1 = 0x0 ll
    r3 = 0x0                                  r3 = 0x0

Code w/o decl_tag("ctx") uses CORE_LD* instructions for these loads and does not reorder loads due to reasons as in netns_cookie_prog.bpf.o.

test_lwt_reroute.bpf.o

Without ctx: 18 instructions
With ctx: 17 instructions

The difference boils down EarlyCSEPass being able to remove last load in the store/load pair:

llvm
; Before EarlyCSEPass
  store i32 %and, ptr %mark24, align 8
  %mark25 = getelementptr inbounds %struct.__sk_buff, ptr %skb, i32 0, i32 2
  %19 = load i32, ptr %mark25, align 8
  %cmp26 = icmp eq i32 %19, 0
; After EarlyCSEPass
  %and = and i32 %cond, 255
  store i32 %and, ptr %mark, align 8
  %cmp26 = icmp eq i32 %and, 0

And unable to do so when get.element.and.{store,load} intrinsics are used. Which leads to slight codegen differences downstream.

test_sockmap_invalid_update.bpf.o

Without ctx: 13 instructions
With ctx: 12 instructions

In the following C fragment:

c
	if (skops->sk)
		bpf_map_update_elem(&map, &key, skops->sk, 0);

Code with decl_tag("ctx") loads skops->sk only once. Code w/o decl_tag("ctx") uses CO-RE relocations and does load twice. As with sock_destroy_prog.bpf.o, EarlyCSEPass does not consolidate identical %x = call llvm.bpf.passthrough; load %x pairs.

type_cast.bpf.o

Without ctx: 96 instructions
With ctx: 96 instructions

__builtin_memcpy(name, dev->name, IFNAMSIZ) is unrolled in a different order. No idea why.

core_kern.bpf.o, test_verif_scale2.bpf.o

For both programs number of instructions is unchanged (11249, 12286). Some instructions have different order after DAG->DAG Pattern Instruction Selection. Instruction selection with CO-RE and non-CO-RE loads produces slightly different result.

In D133361#4629292, @ast wrote:

...
Right. Such recursive propagation of PAI is necessary. For btf_tag we cannot do it. Always propagating it won't be correct.
New preserve_const_field_offset would need to be propagated too and we actually have nested unions.
__bpf_md_ptr is such example. btf_tag wouldn't propagate into that union, but attr(preserve_const_field_offset) should.

Hi Alexei,

It just occurred to me that such an attribute would also require DWARF and BTF encoding in order to get reflected in vmlinux.h (which we already have for btf_decl_tag). Given this I think we can rename decl tag "ctx" to btf_decl_tag("preserve_const_field_offset") but we should still keep it a btf_decl_tag. I'll try to replace usage of bpf_context_marker intrinsic by metadata, if that fails will just rename the intrinsic to preserve_const_field_offset.
What do you think? (Sorry, I should have thought this through last week).

In D133361#4637008, @eddyz87 wrote:

In D133361#4629292, @ast wrote:

...
Right. Such recursive propagation of PAI is necessary. For btf_tag we cannot do it. Always propagating it won't be correct.
New preserve_const_field_offset would need to be propagated too and we actually have nested unions.
__bpf_md_ptr is such example. btf_tag wouldn't propagate into that union, but attr(preserve_const_field_offset) should.

Hi Alexei,

It just occurred to me that such an attribute would also require DWARF and BTF encoding in order to get reflected in vmlinux.h (which we already have for btf_decl_tag). Given this I think we can rename decl tag "ctx" to btf_decl_tag("preserve_const_field_offset") but we should still keep it a btf_decl_tag. I'll try to replace usage of bpf_context_marker intrinsic by metadata, if that fails will just rename the intrinsic to preserve_const_field_offset.
What do you think? (Sorry, I should have thought this through last week).

I still feel that new attr is much cleaner from llvm implementation/design perspective and vmlinux.h inconvenience should be low priority in this considerations.
Since ctx only applies to uapi/bpf.h header the users don't have to use vmlinux.h. I know that today we have pains combining uapi headers and vmlinux.h and several solutions were
proposed. None were accepted yet, but that shouldn't mean that we should sacrifice llvm implementation due to orthogonal issues.
As a temporary workaround for vmlinux.h we can have uapi/bpf.h to do attr(preserve_const_field_offset) _and_ btf_decl_tag("bpf_ctx_struct"). Then teach pahole to emit attr(preserve_const_field_offset)
when it sees btf_decl_tag("bpf_ctx_struct") in vmlinux BTF.
Other workarounds are possible.

eddyz87 updated this revision to Diff 556819.Sep 14 2023, 4:50 PM

eddyz87 retitled this revision from [BPF] Attribute btf_decl_tag("ctx") for structs to [BPF] Attribute preserve_static_offset for structs.

Use (new) attribute((preserve_static_offset)) instead of attribute((btf_decl_tag("ctx"))).
Rename files, passes, etc accordingly.

Harbormaster completed remote builds in B257253: Diff 556819.Sep 14 2023, 5:41 PM

Fix for failed unit test pragma-attribute-supported-attributes-list.test

Harbormaster completed remote builds in B257280: Diff 556858.Sep 15 2023, 8:51 AM

ast added a subscriber: jemarch.Sep 19 2023, 2:36 AM

I don't have a whole lot of opinions on the attribute itself (I'm not super familiar with BPF), but did spot some things to do on the Clang side. This will also need reviewers for the LLVM changes -- any ideas on who usually reviews BPF-related changes in LLVM?

clang/lib/Sema/SemaDeclAttr.cpp
7682–7683
9040
clang/test/Sema/bpf-attr-preserve-static-offset.c
2	You should also add a `-verify` test that verifies we diagnose applying the attribute to something other than a structure or union, accepts no arguments, is diagnosed on a non-BPF target, etc to ensure we've got correct diagnostic behavior.

The CFE stuff is pretty innocuous and I don't see a reason to stop this on our accord, but would like someone familiar with the LLVM stuff to review this at one point.

clang/lib/CodeGen/CGExpr.cpp
3837
3841	You can use `E->getType()->getPointeeType()` here instead, unless you REALLY care that this is only a normal-pointer deref (and not a PMF or reference type).
3924	I'd suggest just making `hasBPFPreserveStaticOffset` be `nullptr` tolerant and remove the 1st half of this.
clang/lib/Sema/SemaDeclAttr.cpp
7686	This should use `BPFPreserveStaticOffsetAttr::Create` instead of using placement `new`. We've been meaning to translate these all at one point, but never got around to it.
9040	Or this, this is probably better.
llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll
62	Are we sure we want to do something like this? It seems this both depends on YOUR computer AND us never releasing Clang 18.

Rebase, changes as requested by @aaron.ballman and @erichkeane.

Hi @aaron.ballman, @erichkeane,

Thank you for taking a look.
I beleive this commit covers all feedback except "clang version"
metadata comment by @erichkeane, I added inline reply there.

This will also need reviewers for the LLVM changes -- any ideas on
who usually reviews BPF-related changes in LLVM?

I'll communicate with @ast and @yonghong-song.

Harbormaster completed remote builds in B257698: Diff 557502.Sep 29 2023, 5:18 PM

In D133361#4652102, @eddyz87 wrote:

Rebase, changes as requested by @aaron.ballman and @erichkeane.

Hi @aaron.ballman, @erichkeane,

Thank you for taking a look.
I beleive this commit covers all feedback except "clang version"
metadata comment by @erichkeane, I added inline reply there.

This will also need reviewers for the LLVM changes -- any ideas on
who usually reviews BPF-related changes in LLVM?

I'll communicate with @ast and @yonghong-song.

I don't see the comment response you had to me.

clang/lib/CodeGen/CGExpr.cpp
3843	getPointeeType can also return nullptr, so unless you have a test elsewhere to ensure it isn't, you likely have to do a little more work here (and if so, I'd need an assert).

In D133361#4652368, @erichkeane wrote:

I don't see the comment response you had to me.

Sorry, forgot to click submit.

clang/lib/Sema/SemaDeclAttr.cpp
7686	Replaced by a call to `handleSimpleAttribute` as suggested by Aaron.
clang/test/Sema/bpf-attr-preserve-static-offset.c
2	`-verify` looks neat, thank you. I've added two test cases: clang/test/Sema/bpf-attr-preserve-static-offset-warns.c clang/test/Sema/bpf-attr-preserve-static-offset-warns-nonbpf.c Those check for things you listed: can't be used for non-bpf target can't take parameters is error to put it on something other than struct / union. Tbh, I don't know if there is anything else to test in this regard.
llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll
62	Are you sure this would be an issue? The specific line is not a part of a CHECK and I tried the following command using my system's llvm 16 opt: opt -O2 -mtriple=bpf-pc-linux -S -o - load-align.ll And module was loaded / processed w/o any issues. In general grepping shows that people don't usually mask these in tests: $ cd llvm/test/CodeGen/ $ ag '{!"clang version' \| wc -l 452

eddyz87 added inline comments.Oct 2 2023, 7:33 AM

clang/lib/CodeGen/CGExpr.cpp
3843	Is it? I actually double-checked this before pushing an update, clangd jumps to the following definition: QualType Type::getPointeeType() const { if (const auto *PT = getAs<PointerType>()) return PT->getPointeeType(); ... return {}; } The `getAsRecordDecl()` can return null indeed, but that null is checked.

erichkeane added inline comments.Oct 2 2023, 7:38 AM

clang/lib/CodeGen/CGExpr.cpp
3843	More correctly, it returns an empty `QualType`. The `operator->` on that will cause a `nullptr`, causing the call to `Type::getAsRecordDecl` to have a `nullptr` `this`.
llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll
62	I don't write LLVM tests ever, so I'm not sure. It just seems odd to provide that much irrelevant info, perhaps one of hte LLVM reviewers can comment. Also, look at those ~450 and see what they contain?

eddyz87 added inline comments.Oct 2 2023, 7:45 AM

clang/lib/CodeGen/CGExpr.cpp

3843

Oh, right, I'll add an additional check, thank you.

llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll

Also, look at those ~450 and see what they contain?

Same random clang versions:

$ ag '{!"clang version' | head
X86/debug-loclists.ll:129:!6 = !{!"clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)"}
X86/dbg-combine.ll:84:!11 = !{!"clang version 3.7.0 (trunk 227074)"}
X86/debuginfo-locations-dce.ll:46:!6 = !{!"clang version 8.0.0 (trunk 339665)"}
X86/pr31242.ll:48:!5 = !{!"clang version 4.0.0 (trunk 288844)"}
X86/catchpad-regmask.ll:140:!1 = !{!"clang version 3.8.0 "}
X86/debug-nodebug-crash.ll:48:!5 = !{!"clang version 4.0.0"}
X86/limit-split-cost.mir:64:  !2 = !{!"clang version 7.0.0 (trunk 335057)"}
X86/swap.ll:169:!1 = !{!"clang version 9.0.0 (trunk 352631) (llvm/trunk 352632)"}
X86/dwarf-aranges-available-externally.ll:65:!16 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git 2f52a868225755ebfa5242992d3a650ac6aadce7)"}
X86/label-annotation.ll:96:!7 = !{!"clang version 9.0.0 (git@github.com:llvm/llvm-project.git 7f9a008a2db285aca57bfa0c09858c9527a7aa98)"}

erichkeane added inline comments.Oct 2 2023, 7:47 AM

llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll
62	It seems at least removing your home-path would be a good idea, but I can't really review these either. Note the 'trunk `#`' is from our SVN days, so that isn't particularly useful at all. IMO everything in the parens is worthless in the tests, but hopefully someone familiar with the LLVM tests can stop by and correct me.

eddyz87 added inline comments.Oct 2 2023, 7:57 AM

llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll
62	Most of the tests use generated IR as a starting point and it looks like people don't bother to hide these details. But I checked and test passes if I replace this string with just "clang", I'll update the tests, not a big deal.

Additional changes to address feedback from @erichkeane:

Update for hasBPFPreserveStaticOffset: added null check for getPointeeType() result (note: currently this function is only called for array base, so getPointeeType() always returns a non-null but I think that having a null check instead of assert is still better in this context).
Update for all .ll test cases to remove references to clang version or compilation directory from all metadata.

eddyz87 marked 4 inline comments as done.Oct 2 2023, 3:03 PM

Harbormaster completed remote builds in B257726: Diff 557541.Oct 2 2023, 7:36 PM

2 nits on the CFE, else LGTM! Obviously you still need an LLVM reviewer for the backend changes.

clang/lib/CodeGen/CGExpr.cpp
3844	We override `operator bool` to make this work.
3846	if possible?

This revision is now accepted and ready to land.Oct 3 2023, 7:29 AM

Rebase, added "const" qualifier in hasBPFPreserveStaticOffset().

eddyz87 added inline comments.Oct 3 2023, 5:14 PM

clang/lib/CodeGen/CGExpr.cpp

3844

Sorry, just to clarify, currently such modification fails with the following error:

clang/lib/CodeGen/CGExpr.cpp:3710:7: error: invalid argument type 'QualType' to unary expression
  if (!PointeeType)
      ^~~~~~~~~~~~
1 error generated.

And you want me to modify QualType as follows:

--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -796,6 +796,8 @@ public:
     return getTypePtr();
   }
 
+  explicit operator bool() const { return isNull(); }
+
   bool isCanonical() const;
   bool isCanonicalAsParam() const;

Right?

Harbormaster completed remote builds in B257750: Diff 557580.Oct 3 2023, 9:06 PM

erichkeane added inline comments.Oct 4 2023, 6:53 AM

clang/lib/CodeGen/CGExpr.cpp
3844	No, don't do that, you can leave it just checking isNull. I could have sworn we already had that operator, but perhaps it was removed at one point.

eddyz87 marked 3 inline comments as done.Oct 4 2023, 6:56 AM

eddyz87 added inline comments.

clang/lib/CodeGen/CGExpr.cpp
3844	Understood. Thank you for the review.

ast added inline comments.Oct 10 2023, 11:55 AM

clang/lib/CodeGen/CGExpr.cpp
3925	If I'm reading this correctly wrapping with preserve_static_offset doesn't prevent further preserver_access_index wrapping which is a wasted effort for pai at the end ?

eddyz87 marked an inline comment as done.Oct 10 2023, 12:00 PM

eddyz87 added inline comments.

clang/lib/CodeGen/CGExpr.cpp
3925	Yes, pai calls are undone in `BPFPreserveStaticOffset.cpp:removePAICalls()`. I can put back the logic that suppresses pai if preserve static offset is present.

ast added inline comments.Oct 10 2023, 1:15 PM

clang/lib/CodeGen/CGExpr.cpp
3925	I see. I guess I missed a previous discussion. Why this approach was chosen?

eddyz87 added inline comments.Oct 10 2023, 1:20 PM

clang/lib/CodeGen/CGExpr.cpp
3925	Initial version used `__attribute__((btf_decl_tag("ctx")))` and Yonghong did not want to have prioritization between `btf_decl_tag` and `preserve_access_index` basing on decl tag string parameter. Now this limitation is gone (and I think this was one of your arguments in favor of separate attribute).

ast accepted this revision.Oct 10 2023, 1:28 PM

ast added inline comments.

clang/lib/CodeGen/CGExpr.cpp
3925	Ahh. Right. It made sense to avoid special treatment of strings in decl_tag, but now it's gone and PSO takes precedence over PAI. Here we're adding PAI just to remove it later. Looks like a waste of cpu cycles and code. Unless applying PAI to outer struct and PSO in inner makes implementation tricky. I doubt we need to support such combo though. I'm fine cleaning this up in a follow up. If such cleanup makes sense at all.

eddyz87 added inline comments.Oct 10 2023, 1:30 PM

clang/lib/CodeGen/CGExpr.cpp
3925	If such cleanup makes sense at all. I think it does, I'll implement this change and push an update.

I went through LLVM part of the change, mostly look good to me from high-level and I need to go through another pass with details.
It would be great if you can post corresponding kernel patches which utilizes this functionality?

clang/include/clang/Basic/AttrDocs.td
2227	g->b.a ?
llvm/lib/Target/BPF/BPFPreserveStaticOffset.cpp
83	Is it possible that such a pattern already changed by GVN/CSE so later on converting to preserve_static_offset becomes impossible? I guess it is unlikely but want to double check. Maybe add a comment to clarify.
103	So the above implies that this is bad for CSE/GVN due to different representation, right? But we want to do (1) since we want to avoid later CSE/GVN etc, so argument of clobbering is not that important, right?

I tried a couple of examples with both preserve_access_index and preserve_static_offset (both 'preserve_access_index preserve_static_offset' and 'preserve_static_offset preserve_access_index'). Looks they cooperate with each other properly. preserve_static_offset is for base pointer while
preserve_access_index is for offset. So the current implementation seems OKAY. But preserve_static_offset actually means 'static offset' (no offset change),
so preserve_access_index seems not necessary, so if possible maybe clang frontend should make a choice to remove preserve_access_index if
preserve_static_offset is there. But this can be done later (need frontend review again, need to update doc etc.)

Rebase.

Fix for documentation type g->a.b -> g->b.a.

eddyz87 marked an inline comment as done.Wed, Nov 29, 4:45 PM

eddyz87 added inline comments.

llvm/lib/Target/BPF/BPFPreserveStaticOffset.cpp
83	Actually yes, that is possible. It can probably be mitigated by allowing non-constant indices in the get.and.{load,store} functions. As we already agreed to land this, I'll do this as a separate MR.
103	We want to forbid some changes by CSE/GVN but allow others. E.g. suppose the following example: struct foo { int a; int b; } __attribute__((preserve_static_offset)); void consume(int); static void bar(int p) { consume(p); } void foo(struct foo *foo) { consume(foo->b); bar(&foo->b); } After inlining of `bar` to `foo` it would look like: consume(foo->b); consume(foo->b); And I attempted not to break CSE/GVN for such situations.

Harbormaster completed remote builds in B258141: Diff 558190.Wed, Nov 29, 8:11 PM

Rebase, hope for green CI.

Harbormaster completed remote builds in B258146: Diff 558196.Thu, Nov 30, 6:50 AM

Closed by commit rGcb13e9286b6d: [BPF] Attribute preserve_static_offset for structs (authored by eddyz87). · Explain WhyThu, Nov 30, 9:46 AM

This revision was automatically updated to reflect the committed changes.

eddyz87 added a commit: rGcb13e9286b6d: [BPF] Attribute preserve_static_offset for structs.

eddyz87 added a reverting change: rG248446980317: Revert "[BPF] Attribute preserve_static_offset for structs".Thu, Nov 30, 12:30 PM

Reopening revision to fix errors reported by testbot.

This revision is now accepted and ready to land.Thu, Nov 30, 5:20 PM

Fixes for issues reported by testbot:

bpf-preserve-static-offset-bitfield.c updated to use -target bpfel instead of -target bpf to avoid different code generation between big and little endian machines.
BPFPreserveStaticOffset.cpp:removePAICalls() updated to avoid use after free for WorkList elements: series of if (isXXX(V)) removeXXX(V) modified to be else-if statement.

Harbormaster completed remote builds in B258150: Diff 558201.Thu, Nov 30, 6:18 PM

Rebase.

Harbormaster completed remote builds in B258153: Diff 558210.Tue, Dec 5, 6:29 AM

Closed by commit rG030b8cb1561d: [BPF] Attribute preserve_static_offset for structs (authored by eddyz87). · Explain WhyTue, Dec 5, 9:25 AM

This revision was automatically updated to reflect the committed changes.

eddyz87 added a commit: rG030b8cb1561d: [BPF] Attribute preserve_static_offset for structs.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

Attr.td

8 lines

AttrDocs.td

37 lines

lib/

CodeGen/

CGExpr.cpp

34 lines

Sema/

SemaDeclAttr.cpp

3 lines

test/

CodeGen/

bpf-preserve-static-offset-arr.c

33 lines

bpf-preserve-static-offset-bitfield.c

31 lines

bpf-preserve-static-offset-lvalue.c

28 lines

bpf-preserve-static-offset-non-bpf.c

18 lines

bpf-preserve-static-offset-pai.c

29 lines

Misc/

pragma-attribute-supported-attributes-list.test

1 line

Sema/

bpf-attr-preserve-static-offset-warns-nonbpf.c

6 lines

bpf-attr-preserve-static-offset-warns.c

23 lines

bpf-attr-preserve-static-offset.c

27 lines

llvm/

include/

llvm/

IR/

Intrinsics.td

4 lines

IntrinsicsBPF.td

39 lines

lib/

Target/

BPF/

BPF.h

19 lines

BPFAbstractMemberAccess.cpp

77 lines

BPFCORE.h

4 lines

BPFCheckAndAdjustIR.cpp

56 lines

BPFPreserveStaticOffset.cpp

680 lines

BPFTargetMachine.cpp

11 lines

CMakeLists.txt

1 line

test/

CodeGen/

BPF/

preserve-static-offset/

66 lines

93 lines

66 lines

82 lines

73 lines

74 lines

load-chain-u8-type-mismatch.ll

73 lines

71 lines

68 lines

85 lines

75 lines

114 lines

71 lines

105 lines

67 lines

load-undo-chain-oob.ll

74 lines

load-undo-chain-u8.ll

68 lines

load-undo-chain.ll

73 lines

load-undo-simple.ll

65 lines

load-undo-volatile.ll

64 lines

load-union-pai.ll

110 lines

load-unroll-inline.ll

108 lines

95 lines

62 lines

57 lines

59 lines

60 lines

77 lines

67 lines

store-chain-u8-oob.ll

67 lines

68 lines

64 lines

136 lines

60 lines

62 lines

store-undo-chain-oob.ll

67 lines

store-undo-chain-u8.ll

62 lines

store-undo-chain.ll

68 lines

store-undo-simple.ll

61 lines

store-undo-volatile.ll

61 lines

store-unroll-inline.ll

104 lines

store-volatile.ll

56 lines

store-zero.ll

51 lines

Diff 558212

clang/include/clang/Basic/Attr.td

	Show First 20 Lines • Show All 2,018 Lines • ▼ Show 20 Lines
	def BPFPreserveAccessIndex : InheritableAttr,			def BPFPreserveAccessIndex : InheritableAttr,
	TargetSpecificAttr<TargetBPF> {			TargetSpecificAttr<TargetBPF> {
	let Spellings = [Clang<"preserve_access_index">];			let Spellings = [Clang<"preserve_access_index">];
	let Subjects = SubjectList<[Record], ErrorDiag>;			let Subjects = SubjectList<[Record], ErrorDiag>;
	let Documentation = [BPFPreserveAccessIndexDocs];			let Documentation = [BPFPreserveAccessIndexDocs];
	let LangOpts = [COnly];			let LangOpts = [COnly];
	}			}

				def BPFPreserveStaticOffset : InheritableAttr,
				TargetSpecificAttr<TargetBPF> {
				let Spellings = [Clang<"preserve_static_offset">];
				let Subjects = SubjectList<[Record], ErrorDiag>;
				let Documentation = [BPFPreserveStaticOffsetDocs];
				let LangOpts = [COnly];
				}

	def BTFDeclTag : InheritableAttr {			def BTFDeclTag : InheritableAttr {
	let Spellings = [Clang<"btf_decl_tag">];			let Spellings = [Clang<"btf_decl_tag">];
	let Args = [StringArgument<"BTFDeclTag">];			let Args = [StringArgument<"BTFDeclTag">];
	let Subjects = SubjectList<[Var, Function, Record, Field, TypedefName],			let Subjects = SubjectList<[Var, Function, Record, Field, TypedefName],
	ErrorDiag>;			ErrorDiag>;
	let Documentation = [BTFDeclTagDocs];			let Documentation = [BTFDeclTagDocs];
	let LangOpts = [COnly];			let LangOpts = [COnly];
	}			}
	▲ Show 20 Lines • Show All 2,313 Lines • Show Last 20 Lines

clang/include/clang/Basic/AttrDocs.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,193 Lines • ▼ Show 20 Lines	def BPFPreserveAccessIndexDocs : Documentation {
let Content = [{		let Content = [{
Clang supports the ``__attribute__((preserve_access_index))``		Clang supports the ``__attribute__((preserve_access_index))``
attribute for the BPF target. This attribute may be attached to a		attribute for the BPF target. This attribute may be attached to a
struct or union declaration, where if -g is specified, it enables		struct or union declaration, where if -g is specified, it enables
preserving struct or union member access debuginfo indices of this		preserving struct or union member access debuginfo indices of this
struct or union, similar to clang ``__builtin_preserve_access_index()``.		struct or union, similar to clang ``__builtin_preserve_access_index()``.
}];		}];
}		}

		def BPFPreserveStaticOffsetDocs : Documentation {
		let Category = DocCatFunction;
		let Content = [{
		Clang supports the ``__attribute__((preserve_static_offset))``
		attribute for the BPF target. This attribute may be attached to a
		struct or union declaration. Reading or writing fields of types having
		such annotation is guaranteed to generate LDX/ST/STX instruction with
		offset corresponding to the field.

		For example:

		.. code-block:: c

		struct foo {
		int a;
		int b;
		};

		struct bar {
		int a;
		struct foo b;
		} __attribute__((preserve_static_offset));

		void buz(struct bar *g) {
		g->b.a = 42;
		yonghong-songUnsubmitted Done Reply Inline Actions g->b.a ? yonghong-song: g->b.a ?
		}

		The assignment to ``g``'s field would produce an ST instruction with
		offset 8: ``*(u32)(r1 + 8) = 42;``.

		Without this attribute generated instructions might be different,
		depending on optimizations behavior. E.g. the example above could be
		rewritten as ``r1 += 8; *(u32)(r1 + 0) = 42;``.
		}];
		}

def BTFDeclTagDocs : Documentation {		def BTFDeclTagDocs : Documentation {
let Category = DocCatFunction;		let Category = DocCatFunction;
let Content = [{		let Content = [{
Clang supports the ``__attribute__((btf_decl_tag("ARGUMENT")))`` attribute for		Clang supports the ``__attribute__((btf_decl_tag("ARGUMENT")))`` attribute for
all targets. This attribute may be attached to a struct/union, struct/union		all targets. This attribute may be attached to a struct/union, struct/union
field, function, function parameter, variable or typedef declaration. If -g is		field, function, function parameter, variable or typedef declaration. If -g is
specified, the ``ARGUMENT`` info will be preserved in IR and be emitted to		specified, the ``ARGUMENT`` info will be preserved in IR and be emitted to
dwarf. For BPF targets, the ``ARGUMENT`` info will be emitted to .BTF ELF		dwarf. For BPF targets, the ``ARGUMENT`` info will be emitted to .BTF ELF
▲ Show 20 Lines • Show All 5,452 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExpr.cpp

Show First 20 Lines • Show All 3,827 Lines • ▼ Show 20 Lines static QualType getFixedSizeElementType(const ASTContext &ctx,

const VariableArrayType *vla) { const VariableArrayType *vla) {

QualType eltType; QualType eltType;

do { do {

eltType = vla->getElementType(); eltType = vla->getElementType();

} while ((vla = ctx.getAsVariableArrayType(eltType))); } while ((vla = ctx.getAsVariableArrayType(eltType)));

return eltType; return eltType;

} }

static bool hasBPFPreserveStaticOffset(const RecordDecl *D) {

return D && D->hasAttr<BPFPreserveStaticOffsetAttr>();

erichkeaneUnsubmitted

Done

static bool hasBPFPreserveStaticOffset(const RecordDecl *D) {

- return D->getAttr<BPFPreserveStaticOffsetAttr>();

+ return D->hasAttr<BPFPreserveStaticOffsetAttr>();

}

static bool hasBPFPreserveStaticOffset(const Expr *E) {

erichkeane:

}

static bool hasBPFPreserveStaticOffset(const Expr *E) {

if (!E)

erichkeaneUnsubmitted

Done

You can use E->getType()->getPointeeType() here instead, unless you REALLY care that this is only a normal-pointer deref (and not a PMF or reference type).

erichkeane: You can use `E->getType()->getPointeeType()` here instead, unless you REALLY care that this is…

return false;

QualType PointeeType = E->getType()->getPointeeType();

erichkeaneUnsubmitted

Done

getPointeeType can also return nullptr, so unless you have a test elsewhere to ensure it isn't, you likely have to do a little more work here (and if so, I'd need an assert).

erichkeane: getPointeeType can also return nullptr, so unless you have a test elsewhere to ensure it isn't…

eddyz87AuthorUnsubmitted

Done

Is it? I actually double-checked this before pushing an update, clangd jumps to the following definition:

QualType Type::getPointeeType() const {
  if (const auto *PT = getAs<PointerType>())
    return PT->getPointeeType();
  ...
  return {};
}

The getAsRecordDecl() can return null indeed, but that null is checked.

eddyz87: Is it? I actually double-checked this before pushing an update, clangd jumps to the following…

erichkeaneUnsubmitted

Done

More correctly, it returns an empty QualType. The operator-> on that will cause a nullptr, causing the call to Type::getAsRecordDecl to have a nullptr this.

erichkeane: More correctly, it returns an empty `QualType`. The `operator->` on that will cause a…

eddyz87AuthorUnsubmitted

Done

Oh, right, I'll add an additional check, thank you.

eddyz87: Oh, right, I'll add an additional check, thank you.

if (PointeeType.isNull())

erichkeaneUnsubmitted

Done

QualType PointeeType = E->getType()->getPointeeType();

- if (PointeeType.isNull())

+ if (!PointeeType)

return false;

We override operator bool to make this work.

erichkeane: We override `operator bool` to make this work.

eddyz87AuthorUnsubmitted

Done

Sorry, just to clarify, currently such modification fails with the following error:

clang/lib/CodeGen/CGExpr.cpp:3710:7: error: invalid argument type 'QualType' to unary expression
  if (!PointeeType)
      ^~~~~~~~~~~~
1 error generated.

And you want me to modify QualType as follows:

--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -796,6 +796,8 @@ public:
     return getTypePtr();
   }
 
+  explicit operator bool() const { return isNull(); }
+
   bool isCanonical() const;
   bool isCanonicalAsParam() const;

Right?

eddyz87: Sorry, just to clarify, currently such modification fails with the following error: ```…

erichkeaneUnsubmitted

Done

No, don't do that, you can leave it just checking isNull. I could have sworn we already had that operator, but perhaps it was removed at one point.

erichkeane: No, don't do that, you can leave it just checking isNull. I could have sworn we already had…

eddyz87AuthorUnsubmitted

Done

Understood.
Thank you for the review.

eddyz87: Understood. Thank you for the review.

return false;

if (const auto *BaseDecl = PointeeType->getAsRecordDecl())

erichkeaneUnsubmitted

Done

return false;

- if (auto *BaseDecl = PointeeType->getAsRecordDecl())

+ if (const auto *BaseDecl = PointeeType->getAsRecordDecl())

return hasBPFPreserveStaticOffset(BaseDecl);

if possible?

erichkeane: if possible?

return hasBPFPreserveStaticOffset(BaseDecl);

return false;

}

// Wraps Addr with a call to llvm.preserve.static.offset intrinsic.

static Address wrapWithBPFPreserveStaticOffset(CodeGenFunction &CGF,

Address &Addr) {

if (!CGF.getTarget().getTriple().isBPF())

return Addr;

llvm::Function *Fn =

CGF.CGM.getIntrinsic(llvm::Intrinsic::preserve_static_offset);

llvm::CallInst *Call = CGF.Builder.CreateCall(Fn, {Addr.getPointer()});

return Address(Call, Addr.getElementType(), Addr.getAlignment());

}

/// Given an array base, check whether its member access belongs to a record /// Given an array base, check whether its member access belongs to a record

/// with preserve_access_index attribute or not. /// with preserve_access_index attribute or not.

static bool IsPreserveAIArrayBase(CodeGenFunction &CGF, const Expr *ArrayBase) { static bool IsPreserveAIArrayBase(CodeGenFunction &CGF, const Expr *ArrayBase) {

if (!ArrayBase || !CGF.getDebugInfo()) if (!ArrayBase || !CGF.getDebugInfo())

return false; return false;

// Only support base as either a MemberExpr or DeclRefExpr. // Only support base as either a MemberExpr or DeclRefExpr.

// DeclRefExpr to cover cases like: // DeclRefExpr to cover cases like:

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines if (auto vla = CGF.getContext().getAsVariableArrayType(eltType)) {

eltType = getFixedSizeElementType(CGF.getContext(), vla); eltType = getFixedSizeElementType(CGF.getContext(), vla);

} }

// We can use that to compute the best alignment of the element. // We can use that to compute the best alignment of the element.

CharUnits eltSize = CGF.getContext().getTypeSizeInChars(eltType); CharUnits eltSize = CGF.getContext().getTypeSizeInChars(eltType);

CharUnits eltAlign = CharUnits eltAlign =

getArrayElementAlign(addr.getAlignment(), indices.back(), eltSize); getArrayElementAlign(addr.getAlignment(), indices.back(), eltSize);

if (hasBPFPreserveStaticOffset(Base))

erichkeaneUnsubmitted

Done

I'd suggest just making hasBPFPreserveStaticOffset be nullptr tolerant and remove the 1st half of this.

erichkeane: I'd suggest just making `hasBPFPreserveStaticOffset` be `nullptr` tolerant and remove the 1st…

addr = wrapWithBPFPreserveStaticOffset(CGF, addr);

astUnsubmitted

Not Done

If I'm reading this correctly wrapping with preserve_static_offset doesn't prevent further preserver_access_index wrapping which is a wasted effort for pai at the end ?

ast: If I'm reading this correctly wrapping with preserve_static_offset doesn't prevent further…

eddyz87AuthorUnsubmitted

Done

Yes, pai calls are undone in BPFPreserveStaticOffset.cpp:removePAICalls(). I can put back the logic that suppresses pai if preserve static offset is present.

eddyz87: Yes, pai calls are undone in `BPFPreserveStaticOffset.cpp:removePAICalls()`. I can put back the…

astUnsubmitted

Not Done

I see. I guess I missed a previous discussion. Why this approach was chosen?

ast: I see. I guess I missed a previous discussion. Why this approach was chosen?

eddyz87AuthorUnsubmitted

Done

Initial version used __attribute__((btf_decl_tag("ctx"))) and Yonghong did not want to have prioritization between btf_decl_tag and preserve_access_index basing on decl tag string parameter. Now this limitation is gone (and I think this was one of your arguments in favor of separate attribute).

eddyz87: Initial version used `__attribute__((btf_decl_tag("ctx")))` and Yonghong did not want to have…

astUnsubmitted

Not Done

Ahh. Right. It made sense to avoid special treatment of strings in decl_tag, but now it's gone and PSO takes precedence over PAI. Here we're adding PAI just to remove it later. Looks like a waste of cpu cycles and code. Unless applying PAI to outer struct and PSO in inner makes implementation tricky. I doubt we need to support such combo though. I'm fine cleaning this up in a follow up. If such cleanup makes sense at all.

ast: Ahh. Right. It made sense to avoid special treatment of strings in decl_tag, but now it's gone…

eddyz87AuthorUnsubmitted

Done

If such cleanup makes sense at all.

I think it does, I'll implement this change and push an update.

eddyz87: > If such cleanup makes sense at all. I think it does, I'll implement this change and push an…

llvm::Value *eltPtr; llvm::Value *eltPtr;

auto LastIndex = dyn_cast<llvm::ConstantInt>(indices.back()); auto LastIndex = dyn_cast<llvm::ConstantInt>(indices.back());

if (!LastIndex || if (!LastIndex ||

(!CGF.IsInPreservedAIRegion && !IsPreserveAIArrayBase(CGF, Base))) { (!CGF.IsInPreservedAIRegion && !IsPreserveAIArrayBase(CGF, Base))) {

eltPtr = emitArraySubscriptGEP( eltPtr = emitArraySubscriptGEP(

CGF, addr.getElementType(), addr.getPointer(), indices, inbounds, CGF, addr.getElementType(), addr.getPointer(), indices, inbounds,

signedIndices, loc, name); signedIndices, loc, name);

} else { } else {

▲ Show 20 Lines • Show All 612 Lines • ▼ Show 20 Lines const bool UseVolatile = isAAPCS(CGM.getTarget()) &&

CGM.getCodeGenOpts().AAPCSBitfieldWidth && CGM.getCodeGenOpts().AAPCSBitfieldWidth &&

Info.VolatileStorageSize != 0 && Info.VolatileStorageSize != 0 &&

field->getType() field->getType()

.withCVRQualifiers(base.getVRQualifiers()) .withCVRQualifiers(base.getVRQualifiers())

.isVolatileQualified(); .isVolatileQualified();

Address Addr = base.getAddress(*this); Address Addr = base.getAddress(*this);

unsigned Idx = RL.getLLVMFieldNo(field); unsigned Idx = RL.getLLVMFieldNo(field);

const RecordDecl *rec = field->getParent(); const RecordDecl *rec = field->getParent();

if (hasBPFPreserveStaticOffset(rec))

Addr = wrapWithBPFPreserveStaticOffset(*this, Addr);

if (!UseVolatile) { if (!UseVolatile) {

if (!IsInPreservedAIRegion && if (!IsInPreservedAIRegion &&

(!getDebugInfo() || !rec->hasAttr<BPFPreserveAccessIndexAttr>())) { (!getDebugInfo() || !rec->hasAttr<BPFPreserveAccessIndexAttr>())) {

if (Idx != 0) if (Idx != 0)

// For structs, we GEP to the field that the record layout suggests. // For structs, we GEP to the field that the record layout suggests.

Addr = Builder.CreateStructGEP(Addr, Idx, field->getName()); Addr = Builder.CreateStructGEP(Addr, Idx, field->getName());

} else { } else {

llvm::DIType *DbgInfo = getDebugInfo()->getOrCreateRecordType( llvm::DIType *DbgInfo = getDebugInfo()->getOrCreateRecordType(

▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines if (base.getTBAAInfo().isMayAlias() ||

// Update the final access type and size. // Update the final access type and size.

FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType); FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType);

FieldTBAAInfo.Size = FieldTBAAInfo.Size =

getContext().getTypeSizeInChars(FieldType).getQuantity(); getContext().getTypeSizeInChars(FieldType).getQuantity();

} }

Address addr = base.getAddress(*this); Address addr = base.getAddress(*this);

if (hasBPFPreserveStaticOffset(rec))

addr = wrapWithBPFPreserveStaticOffset(*this, addr);

if (auto *ClassDef = dyn_cast<CXXRecordDecl>(rec)) { if (auto *ClassDef = dyn_cast<CXXRecordDecl>(rec)) {

if (CGM.getCodeGenOpts().StrictVTablePointers && if (CGM.getCodeGenOpts().StrictVTablePointers &&

ClassDef->isDynamicClass()) { ClassDef->isDynamicClass()) {

// Getting to any field of dynamic object requires stripping dynamic // Getting to any field of dynamic object requires stripping dynamic

// information provided by invariant.group. This is because accessing // information provided by invariant.group. This is because accessing

// fields may leak the real address of dynamic object, which could result // fields may leak the real address of dynamic object, which could result

// in miscompilation when leaked pointer would be compared. // in miscompilation when leaked pointer would be compared.

auto *stripped = Builder.CreateStripInvariantGroup(addr.getPointer()); auto *stripped = Builder.CreateStripInvariantGroup(addr.getPointer());

▲ Show 20 Lines • Show All 1,262 Lines • Show Last 20 Lines

clang/lib/Sema/SemaDeclAttr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,673 Lines • ▼ Show 20 Lines

} }

static void handleBPFPreserveAccessIndexAttr(Sema &S, Decl *D, static void handleBPFPreserveAccessIndexAttr(Sema &S, Decl *D,

const ParsedAttr &AL) { const ParsedAttr &AL) {

auto *Rec = cast<RecordDecl>(D); auto *Rec = cast<RecordDecl>(D);

handleBPFPreserveAIRecord(S, Rec); handleBPFPreserveAIRecord(S, Rec);

Rec->addAttr(::new (S.Context) BPFPreserveAccessIndexAttr(S.Context, AL)); Rec->addAttr(::new (S.Context) BPFPreserveAccessIndexAttr(S.Context, AL));

} }

static bool hasBTFDeclTagAttr(Decl *D, StringRef Tag) { static bool hasBTFDeclTagAttr(Decl *D, StringRef Tag) {

aaron.ballmanUnsubmitted

Done

Rec->addAttr(::new (S.Context) BPFPreserveAccessIndexAttr(S.Context, AL));

}

- static void handleBPFPreserveStaticOffsetAttr(Sema &S, Decl *D,

- const ParsedAttr &AL) {

- auto *Rec = cast<RecordDecl>(D);

- Rec->addAttr(::new (S.Context) BPFPreserveStaticOffsetAttr(S.Context, AL));

- }

static bool hasBTFDeclTagAttr(Decl *D, StringRef Tag) {

for (const auto *I : D->specific_attrs<BTFDeclTagAttr>()) {

aaron.ballman:

for (const auto *I : D->specific_attrs<BTFDeclTagAttr>()) { for (const auto *I : D->specific_attrs<BTFDeclTagAttr>()) {

if (I->getBTFDeclTag() == Tag) if (I->getBTFDeclTag() == Tag)

return true; return true;

erichkeaneUnsubmitted

Done

This should use BPFPreserveStaticOffsetAttr::Create instead of using placement new. We've been meaning to translate these all at one point, but never got around to it.

erichkeane: This should use `BPFPreserveStaticOffsetAttr::Create` instead of using placement `new`. We've…

eddyz87AuthorUnsubmitted

Done

Replaced by a call to handleSimpleAttribute as suggested by Aaron.

eddyz87: Replaced by a call to `handleSimpleAttribute` as suggested by Aaron.

} }

return false; return false;

} }

static void handleBTFDeclTagAttr(Sema &S, Decl *D, const ParsedAttr &AL) { static void handleBTFDeclTagAttr(Sema &S, Decl *D, const ParsedAttr &AL) {

StringRef Str; StringRef Str;

if (!S.checkStringLiteralArgumentAttr(AL, 0, Str)) if (!S.checkStringLiteralArgumentAttr(AL, 0, Str))

return; return;

▲ Show 20 Lines • Show All 1,336 Lines • ▼ Show 20 Lines case ParsedAttr::AT_AMDGPUNumVGPR:

handleAMDGPUNumVGPRAttr(S, D, AL); handleAMDGPUNumVGPRAttr(S, D, AL);

break; break;

case ParsedAttr::AT_AVRSignal: case ParsedAttr::AT_AVRSignal:

handleAVRSignalAttr(S, D, AL); handleAVRSignalAttr(S, D, AL);

break; break;

case ParsedAttr::AT_BPFPreserveAccessIndex: case ParsedAttr::AT_BPFPreserveAccessIndex:

handleBPFPreserveAccessIndexAttr(S, D, AL); handleBPFPreserveAccessIndexAttr(S, D, AL);

break; break;

case ParsedAttr::AT_BPFPreserveStaticOffset:

handleSimpleAttribute<BPFPreserveStaticOffsetAttr>(S, D, AL);

aaron.ballmanUnsubmitted

Done

case ParsedAttr::AT_BPFPreserveStaticOffset:

- handleBPFPreserveStaticOffsetAttr(S, D, AL);

+ handleSimpleAttribute<BPFPreserveStaticOffsetAttr>(S, D, AL);

break;

aaron.ballman:

erichkeaneUnsubmitted

Done

Or this, this is probably better.

erichkeane: Or this, this is probably better.

break;

case ParsedAttr::AT_BTFDeclTag: case ParsedAttr::AT_BTFDeclTag:

handleBTFDeclTagAttr(S, D, AL); handleBTFDeclTagAttr(S, D, AL);

break; break;

case ParsedAttr::AT_WebAssemblyExportName: case ParsedAttr::AT_WebAssemblyExportName:

handleWebAssemblyExportNameAttr(S, D, AL); handleWebAssemblyExportNameAttr(S, D, AL);

break; break;

case ParsedAttr::AT_WebAssemblyImportModule: case ParsedAttr::AT_WebAssemblyImportModule:

handleWebAssemblyImportModuleAttr(S, D, AL); handleWebAssemblyImportModuleAttr(S, D, AL);

▲ Show 20 Lines • Show All 1,064 Lines • Show Last 20 Lines

clang/test/CodeGen/bpf-preserve-static-offset-arr.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
				// REQUIRES: bpf-registered-target
				// RUN: %clang -cc1 -triple bpf -disable-llvm-passes -S -emit-llvm -o - %s \
				// RUN: \| FileCheck %s

				// Check that call to preserve.static.offset is generated when array
				// member of a struct marked with __attribute__((preserve_static_offset))
				// is accessed.

				#define __ctx __attribute__((preserve_static_offset))

				struct foo {
				struct {
				int a;
				} b[7];
				} __ctx;

				// CHECK-LABEL: define dso_local i32 @arr_access
				// CHECK-SAME: (ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[P_ADDR:%.*]] = alloca ptr, align 8
				// CHECK-NEXT: store ptr [[P]], ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP1:%.*]] = call ptr @llvm.preserve.static.offset(ptr [[TMP0]])
				// CHECK-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_FOO:%.]], ptr [[TMP1]], i32 0, i32 0
				// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [7 x %struct.anon], ptr [[B]], i64 0, i64 2
				// CHECK-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ANON:%.]], ptr [[ARRAYIDX]], i32 0, i32 0
				// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[A]], align 4
				// CHECK-NEXT: ret i32 [[TMP2]]
				//
				int arr_access(struct foo *p) {
				return p->b[2].a;
				}

clang/test/CodeGen/bpf-preserve-static-offset-bitfield.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
				// REQUIRES: bpf-registered-target
				// RUN: %clang -cc1 -triple bpfel -disable-llvm-passes -S -emit-llvm -o - %s \
				// RUN: \| FileCheck %s

				// Check that call to preserve.static.offset is generated when bitfield
				// from a struct marked with __attribute__((preserve_static_offset)) is
				// accessed.

				#define __ctx __attribute__((preserve_static_offset))

				struct foo {
				unsigned a:1;
				} __ctx;

				// CHECK-LABEL: define dso_local void @lvalue_bitfield
				// CHECK-SAME: (ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[P_ADDR:%.*]] = alloca ptr, align 8
				// CHECK-NEXT: store ptr [[P]], ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP1:%.*]] = call ptr @llvm.preserve.static.offset(ptr [[TMP0]])
				// CHECK-NEXT: [[BF_LOAD:%.*]] = load i8, ptr [[TMP1]], align 4
				// CHECK-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -2
				// CHECK-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
				// CHECK-NEXT: store i8 [[BF_SET]], ptr [[TMP1]], align 4
				// CHECK-NEXT: ret void
				//
				void lvalue_bitfield(struct foo *p) {
				p->a = 1;
				}

clang/test/CodeGen/bpf-preserve-static-offset-lvalue.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
				// REQUIRES: bpf-registered-target
				// RUN: %clang -cc1 -triple bpf -disable-llvm-passes -S -emit-llvm -o - %s \
				// RUN: \| FileCheck %s

				// Check that call to preserve.static.offset is generated when field of
				// a struct marked with __attribute__((preserve_static_offset)) is accessed.

				#define __ctx __attribute__((preserve_static_offset))

				struct foo {
				int a;
				} __ctx;

				// CHECK-LABEL: define dso_local void @lvalue
				// CHECK-SAME: (ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[P_ADDR:%.*]] = alloca ptr, align 8
				// CHECK-NEXT: store ptr [[P]], ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP1:%.*]] = call ptr @llvm.preserve.static.offset(ptr [[TMP0]])
				// CHECK-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_FOO:%.]], ptr [[TMP1]], i32 0, i32 0
				// CHECK-NEXT: store i32 42, ptr [[A]], align 4
				// CHECK-NEXT: ret void
				//
				void lvalue(struct foo *p) {
				p->a = 42;
				}

clang/test/CodeGen/bpf-preserve-static-offset-non-bpf.c

This file was added.

				// REQUIRES: x86-registered-target
				// RUN: %clang -cc1 -triple x86_64 -disable-llvm-passes -S -emit-llvm -o - %s \
				// RUN: \| FileCheck %s

				// Verify that __attribute__((preserve_static_offset))
				// has no effect for non-BPF target.

				#define __ctx __attribute__((preserve_static_offset))

				struct foo {
				int a;
				} __ctx;

				// CHECK-NOT: @llvm_preserve_static_offset

				int bar(struct foo *p) {
				return p->a;
				}

clang/test/CodeGen/bpf-preserve-static-offset-pai.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
				// REQUIRES: bpf-registered-target
				// RUN: %clang -cc1 -triple bpf -disable-llvm-passes -S -emit-llvm -o - %s \
				// RUN: \| FileCheck %s

				// Verify that preserve_static_offset does not interfere with
				// preserve_access_index at IR generation stage.

				#define __ctx __attribute__((preserve_static_offset))
				#define __pai __attribute__((preserve_access_index))

				struct foo {
				int a;
				} __ctx __pai;

				// CHECK-LABEL: define dso_local i32 @bar
				// CHECK-SAME: (ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[P_ADDR:%.*]] = alloca ptr, align 8
				// CHECK-NEXT: store ptr [[P]], ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
				// CHECK-NEXT: [[TMP1:%.*]] = call ptr @llvm.preserve.static.offset(ptr [[TMP0]])
				// CHECK-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_FOO:%.]], ptr [[TMP1]], i32 0, i32 0
				// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[A]], align 4
				// CHECK-NEXT: ret i32 [[TMP2]]
				//
				int bar(struct foo *p) {
				return p->a;
				}

clang/test/Misc/pragma-attribute-supported-attributes-list.test

	Show All 17 Lines
	// CHECK-NEXT: AnyX86NoCfCheck (SubjectMatchRule_hasType_functionType)			// CHECK-NEXT: AnyX86NoCfCheck (SubjectMatchRule_hasType_functionType)
	// CHECK-NEXT: ArcWeakrefUnavailable (SubjectMatchRule_objc_interface)			// CHECK-NEXT: ArcWeakrefUnavailable (SubjectMatchRule_objc_interface)
	// CHECK-NEXT: ArmBuiltinAlias (SubjectMatchRule_function)			// CHECK-NEXT: ArmBuiltinAlias (SubjectMatchRule_function)
	// CHECK-NEXT: AssumeAligned (SubjectMatchRule_objc_method, SubjectMatchRule_function)			// CHECK-NEXT: AssumeAligned (SubjectMatchRule_objc_method, SubjectMatchRule_function)
	// CHECK-NEXT: Assumption (SubjectMatchRule_function, SubjectMatchRule_objc_method)			// CHECK-NEXT: Assumption (SubjectMatchRule_function, SubjectMatchRule_objc_method)
	// CHECK-NEXT: Availability ((SubjectMatchRule_record, SubjectMatchRule_enum, SubjectMatchRule_enum_constant, SubjectMatchRule_field, SubjectMatchRule_function, SubjectMatchRule_namespace, SubjectMatchRule_objc_category, SubjectMatchRule_objc_implementation, SubjectMatchRule_objc_interface, SubjectMatchRule_objc_method, SubjectMatchRule_objc_property, SubjectMatchRule_objc_protocol, SubjectMatchRule_record, SubjectMatchRule_type_alias, SubjectMatchRule_variable))			// CHECK-NEXT: Availability ((SubjectMatchRule_record, SubjectMatchRule_enum, SubjectMatchRule_enum_constant, SubjectMatchRule_field, SubjectMatchRule_function, SubjectMatchRule_namespace, SubjectMatchRule_objc_category, SubjectMatchRule_objc_implementation, SubjectMatchRule_objc_interface, SubjectMatchRule_objc_method, SubjectMatchRule_objc_property, SubjectMatchRule_objc_protocol, SubjectMatchRule_record, SubjectMatchRule_type_alias, SubjectMatchRule_variable))
	// CHECK-NEXT: AvailableOnlyInDefaultEvalMethod (SubjectMatchRule_type_alias)			// CHECK-NEXT: AvailableOnlyInDefaultEvalMethod (SubjectMatchRule_type_alias)
	// CHECK-NEXT: BPFPreserveAccessIndex (SubjectMatchRule_record)			// CHECK-NEXT: BPFPreserveAccessIndex (SubjectMatchRule_record)
				// CHECK-NEXT: BPFPreserveStaticOffset (SubjectMatchRule_record)
	// CHECK-NEXT: BTFDeclTag (SubjectMatchRule_variable, SubjectMatchRule_function, SubjectMatchRule_record, SubjectMatchRule_field, SubjectMatchRule_type_alias)			// CHECK-NEXT: BTFDeclTag (SubjectMatchRule_variable, SubjectMatchRule_function, SubjectMatchRule_record, SubjectMatchRule_field, SubjectMatchRule_type_alias)
	// CHECK-NEXT: BuiltinAlias (SubjectMatchRule_function)			// CHECK-NEXT: BuiltinAlias (SubjectMatchRule_function)
	// CHECK-NEXT: CFAuditedTransfer (SubjectMatchRule_function)			// CHECK-NEXT: CFAuditedTransfer (SubjectMatchRule_function)
	// CHECK-NEXT: CFConsumed (SubjectMatchRule_variable_is_parameter)			// CHECK-NEXT: CFConsumed (SubjectMatchRule_variable_is_parameter)
	// CHECK-NEXT: CFGuard (SubjectMatchRule_function)			// CHECK-NEXT: CFGuard (SubjectMatchRule_function)
	// CHECK-NEXT: CFICanonicalJumpTable (SubjectMatchRule_function)			// CHECK-NEXT: CFICanonicalJumpTable (SubjectMatchRule_function)
	// CHECK-NEXT: CFUnknownTransfer (SubjectMatchRule_function)			// CHECK-NEXT: CFUnknownTransfer (SubjectMatchRule_function)
	// CHECK-NEXT: CPUDispatch (SubjectMatchRule_function)			// CHECK-NEXT: CPUDispatch (SubjectMatchRule_function)
	▲ Show 20 Lines • Show All 179 Lines • Show Last 20 Lines

clang/test/Sema/bpf-attr-preserve-static-offset-warns-nonbpf.c

This file was added.

				// RUN: %clang_cc1 -fsyntax-only -verify %s

				#define __pso __attribute__((preserve_static_offset))

				struct foo { int a; } __pso; // expected-warning{{unknown attribute}}
				union quux { int a; } __pso; // expected-warning{{unknown attribute}}

clang/test/Sema/bpf-attr-preserve-static-offset-warns.c

This file was added.

				// RUN: %clang_cc1 -fsyntax-only -verify -triple bpf-pc-linux-gnu %s

				#define __pso __attribute__((preserve_static_offset))

				// These are correct usages.
				struct foo { int a; } __pso;
				union quux { int a; } __pso;
				struct doug { int a; } __pso __attribute__((packed));

				// Rest are incorrect usages.
				typedef int bar __pso; // expected-error{{attribute only applies to}}
				struct goo {
				int a __pso; // expected-error{{attribute only applies to}}
				};
				int g __pso; // expected-error{{attribute only applies to}}
				__pso void ffunc1(void); // expected-error{{attribute only applies to}}
				void ffunc2(int a __pso); // expected-error{{attribute only applies to}}
				void ffunc3(void) {
				int a __pso; // expected-error{{attribute only applies to}}
				}

				struct buz { int a; } __attribute__((preserve_static_offset("hello"))); // \
				expected-error{{attribute takes no arguments}}

clang/test/Sema/bpf-attr-preserve-static-offset.c

This file was added.

				// RUN: %clang_cc1 -fsyntax-only -ast-dump -triple bpf-pc-linux-gnu %s \| FileCheck %s

				aaron.ballmanUnsubmitted Done Reply Inline Actions You should also add a `-verify` test that verifies we diagnose applying the attribute to something other than a structure or union, accepts no arguments, is diagnosed on a non-BPF target, etc to ensure we've got correct diagnostic behavior. aaron.ballman: You should also add a `-verify` test that verifies we diagnose applying the attribute to…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions `-verify` looks neat, thank you. I've added two test cases: clang/test/Sema/bpf-attr-preserve-static-offset-warns.c clang/test/Sema/bpf-attr-preserve-static-offset-warns-nonbpf.c Those check for things you listed: can't be used for non-bpf target can't take parameters is error to put it on something other than struct / union. Tbh, I don't know if there is anything else to test in this regard. eddyz87: `-verify` looks neat, thank you. I've added two test cases: - clang/test/Sema/bpf-attr…
				// The 'preserve_static_offset' attribute should be propagated to
				// inline declarations (foo's 'b', 'bb', 'c' but not 'd').
				//
				// CHECK: RecordDecl {{.*}} struct foo definition
				// CHECK-NEXT: BPFPreserveStaticOffsetAttr
				// CHECK-NEXT: FieldDecl {{.*}} a
				// CHECK-NEXT: RecordDecl {{.*}} struct definition
				// CHECK-NEXT: FieldDecl {{.*}} aa
				// CHECK-NEXT: FieldDecl {{.*}} b
				// CHECK-NEXT: RecordDecl {{.*}} union bar definition
				// CHECK-NEXT: BPFPreserveStaticOffsetAttr
				// CHECK-NEXT: FieldDecl {{.*}} a
				// CHECK-NEXT: FieldDecl {{.*}} b

				struct foo {
				int a;
				struct {
				int aa;
				} b;
				} __attribute__((preserve_static_offset));

				union bar {
				int a;
				long b;
				} __attribute__((preserve_static_offset));

llvm/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 2,463 Lines • ▼ Show 20 Lines	def int_preserve_union_access_index : DefaultAttrsIntrinsic<[llvm_anyptr_ty],
[IntrNoMem,		[IntrNoMem,
ImmArg<ArgIndex<1>>]>;		ImmArg<ArgIndex<1>>]>;
def int_preserve_struct_access_index : DefaultAttrsIntrinsic<[llvm_anyptr_ty],		def int_preserve_struct_access_index : DefaultAttrsIntrinsic<[llvm_anyptr_ty],
[llvm_anyptr_ty, llvm_i32_ty,		[llvm_anyptr_ty, llvm_i32_ty,
llvm_i32_ty],		llvm_i32_ty],
[IntrNoMem,		[IntrNoMem,
ImmArg<ArgIndex<1>>,		ImmArg<ArgIndex<1>>,
ImmArg<ArgIndex<2>>]>;		ImmArg<ArgIndex<2>>]>;
		def int_preserve_static_offset : DefaultAttrsIntrinsic<[llvm_ptr_ty],
		yonghong-songUnsubmitted Not Done Reply Inline Actions Is it possible to make this builtin as BPF specific one? yonghong-song: Is it possible to make this builtin as BPF specific one?
		eddyz87AuthorUnsubmitted Done Reply Inline Actions Currently `llvm::Intrinsic::context_marker_bpf` gets defined in `llvm/IR/Intrinsics.h` (via include of generated file `IntrinsicEnums.inc`, same as `preserve_{struct,union,array}_access_index`). BPF specific intrinsics are defined in `llvm/IR/IntrinsicsBPF.h` (generated directly w/o .inc intermediary). Thus, if I move `context_marker_bpf` to `IntrinsicsBPF.td` I would have to include `IntrinsicsBPF.h` in `CGExpr.cpp`. However, I don't see any target specific includes in that file. eddyz87: Currently `llvm::Intrinsic::context_marker_bpf` gets defined in `llvm/IR/Intrinsics.h` (via…
		yonghong-songUnsubmitted Not Done Reply Inline Actions I went through the related clang code and indeed found it is hard to get a BPF target defined function in CGF or CGM. On the other hand, we can consider this new builtin under the umbrella "Intrinsics that are used to preserve debug information". Maybe we can rename the intrinsic name to 'int_preserve_context_marker'? The goal of this builtin to preserve certain load/store which should be immune from optimizations. I try to generalize this so your function name 'wrapWithBPFContextMarker' can be renamed to 'wrapWithContextMarker'. There is no need to mention BPF any more. In the commit message, you can mention that similar to int_preserve_array_access_index, int_preserve_context_marker is only implemented in BPF backend. But other architectures can implement processing these intrinsics if they want to achieve some results similar to bpf backend. WDYT? yonghong-song: I went through the related clang code and indeed found it is hard to get a BPF target defined…
		eddyz87AuthorUnsubmitted Not Done Reply Inline Actions I can rename these things, but tbh I don't think this functionality would be useful anywhere outside BPF, thus such renaming would be kind-of deceptive (and in case it would be useful, the renaming could be done at the time of second use). Something more generic might look like `!btf_decl_tag <str-value>` metadata node attached to something. However, in the current form this would require transfer of such decl tag from type to function parameters and variables, e.g.: struct foo { ... } __attribute__((btf_decl_tag("ctx"))); void bar(struct foo p) { ... } During code-gen for `bar` use rule like "if function parameter has type annotated with btf_decl_tag, attach metadata to such parameter": define void @bar(ptr %p !btf_decl_tag !1) { ... } !1 = { "ctx" } Such rule looks a bit weird: tag is transferred from type to it's usage what usages should be annotated? we care about function parameters but from generic point of view `alloca`s or field accesses should be annotated as well. The same metadata approach but with "ctx" attributes annotating function parameters (as you suggested originally, if I recall correctly) seems to be most generic and least controversial of all, e.g.: void bar(struct foo p __attribute__((btf_decl_tag("ctx")))) { ... } Converted to: define void @bar(ptr %p !btf_decl_tag !1) { ... } !1 = { "ctx" } However, this is less ergonomic for BPF, because user will have to annotate function parameters. (On the other hand, no changes in the kernel headers would be necessary). eddyz87: I can rename these things, but tbh I don't think this functionality would be useful anywhere…
		[llvm_ptr_ty],
		[IntrNoMem, IntrSpeculatable,
		ReadNone <ArgIndex<0>>]>;

//===------------ Intrinsics to perform common vector shuffles ------------===//		//===------------ Intrinsics to perform common vector shuffles ------------===//

def int_experimental_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],		def int_experimental_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>],		[LLVMMatchType<0>],
[IntrNoMem]>;		[IntrNoMem]>;

def int_experimental_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],		def int_experimental_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IntrinsicsBPF.td

Show All 31 Lines	let TargetPrefix = "bpf" in { // All intrinsics start with "llvm.bpf."
def int_bpf_preserve_enum_value : ClangBuiltin<"__builtin_bpf_preserve_enum_value">,		def int_bpf_preserve_enum_value : ClangBuiltin<"__builtin_bpf_preserve_enum_value">,
Intrinsic<[llvm_i64_ty], [llvm_i32_ty, llvm_ptr_ty, llvm_i64_ty],		Intrinsic<[llvm_i64_ty], [llvm_i32_ty, llvm_ptr_ty, llvm_i64_ty],
[IntrNoMem]>;		[IntrNoMem]>;
def int_bpf_passthrough : ClangBuiltin<"__builtin_bpf_passthrough">,		def int_bpf_passthrough : ClangBuiltin<"__builtin_bpf_passthrough">,
Intrinsic<[llvm_any_ty], [llvm_i32_ty, llvm_any_ty], [IntrNoMem]>;		Intrinsic<[llvm_any_ty], [llvm_i32_ty, llvm_any_ty], [IntrNoMem]>;
def int_bpf_compare : ClangBuiltin<"__builtin_bpf_compare">,		def int_bpf_compare : ClangBuiltin<"__builtin_bpf_compare">,
Intrinsic<[llvm_i1_ty], [llvm_i32_ty, llvm_anyint_ty, llvm_anyint_ty],		Intrinsic<[llvm_i1_ty], [llvm_i32_ty, llvm_anyint_ty, llvm_anyint_ty],
[IntrNoMem]>;		[IntrNoMem]>;
		def int_bpf_getelementptr_and_load : ClangBuiltin<"__builtin_bpf_getelementptr_and_load">,
		Intrinsic<[llvm_any_ty],
		[llvm_ptr_ty, // base ptr for getelementptr
		llvm_i1_ty, // volatile
		llvm_i8_ty, // atomic order
		llvm_i8_ty, // synscope id
		llvm_i8_ty, // alignment
		llvm_i1_ty, // inbounds
		llvm_vararg_ty], // indices for getelementptr insn
		[IntrNoCallback,
		IntrNoFree,
		IntrWillReturn,
		NoCapture <ArgIndex<0>>,
		ImmArg <ArgIndex<1>>, // volatile
		ImmArg <ArgIndex<2>>, // atomic order
		ImmArg <ArgIndex<3>>, // synscope id
		ImmArg <ArgIndex<4>>, // alignment
		ImmArg <ArgIndex<5>>, // inbounds
		]>;
		def int_bpf_getelementptr_and_store : ClangBuiltin<"__builtin_bpf_getelementptr_and_store">,
		Intrinsic<[],
		[llvm_any_ty, // value to store
		llvm_ptr_ty, // base ptr for getelementptr
		llvm_i1_ty, // volatile
		llvm_i8_ty, // atomic order
		llvm_i8_ty, // syncscope id
		llvm_i8_ty, // alignment
		llvm_i1_ty, // inbounds
		llvm_vararg_ty], // indexes for getelementptr insn
		[IntrNoCallback,
		IntrNoFree,
		IntrWillReturn,
		NoCapture <ArgIndex<1>>,
		ImmArg <ArgIndex<2>>, // volatile
		ImmArg <ArgIndex<3>>, // atomic order
		ImmArg <ArgIndex<4>>, // syncscope id
		ImmArg <ArgIndex<5>>, // alignment
		ImmArg <ArgIndex<6>>, // inbounds
		]>;
}		}

llvm/lib/Target/BPF/BPF.h

//===-- BPF.h - Top-level interface for BPF representation ------- C++ --===//		//===-- BPF.h - Top-level interface for BPF representation ------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_TARGET_BPF_BPF_H		#ifndef LLVM_LIB_TARGET_BPF_BPF_H
#define LLVM_LIB_TARGET_BPF_BPF_H		#define LLVM_LIB_TARGET_BPF_BPF_H

#include "MCTargetDesc/BPFMCTargetDesc.h"		#include "MCTargetDesc/BPFMCTargetDesc.h"
		#include "llvm/IR/Instructions.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Target/TargetMachine.h"		#include "llvm/Target/TargetMachine.h"

namespace llvm {		namespace llvm {
class BPFTargetMachine;		class BPFTargetMachine;
class PassRegistry;		class PassRegistry;

Show All 36 Lines	public:

static bool isRequired() { return true; }		static bool isRequired() { return true; }
};		};

class BPFAdjustOptPass : public PassInfoMixin<BPFAdjustOptPass> {		class BPFAdjustOptPass : public PassInfoMixin<BPFAdjustOptPass> {
public:		public:
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);		PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
};		};

		class BPFPreserveStaticOffsetPass
		: public PassInfoMixin<BPFPreserveStaticOffsetPass> {
		bool AllowPartial;

		public:
		BPFPreserveStaticOffsetPass(bool AllowPartial) : AllowPartial(AllowPartial) {}
		PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);

		static bool isRequired() { return true; }

		static std::pair<GetElementPtrInst , LoadInst >
		reconstructLoad(CallInst *Call);

		static std::pair<GetElementPtrInst , StoreInst >
		reconstructStore(CallInst *Call);
		};

} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	private:
void traceGEP(GetElementPtrInst GEP, CallInst Parent,		void traceGEP(GetElementPtrInst GEP, CallInst Parent,
CallInfo &ParentInfo);		CallInfo &ParentInfo);
void collectAICallChains(Function &F);		void collectAICallChains(Function &F);

bool IsPreserveDIAccessIndexCall(const CallInst *Call, CallInfo &Cinfo);		bool IsPreserveDIAccessIndexCall(const CallInst *Call, CallInfo &Cinfo);
bool IsValidAIChain(const MDNode *ParentMeta, uint32_t ParentAI,		bool IsValidAIChain(const MDNode *ParentMeta, uint32_t ParentAI,
const MDNode *ChildMeta);		const MDNode *ChildMeta);
bool removePreserveAccessIndexIntrinsic(Function &F);		bool removePreserveAccessIndexIntrinsic(Function &F);
void replaceWithGEP(std::vector<CallInst *> &CallList,
uint32_t NumOfZerosIndex, uint32_t DIIndex);
bool HasPreserveFieldInfoCall(CallInfoStack &CallStack);		bool HasPreserveFieldInfoCall(CallInfoStack &CallStack);
void GetStorageBitRange(DIDerivedType *MemberTy, Align RecordAlignment,		void GetStorageBitRange(DIDerivedType *MemberTy, Align RecordAlignment,
uint32_t &StartBitOffset, uint32_t &EndBitOffset);		uint32_t &StartBitOffset, uint32_t &EndBitOffset);
uint32_t GetFieldInfo(uint32_t InfoKind, DICompositeType *CTy,		uint32_t GetFieldInfo(uint32_t InfoKind, DICompositeType *CTy,
uint32_t AccessIndex, uint32_t PatchImm,		uint32_t AccessIndex, uint32_t PatchImm,
MaybeAlign RecordAlignment);		MaybeAlign RecordAlignment);

Value computeBaseAndAccessKey(CallInst Call, CallInfo &CInfo,		Value computeBaseAndAccessKey(CallInst Call, CallInfo &CInfo,
std::string &AccessKey, MDNode *&BaseMeta);		std::string &AccessKey, MDNode *&BaseMeta);
MDNode computeAccessKey(CallInst Call, CallInfo &CInfo,		MDNode computeAccessKey(CallInst Call, CallInfo &CInfo,
std::string &AccessKey, bool &IsInt32Ret);		std::string &AccessKey, bool &IsInt32Ret);
uint64_t getConstant(const Value *IndexValue);
bool transformGEPChain(CallInst *Call, CallInfo &CInfo);		bool transformGEPChain(CallInst *Call, CallInfo &CInfo);
};		};

std::map<std::string, GlobalVariable *> BPFAbstractMemberAccess::GEPGlobals;		std::map<std::string, GlobalVariable *> BPFAbstractMemberAccess::GEPGlobals;
} // End anonymous namespace		} // End anonymous namespace

bool BPFAbstractMemberAccess::run(Function &F) {		bool BPFAbstractMemberAccess::run(Function &F) {
LLVM_DEBUG(dbgs() << "******** Abstract Member Accesses ********\n");		LLVM_DEBUG(dbgs() << "******** Abstract Member Accesses ********\n");
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	static uint32_t calcArraySize(const DICompositeType *CTy, uint32_t StartDim) {
return DimSize;		return DimSize;
}		}

static Type getBaseElementType(const CallInst Call) {		static Type getBaseElementType(const CallInst Call) {
// Element type is stored in an elementtype() attribute on the first param.		// Element type is stored in an elementtype() attribute on the first param.
return Call->getParamElementType(0);		return Call->getParamElementType(0);
}		}

		static uint64_t getConstant(const Value *IndexValue) {
		const ConstantInt *CV = dyn_cast<ConstantInt>(IndexValue);
		assert(CV);
		return CV->getValue().getZExtValue();
		}

/// Check whether a call is a preserve_*_access_index intrinsic call or not.		/// Check whether a call is a preserve_*_access_index intrinsic call or not.
bool BPFAbstractMemberAccess::IsPreserveDIAccessIndexCall(const CallInst *Call,		bool BPFAbstractMemberAccess::IsPreserveDIAccessIndexCall(const CallInst *Call,
CallInfo &CInfo) {		CallInfo &CInfo) {
if (!Call)		if (!Call)
return false;		return false;

const auto *GV = dyn_cast<GlobalValue>(Call->getCalledOperand());		const auto *GV = dyn_cast<GlobalValue>(Call->getCalledOperand());
if (!GV)		if (!GV)
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	if (GV->getName().startswith("llvm.bpf.preserve.enum.value")) {
else		else
CInfo.AccessIndex = BTF::ENUM_VALUE;		CInfo.AccessIndex = BTF::ENUM_VALUE;
return true;		return true;
}		}

return false;		return false;
}		}

void BPFAbstractMemberAccess::replaceWithGEP(std::vector<CallInst *> &CallList,		static void replaceWithGEP(CallInst *Call, uint32_t DimensionIndex,
uint32_t DimensionIndex,
uint32_t GEPIndex) {		uint32_t GEPIndex) {
for (auto *Call : CallList) {
uint32_t Dimension = 1;		uint32_t Dimension = 1;
if (DimensionIndex > 0)		if (DimensionIndex > 0)
Dimension = getConstant(Call->getArgOperand(DimensionIndex));		Dimension = getConstant(Call->getArgOperand(DimensionIndex));

Constant *Zero =		Constant *Zero =
ConstantInt::get(Type::getInt32Ty(Call->getParent()->getContext()), 0);		ConstantInt::get(Type::getInt32Ty(Call->getParent()->getContext()), 0);
SmallVector<Value *, 4> IdxList;		SmallVector<Value *, 4> IdxList;
for (unsigned I = 0; I < Dimension; ++I)		for (unsigned I = 0; I < Dimension; ++I)
IdxList.push_back(Zero);		IdxList.push_back(Zero);
IdxList.push_back(Call->getArgOperand(GEPIndex));		IdxList.push_back(Call->getArgOperand(GEPIndex));

auto *GEP = GetElementPtrInst::CreateInBounds(		auto *GEP = GetElementPtrInst::CreateInBounds(
getBaseElementType(Call), Call->getArgOperand(0), IdxList, "", Call);		getBaseElementType(Call), Call->getArgOperand(0), IdxList, "", Call);
Call->replaceAllUsesWith(GEP);		Call->replaceAllUsesWith(GEP);
Call->eraseFromParent();		Call->eraseFromParent();
}		}

		void BPFCoreSharedInfo::removeArrayAccessCall(CallInst *Call) {
		replaceWithGEP(Call, 1, 2);
		}

		void BPFCoreSharedInfo::removeStructAccessCall(CallInst *Call) {
		replaceWithGEP(Call, 0, 1);
		}

		void BPFCoreSharedInfo::removeUnionAccessCall(CallInst *Call) {
		Call->replaceAllUsesWith(Call->getArgOperand(0));
		Call->eraseFromParent();
}		}

bool BPFAbstractMemberAccess::removePreserveAccessIndexIntrinsic(Function &F) {		bool BPFAbstractMemberAccess::removePreserveAccessIndexIntrinsic(Function &F) {
std::vector<CallInst *> PreserveArrayIndexCalls;		std::vector<CallInst *> PreserveArrayIndexCalls;
std::vector<CallInst *> PreserveUnionIndexCalls;		std::vector<CallInst *> PreserveUnionIndexCalls;
std::vector<CallInst *> PreserveStructIndexCalls;		std::vector<CallInst *> PreserveStructIndexCalls;
bool Found = false;		bool Found = false;

Show All 18 Lines	bool BPFAbstractMemberAccess::removePreserveAccessIndexIntrinsic(Function &F) {
// is transformed to		// is transformed to
// addr = GEP(base, dimenion's zero's, index)		// addr = GEP(base, dimenion's zero's, index)
// . addr = preserve_union_access_index(base, di_index)		// . addr = preserve_union_access_index(base, di_index)
// is transformed to		// is transformed to
// addr = base, i.e., all usages of "addr" are replaced by "base".		// addr = base, i.e., all usages of "addr" are replaced by "base".
// . addr = preserve_struct_access_index(base, gep_index, di_index)		// . addr = preserve_struct_access_index(base, gep_index, di_index)
// is transformed to		// is transformed to
// addr = GEP(base, 0, gep_index)		// addr = GEP(base, 0, gep_index)
replaceWithGEP(PreserveArrayIndexCalls, 1, 2);		for (CallInst *Call : PreserveArrayIndexCalls)
replaceWithGEP(PreserveStructIndexCalls, 0, 1);		BPFCoreSharedInfo::removeArrayAccessCall(Call);
for (auto *Call : PreserveUnionIndexCalls) {		for (CallInst *Call : PreserveStructIndexCalls)
Call->replaceAllUsesWith(Call->getArgOperand(0));		BPFCoreSharedInfo::removeStructAccessCall(Call);
Call->eraseFromParent();		for (CallInst *Call : PreserveUnionIndexCalls)
}		BPFCoreSharedInfo::removeUnionAccessCall(Call);

return Found;		return Found;
}		}

/// Check whether the access index chain is valid. We check		/// Check whether the access index chain is valid. We check
/// here because there may be type casts between two		/// here because there may be type casts between two
/// access indexes. We want to ensure memory access still valid.		/// access indexes. We want to ensure memory access still valid.
bool BPFAbstractMemberAccess::IsValidAIChain(const MDNode *ParentType,		bool BPFAbstractMemberAccess::IsValidAIChain(const MDNode *ParentType,
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	for (auto &I : BB) {
if (!IsPreserveDIAccessIndexCall(Call, CInfo) \|\|		if (!IsPreserveDIAccessIndexCall(Call, CInfo) \|\|
AIChain.find(Call) != AIChain.end())		AIChain.find(Call) != AIChain.end())
continue;		continue;

traceAICall(Call, CInfo);		traceAICall(Call, CInfo);
}		}
}		}

uint64_t BPFAbstractMemberAccess::getConstant(const Value *IndexValue) {
const ConstantInt *CV = dyn_cast<ConstantInt>(IndexValue);
assert(CV);
return CV->getValue().getZExtValue();
}

/// Get the start and the end of storage offset for \p MemberTy.		/// Get the start and the end of storage offset for \p MemberTy.
void BPFAbstractMemberAccess::GetStorageBitRange(DIDerivedType *MemberTy,		void BPFAbstractMemberAccess::GetStorageBitRange(DIDerivedType *MemberTy,
Align RecordAlignment,		Align RecordAlignment,
uint32_t &StartBitOffset,		uint32_t &StartBitOffset,
uint32_t &EndBitOffset) {		uint32_t &EndBitOffset) {
uint32_t MemberBitSize = MemberTy->getSizeInBits();		uint32_t MemberBitSize = MemberTy->getSizeInBits();
uint32_t MemberBitOffset = MemberTy->getOffsetInBits();		uint32_t MemberBitOffset = MemberTy->getOffsetInBits();

▲ Show 20 Lines • Show All 546 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFCORE.h

//===- BPFCORE.h - Common info for Compile-Once Run-EveryWhere -- C++ --===//		//===- BPFCORE.h - Common info for Compile-Once Run-EveryWhere -- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_TARGET_BPF_BPFCORE_H		#ifndef LLVM_LIB_TARGET_BPF_BPFCORE_H
#define LLVM_LIB_TARGET_BPF_BPFCORE_H		#define LLVM_LIB_TARGET_BPF_BPFCORE_H

#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/IR/Instructions.h"

namespace llvm {		namespace llvm {

class BasicBlock;		class BasicBlock;
class Instruction;		class Instruction;
class Module;		class Module;

class BPFCoreSharedInfo {		class BPFCoreSharedInfo {
Show All 27 Lines	public:

/// llvm.bpf.passthrough builtin seq number		/// llvm.bpf.passthrough builtin seq number
static uint32_t SeqNum;		static uint32_t SeqNum;

/// Insert a bpf passthrough builtin function.		/// Insert a bpf passthrough builtin function.
static Instruction insertPassThrough(Module M, BasicBlock *BB,		static Instruction insertPassThrough(Module M, BasicBlock *BB,
Instruction *Input,		Instruction *Input,
Instruction *Before);		Instruction *Before);
		static void removeArrayAccessCall(CallInst *Call);
		static void removeStructAccessCall(CallInst *Call);
		static void removeUnionAccessCall(CallInst *Call);
};		};

} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp

//===------------ BPFCheckAndAdjustIR.cpp - Check and Adjust IR -----------===//		//===------------ BPFCheckAndAdjustIR.cpp - Check and Adjust IR -----------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Check IR and adjust IR for verifier friendly codes.		// Check IR and adjust IR for verifier friendly codes.
// The following are done for IR checking:		// The following are done for IR checking:
// - no relocation globals in PHI node.		// - no relocation globals in PHI node.
// The following are done for IR adjustment:		// The following are done for IR adjustment:
// - remove __builtin_bpf_passthrough builtins. Target independent IR		// - remove __builtin_bpf_passthrough builtins. Target independent IR
// optimizations are done and those builtins can be removed.		// optimizations are done and those builtins can be removed.
		// - remove llvm.bpf.getelementptr.and.load builtins.
		// - remove llvm.bpf.getelementptr.and.store builtins.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "BPF.h"		#include "BPF.h"
#include "BPFCORE.h"		#include "BPFCORE.h"
#include "BPFTargetMachine.h"		#include "BPFTargetMachine.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
		#include "llvm/IR/IntrinsicsBPF.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"		#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"

#define DEBUG_TYPE "bpf-check-and-opt-ir"		#define DEBUG_TYPE "bpf-check-and-opt-ir"
Show All 11 Lines	public:
virtual void getAnalysisUsage(AnalysisUsage &AU) const override;		virtual void getAnalysisUsage(AnalysisUsage &AU) const override;

private:		private:
void checkIR(Module &M);		void checkIR(Module &M);
bool adjustIR(Module &M);		bool adjustIR(Module &M);
bool removePassThroughBuiltin(Module &M);		bool removePassThroughBuiltin(Module &M);
bool removeCompareBuiltin(Module &M);		bool removeCompareBuiltin(Module &M);
bool sinkMinMax(Module &M);		bool sinkMinMax(Module &M);
		bool removeGEPBuiltins(Module &M);
};		};
} // End anonymous namespace		} // End anonymous namespace

char BPFCheckAndAdjustIR::ID = 0;		char BPFCheckAndAdjustIR::ID = 0;
INITIALIZE_PASS(BPFCheckAndAdjustIR, DEBUG_TYPE, "BPF Check And Adjust IR",		INITIALIZE_PASS(BPFCheckAndAdjustIR, DEBUG_TYPE, "BPF Check And Adjust IR",
false, false)		false, false)

ModulePass *llvm::createBPFCheckAndAdjustIR() {		ModulePass *llvm::createBPFCheckAndAdjustIR() {
▲ Show 20 Lines • Show All 294 Lines • ▼ Show 20 Lines	bool BPFCheckAndAdjustIR::sinkMinMax(Module &M) {

return Changed;		return Changed;
}		}

void BPFCheckAndAdjustIR::getAnalysisUsage(AnalysisUsage &AU) const {		void BPFCheckAndAdjustIR::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
}		}

		static void unrollGEPLoad(CallInst *Call) {
		auto [GEP, Load] = BPFPreserveStaticOffsetPass::reconstructLoad(Call);
		GEP->insertBefore(Call);
		Load->insertBefore(Call);
		Call->replaceAllUsesWith(Load);
		Call->eraseFromParent();
		}

		static void unrollGEPStore(CallInst *Call) {
		auto [GEP, Store] = BPFPreserveStaticOffsetPass::reconstructStore(Call);
		GEP->insertBefore(Call);
		Store->insertBefore(Call);
		Call->eraseFromParent();
		}

		static bool removeGEPBuiltinsInFunc(Function &F) {
		SmallVector<CallInst *> GEPLoads;
		SmallVector<CallInst *> GEPStores;
		for (auto &BB : F)
		for (auto &Insn : BB)
		if (auto *Call = dyn_cast<CallInst>(&Insn))
		if (auto *Called = Call->getCalledFunction())
		switch (Called->getIntrinsicID()) {
		case Intrinsic::bpf_getelementptr_and_load:
		GEPLoads.push_back(Call);
		break;
		case Intrinsic::bpf_getelementptr_and_store:
		GEPStores.push_back(Call);
		break;
		}

		if (GEPLoads.empty() && GEPStores.empty())
		return false;

		for_each(GEPLoads, unrollGEPLoad);
		for_each(GEPStores, unrollGEPStore);

		return true;
		}

		// Rewrites the following builtins:
		// - llvm.bpf.getelementptr.and.load
		// - llvm.bpf.getelementptr.and.store
		// As (load (getelementptr ...)) or (store (getelementptr ...)).
		bool BPFCheckAndAdjustIR::removeGEPBuiltins(Module &M) {
		bool Changed = false;
		for (auto &F : M)
		Changed = removeGEPBuiltinsInFunc(F) \|\| Changed;
		return Changed;
		}

bool BPFCheckAndAdjustIR::adjustIR(Module &M) {		bool BPFCheckAndAdjustIR::adjustIR(Module &M) {
bool Changed = removePassThroughBuiltin(M);		bool Changed = removePassThroughBuiltin(M);
Changed = removeCompareBuiltin(M) \|\| Changed;		Changed = removeCompareBuiltin(M) \|\| Changed;
Changed = sinkMinMax(M) \|\| Changed;		Changed = sinkMinMax(M) \|\| Changed;
		Changed = removeGEPBuiltins(M) \|\| Changed;
return Changed;		return Changed;
}		}

bool BPFCheckAndAdjustIR::runOnModule(Module &M) {		bool BPFCheckAndAdjustIR::runOnModule(Module &M) {
checkIR(M);		checkIR(M);
return adjustIR(M);		return adjustIR(M);
}		}

llvm/lib/Target/BPF/BPFPreserveStaticOffset.cpp

This file was added.

				//===------ BPFPreserveStaticOffset.cpp -----------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// TLDR: replaces llvm.preserve.static.offset + GEP + load / store
				// with llvm.bpf.getelementptr.and.load / store
				//
				// This file implements BPFPreserveStaticOffsetPass transformation.
				// This transformation address two BPF verifier specific issues:
				//
				// (a) Access to the fields of some structural types is allowed only
				// using load and store instructions with static immediate offsets.
				//
				// Examples of such types are `struct __sk_buff` and `struct
				// bpf_sock_ops`. This is so because offsets of the fields of
				// these structures do not match real offsets in the running
				// kernel. During BPF program load LDX and STX instructions
				// referring to the fields of these types are rewritten so that
				// offsets match real offsets. For this rewrite to happen field
				// offsets have to be encoded as immediate operands of the
				// instructions.
				//
				// See kernel/bpf/verifier.c:convert_ctx_access function in the
				// Linux kernel source tree for details.
				//
				// (b) Pointers to context parameters of BPF programs must not be
				// modified before access.
				//
				// During BPF program verification a tag PTR_TO_CTX is tracked for
				// register values. In case if register with such tag is modified
				// BPF program is not allowed to read or write memory using this
				// register. See kernel/bpf/verifier.c:check_mem_access function
				// in the Linux kernel source tree for details.
				//
				// The following sequence of the IR instructions:
				//
				// %x = getelementptr %ptr, %constant_offset
				// %y = load %x
				//
				// Is translated as a single machine instruction:
				//
				// LDW %ptr, %constant_offset
				//
				// In order for cases (a) and (b) to work the sequence %x-%y above has
				// to be preserved by the IR passes.
				//
				// However, several optimization passes might sink `load` instruction
				// or hoist `getelementptr` instruction so that the instructions are
				// no longer in sequence. Examples of such passes are:
				// SimplifyCFGPass, InstCombinePass, GVNPass.
				// After such modification the verifier would reject the BPF program.
				//
				// To avoid this issue the patterns like (load/store (getelementptr ...))
				// are replaced by calls to BPF specific intrinsic functions:
				// - llvm.bpf.getelementptr.and.load
				// - llvm.bpf.getelementptr.and.store
				//
				// These calls are lowered back to (load/store (getelementptr ...))
				// by BPFCheckAndAdjustIR pass right before the translation from IR to
				// machine instructions.
				//
				// The transformation is split into the following steps:
				// - When IR is generated from AST the calls to intrinsic function
				// llvm.preserve.static.offset are inserted.
				// - BPFPreserveStaticOffsetPass is executed as early as possible
				// with AllowPatial set to true, this handles marked GEP chains
				// with constant offsets.
				// - BPFPreserveStaticOffsetPass is executed at ScalarOptimizerLateEPCallback
				// with AllowPatial set to false, this handles marked GEP chains
				// with offsets that became constant after loop unrolling, e.g.
				// to handle the following code:
				//
				// struct context { int x[4]; } __attribute__((preserve_static_offset));
				//
				// struct context *ctx = ...;
				// #pragma clang loop unroll(full)
				// for (int i = 0; i < 4; ++i)
				// foo(ctx->x[i]);
				//
				yonghong-songUnsubmitted Not Done Reply Inline Actions Is it possible that such a pattern already changed by GVN/CSE so later on converting to preserve_static_offset becomes impossible? I guess it is unlikely but want to double check. Maybe add a comment to clarify. yonghong-song: Is it possible that such a pattern already changed by GVN/CSE so later on converting to…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions Actually yes, that is possible. It can probably be mitigated by allowing non-constant indices in the get.and.{load,store} functions. As we already agreed to land this, I'll do this as a separate MR. eddyz87: Actually yes, that is possible. It can probably be mitigated by allowing non-constant indices…
				// The early BPFPreserveStaticOffsetPass run is necessary to allow
				// additional GVN / CSE opportunities after functions inlining.
				// The relative order of optimization applied to function:
				// - early stage (1)
				// - ...
				// - function inlining (2)
				// - ...
				// - loop unrolling
				// - ...
				// - ScalarOptimizerLateEPCallback (3)
				//
				// When function A is inlined into function B all optimizations for A
				// are already done, while some passes remain for B. In case if
				// BPFPreserveStaticOffsetPass is done at (3) but not done at (1)
				// the code after (2) would contain a mix of
				// (load (gep %p)) and (get.and.load %p) usages:
				// - the (load (gep %p)) would come from the calling function;
				// - the (get.and.load %p) would come from the callee function.
				// Thus clobbering CSE / GVN passes done after inlining.

				yonghong-songUnsubmitted Done Reply Inline Actions So the above implies that this is bad for CSE/GVN due to different representation, right? But we want to do (1) since we want to avoid later CSE/GVN etc, so argument of clobbering is not that important, right? yonghong-song: So the above implies that this is bad for CSE/GVN due to different representation, right? But…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions We want to forbid some changes by CSE/GVN but allow others. E.g. suppose the following example: struct foo { int a; int b; } __attribute__((preserve_static_offset)); void consume(int); static void bar(int p) { consume(p); } void foo(struct foo foo) { consume(foo->b); bar(&foo->b); } After inlining of `bar` to `foo` it would look like: consume(foo->b); consume(foo->b); And I attempted not to break CSE/GVN for such situations. eddyz87:* We want to forbid some changes by CSE/GVN but allow others. E.g. suppose the following example…
				#include "BPF.h"
				#include "BPFCORE.h"
				#include "llvm/ADT/SmallPtrSet.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/IR/Argument.h"
				#include "llvm/IR/Attributes.h"
				#include "llvm/IR/BasicBlock.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/DebugInfoMetadata.h"
				#include "llvm/IR/DiagnosticInfo.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/Intrinsics.h"
				#include "llvm/IR/IntrinsicsBPF.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/ErrorHandling.h"

				#define DEBUG_TYPE "bpf-preserve-static-offset"

				using namespace llvm;

				static const unsigned GepAndLoadFirstIdxArg = 6;
				static const unsigned GepAndStoreFirstIdxArg = 7;

				static bool isIntrinsicCall(Value *I, Intrinsic::ID Id) {
				if (auto *Call = dyn_cast<CallInst>(I))
				if (Function *Func = Call->getCalledFunction())
				return Func->getIntrinsicID() == Id;
				return false;
				}

				static bool isPreserveStaticOffsetCall(Value *I) {
				return isIntrinsicCall(I, Intrinsic::preserve_static_offset);
				}

				static CallInst isGEPAndLoad(Value I) {
				if (isIntrinsicCall(I, Intrinsic::bpf_getelementptr_and_load))
				return cast<CallInst>(I);
				return nullptr;
				}

				static CallInst isGEPAndStore(Value I) {
				if (isIntrinsicCall(I, Intrinsic::bpf_getelementptr_and_store))
				return cast<CallInst>(I);
				return nullptr;
				}

				template <class T = Instruction>
				static DILocation mergeDILocations(SmallVector<T > &Insns) {
				DILocation Merged = (Insns.begin())->getDebugLoc();
				for (T *I : Insns)
				Merged = DILocation::getMergedLocation(Merged, I->getDebugLoc());
				return Merged;
				}

				static CallInst makeIntrinsicCall(Module M,
				Intrinsic::BPFIntrinsics Intrinsic,
				ArrayRef<Type *> Types,
				ArrayRef<Value *> Args) {

				Function *Fn = Intrinsic::getDeclaration(M, Intrinsic, Types);
				return CallInst::Create(Fn, Args);
				}

				static void setParamElementType(CallInst Call, unsigned ArgNo, Type Type) {
				LLVMContext &C = Call->getContext();
				Call->addParamAttr(ArgNo, Attribute::get(C, Attribute::ElementType, Type));
				}

				static void setParamReadNone(CallInst *Call, unsigned ArgNo) {
				LLVMContext &C = Call->getContext();
				Call->addParamAttr(ArgNo, Attribute::get(C, Attribute::ReadNone));
				}

				static void setParamReadOnly(CallInst *Call, unsigned ArgNo) {
				LLVMContext &C = Call->getContext();
				Call->addParamAttr(ArgNo, Attribute::get(C, Attribute::ReadOnly));
				}

				static void setParamWriteOnly(CallInst *Call, unsigned ArgNo) {
				LLVMContext &C = Call->getContext();
				Call->addParamAttr(ArgNo, Attribute::get(C, Attribute::WriteOnly));
				}

				namespace {
				struct GEPChainInfo {
				bool InBounds;
				Type *SourceElementType;
				SmallVector<Value *> Indices;
				SmallVector<GetElementPtrInst *> Members;

				GEPChainInfo() { reset(); }

				void reset() {
				InBounds = true;
				SourceElementType = nullptr;
				Indices.clear();
				Members.clear();
				}
				};
				} // Anonymous namespace

				template <class T = std::disjunction<LoadInst, StoreInst>>
				static void fillCommonArgs(LLVMContext &C, SmallVector<Value *> &Args,
				GEPChainInfo &GEP, T *Insn) {
				Type *Int8Ty = Type::getInt8Ty(C);
				Type *Int1Ty = Type::getInt1Ty(C);
				// Implementation of Align guarantees that ShiftValue < 64
				unsigned AlignShiftValue = Log2_64(Insn->getAlign().value());
				Args.push_back(GEP.Members[0]->getPointerOperand());
				Args.push_back(ConstantInt::get(Int1Ty, Insn->isVolatile()));
				Args.push_back(ConstantInt::get(Int8Ty, (unsigned)Insn->getOrdering()));
				Args.push_back(ConstantInt::get(Int8Ty, (unsigned)Insn->getSyncScopeID()));
				Args.push_back(ConstantInt::get(Int8Ty, AlignShiftValue));
				Args.push_back(ConstantInt::get(Int1Ty, GEP.InBounds));
				Args.append(GEP.Indices.begin(), GEP.Indices.end());
				}

				static Instruction makeGEPAndLoad(Module M, GEPChainInfo &GEP,
				LoadInst *Load) {
				SmallVector<Value *> Args;
				fillCommonArgs(M->getContext(), Args, GEP, Load);
				CallInst *Call = makeIntrinsicCall(M, Intrinsic::bpf_getelementptr_and_load,
				{Load->getType()}, Args);
				setParamElementType(Call, 0, GEP.SourceElementType);
				Call->applyMergedLocation(mergeDILocations(GEP.Members), Load->getDebugLoc());
				Call->setName((*GEP.Members.rbegin())->getName());
				if (Load->isUnordered()) {
				Call->setOnlyReadsMemory();
				Call->setOnlyAccessesArgMemory();
				setParamReadOnly(Call, 0);
				}
				for (unsigned I = GepAndLoadFirstIdxArg; I < Args.size(); ++I)
				Call->addParamAttr(I, Attribute::ImmArg);
				Call->setAAMetadata(Load->getAAMetadata());
				return Call;
				}

				static Instruction makeGEPAndStore(Module M, GEPChainInfo &GEP,
				StoreInst *Store) {
				SmallVector<Value *> Args;
				Args.push_back(Store->getValueOperand());
				fillCommonArgs(M->getContext(), Args, GEP, Store);
				CallInst *Call =
				makeIntrinsicCall(M, Intrinsic::bpf_getelementptr_and_store,
				{Store->getValueOperand()->getType()}, Args);
				setParamElementType(Call, 1, GEP.SourceElementType);
				if (Store->getValueOperand()->getType()->isPointerTy())
				setParamReadNone(Call, 0);
				Call->applyMergedLocation(mergeDILocations(GEP.Members),
				Store->getDebugLoc());
				if (Store->isUnordered()) {
				Call->setOnlyWritesMemory();
				Call->setOnlyAccessesArgMemory();
				setParamWriteOnly(Call, 1);
				}
				for (unsigned I = GepAndStoreFirstIdxArg; I < Args.size(); ++I)
				Call->addParamAttr(I, Attribute::ImmArg);
				Call->setAAMetadata(Store->getAAMetadata());
				return Call;
				}

				static unsigned getOperandAsUnsigned(CallInst *Call, unsigned ArgNo) {
				if (auto *Int = dyn_cast<ConstantInt>(Call->getOperand(ArgNo)))
				return Int->getValue().getZExtValue();
				std::string Report;
				raw_string_ostream ReportS(Report);
				ReportS << "Expecting ConstantInt as argument #" << ArgNo << " of " << *Call
				<< "\n";
				report_fatal_error(StringRef(Report));
				}

				static GetElementPtrInst reconstructGEP(CallInst Call, int Delta) {
				SmallVector<Value *> Indices;
				Indices.append(Call->data_operands_begin() + 6 + Delta,
				Call->data_operands_end());
				Type *GEPPointeeType = Call->getParamElementType(Delta);
				auto *GEP =
				GetElementPtrInst::Create(GEPPointeeType, Call->getOperand(Delta),
				ArrayRef<Value *>(Indices), Call->getName());
				GEP->setIsInBounds(getOperandAsUnsigned(Call, 5 + Delta));
				return GEP;
				}

				template <class T = std::disjunction<LoadInst, StoreInst>>
				static void reconstructCommon(CallInst Call, GetElementPtrInst GEP, T *Insn,
				int Delta) {
				Insn->setVolatile(getOperandAsUnsigned(Call, 1 + Delta));
				Insn->setOrdering((AtomicOrdering)getOperandAsUnsigned(Call, 2 + Delta));
				Insn->setSyncScopeID(getOperandAsUnsigned(Call, 3 + Delta));
				unsigned AlignShiftValue = getOperandAsUnsigned(Call, 4 + Delta);
				Insn->setAlignment(Align(1ULL << AlignShiftValue));
				GEP->setDebugLoc(Call->getDebugLoc());
				Insn->setDebugLoc(Call->getDebugLoc());
				Insn->setAAMetadata(Call->getAAMetadata());
				}

				std::pair<GetElementPtrInst , LoadInst >
				BPFPreserveStaticOffsetPass::reconstructLoad(CallInst *Call) {
				GetElementPtrInst *GEP = reconstructGEP(Call, 0);
				Type *ReturnType = Call->getFunctionType()->getReturnType();
				auto *Load = new LoadInst(ReturnType, GEP, "",
				/* These would be set in reconstructCommon */
				false, Align(1));
				reconstructCommon(Call, GEP, Load, 0);
				return std::pair{GEP, Load};
				}

				std::pair<GetElementPtrInst , StoreInst >
				BPFPreserveStaticOffsetPass::reconstructStore(CallInst *Call) {
				GetElementPtrInst *GEP = reconstructGEP(Call, 1);
				auto *Store = new StoreInst(Call->getOperand(0), GEP,
				/* These would be set in reconstructCommon */
				false, Align(1));
				reconstructCommon(Call, GEP, Store, 1);
				return std::pair{GEP, Store};
				}

				static bool isZero(Value *V) {
				auto *CI = dyn_cast<ConstantInt>(V);
				return CI && CI->isZero();
				}

				// Given a chain of GEP instructions collect information necessary to
				// merge this chain as a single GEP instruction of form:
				// getelementptr %<type>, ptr %p, i32 0, <field_idx1>, <field_idx2>, ...
				static bool foldGEPChainAsStructAccess(SmallVector<GetElementPtrInst *> &GEPs,
				GEPChainInfo &Info) {
				if (GEPs.empty())
				return false;

				if (!all_of(GEPs, [=](GetElementPtrInst *GEP) {
				return GEP->hasAllConstantIndices();
				}))
				return false;

				GetElementPtrInst *First = GEPs[0];
				Info.InBounds = First->isInBounds();
				Info.SourceElementType = First->getSourceElementType();
				Type *ResultElementType = First->getResultElementType();
				Info.Indices.append(First->idx_begin(), First->idx_end());
				Info.Members.push_back(First);

				for (auto *Iter = GEPs.begin() + 1; Iter != GEPs.end(); ++Iter) {
				GetElementPtrInst GEP = Iter;
				if (!isZero(*GEP->idx_begin())) {
				Info.reset();
				return false;
				}
				if (!GEP->getSourceElementType() \|\|
				GEP->getSourceElementType() != ResultElementType) {
				Info.reset();
				return false;
				}
				Info.InBounds &= GEP->isInBounds();
				Info.Indices.append(GEP->idx_begin() + 1, GEP->idx_end());
				Info.Members.push_back(GEP);
				ResultElementType = GEP->getResultElementType();
				}

				return true;
				}

				// Given a chain of GEP instructions collect information necessary to
				// merge this chain as a single GEP instruction of form:
				// getelementptr i8, ptr %p, i64 %offset
				static bool foldGEPChainAsU8Access(SmallVector<GetElementPtrInst *> &GEPs,
				GEPChainInfo &Info) {
				if (GEPs.empty())
				return false;

				GetElementPtrInst *First = GEPs[0];
				const DataLayout &DL = First->getModule()->getDataLayout();
				LLVMContext &C = First->getContext();
				Type *PtrTy = First->getType()->getScalarType();
				APInt Offset(DL.getIndexTypeSizeInBits(PtrTy), 0);
				for (GetElementPtrInst *GEP : GEPs) {
				if (!GEP->accumulateConstantOffset(DL, Offset)) {
				Info.reset();
				return false;
				}
				Info.InBounds &= GEP->isInBounds();
				Info.Members.push_back(GEP);
				}
				Info.SourceElementType = Type::getInt8Ty(C);
				Info.Indices.push_back(ConstantInt::get(C, Offset));

				return true;
				}

				static void reportNonStaticGEPChain(Instruction *Insn) {
				auto Msg = DiagnosticInfoUnsupported(
				*Insn->getFunction(),
				Twine("Non-constant offset in access to a field of a type marked "
				"with preserve_static_offset might be rejected by BPF verifier")
				.concat(Insn->getDebugLoc()
				? ""
				: " (pass -g option to get exact location)"),
				Insn->getDebugLoc(), DS_Warning);
				Insn->getContext().diagnose(Msg);
				}

				static bool allZeroIndices(SmallVector<GetElementPtrInst *> &GEPs) {
				return GEPs.empty() \|\| all_of(GEPs, [=](GetElementPtrInst *GEP) {
				return GEP->hasAllZeroIndices();
				});
				}

				static bool tryToReplaceWithGEPBuiltin(Instruction *LoadOrStoreTemplate,
				SmallVector<GetElementPtrInst *> &GEPs,
				Instruction *InsnToReplace) {
				GEPChainInfo GEPChain;
				if (!foldGEPChainAsStructAccess(GEPs, GEPChain) &&
				!foldGEPChainAsU8Access(GEPs, GEPChain)) {
				return false;
				}
				Module *M = InsnToReplace->getModule();
				if (auto *Load = dyn_cast<LoadInst>(LoadOrStoreTemplate)) {
				Instruction *Replacement = makeGEPAndLoad(M, GEPChain, Load);
				Replacement->insertBefore(InsnToReplace);
				InsnToReplace->replaceAllUsesWith(Replacement);
				}
				if (auto *Store = dyn_cast<StoreInst>(LoadOrStoreTemplate)) {
				Instruction *Replacement = makeGEPAndStore(M, GEPChain, Store);
				Replacement->insertBefore(InsnToReplace);
				}
				return true;
				}

				// Check if U->getPointerOperand() == I
				static bool isPointerOperand(Value I, User U) {
				if (auto *L = dyn_cast<LoadInst>(U))
				return L->getPointerOperand() == I;
				if (auto *S = dyn_cast<StoreInst>(U))
				return S->getPointerOperand() == I;
				if (auto *GEP = dyn_cast<GetElementPtrInst>(U))
				return GEP->getPointerOperand() == I;
				if (auto *Call = isGEPAndLoad(U))
				return Call->getArgOperand(0) == I;
				if (auto *Call = isGEPAndStore(U))
				return Call->getArgOperand(1) == I;
				return false;
				}

				static bool isInlineableCall(User *U) {
				if (auto *Call = dyn_cast<CallInst>(U))
				return Call->hasFnAttr(Attribute::InlineHint);
				return false;
				}

				static void rewriteAccessChain(Instruction *Insn,
				SmallVector<GetElementPtrInst *> &GEPs,
				SmallVector<Instruction *> &Visited,
				bool AllowPatial, bool &StillUsed);

				static void rewriteUses(Instruction *Insn,
				SmallVector<GetElementPtrInst *> &GEPs,
				SmallVector<Instruction *> &Visited, bool AllowPatial,
				bool &StillUsed) {
				for (User *U : Insn->users()) {
				auto *UI = dyn_cast<Instruction>(U);
				if (UI && (isPointerOperand(Insn, UI) \|\| isPreserveStaticOffsetCall(UI) \|\|
				isInlineableCall(UI)))
				rewriteAccessChain(UI, GEPs, Visited, AllowPatial, StillUsed);
				else
				LLVM_DEBUG({
				llvm::dbgs() << "unsupported usage in BPFPreserveStaticOffsetPass:\n";
				llvm::dbgs() << " Insn: " << *Insn << "\n";
				llvm::dbgs() << " User: " << *U << "\n";
				});
				}
				}

				// A DFS traversal of GEP chain trees starting from Root.
				//
				// Recursion descends through GEP instructions and
				// llvm.preserve.static.offset calls. Recursion stops at any other
				// instruction. If load or store instruction is reached it is replaced
				// by a call to `llvm.bpf.getelementptr.and.load` or
				// `llvm.bpf.getelementptr.and.store` intrinsic.
				// If `llvm.bpf.getelementptr.and.load/store` is reached the accumulated
				// GEPs are merged into the intrinsic call.
				// If nested calls to `llvm.preserve.static.offset` are encountered these
				// calls are marked for deletion.
				//
				// Parameters description:
				// - Insn - current position in the tree
				// - GEPs - GEP instructions for the current branch
				// - Visited - a list of visited instructions in DFS order,
				// order is important for unused instruction deletion.
				// - AllowPartial - when true GEP chains that can't be folded are
				// not reported, otherwise diagnostic message is show for such chains.
				// - StillUsed - set to true if one of the GEP chains could not be
				// folded, makes sense when AllowPartial is false, means that root
				// preserve.static.offset call is still in use and should remain
				// until the next run of this pass.
				static void rewriteAccessChain(Instruction *Insn,
				SmallVector<GetElementPtrInst *> &GEPs,
				SmallVector<Instruction *> &Visited,
				bool AllowPatial, bool &StillUsed) {
				auto MarkAndTraverseUses = [&]() {
				Visited.push_back(Insn);
				rewriteUses(Insn, GEPs, Visited, AllowPatial, StillUsed);
				};
				auto TryToReplace = [&](Instruction *LoadOrStore) {
				// Do nothing for (preserve.static.offset (load/store ..)) or for
				// GEPs with zero indices. Such constructs lead to zero offset and
				// are simplified by other passes.
				if (allZeroIndices(GEPs))
				return;
				if (tryToReplaceWithGEPBuiltin(LoadOrStore, GEPs, Insn)) {
				Visited.push_back(Insn);
				return;
				}
				if (!AllowPatial)
				reportNonStaticGEPChain(Insn);
				StillUsed = true;
				};
				if (isa<LoadInst>(Insn) \|\| isa<StoreInst>(Insn)) {
				TryToReplace(Insn);
				} else if (isGEPAndLoad(Insn)) {
				auto [GEP, Load] =
				BPFPreserveStaticOffsetPass::reconstructLoad(cast<CallInst>(Insn));
				GEPs.push_back(GEP);
				TryToReplace(Load);
				GEPs.pop_back();
				delete Load;
				delete GEP;
				} else if (isGEPAndStore(Insn)) {
				// This case can't be merged with the above because
				// `delete Load` / `delete Store` wants a concrete type,
				// destructor of Instruction is protected.
				auto [GEP, Store] =
				BPFPreserveStaticOffsetPass::reconstructStore(cast<CallInst>(Insn));
				GEPs.push_back(GEP);
				TryToReplace(Store);
				GEPs.pop_back();
				delete Store;
				delete GEP;
				} else if (auto *GEP = dyn_cast<GetElementPtrInst>(Insn)) {
				GEPs.push_back(GEP);
				MarkAndTraverseUses();
				GEPs.pop_back();
				} else if (isPreserveStaticOffsetCall(Insn)) {
				MarkAndTraverseUses();
				} else if (isInlineableCall(Insn)) {
				// Preserve preserve.static.offset call for parameters of
				// functions that might be inlined. These would be removed on a
				// second pass after inlining.
				// Might happen when a pointer to a preserve_static_offset
				// structure is passed as parameter of a function that would be
				// inlined inside a loop that would be unrolled.
				if (AllowPatial)
				StillUsed = true;
				} else {
				SmallString<128> Buf;
				raw_svector_ostream BufStream(Buf);
				BufStream << *Insn;
				report_fatal_error(
				Twine("Unexpected rewriteAccessChain Insn = ").concat(Buf));
				}
				}

				static void removeMarkerCall(Instruction *Marker) {
				Marker->replaceAllUsesWith(Marker->getOperand(0));
				Marker->eraseFromParent();
				}

				static bool rewriteAccessChain(Instruction *Marker, bool AllowPatial,
				SmallPtrSetImpl<Instruction *> &RemovedMarkers) {
				SmallVector<GetElementPtrInst *> GEPs;
				SmallVector<Instruction *> Visited;
				bool StillUsed = false;
				rewriteUses(Marker, GEPs, Visited, AllowPatial, StillUsed);
				// Check if Visited instructions could be removed, iterate in
				// reverse to unblock instructions higher in the chain.
				for (auto V = Visited.rbegin(); V != Visited.rend(); ++V) {
				if (isPreserveStaticOffsetCall(*V)) {
				removeMarkerCall(*V);
				RemovedMarkers.insert(*V);
				} else if ((*V)->use_empty()) {
				(*V)->eraseFromParent();
				}
				}
				return StillUsed;
				}

				static std::vector<Instruction *>
				collectPreserveStaticOffsetCalls(Function &F) {
				std::vector<Instruction *> Calls;
				for (Instruction &Insn : instructions(F))
				if (isPreserveStaticOffsetCall(&Insn))
				Calls.push_back(&Insn);
				return Calls;
				}

				bool isPreserveArrayIndex(Value *V) {
				return isIntrinsicCall(V, Intrinsic::preserve_array_access_index);
				}

				bool isPreserveStructIndex(Value *V) {
				return isIntrinsicCall(V, Intrinsic::preserve_struct_access_index);
				}

				bool isPreserveUnionIndex(Value *V) {
				return isIntrinsicCall(V, Intrinsic::preserve_union_access_index);
				}

				static void removePAICalls(Instruction *Marker) {
				auto IsPointerOperand = [](Value Op, User U) {
				if (auto *GEP = dyn_cast<GetElementPtrInst>(U))
				return GEP->getPointerOperand() == Op;
				if (isPreserveStaticOffsetCall(U) \|\| isPreserveArrayIndex(U) \|\|
				isPreserveStructIndex(U) \|\| isPreserveUnionIndex(U))
				return cast<CallInst>(U)->getArgOperand(0) == Op;
				return false;
				};

				SmallVector<Value *, 32> WorkList;
				WorkList.push_back(Marker);
				do {
				Value *V = WorkList.pop_back_val();
				for (User *U : V->users())
				if (IsPointerOperand(V, U))
				WorkList.push_back(U);
				auto *Call = dyn_cast<CallInst>(V);
				if (!Call)
				continue;
				if (isPreserveArrayIndex(V))
				BPFCoreSharedInfo::removeArrayAccessCall(Call);
				else if (isPreserveStructIndex(V))
				BPFCoreSharedInfo::removeStructAccessCall(Call);
				else if (isPreserveUnionIndex(V))
				BPFCoreSharedInfo::removeUnionAccessCall(Call);
				} while (!WorkList.empty());
				}

				// Look for sequences:
				// - llvm.preserve.static.offset -> getelementptr... -> load
				// - llvm.preserve.static.offset -> getelementptr... -> store
				// And replace those with calls to intrinsics:
				// - llvm.bpf.getelementptr.and.load
				// - llvm.bpf.getelementptr.and.store
				static bool rewriteFunction(Function &F, bool AllowPartial) {
				LLVM_DEBUG(dbgs() << "********** BPFPreserveStaticOffsetPass (AllowPartial="
				<< AllowPartial << ") ************\n");

				auto MarkerCalls = collectPreserveStaticOffsetCalls(F);
				SmallPtrSet<Instruction *, 16> RemovedMarkers;

				LLVM_DEBUG(dbgs() << "There are " << MarkerCalls.size()
				<< " preserve.static.offset calls\n");

				if (MarkerCalls.empty())
				return false;

				for (auto *Call : MarkerCalls)
				removePAICalls(Call);

				for (auto *Call : MarkerCalls) {
				if (RemovedMarkers.contains(Call))
				continue;
				bool StillUsed = rewriteAccessChain(Call, AllowPartial, RemovedMarkers);
				if (!StillUsed \|\| !AllowPartial)
				removeMarkerCall(Call);
				}

				return true;
				}

				PreservedAnalyses
				llvm::BPFPreserveStaticOffsetPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				return rewriteFunction(F, AllowPartial) ? PreservedAnalyses::none()
				: PreservedAnalyses::all();
				}

llvm/lib/Target/BPF/BPFTargetMachine.cpp

	Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
	void BPFTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {			void BPFTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
	PB.registerPipelineParsingCallback(			PB.registerPipelineParsingCallback(
	[](StringRef PassName, FunctionPassManager &FPM,			[](StringRef PassName, FunctionPassManager &FPM,
	ArrayRef<PassBuilder::PipelineElement>) {			ArrayRef<PassBuilder::PipelineElement>) {
	if (PassName == "bpf-ir-peephole") {			if (PassName == "bpf-ir-peephole") {
	FPM.addPass(BPFIRPeepholePass());			FPM.addPass(BPFIRPeepholePass());
	return true;			return true;
	}			}
				if (PassName == "bpf-preserve-static-offset") {
				FPM.addPass(BPFPreserveStaticOffsetPass(false));
				return true;
				}
	return false;			return false;
	});			});
	PB.registerPipelineStartEPCallback(			PB.registerPipelineStartEPCallback(
	[=](ModulePassManager &MPM, OptimizationLevel) {			[=](ModulePassManager &MPM, OptimizationLevel) {
	FunctionPassManager FPM;			FunctionPassManager FPM;
				FPM.addPass(BPFPreserveStaticOffsetPass(true));
	FPM.addPass(BPFAbstractMemberAccessPass(this));			FPM.addPass(BPFAbstractMemberAccessPass(this));
	FPM.addPass(BPFPreserveDITypePass());			FPM.addPass(BPFPreserveDITypePass());
	FPM.addPass(BPFIRPeepholePass());			FPM.addPass(BPFIRPeepholePass());
	MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));			MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
	});			});
	PB.registerPeepholeEPCallback([=](FunctionPassManager &FPM,			PB.registerPeepholeEPCallback([=](FunctionPassManager &FPM,
	OptimizationLevel Level) {			OptimizationLevel Level) {
	FPM.addPass(SimplifyCFGPass(SimplifyCFGOptions().hoistCommonInsts(true)));			FPM.addPass(SimplifyCFGPass(SimplifyCFGOptions().hoistCommonInsts(true)));
	});			});
				PB.registerScalarOptimizerLateEPCallback(
				[=](FunctionPassManager &FPM, OptimizationLevel Level) {
				// Run this after loop unrolling but before
				// SimplifyCFGPass(... .sinkCommonInsts(true))
				FPM.addPass(BPFPreserveStaticOffsetPass(false));
				});
	PB.registerPipelineEarlySimplificationEPCallback(			PB.registerPipelineEarlySimplificationEPCallback(
	[=](ModulePassManager &MPM, OptimizationLevel) {			[=](ModulePassManager &MPM, OptimizationLevel) {
	MPM.addPass(BPFAdjustOptPass());			MPM.addPass(BPFAdjustOptPass());
	});			});
	}			}

	void BPFPassConfig::addIRPasses() {			void BPFPassConfig::addIRPasses() {
	addPass(createBPFCheckAndAdjustIR());			addPass(createBPFCheckAndAdjustIR());
	Show All 36 Lines

llvm/lib/Target/BPF/CMakeLists.txt

Show All 20 Lines	add_llvm_target(BPFCodeGen
BPFCheckAndAdjustIR.cpp		BPFCheckAndAdjustIR.cpp
BPFFrameLowering.cpp		BPFFrameLowering.cpp
BPFInstrInfo.cpp		BPFInstrInfo.cpp
BPFIRPeephole.cpp		BPFIRPeephole.cpp
BPFISelDAGToDAG.cpp		BPFISelDAGToDAG.cpp
BPFISelLowering.cpp		BPFISelLowering.cpp
BPFMCInstLower.cpp		BPFMCInstLower.cpp
BPFPreserveDIType.cpp		BPFPreserveDIType.cpp
		BPFPreserveStaticOffset.cpp
BPFRegisterInfo.cpp		BPFRegisterInfo.cpp
BPFSelectionDAGInfo.cpp		BPFSelectionDAGInfo.cpp
BPFSubtarget.cpp		BPFSubtarget.cpp
BPFTargetMachine.cpp		BPFTargetMachine.cpp
BPFMIPeephole.cpp		BPFMIPeephole.cpp
BPFMIChecking.cpp		BPFMIChecking.cpp
BPFMISimplifyPatchable.cpp		BPFMISimplifyPatchable.cpp
BTFDebug.cpp		BTFDebug.cpp
Show All 26 Lines

llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a load instruction for a field with non-standard
				; alignment by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; typedef int aligned_int __attribute__((aligned(128)));
				;
				; struct foo {
				; int _;
				; aligned_int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->a);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, [124 x i8], i32, [124 x i8] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 2
				%1 = load i32, ptr %a, align 128, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: %[[a1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 7, i1 true, i32 immarg 0, i32 immarg 2)
				; ^^^^
				; alignment 2**7
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK-NEXT: call void @consume(i32 noundef %[[a1]])
				; CHECK: attributes #[[v2]] = { memory(argmem: read) }

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 128}
				erichkeaneUnsubmitted Done Reply Inline Actions Are we sure we want to do something like this? It seems this both depends on YOUR computer AND us never releasing Clang 18. erichkeane: Are we sure we want to do something like this? It seems this both depends on YOUR computer AND…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions Are you sure this would be an issue? The specific line is not a part of a CHECK and I tried the following command using my system's llvm 16 opt: opt -O2 -mtriple=bpf-pc-linux -S -o - load-align.ll And module was loaded / processed w/o any issues. In general grepping shows that people don't usually mask these in tests: $ cd llvm/test/CodeGen/ $ ag '{!"clang version' \| wc -l 452 eddyz87: Are you sure this would be an issue? The specific line is not a part of a CHECK and I tried the…
				erichkeaneUnsubmitted Done Reply Inline Actions I don't write LLVM tests ever, so I'm not sure. It just seems odd to provide that much irrelevant info, perhaps one of hte LLVM reviewers can comment. Also, look at those ~450 and see what they contain? erichkeane: I don't write LLVM tests ever, so I'm not sure. It just seems odd to provide that much…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions Also, look at those ~450 and see what they contain? Same random clang versions: $ ag '{!"clang version' \| head X86/debug-loclists.ll:129:!6 = !{!"clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)"} X86/dbg-combine.ll:84:!11 = !{!"clang version 3.7.0 (trunk 227074)"} X86/debuginfo-locations-dce.ll:46:!6 = !{!"clang version 8.0.0 (trunk 339665)"} X86/pr31242.ll:48:!5 = !{!"clang version 4.0.0 (trunk 288844)"} X86/catchpad-regmask.ll:140:!1 = !{!"clang version 3.8.0 "} X86/debug-nodebug-crash.ll:48:!5 = !{!"clang version 4.0.0"} X86/limit-split-cost.mir:64: !2 = !{!"clang version 7.0.0 (trunk 335057)"} X86/swap.ll:169:!1 = !{!"clang version 9.0.0 (trunk 352631) (llvm/trunk 352632)"} X86/dwarf-aranges-available-externally.ll:65:!16 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git 2f52a868225755ebfa5242992d3a650ac6aadce7)"} X86/label-annotation.ll:96:!7 = !{!"clang version 9.0.0 (git@github.com:llvm/llvm-project.git 7f9a008a2db285aca57bfa0c09858c9527a7aa98)"} eddyz87: > Also, look at those ~450 and see what they contain? Same random clang versions: ``` $ ag '{!
				erichkeaneUnsubmitted Done Reply Inline Actions It seems at least removing your home-path would be a good idea, but I can't really review these either. Note the 'trunk `#`' is from our SVN days, so that isn't particularly useful at all. IMO everything in the parens is worthless in the tests, but hopefully someone familiar with the LLVM tests can stop by and correct me. erichkeane: It seems at least removing your home-path would be a good idea, but I can't really review these…
				eddyz87AuthorUnsubmitted Done Reply Inline Actions Most of the tests use generated IR as a starting point and it looks like people don't bother to hide these details. But I checked and test passes if I replace this string with just "clang", I'll update the tests, not a big deal. eddyz87: Most of the tests use generated IR as a starting point and it looks like people don't bother to…
				!3 = !{!"foo", !4, i64 0, !4, i64 128}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-arr-pai.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				; #define __pai __attribute__((preserve_access_index))
				;
				; struct bar {
				; int a[7];
				; } __pai __ctx;
				;
				; int buz(struct bar *p) {
				; return p->a[5];
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -debug-info-kind=limited -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { [7 x i32] }

				; Function Attrs: nounwind
				define dso_local i32 @buz(ptr noundef %p) #0 !dbg !10 {
				entry:
				call void @llvm.dbg.value(metadata ptr %p, metadata !18, metadata !DIExpression()), !dbg !19
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !20
				%1 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.bar) %0, i32 0, i32 0), !dbg !20, !llvm.preserve.access.index !14
				%2 = call ptr @llvm.preserve.array.access.index.p0.p0(ptr elementtype([7 x i32]) %1, i32 1, i32 5), !dbg !21, !llvm.preserve.access.index !3
				%3 = load i32, ptr %2, align 4, !dbg !21, !tbaa !22
				ret i32 %3, !dbg !26
				}

				; CHECK: define dso_local i32 @buz(ptr noundef %[[p:.]]) {{.}} {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.dbg.value
				; CHECK-NEXT: %[[v5:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.bar) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 0, i32 immarg 5)
				; CHECK-SAME: #[[v6:.*]], !tbaa
				; CHECK-NEXT: ret i32 %[[v5]]
				; CHECK-NEXT: }
				; CHECK: attributes #[[v6]] = { memory(argmem: read) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.struct.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.array.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!7, !8}
				!llvm.ident = !{!9}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, retainedTypes: !2, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{!3}
				!3 = !DICompositeType(tag: DW_TAG_array_type, baseType: !4, size: 224, elements: !5)
				!4 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!5 = !{!6}
				!6 = !DISubrange(count: 7)
				!7 = !{i32 2, !"Debug Info Version", i32 3}
				!8 = !{i32 1, !"wchar_size", i32 4}
				!9 = !{!"clang"}
				!10 = distinct !DISubprogram(name: "buz", scope: !1, file: !1, line: 8, type: !11, scopeLine: 8, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !17)
				!11 = !DISubroutineType(types: !12)
				!12 = !{!4, !13}
				!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64)
				!14 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !1, line: 4, size: 224, elements: !15)
				!15 = !{!16}
				!16 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !14, file: !1, line: 5, baseType: !3, size: 224)
				!17 = !{!18}
				!18 = !DILocalVariable(name: "p", arg: 1, scope: !10, file: !1, line: 8, type: !13)
				!19 = !DILocation(line: 0, scope: !10)
				!20 = !DILocation(line: 9, column: 13, scope: !10)
				!21 = !DILocation(line: 9, column: 10, scope: !10)
				!22 = !{!23, !23, i64 0}
				!23 = !{!"int", !24, i64 0}
				!24 = !{!"omnipotent char", !25, i64 0}
				!25 = !{!"Simple C/C++ TBAA"}
				!26 = !DILocation(line: 9, column: 3, scope: !10)

llvm/test/CodeGen/BPF/preserve-static-offset/load-atomic.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of atomic load instruction by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int _;
				; int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; int r;
				; __atomic_load(&p->a, &r, 2);
				; consume(r);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%1 = load atomic i32, ptr %a acquire, align 4
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: %[[a1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr elementtype(%struct.foo) %[[p:.*]],
				; i1 false, i8 4, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; ^^^^
				; atomic order
				; CHECK-NOT: #{{[0-9]+}}
				; CHECK-NEXT: call void @consume(i32 noundef %[[a1]])

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				declare void @consume(i32 noundef) #3

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-2.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds GEP chains that end by
				; getelementptr.and.load.
				;
				; Source (modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->b.bb);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; And modified to fold last getelementptr/load as a single
				; getelementptr.and.load.

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%bb1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr readonly elementtype(%struct.bar) %b,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				#4, !tbaa !2
				call void @consume(i32 noundef %bb1)
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[bb1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK-NEXT: call void @consume(i32 noundef %[[bb1]])
				; CHECK: attributes #[[v2]] = { memory(argmem: read) }

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-oob.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset keeps track of 'inbounds' flags while
				; folding chain of GEP instructions.
				;
				; Source (IR modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a[2];
				; };
				;
				; struct bar {
				; int a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void buz(struct bar *p) {
				; consume(p->b.a[1]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; Modified to remove one of the 'inbounds' from one of the GEP instructions.

				%struct.bar = type { i32, %struct.foo }
				%struct.foo = type { [2 x i32] }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%a = getelementptr %struct.foo, ptr %b, i32 0, i32 0
				%arrayidx = getelementptr inbounds [2 x i32], ptr %a, i64 0, i64 1
				%1 = load i32, ptr %arrayidx, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: %[[v1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.bar) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 false,
				; ^^^^^^^^
				; not inbounds
				; CHECK-SAME: i32 immarg 0, i32 immarg 1, i32 immarg 0, i64 immarg 1)
				; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				; folded gep chain
				; CHECK-NEXT: call void @consume(i32 noundef %[[v1]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"int", !4, i64 0}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8-oob.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				; The GEP chain in this example has unexpected shape and thus is
				; folded as i8 access.
				;
				; Source (IR modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char a[2];
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(char);
				;
				; void buz(struct bar *p) {
				; consume((&p->b)[1].a[1]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; Modified to remove 'inbounds' from one of the GEP instructions.

				%struct.bar = type { i8, %struct.foo }
				%struct.foo = type { [2 x i8] }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%arrayidx = getelementptr inbounds %struct.foo, ptr %b, i64 1
				; ^^^^^
				; folded as i8 access because of this index
				%a = getelementptr %struct.foo, ptr %arrayidx, i32 0, i32 0
				%arrayidx1 = getelementptr inbounds [2 x i8], ptr %a, i64 0, i64 1
				%1 = load i8, ptr %arrayidx1, align 1, !tbaa !2
				call void @consume(i8 noundef signext %1)
				ret void
				}

				; CHECK: %[[v1:.*]] = call i8 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i8
				; CHECK-SAME: (ptr readonly elementtype(i8) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 0, i1 false, i64 immarg 4)
				; ^^^^^^^^ ^^^^^^^^^^^^
				; not inbounds ---' \|
				; offset from 'struct bar' start -------------'
				; CHECK-NEXT: call void @consume(i8 noundef signext %[[v1]])

				declare void @consume(i8 noundef signext) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"omnipotent char", !4, i64 0}
				!4 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8-type-mismatch.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				; The GEP chain in this example has unexpected shape and thus is
				; folded as i8 access.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char aa;
				; char bb;
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(char);
				;
				; void buz(struct bar *p) {
				; consume(((struct foo )(((char)&p->b) + 1))->bb);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i8, %struct.foo }
				%struct.foo = type { i8, i8 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%add.ptr = getelementptr inbounds i8, ptr %b, i64 1
				; ~~
				; these types do not match, thus GEP chain is folded as an offset
				; ~~~~~~~~~~~
				%bb = getelementptr inbounds %struct.foo, ptr %add.ptr, i32 0, i32 1
				%1 = load i8, ptr %bb, align 1, !tbaa !2
				call void @consume(i8 noundef signext %1)
				ret void
				}

				; CHECK: %[[bb1:.*]] = call i8 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i8
				; CHECK-SAME: (ptr readonly elementtype(i8) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 0, i1 true, i64 immarg 3)
				; ^^^^^^^^^^^^
				; offset from 'struct bar' start
				; CHECK-NEXT: call void @consume(i8 noundef signext %[[bb1]])

				declare void @consume(i8 noundef signext) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 1}
				!3 = !{!"foo", !4, i64 0, !4, i64 1}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				; The GEP chain in this example has unexpected shape and thus is
				; folded as i8 access.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char a[2];
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(char);
				;
				; void buz(struct bar *p) {
				; consume((&p->b)[1].a[1]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i8, %struct.foo }
				%struct.foo = type { [2 x i8] }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%arrayidx = getelementptr inbounds %struct.foo, ptr %b, i64 1
				; ^^^^^
				; folded as i8 access because of this index
				%a = getelementptr inbounds %struct.foo, ptr %arrayidx, i32 0, i32 0
				%arrayidx1 = getelementptr inbounds [2 x i8], ptr %a, i64 0, i64 1
				%1 = load i8, ptr %arrayidx1, align 1, !tbaa !2
				call void @consume(i8 noundef signext %1)
				ret void
				}

				; CHECK: %[[v1:.*]] = call i8 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i8
				; CHECK-SAME: (ptr readonly elementtype(i8) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 0, i1 true, i64 immarg 4)
				; ^^^^^^^^^^^^
				; offset from 'struct bar' start
				; CHECK-NEXT: call void @consume(i8 noundef signext %[[v1]])

				declare void @consume(i8 noundef signext) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"omnipotent char", !4, i64 0}
				!4 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a[2];
				; };
				;
				; struct bar {
				; int a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void buz(struct bar *p) {
				; consume(p->b.a[1]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i32, %struct.foo }
				%struct.foo = type { [2 x i32] }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%a = getelementptr inbounds %struct.foo, ptr %b, i32 0, i32 0
				%arrayidx = getelementptr inbounds [2 x i32], ptr %a, i64 0, i64 1
				%1 = load i32, ptr %arrayidx, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: %[[v1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.bar) %{{[^,]+}},
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true,
				; CHECK-SAME: i32 immarg 0, i32 immarg 1, i32 immarg 0, i64 immarg 1)
				; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				; folded gep chain
				; CHECK-NEXT: call void @consume(i32 noundef %[[v1]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"int", !4, i64 0}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-inline.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check position of bpf-preserve-static-offset pass in the pipeline:
				; - preserve.static.offset call is preserved if address is passed as
				; a parameter to an inline-able function;
				; - second bpf-preserve-static-offset pass (after inlining) should introduce
				; getelementptr.and.load call using the preserved marker.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; static inline void bar(struct bar *p){
				; consume(p->bb);
				; }
				;
				; void quux(struct foo *p) {
				; bar(&p->b);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @quux(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				call void @bar(ptr noundef %b)
				ret void
				}

				; Function Attrs: inlinehint nounwind
				define internal void @bar(ptr noundef %p) #1 {
				entry:
				%bb = getelementptr inbounds %struct.bar, ptr %p, i32 0, i32 1
				%0 = load i32, ptr %bb, align 4, !tbaa !2
				call void @consume(i32 noundef %0)
				ret void
				}

				; CHECK: define dso_local void @quux(ptr nocapture noundef readonly %[[p:.*]])
				; CHECK: %[[bb_i1:.*]] = tail call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 0, i32 immarg 1, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK-NEXT: tail call void @consume(i32 noundef %[[bb_i1]])
				; CHECK: attributes #[[v2]] = { memory(argmem: read) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				declare void @consume(i32 noundef) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { inlinehint nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"bar", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-non-const.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s 2>&1 \| FileCheck %s
				;
				; If load offset is not a constant bpf-preserve-static-offset should report a
				; warning and remove preserve.static.offset call.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a[7];
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p, unsigned long i) {
				; consume(p->a[i]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -debug-info-kind=line-tables-only -triple bpf \
				; -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				; CHECK: warning: some-file.c:10:11: in function bar void (ptr, i64):
				; CHECK-SAME: Non-constant offset in access to a field of a type marked with
				; CHECK-SAME: preserve_static_offset might be rejected by BPF verifier

				%struct.foo = type { [7 x i32] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p, i64 noundef %i) #0 !dbg !5 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !8
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 0, !dbg !8
				%arrayidx = getelementptr inbounds [7 x i32], ptr %a, i64 0, i64 %i, !dbg !9
				%1 = load i32, ptr %arrayidx, align 4, !dbg !9, !tbaa !10
				call void @consume(i32 noundef %1), !dbg !14
				ret void, !dbg !15
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.]], i64 noundef %[[i:.]])
				; CHECK: %[[a:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 0, !dbg
				; CHECK-NEXT: %[[arrayidx:.*]] = getelementptr inbounds [7 x i32], ptr %[[a]], i64 0, i64 %[[i]], !dbg
				; CHECK-NEXT: %[[v5:.]] = load i32, ptr %[[arrayidx]], align 4, !dbg {{.}}, !tbaa
				; CHECK-NEXT: call void @consume(i32 noundef %[[v5]]), !dbg

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3}
				!llvm.ident = !{!4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{i32 2, !"Debug Info Version", i32 3}
				!3 = !{i32 1, !"wchar_size", i32 4}
				!4 = !{!"clang"}
				!5 = distinct !DISubprogram(name: "bar", scope: !1, file: !1, line: 9, type: !6, scopeLine: 9, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0)
				!6 = !DISubroutineType(types: !7)
				!7 = !{}
				!8 = !DILocation(line: 10, column: 14, scope: !5)
				!9 = !DILocation(line: 10, column: 11, scope: !5)
				!10 = !{!11, !11, i64 0}
				!11 = !{!"int", !12, i64 0}
				!12 = !{!"omnipotent char", !13, i64 0}
				!13 = !{!"Simple C/C++ TBAA"}
				!14 = !DILocation(line: 10, column: 3, scope: !5)
				!15 = !DILocation(line: 11, column: 1, scope: !5)

llvm/test/CodeGen/BPF/preserve-static-offset/load-ptr-pai.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				; #define __pai __attribute__((preserve_access_index))
				;
				; struct bar {
				; int a;
				; int b;
				; } __pai;
				;
				; struct buz {
				; int _1;
				; struct bar *b;
				; } __pai __ctx;
				;
				; void foo(struct buz *p) {
				; p->b->b = 42;
				; }
				;
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes \
				; -debug-info-kind=limited -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.buz = type { i32, ptr }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @foo(ptr noundef %p) #0 !dbg !5 {
				entry:
				call void @llvm.dbg.value(metadata ptr %p, metadata !20, metadata !DIExpression()), !dbg !21
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !22
				%1 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.buz) %0, i32 1, i32 1), !dbg !22, !llvm.preserve.access.index !9
				%2 = load ptr, ptr %1, align 8, !dbg !22, !tbaa !23
				%3 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.bar) %2, i32 1, i32 1), !dbg !29, !llvm.preserve.access.index !15
				store i32 42, ptr %3, align 4, !dbg !30, !tbaa !31
				ret void, !dbg !33
				}

				; CHECK: define dso_local void @foo(ptr noundef %[[p:.]]) {{.}} {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.dbg.value
				; CHECK-NEXT: %[[v5:.*]] = call ptr (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.p0
				; CHECK-SAME: (ptr readonly elementtype(%struct.buz) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 3, i1 true, i32 immarg 0, i32 immarg 1)
				; CHECK-SAME: #[[v6:.*]], !tbaa
				; CHECK-NEXT: %[[v8:.*]] =
				; CHECK-SAME: call ptr @llvm.preserve.struct.access.index.p0.p0
				; CHECK-SAME: (ptr elementtype(%struct.bar) %[[v5]], i32 1, i32 1),
				; CHECK-SAME: !dbg ![[#]], !llvm.preserve.access.index ![[#]]
				; CHECK-NEXT: store i32 42, ptr %[[v8]], align 4, !dbg ![[#]], !tbaa
				; CHECK-NEXT: ret void, !dbg
				; CHECK-NEXT: }

				; CHECK : attributes #[[v6]] = { memory(argmem: read) }


				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.struct.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3}
				!llvm.ident = !{!4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{i32 2, !"Debug Info Version", i32 3}
				!3 = !{i32 1, !"wchar_size", i32 4}
				!4 = !{!"clang"}
				!5 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 14, type: !6, scopeLine: 14, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !19)
				!6 = !DISubroutineType(types: !7)
				!7 = !{null, !8}
				!8 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !9, size: 64)
				!9 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "buz", file: !1, line: 9, size: 128, elements: !10)
				!10 = !{!11, !13}
				!11 = !DIDerivedType(tag: DW_TAG_member, name: "_1", scope: !9, file: !1, line: 10, baseType: !12, size: 32)
				!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!13 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !9, file: !1, line: 11, baseType: !14, size: 64, offset: 64)
				!14 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !15, size: 64)
				!15 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !1, line: 4, size: 64, elements: !16)
				!16 = !{!17, !18}
				!17 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !15, file: !1, line: 5, baseType: !12, size: 32)
				!18 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !15, file: !1, line: 6, baseType: !12, size: 32, offset: 32)
				!19 = !{!20}
				!20 = !DILocalVariable(name: "p", arg: 1, scope: !5, file: !1, line: 14, type: !8)
				!21 = !DILocation(line: 0, scope: !5)
				!22 = !DILocation(line: 15, column: 6, scope: !5)
				!23 = !{!24, !28, i64 8}
				!24 = !{!"buz", !25, i64 0, !28, i64 8}
				!25 = !{!"int", !26, i64 0}
				!26 = !{!"omnipotent char", !27, i64 0}
				!27 = !{!"Simple C/C++ TBAA"}
				!28 = !{!"any pointer", !26, i64 0}
				!29 = !DILocation(line: 15, column: 9, scope: !5)
				!30 = !DILocation(line: 15, column: 11, scope: !5)
				!31 = !{!32, !25, i64 4}
				!32 = !{!"bar", !25, i64 0, !25, i64 4}
				!33 = !DILocation(line: 16, column: 1, scope: !5)

llvm/test/CodeGen/BPF/preserve-static-offset/load-simple.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a simple load instruction by bpf-preserve-static-offset.
				; Verify:
				; - presence of gep.and.load intrinsic call
				; - correct attributes for intrinsic call
				; - presence of tbaa annotations
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int _;
				; int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->a);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%1 = load i32, ptr %a, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[a1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; CHECK-SAME: #[[v1:.*]], !tbaa
				; CHECK-NEXT: call void @consume(i32 noundef %[[a1]])

				; CHECK: declare i32
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, {{.}}) #[[v2:.]]

				; CHECK: attributes #[[v2]] = { nocallback nofree nounwind willreturn }
				; CHECK: attributes #[[v1]] = { memory(argmem: read) }

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-struct-pai.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				; #define __pai __attribute__((preserve_access_index))
				;
				; struct foo {
				; int a;
				; int b;
				; };
				;
				; struct bar {
				; int _1;
				; int _2;
				; struct foo c;
				; } __pai __ctx;
				;
				; int buz(struct bar *p) {
				; return p->c.b;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes \
				; -debug-info-kind=limited -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i32, i32, %struct.foo }
				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local i32 @buz(ptr noundef %p) #0 !dbg !5 {
				entry:
				call void @llvm.dbg.value(metadata ptr %p, metadata !20, metadata !DIExpression()), !dbg !21
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !22
				%1 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.bar) %0, i32 2, i32 2), !dbg !22, !llvm.preserve.access.index !10
				%b = getelementptr inbounds %struct.foo, ptr %1, i32 0, i32 1, !dbg !23
				%2 = load i32, ptr %b, align 4, !dbg !23, !tbaa !24
				ret i32 %2, !dbg !30
				}

				; CHECK: define dso_local i32 @buz(ptr noundef %[[p:.]]) {{.}} {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.dbg.value
				; CHECK-NEXT: %[[b1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.bar) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 2, i32 immarg 1)
				; CHECK-SAME: #[[v5:.*]], !tbaa
				; CHECK-NEXT: ret i32 %[[b1]]
				; CHECK-NEXT: }

				; CHECK: attributes #[[v5]] = { memory(argmem: read) }


				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.struct.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3}
				!llvm.ident = !{!4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{i32 2, !"Debug Info Version", i32 3}
				!3 = !{i32 1, !"wchar_size", i32 4}
				!4 = !{!"clang"}
				!5 = distinct !DISubprogram(name: "buz", scope: !1, file: !1, line: 15, type: !6, scopeLine: 15, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !19)
				!6 = !DISubroutineType(types: !7)
				!7 = !{!8, !9}
				!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!9 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64)
				!10 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !1, line: 9, size: 128, elements: !11)
				!11 = !{!12, !13, !14}
				!12 = !DIDerivedType(tag: DW_TAG_member, name: "_1", scope: !10, file: !1, line: 10, baseType: !8, size: 32)
				!13 = !DIDerivedType(tag: DW_TAG_member, name: "_2", scope: !10, file: !1, line: 11, baseType: !8, size: 32, offset: 32)
				!14 = !DIDerivedType(tag: DW_TAG_member, name: "c", scope: !10, file: !1, line: 12, baseType: !15, size: 64, offset: 64)
				!15 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo", file: !1, line: 4, size: 64, elements: !16)
				!16 = !{!17, !18}
				!17 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !15, file: !1, line: 5, baseType: !8, size: 32)
				!18 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !15, file: !1, line: 6, baseType: !8, size: 32, offset: 32)
				!19 = !{!20}
				!20 = !DILocalVariable(name: "p", arg: 1, scope: !5, file: !1, line: 15, type: !9)
				!21 = !DILocation(line: 0, scope: !5)
				!22 = !DILocation(line: 16, column: 13, scope: !5)
				!23 = !DILocation(line: 16, column: 15, scope: !5)
				!24 = !{!25, !26, i64 12}
				!25 = !{!"bar", !26, i64 0, !26, i64 4, !29, i64 8}
				!26 = !{!"int", !27, i64 0}
				!27 = !{!"omnipotent char", !28, i64 0}
				!28 = !{!"Simple C/C++ TBAA"}
				!29 = !{!"foo", !26, i64 0, !26, i64 4}
				!30 = !DILocation(line: 16, column: 3, scope: !5)

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-align.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that getelementptr.and.load unroll restores alignment spec.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; typedef int aligned_int __attribute__((aligned(128)));
				;
				; struct foo {
				; int _;
				; aligned_int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->a);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, [124 x i8], i32, [124 x i8] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%a1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr readonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 7, i1 true, i32 immarg 0, i32 immarg 2)
				#4, !tbaa !2
				call void @consume(i32 noundef %a1)
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[a11:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 2
				; CHECK: %[[v2:.*]] = load i32, ptr %[[a11]], align 128
				; CHECK: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 128}
				!3 = !{!"foo", !4, i64 0, !4, i64 128}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain-oob.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that getelementptr.and.load unroll can skip 'inbounds' flag.
				;
				; Source (IR modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void buz(struct foo *p) {
				; consume(p->b.bb);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -
				;
				; Modified to set 'inbounds' flag to false.

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%bb1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr readonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 false, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				#4, !tbaa !2
				call void @consume(i32 noundef %bb1)
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[bb11:.*]] = getelementptr %struct.foo, ptr %[[p]], i32 0, i32 1, i32 1
				; CHECK: %[[v2:.*]] = load i32, ptr %[[bb11]], align 4
				; CHECK: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain-u8.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.load when direct memory offset is
				; used instead of field indexes.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char aa;
				; char bb;
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; extern void consume(char);
				;
				; void buz(struct bar *p) {
				; consume(((struct foo )(((char)&p->b) + 1))->bb);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%bb1 = call i8 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i8
				(ptr readonly elementtype(i8) %p,
				i1 false, i8 0, i8 1, i8 0, i1 true, i64 immarg 3)
				#4, !tbaa !2
				call void @consume(i8 noundef signext %bb1)
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[bb11:.*]] = getelementptr inbounds i8, ptr %[[p]], i64 3
				; CHECK: %[[v2:.*]] = load i8, ptr %[[bb11]], align 1
				; CHECK: call void @consume(i8 noundef signext %[[v2]])

				declare void @consume(i8 noundef signext) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i8 @llvm.bpf.getelementptr.and.load.i8(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 1}
				!3 = !{!"foo", !4, i64 0, !4, i64 1}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.load when several field indexes
				; are specified in a chain.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void buz(struct foo *p) {
				; consume(p->b.bb);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%bb1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr readonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				#4, !tbaa !2
				call void @consume(i32 noundef %bb1)
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[bb11:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1, i32 1
				; CHECK: %[[v2:.*]] = load i32, ptr %[[bb11]], align 4
				; CHECK: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-simple.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.load.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; int b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p){
				; consume(p->b);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%b1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr readonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				#4, !tbaa !2
				call void @consume(i32 noundef %b1)
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.]]) #[[v1:.]] {
				; CHECK: %[[b11:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1
				; CHECK-NEXT: %[[v2:.*]] = load i32, ptr %[[b11]], align 4
				; CHECK-NEXT: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }
				attributes #4 = { memory(argmem: read) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-volatile.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that unroll of getelementptr.and.load restores volatile.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; volatile int b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p){
				; consume(p->b);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%b1 = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.load.i32
				(ptr elementtype(%struct.foo) %p,
				i1 true, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1),
				!tbaa !2
				call void @consume(i32 noundef %b1)
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[b11:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1
				; CHECK: %[[v2:.*]] = load volatile i32, ptr %[[b11]], align 4
				; CHECK: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nounwind willreturn
				declare i32 @llvm.bpf.getelementptr.and.load.i32(ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #3

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { nocallback nofree nounwind willreturn }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-union-pai.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				; #define __pai __attribute__((preserve_access_index))
				;
				; struct foo {
				; char a[10];
				; } __pai;
				;
				; struct bar {
				; int a;
				; int b;
				; } __pai;
				;
				; union buz {
				; struct foo a;
				; struct bar b;
				; } __pai __ctx;
				;
				; int quux(union buz *p) {
				; return p->b.b;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -debug-info-kind=limited -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local i32 @quux(ptr noundef %p) #0 !dbg !5 {
				entry:
				call void @llvm.dbg.value(metadata ptr %p, metadata !26, metadata !DIExpression()), !dbg !27
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !28
				%1 = call ptr @llvm.preserve.union.access.index.p0.p0(ptr %0, i32 1), !dbg !28, !llvm.preserve.access.index !10
				%2 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.bar) %1, i32 1, i32 1), !dbg !29, !llvm.preserve.access.index !21
				%3 = load i32, ptr %2, align 4, !dbg !29, !tbaa !30
				ret i32 %3, !dbg !33
				}

				; CHECK: define dso_local i32 @quux(ptr noundef %[[p:.]]) {{.}} {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.dbg.value
				; CHECK-NEXT: %[[v5:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.bar) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; CHECK-SAME: #[[v6:.*]], !tbaa
				; CHECK-NEXT: ret i32 %[[v5]]
				; CHECK-NEXT: }
				; CHECK: attributes #[[v6]] = { memory(argmem: read) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.union.access.index.p0.p0(ptr, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.struct.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3}
				!llvm.ident = !{!4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{i32 2, !"Debug Info Version", i32 3}
				!3 = !{i32 1, !"wchar_size", i32 4}
				!4 = !{!"clang"}
				!5 = distinct !DISubprogram(name: "quux", scope: !1, file: !1, line: 18, type: !6, scopeLine: 18, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !25)
				!6 = !DISubroutineType(types: !7)
				!7 = !{!8, !9}
				!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!9 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64)
				!10 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "buz", file: !1, line: 13, size: 96, elements: !11)
				!11 = !{!12, !20}
				!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !1, line: 14, baseType: !13, size: 80)
				!13 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo", file: !1, line: 4, size: 80, elements: !14)
				!14 = !{!15}
				!15 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !13, file: !1, line: 5, baseType: !16, size: 80)
				!16 = !DICompositeType(tag: DW_TAG_array_type, baseType: !17, size: 80, elements: !18)
				!17 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
				!18 = !{!19}
				!19 = !DISubrange(count: 10)
				!20 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10, file: !1, line: 15, baseType: !21, size: 64)
				!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !1, line: 8, size: 64, elements: !22)
				!22 = !{!23, !24}
				!23 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !21, file: !1, line: 9, baseType: !8, size: 32)
				!24 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !21, file: !1, line: 10, baseType: !8, size: 32, offset: 32)
				!25 = !{!26}
				!26 = !DILocalVariable(name: "p", arg: 1, scope: !5, file: !1, line: 18, type: !9)
				!27 = !DILocation(line: 0, scope: !5)
				!28 = !DILocation(line: 19, column: 13, scope: !5)
				!29 = !DILocation(line: 19, column: 15, scope: !5)
				!30 = !{!31, !31, i64 0}
				!31 = !{!"omnipotent char", !32, i64 0}
				!32 = !{!"Simple C/C++ TBAA"}
				!33 = !DILocation(line: 19, column: 3, scope: !5)

llvm/test/CodeGen/BPF/preserve-static-offset/load-unroll-inline.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check position of bpf-preserve-static-offset pass in the pipeline:
				; - preserve.static.offset call is preserved if address is passed as
				; a parameter to an inline-able function;
				; - second bpf-preserve-static-offset pass (after inlining) should introduce
				; getelementptr.and.load call using the preserved marker after loops
				; unrolling;
				; - readonly and tbaa attributes should allow replacement of
				; getelementptr.and.load calls by CSE transformation.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; int b[4];
				; } __ctx;
				;
				; extern void consume(int);
				;
				; static inline void bar(int * restrict p) {
				; consume(p[1]);
				; }
				;
				; void quux(struct foo *p){
				; unsigned long i = 0;
				; #pragma clang loop unroll(full)
				; while (i < 2) {
				; bar(p->b);
				; ++i;
				; }
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, [4 x i32] }

				; Function Attrs: nounwind
				define dso_local void @quux(ptr noundef %p) #0 {
				entry:
				br label %while.cond

				while.cond: ; preds = %while.body, %entry
				%i.0 = phi i64 [ 0, %entry ], [ %inc, %while.body ]
				%cmp = icmp ult i64 %i.0, 2
				br i1 %cmp, label %while.body, label %while.end

				while.body: ; preds = %while.cond
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%arraydecay = getelementptr inbounds [4 x i32], ptr %b, i64 0, i64 0
				call void @bar(ptr noundef %arraydecay)
				%inc = add i64 %i.0, 1
				br label %while.cond, !llvm.loop !2

				while.end: ; preds = %while.cond
				ret void
				}

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

				; Function Attrs: inlinehint nounwind
				define internal void @bar(ptr noalias noundef %p) #2 {
				entry:
				%arrayidx = getelementptr inbounds i32, ptr %p, i64 1
				%0 = load i32, ptr %arrayidx, align 4, !tbaa !5
				call void @consume(i32 noundef %0)
				ret void
				}

				; CHECK: define dso_local void @quux(ptr nocapture noundef readonly %[[p:.*]])
				; CHECK: %[[v1:.*]] = tail call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 0, i32 immarg 1, i64 immarg 1)
				; CHECK: tail call void @consume(i32 noundef %[[v1]])
				; CHECK: tail call void @consume(i32 noundef %[[v1]])

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #3

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #1

				declare void @consume(i32 noundef) #4

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
				attributes #2 = { inlinehint nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #3 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #4 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = distinct !{!2, !3, !4}
				!3 = !{!"llvm.loop.mustprogress"}
				!4 = !{!"llvm.loop.unroll.full"}
				!5 = !{!6, !6, i64 0}
				!6 = !{!"int", !7, i64 0}
				!7 = !{!"omnipotent char", !8, i64 0}
				!8 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-unroll.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check position of bpf-preserve-static-offset pass in the pipeline:
				; preserve.static.offset call should be preserved long enough to allow
				; introduction of getelementptr.and.load after loops unrolling.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; int b[4];
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p){
				; unsigned long i = 0;
				; #pragma clang loop unroll(full)
				; while (i < 2)
				; consume(p->b[i++]);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, [4 x i32] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				br label %while.cond

				while.cond: ; preds = %while.body, %entry
				%i.0 = phi i64 [ 0, %entry ], [ %inc, %while.body ]
				%cmp = icmp ult i64 %i.0, 2
				br i1 %cmp, label %while.body, label %while.end

				while.body: ; preds = %while.cond
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%inc = add i64 %i.0, 1
				%arrayidx = getelementptr inbounds [4 x i32], ptr %b, i64 0, i64 %i.0
				%1 = load i32, ptr %arrayidx, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				br label %while.cond, !llvm.loop !6

				while.end: ; preds = %while.cond
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef readonly %[[p:.*]])
				; CHECK: %[[v1:.*]] = tail call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 0, i32 immarg 1, i64 immarg 0)
				; CHECK-SAME: #[[attrs:.*]], !tbaa
				; CHECK-NEXT: tail call void @consume(i32 noundef %[[v1]])
				; CHECK-NEXT: %[[v2:.*]] = tail call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr readonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 0, i32 immarg 1, i64 immarg 1)
				; CHECK-SAME: #[[attrs]], !tbaa
				; CHECK-NEXT: tail call void @consume(i32 noundef %[[v2]])
				; CHECK: attributes #[[attrs]] = { memory(argmem: read) }

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

				declare void @consume(i32 noundef) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #3

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
				attributes #2 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #3 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"int", !4, i64 0}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}
				!6 = distinct !{!6, !7, !8}
				!7 = !{!"llvm.loop.mustprogress"}
				!8 = !{!"llvm.loop.unroll.full"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-volatile.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a volatile load instruction by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int _;
				; volatile int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->a);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%1 = load volatile i32, ptr %a, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: %[[a1:.*]] = call i32 (ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.load.i32
				; CHECK-SAME: (ptr elementtype(%struct.foo) %{{[^,]+}},
				; CHECK-SAME: i1 true, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; ^^^^^^^^
				; volatile
				; CHECK-NOT: #{{[0-9]+}}
				; CHECK-NEXT: call void @consume(i32 noundef %[[a1]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/load-zero.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that loads from zero offset are not modified by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p) {
				; consume(p->a);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 0
				%1 = load i32, ptr %a, align 4, !tbaa !2
				call void @consume(i32 noundef %1)
				ret void
				}

				; CHECK: entry:
				; CHECK-NEXT: %[[a:.]] = getelementptr inbounds %struct.foo, ptr %[[p:.]], i32 0, i32 0
				; CHECK-NEXT: %[[v2:.*]] = load i32, ptr %[[a]], align 4, !tbaa
				; CHECK-NEXT: call void @consume(i32 noundef %[[v2]])

				declare void @consume(i32 noundef) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 0}
				!3 = !{!"foo", !4, i64 0}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-align.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a store instruction for a field with non-standard
				; alignment by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; typedef int aligned_int __attribute__((aligned(128)));
				;
				; struct foo {
				; int _;
				; aligned_int a;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; p->a = 7;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, [124 x i8], i32, [124 x i8] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 2
				store i32 7, ptr %a, align 128, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 7,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 7, i1 true, i32 immarg 0, i32 immarg 2)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 128}
				!3 = !{!"foo", !4, i64 0, !4, i64 128}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-atomic.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of atomic store instruction by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int _;
				; int a;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; int r;
				; r = 7;
				; __atomic_store(&p->a, &r, 3);
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				store atomic i32 7, ptr %a release, align 4
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 7,
				; CHECK-SAME: ptr elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 5, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; CHECK-NOT: #{{[0-9]+}}
				; CHECK-NEXT: ret void

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #2

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-2.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds GEP chains that end by
				; getelementptr.and.store.
				;
				; Source (modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; void buz(struct foo *p) {
				; p->b.bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; And modified to fold last getelementptr/store as a single
				; getelementptr.and.store.

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr writeonly elementtype(%struct.bar) %b,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 42,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-oob.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset keeps track of 'inbounds' flags while
				; folding chain of GEP instructions.
				;
				; Source (IR modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; void buz(struct foo *p) {
				; p->b.bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; Modified to remove one of the 'inbounds' from one of the GEP instructions.

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%bb = getelementptr %struct.bar, ptr %b, i32 0, i32 1
				store i32 42, ptr %bb, align 4, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 42,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 false, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-u8-oob.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				; The GEP chain in this example has type mismatch and thus is
				; folded as i8 access.
				;
				; Source (modified by hand):
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char aa;
				; char bb;
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; void buz(struct bar *p) {
				; ((struct foo )(((char)&p->b) + 1))->bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -
				;
				; Modified to remove one of the 'inbounds' from one of the getelementptr.

				%struct.bar = type { i8, %struct.foo }
				%struct.foo = type { i8, i8 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%add.ptr = getelementptr i8, ptr %b, i64 1
				%bb = getelementptr inbounds %struct.foo, ptr %add.ptr, i32 0, i32 1
				store i8 42, ptr %bb, align 1, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i8, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i8
				; CHECK-SAME: (i8 42,
				; CHECK-SAME: ptr writeonly elementtype(i8) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 0, i1 false, i64 immarg 3)
				; CHECK-SAME: #[[v2:.]], !tbaa ![[v3:.]]
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 1}
				!3 = !{!"foo", !4, i64 0, !4, i64 1}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-u8.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				; The GEP chain in this example has type mismatch and thus is
				; folded as i8 access.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char aa;
				; char bb;
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; void buz(struct bar *p) {
				; ((struct foo )(((char)&p->b) + 1))->bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.bar = type { i8, %struct.foo }
				%struct.foo = type { i8, i8 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.bar, ptr %0, i32 0, i32 1
				%add.ptr = getelementptr inbounds i8, ptr %b, i64 1
				; ~~
				; these types do not match, thus GEP chain is folded as an offset
				; ~~~~~~~~~~~
				%bb = getelementptr inbounds %struct.foo, ptr %add.ptr, i32 0, i32 1
				store i8 42, ptr %bb, align 1, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i8, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i8
				; CHECK-SAME: (i8 42,
				; CHECK-SAME: ptr writeonly elementtype(i8) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 0, i1 true, i64 immarg 3)
				; CHECK-SAME: #[[v2:.]], !tbaa ![[v3:.]]
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 1}
				!3 = !{!"foo", !4, i64 0, !4, i64 1}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that bpf-preserve-static-offset folds chain of GEP instructions.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; void buz(struct foo *p) {
				; p->b.bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%bb = getelementptr inbounds %struct.bar, ptr %b, i32 0, i32 1
				store i32 42, ptr %bb, align 4, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 42,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/store-pai.ll

This file was added.

				; RUN: opt -passes=bpf-preserve-static-offset -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				; #define __pai __attribute__((preserve_access_index))
				;
				; struct foo {
				; char a[10];
				; } __pai;
				;
				; struct bar {
				; int a;
				; int b;
				; } __pai;
				;
				; struct buz {
				; int _1;
				; int _2;
				; int _3;
				; union {
				; struct foo a;
				; struct bar b[7];
				; };
				; } __pai __ctx;
				;
				; void quux(struct buz *p) {
				; p->b[5].b = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes \
				; -debug-info-kind=limited -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.buz = type { i32, i32, i32, %union.anon }
				%union.anon = type { [7 x %struct.bar] }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @quux(ptr noundef %p) #0 !dbg !31 {
				entry:
				call void @llvm.dbg.value(metadata ptr %p, metadata !36, metadata !DIExpression()), !dbg !37
				%0 = call ptr @llvm.preserve.static.offset(ptr %p), !dbg !38
				%1 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.buz) %0, i32 3, i32 3), !dbg !38, !llvm.preserve.access.index !4
				%2 = call ptr @llvm.preserve.union.access.index.p0.p0(ptr %1, i32 1), !dbg !38, !llvm.preserve.access.index !3
				%3 = call ptr @llvm.preserve.array.access.index.p0.p0(ptr elementtype([7 x %struct.bar]) %2, i32 1, i32 5), !dbg !39, !llvm.preserve.access.index !21
				%4 = call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%struct.bar) %3, i32 1, i32 1), !dbg !40, !llvm.preserve.access.index !22
				store i32 42, ptr %4, align 4, !dbg !41, !tbaa !42
				ret void, !dbg !45
				}

				; CHECK: define dso_local void @quux(ptr noundef %[[p:.]]) {{.}} {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.dbg.value
				; CHECK-NEXT: call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 42,
				; CHECK-SAME: ptr writeonly elementtype(i8) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 56)
				; CHECK-SAME: #[[v5:.*]], !tbaa
				; CHECK-NEXT: ret void, !dbg
				; CHECK-NEXT: }
				; CHECK: attributes #[[v5]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.struct.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.union.access.index.p0.p0(ptr, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
				declare ptr @llvm.preserve.array.access.index.p0.p0(ptr, i32 immarg, i32 immarg) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!28, !29}
				!llvm.ident = !{!30}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, retainedTypes: !2, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "some-file.c", directory: "/some/dir/")
				!2 = !{!3, !21}
				!3 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !4, file: !1, line: 17, size: 448, elements: !11)
				!4 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "buz", file: !1, line: 13, size: 544, elements: !5)
				!5 = !{!6, !8, !9, !10}
				!6 = !DIDerivedType(tag: DW_TAG_member, name: "_1", scope: !4, file: !1, line: 14, baseType: !7, size: 32)
				!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!8 = !DIDerivedType(tag: DW_TAG_member, name: "_2", scope: !4, file: !1, line: 15, baseType: !7, size: 32, offset: 32)
				!9 = !DIDerivedType(tag: DW_TAG_member, name: "_3", scope: !4, file: !1, line: 16, baseType: !7, size: 32, offset: 64)
				!10 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !1, line: 17, baseType: !3, size: 448, offset: 96)
				!11 = !{!12, !20}
				!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !3, file: !1, line: 18, baseType: !13, size: 80)
				!13 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo", file: !1, line: 4, size: 80, elements: !14)
				!14 = !{!15}
				!15 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !13, file: !1, line: 5, baseType: !16, size: 80)
				!16 = !DICompositeType(tag: DW_TAG_array_type, baseType: !17, size: 80, elements: !18)
				!17 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
				!18 = !{!19}
				!19 = !DISubrange(count: 10)
				!20 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !3, file: !1, line: 19, baseType: !21, size: 448)
				!21 = !DICompositeType(tag: DW_TAG_array_type, baseType: !22, size: 448, elements: !26)
				!22 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !1, line: 8, size: 64, elements: !23)
				!23 = !{!24, !25}
				!24 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !22, file: !1, line: 9, baseType: !7, size: 32)
				!25 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !22, file: !1, line: 10, baseType: !7, size: 32, offset: 32)
				!26 = !{!27}
				!27 = !DISubrange(count: 7)
				!28 = !{i32 2, !"Debug Info Version", i32 3}
				!29 = !{i32 1, !"wchar_size", i32 4}
				!30 = !{!"clang"}
				!31 = distinct !DISubprogram(name: "quux", scope: !1, file: !1, line: 23, type: !32, scopeLine: 23, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !35)
				!32 = !DISubroutineType(types: !33)
				!33 = !{null, !34}
				!34 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !4, size: 64)
				!35 = !{!36}
				!36 = !DILocalVariable(name: "p", arg: 1, scope: !31, file: !1, line: 23, type: !34)
				!37 = !DILocation(line: 0, scope: !31)
				!38 = !DILocation(line: 24, column: 6, scope: !31)
				!39 = !DILocation(line: 24, column: 3, scope: !31)
				!40 = !DILocation(line: 24, column: 11, scope: !31)
				!41 = !DILocation(line: 24, column: 13, scope: !31)
				!42 = !{!43, !43, i64 0}
				!43 = !{!"omnipotent char", !44, i64 0}
				!44 = !{!"Simple C/C++ TBAA"}
				!45 = !DILocation(line: 25, column: 1, scope: !31)

llvm/test/CodeGen/BPF/preserve-static-offset/store-simple.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a simple store instruction by bpf-preserve-static-offset.
				; Verify:
				; - presence of gep.and.store intrinsic call
				; - correct attributes for intrinsic call
				; - presence of tbaa annotations
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int _;
				; int a;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; p->a = 7;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				store i32 7, ptr %a, align 4, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 7,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				; CHECK-SAME: #[[v2:.*]], !tbaa
				; CHECK: attributes #[[v2]] = { memory(argmem: write) }

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-align.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that getelementptr.and.store unroll restores alignment spec.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; typedef int aligned_int __attribute__((aligned(128)));
				;
				; struct foo {
				; int _;
				; aligned_int a;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; p->a = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, [124 x i8], i32, [124 x i8] }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr writeonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 7, i1 true, i32 immarg 0, i32 immarg 2)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[v2:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 2
				; CHECK: store i32 42, ptr %[[v2]], align 128
				; CHECK: ret void

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 128}
				!3 = !{!"foo", !4, i64 0, !4, i64 128}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain-oob.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that getelementptr.and.load unroll can skip 'inbounds' flag.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; void buz(struct foo *p) {
				; p->b.bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr writeonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 false, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[v2:.*]] = getelementptr %struct.foo, ptr %[[p]], i32 0, i32 1, i32 1
				; CHECK: store i32 42, ptr %[[v2]], align 4
				; CHECK: ret void

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain-u8.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.store when direct memory offset is
				; used instead of field indexes.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; char aa;
				; char bb;
				; };
				;
				; struct bar {
				; char a;
				; struct foo b;
				; } __ctx;
				;
				; void buz(struct bar *p) {
				; ((struct foo )(((char)&p->b) + 1))->bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				call void (i8, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i8
				(i8 42,
				ptr writeonly elementtype(i8) %p,
				i1 false, i8 0, i8 1, i8 0, i1 true, i64 immarg 3)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[v2:.*]] = getelementptr inbounds i8, ptr %[[p]], i64 3
				; CHECK: store i8 42, ptr %[[v2]], align 1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i8(i8, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 1}
				!3 = !{!"foo", !4, i64 0, !4, i64 1}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.store when several field indexes
				; are specified in a chain.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct bar {
				; int aa;
				; int bb;
				; };
				;
				; struct foo {
				; int a;
				; struct bar b;
				; } __ctx;
				;
				; void buz(struct foo *p) {
				; p->b.bb = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, %struct.bar }
				%struct.bar = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @buz(ptr noundef %p) #0 {
				entry:
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr writeonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1, i32 immarg 1)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @buz(ptr noundef %[[p:.*]])
				; CHECK: %[[v2:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1, i32 1
				; CHECK: store i32 42, ptr %[[v2]], align 4
				; CHECK: ret void

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 8}
				!3 = !{!"foo", !4, i64 0, !7, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!"bar", !4, i64 0, !4, i64 4}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-simple.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check unroll of getelementptr.and.store.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; int b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p){
				; p->b = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr writeonly elementtype(%struct.foo) %p,
				i1 false, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1)
				#3, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: %[[v2:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1
				; CHECK: store i32 42, ptr %[[v2]], align 4

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }
				attributes #3 = { memory(argmem: write) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-volatile.ll

This file was added.

				; RUN: opt --bpf-check-and-opt-ir -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that unroll of getelementptr.and.store restores volatile.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; volatile int b;
				; } __ctx;
				;
				; extern void consume(int);
				;
				; void bar(struct foo *p){
				; p->b = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=sroa,bpf-preserve-static-offset -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				@llvm.bpf.getelementptr.and.store.i32
				(i32 42,
				ptr elementtype(%struct.foo) %p,
				i1 true, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1),
				!tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr noundef %[[p:.*]])
				; CHECK: entry:
				; CHECK: %[[v2:.*]] = getelementptr inbounds %struct.foo, ptr %[[p]], i32 0, i32 1
				; CHECK: store volatile i32 42, ptr %[[v2]], align 4

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				; Function Attrs: nocallback nofree nounwind willreturn
				declare void @llvm.bpf.getelementptr.and.store.i32(i32, ptr nocapture, i1 immarg, i8 immarg, i8 immarg, i8 immarg, i1 immarg, ...) #2

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #2 = { nocallback nofree nounwind willreturn }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-unroll-inline.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check position of bpf-preserve-static-offset pass in the pipeline:
				; - preserve.static.offset call is preserved if address is passed as
				; a parameter to an inline-able function;
				; - second bpf-preserve-static-offset pass (after inlining) should introduce
				; getelementptr.and.store call using the preserved marker after loops
				; unrolling;
				; - memory(argmem: readwrite) and tbaa attributes should allow
				; removing one getelementptr.and.store call.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; int b[4];
				; } __ctx;
				;
				; static inline void bar(int * restrict p, unsigned long i) {
				; p[0] = i;
				; }
				;
				; void quux(struct foo *p){
				; unsigned long i = 0;
				; #pragma clang loop unroll(full)
				; while (i < 2) {
				; bar(p->b, i);
				; ++i;
				; }
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, [4 x i32] }

				; Function Attrs: nounwind
				define dso_local void @quux(ptr noundef %p) #0 {
				entry:
				br label %while.cond

				while.cond: ; preds = %while.body, %entry
				%i.0 = phi i64 [ 0, %entry ], [ %inc, %while.body ]
				%cmp = icmp ult i64 %i.0, 2
				br i1 %cmp, label %while.body, label %while.end

				while.body: ; preds = %while.cond
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				%arraydecay = getelementptr inbounds [4 x i32], ptr %b, i64 0, i64 0
				call void @bar(ptr noundef %arraydecay, i64 noundef %i.0)
				%inc = add i64 %i.0, 1
				br label %while.cond, !llvm.loop !2

				while.end: ; preds = %while.cond
				ret void
				}

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

				; Function Attrs: inlinehint nounwind
				define internal void @bar(ptr noalias noundef %p, i64 noundef %i) #2 {
				entry:
				%conv = trunc i64 %i to i32
				%arrayidx = getelementptr inbounds i32, ptr %p, i64 0
				store i32 %conv, ptr %arrayidx, align 4, !tbaa !5
				ret void
				}

				; CHECK: define dso_local void @quux(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK-NEXT: entry:
				; CHECK-NEXT: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 1,
				; CHECK-SAME: ptr writeonly elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 false, i8 0, i8 1, i8 2, i1 true, i64 immarg 0, i32 immarg 1)
				; CHECK-NEXT: ret void

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #3

				; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
				attributes #2 = { inlinehint nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #3 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = distinct !{!2, !3, !4}
				!3 = !{!"llvm.loop.mustprogress"}
				!4 = !{!"llvm.loop.unroll.full"}
				!5 = !{!6, !6, i64 0}
				!6 = !{!"int", !7, i64 0}
				!7 = !{!"omnipotent char", !8, i64 0}
				!8 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-volatile.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check handling of a volatile store instruction by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; volatile int b;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; p->b = 42;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32, i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%b = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
				store volatile i32 42, ptr %b, align 4, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef %[[p:.*]])
				; CHECK: tail call void (i32, ptr, i1, i8, i8, i8, i1, ...)
				; CHECK-SAME: @llvm.bpf.getelementptr.and.store.i32
				; CHECK-SAME: (i32 42,
				; CHECK-SAME: ptr elementtype(%struct.foo) %[[p]],
				; CHECK-SAME: i1 true, i8 0, i8 1, i8 2, i1 true, i32 immarg 0, i32 immarg 1),
				; CHECK-NOT: #{{[0-9]+}}
				; CHECK-SAME: !tbaa

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 4}
				!3 = !{!"foo", !4, i64 0, !4, i64 4}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/CodeGen/BPF/preserve-static-offset/store-zero.ll

This file was added.

				; RUN: opt -O2 -mtriple=bpf-pc-linux -S -o - %s \| FileCheck %s
				;
				; Check that stores from zero offset are not modified by bpf-preserve-static-offset.
				;
				; Source:
				; #define __ctx __attribute__((preserve_static_offset))
				;
				; struct foo {
				; int a;
				; } __ctx;
				;
				; void bar(struct foo *p) {
				; p->a = 0;
				; }
				;
				; Compilation flag:
				; clang -cc1 -O2 -triple bpf -S -emit-llvm -disable-llvm-passes -o - \
				; \| opt -passes=function(sroa) -S -o -

				%struct.foo = type { i32 }

				; Function Attrs: nounwind
				define dso_local void @bar(ptr noundef %p) #0 {
				entry:
				%0 = call ptr @llvm.preserve.static.offset(ptr %p)
				%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 0
				store i32 0, ptr %a, align 4, !tbaa !2
				ret void
				}

				; CHECK: define dso_local void @bar(ptr nocapture noundef writeonly %[[p:.*]])
				; CHECK-NEXT: entry:
				; CHECK-NEXT: store i32 0, ptr %[[p]], align 4, !tbaa
				; CHECK-NEXT: ret void

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare ptr @llvm.preserve.static.offset(ptr readnone) #1

				attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang"}
				!2 = !{!3, !4, i64 0}
				!3 = !{!"foo", !4, i64 0}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Attribute preserve_static_offset for structsClosedPublic

Details

Diff Detail

Event Timeline

bpf_flow.bpf.o

connect4_prog.bpf.o

test_cls_redirect.bpf.o

test_misc_tcp_hdr_options.bpf.o

test_tcpnotify_kern.bpf.o

netns_cookie_prog.bpf.o

sock_destroy_prog.bpf.o

socket_cookie_prog.bpf.o

test_lwt_reroute.bpf.o

test_sockmap_invalid_update.bpf.o

type_cast.bpf.o

core_kern.bpf.o, test_verif_scale2.bpf.o

Revision Contents

Diff 558212

clang/include/clang/Basic/Attr.td

clang/include/clang/Basic/AttrDocs.td

clang/lib/CodeGen/CGExpr.cpp

clang/lib/Sema/SemaDeclAttr.cpp

clang/test/CodeGen/bpf-preserve-static-offset-arr.c

clang/test/CodeGen/bpf-preserve-static-offset-bitfield.c

clang/test/CodeGen/bpf-preserve-static-offset-lvalue.c

clang/test/CodeGen/bpf-preserve-static-offset-non-bpf.c

clang/test/CodeGen/bpf-preserve-static-offset-pai.c

clang/test/Misc/pragma-attribute-supported-attributes-list.test

clang/test/Sema/bpf-attr-preserve-static-offset-warns-nonbpf.c

clang/test/Sema/bpf-attr-preserve-static-offset-warns.c

clang/test/Sema/bpf-attr-preserve-static-offset.c

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/IR/IntrinsicsBPF.td

llvm/lib/Target/BPF/BPF.h

llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp

llvm/lib/Target/BPF/BPFCORE.h

llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp

llvm/lib/Target/BPF/BPFPreserveStaticOffset.cpp

llvm/lib/Target/BPF/BPFTargetMachine.cpp

llvm/lib/Target/BPF/CMakeLists.txt

llvm/test/CodeGen/BPF/preserve-static-offset/load-align.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-arr-pai.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-atomic.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-2.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8-type-mismatch.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain-u8.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-chain.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-inline.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-non-const.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-ptr-pai.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-simple.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-struct-pai.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-align.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain-u8.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-chain.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-simple.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-undo-volatile.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-union-pai.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-unroll-inline.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-unroll.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-volatile.ll

llvm/test/CodeGen/BPF/preserve-static-offset/load-zero.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-align.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-atomic.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-2.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-u8-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain-u8.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-chain.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-pai.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-simple.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-align.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain-oob.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain-u8.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-chain.ll

llvm/test/CodeGen/BPF/preserve-static-offset/store-undo-simple.ll

[BPF] Attribute preserve_static_offset for structs
ClosedPublic