Download Raw Diff

Details

Reviewers

eli.friedman
ast
lebedev.ri

Commits

rGe3919c6baf9a: [BPF] add new intrinsics preserve_{array,union,struct}_access_index
rL365423: [BPF] add new intrinsics preserve_{array,union,struct}_access_index
rG75c2a6709e80: [BPF] add new intrinsics preserve_{array,union,struct}_access_index
rL365352: [BPF] add new intrinsics preserve_{array,union,struct}_access_index

Summary

For background of BPF CO-RE project, please refer to

http://vger.kernel.org/bpfconf2019.html

In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.

addr = preserve_array_access_index(base, index, dimension)
addr = preserve_union_access_index(base, di_index)
addr = preserve_struct_access_index(base, gep_index, di_index)

here,

base: the base pointer for the array/union/struct access.
index: the last access index for array, the same for IR/DebugInfo layout.
dimension: the array dimension.
gep_index: the access index based on IR layout.
di_index: the access index based on user/debuginfo types.

For example, for the following example,

$ cat test.c
struct sk_buff {
   int i;
   int b1:1;
   int b2:2;
   union {
     struct {
       int o1;
       int o2;
     } o;
     struct {
       char flags;
       char dev_id;
     } dev;
     int netid;
   } u[10];
};  

static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
    = (void *) 4;

#define _(x) (__builtin_preserve_access_index(x))

int bpf_prog(struct sk_buff *ctx) {
  char dev_id;
  bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
  return dev_id;
}
$ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
  test.c >& log

The generated IR looks like below:

...
define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
  %2 = alloca %struct.sk_buff*, align 8
  %3 = alloca i8, align 1
  store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
  call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
  call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
  call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
  %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
  %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
  %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
       %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
  %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
       [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
  %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
       %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
  %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
  %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
       %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
  %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
  %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
  %13 = sext i8 %12 to i32, !dbg !54
  call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
  ret i32 %13, !dbg !57
}

!19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
!26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
!34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,

. The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
. The "%7 = ..." represents array subscript "5".
. The "%8 = ..." represents union member "dev" with index 1 for DI layout.
. The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Diff Detail

Repository: rL LLVM

Event Timeline

yonghong-song created this revision.May 10 2019, 3:38 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2019, 3:38 PM

Herald added subscribers: llvm-commits, arphaman, kosarev. · View Herald Transcript

Missing langref changes.
What about other targets, does this need some default expansion?

@lebedev.ri Thanks for looking at this patch. The following two other patches are correlated so I just added them here.

https://reviews.llvm.org/D61809
https://reviews.llvm.org/D61524

Could you be more specific where to add langref for this intrinsic?

Currently, I am guarding the usage of intrinsics in clang under bpf, so other targets won't be able to use this.
If there is a demand we can implement a callback function in TargetInfo so different targets can decide whether
they want to use this intrinsics.

We can implement the default expansion, do you know where is the proper place to do that?

yonghong-song added a child revision: D61524: [BPF] Support for compile once and run everywhere.May 10 2019, 4:01 PM

yonghong-song added a child revision: D61809: [BPF] Preserve debuginfo array/union/struct type/access index.

split intrinsics into three, and add langdef.

@lebedev.ri For comments "What about other targets, does this need some default expansion?", currently these intrinsics only used by bpf and lowered to GEP in an IR pass implemented by bpf backend. I can certainly implement some default expansion if it is desirable. Is the SelectionDAG the proper place to implement the default lowering?

jdoerfert added a subscriber: jdoerfert.May 16 2019, 12:24 PM

efriedma added a subscriber: efriedma.May 16 2019, 12:33 PM

efriedma added inline comments.

docs/LangRef.rst
16961 ↗	(On Diff #199844)	inst_name seems unnecessary. Don't you need some argument to indicate the type of the array elements?
16992 ↗	(On Diff #199844)	How do you plan to use this?
17025 ↗	(On Diff #199844)	I'm not sure about depending on digging the necessary information out of debug info, as opposed to generating some dedicated data structure. It feels more complicated, since it's not clear what parts of the debug info, exactly, you're depending on. Looking up the struct type by name seems error-prone; if you're depending on the debug info anyway, could you pass the debug info for the type directly as a metadata argument?
include/llvm/IR/IRBuilder.h
2270 ↗	(On Diff #199844)	Don't pass StringRefs by const reference (see https://llvm.org/docs/ProgrammersManual.html#the-stringref-class ).

@eli.friedman thanks for the review. I just did some experiments with some performance sensitive bpf programs. It looks the approach I am taking here cause worse case 30% instructions generated by llvm at -O2 level.
So the current approach to convert to intrinsics blindly and later on convert back unnecessary intrinsics to regular GEP and necessary intrinsics to relocatable GEP won't work.
The main reason for performance degradation is:

intrinsics causing less optimal codes which may contributes 5% performance loss for a relative small routines, for a large routines, I have see the loss is more than 20%.
intrinsics impact inline decision and less inlining has negative impacts on the performance.

Therefore, instead of blindly converting GEPs to intrinsics, I will try a different approach, design some bpf specific intrinsics to help bpf backend generate relocatable GEP. Considering the potential performance impact,
this approach will be only used when people trade some performance for portability in case bpf performance is not super critical.

include/llvm/IR/IRBuilder.h
2270 ↗	(On Diff #199844)	Thanks for letting me know. Will correct this in all other occasions as well.

intrinsics causing less optimal codes which may contributes 5% performance loss for a relative small routines, for a large routines, I have see the loss is more than 20%.

Could you clarify what, exactly, the issue is here? Is the problem extra arithmetic? Extra memory operations? Something else? You could probably mitigate the impact here with a few small changes to optimizations. Or maybe you could make the intrinsic return the relevant offset, rather than actually performing the GEP itself. (The key here is that you have to make sure the offset is opaque to the optimizer, so it doesn't make illegal transforms.)

There's probably some fundamental performance loss involved in making the struct offsets opaque values, no matter how that is represented in the IR.

intrinsics impact inline decision and less inlining has negative impacts on the performance.

This is something you can probably fix with a trivial tweak to the inlining heuristics. The inliner asks the target for the cost of a call using TargetTransformInfo.

change new intrinsic format, add num_of_zeros for array, remove inst_name

This update of three patches mostly a few intrinsic format changes (array +num_of_zeros for gep, array/struct -inst_name), and correspond adjustments to fix bugs.
This will unblock other kernel bpf developers which can work in parallel in kernel side.

I have not looked at the performance aspect yet. This is what I will do next to see where is the performance loss and how to avoid them.
I have not explored the approach to add meta data to instruction/intrinsics yet. Will do that once I did some performance analysis to ensure the
current approach does not cause performance degradation for perf. critical bpf applications.

docs/LangRef.rst
16961 ↗	(On Diff #199844)	I include inst_name only to generate original GEP similar to what clang did. But I agree that inst_name is unnecessary. The <type2> in the above is the pointer to the array element type.

I did some analysis on the performance side, specifically, # of instructions generated.
My example is cilium at https://github.com/cilium/cilium/tree/master/bpf, bpf program bpf_lxc.c.

(1). First, I made the following change:
-bash-4.4$ git diff
diff --git a/include/llvm/Analysis/TargetTransformInfoImpl.h b/include/llvm/Analysis/TargetTransformInfoImpl.h
index a1e1f9b07aa..544bbc93a4f 100644

a/include/llvm/Analysis/TargetTransformInfoImpl.h

+++ b/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -781,6 +781,9 @@ public:

case Intrinsic::coro_suspend:
case Intrinsic::coro_param:
case Intrinsic::coro_subfn_addr:

+ case Intrinsic::preserve_array_access_index:
+ case Intrinsic::preserve_union_access_index:
+ case Intrinsic::preserve_struct_access_index:

  // These intrinsics don't actually represent code after lowering.
  return TTI::TCC_Free;
}

-bash-4.4$
so all previously inlined functions are all inlined. There is probably a better way to do this than
Just TTI::TCC_Free. The change here is just for demonstration purpose.

(2). Make the following change for cilium so the data array is aligned to 8 to
use store double always.

diff --git a/bpf/lib/icmp6.h b/bpf/lib/icmp6.h
index fcae3d3ed..bb397685d 100644

a/bpf/lib/icmp6.h

+++ b/bpf/lib/icmp6.h
@@ -224,7 +224,7 @@ static inline be32 compute_icmp6_csum(char data[80], u16 payload_len,
static inline int icmp6_send_time_exceeded(struct sk_buff *skb, int nh_off)
{

/* FIXME: Fix code below to not require this init */

char data[80] = {};

+ char data[80] attribute((aligned(8))) = {};

 struct icmp6hdr *icmp6hoplim;
 struct ipv6hdr *ipv6hdr;
char *upper; /* icmp6 or tcp or udp */

(3).
Compile bpf_lxc.c with and without offset relocation, and compare the number of instructions.
Note that bpf_lxc.c does not have bpf_probe_read, so here we purely measure the cost
of preserve_*_access_index in the generated codes.

The function sizes without offset reloc:
-bash-4.4$ llvm-readelf -s bpf_lxc.o | grep FUNC

826: 0000000000000000   336 FUNC    GLOBAL DEFAULT    3 __send_drop_notify
829: 0000000000000000   656 FUNC    GLOBAL DEFAULT   27 handle_policy
830: 0000000000000000   944 FUNC    GLOBAL DEFAULT   29 handle_to_container
831: 0000000000000000   720 FUNC    GLOBAL DEFAULT   17 handle_xgress
832: 0000000000000000  1880 FUNC    GLOBAL DEFAULT   15 tail_handle_arp
833: 0000000000000000 28264 FUNC    GLOBAL DEFAULT   13 tail_handle_ipv4
834: 0000000000000000 29800 FUNC    GLOBAL DEFAULT   11 tail_handle_ipv6
835: 0000000000000000  3272 FUNC    GLOBAL DEFAULT    9 tail_icmp6_handle_ns
836: 0000000000000000  2480 FUNC    GLOBAL DEFAULT    5 tail_icmp6_send_echo_reply
837: 0000000000000000  3968 FUNC    GLOBAL DEFAULT    7 tail_icmp6_send_time_exceeded
838: 0000000000000000 11824 FUNC    GLOBAL DEFAULT   23 tail_ipv4_policy
839: 0000000000000000 12568 FUNC    GLOBAL DEFAULT   25 tail_ipv4_to_endpoint
840: 0000000000000000 14144 FUNC    GLOBAL DEFAULT   19 tail_ipv6_policy
841: 0000000000000000 15328 FUNC    GLOBAL DEFAULT   21 tail_ipv6_to_endpoint

The function sizes with offset reloc enabled:
-bash-4.4$ llvm-readelf -s bpf_lxc.o | grep FUNC

851: 0000000000000000   336 FUNC    GLOBAL DEFAULT    3 __send_drop_notify
854: 0000000000000000   680 FUNC    GLOBAL DEFAULT   27 handle_policy
855: 0000000000000000   960 FUNC    GLOBAL DEFAULT   29 handle_to_container
856: 0000000000000000   744 FUNC    GLOBAL DEFAULT   17 handle_xgress
857: 0000000000000000  1904 FUNC    GLOBAL DEFAULT   15 tail_handle_arp
858: 0000000000000000 29048 FUNC    GLOBAL DEFAULT   13 tail_handle_ipv4
859: 0000000000000000 31808 FUNC    GLOBAL DEFAULT   11 tail_handle_ipv6
860: 0000000000000000  3280 FUNC    GLOBAL DEFAULT    9 tail_icmp6_handle_ns
861: 0000000000000000  2528 FUNC    GLOBAL DEFAULT    5 tail_icmp6_send_echo_reply
862: 0000000000000000  4056 FUNC    GLOBAL DEFAULT    7 tail_icmp6_send_time_exceeded
863: 0000000000000000 12224 FUNC    GLOBAL DEFAULT   23 tail_ipv4_policy
864: 0000000000000000 13088 FUNC    GLOBAL DEFAULT   25 tail_ipv4_to_endpoint
865: 0000000000000000 14792 FUNC    GLOBAL DEFAULT   19 tail_ipv6_policy
866: 0000000000000000 15992 FUNC    GLOBAL DEFAULT   21 tail_ipv6_to_endpoint

In summary, with relocable gep, we get 0-7% regression.

Look at function handle_policy(),
good code:

44:       b7 02 00 00 00 00 00 00 r2 = 0
45:       7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
46:       7b 2a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r2
47:       b7 02 00 00 00 01 00 00 r2 = 256
48:       7b 2a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r2
49:       87 01 00 00 00 00 00 00 r1 = -r1
50:       73 1a e8 ff 00 00 00 00 *(u8 *)(r10 - 24) = r1
51:       bf a2 00 00 00 00 00 00 r2 = r10
52:       07 02 00 00 e8 ff ff ff r2 += -24
53:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
55:       85 00 00 00 01 00 00 00 call 1

regression:

44:       b7 02 00 00 00 00 00 00 r2 = 0
45:       7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
46:       7b 2a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r2
47:       7b 2a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r2
48:       87 01 00 00 00 00 00 00 r1 = -r1
49:       73 1a e8 ff 00 00 00 00 *(u8 *)(r10 - 24) = r1
50:       71 a1 e9 ff 00 00 00 00 r1 = *(u8 *)(r10 - 23)
51:       57 01 00 00 fc 00 00 00 r1 &= 252
52:       47 01 00 00 01 00 00 00 r1 |= 1
53:       73 1a e9 ff 00 00 00 00 *(u8 *)(r10 - 23) = r1
54:       bf a2 00 00 00 00 00 00 r2 = r10
55:       07 02 00 00 e8 ff ff ff r2 += -24
56:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
58:       85 00 00 00 01 00 00 00 call 1

Look at function tail_handle_ipv6(),
good code:

49:       85 00 00 00 19 00 00 00 call 25
50:       15 07 0e 00 80 00 00 00 if r7 == 128 goto +14 <LBB4_7>
51:       55 07 54 00 87 00 00 00 if r7 != 135 goto +84 <LBB4_10>
52:       b7 01 00 00 02 00 00 00 r1 = 2
53:       63 16 34 00 00 00 00 00 *(u32 *)(r6 + 52) = r1
54:       b7 01 00 00 0e 00 00 00 r1 = 14
55:       63 16 30 00 00 00 00 00 *(u32 *)(r6 + 48) = r1
56:       bf 61 00 00 00 00 00 00 r1 = r6
57:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
59:       b7 03 00 00 04 00 00 00 r3 = 4

regression:

49:       85 00 00 00 19 00 00 00 call 25
50:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
52:       63 1a a4 ff 00 00 00 00 *(u32 *)(r10 - 92) = r1
53:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
55:       63 1a a0 ff 00 00 00 00 *(u32 *)(r10 - 96) = r1
56:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
58:       63 1a 9c ff 00 00 00 00 *(u32 *)(r10 - 100) = r1
59:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
61:       63 1a 98 ff 00 00 00 00 *(u32 *)(r10 - 104) = r1
62:       15 08 0e 00 80 00 00 00 if r8 == 128 goto +14 <LBB4_6>
63:       55 08 46 00 87 00 00 00 if r8 != 135 goto +70 <LBB4_10>
64:       b7 01 00 00 02 00 00 00 r1 = 2
65:       63 16 34 00 00 00 00 00 *(u32 *)(r6 + 52) = r1
66:       b7 01 00 00 0e 00 00 00 r1 = 14
67:       63 16 30 00 00 00 00 00 *(u32 *)(r6 + 48) = r1
68:       bf 61 00 00 00 00 00 00 r1 = r6
69:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
71:       b7 03 00 00 04 00 00 00 r3 = 4

Maybe some pathes makes conservative decisions in the case of
intrinsic calls, even if it is marked as no memory access.

I will proceed to do the following:

. considering how to limit the scope of using intrinsics. we may introduce
  another one for explicit buy-in intrinsic to identify the places where
  preserve_*_access_index is needed. Note that for a typicall bpf program,
  in most places, preserve_*_access_index intrinsics are not needed.
. if we proceed with this approach, I will also look at the # of insn regressions
 so we can still have reasonable performance even if intrinsics are used.

removed typename from intrinsics, using metadata (ditype) instead

@eli.friedman @rsmith @lebedev.ri I just loaded a new set of patches. Notable changes include

introduce clang intrinsic base = __builtin_preserve_access_index(base), which is used to identify relocatable geps. This is a buy-in approach from user so we do not pay performance penalty for transforming gep's to intrinsics and later back to gep with some performance loss.
for IR intrinsics preserve_*_access_index, struct/union names are removed from the argument list. Instead a new metadata type MD_preserve_access_index is introduced. This metadata has struct/union debuginfo type and attached to preserve_*_access_index.

The noticable change in BPF backend:

During IR, we create a global variable to represent the relocable gep. this global also has MD_preserve_access_index metadata attached, which is available later in AsmPrinter, to avoid any type name comparison.
a BPF subtarget feature checkoffsetreloc is implemented to warn users of any bpf_probe_read() without relocatable gep. This is mostly for debugging.

Please let me know what you think? Thanks!

Don't think i will be of any help here.

docs/LangRef.rst
16960 ↗	(On Diff #200640)	what's `anons` ?

lebedev.ri resigned from this revision.May 23 2019, 10:28 AM

anakryiko added a subscriber: anakryiko.May 24 2019, 12:23 PM

yonghong-song edited the summary of this revision. (Show Details)Jul 3 2019, 9:38 AM

@eli.friedman has reviewed a related patch (https://reviews.llvm.org/D61809) and is okay with the whole approach. @ast is also okay with the patch, so I will start to land soon. Thanks!

This revision was not accepted when it landed; it landed in state Needs Review.Jul 8 2019, 10:11 AM

Closed by commit rL365352: [BPF] add new intrinsics preserve_{array,union,struct}_access_index (authored by yhs). · Explain Why

This revision was automatically updated to reflect the committed changes.

This patch lacks tests for various parts, intrinsics, metadata, ... Also descriptions are missing.

@efriedma @eli.friedman

llvm/trunk/docs/LangRef.rst
17348	I fail to see how what `llvm.preserve.array.access.index` preserves, at least given this description.
17386	why `2` in `type2`, also above.
17397	Where is the description of this metadata?

@jdoerfert Thanks. I will address your comments and submit standalone test cases for these intrinsics. I do have some BPF unit tests using these intrinsics. But I agree that some standalone unit test cases are also needed.

The followup patch for this is at https://reviews.llvm.org/D64606.

Abandon this revision for now as one of its variants has been merged.

Diff 208452

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 17,298 Lines • ▼ Show 20 Lines
	::			::

	declare i8* @llvm.objc.storeWeak(i8*, i8)			declare i8* @llvm.objc.storeWeak(i8*, i8)

	Lowering:			Lowering:
	"""""""""			"""""""""

	Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.			Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.

				Preserving Debug Information Intrinsics
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				These intrinsics are used to carry certain debuginfo together with
				IR-level operations. For example, it may be desirable to
				know the structure/union name and the original user-level field
				indices. Such information got lost in IR GetElementPtr instruction
				since the IR types are different from debugInfo types and unions
				are converted to structs in IR.

				'``llvm.preserve.array.access.index``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""
				::

				declare <type2>
				@llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
				i32 dim,
				i32 index)

				Overview:
				"""""""""

				The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
				based on array base ``base``, array dimension ``dim`` and the last access index ``index``
				into the array.

				Arguments:
				""""""""""

				The ``base`` is the array base address. The ``dim`` is the array dimension.
				The ``base`` is a pointer if ``dim`` equals 0.
				The ``index`` is the last access index into the array or pointer.

				Semantics:
				""""""""""

				The '``llvm.preserve.array.access.index``' intrinsic produces the same result
				as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
				jdoerfertUnsubmitted Not Done Reply Inline Actions I fail to see how what `llvm.preserve.array.access.index` preserves, at least given this description. jdoerfert: I fail to see how what `llvm.preserve.array.access.index` preserves, at least given this…

				'``llvm.preserve.union.access.index``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""
				::

				declare <type>
				@llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
				i32 di_index)

				Overview:
				"""""""""

				The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
				``di_index`` and returns the ``base`` address.
				The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
				to provide union debuginfo type.

				Arguments:
				""""""""""

				The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.

				Semantics:
				""""""""""

				The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.

				'``llvm.preserve.struct.access.index``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""
				::

				declare <type2>
				jdoerfertUnsubmitted Not Done Reply Inline Actions why `2` in `type2`, also above. jdoerfert: why `2` in `type2`, also above.
				@llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
				i32 gep_index,
				i32 di_index)

				Overview:
				"""""""""

				The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
				based on struct base ``base`` and IR struct member index ``gep_index``.
				The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
				to provide struct debuginfo type.
				jdoerfertUnsubmitted Not Done Reply Inline Actions Where is the description of this metadata? jdoerfert: Where is the description of this metadata?

				Arguments:
				""""""""""

				The ``base`` is the structure base address. The ``gep_index`` is the struct member index
				based on IR structures. The ``di_index`` is the struct member index based on debuginfo.

				Semantics:
				""""""""""

				The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
				as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.

llvm/trunk/include/llvm/IR/IRBuilder.h

Show First 20 Lines • Show All 2,447 Lines • ▼ Show 20 Lines	Value CreateExtractInteger(const DataLayout &DL, Value From,
assert(ExtractedTy->getBitWidth() <= IntTy->getBitWidth() &&		assert(ExtractedTy->getBitWidth() <= IntTy->getBitWidth() &&
"Cannot extract to a larger integer!");		"Cannot extract to a larger integer!");
if (ExtractedTy != IntTy) {		if (ExtractedTy != IntTy) {
V = CreateTrunc(V, ExtractedTy, Name + ".trunc");		V = CreateTrunc(V, ExtractedTy, Name + ".trunc");
}		}
return V;		return V;
}		}

		Value CreatePreserveArrayAccessIndex(Value Base, unsigned Dimension,
		unsigned LastIndex) {
		assert(isa<PointerType>(Base->getType()) &&
		"Invalid Base ptr type for preserve.array.access.index.");
		auto *BaseType = Base->getType();

		Value *LastIndexV = getInt32(LastIndex);
		Constant *Zero = ConstantInt::get(Type::getInt32Ty(Context), 0);
		SmallVector<Value *, 4> IdxList;
		for (unsigned I = 0; I < Dimension; ++I)
		IdxList.push_back(Zero);
		IdxList.push_back(LastIndexV);

		Type *ResultType =
		GetElementPtrInst::getGEPReturnType(Base, IdxList);

		Module *M = BB->getParent()->getParent();
		Function *FnPreserveArrayAccessIndex = Intrinsic::getDeclaration(
		M, Intrinsic::preserve_array_access_index, {ResultType, BaseType});

		Value *DimV = getInt32(Dimension);
		CallInst *Fn =
		CreateCall(FnPreserveArrayAccessIndex, {Base, DimV, LastIndexV});

		return Fn;
		}

		Value CreatePreserveUnionAccessIndex(Value Base, unsigned FieldIndex,
		MDNode *DbgInfo) {
		assert(isa<PointerType>(Base->getType()) &&
		"Invalid Base ptr type for preserve.union.access.index.");
		auto *BaseType = Base->getType();

		Module *M = BB->getParent()->getParent();
		Function *FnPreserveUnionAccessIndex = Intrinsic::getDeclaration(
		M, Intrinsic::preserve_union_access_index, {BaseType, BaseType});

		Value *DIIndex = getInt32(FieldIndex);
		CallInst *Fn =
		CreateCall(FnPreserveUnionAccessIndex, {Base, DIIndex});
		Fn->setMetadata(LLVMContext::MD_preserve_access_index, DbgInfo);

		return Fn;
		}

		Value CreatePreserveStructAccessIndex(Value Base, unsigned Index,
		unsigned FieldIndex, MDNode *DbgInfo) {
		assert(isa<PointerType>(Base->getType()) &&
		"Invalid Base ptr type for preserve.struct.access.index.");
		auto *BaseType = Base->getType();

		Value *GEPIndex = getInt32(Index);
		Constant *Zero = ConstantInt::get(Type::getInt32Ty(Context), 0);
		Type *ResultType =
		GetElementPtrInst::getGEPReturnType(Base, {Zero, GEPIndex});

		Module *M = BB->getParent()->getParent();
		Function *FnPreserveStructAccessIndex = Intrinsic::getDeclaration(
		M, Intrinsic::preserve_struct_access_index, {ResultType, BaseType});

		Value *DIIndex = getInt32(FieldIndex);
		CallInst *Fn = CreateCall(FnPreserveStructAccessIndex,
		{Base, GEPIndex, DIIndex});
		Fn->setMetadata(LLVMContext::MD_preserve_access_index, DbgInfo);

		return Fn;
		}

private:		private:
/// Helper function that creates an assume intrinsic call that		/// Helper function that creates an assume intrinsic call that
/// represents an alignment assumption on the provided Ptr, Mask, Type		/// represents an alignment assumption on the provided Ptr, Mask, Type
/// and Offset. It may be sometimes useful to do some other logic		/// and Offset. It may be sometimes useful to do some other logic
/// based on this alignment check, thus it can be stored into 'TheCheck'.		/// based on this alignment check, thus it can be stored into 'TheCheck'.
CallInst *CreateAlignmentAssumptionHelper(const DataLayout &DL,		CallInst *CreateAlignmentAssumptionHelper(const DataLayout &DL,
Value PtrValue, Value Mask,		Value PtrValue, Value Mask,
Type IntPtrTy, Value OffsetValue,		Type IntPtrTy, Value OffsetValue,
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 1,034 Lines • ▼ Show 20 Lines
	// Clear cache intrinsic, default to ignore (ie. emit nothing)			// Clear cache intrinsic, default to ignore (ie. emit nothing)
	// maps to void __clear_cache() on supporting platforms			// maps to void __clear_cache() on supporting platforms
	def int_clear_cache : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],			def int_clear_cache : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],
	[], "llvm.clear_cache">;			[], "llvm.clear_cache">;

	// Intrinsic to detect whether its argument is a constant.			// Intrinsic to detect whether its argument is a constant.
	def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem], "llvm.is.constant">;			def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem], "llvm.is.constant">;


	//===-------------------------- Masked Intrinsics -------------------------===//			//===-------------------------- Masked Intrinsics -------------------------===//
	//			//
	def int_masked_store : Intrinsic<[], [llvm_anyvector_ty,			def int_masked_store : Intrinsic<[], [llvm_anyvector_ty,
	LLVMAnyPointerType<LLVMMatchType<0>>,			LLVMAnyPointerType<LLVMMatchType<0>>,
	llvm_i32_ty,			llvm_i32_ty,
	LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],			LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],
	[IntrArgMemOnly, ImmArg<2>]>;			[IntrArgMemOnly, ImmArg<2>]>;

	▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines
	def int_loop_decrement_reg :			def int_loop_decrement_reg :
	Intrinsic<[llvm_anyint_ty],			Intrinsic<[llvm_anyint_ty],
	[llvm_anyint_ty, llvm_anyint_ty], [IntrNoDuplicate]>;			[llvm_anyint_ty, llvm_anyint_ty], [IntrNoDuplicate]>;

	//===----- Intrinsics that are used to provide predicate information -----===//			//===----- Intrinsics that are used to provide predicate information -----===//

	def int_ssa_copy : Intrinsic<[llvm_any_ty], [LLVMMatchType<0>],			def int_ssa_copy : Intrinsic<[llvm_any_ty], [LLVMMatchType<0>],
	[IntrNoMem, Returned<0>]>;			[IntrNoMem, Returned<0>]>;

				//===------- Intrinsics that are used to preserve debug information -------===//

				def int_preserve_array_access_index : Intrinsic<[llvm_anyptr_ty],
				[llvm_anyptr_ty, llvm_i32_ty,
				llvm_i32_ty],
				[IntrNoMem, ImmArg<1>, ImmArg<2>]>;
				def int_preserve_union_access_index : Intrinsic<[llvm_anyptr_ty],
				[llvm_anyptr_ty, llvm_i32_ty],
				[IntrNoMem, ImmArg<1>]>;
				def int_preserve_struct_access_index : Intrinsic<[llvm_anyptr_ty],
				[llvm_anyptr_ty, llvm_i32_ty,
				llvm_i32_ty],
				[IntrNoMem, ImmArg<1>,
				ImmArg<2>]>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Target-specific intrinsics			// Target-specific intrinsics
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "llvm/IR/IntrinsicsPowerPC.td"			include "llvm/IR/IntrinsicsPowerPC.td"
	include "llvm/IR/IntrinsicsX86.td"			include "llvm/IR/IntrinsicsX86.td"
	include "llvm/IR/IntrinsicsARM.td"			include "llvm/IR/IntrinsicsARM.td"
	include "llvm/IR/IntrinsicsAArch64.td"			include "llvm/IR/IntrinsicsAArch64.td"
	Show All 9 Lines

llvm/trunk/include/llvm/IR/LLVMContext.h

Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	enum : unsigned {
MD_type = 19, // "type"		MD_type = 19, // "type"
MD_section_prefix = 20, // "section_prefix"		MD_section_prefix = 20, // "section_prefix"
MD_absolute_symbol = 21, // "absolute_symbol"		MD_absolute_symbol = 21, // "absolute_symbol"
MD_associated = 22, // "associated"		MD_associated = 22, // "associated"
MD_callees = 23, // "callees"		MD_callees = 23, // "callees"
MD_irr_loop = 24, // "irr_loop"		MD_irr_loop = 24, // "irr_loop"
MD_access_group = 25, // "llvm.access.group"		MD_access_group = 25, // "llvm.access.group"
MD_callback = 26, // "callback"		MD_callback = 26, // "callback"
		MD_preserve_access_index = 27, // "llvm.preserve.*.access.index"
};		};

/// Known operand bundle tag IDs, which always have the same value. All		/// Known operand bundle tag IDs, which always have the same value. All
/// operand bundle tags that LLVM has special knowledge of are listed here.		/// operand bundle tags that LLVM has special knowledge of are listed here.
/// Additionally, this scheme allows LLVM to efficiently check for specific		/// Additionally, this scheme allows LLVM to efficiently check for specific
/// operand bundle tags without comparing strings.		/// operand bundle tags without comparing strings.
enum : unsigned {		enum : unsigned {
OB_deopt = 0, // "deopt"		OB_deopt = 0, // "deopt"
▲ Show 20 Lines • Show All 251 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/LLVMContext.cpp

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	std::pair<unsigned, StringRef> MDKinds[] = {
{MD_type, "type"},		{MD_type, "type"},
{MD_section_prefix, "section_prefix"},		{MD_section_prefix, "section_prefix"},
{MD_absolute_symbol, "absolute_symbol"},		{MD_absolute_symbol, "absolute_symbol"},
{MD_associated, "associated"},		{MD_associated, "associated"},
{MD_callees, "callees"},		{MD_callees, "callees"},
{MD_irr_loop, "irr_loop"},		{MD_irr_loop, "irr_loop"},
{MD_access_group, "llvm.access.group"},		{MD_access_group, "llvm.access.group"},
{MD_callback, "callback"},		{MD_callback, "callback"},
		{MD_preserve_access_index, "llvm.preserve.access.index"},
};		};

for (auto &MDKind : MDKinds) {		for (auto &MDKind : MDKinds) {
unsigned ID = getMDKindID(MDKind.second);		unsigned ID = getMDKindID(MDKind.second);
assert(ID == MDKind.first && "metadata kind id drifted");		assert(ID == MDKind.first && "metadata kind id drifted");
(void)ID;		(void)ID;
}		}

▲ Show 20 Lines • Show All 277 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[BPF] add new intrinsics preserve_{array,union,struct}_access_index
AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 208452

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/IR/IRBuilder.h

llvm/trunk/include/llvm/IR/Intrinsics.td

llvm/trunk/include/llvm/IR/LLVMContext.h

llvm/trunk/lib/IR/LLVMContext.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[BPF] add new intrinsics preserve_{array,union,struct}_access_indexAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 208452

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/IR/IRBuilder.h

llvm/trunk/include/llvm/IR/Intrinsics.td

llvm/trunk/include/llvm/IR/LLVMContext.h

llvm/trunk/lib/IR/LLVMContext.cpp

[BPF] add new intrinsics preserve_{array,union,struct}_access_index
AbandonedPublic