This is an archive of the discontinued LLVM Phabricator instance.

[XRay] Typed event logging intrinsic
ClosedPublic

Authored by kpw on Apr 13 2018, 11:27 AM.

Details

Summary

Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.

Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.

Diff Detail

Repository
rL LLVM

Event Timeline

kpw created this revision.Apr 13 2018, 11:27 AM
kpw added a comment.Apr 13 2018, 11:35 AM

Adding some comments for questions I had. Please chime in if you can shed light on these.

include/llvm/IR/Intrinsics.td
900 ↗(On Diff #142446)

I confess I copied IntrWriteMem from the custom event intrinsic without understanding why it is the right choice.

lib/Target/X86/X86MCInstLower.cpp
1098–1100 ↗(On Diff #142446)

The reason I didn't do this now is that I wasn't sure whether it should be done with a sled version change.

While it is obviously a change, it doesn't require a change in the patching/unpatching logic.

test/CodeGen/X86/xray-typed-event-log.ll
13 ↗(On Diff #142446)

I would have liked to have an additional test for when the registers match up and the sled uses noops, but I couldn't figure out how to force that in IR.

I tried a variant with "call cc 78 void ..." to try and force SystemV, but I think at the ir level, but it didn't have the desired effect. Any ideas?

dberris accepted this revision.Apr 14 2018, 10:55 PM

LGTM

include/llvm/IR/Intrinsics.td
900 ↗(On Diff #142446)

This is so that the optimisation passes cannot assume that the call to the intrinsic will not touch any global memory.

lib/Target/X86/X86MCInstLower.cpp
1098–1100 ↗(On Diff #142446)

I think we can change the lowering now, as long as the number of bytes doesn't change. Also we don't need to change the sled version just for the re-ordering.

1188 ↗(On Diff #142446)

Yes, mostly because we don't want to make the stack-based and/or constant global value handling special. It could be possible to support those too, but it's not as simple (and it's unclear whether we can do that with the same amount of bytes in the sled).

test/CodeGen/X86/xray-typed-event-log.ll
13 ↗(On Diff #142446)

There's a lot of randomness in how the register allocators decide which registers to use. Unfortunately I don't think we can force that at the IR level. If you'd like to see how this works at the MIR level, that's an option but I don't know enough how to do this yet. Maybe look around at existing MIR tests to see whether that might be something you can try?

This revision is now accepted and ready to land.Apr 14 2018, 10:55 PM
kpw added inline comments.Apr 15 2018, 1:23 PM
include/llvm/IR/Intrinsics.td
900 ↗(On Diff #142446)

What about IntrReadMem as well? IntrWriteMem is commented as

This intrinsic writes to unspecified memory, but does not
read from memory, and has no other side effects. This means dead stores
before calls to this intrinsics may be removed.

lib/Target/X86/X86MCInstLower.cpp
1098–1100 ↗(On Diff #142446)

That sounds fine, I'll add a change to this patch.

1188 ↗(On Diff #142446)

Is there a better way we can support the contract that the intrinsic is invoked with register arguments than runtime assert?

test/CodeGen/X86/xray-typed-event-log.ll
13 ↗(On Diff #142446)

I'll take a look at some MIR tests. Thanks.

dberris added inline comments.Apr 15 2018, 5:30 PM
include/llvm/IR/Intrinsics.td
900 ↗(On Diff #142446)

Technically, the ReadOnly<1> attribute preserves the data pointed to, which is the minimal requirement for this intrinsic.

IntrReadMem signals that globals may be read, but the intrinsic itself really doesn't need this. This is so that we minimise the interference of the intrinsic when it's emitted in a function that may be in-lined. We want to say that this function acts like some sort of write barrier, but not a read barrier.

lib/Target/X86/X86MCInstLower.cpp
1188 ↗(On Diff #142446)

The way we've forced the arguments to be in registers when lowering the pseudo-instruction is how we're telling the register allocators to do this for us. Until we come up with a way to force specific registers to be used at the MachineInstr level (which up until now I'm not sure is possible or desirable) we're going to be at the mercy of the default calling convention and therefore have to handle re-ordering the inputs per call-site.

The assertion here is meant to signal that we (us working on XRay) and the intervening optimisations on the machine instructions, have not messed up the fact that the arguments to the intrinsic call have to be in registers. Now there might be a way (similar to how the stack maps do it) to have much less interference with both typed events and custom events if we're able to record where the data is stored (could be global, could be in the stack already, could be a constant in a register already, etc), and emit the appropriate instrumentation point (with versioning and maybe some additional metadata).

We could do the more complicated thing which will reduce the register pressure from calling the intrinsic functions. But we'd need to measure the effects of that before committing to that course of action. :)

That's not to say we're not going to do it or that it will not be worth it -- I think we should do it, it's just a matter of "not now". But "not now" is closer to "in the next couple of months" rather than "never". :D

This revision was automatically updated to reflect the committed changes.