This is an archive of the discontinued LLVM Phabricator instance.

[XRay] Change xray_instr_map sled addresses from absolute to PC relative for x86-64
ClosedPublic

Authored by MaskRay on Apr 13 2020, 11:35 PM.

Details

Summary

xray_instr_map contains absolute addresses of sleds, which are relocated
by R_*_RELATIVE when linked in -pie or -shared mode.

By making these addresses relative to PC, we can avoid the dynamic
relocations and remove the SHF_WRITE flag from xray_instr_map. We can
thus save VM pages containg xray_instr_map (because they are not
modified).

This patch changes x86-64. Subsequent changes will change powerpc64le
and AArch64.

Diff Detail

Event Timeline

MaskRay created this revision.Apr 13 2020, 11:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2020, 11:35 PM
Herald added subscribers: Restricted Project, danielkiss, steven.zhang and 2 others. · View Herald Transcript
MaskRay marked an inline comment as done.Apr 13 2020, 11:39 PM
MaskRay added inline comments.
llvm/test/CodeGen/X86/xray-section-group.ll
1

The test can actually be merged into the main test.

I may change the main test to be similar to patchable-function-entry.ll

ianlevesque accepted this revision.Apr 14 2020, 12:03 PM

Is "llvm-xray extract" still able to symbolize these correctly? I don't know offhand what test coverage is there for that. The change looks great to me otherwise.

This revision is now accepted and ready to land.Apr 14 2020, 12:03 PM
MaskRay updated this revision to Diff 257563.Apr 14 2020, 5:00 PM

Fix llvm-xray

MaskRay updated this revision to Diff 257564.Apr 14 2020, 5:02 PM

Fix DebugInfo/X86/xray-split-dwarf-interaction.ll

Is "llvm-xray extract" still able to symbolize these correctly? I don't know offhand what test coverage is there for that. The change looks great to me otherwise.

Thanks. I should have added a note that the binaries used by llvm-xray tests are large (du -sh => 2M) and I really dislike the prebuilt binaries. They permanently increased the repository size by 2M.

The address: fields may look weird because they haven't been fixed. I'll defer that to a future change.

LGTM, but I'm concerned about the backwards compatibility story.

compiler-rt/lib/xray/xray_interface_internal.h
35

How do you handle backwards-compatibility here? i.e. what happens when you link in a new compiler-rt implementation with a previous-versioned XRay instrumented .o file?

My suspicion is you'll need to increment the version to determine whether you're using pc-relative addressing or whether it's the old style absolute addressing. Otherwise there'd be potential for random memory addresses and potential corruption going on.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3217

const bool perhaps?

MaskRay added a comment.EditedApr 14 2020, 8:46 PM

LGTM, but I'm concerned about the backwards compatibility story.

We could pick a different section name. Do you have a suggestion? Doing that can change a lot of tests and I need some ack before moving toward that goal.

I want to avoid something like xray_instr_maps_ro because really the read-only section is commonplace, and I don't want to penalize the good readonly section name...

LGTM, but I'm concerned about the backwards compatibility story.

We could pick a different section name. Do you have a suggestion? Doing that can change a lot of tests and I need some ack before moving toward that goal.

I want to avoid something like xray_instr_maps_ro because really the read-only section is commonplace, and I don't want to penalize the good readonly section name...

As I mentioned in the other inline comment, you can use the 'Version' field for each entry. If you increment the version number in the compiler, then you can change the runtime code to support some older versions. Doing that allows you to avoid renaming the section and maintaining some backwards compatibility.

LGTM, but I'm concerned about the backwards compatibility story.

We could pick a different section name. Do you have a suggestion? Doing that can change a lot of tests and I need some ack before moving toward that goal.

I want to avoid something like xray_instr_maps_ro because really the read-only section is commonplace, and I don't want to penalize the good readonly section name...

As I mentioned in the other inline comment, you can use the 'Version' field for each entry. If you increment the version number in the compiler, then you can change the runtime code to support some older versions. Doing that allows you to avoid renaming the section and maintaining some backwards compatibility.

Supporting two versions of Sled.Address can make the runtime overly complex I think?

--- a/compiler-rt/lib/xray/xray_interface.cpp
+++ b/compiler-rt/lib/xray/xray_interface.cpp
@@ -264,14 +264,14 @@ XRayPatchingStatus controlPatching(bool Enable) XRAY_NEVER_INSTRUMENT {
   // now we're assuming we can mprotect the whole section of text between the
   // minimum sled address and the maximum sled address (+ the largest sled
   // size).
-  auto MinSled = InstrMap.Sleds[0];
-  auto MaxSled = InstrMap.Sleds[InstrMap.Entries - 1];
+  auto *MinSled = &InstrMap.Sleds[0];
+  auto *MaxSled = &InstrMap.Sleds[InstrMap.Entries - 1];
   for (std::size_t I = 0; I < InstrMap.Entries; I++) {
     const auto &Sled = InstrMap.Sleds[I];
-    if (Sled.Address < MinSled.Address)
-      MinSled = Sled;
-    if (Sled.Address > MaxSled.Address)
-      MaxSled = Sled;
+    if (Sled.address() < MinSled->address())
+      MinSled = &Sled;
+    if (Sled.address() > MaxSled->address())
+      MaxSled = &Sled;
   }

One approach is to teach the member function about return Version < 1 ? Address : reinterpret_cast<uint64_t>(&Address) + Address; but this will impose overhead on every operation.

x86 CUSTOM_EVENT has used version 1. Bump versions of all SledKind's to 2?

--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1510,7 +1510,7 @@ void X86AsmPrinter::LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI,
   // Record the sled version. Older versions of this sled were spelled
   // differently, so we let the runtime handle the different offsets we're
   // using.
-  recordSled(CurSled, MI, SledKind::CUSTOM_EVENT, 1);
+  recordSled(CurSled, MI, SledKind::CUSTOM_EVENT, 2);
 }

It'd be nice to reject old versions.

LGTM, but I'm concerned about the backwards compatibility story.

We could pick a different section name. Do you have a suggestion? Doing that can change a lot of tests and I need some ack before moving toward that goal.

I want to avoid something like xray_instr_maps_ro because really the read-only section is commonplace, and I don't want to penalize the good readonly section name...

As I mentioned in the other inline comment, you can use the 'Version' field for each entry. If you increment the version number in the compiler, then you can change the runtime code to support some older versions. Doing that allows you to avoid renaming the section and maintaining some backwards compatibility.

Supporting two versions of Sled.Address can make the runtime overly complex I think?

--- a/compiler-rt/lib/xray/xray_interface.cpp
+++ b/compiler-rt/lib/xray/xray_interface.cpp
@@ -264,14 +264,14 @@ XRayPatchingStatus controlPatching(bool Enable) XRAY_NEVER_INSTRUMENT {
   // now we're assuming we can mprotect the whole section of text between the
   // minimum sled address and the maximum sled address (+ the largest sled
   // size).
-  auto MinSled = InstrMap.Sleds[0];
-  auto MaxSled = InstrMap.Sleds[InstrMap.Entries - 1];
+  auto *MinSled = &InstrMap.Sleds[0];
+  auto *MaxSled = &InstrMap.Sleds[InstrMap.Entries - 1];
   for (std::size_t I = 0; I < InstrMap.Entries; I++) {
     const auto &Sled = InstrMap.Sleds[I];
-    if (Sled.Address < MinSled.Address)
-      MinSled = Sled;
-    if (Sled.Address > MaxSled.Address)
-      MaxSled = Sled;
+    if (Sled.address() < MinSled->address())
+      MinSled = &Sled;
+    if (Sled.address() > MaxSled->address())
+      MaxSled = &Sled;
   }

One approach is to teach the member function about return Version < 1 ? Address : reinterpret_cast<uint64_t>(&Address) + Address; but this will impose overhead on every operation.

x86 CUSTOM_EVENT has used version 1. Bump versions of all SledKind's to 2?

--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1510,7 +1510,7 @@ void X86AsmPrinter::LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI,
   // Record the sled version. Older versions of this sled were spelled
   // differently, so we let the runtime handle the different offsets we're
   // using.
-  recordSled(CurSled, MI, SledKind::CUSTOM_EVENT, 1);
+  recordSled(CurSled, MI, SledKind::CUSTOM_EVENT, 2);
 }

This is my preferred solution -- and it won't just be the custom events, it will be for all the instrumentation points (entry, exit, etc.) since they're all going to be affected. For versioning I don't expect that we stick with the monotonically increasing numbers either, we might be able to get away with dates (say 2005 to represent 2020-05) and start communicating/coordinating a policy to drop support for older versions.

The cost of doing that comparison is not really that high and I'd be surprised if it's significant enough to show up as a significant detectable overhead.

It'd be nice to reject old versions.

Someday, yes, if we can teach the linker (or use symbol collision tricks) so that we can't link together object files with different XRay instrumentation versions. That decision might even need to be done by the runtime.

MaskRay updated this revision to Diff 258895.Apr 20 2020, 9:12 PM

Implement version 2

dberris accepted this revision.Apr 21 2020, 3:00 AM

LGTM -- thanks!

llvm/lib/Target/X86/X86MCInstLower.cpp
1511–1514

One suggestion: consider documenting what happened with version 1, and why we're going to version 2 here (for the future reader).

MaskRay updated this revision to Diff 259021.Apr 21 2020, 9:10 AM

Add comments about version 2

This revision was automatically updated to reflect the committed changes.