This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/
-
CodeGen/
-
XRayInstrumentation.cpp
-
Target/ARM/
-
ARM/
-
ARMAsmPrinter.h
-
ARMAsmPrinter.cpp
-
ARMMCInstLower.cpp
-
test/CodeGen/ARM/
-
CodeGen/
-
ARM/
-
xray-tail-call-sled.ll

Differential D25030

[XRay] Support for for tail calls for ARM no-Thumb
ClosedPublic

Authored by rSerge on Sep 28 2016, 8:45 AM.

Download Raw Diff

Details

Reviewers

dberris
echristo
rnk

Commits

rG156f6cafc287: [XRay] Support for for tail calls for ARM no-Thumb
rL284456: [XRay] Support for for tail calls for ARM no-Thumb

Summary

This patch adds simplified support for tail calls on ARM with XRay instrumentation.

Known issue: compiled with generic flags: -O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.), the following program

#include <cstdio>
#include <cassert>
#include <xray/xray_interface.h>

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() { 
  std::printf("In fC()\n");
}

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() { 
  std::printf("In fB()\n");
  fC();
}

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() { 
  std::printf("In fA()\n");
  fB();
}

// Avoid infinite recursion in case the logging function is instrumented (so calls logging
//   function again).
[[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret)
{
  printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret));
}

int main(int argc, char* argv[]) {
  __xray_set_handler(simplyPrint);

  printf("Patching...\n");
  __xray_patch();
  fA();

  printf("Unpatching...\n");
  __xray_unpatch();       
  fA();

  return 0;
}

gives the following output:

Patching...
XRay: functionId=3 type=0.
In fA()
XRay: functionId=3 type=1.
XRay: functionId=2 type=0.
In fB()
XRay: functionId=2 type=1.
XRay: functionId=1 type=0.
XRay: functionId=1 type=1.
In fC()
Unpatching...
In fA()
In fB()
In fC()

So for function fC() the exit sled seems to be called too much before function exit: before printing In fC().
Debugging shows that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach (i.e. within the simplification described in https://reviews.llvm.org/D23988 ).

Diff Detail

Repository: rL LLVM

Event Timeline

rSerge updated this revision to Diff 72832.Sep 28 2016, 8:45 AM

rSerge retitled this revision from to [XRay] Support for for tail calls for ARM no-Thumb.

rSerge updated this object.

rSerge added reviewers: dberris, rnk, sanjoy, echristo.

rSerge added subscribers: llvm-commits, iid_iunknown.

Herald added subscribers: dberris, samparker, mehdi_amini and 2 others. · View Herald TranscriptSep 28 2016, 8:45 AM

rSerge added parent revisions: D23931: [XRay] ARM 32-bit no-Thumb support in LLVM, D23933: [XRay] ARM 32-bit no-Thumb support in compiler-rt.Sep 28 2016, 8:49 AM

I don't think I'm qualified to review this patch, getting it out of my queue.

Why is this doing something different from the x86_64 implementation? We'd like to be able to, in the future, define different semantics from a tail exit, which means we need to distinguish the kind of exit is happening (tail call is just one, another is for exceptions, etc.) and this makes the arm implementation semantically different.

This revision now requires changes to proceed.Oct 4 2016, 10:07 PM

As I understood, x86_64 implementation for the tail calls reuses the trampoline of function exit tracing, but normal function exit tracing jumps into the trampoline, while tail call tracing calls the exit trampoline as a function. Because on ARM normal function exit tracing already calls the exit trampoline as a function, there was no need for additional switching between jump and call instruction in the patch. Thus for ARM, to reproduce the same functionality as for x86_64, it was sufficient to check whether the instruction is a tail call.

In D25030#562462, @rSerge wrote:

As I understood, x86_64 implementation for the tail calls reuses the trampoline of function exit tracing, but normal function exit tracing jumps into the trampoline, while tail call tracing calls the exit trampoline as a function. Because on ARM normal function exit tracing already calls the exit trampoline as a function, there was no need for additional switching between jump and call instruction in the patch. Thus for ARM, to reproduce the same functionality as for x86_64, it was sufficient to check whether the instruction is a tail call.

This doesn't identify whether an exit is a tail call exit though (from the instrumentation map perspective), which will cause the runtime to treat it like a normal exit. This is the current state as a transitional point to when we change the entry type being passed onto the logging function, to differentiate between normal exits and tail exits. The point of the change in X86 is to make this transition staged, instead of abrupt -- lay down the sleds, differentiate them in the instrumentation map, and then later on handle the tail calls differently.

Implemented more staging for the real tail call implementation.

Thanks @rSerge -- let me know if you want me to submit on your behalf again (or if you've already gotten repository write access). :)

This revision is now accepted and ready to land.Oct 16 2016, 10:20 PM

In D25030#571365, @dberris wrote:

Thanks @rSerge -- let me know if you want me to submit on your behalf again (or if you've already gotten repository write access). :)

@dberris , yes, please deliver it to mainline. I haven't yet got the write access.

Closed by commit rL284456: [XRay] Support for for tail calls for ARM no-Thumb (authored by dberris). · Explain WhyOct 17 2016, 11:03 PM

This revision was automatically updated to reflect the committed changes.

Done @rSerge -- in the future, it would make it much easier/simpler to commit this if you use arcanist to create the patches (on git or svn).

If you're manually making the diffs, please use -p1 format (have a "before" directory and "after" directory), and run against clang-format before uploading. This way we don't introduce whitespace issues that later need to be fixed up (trailing whitespace being the biggest offender).

If you're using git it should be easy to create a patch using git diff.

Cheers

In D25030#572543, @dberris wrote:

Done @rSerge -- in the future, it would make it much easier/simpler to commit this if you use arcanist to create the patches (on git or svn).

If you're manually making the diffs, please use -p1 format (have a "before" directory and "after" directory), and run against clang-format before uploading. This way we don't introduce whitespace issues that later need to be fixed up (trailing whitespace being the biggest offender).

If you're using git it should be easy to create a patch using git diff.

Cheers

Ok, I'll try to setup one of those. I am making the patches as suggested by the developer policy: svn diff --diff-cmd=diff -x -U999999 > llvm.diff
Thanks for delivering!

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

XRayInstrumentation.cpp

13 lines

Target/

ARM/

ARMAsmPrinter.h

1 line

ARMAsmPrinter.cpp

3 lines

ARMMCInstLower.cpp

5 lines

test/

CodeGen/

ARM/

xray-tail-call-sled.ll

53 lines

Diff 74946

llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	for (auto &I : Terminators)
I->eraseFromParent();		I->eraseFromParent();
}		}

void XRayInstrumentation::prependRetWithPatchableExit(MachineFunction &MF,		void XRayInstrumentation::prependRetWithPatchableExit(MachineFunction &MF,
const TargetInstrInfo *TII)		const TargetInstrInfo *TII)
{		{
for (auto &MBB : MF) {		for (auto &MBB : MF) {
for (auto &T : MBB.terminators()) {		for (auto &T : MBB.terminators()) {
		unsigned Opc = 0;
if (T.isReturn()) {		if (T.isReturn()) {
// Prepend the return instruction with PATCHABLE_FUNCTION_EXIT		Opc = TargetOpcode::PATCHABLE_FUNCTION_EXIT;
BuildMI(MBB, T, T.getDebugLoc(),		}
TII->get(TargetOpcode::PATCHABLE_FUNCTION_EXIT));		if (TII->isTailCall(T)) {
		Opc = TargetOpcode::PATCHABLE_TAIL_CALL;
		}
		if (Opc != 0) {
		// Prepend the return instruction with PATCHABLE_FUNCTION_EXIT or
		// PATCHABLE_TAIL_CALL .
		BuildMI(MBB, T, T.getDebugLoc(),TII->get(Opc));
}		}
}		}
}		}
}		}

bool XRayInstrumentation::runOnMachineFunction(MachineFunction &MF) {		bool XRayInstrumentation::runOnMachineFunction(MachineFunction &MF) {
auto &F = *MF.getFunction();		auto &F = *MF.getFunction();
auto InstrAttr = F.getFnAttribute("function-instrument");		auto InstrAttr = F.getFnAttribute("function-instrument");
▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/ARM/ARMAsmPrinter.h

Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	public:

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// XRay implementation		// XRay implementation
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
public:		public:
// XRay-specific lowering for ARM.		// XRay-specific lowering for ARM.
void LowerPATCHABLE_FUNCTION_ENTER(const MachineInstr &MI);		void LowerPATCHABLE_FUNCTION_ENTER(const MachineInstr &MI);
void LowerPATCHABLE_FUNCTION_EXIT(const MachineInstr &MI);		void LowerPATCHABLE_FUNCTION_EXIT(const MachineInstr &MI);
		void LowerPATCHABLE_TAIL_CALL(const MachineInstr &MI);
// Helper function that emits the XRay sleds we've collected for a particular		// Helper function that emits the XRay sleds we've collected for a particular
// function.		// function.
void EmitXRayTable();		void EmitXRayTable();

private:		private:
void EmitSled(const MachineInstr &MI, SledKind Kind);		void EmitSled(const MachineInstr &MI, SledKind Kind);

// Helpers for EmitStartOfAsmFile() and EmitEndOfAsmFile()		// Helpers for EmitStartOfAsmFile() and EmitEndOfAsmFile()
Show All 38 Lines

llvm/trunk/lib/Target/ARM/ARMAsmPrinter.cpp

Show First 20 Lines • Show All 2,044 Lines • ▼ Show 20 Lines	case ARM::tInt_WIN_eh_sjlj_longjmp: {
return;		return;
}		}
case ARM::PATCHABLE_FUNCTION_ENTER:		case ARM::PATCHABLE_FUNCTION_ENTER:
LowerPATCHABLE_FUNCTION_ENTER(*MI);		LowerPATCHABLE_FUNCTION_ENTER(*MI);
return;		return;
case ARM::PATCHABLE_FUNCTION_EXIT:		case ARM::PATCHABLE_FUNCTION_EXIT:
LowerPATCHABLE_FUNCTION_EXIT(*MI);		LowerPATCHABLE_FUNCTION_EXIT(*MI);
return;		return;
		case ARM::PATCHABLE_TAIL_CALL:
		LowerPATCHABLE_TAIL_CALL(*MI);
		return;
}		}

MCInst TmpInst;		MCInst TmpInst;
LowerARMMachineInstrToMCInst(MI, TmpInst, *this);		LowerARMMachineInstrToMCInst(MI, TmpInst, *this);

EmitToStreamer(*OutStreamer, TmpInst);		EmitToStreamer(*OutStreamer, TmpInst);
}		}

Show All 11 Lines

llvm/trunk/lib/Target/ARM/ARMMCInstLower.cpp

Show First 20 Lines • Show All 213 Lines • ▼ Show 20 Lines	void ARMAsmPrinter::LowerPATCHABLE_FUNCTION_ENTER(const MachineInstr &MI)
EmitSled(MI, SledKind::FUNCTION_ENTER);		EmitSled(MI, SledKind::FUNCTION_ENTER);
}		}

void ARMAsmPrinter::LowerPATCHABLE_FUNCTION_EXIT(const MachineInstr &MI)		void ARMAsmPrinter::LowerPATCHABLE_FUNCTION_EXIT(const MachineInstr &MI)
{		{
EmitSled(MI, SledKind::FUNCTION_EXIT);		EmitSled(MI, SledKind::FUNCTION_EXIT);
}		}

		void ARMAsmPrinter::LowerPATCHABLE_TAIL_CALL(const MachineInstr &MI)
		{
		EmitSled(MI, SledKind::TAIL_CALL);
		}

void ARMAsmPrinter::EmitXRayTable()		void ARMAsmPrinter::EmitXRayTable()
{		{
if (Sleds.empty())		if (Sleds.empty())
return;		return;
if (Subtarget->isTargetELF()) {		if (Subtarget->isTargetELF()) {
auto *Section = OutContext.getELFSection(		auto *Section = OutContext.getELFSection(
"xray_instr_map", ELF::SHT_PROGBITS,		"xray_instr_map", ELF::SHT_PROGBITS,
ELF::SHF_ALLOC \| ELF::SHF_GROUP \| ELF::SHF_MERGE, 0,		ELF::SHF_ALLOC \| ELF::SHF_GROUP \| ELF::SHF_MERGE, 0,
Show All 17 Lines

llvm/trunk/test/CodeGen/ARM/xray-tail-call-sled.ll

				; RUN: llc -filetype=asm -o - -mtriple=armv7-unknown-linux-gnu < %s \| FileCheck %s

				define i32 @callee() nounwind noinline uwtable "function-instrument"="xray-always" {
				; CHECK: .p2align 2
				; CHECK-LABEL: Lxray_sled_0:
				; CHECK-NEXT: b #20
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-LABEL: Ltmp0:
				ret i32 0
				; CHECK-NEXT: mov r0, #0
				; CHECK-NEXT: .p2align 2
				; CHECK-LABEL: Lxray_sled_1:
				; CHECK-NEXT: b #20
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-LABEL: Ltmp1:
				; CHECK-NEXT: bx lr
				}

				define i32 @caller() nounwind noinline uwtable "function-instrument"="xray-always" {
				; CHECK: .p2align 2
				; CHECK-LABEL: Lxray_sled_2:
				; CHECK-NEXT: b #20
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-LABEL: Ltmp2:
				; CHECK: .p2align 2
				; CHECK-LABEL: Lxray_sled_3:
				; CHECK-NEXT: b #20
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-NEXT: nop
				; CHECK-LABEL: Ltmp3:
				%retval = tail call i32 @callee()
				; CHECK: b callee
				ret i32 %retval
				}