Page MenuHomePhabricator

[sanitizer][sancov] Refactor GetNextInstructionPc/GetPreviousInstructionPc
ClosedPublic

Authored by MaskRay on Feb 22 2022, 3:09 PM.

Details

Summary

x86 uses offset 1 while most RISC architectures use offset 4.
Check x86 first to prevent changes for new RISC architectures.

For sancov, getPreviousInstructionPc does not use the PC - 8 code path for
Sparc. Triple.h does not have isSparc yet and I hope someone can investigate
why Mips/Sparc use the weird -8.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 22 2022, 3:09 PM
MaskRay requested review of this revision.Feb 22 2022, 3:09 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 22 2022, 3:09 PM
vitalybuka accepted this revision.Feb 22 2022, 4:06 PM

Maybe split into 3 patches for easier bisecting/reverting as usually fuzzers include all 3 components?

compiler-rt/lib/sanitizer_common/sanitizer_stacktrace.cpp
49

same

compiler-rt/lib/sanitizer_common/sanitizer_stacktrace.h
101

SANITIZER_I386 || SANITIZER_X64

This revision is now accepted and ready to land.Feb 22 2022, 4:06 PM
MaskRay updated this revision to Diff 410667.Feb 22 2022, 4:13 PM
MaskRay marked 2 inline comments as done.

Use SANITIZER_I386 || SANITIZER_X32 || SANITIZER_X64.

Will split when pushing.

This revision was landed with ongoing or failed builds.Feb 22 2022, 4:20 PM
This revision was automatically updated to reflect the committed changes.

For sancov, getPreviousInstructionPc does not use the PC - 8 code path for
Sparc. Triple.h does not have isSparc yet and I hope someone can investigate
why Mips/Sparc use the weird -8.

@MaskRay
IMO, The SPARC and MIPS branching operations have a branch delay slot, 4 more bytes occupied.
I did an experiment on MIPS64EL machine as following:

// cat caller.c 
void foo() {
	void *caller_pc = __builtin_return_address(0);
}

int main() {
	foo();
}

Debug with gdb:

print caller_pc variable : 
(gdb) p caller_pc
$1 = (void *) 0x555555554b54 <main+48>

Dump of assembler code for function main:
...
   0x0000555555554b38 <+20>:	lui	gp,0x2
   0x0000555555554b3c <+24>:	daddiu	gp,gp,-32340
   0x0000555555554b40 <+28>:	daddu	gp,gp,t9
=> 0x0000555555554b44 <+32>:	ld	v0,-32680(gp)
   0x0000555555554b48 <+36>:	move	t9,v0
   0x0000555555554b4c <+40>:	bal	0x555555554af0 <foo>   /// caller_pc - 8
   0x0000555555554b50 <+44>:	nop                            /// caller_pc - 4
   0x0000555555554b54 <+48>:	move	v0,zero                /// caller_pc
   0x0000555555554b58 <+52>:	move	sp,s8
...
ro added a comment.Feb 23 2022, 4:52 AM

@MaskRay
IMO, The SPARC and MIPS branching operations have a branch delay slot, 4 more bytes occupied.

Indeed, but with an additional caveat: to be fully correct, the test program needs to become

void foo() {
  void *caller_pc = __builtin_extract_return_addr(__builtin_return_address(0));
}

int main() {
  foo();
}

On most CPUs, __builtin_extract_return_addr is a no-op, but not on SPARC, and since D91607 clang implements it correctly like gcc already did. I needed that in D91608 to fix a couple of sanitizer_common failures when enabling that testing on SPARC. With those changes in place, I get

(gdb) p caller_pc
$3 = (void *) 0x100000e18 <main+12>
(gdb) x/4 $3-8
   0x100000e10 <main+4>:	call  0x100000df4 <foo> <- caller-pc - 8
   0x100000e14 <main+8>:	nop 
   0x100000e18 <main+12>:	ret                                   <- caller_pc
   0x100000e1c <main+16>:	restore  %g0, 0, %o0

There are more CPUs affected in clang, compared to gcc: ARM, MIPS, and S390, but the failure goes unnoticed because sanitizer_common testing is enabled for none of them.

This seems to have caused build bot failures on s390x: https://lab.llvm.org/buildbot/#/builders/94/builds/7935.

The patch changes the logic for s390x -- it now uses 4 when it used to use 1. 4 clearly is wrong - we have instructions of sizes 2, 4, or 6 bytes. In general it is not possible to determine the "previous" instruction just from a PC value, similar to Intel, so using - 1 seems the best option - at least this results in a value guaranteed to be *within* the previous instruction.

Also, I noticed this bit in test/tsan/test.h:

//The const kPCInc must be in sync with StackTrace::GetPreviousInstructionPc

This now no longer seems to be the case.

This seems to have caused build bot failures on s390x: https://lab.llvm.org/buildbot/#/builders/94/builds/7935.

The patch changes the logic for s390x -- it now uses 4 when it used to use 1. 4 clearly is wrong - we have instructions of sizes 2, 4, or 6 bytes. In general it is not possible to determine the "previous" instruction just from a PC value, similar to Intel, so using - 1 seems the best option - at least this results in a value guaranteed to be *within* the previous instruction.

Also, I noticed this bit in test/tsan/test.h:

//The const kPCInc must be in sync with StackTrace::GetPreviousInstructionPc

This now no longer seems to be the case.

Sent D120432 to use -1 for s390x.

compiler-rt/lib/sanitizer_common/sanitizer_stacktrace.h
101

I'll use SANITIZER_I386 || SANITIZER_X32 || SANITIZER_X64 to cover x86-32.