This is an archive of the discontinued LLVM Phabricator instance.

tsan: split thread into logical and physical state
AbandonedPublic

Authored by dvyukov on Feb 26 2016, 8:58 AM.

Details

Reviewers
kcc
zatrazz
Summary

Currently ThreadState holds both logical state (required for race-detection algorithm, user-visible)
and physical state (various caches, most notably malloc cache). Move physical state in a new
Process entity. Besides just being the right thing from abstraction point of view, this solves several
problems:

  1. Cache everything on P level in Go. Currently we cache on a mix of goroutine and OS thread levels.

This unnecessary increases memory consumption.

  1. Properly handle free operations in Go. Frees are issue by GC which don't have goroutine context.

As the result we could not do anything more than just clearing shadow. For example, we leaked
sync objects and heap block descriptors.

  1. This will allow to get rid of libc malloc in Go (now we have Processor context for internal allocator cache).

This in turn will allow to get rid of dependency on libc entirely.

  1. Potentially we can make Processor per-CPU in C++ mode instead of per-thread, which will

reduce resource consumption.
The distinction between Thread and Processor is currently used only by Go, C++ creates Processor per OS thread,
which is equivalent to the current scheme.

Diff Detail

Event Timeline

dvyukov updated this revision to Diff 49197.Feb 26 2016, 8:58 AM
dvyukov retitled this revision from to tsan: split thread into logical and physical state.
dvyukov updated this object.
dvyukov added a reviewer: kcc.

Kuba, Renato, Bill,

I would like to ask one of you to try to figure out why it crashes on darwin/arm64/powerpc. I don't have accesses to such machines, and it works on linux/amd64. There is probably some initialization order issue, maybe common for all these platforms. This blocks some important changes for Go.

Thanks in advance

zatrazz edited edge metadata.Feb 29 2016, 11:04 AM

This is indeed an initialization order issue where 'ProcCreate' is trying to access the ThreadState object through internal_alloc without it has been proper initialized (on ThreadStart function). I think it would be better to use plain allocation for 'Process' objects. I check this fix on top the patch on both aarch64 and powerpc64le and it shows no regression:

diff --git a/lib/tsan/rtl/tsan_rtl_proc.cc b/lib/tsan/rtl/tsan_rtl_proc.cc
index 710a3bf..c396350 100644
--- a/lib/tsan/rtl/tsan_rtl_proc.cc
+++ b/lib/tsan/rtl/tsan_rtl_proc.cc
@@ -19,7 +19,7 @@
 namespace __tsan {

 Processor *ProcCreate() {
-  void *mem = internal_alloc(MBlockProcessor, sizeof(Processor));
+  void *mem = InternalAlloc(sizeof(Processor));
   internal_memset(mem, 0, sizeof(Processor));
   Processor *proc = new(mem) Processor;
   proc->thr = nullptr;
@@ -41,7 +41,7 @@ void ProcDestroy(Processor *proc) {
   if (common_flags()->detect_deadlocks)
      ctx->dd->DestroyPhysicalThread(proc->dd_pt);
   proc->~Processor();
-  internal_free(proc);
+  InternalFree(proc);
 }

 void ProcWire(Processor *proc, ThreadState *thr) {

Thank you very much!
I will try to resubmit with your change tomorrow.

It was working on integration with Go runtime but then was sidetracked. I want to shake out things on Go side and ensure that it all works end-to-end, and only then submit the llvm part.

dvyukov abandoned this revision.Apr 15 2021, 1:20 AM