This is an archive of the discontinued LLVM Phabricator instance.

[tsan] Implement __tsan_get_alloc_stack and __tsan_locate_address to query pointer types and allocation stacks of heap pointers
ClosedPublic

Authored by kubamracek on Dec 10 2016, 1:57 PM.

Details

Summary

In ASan, we have __asan_locate_address and __asan_get_alloc_stack, which is used in LLDB/Xcode to show the allocation backtrace for a heap memory object. This patch implements the same for TSan.

Diff Detail

Repository
rL LLVM

Event Timeline

kubamracek updated this revision to Diff 81007.Dec 10 2016, 1:57 PM
kubamracek retitled this revision from to [tsan] Implement __tsan_get_alloc_stack to query the allocation stack of a heap pointer.
kubamracek updated this object.
kubamracek added reviewers: dvyukov, zaks.anna, kcc.
kubamracek set the repository for this revision to rL LLVM.
kubamracek added a project: Restricted Project.
kubamracek added a subscriber: llvm-commits.
dvyukov added inline comments.Dec 12 2016, 11:49 AM
lib/tsan/rtl/tsan_debugging.cc
168 ↗(On Diff #81007)

Since the interface is different from asan, it may makes sense to return block begin and size as well. What do you think?

181 ↗(On Diff #81007)

os_id can be stale if status is not ThreadStatusRunning
I think it's better to return something more predictable in such case, e.g. 0.

kubamracek added inline comments.Dec 14 2016, 11:10 AM
lib/tsan/rtl/tsan_debugging.cc
168 ↗(On Diff #81007)

Makes sense. What about going further and adding a TSan version of __asan_locate_address (see asan_debugging.cc), which will "describe" a pointer and return its bounds if known, i.e. having two APIs:

__tsan_locate_address(addr) ... returns region name, start address, block size
__tsan_get_alloc_stack(addr) ... returns trace, thread ID

Does that sound better or worse than having just one

__tsan_get_alloc_stack(addr) ... returns trace, thread ID, start address and block size

?

181 ↗(On Diff #81007)

Hm. At least on Darwin, GetTid() returns a unique ID for the current boot of the OS. So it's correct even after the thread finished, and even after the process has ended (as long as you don't reboot). Is gettid() on Linux different?

I'm already using os_id to "match" threads in TSan reports: If we see the same os_id, we know it's the same thread even when that thread doesn't exist anymore. I'd like not to lose this feature.

Does Linux reuse the thread ID soon enough for this to be a problem? If yes, how about we reset the os_id in the ThreadRegistry when a thread ends (for Linux, and keep the value for Darwin)?

dvyukov added inline comments.Dec 15 2016, 4:09 AM
lib/tsan/rtl/tsan_debugging.cc
168 ↗(On Diff #81007)

tsan_locate_address(addr) ... returns region name, start address, block size
tsan_get_alloc_stack(addr) ... returns trace, thread ID

sounds better, more orthogonal and some use cases may not need stack/thread

181 ↗(On Diff #81007)

Linux reuses PIDs FIFO, but it still reuses them.
On closer inspection, there is a more realistic source of bogus values -- tid's are recycled, and ThreadContext we obtain can be for a different thread. Tid does not uniquely identify ThreadContext, it's tid+epoch that does. But we don't have epoch in MBLock.
So I think it's fine to leave code as is.

kubamracek updated this revision to Diff 81667.Dec 15 2016, 3:02 PM
kubamracek retitled this revision from [tsan] Implement __tsan_get_alloc_stack to query the allocation stack of a heap pointer to [tsan] Implement __tsan_get_alloc_stack and __tsan_locate_address to query pointer types and allocation stacks of heap pointers.
kubamracek updated this object.
kubamracek removed rL LLVM as the repository for this revision.

Updating patch to also add __tsan_locate_address. Adding a testcase.

dvyukov accepted this revision.Dec 19 2016, 1:52 AM
dvyukov edited edge metadata.
This revision is now accepted and ready to land.Dec 19 2016, 1:52 AM
This revision was automatically updated to reflect the committed changes.