This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Don't value-instrument llvm.global_ctors and llvm.global_dtors functions
Needs ReviewPublic

Authored by xur on May 2 2016, 10:54 PM.

Details

Reviewers
davidxl
Summary

The runtime of value profiling (__llvm_profile_instrument_target) has dynamic memory allocation (i.e. malloc). Malloc call can be overwritten by other memory allocator through llvm.global_ctors functions. Value-instrumenting these functions can result in a deadlock.

This patch disables the value-instrumentation (for indirect-call) of functions referenced in llvm.global.ctors and llvm.global_dtors.

Diff Detail

Event Timeline

xur updated this revision to Diff 55948.May 2 2016, 10:54 PM
xur retitled this revision from to [PGO] Don't value-instrument llvm.global_ctors and llvm.global_dtors functions.
xur updated this object.
xur added a reviewer: davidxl.
xur added subscribers: llvm-commits, xur.
davidxl edited edge metadata.May 3 2016, 9:39 AM

This does not work well for O0 compilation. __cxx_global_var_init also needs to be skipped, but it is not directly referenced by llvm.global_ctors. Probably just skip functions in startup section.

@llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I_t.cc, i8* null }]

Function Attrs: uwtable
define internal void @__cxx_global_var_init() #0 section ".text.startup" {

call void @_ZN1AC1Ev(%struct.A* @a)
ret void

}

; Function Attrs: uwtable
define internal void @_GLOBAL__sub_I_t.cc() #0 section ".text.startup" {

call void @__cxx_global_var_init()
ret void

}

silvas added a subscriber: silvas.May 3 2016, 9:20 PM

This does not work well for O0 compilation. __cxx_global_var_init also needs to be skipped, but it is not directly referenced by llvm.global_ctors. Probably just skip functions in startup section.

I don't think that is a good solution. The name of that section is not always applied and varies from platform to platform even when it is applied (see getStaticInitSectionSpecifier in clang and its overrides; duplicating that information in this pass is undesirable).

Maybe we could do a CG walk from the llvm.global_[cd]tors functions or something? A naive walk is probably not sufficient (would find too many things). But maybe functions used transitively by llvm.global_[cd]tors and that is the only use?