This is an archive of the discontinued LLVM Phabricator instance.

[memprof] Restrict memprof profile generation to clang only.
AbandonedPublic

Authored by snehasish on Mar 6 2023, 2:23 PM.

Details

Reviewers
tejohnson
Summary

This disable the memprof tests on gcc since allocation sites are affected by
optimizations. Adopted from the approach in
compiler-rt/test/asan/TestCases/heap-overflow.cpp

Example failure: https://lab.llvm.org/buildbot/#/builders/247/builds/2185

Diff Detail

Event Timeline

snehasish created this revision.Mar 6 2023, 2:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 2:23 PM
Herald added a subscriber: Enna1. · View Herald Transcript
snehasish requested review of this revision.Mar 6 2023, 2:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 2:23 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript

I'm not sure I understand what is happening - presumably gcc is being used to build clang and compiler-rt. How is that affecting the allocation sites?

I assumed it was due to optimizations however after looking more closely, the behaviour is due to a difference in debug information for the compiler-rt runtime. For the basic test case with the code below --

#include <stdlib.h>                                                                                                                                                          
#include <string.h>                                                                                                                                                          
int main(int argc, char **argv) {                                                                                                                                            
  char *x = (char *)malloc(10);                                                                                                                                              
  memset(x, 0, 10);                                                                                                                                                          
  free(x);                                                                                                                                                                   
  x = (char *)malloc(10);                                                                                                                                                    
  memset(x, 0, 10);                                                                                                                                                          
  free(x);                                                                                                                                                                   
  return 0;                                                                                                                                                                  
}

In both gcc and clang, we record 4 allocation sites. However two of them are from the application and two of them are from the runtime only [1]. We filter the records and drop frames which are from the runtime using the check here [2]. For gcc, this check fails since the path is just the filename memprof_malloc_linux.cpp as opposed to compiler-rt/lib/memprof/memprof_malloc_linux.cpp when built with clang. The isRuntime check looks for memprof/memprof_*. This seems to be an artifact of how the runtime was compiled in this environment and perhaps we can make this check more robust by listing the runtime filenames directly?

[1] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/memprof/memprof_malloc_linux.cpp#L58
[2] https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/RawMemProfReader.cpp#L151-L154

Also looking into why we have these extra interceptions, it looks like we set memprof_init_is_running to false and a subsequent call to Symbolizer::LateInitialize is intercepted [1].

[1] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/memprof/memprof_rtl.cpp#L201-L217

I assumed it was due to optimizations however after looking more closely, the behaviour is due to a difference in debug information for the compiler-rt runtime. For the basic test case with the code below --

#include <stdlib.h>                                                                                                                                                          
#include <string.h>                                                                                                                                                          
int main(int argc, char **argv) {                                                                                                                                            
  char *x = (char *)malloc(10);                                                                                                                                              
  memset(x, 0, 10);                                                                                                                                                          
  free(x);                                                                                                                                                                   
  x = (char *)malloc(10);                                                                                                                                                    
  memset(x, 0, 10);                                                                                                                                                          
  free(x);                                                                                                                                                                   
  return 0;                                                                                                                                                                  
}

In both gcc and clang, we record 4 allocation sites. However two of them are from the application and two of them are from the runtime only [1]. We filter the records and drop frames which are from the runtime using the check here [2]. For gcc, this check fails since the path is just the filename memprof_malloc_linux.cpp as opposed to compiler-rt/lib/memprof/memprof_malloc_linux.cpp when built with clang. The isRuntime check looks for memprof/memprof_*. This seems to be an artifact of how the runtime was compiled in this environment and perhaps we can make this check more robust by listing the runtime filenames directly?

Yeah, that seems like the best option.

Also looking into why we have these extra interceptions, it looks like we set memprof_init_is_running to false and a subsequent call to Symbolizer::LateInitialize is intercepted [1].

I don't recall why we set this to false at this specific point, it seems like it should be ok to delay it. And then I wonder if we can combine memprof_init_is_running with memprof_init_done?

I don't recall why we set this to false at this specific point, it seems like it should be ok to delay it. And then I wonder if we can combine memprof_init_is_running with memprof_init_done?

Delaying it seems to be fine on the small tests, let me look into what the other flags do and try to simplify if we can.

Yeah, that seems like the best option.

Sent D145521 to match filenames which define interceptors.

I don't recall why we set this to false at this specific point, it seems like it should be ok to delay it. And then I wonder if we can combine memprof_init_is_running with memprof_init_done?

Delaying it seems to be fine on the small tests, let me look into what the other flags do and try to simplify if we can.

Done in D145528.

snehasish abandoned this revision.Mar 8 2023, 11:25 AM