This is an archive of the discontinued LLVM Phabricator instance.

[esan] EfficiencySanitizer base runtime library
ClosedPublic

Authored by bruening on Apr 15 2016, 11:07 AM.

Details

Summary

Adds the initial version of a runtime library for the new
EfficiencySanitizer ("esan") family of tools. The library includes:

+ Slowpath code via callouts from the compiler instrumentation for

each memory access.

+ Registration of atexit() to call finalization code.

+ Runtime option flags controlled by the environment variable

ESAN_OPTIONS.  The common sanitizer flags are supported such as
verbosity and log_path.

Still TODO: common code for libc interceptors and shadow memory mapping,
and tool-specific code for shadow state updating.

Diff Detail

Repository
rL LLVM

Event Timeline

bruening updated this revision to Diff 53918.Apr 15 2016, 11:07 AM
bruening retitled this revision from to [esan] EfficiencySanitizer base runtime library.
bruening updated this object.
bruening added a reviewer: eugenis.
bruening added subscribers: kcc, zhaoqin, llvm-commits.
vitalybuka added inline comments.Apr 15 2016, 3:17 PM
lib/esan/esan.cc
54 ↗(On Diff #53918)

This looks inconsistent with functions below.
why it's SizeLog, not just Size and round up to the power of 2 here?

lib/esan/esan.h
41 ↗(On Diff #53918)

I am new to this code, so is there a convention for this?
Why not enum or even: constexpr sizeLog(size_t size) {...}

aizatsky added inline comments.Apr 15 2016, 3:37 PM
cmake/config-ix.cmake
633 ↗(On Diff #53918)

is CFI_SUPPORTED_ARCH expected here?

lib/esan/esan.cc
84 ↗(On Diff #53918)

replace by a call to InitializeCommonFlags (sanitizer_flags.h)

lib/esan/esan_interface.cc
25 ↗(On Diff #53918)

I'm not sure if it will help with inlining because you add calls during instrumentation.

47 ↗(On Diff #53918)

any special reason why not kSizeLog16?

lib/esan/esan_interface.h
37 ↗(On Diff #53918)

Is there a reason why you have lots of _N functions instead of

__esan_aligned_write(void* Addr, size_t sizeLog)?

Each one of these makes this kind of call anyway.

bruening added inline comments.Apr 15 2016, 5:42 PM
cmake/config-ix.cmake
633 ↗(On Diff #53918)

No, that is an error: it should be ESAN_.

lib/esan/esan.cc
54 ↗(On Diff #53918)

The log is passed from the compiler partly to simplify the indexing in its own code. We're not doing anything with the log in the runtime, so it would probably be cleaner to change this to Size. I'll do that.

84 ↗(On Diff #53918)

Agreed.

lib/esan/esan.h
41 ↗(On Diff #53918)

This is based on what tsan does. However, we are leaning toward just throwing these out here as the log is not used for anything.

lib/esan/esan_interface.cc
47 ↗(On Diff #53918)

No, this is an artifact coming from tsan. For us we are not limited to 8 so this could become a single call (and as mentioned above we can move away from the log here). I will change it.

lib/esan/esan_interface.h
37 ↗(On Diff #53918)

The initial reason was precedence: that's what asan and tsan do. A (maybe minor) reason not to change is that if profiling shows this slowpath time causing bottlenecks we can more easily write dedicated handlers for each size.

bruening marked 6 inline comments as done.Apr 18 2016, 8:42 PM
bruening updated this revision to Diff 54155.Apr 18 2016, 8:44 PM

Addresses reviewer comments

eugenis added inline comments.Apr 19 2016, 10:57 AM
lib/esan/esan.h
21 ↗(On Diff #54155)

typo: headers

32 ↗(On Diff #54155)

sanitizer_internal_defs.h:# define GET_CALLER_PC() (uptr)__builtin_return_address(0)

lib/esan/esan_interface.h
14 ↗(On Diff #54155)

If this is meant to be included by the instrumented program, it should go under include/sanitizer. And not use sanitizer_internal_defs.h.

aizatsky added inline comments.Apr 19 2016, 1:00 PM
lib/esan/esan_interface.h
14 ↗(On Diff #54155)

Maybe this should be esan_internal_interface.h? I don't suppose any functions here should be called manually.

38 ↗(On Diff #54155)

Please add a general comment about this group of functions: when are they called (and when they are not). Comments on interface between instrumenter/runtime are really helpful.

bruening updated this revision to Diff 54287.Apr 19 2016, 3:47 PM
bruening marked 3 inline comments as done.

Address reviewer comments: rename header, fix typo.

lib/esan/esan_interface.h
14 ↗(On Diff #54155)

Removing that comment line. It was based on lib/tsan/tsan_interface.h (which should be in include/sanitizer?) but our only use is insertion via instrumentation.

14 ↗(On Diff #54155)

Will rename to esan_interface_internal.h to mirror asan and msan. Unfortunately that will likely make the prior comments not linked to the new file unless this review system is that advanced.

filcab added a subscriber: filcab.Apr 20 2016, 8:38 AM

Thanks again, for working on this.
Please add tests so that we're always sure the current code is building+running correctly.

lib/esan/esan.cc
66 ↗(On Diff #54287)

If you end up calling the "range access" function on both the above cases, then I see no reason to have it split from the other two.
processRangeAccess should be able to do everything.

It's trivial to think of occasions when this wouldn't happen. And I understand this is a skeleton, but I'd prefer (for now) to either:
Remove the processRangeAccess be called only when the others aren't. Or only have processRangeAccess.
(In the future, the "other path" will be added, I'm sure. But I don't think it's necessary for the current skeleton proposal, and it's very easy for some of this to fall into dead-code land)

lib/esan/esan.h
41 ↗(On Diff #54287)

Mem vs Unaligned seems weird. Maybe Aligned vs Unaligned explains it better.

lib/esan/esan_interface.cc
23 ↗(On Diff #54287)

__esan_init() will be called more than once.
We should make sure we're always called with the same tool as argument.

aizatsky accepted this revision.Apr 20 2016, 11:16 AM
aizatsky edited edge metadata.

One nit. LG otherwise.

lib/esan/esan_interface_internal.h
38 ↗(On Diff #54287)

Specify if it will insert call before or after the read/write.

This revision is now accepted and ready to land.Apr 20 2016, 11:16 AM
kubamracek added inline comments.Apr 20 2016, 11:29 AM
lib/esan/esan.h
38–40 ↗(On Diff #54287)

All the other sanitizers use capitalized function names (with the first letter capitalized as well). Can we do this in ESan as well? E.g. InitializeLibrary, FinalizeLibrary, ProcessMemAccess, etc.

aizatsky added inline comments.Apr 20 2016, 11:34 AM
lib/esan/esan.h
38–40 ↗(On Diff #54287)

Chandler asked us specifically to adhere to LLVM coding and naming guidelines.

bruening added inline comments.Apr 20 2016, 11:41 AM
lib/esan/esan.h
38–40 ↗(On Diff #54287)

Right, I talked to Kostya about this before and he asked for new files to use LLVM standards, while the existing files would remain internally consistent with their current style.

bruening marked 4 inline comments as done.Apr 20 2016, 3:13 PM

Please add tests so that we're always sure the current code is building+running correctly.

The skeleton framework was split up for smaller reviews: libc interceptors, shadow mapping, and test infrastructure were deliberately not included here. I can merge test infra into here but there is not much that can be tested at this point.

bruening updated this revision to Diff 54432.Apr 20 2016, 3:13 PM
bruening edited edge metadata.

Add test infrastructure

Almost LGTM with a few minor comments, now.
Thank you!

lib/esan/CMakeLists.txt
7 ↗(On Diff #54432)

No need to have this variable if it's the same as ESAN_CFLAGS

lib/esan/esan.cc
31 ↗(On Diff #54432)

EsanOptsEnv, to keep consistency with other abbreviations of Options
Since you're changing this line, why not the shorter static const char EsanOptsEnv[] = "..."?

lib/esan/esan_interface_internal.h
24 ↗(On Diff #54432)

This is an internal definition of the interface, it should always be compiled as C++.

28 ↗(On Diff #54432)

Pop this off to something like lib/esan/esan_internal_defs.h.

29 ↗(On Diff #54432)

In the instrumentation pass, you get the function with an i32 argument.
Should we fix the size of this enum to be 32-bits (I don't know of any practical problems in a common arch with this as it is off the top of my head, though)?

lib/sanitizer_common/sanitizer_common_interceptors.inc
35 ↗(On Diff #54432)

Please split this comment addition to a separate commit.

test/esan/lit.cfg
28 ↗(On Diff #54432)

Since this is new and adhering to LLVM conventions, we can probably remove the .cc, right?

31 ↗(On Diff #54432)

I really don't like testing host_os. ESan availability will depend on the target, not the host.
Right now there's no better option in lit, so let's do it this way.
But I'll ask you to try and keep test command lines to constructs that lit's shell knows how to parse, if possible/easy.

34 ↗(On Diff #54432)

You can merge this with the above:

if config.host_os not in ['Linux'] or config.target_arch != 'x86_64':
  config.unsupported = True
bruening marked 5 inline comments as done.Apr 21 2016, 11:13 AM
bruening added inline comments.
lib/esan/esan_interface_internal.h
24 ↗(On Diff #54432)

This is the interface between the compiler instrumentation and the runtime library. It is C for simplicity. This matches what all the existing sanitizers do.

28 ↗(On Diff #54432)

To me that sounds like definitions internal to the runtime library and not shared with anything else, while these are part of the interface with the compiler instrumentation. (Having "internal" in the name of this file was not my first choice: requested as part of this review.) The parallel files tsan_interface.h and asan_interface_internal.h also have types in the same file.

29 ↗(On Diff #54432)

You mean, make this a type-less enumerator and typedef u32 as ToolType? That is less appealing. I will add a static_assert on the sizeof to the .cc file.

lib/sanitizer_common/sanitizer_common_interceptors.inc
35 ↗(On Diff #54432)

It is a continuation of this header comment about what macros the including file needs to define, making a comment on one of them. Probably it looks different only seeing this patch window of a few lines. Please take a look at the entire top of this file.

test/esan/lit.cfg
28 ↗(On Diff #54432)

Hmm, I suppose the runtime library implementation files should be renamed from esan.cc to esan.cpp, etc. as well. That will churn the review diff but few comments are in the impl files.

31 ↗(On Diff #54432)

Note that every existing sanitizer lit.cfg tests host_os.

bruening updated this revision to Diff 54550.Apr 21 2016, 11:14 AM
bruening marked 2 inline comments as done.

Rename .cc to .cpp; add unusually-sized callouts.

filcab accepted this revision.Apr 21 2016, 11:49 AM
filcab added a reviewer: filcab.

Not much to do now except for some final problems. I'll LGTM since I might not be able to reply today and I only need minor fixes.

lib/esan/esan_interface_internal.h
25 ↗(On Diff #54550)

Sorry, I have the annotation in the wrong line. I'm not complaining about the extern "C".
The #ifdef __cplusplus is always true, since this is the internal (for the library only) definition of the API, which we know will always be compiled as C++. (Check ASan's asan_interface_internal.h. TSan has the ifdefs because of one unit test. Otherwise, it wouldn't need the ifdef either)

29 ↗(On Diff #54550)

My problem was with having the type defined twice, but it seems I messed up when looking for ESAN_CacheFrag.
The constants are only defined once inside compiler-rt, so it's good. Sorry about that.

30 ↗(On Diff #54550)

No. It was more about fixing the underlying type:
http://en.cppreference.com/w/cpp/language/enum
Version (2) of "Unscoped enumeration".

lib/sanitizer_common/sanitizer_common_interceptors.inc
35 ↗(On Diff #54550)

The problem is that later, when blaming that file, people will see this commit, and possibly spend time looking at it because of an unrelated change.
I'm totally ok with adding the comment, I think it's useful. But I don't think it belongs to the "Introduce ESan" commit.

test/esan/lit.cfg
29 ↗(On Diff #54550)

Since these are all new files, probably it would be best to use .cpp to adhere more to llvm current practice.

32 ↗(On Diff #54550)

Yeah, unfortunately this is what we have now.
No problems here, as there's no good solution you can start using. I ended up ranting a bit. Sorry about the noise.

bruening added inline comments.Apr 21 2016, 1:04 PM
lib/esan/esan_interface_internal.h
30 ↗(On Diff #54550)

That is on the list of C++11 allowed in LLVM, but putting it inside the extern "C" feels wrong (hence why it seemed the only solution was what I mentioned). I suppose it's fine if there is no C program that includes the header, which is the case if we are able to remove the ifdef __cplusplus.

bruening marked 4 inline comments as done.Apr 21 2016, 1:19 PM
bruening added inline comments.
lib/sanitizer_common/sanitizer_common_interceptors.inc
35 ↗(On Diff #54550)

Oh, I misread "to a separate commit" as "to a separate comment".

This revision was automatically updated to reflect the committed changes.
bruening marked an inline comment as done.