Page MenuHomePhabricator

Discussion: Darwin Sanitizers Stable ABI
ClosedPublic

Authored by rsundahl on Feb 9 2023, 12:45 PM.

Details

Summary

Darwin Sanitizers Stable ABI

We wish to make it possible to include the AddressSanitizer (ASan) runtime implementation in OSes and for this we need a stable ASan ABI. Based on previous discussions about this topic, our understanding is that freezing the present ABI would impose an excessive burden on other sanitizer developers and for unrelated platforms. Therefore, we propose adding a secondary stable ABI for our use and anyone else in the community seeking the same. We believe that we can define a stable ABI with minimal burden on the community, expecting only to keep existing tests running and implementing stubs when new features are added. We are okay with trading performance for stability with no impact for existing users of ASan while minimizing the maintenance burden for ASan maintainers. We wish to commit this functionality to the LLVM project to maintain it there. This new and stable ABI will abstract away the implementation details allowing new and novel approaches to ASan for developers, researchers and others.

Details

Rather than adding a lot of conditional code to the LLVM instrumentation phase, which would incur excessive complexity and maintenance cost of adding conditional code into all places that emit a runtime call, we propose a “shim” layer which will map the unstable ABI to the stable ABI:

  • A static library (.a library) shim that maps the existing ASan ABI to a generalized, smaller and stable ABI. The library would implement the __asan functions and call into the new ABI. For example:
    • void __asan_load1(uptr p) { __asan_abi_loadn(p, 1, true); }
    • void __asan_load2(uptr p) { __asan_abi_loadn(p, 2, true); }
    • void __asan_noabort_load16(uptr p) { __asan_abi_loadn(p, 16, false); }
    • void __asan_poison_cxx_array_cookie(uptr p) { __asan_abi_pac(p); }
  • This “shim” library would only be used by people who opt in: A compilation flag in the Clang driver will be used to gate the use of the stable ABI workflow.
  • Utilize the existing ability for the ASan instrumentation to prefer runtime calls instead of inlined direct shadow memory accesses.
  • Pursue (under the new driver flag) a better separation of abstraction and implementation with:
    • LLVM instrumentation: Calling out for all poisoning, checking and unpoisoning.
    • Runtime: Implementing the stable ABI and being responsible of implementation details of the shadow memory.

Maintenance

Our aim is that the maintenance burden on the sanitizer developer community be negligible. Stable ABI tests will always pass for non-Darwin platforms. Changes to the existing ABI which would require a change to the shim have been infrequent as the ASan ABI is already relatively stable. Rarely, a change that impacts the contract between LLVM and the shim will occur. Among such foreseeable changes are: 1) changes to a function signature, 2) additions of new functions, or 3) deprecation of an existing function. Following are some examples of reasonable responses to those changes:

  • Example: An existing ABI function is changed to return the input parameter on success or NULL on failure. In this scenario, a reasonable change to the shim would be to modify the function signature appropriately and to simply guess at a common-sense implementation.
    • uptr __asan_load1(uptr p) { __asan_abi_loadn(p, 1, true); return p; }
  • Example: An additional function is added for performance reasons. It has a very similar function signature to other similarly named functions and logically is an extension of that same pattern. In this case it would make sense to apply the same logic as the existing entry points:
    • void __asan_load128(uptr p) { __asan_abi_loadn(p, 128, true); }
  • Example: An entry point is added to the existing ABI for which there is no obvious stable ABI implementation: In this case, doing nothing in a no-op stub would be acceptable, assuming existing features of ASan can still work without an actual implementation of this new function.
    • void __asan_prefetch(uptr p) { }
  • Example: An entrypoint in the existing ABI is deprecated and/or deleted:
    • (Delete the entrypoint from the shim.)

We’re looking for buy-in for this level of support.

(Note: Upon acceptance of the general concepts herein, we will add a controlling clang flag, cmake integration, contract for the stable ABI, and the appropriate test infrastructure.)

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@kcc @eugenis @MaskRay @vitalybuka Ok to go with this? All new functionality is under the added flag so not expecting any surprises. Rename asabi->asan_abi as suggested.

Thanks, asan_abi LGTM.

I don't have good reasons to object that patch, but I suspect it's sub-optimal. But we may get a valuable expirience.

Rather than adding a lot of conditional code to the LLVM instrumentation phase

We do this for hwasan for Android, and to some extent msan for Chromium. @eugenis maybe can share more info.

Based on previous discussions about this topic, our understanding is that freezing the present ABI would impose an excessive burden on other sanitizer developers and for unrelated platforms.

I guess we just have no way to enforce that. A couple of buildbots with "stable clang" + "HEAD runtime" and "HEAD clang" + "stable runtime" which do some non-tivial build, e.g. clang bootstrap can enforce that. We can at list to enforce default set of flags.

Small insignificant note from me: When this lands, please be sure to add me as co-author.
https://github.blog/2018-01-29-commit-together-with-co-authors/

I'm fine with it in general. Is asan_abi.cpp meant as a temporary stub? It's not even link anywhere in the current version.

Right, we should be using it... We will add a test that compiles and links to it as affirmation that we have covered all of the entrypoints that ASan generates. It's also the case that asan_abi_shim.h is redundant since asan_abi_shim.cpp now gets its declarations from ../asan/asan_interface_internal.h which is of course the source of truth for what the shim should be implementing. We will remove that as well and update. Thanks.

Small insignificant note from me: When this lands, please be sure to add me as co-author.
https://github.blog/2018-01-29-commit-together-with-co-authors/

I've not seen this before @thetruestblue! I certainly will do so!

rsundahl updated this revision to Diff 521479.Thu, May 11, 3:34 PM

Added testing. Removed asan_abi_shim.h.

rsundahl updated this revision to Diff 521482.Thu, May 11, 3:43 PM

Fixed nits (missing newlines at end of files)

I'm fine with it in general. Is asan_abi.cpp meant as a temporary stub? It's not even link anywhere in the current version.

We now use it during testing to close the loop on the question of whether the file is "complete" in the sense that it satisfies the minimal "no-op" implementation of the abi. We also moved from having a hand-curated include file to using the actual interface file which should be the root truth for what needs to be in there. We discovered a few additional functions that were in asan_interface.h but aren't strictly part of the interface between the instrumentation and the runtime but rather are used intra-runtime. Some other routines living in asan_interface.h are really documented "helper" functions. Maybe these should be aggregated somewhere else and/or under a different namespace. For now we ignore those entrypoints by listing them in compiler-rt/lib/asan_abi/darwin_exclude_symbols.inc

vitalybuka added inline comments.Thu, May 11, 4:17 PM
compiler-rt/lib/asan_abi/asan_abi_shim.cpp
63

static_assert

MaskRay added inline comments.Thu, May 11, 5:22 PM
compiler-rt/lib/asan_abi/asan_abi_shim.cpp
15

Comments are usually a complete sentence with a period. There are exceptions, but a "Globals" needs elaboration to make it better understandable by a reader.

17

2-space indentation, here and throughout.

42

C++ prefers () instead of (void).

487

You may apply clang-format, which may turn this into (void *)0, but nullptr likely works better.

compiler-rt/test/asan_abi/TestCases/Darwin/llvm_interface_symbols.cpp
2

excess spaces

-O2 ... -O0?

5

One blank line suffices.

10

We generally prefer llvm counterparts to the system binary utilities. Use llvm-nm

16

unneeded ^//$ lines, here and throughout.

17

Does Darwin sed accept ; to combine multiple -e into one -e with ;?

22

2-space indentation for | continuation lines as well

27

sort %t.imports. See Useless use of cat

compiler-rt/test/asan_abi/TestCases/linkstaticlibrary.cpp
2

-O2 ... -O0

I'm fine with it in general. Is asan_abi.cpp meant as a temporary stub? It's not even link anywhere in the current version.

We now use it during testing to close the loop on the question of whether the file is "complete" in the sense that it satisfies the minimal "no-op" implementation of the abi. We also moved from having a hand-curated include file to using the actual interface file which should be the root truth for what needs to be in there. We discovered a few additional functions that were in asan_interface.h but aren't strictly part of the interface between the instrumentation and the runtime but rather are used intra-runtime. Some other routines living in asan_interface.h are really documented "helper" functions. Maybe these should be aggregated somewhere else and/or under a different namespace. For now we ignore those entrypoints by listing them in compiler-rt/lib/asan_abi/darwin_exclude_symbols.inc

How is compiler-rt/lib/asan_abi/darwin_exclude_symbols.inc used in the upstream and downstream build system?
In this patch this file is only used by one test?

clang/include/clang/Driver/Options.td
1794

See BooleanFFlag. Some existing sanitizer options are not written with the best practice.

If -fno-sanitize-stable-abi is the default, there is usually no need to have a duplicate help message Disable .... Documenting the non-default boolean option suffices.

clang/lib/Driver/SanitizerArgs.cpp
917

Existing code unnecessarily reads the previous value (false) of the variable. No need to copy that for new code.

1289

Optional nit: I added -mllvm= as an alias in D143325. You can use -mllvm=-asan-instrumentation-with-call-threshold=0 to decrease the number/length of cc1 options.

Add some comments before if (StableABI) { why the two cl::opt options are specified.

clang/test/Driver/fsanitize.c
266 ↗(On Diff #521482)

I think the tests should go to a new file fsanitize-stable-abi.c. The checks are different enough from the rest of fsanitize.c (which can be split).

269 ↗(On Diff #521482)

I presume that you want to test the positive forms with Darwin triples like arm64-apple-darwin?

We can even argue that the option should lead to an error for non-Darwin triples.

compiler-rt/docs/asan_abi.md
1 ↗(On Diff #521482)

The existing compiler-rt/docs docs use .rst. Better to use .rst to not introduce more than one format for one subproject.

.rst is used much more than .md in llvm-project anyway.

compiler-rt/lib/asan_abi/asan_abi.h
14

use clang-format to sort the headers. I'd expect that stdbool and stddef are placed together for any sorting behavior.

16

add a blank line before extern "C" {

compiler-rt/test/asan_abi/lit.cfg.py
10

This workaround is unneeded. I sent D150410 to clean up other lit.cfg.py files.

21

is None is better.

84

!=

@kcc @eugenis @MaskRay @vitalybuka Ok to go with this? All new functionality is under the added flag so not expecting any surprises. Rename asabi->asan_abi as suggested.

Thanks, asan_abi LGTM.

I don't have good reasons to object that patch, but I suspect it's sub-optimal. But we may get a valuable expirience.

Rather than adding a lot of conditional code to the LLVM instrumentation phase

We do this for hwasan for Android, and to some extent msan for Chromium. @eugenis maybe can share more info.

Based on previous discussions about this topic, our understanding is that freezing the present ABI would impose an excessive burden on other sanitizer developers and for unrelated platforms.

I guess we just have no way to enforce that. A couple of buildbots with "stable clang" + "HEAD runtime" and "HEAD clang" + "stable runtime" which do some non-tivial build, e.g. clang bootstrap can enforce that. We can at list to enforce default set of flags.

Very sorry for my belated response. I feel that I am not a decision maker, so I am waiting on the maintainers.
I do care about driver options (as a code owner) and sanitizer maintainability (my interest), though. I have left some comments and will be happy when they are resolved.

The documentation compiler-rt/docs/asan_abi.md probably needs more polishing. The current style is like seeking for RFC, not for something already in tree.

compiler-rt/docs/asan_abi.md
3 ↗(On Diff #521482)

The introductory paragraph is written in a style like the RFC, but not for the official documentation.
The official documentation should be written in a style that this has been accepted.
For sentences like maintenance costs, they can be moved to the Maintenance chapter.

I have an simplified introductory paragraph:

Some OSes like Darwin want to include the AddressSanitizer runtime by establishing a stable ASan ABI. lib/asan_abi contains a secondary stable ABI for our use and others in the community. This new ABI will have minimal impact on the community, prioritizing stability over performance.

Feel free to add more sentences if you feel too simplified.

Note that .rst uses two backsticks for what Markdown uses one backstick for.

7 ↗(On Diff #521482)

Similarly, words like "propose" are RFC-style wording, not for the official documentation. For the official documentation, you just say what this is.

14 ↗(On Diff #521482)

This “shim” library will only be used when -fsanitize-stable-abi is specified in the Clang driver.

thetruestblue added inline comments.Thu, May 11, 8:12 PM
compiler-rt/test/asan_abi/lit.cfg.py
84

The thought here was to leave basic lit patterns in-tact to expand to other OSs if others want to in the future. But if there's no desire for that, it doesn't make a big difference to me.

rsundahl updated this revision to Diff 522417.Mon, May 15, 8:20 PM

Applied suggestions from reviewers

Cleaned up options parsing
Moved test into stanalone file fsanitize-stable-abi.c
Changed target triple to arm64-apple-darwin
Changed documentation style from proposal to specification
Changed format of documentation form .md to .rst
Applied clang-format to entire diff
Removed extraneous spaces and lines
Expanded comments to sentences
Switched to static_assert()

rsundahl marked 22 inline comments as done.Mon, May 15, 8:44 PM

Thank you for your review and thoughtful input @eugenis, @MaskRay and @vitalybuka. I think we're close to having everything incorporated. (@MaskRay, the doc files went from .md to .rst and I implemented all of your suggestions there.

clang/include/clang/Driver/Options.td
1794

(Not sure if this is exactly what you meant @MaskRay but I think it's close.)

clang/lib/Driver/SanitizerArgs.cpp
1289

I couldn't get this one to work. Did I do it wrong? (Couldn't find example in the code to go from.)

Tried:

if (StableABI) {
  CmdArgs.push_back("-mllvm=-asan-instrumentation-with-call-threshold=0");
  CmdArgs.push_back("-mllvm=-asan-max-inline-poisoning-size=0");
}

Got:

error: unknown argument: '-mllvm=-asan-instrumentation-with-call-threshold=0'
error: unknown argument: '-mllvm=-asan-max-inline-poisoning-size=0'
compiler-rt/lib/asan_abi/asan_abi_shim.cpp
42

These are actually "C'
Added:

extern "C" {
...
}
487

clang-format left this as-is. I suspect this is because I also added the extern "C" brackets.

compiler-rt/test/asan_abi/TestCases/Darwin/llvm_interface_symbols.cpp
17

I couldn't get the same behavior out of the intersection of GNU an BSD set. Tried hard in https://reviews.llvm.org/D138824 and landed with the -e's. iirc exactly what the problem was with semicolons, just that I was relieved when I found a format that worked for all the platforms.

27

Good point!

compiler-rt/test/asan_abi/lit.cfg.py
10

Wasn't actually used anyway but good to know!

84

I landed on just an else clause. Let me know if that's ok @thetruestblue.

rsundahl marked 8 inline comments as done.Mon, May 15, 8:46 PM
rsundahl marked 4 inline comments as done.Mon, May 15, 9:00 PM

Suggestions for compiler-rt/docs/asan_abi.md are captured in the successor file compiler-rt/docs/ASanABI.rst and marked complete.

rsundahl marked 2 inline comments as done.Mon, May 15, 9:02 PM

Suggestions for compiler-rt/docs/asan_abi.md are captured in the successor file compiler-rt/docs/ASanABI.rst and marked complete.

rsundahl updated this revision to Diff 522452.Mon, May 15, 11:31 PM

Missed one file in revert of combined -mllvm= change.

rsundahl updated this revision to Diff 522592.Tue, May 16, 6:46 AM

Renamed darwin_exclude_symbols.inc asan_abi_tbd.txt.

This file contains the entrypoints that aren't strictly in the interface
between the instrumentation and the runtime but may still be part of a
public API that needs to have a home in Stable ABI. For now we acknowledge
them in the existing ABI and explicitly list them as TBD until we have a
complete story for how the shim deals with them.

MaskRay accepted this revision.Tue, May 16, 10:33 AM

@MaskRay, thank you for your approval. @eugenis, you were added as a blocking reviewer by @vitalybuka. If you are still without objection, can we get your approval and merge? Thank you all for your input.

MaskRay added inline comments.Wed, May 17, 7:26 PM
compiler-rt/docs/ASanABI.rst
30

How does the 2-space indentation render in the built HTML? It may look good, I ask just in case.

compiler-rt/lib/asan_abi/asan_abi.h
19

asan_abi.h is C++ only (extern "C" isn't allowed in C). (void) should be replaced with ().

compiler-rt/lib/asan_abi/asan_abi_shim.cpp
15

Below there is no :. You may omit this : as well.

346
compiler-rt/test/asan_abi/lit.cfg.py
14

This file mixes single quotes and double quotes (the file it copied from does so as well). Pick one and be consistent!

rsundahl marked an inline comment as done.Thu, May 18, 8:44 AM
rsundahl added inline comments.
compiler-rt/docs/ASanABI.rst
30

How does the 2-space indentation render in the built HTML? It may look good, I ask just in case.

IDK so I played around with it. Global search/replace 2 spaces with 4 does not affect rendering of the block above at all. In the block below there was an effect which was to indent the code blocks one more stop. The reason for this seems to be that the "current indent level" is advanced by 2 in a bulleted list, so below, the code block statement is actually aligned with the bulleted paragraph above it. After the GSR, the code block is indented and rendered further indented. Since bullets advance the "current indent level" by 2, maybe a more natural indent for the source code (.rst) is to use 2 as well. Seems to read a little easier in the source and avoids having to think about all the multiple of 4's after a bullet being "off by 2".

rsundahl marked an inline comment as done.Thu, May 18, 8:46 AM
rsundahl updated this revision to Diff 523409.Thu, May 18, 9:32 AM

Implement suggestions from latest review.

rsundahl marked 4 inline comments as done.Thu, May 18, 9:37 AM

Hello Egenii,

Thank you for your time and consideration of this PR. Since you last commented, @vitalybuka has approved the PR and added @MaskRay and yourself as blocking reviewers. @MaskRay has approved and we are awaiting your approval if you remain positive to it (or let us know if you have any new reservations).

Thank you,
-Roy Sundahl

eugenis accepted this revision.Wed, May 24, 2:38 PM
This revision is now accepted and ready to land.Wed, May 24, 2:38 PM
MaskRay accepted this revision.Wed, May 24, 8:21 PM

LGTM.

compiler-rt/docs/ASanABI.rst
16

Delete .... Sample content implies that this is a code fragment and does not contain everything, so ... is redundant.

23

Quote all compiler driver options with double backsticks.

rsundahl updated this revision to Diff 525644.Thu, May 25, 8:44 AM

Apply proper backticks quotes to options. Remove redundant ellipses.

rsundahl marked 2 inline comments as done.Thu, May 25, 8:48 AM

Thanks for your time and guidance getting this landed.

This revision was landed with ongoing or failed builds.Thu, May 25, 8:58 AM
This revision was automatically updated to reflect the committed changes.