This is an archive of the discontinued LLVM Phabricator instance.

Correctly report error when interceptor points at an unlinked symbol
Needs ReviewPublic

Authored by serge-sans-paille on Jul 31 2020, 2:04 AM.

Details

Summary
  • This is a demonstration of a patch, I'd like to generalize it and use proper setup **

This is a preleminary patch for this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1827338
Basically, libclang_rt.asan provides a weak version of the crypt_r symbol (among others) which prevents the linker from raising a link error if the user forgets to link with the crypt library. Adding the appropriate check for crypt_r during interceptor installation helps to report the error correctly.

I think the check code should be put in a more generic place (where?) and maybe the error message be improved (?)

Diff Detail

Event Timeline

serge-sans-paille requested review of this revision.Jul 31 2020, 2:04 AM
serge-sans-paille created this revision.
fhahn added a reviewer: yln.Sep 1 2020, 7:14 AM
yln added a comment.Sep 1 2020, 10:40 AM
This comment was removed by yln.
yln added a comment.Sep 1 2020, 10:47 AM

On Linux, I think we define our interceptors like this:

# define DECLARE_WRAPPER(ret_type, func, ...) \
    extern "C" ret_type func(__VA_ARGS__) \
    __attribute__((weak, alias("__interceptor_" #func), visibility("default")));
...
extern "C" INTERCEPTOR_ATTRIBUTE \
  ret_type WRAP(func)(__VA_ARGS__) {
    x = REAL(func)(arg1, arg2);
    return x;
}

And REAL(func) is just a function pointer to the intercepted function retrieved via dlsym() during initialization.
This allows us to define interceptors in the ASan runtime for functions that aren't necessarily present in the original program.
In your example: what happens if the original program doesn't use and doesn't want to link the crypt library?

I am not sure if there is a way around this?
You can use ASAN_OPTIONS=verbosity=1 to get warnings for functions that couldn't be retrieved for interception.

(On macOS, interception is aided by the linker and works slightly different.)

dmajor added a subscriber: dmajor.Sep 1 2020, 2:41 PM

Thanks for the rely!

And REAL(func) is just a function pointer to the intercepted function retrieved via dlsym() during initialization.
This allows us to define interceptors in the ASan runtime for functions that aren't necessarily present in the original program.
In your example: what happens if the original program doesn't use and doesn't want to link the crypt library?

Then we get an error, which is unexpected.

I am not sure if there is a way around this?

extern "C" INTERCEPTOR_ATTRIBUTE \
  ret_type WRAP(func)(__VA_ARGS__) {
    x = REAL(func)(arg1, arg2);
    return x;
}

We could dynamically check if REAL is correctly set before calling it. With the correct `__builtin_expect` the overhead may not be too significant?

You can use ASAN_OPTIONS=verbosity=1 to get warnings for functions that couldn't be retrieved for interception.

(On macOS, interception is aided by the linker and works slightly different.)

yln added a comment.Sep 8 2020, 11:31 AM

Let me recap for my own understanding. The original issue below:

Basically, libclang_rt.asan provides a weak version of the crypt_r symbol (among others) which prevents the linker from raising a link error if the user forgets to link with the crypt library.

I can't think of a way to get around this root issue (without significant changes to how interception works today) and since it's a linking issue, we can't solve it with a runtime check.
However, adding a runtime check could give us better diagnostics instead of just a "random" nullptr dereference in an ASan/sanitizer interceptor.

During initialization, we can do a check, but I can't think of a way to distinguish the two cases for crypt_r not being present:

  • program doesn't use crypt_r and doesn't want to link it, or
  • program uses crypt_r but we forgot to link it.

Adding a check on delegation (for every REAL(func)) would certainly be possible.

From my point of view this is not worth doing because:

  • Sounds like a high risk change.
  • It doesn't solve/diagnose the root issue.
  • This sounds like a fringe issue; I hope everyone evenutally compiles their program without sanitizers at some point, and
  • compiling without -fsanitize=xxx should give the right linker diagnostic.

I am happy to have my mind changed (and others should probably agree).

Thanks for the recap, that's a very clarifying one.

In D85011#2261601, @yln wrote:

Let me recap for my own understanding. The original issue below:

Basically, libclang_rt.asan provides a weak version of the crypt_r symbol (among others) which prevents the linker from raising a link error if the user forgets to link with the crypt library.

I can't think of a way to get around this root issue (without significant changes to how interception works today) and since it's a linking issue, we can't solve it with a runtime check.
However, adding a runtime check could give us better diagnostics instead of just a "random" nullptr dereference in an ASan/sanitizer interceptor.

During initialization, we can do a check, but I can't think of a way to distinguish the two cases for crypt_r not being present:

  • program doesn't use crypt_r and doesn't want to link it, or
  • program uses crypt_r but we forgot to link it.

Adding a check on delegation (for every REAL(func)) would certainly be possible.

I agree with everything until that point. That's my understanding too.

From my point of view this is not worth doing because:

  • Sounds like a high risk change.

I was thinking of something like this

--- a/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
@@ -9797,6 +9797,9 @@ INTERCEPTOR(char *, crypt_r, char *key, char *salt, void *data) {
                                    __sanitizer::struct_crypt_data_sz);
     COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1);
   }
+  else {
+      # print error, or abort
+  }
   return res;
 }
 #define INIT_CRYPT_R COMMON_INTERCEPT_FUNCTION(crypt_r);
  • It doesn't solve/diagnose the root issue.

I think that a message like "trying to call function xxxx that's not linked into that program" is somehow a good hint.

  • This sounds like a fringe issue; I hope everyone evenutally compiles their program without sanitizers at some point, and
  • compiling without -fsanitize=xxx should give the right linker diagnostic.

I'd tend to agree with that one. But some user may disagree ;-)

I am happy to have my mind changed (and others should probably agree).