This is an archive of the discontinued LLVM Phabricator instance.

[RFC][nsan] A Floating-point numerical sanitizer.
Needs ReviewPublic

Authored by courbet on Mar 3 2021, 6:21 AM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

LLVM has sanitizers for thread safety, memory, UB,... We propose nsan
as a new sanitizer for numerical (floating-point) issues.

For each floating point IR instruction, the instrumentation pass inserts
an equivalent instruction in double the precision (e.g. float -> double,
double -> fp128), called the "shadow".

For any instruction that is observable outside of a function (return,
function call with float parameters, store, ...), the results of the
original and shadow computations are compared, and a warning is emitted
if the values do not match.

Original values in memory are shadowed in double the precision, with an
additional tag for each byte to represent the memory type (untyped,
float, long, double, long double, ...). Libc functions (and corresponding
llvm intrinsics) are intercepted to copy or set shadow types accordingly.

Unit tests include some well-known examples of numerical instabilities.

nsan is still work in progress, but this patch is able to run and detect
issues on several applications, including the whoel SPECfp benchmark
suite.

nsan-instrumented applications are typically 3-100x slower and take ~4x
more memory than the original.

More details can be found in this paper: https://arxiv.org/abs/2102.12782

Diff Detail

Event Timeline

courbet created this revision.Mar 3 2021, 6:21 AM
courbet requested review of this revision.Mar 3 2021, 6:21 AM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptMar 3 2021, 6:21 AM
Herald added subscribers: llvm-commits, Restricted Project, cfe-commits. · View Herald Transcript

When bootstrapping LLVM with nsan, there are only a few issues.

Several of them stem from using double to measure elapsed time in seconds: We measure start time, end time, and subtract them. The resulting error depends on the arbitrary magnitude of the time since epoch, so as time passes the error will increase. This is especially visible when we measure short intervals of time (e.g. a few microseconds, which are small compared to the time since epoch).

For example one test has more than 2% error:

WARNING: NumericalStabilitySanitizer: inconsistent shadow results while checking store to address 0x4b87860
double       precision  (native): dec: 0.00000858306884765625  hex: 0x1.20000000000000000000p-17
__float128   precision  (shadow): dec: 0.00000880600000000000  hex: 0x9.3bd7b64e9fe4fc000000p-20
shadow truncated to double      : dec: 0.00000880600000000000  hex: 0x1.277af6c9d3fca0000000p-17
Relative error: 2.53158247040370201937% (2^47 epsilons)
Absolute error: 0x1.debdb274ff27e0000000p-23
(131595325226954 ULPs == 14.1 digits == 46.9 bits)
    #0 0x119db71 in llvm::TimeRecord::operator-=(llvm::TimeRecord const&) [...]/llvm/llvm-project/llvm/include/llvm/Support/Timer.h:63:14
    #1 0x119db71 in llvm::Timer::stopTimer() [...]/llvm/llvm-project/llvm/lib/Support/Timer.cpp:176:8
    #2 0x108b1d2 in llvm::TimePassesHandler::stopTimer(llvm::StringRef) [...]/llvm/llvm-project/llvm/lib/IR/PassTimingInfo.cpp:248:14
    #3 0x108b1d2 in llvm::TimePassesHandler::runAfterPass(llvm::StringRef) [...]/llvm/llvm-project/llvm/lib/IR/PassTimingInfo.cpp:267:3
    #4 0x108e159 in llvm::TimePassesHandler::registerCallbacks(llvm::PassInstrumentationCallbacks&)::$_2::operator()(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&) const [...]/llvm/llvm-project/llvm/lib/IR/PassTimingInfo.cpp:281:15
    #5 0x108e159 in void llvm::detail::UniqueFunctionBase<void, llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&>::CallImpl<llvm::TimePassesHandler::registerCallbacks(llvm::PassInstrumentationCallbacks&)::$_2>(void*, llvm::StringRef, llvm::Any&, llvm::PreservedAnalyses const&) [...]/llvm/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:204:12
    #6 0xa4f826 in llvm::unique_function<void (llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&)>::operator()(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&) [...]/llvm/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:366:12
    #7 0xa4f826 in void llvm::PassInstrumentation::runAfterPass<llvm::Module, (anonymous namespace)::MyPass2>((anonymous namespace)::MyPass2 const&, llvm::Module const&, llvm::PreservedAnalyses const&) const [...]/llvm/llvm-project/llvm/include/llvm/IR/PassInstrumentation.h:227:9
    #8 0xa4f826 in (anonymous namespace)::TimePassesTest_CustomOut_Test::TestBody() [...]/llvm/llvm-project/llvm/unittests/IR/TimePassesTest.cpp:137:6
...
qiucf added a subscriber: qiucf.Mar 4 2021, 4:20 AM
scanon added a subscriber: scanon.Mar 10 2021, 8:51 AM

Is there a mechanism to instruct the sanitizer to ignore a specific expression or function? From a cursory reading, I am mildly concerned about a deluge of false positives from primitives that compute exact (or approximate) residuals; these are acting to eliminate or precisely control floating-point errors, but tend to show up as "unstable" in a naive analysis that isn't aware of them.

Is there a mechanism to instruct the sanitizer to ignore a specific expression or function? From a cursory reading, I am mildly concerned about a deluge of false positives from primitives that compute exact (or approximate) residuals; these are acting to eliminate or precisely control floating-point errors, but tend to show up as "unstable" in a naive analysis that isn't aware of them.

Yes: like all sanitizers, what happens behind the scenes is that the frontend (clang) sets an annotation on each function in the program. It can be disabled for a specific function with the no_sanitize attribute.

If nsan is disabled for a specific function, any return value will be re-extended again to shadow precision, and the computations will resume from here. This is equivalent to assuming that the function, its parameters, and any memory reads were correct.

Matt added a subscriber: Matt.Mar 25 2021, 3:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2022, 11:39 AM
olologin added a comment.EditedNov 22 2022, 11:44 AM

@courbet
Hi, sorry for pinging you.
Is this review stalled? Is there a reason for this? Judging by paper this sanitizer looks promising. I was recently planning to try out NSAN on huge codebase but I only found this review, so I wonder if I should try to build clang with your patch myself and try it.

Maybe I could help with something if you don't have time for completing this review, but I doubt my knowledge of LLVM codebase is good enough for serious tasks :)

@courbet
Hi, sorry for pinging you.
Is this review stalled? Is there a reason for this? Judging by paper this sanitizer looks promising. I was recently planning to try out NSAN on huge codebase but I only found this review, so I wonder if I should try to build clang with your patch myself and try it.

Maybe I could help with something if you don't have time for completing this review, but I doubt my knowledge of LLVM codebase is good enough for serious tasks :)

The interest for this in the LLVM community did not seem overwhelming, so I did not pursue this.

There was some work out-of-tree by the interflop (in particular, @titeup), e.g https://github.com/Thukisdo/NSan-interflop-runtime, I'm not sure if they are still working on it.

If you want to try to get this submitted, feel free to take over !

This looks very interesting and I wish that I can help as well (but I'll likely spend very little time in December due to holidays:) ). Hmm, I can't find it in an email archive. Perhaps there hasn't been a discussion so people are unaware of this work...

This looks very interesting. I had a discussion with someone at the recent LLVM Dev Meeting about the possibility of something like this. However, rather than tracking error based on data precision, I am interested in tracking errors introduced by fast-math based optimizations. For instance, someone has a program with has been verified within accuracy requirements, but they want it to run faster so they enable fast-math. Sometimes this works, but other times the fast-math optimizations introduce an unacceptable amount of error. What I'd like is to be able to trace the problem back to which part of the original source was sensitive to this error so that I can disable fast-math locally for just that operation.

Another potential use for this sort of technology relates to an RFC that I just posted (https://reviews.llvm.org/D138867). There I'm trying to introduce a mechanism that allows the compiler to replace builtin math operations (calls to math library functions or equivalent builtins in platforms like SYCL or CUDA) based on a specified accuracy requirement. One of the challenges with this is verifying that the substitute call is really as accurate as it claims. For example, let's say I want to call cosf() and I need 4 ulp accuracy. The standard GNU libm implementation claims 1 ulp accuracy, so it's not necessarily useful as a point of comparison. But if we had a shadow computation that converted the input value to double and called cos() then converted that result back to float, that should give me the correctly rounded result, right? Or I could use a shadow call to one of the various correctly rounded implementations that are becoming available. It would be great to use nsan to verify the results from these alternate library calls.