Page MenuHomePhabricator

Fix ASAN execution for the MLIR Python tests
ClosedPublic

Authored by mehdi_amini on Oct 2 2021, 7:48 PM.

Details

Summary

First the leak sanitizer has to be disabled, as even an empty script
leads to leak detection with Python.
Then we need to preload the ASAN runtime, as the main binary (python)
won't be linked against it. This will only work on Linux right now.

Event Timeline

mehdi_amini created this revision.Oct 2 2021, 7:48 PM
mehdi_amini requested review of this revision.Oct 2 2021, 7:48 PM
stellaraccident accepted this revision.Oct 2 2021, 10:28 PM
stellaraccident added inline comments.
mlir/test/lit.cfg.py
87

I've found this to be quite a bit more fiddly than this, with both compiler and specific Linux variant dependence. I've actually found modern GCC (>=10) to be more likely to work out of the box for me with ASAN in these odd setups.

Depending on which vintage of libasan is used, there are numerous bugs with DSOs+exception handling. On somewhat older versions (~>=18 months), this causes infinite loops when pybind code throws an exception (it was showing up for me when we throw StopIteration, which is done in non exceptional situations even). Somewhat more recent (~1 year old) have another bug where you also have to LD_PRELOAD the C++ standard library on systems that were linked with --as-needed (e.g. from my recent history, this one with gcc-asan: env LD_PRELOAD="/usr/lib/gcc/x86_64-linux-gnu/10/libasan.so /usr/lib/x86_64-linux-gnu/libstdc++.so.6" python runtime_errors.py). Apparently, this happens on various Debian/Ubuntu but not on SUSE.

I'm supportive of giving this a shot, as it has a higher chance of working than what we have now, but I don't think we've seen the last of issues.

This revision is now accepted and ready to land.Oct 2 2021, 10:28 PM
mlir/test/lit.cfg.py
87

Nit: I thought lit commands were more structured than this (being a subset of shell). When working interactively, I typically do environment modifications via env. Not sure if it matters here.

You may also want to open code the $({config.host_cxx} -print-file-name=libclang_rt.asan-{config.host_arch}.so) in Python (in the lit.cfg.py), because then you can feature detect on whether you could find an appropriate asan library -- and if not disable the feature gating Python tests.

Something like:

import subprocess
try:
  asan_rt_lib = subprocess.check_output([config.host_cxx, f"-print-file-name=libclang_rt.asan-{config.host_arch}.so")
except CalledProcessError:
  config.enable_python_bindings = False
This revision was automatically updated to reflect the committed changes.
mehdi_amini added inline comments.Oct 2 2021, 10:38 PM
mlir/test/lit.cfg.py
87

Ouch, seems like we could blacklist some known bad configs as we find them I think?

Right now I got it to work on Ubuntu with clang (setting up more buildbots upstream)

Got another idea last night: instead of trying to magically find the ASAN runtime, what about requiring the user to provide the path with a CMake option? (-DMLIR_PRELOAD_ASAN_FOR_PYTHON_TESTS=<path>)

Got another idea last night: instead of trying to magically find the ASAN runtime, what about requiring the user to provide the path with a CMake option? (-DMLIR_PRELOAD_ASAN_FOR_PYTHON_TESTS=<path>)

I like that. In my experience there is enough environment variability here that having the knobs to overcome it is half the battle.

Trying to think of a more inclusive name and failing. We can rename the option if it gets used for other things I guess. This would also be needed for anything that included an ASAN built runtime library into a host executable which doesn't link ASAN. I think that is just python for now.

An advantage to making this a cmake option is that we can emit a verbose warning when asan is enabled and this is not.