This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/tsan/
-
tsan/
-
cond_cancel.c
-
mutex_cycle2.c

Differential D18378

Fix test failure on unrelated warnings.
Needs ReviewPublic

Authored by yabinc on Mar 22 2016, 2:47 PM.

Download Raw Diff

Details

Reviewers

dvyukov
llvm-commits

Summary

The test failure message is as below:
<stdin>:1:10: error: CHECK-NOT: string occurred!

17677==WARNING: Program is run with randomized virtual address space, which wouldn't work with ThreadSanitizer.

compiler-rt/test/tsan/cond_cancel.c:2:15: note: CHECK-NOT: pattern specified here
// CHECK-NOT: WARNING

Diff Detail

Event Timeline

yabinc updated this revision to Diff 51347.Mar 22 2016, 2:47 PM

yabinc retitled this revision from to Fix test failure on unrelated warnings..

yabinc updated this object.

yabinc added reviewers: llvm-commits, rengolin, dvyukov.Mar 22 2016, 2:49 PM

yabinc added a subscriber: srhines.

That's not the only test that failed, and also, I don't think this is going to fix the test. The warning is saying that the test will not work with address randomization, so maybe, we shouldn't be forcing it everywhere.

Can you revert the patches for now, so we can discuss without a red bot?

Fix mutex_cycle2.c.

remove change in tsan_platform_linux.cc

Thanks for reminding. I have tested on x86_64 (by enabling personality(ADDR_NO_RANDOMIZE) on x86_64).

Printing the "WARNING: Program is run with randomized virtual address space, which wouldn't work with ThreadSanitizer" all the time is not OK. It's definitely not OK for x86_64. There are programs that generate machine-parsable output (other programs), and it is not OK to alter their output unless there is a bug.
I LGTMed the change for arm64 because as far as I understand tsan will just stop working as users upgrade to newer kernels. But it is not OK to alter program output all the time on arm64 either. So I assumed that the supported way to run tsan programs on arm64 now is to disable ASLR beforehand (unless until we have a better fix for the shadow mapping problem).
Based on that the right fix is to run tsan tests on arm64 with ASLR pre-disabled in the test driver (because that's we expect users to do now).
I understand that it is suboptimal for users, but at least tsan is working for them now (and hopefully majority of programs won't fail because of the additional output, so users can just ignore that warning). Long-term we may want to improve shadow memory mapping on arm64 so that it supports ASLR (Renato expressed concerns about the mapping slowdown and it is still an open question how to do it without the additional slowdown).

Yabin,

When I asked you to revert the patch it was because buildbots in the red make way to more bugs being inserted. Not only you didn't revert, but you continued to push a fix that was wrong, untested and not actually fixing anything. This is not the behaviour we expected in this community.

This is not the first time that the Android team at Google pushes patches to the sanitizers that were only tested on their own Android phones, and this is not acceptable. Please, communicate internally that this disrupts the work of everyone else involved and have been noted by others as bad community practices.

I have just reverted in r264150, and from now on, I will need a much larger body of proof for all your next patches, including testing on AArch64 Linux, Android as well as x86_64 Linux. Changes in the sanitizer may look simple for your single platform, but they have profound implications to every other architecture.

regards,
--renato

It can make sense to limit such changes to only the platform where we have a problem at hand (e.g. #if SANITIZER_ANDROID && defined(aarch64)). Especially if the platform is not working at all at the moment.
While it is generally good to make useful features available on all supported platforms, testing of such changes is tough and they have good chances of breaking things. If other platforms hit the same issue in future, maintainers can just extend the preprocessor condition (but they have a possibility to test it).
I guess should not have been LGTMed the original change with just #if defined(aarch64) (though, I guess all aarch64 users will hit this issue sooner or later).

I applogize for not testing tsan patches well before uploading them to llvm. I will do my best to
test them on other platforms as well as Android. I didn't revert the patch because I think fixing
broken patches instead of reverting them is the first choice. But thanks for reverting it as I
didn't fix it in time. I will upload the patch to make it Android specific soon.

At Diff1 of this patch, I made it Android specific. And zatrazz recommended
to use VReport() instead of Report() to avoid breaking tests. It doesn't break tests any more when I run make check-tsan.

This patch broke not *all* AArch64 buildbots, but "some".

Can you specify what build it may break? As far as I know, export TSAN_OPTIONS="verbosity=1" makes tsan print the verbose warning. But make check-tsan doesn't fail, I think it may be because make check-tsan clears environment variable TSAN_OPTIONS. Maybe you want me to fix the tests, comment on http://reviews.llvm.org/D18378?

if this fix is *guaranteed* to work on *every* possible Android configuration on the planet, present and future, than I'm ok with the change.

I think this is an unresonable promise for me. I have tested on the aarch64 android devices I have: N5X, N9, and N6P, which are all 39-VMAs,
without and with the kernel patch. Unless you have specific considerations or actionable suggests, I can't do anything further.

The previous change was *guaranteed* to work with any Linux kernel, so we accepted for all AArch64, but that was a false assumption. So forgive me if I don't trust this new assumption as well.

I don't think that was a false assumption. The previous change broken the test because of the tests. There is no sense so far that the change doesn't work with any Linux kernel.

Concretely, what I'm asking is proof that this is true for all Android configurations. Virtual address randomization is a crucial security feature, and I'd be surprised if Google never uses it as we move into 64-bit land, especially as Android takes on more roles in automotive, home automation and small laptops.

I think the current plan on android is only using tsan for debugging by platform developers and app developers, so security issue isn't involved so far (or never will be). If the tsan maintainers determine that they don't want to disable virtual address randomization, android developers have to disable virtual address space manually before tsan has a workable solution. The only way to make something true for all Android configurations is to write a Compatibility Test Suite for it, but I can't do it before I make tsan usable by platform developers (because how can you test that something is true before you can't run the test).

sorry, I commented on the wrong patch.

This is an old one, maybe should be abandoned?

In D18378#381239, @dvyukov wrote:

I LGTMed the change for arm64 because as far as I understand tsan will just stop working as users upgrade to newer kernels. But it is not OK to alter program output all the time on arm64 either. So I assumed that the supported way to run tsan programs on arm64 now is to disable ASLR beforehand (unless until we have a better fix for the shadow mapping problem).

Just a comment here: TSAN will not stop working when users upgrade to new kernels, as @zatrazz made it work for 39, 42 and 48 VMA kernels.

Based on that the right fix is to run tsan tests on arm64 with ASLR pre-disabled in the test driver (because that's we expect users to do now).
I understand that it is suboptimal for users, but at least tsan is working for them now (and hopefully majority of programs won't fail because of the additional output, so users can just ignore that warning). Long-term we may want to improve shadow memory mapping on arm64 so that it supports ASLR (Renato expressed concerns about the mapping slowdown and it is still an open question how to do it without the additional slowdown).

TSAN is working well for AArch64 on Linux and, AFAIK, Android, so I'm not sure we should treat it differently than x86_64.

Revision Contents

Path

Size

test/

tsan/

cond_cancel.c

2 lines

mutex_cycle2.c