Page MenuHomePhabricator

[ADT] Shuffle containers before sorting to uncover non-deterministic behavior
ClosedPublic

Authored by mgrang on Oct 24 2017, 9:55 AM.

Details

Summary

std::sort and array_pod_sort both use non-stable sorting algorithms.
This means that the relative order of elements with the same key is
undefined. This patch is an attempt to uncover such scenarios by
randomly shuffling all containers before sorting, if EXPENSIVE_CHECKS
is enabled.

Here's the bugzilla for this: https://bugs.llvm.org/show_bug.cgi?id=35135

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vsk added a comment.Oct 24 2017, 1:04 PM

@mgrang thanks, should be fixed in r316490.

mgrang edited the summary of this revision. (Show Details)Oct 24 2017, 1:16 PM
dexonsmith edited edge metadata.Oct 24 2017, 2:17 PM

std::sort and array_pod_sort both use quicksort

FWIW, most standards-compliant std::sort implementations use introsort... although it's also unstable.

NOTE: To randomly shuffle before std::sort we may have to change all calls to std::sort with llvm::sort.

It looks to me like the current overloads of llvm::sort take a parallel execution policy, which would be unfortunate boilerplate to add. Perhaps you should create an overload in ADT/STLExtras.h with the same signature as std::sort (that wraps std::sort with the initial call to std::random_shuffle).

std::sort and array_pod_sort both use quicksort

FWIW, most standards-compliant std::sort implementations use introsort... although it's also unstable.

NOTE: To randomly shuffle before std::sort we may have to change all calls to std::sort with llvm::sort.

It looks to me like the current overloads of llvm::sort take a parallel execution policy, which would be unfortunate boilerplate to add. Perhaps you should create an overload in ADT/STLExtras.h with the same signature as std::sort (that wraps std::sort with the initial call to std::random_shuffle).

Sure, I can add overload for llvm::sort which takes ExecutionPolicy and invoke std::sort. However, it would still mean changing all calls to std::sort with llvm::sort if this feature needs to be tested. Is there a way to do this w/o changing the callers? #define std::sort llvm::sort won't work because of the double colon (::), redefining std::sort will result in duplicate definitions, etc.

std::sort and array_pod_sort both use quicksort

FWIW, most standards-compliant std::sort implementations use introsort... although it's also unstable.

NOTE: To randomly shuffle before std::sort we may have to change all calls to std::sort with llvm::sort.

It looks to me like the current overloads of llvm::sort take a parallel execution policy, which would be unfortunate boilerplate to add. Perhaps you should create an overload in ADT/STLExtras.h with the same signature as std::sort (that wraps std::sort with the initial call to std::random_shuffle).

Sure, I can add overload for llvm::sort which takes ExecutionPolicy and invoke std::sort.

You mean, add an overload that does not take ExecutionPolicy?

However, it would still mean changing all calls to std::sort with llvm::sort if this feature needs to be tested. Is there a way to do this w/o changing the callers? #define std::sort llvm::sort won't work because of the double colon (::), redefining std::sort will result in duplicate definitions, etc.

I think the best thing to do is to update the callers to call llvm::sort instead of std::sort.

lattner resigned from this revision.Oct 24 2017, 9:24 PM
mgrang updated this revision to Diff 120187.Oct 24 2017, 10:41 PM
mgrang edited the summary of this revision. (Show Details)
mgrang removed a reviewer: lattner.

Added std::sort overloads for ExecutionPolicy.

dexonsmith added inline comments.Oct 25 2017, 9:12 AM
include/llvm/ADT/STLExtras.h
812–819

Doesn't this already exist in llvm/Support/Parallel.h?

mgrang added inline comments.Oct 25 2017, 10:39 AM
include/llvm/ADT/STLExtras.h
812–819

Yea ... it apparently does. So I guess we don't need the ExecutionPolicy overloads here. Will remove them. Thanks!

mgrang updated this revision to Diff 120294.Oct 25 2017, 12:12 PM

Removed overloads for ExecutionPolicy as they are already present in llvm/Support/Parallel.h.

mgrang marked an inline comment as done.Oct 25 2017, 12:12 PM
chandlerc requested changes to this revision.Oct 25 2017, 3:43 PM

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

include/llvm/ADT/STLExtras.h
796

This was deprecated in C++14, I think you want std::shuffle now with a PRNG parameter.

This revision now requires changes to proceed.Oct 25 2017, 3:43 PM

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I agree with Chandler that we should avoid the proliferation of macros and use ABI_BREAKING_CHECKS instead.
Is it possible to fixing the failures before enabling the switch? (or, in the same commit?) That should avoid bot breakage [I understand it's chasing a moving target, as we did with EXPENSIVE_CHECK, but it turned out to be worth in the long run fixing all the tests and turning the switch on instead of introducing new macros, cc: @RKSimon ]

mgrang updated this revision to Diff 120362.Oct 25 2017, 10:47 PM
mgrang edited edge metadata.

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I agree with Chandler that we should avoid the proliferation of macros and use ABI_BREAKING_CHECKS instead.
Is it possible to fixing the failures before enabling the switch? (or, in the same commit?) That should avoid bot breakage [I understand it's chasing a moving target, as we did with EXPENSIVE_CHECK, but it turned out to be worth in the long run fixing all the tests and turning the switch on instead of introducing new macros, cc: @RKSimon ]

Thanks Davide. So the consensus seems to be that ABI_BREAKING_CHECKS would be a better option to base this patch on. I have updated my patch accordingly.
We would need to fix the existing failures before this can be merged. I will solicit help from the code owners/developers of the breaking tests to get them fixed. Once those are fixed I would push patches to change std::sort to llvm::sort for llvm/clang/polly/compiler-rt.

mgrang marked an inline comment as done.Oct 25 2017, 10:51 PM

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I agree with Chandler that we should avoid the proliferation of macros and use ABI_BREAKING_CHECKS instead.

I'm actually a bit confused by this. How is this ABI-breaking? If (1) I'm right, and it isn't ABI-breaking, and (2) we're not going to have a custom macro, then I think this should just be behind NDEBUG.

ABI_BREAKING_CHECKS is meant to be for NDEBUG-like checks that actually break the ABI. One could, in theory, build Clang with assertions, build LLVM without assertions, and still a functional compiler by overriding one of the defaults for ABI_BREAKING_CHECKS. I don't see how that applies to these checks.

dexonsmith requested changes to this revision.Oct 26 2017, 8:48 AM
dexonsmith added inline comments.
include/llvm/ADT/STLExtras.h
778–781

I don't see how toggling these checks would be ABI-breaking, so I think this is the wrong macro to use.

This revision now requires changes to proceed.Oct 26 2017, 8:48 AM

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I agree with Chandler that we should avoid the proliferation of macros and use ABI_BREAKING_CHECKS instead.

I'm actually a bit confused by this. How is this ABI-breaking? If (1) I'm right, and it isn't ABI-breaking, and (2) we're not going to have a custom macro, then I think this should just be behind NDEBUG.

ABI_BREAKING_CHECKS is meant to be for NDEBUG-like checks that actually break the ABI. One could, in theory, build Clang with assertions, build LLVM without assertions, and still a functional compiler by overriding one of the defaults for ABI_BREAKING_CHECKS. I don't see how that applies to these checks.

I haven't looked very closely, mine was just a meta comment. If they don't break the ABI, probably ABI_BREAKING_CHECKS is not the correct place, of course, but I can't comment further on that.
FWIW, NDEBUG is probably OK if the cost of the check is nearly zero, otherwise I would put them under LLVM_EXPENSIVE_CHECKS (we have a bot for that already).
That said, my point that we shouldn't have yet another macro stands, I guess, independently from the bucket we decide to throw these checks in.

That said, my point that we shouldn't have yet another macro stands, I guess, independently from the bucket we decide to throw these checks in.

Yup, no quarrel there.

I don't really like the ever growing set of configuration macros here.

I would much prefer we have one ABI-breaking check macro and just use that for everything here...

Thanks @chandlerc. I am fine with enabling this based on LLVM_ABI_BREAKING_CHECKS. However, this macro is ON by default for +asserts builds. This means all existing +asserts bots would start failing as soon as this patch gets merged since there are about 15 unit test failures. Do we want to gate patches if they show up sort order failures uncovered by shuffling?
My idea was to control this via a separate macro so that we can isolate shuffle+sort failures into a separate bot which may make it easier to fix without gating the regular flow. Thoughts?

I agree with Chandler that we should avoid the proliferation of macros and use ABI_BREAKING_CHECKS instead.

I'm actually a bit confused by this. How is this ABI-breaking? If (1) I'm right, and it isn't ABI-breaking, and (2) we're not going to have a custom macro, then I think this should just be behind NDEBUG.

ABI_BREAKING_CHECKS is meant to be for NDEBUG-like checks that actually break the ABI. One could, in theory, build Clang with assertions, build LLVM without assertions, and still a functional compiler by overriding one of the defaults for ABI_BREAKING_CHECKS. I don't see how that applies to these checks.

That said, my point that we shouldn't have yet another macro stands, I guess, independently from the bucket we decide to throw these checks in.

Yup, no quarrel there.

I guess +asserts and -asserts builds should be comparable in terms of ninja check failures. I would prefer not to add these under ABI_BREAKING. How about we add these either under EXPENSIVE_CHECKS or rename LLVM_REVERSE_ITERATION as LLVM_NON_DETERMINISM_CHECKS and add these under that?

Btw .. I ran shuffle+sort with all targets enabled and see 39 ninja check failures (previously I had built only ARM/AArch64 targets).

As with Davide, my argument was mostly "use an existing bucket" not "it should be in bucket X". =]

I'm happy with this being under NDEBUG if this doesn't break ABI. As long as shuffle is O(N), adding an O(N) check to an O(N * log(N)) algorithm seems totally fine for NDEBUG, so I'd rather not stuff it behind the "expensive checks" macro.

I guess +asserts and -asserts builds should be comparable in terms of ninja check failures. I would prefer not to add these under ABI_BREAKING. How about we add these either under EXPENSIVE_CHECKS or rename LLVM_REVERSE_ITERATION as LLVM_NON_DETERMINISM_CHECKS and add these under that?

Btw .. I ran shuffle+sort with all targets enabled and see 39 ninja check failures (previously I had built only ARM/AArch64 targets).

The suggestion is that we should fix these failures before we land this unless there is some reason that will be infeasible to do. That way we don't have new failures?

As with Davide, my argument was mostly "use an existing bucket" not "it should be in bucket X". =]

I'm happy with this being under NDEBUG if this doesn't break ABI. As long as shuffle is O(N), adding an O(N) check to an O(N * log(N)) algorithm seems totally fine for NDEBUG, so I'd rather not stuff it behind the "expensive checks" macro.

Makes sense.

I guess +asserts and -asserts builds should be comparable in terms of ninja check failures. I would prefer not to add these under ABI_BREAKING. How about we add these either under EXPENSIVE_CHECKS or rename LLVM_REVERSE_ITERATION as LLVM_NON_DETERMINISM_CHECKS and add these under that?

Btw .. I ran shuffle+sort with all targets enabled and see 39 ninja check failures (previously I had built only ARM/AArch64 targets).

The suggestion is that we should fix these failures before we land this unless there is some reason that will be infeasible to do. That way we don't have new failures?

Nod, that was my suggestion too. I completely understand it's annoying as people continue checking in code that might break other stuffs, but, I guess this frustration will also speed up the fix :)

mgrang updated this revision to Diff 120762.Oct 29 2017, 7:12 PM
mgrang edited edge metadata.

Enabled the change under NDEBUG.

fhahn added a subscriber: fhahn.Oct 31 2017, 2:58 AM

@mgrang I noticed you resolved PR35135 - what are you intending to do with this patch?

All outstanding unit test failures uncovered by this patch have now been resolved on the tip. Can we now please get this patch reviewed and merged?
Once this patch merges we can do a mass replacement of std::sort to llvm::sort in the entire code base.

mgrang edited the summary of this revision. (Show Details)Feb 25 2018, 12:03 PM

Wouldn't this kind of testing be better under EXPENSIVE_CHECKS instead of NDEBUG?

Wouldn't this kind of testing be better under EXPENSIVE_CHECKS instead of NDEBUG?

Complexity-wise std::shuffle is not too expensive (it's O(n) while sorting is O(n*log n). So I'm not sure if this should be under EXPENSIVE_CHECKS.
With that said, I am totally fine shoving this under EXPENSIVE_CHECKS as long as we have consensus that that's the most appropriate macro for this :) Otherwise, I guess this can remain under NDEBUG.
Thoughts?

Ping for reviews please.

mgrang added a comment.Mar 7 2018, 8:10 AM

Ping 2 for reviews please.

@davide @efriedma @chandlerc Does anyone want this under NDEBUG or should it be moved to EXPENSIVE_CHECKS?

davide added a comment.Mar 8 2018, 9:23 AM

std::shuffle has linear time complexity, so I'd rather not pessimize the DEBUG build even more.

EXPENSIVE_CHECKS is fine.

OK, let's go for EXPENSIVE_CHECKS

mgrang added a comment.Mar 8 2018, 5:10 PM

OK, let's go for EXPENSIVE_CHECKS

Sounds good! Thanks everyone for your feedback. I will push a patch soon.

mgrang updated this revision to Diff 137690.Mar 8 2018, 7:28 PM
mgrang edited the summary of this revision. (Show Details)

Based std::shuffle on EXPENSIVE_CHECKS instead of NDEBUG.

mgrang updated this revision to Diff 137693.Mar 8 2018, 7:34 PM
RKSimon accepted this revision.Mar 9 2018, 6:06 AM

LGTM, with one minor

include/llvm/ADT/STLExtras.h
803

Please expand this a little:

// Provide wrappers to std::sort which shuffle the elements before sorting
// to help uncover non-deterministic behavior (PR35135).
mgrang updated this revision to Diff 137828.Mar 9 2018, 1:36 PM

Expanded comment.

mgrang marked an inline comment as done.Mar 9 2018, 1:36 PM
RKSimon accepted this revision.Mar 10 2018, 6:12 AM

Thanks, LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Mar 10 2018, 11:02 AM
This revision was automatically updated to reflect the committed changes.