This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Fix strict weak ordering in sorting
ClosedPublic

Authored by danlark on Jun 5 2023, 6:31 AM.

Details

Summary

These checks have been fired when comparing same elements like comp(a, a). Let's fix it. I don't have commit rights.

Danila Kutenin
kutdanila@yandex.ru

Diff Detail

Event Timeline

danlark created this revision.Jun 5 2023, 6:31 AM
Herald added a project: Restricted Project. · View Herald Transcript
danlark requested review of this revision.Jun 5 2023, 6:31 AM
Peiming added a subscriber: Peiming.Jun 5 2023, 2:16 PM

Is there any change to the std::sort that requires such a change? In these two particular cases, there should not be any duplicated item from the array (otherwise, it leads to logical bugs).

Is there any change to the std::sort that requires such a change? In these two particular cases, there should not be any duplicated item from the array (otherwise, it leads to logical bugs).

std::sort is allowed to call comparator on the same element.

Is there any change to the std::sort that requires such a change? In these two particular cases, there should not be any duplicated item from the array (otherwise, it leads to logical bugs).

std::sort is allowed to call comparator on the same element.

But do you have a example that trigger this? The sparse compiler has a non-dup property, and we would like to see the example that leads to this?

Is there any change to the std::sort that requires such a change? In these two particular cases, there should not be any duplicated item from the array (otherwise, it leads to logical bugs).

std::sort is allowed to call comparator on the same element.

But do you have a example that trigger this? The sparse compiler has a non-dup property, and we would like to see the example that leads to this?

We added checks in debug mode so it deliberately will https://reviews.llvm.org/D150264

In practice if there are more than 30 elements, it could and can (for example, by being a pivot and waiting for the false condition): https://stackoverflow.com/questions/38966516/should-sorting-algorithm-pass-same-element-in-the-comparison-function

In practice if there are more than 30 elements, it could and can (for example, by being a pivot and waiting for the false condition): https://stackoverflow.com/questions/38966516/should-sorting-algorithm-pass-same-element-in-the-comparison-function

I understand the general requirement for sort, but this sort is only used in particular situations by the sparse compiler. So my question is, do you have a test case that would trigger the assert at L35.
If so, we would like to understand the example code before removing the assert.

In practice if there are more than 30 elements, it could and can (for example, by being a pivot and waiting for the false condition): https://stackoverflow.com/questions/38966516/should-sorting-algorithm-pass-same-element-in-the-comparison-function

I understand the general requirement for sort, but this sort is only used in particular situations by the sparse compiler. So my question is, do you have a test case that would trigger the assert at L35.
If so, we would like to understand the example code before removing the assert.

I was not able to reproduce the issue. If we go down to the specifics of std::sort in LLVM, it should take 25 elements to find when comp(a, a) can be called. If we can make here 25 elements anyhow and reorder them as we want, we can fail it

Any update on this? I am of course willing to accept this if needed, but I would really like to see the test case that exposes the new behavior (should not be too hard to set up, if you have any sequence that exposes this, you can simply construct one sparse vector with indices identical to that sequence).

Any update on this? I am of course willing to accept this if needed, but I would really like to see the test case that exposes the new behavior (should not be too hard to set up, if you have any sequence that exposes this, you can simply construct one sparse vector with indices identical to that sequence).

I checked and with the new implementation it's impossible to have comp(a, a) in the algorithm itself. My suggestion will be to change the assert to something like

assert(std::addressof(l) == std::addressof(r) || l != r);

assert(std::addressof(l) == std::addressof(r) || l != r);

That L35 change would be acceptable to me ;-)
What about the L569 change?

danlark updated this revision to Diff 542402.Jul 20 2023, 3:03 AM

assert(std::addressof(l) == std::addressof(r) || l != r);

That L35 change would be acceptable to me ;-)
What about the L569 change?

Done, sorry for a long response

danlark updated this revision to Diff 542498.Jul 20 2023, 6:48 AM

It looks like Dialect/SparseTensor/sparse_conv_2d_slice_based.mlir is failing.

assert(std::addressof(l) == std::addressof(r) || l != r);

Personally, I'd be a lot more comfortable with instead replacing the assertion with if (LLVM_UNLIKELY(l == r)) return false; which is just short-circuiting what the rest of the comparison function would compute anyways (modulo that it doesn't cause assertion error when l/r doesn't occur in order).

Note that LoopId is just unsigned, so there's no funny business going on with the operator== in the for-loop.

rnk added a subscriber: rnk.Aug 11 2023, 1:46 PM

Any thoughts on this? As I understand it, this is the last remaining strict weak ordering violation in llvm-project, and after that folks can enable the checks for this in libc++. The other two changes (D155811, D155809) landed.

Any thoughts on this? As I understand it, this is the last remaining strict weak ordering violation in llvm-project, and after that folks can enable the checks for this in libc++. The other two changes (D155811, D155809) landed.

No, there are 3-4 more, I am unable to fix with my knowledge of llvm. One example is https://github.com/llvm/llvm-project/issues/64121

Ping. Looks like Dialect/SparseTensor/sparse_conv_2d_slice_based.mlir was a flake, I cannot reproduce

Peiming added a comment.EditedAug 15 2023, 8:59 AM

Ping. Looks like Dialect/SparseTensor/sparse_conv_2d_slice_based.mlir was a flake, I cannot reproduce

We changed the std::sort to std::stable_sort in LoopEmitter.cpp@574 to fix some ordering issue a while back, it might have been already resolved.

Ping. What is required for me here? I believe the change is very safe to submit

aartbik accepted this revision.Aug 18 2023, 1:59 PM

Sorry this took so long!

This revision is now accepted and ready to land.Aug 18 2023, 1:59 PM

Please, submit on my behalf

Submitting....

aartbik updated this revision to Diff 552519.Aug 22 2023, 3:30 PM

rebased with main

aartbik added inline comments.Aug 22 2023, 3:48 PM
mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.cpp
38

hmm, the i++ was replaced with i
(did this pass tests? :-)

Fixing...

Oops but seems to pass :)

aartbik updated this revision to Diff 552526.Aug 22 2023, 3:57 PM

fixed typo on i++

Submitting...

This revision was automatically updated to reflect the committed changes.