This is an archive of the discontinued LLVM Phabricator instance.

[pstl] Fix a number of bugs with parallel partial sort
AbandonedPublic

Authored by nadiasvertex on Apr 3 2021, 2:37 PM.

Details

Reviewers
ldionne
jdoerfert
Group Reviewers
Restricted Project

Diff Detail

Event Timeline

nadiasvertex created this revision.Apr 3 2021, 2:37 PM
nadiasvertex requested review of this revision.Apr 3 2021, 2:37 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Quuxplusone added inline comments.
pstl/include/pstl/internal/parallel_backend_omp.h
556–558

As long as you're changing these anyway, might as well simplify.

569

New lines 577, 570, 573: seems like you should undo these whitespace diffs (manually if necessary).

I don't suppose it would make sense to use a heap instead of linearly scanning the vector in __update_largest_item every time one is inserted? I mean, that's the whole point of non-parallel partial_sort... but I guess the idea here is that rearranging the elements once would be less cache-friendly than linearly scanning them multiple times?

600–601

if you wanted to keep this simple

619

That this ever worked indicates that we could use some more "evil" test cases with overloaded/deleted operator, and so on. The iterators in libcxx/test/support/test_iterators.h are already "evil" in this way; we'd just need some new tests that use those iterators with parallel algorithms.

My ADL senses are tingling with those ADL calls to begin and end.

(Are there tests for parallel partial_sort? If it's this buggy, why aren't those tests failing now?)

630

Gratuitous whitespace diff here lost the indentation of the trailing return type.

760

I think this was meant to call std::iter_swap(__current_item, __swap_item), actually. Calling std::swap qualified seems like a bug, but calling std::iter_swap qualified would be normal practice.

859

Technically, all of these ADL calls are unsafe and should be qualified, like __omp_backend::__parallel_stable_sort_body(...). You're changing so much in the PR currently that I feel like you should go ahead and ADL-proof it too; but I wouldn't make ADL the main point of the PR. ;)

nadiasvertex added inline comments.Apr 3 2021, 4:49 PM
pstl/include/pstl/internal/parallel_backend_omp.h
569

Yes, exactly. I didn't do timed tests, but when I look at managing a heap, operationally it's hard to see how it could be faster than the linear scan, when K is small. If K starts to get large, one wonders why a partial sort is even being done. If I maintain a heap there's a lot of copying and moving going on. This approach lets me overwrite existing items in place and enjoy the benefits of the cache for the search.

Also, during the merge operation used to reduce the chunks I have to update the largest item, so I would be doing a lot of heap maintenance. But if you think it would be better, I can look into doing that instead.

600–601

In general I thought it better to use std:distance, but if the preference is to use explicit + and - operators, I can do that.

859

Actually, this is all "new" code. I have another parent revision where most of this code is introduced. However, the parallel partial sort is buggy, so I am working on fixing it here. I will make the ADL fixes you suggest.

The partial sort now succeeds for most of the collections I try, but it fails to produce correct results in some cases, I think due to some asymmetry in the reduction. I thought I had fixed it (which is why I posted this) until I did more testing.

nadiasvertex abandoned this revision.Aug 31 2021, 7:35 AM

This work will continue on https://reviews.llvm.org/D99836