This is an archive of the discontinued LLVM Phabricator instance.

[Support] Sink LLD's parallel algorithm wrappers to support
ClosedPublic

Authored by rnk on May 4 2020, 8:12 PM.

Details

Summary

Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.

The changes are:

  • Add range overloads
  • Use the sequential algorithm directly when 1 thread is requested (skips task grouping)
  • Fix the index type of for_each_n to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.

Remove Threads.h and update LLD for that.

This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.

Diff Detail

Event Timeline

rnk created this revision.May 4 2020, 8:12 PM
Herald added a project: Restricted Project. · View Herald Transcript

Add range overloads
Use the sequential algorithm directly when 1 thread is requested (skips task grouping)
Fix the index type of for_each_n to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.

These changes all look good to me. I have a question regarding the intended usage:

lld/COFF/LLDMapFile.cpp
77

Isn't parallel::par redundant? Can we just use

parallel::for_each_n((size_t)0, syms.size(), [&](size_t i) { ... }) ?

aganea added a comment.May 5 2020, 6:23 AM

This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.

Thanks for working on this Reid :-)

lld/COFF/LLDMapFile.cpp
77

+1

lld/ELF/Writer.cpp
1752

sym, as indicated by the clang-tidy log.

rnk marked an inline comment as done.May 5 2020, 9:51 AM
rnk added inline comments.
lld/COFF/LLDMapFile.cpp
77

IMO yes, but I think these APIs are designed to be drop-in replaceable by the parallel STL APIs standardized in C++17, and those take a policy. I think the idea is that there could be other policies for GPUs, accelerators, SIMD variants, etc, although I'm not super familiar with them.

I peaked at llvm-project/pstl/include/internal/glue_algorithm_impl.h, which declares the C++17 standard APIs, and they are already slightly different from what LLVM has here. (No range wrappers, for_each_n does not take size_t.) If we are going to be different, we might as well simplify the API. I will remove the policy parameter and reupload this change. I believe there are only two other in-tree clients of this API in MLIR.

rnk updated this revision to Diff 262175.May 5 2020, 11:06 AM
  • Drop policy parameter

While doing this, I may have re-discovered the reason that the committee
designed the APIs this way. With the parameter, you can use ADL to write this:

sort(parallel::par, vec.begin(), vec.end(), ...);
for_each(parallel::par, vec.begin(), vec.end(), ...);

You can even do using namespace (llvm|std)::parallel; and write this:

sort(par, vec.begin(), vec.end(), ...);

If you drop the policy parameter, using the llvm::parallel namespace becomes
almost impossible because it makes every unqualified call to sort ambiguous.

WDYT? I would be happy to go back and rewrite all the code to this:

sort([llvm::]parallel::par, ...);

As in, drop the "parallel::" prefix from "parallel::sort(...)", and rely on
ADL. Maybe that is too tricky, though.

rriddle accepted this revision.May 5 2020, 11:08 AM

LGTM for anything in MLIR.

This revision is now accepted and ready to land.May 5 2020, 11:08 AM

If the goal is to replace Parallel.h with pSTL sometime in the future, then maybe it's better to keep the STL-like API and the policy (but I wouldn't favor that ATM)
I find it pleasant to read the way it is -- if you want to protect from someone doing using llvm::parallel, why not keep the LLD nomenclature? llvm::parallelSort(...). It will not be ambiguous, even if you do using llvm; parallelSort(...);.

rnk updated this revision to Diff 262223.May 5 2020, 2:15 PM
  • rename to parallelSort, parallelForEach[N]
MaskRay accepted this revision.EditedMay 5 2020, 2:43 PM

Seems the conclusion is that llvm::parallelSort(..) is better than llvm::parallel::sort(...) or llvm::parallel(parallel::seq, ...).

It is nice to avoid ADL when doing the refactoring. llvm::parallelSort also avoids functions with same signatures but overloaded in different namespaces (std:: llvm::parallel::)

aganea added a comment.May 5 2020, 2:55 PM

LGTM as well, thank you!

lld/lib/ReaderWriter/MachO/LayoutPass.cpp
464–468

Could this be llvm::parallelSort(vec, ...)?

llvm/unittests/Support/ParallelTest.cpp
33

parallelSort(array)?

aganea accepted this revision.May 5 2020, 2:55 PM
rnk marked 2 inline comments as done.May 5 2020, 3:18 PM

Thanks!

llvm/unittests/Support/ParallelTest.cpp
33

The compiler didn't like that one:

llvm-project/llvm/unittests/Support/ParallelTest.cpp:33:3: error: no matching function for call to 'parallelSort'
  parallelSort(array);
  ^~~~~~~~~~~~
llvm-project/llvm/include/llvm/Support/Parallel.h:204:6: note: candidate template ignored: substitution failure [with RangeTy = unsigned int (&)[1048576]]: reference to type 'unsigned int [1048576]' requires an initializer
This revision was automatically updated to reflect the committed changes.
lld/ELF/Writer.cpp