Implement the PSTL backend using OpenMP.
Why did you prefer to make the whole class template rather than making operator() with template parameters?
const at the end please
__first2 supposed to be OutputIterator semantically. Let's rename the second template parameter to OutputIterator and the last function parameter to __d_first or something. Otherwise, readability suffers.
This comment is applicable if the callable is required. Why do you need to be templatized? Could those template parameters be tied to operator()
I think it should not be a big deal to create the header with the __buffer class within and include it to both TBB backend and OpenMP backend. Otherwise we will commit relatively big code duplication
I am not sure if we need this variable basing on Mikhail's comments (probably, it's still useful for some algorithms) but if it remains in the code please mark it inline.
Is there any reason to use trailing return types? As I remember correctly you use those only in this file.
Why do you use std::intptr_t? it's a type to represent the pointer as the signed value. If you already check (with enable_if) that your type is integral and you have two same types (by function signature) you may case directly to Index, may not you?
chuck_policy (a least to me) is a very confusing name in the context of Parallel STL. I had an impression that use have some special execution policy or you wrap ExecutionPolicy object with this function or something like that. Cannot propose better from the top of my head. Need to think a little bit about that.
I would recommend to split those variable to make each in its line for the sake of readability.
If I don't miss something you always construct chunk_policy object with 3 arguments. Why do you need field initialization? I would prefer to leave this struct trivially_default_constructible unless we have strong reasons not to do that.
I think the file header (here and in the all headers) should be arranged with "pstl LLVM" file header, like this: (w/o copyright - the contributors listed in https://github.com/llvm/llvm-project/blob/main/pstl/CREDITS.txt)
No reason. Happy to change it.
I'm not sure why you think adding a _func suffix makes this clearer. I'm not opposed to it, but most functors in the STL aren't suffixed with "_func" and that's not confusing.
I agree. This was an accident.
Isn't calling omp_get_num_threads() slower than reading a variable?
This is fixed.
I made this change for the oneDPL commit, but it was deferred. I can make this change here if that is the consensus.
Many algorithms chunk up the work. If we don't have a variable we don't know how big to make the chunks. I'm happy to make it inline. Also, my testing shows that it should probably be 2048.
I prefer them. If the preference in PSTL is to not, I can change it.
Good point! I will fix that!
chunk_policy is intended to refer to the information needed to apply the chunking algorithm. We could call it chunk_boundaries or __chunk_partition_details.
I think that the compiler complains if I don't initialize the values. I can remove it and see. It might have been an artifact from other refactoring.
Christopher, one more comment - we discussed and agreed that a better name for macro _PSTL_PAR_BACKEND_OMP is
(I've counted 3 matches in the patch.)
Talking about the redundant parentheses on line 45 - yes, perhaps, I can agree that it improves readability... But, here, on the line 53, I guess, it worsens readability, IMHO.
__parallel_merge_body( __mid - __xs, __xe - __mid,...
According to my and Ruslan's comments about usage of "std::distance" in case of random access iterators - yes, as we agreed, you have changed it by difference.... but why here (and on line 55) it was kept?
The lines (83-87) look a kind of "merge artifacts"....
Addressed all known concerns with the last patch.
I did a search and replace. Those parentheses were a side-effect of a mistake I made in the regular expression. Sorry. This is fixed now.
Yup. It wasn't clear to me that this was also required to be a random access iterator. I have made the change here too.
I misunderstood what you were talking about. I have fixed this.
I've put some small comments but they are definitely not a showstopper for this patch to be merged. Whoever merges it please just make sure to apply those.
Other than that, looks good to me. Remaining improvements can be done later.
Please add new line when merge this review
In all (or almost all) files the guard macro is not started from underscore. It's inconsistent with the rest of the code and makes somebody think that this macro is public.
It probably doesn't deserve one more patch but I would like to make sure it's properly merged to main branch.
Thanks for the patch! Please ping me once you've applied those comments and we should be close to ready to ship this.
Please fix. This should be _PSTL_INTERNAL_OMP_PARALLEL_FOR_H (and similarly throughout).
Missing #include <cstddef>
It should be sufficient to use std::distance, not ::std::distance. Any reason why you're doing the latter? Applies in a bunch of places.
This error message isn't too helpful. Something like this would probably be better:
#if !defined(_OPENMP) # error The OpenMP PSTL backend requires OpenMP, but it looks like your platform doesn't implement it. #endif
Is the above message accurate and what you're trying to convey to the user? Also, is there any way that you'd be able to include <omp.h> yet OpenMP wouldn't be available? If not, then the #if !defined(_OPENMP) block is useless altogether.
Resolved all outstanding comments.
Someone mentioned for ADL it is better to root it in the global space. I am fine with taking it out.
Probably true. This was reduced from a larger block that checked various openmp versions. I will remove it.
Together with Mikhail we reviewed that patch and it looks good. The tests were launched on the (almost) latest code base in oneDPL CI (just several configurations, to be honest). Anyway, no issues observed.
Do you want us to commit this patch?
Let's work on enabling the tests in the CI?
I believe it can be done separately if you don't have opposite opinion
by Mikhail Dvorskiy, 10/15/2021 03:36 PM
[pstl] Initial implementation of OpenMP backend, on behalf of Christopher Nelson email@example.com
A couple of parallel patterns still remains serial - "Parallel partial sort", and "Parallel transform scan" - there are //TODOs in the code.