This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP] Info in release notes about OpenMP support in clang.
ClosedPublic

Authored by ABataev on Jul 9 2015, 5:20 AM.

Details

Summary

Add info about completion of OpenMP 3.1 + support for some elements of OpenMP 4.0

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev updated this revision to Diff 29316.Jul 9 2015, 5:20 AM
ABataev retitled this revision from to [OPENMP] Info in release notes about OpenMP support in clang..
ABataev updated this object.
ABataev added a reviewer: rsmith.
ABataev added a subscriber: cfe-commits.

Suggest we modify to:

OpenMP Support

Clang 3.7 fully supports OpenMP 3.1 and reported to work on several platforms,
including x86, x86-64 and Power.

In addition to OpenMP 3.1, several important elements of 4.0 version of the
standard are supported as well:

  • `omp simd, omp for simd and omp parallel for simd` pragmas
  • atomic constructs
  • `proc_bind clause of omp parallel` pragma
  • `depend clause of omp task` pragma
  • `omp cancel and omp cancellation point` pragmas
  • `omp taskgroup` pragma

We plan to continue work on 4.0 for clang 3.8. Please see this link for up-to-date
status <https://github.com/clang-omp/clang/wiki/Status-of-supported-OpenMP-constructs>_.
Contributors to this work include AMD, Argonne National Lab., IBM, Intel, Texas Instruments, University of Houston and many others.

rsmith accepted this revision.Jul 21 2015, 1:24 PM
rsmith edited edge metadata.

This looks fine to me. Please remember that this should be committed to the 3.7 branch, not to Clang trunk.

Regarding Michael Wong's suggestion:

  • The addition of "We plan to continue work on 4.0 for clang 3.8. Please see this link for up-to-date status https://github.com/clang-omp/clang/wiki/Status-of-supported-OpenMP-constructs_." is useful information, but doesn't seem directly relevant for the release notes for Clang 3.7, because it's not information about the 3.7 release. I'm happy with this with or without that change.
  • We have historically not included lists of contributors in our release notes. While attribution of this kind is important, a better place for it is probably LLVM's CREDITS.TXT or similar rather than here (we want these credits to remain for the lifetime of the project, not just for one release).
This revision is now accepted and ready to land.Jul 21 2015, 1:24 PM

Michael, will you add credits info somewhere?

hans added a subscriber: hans.Jul 22 2015, 3:56 PM

Can the note include a link to the documentation describing how to use OpenMP with Clang?

And speaking of that, what is the situation there? As far as I understand, the "switching CLANG_DEFAULT_OPENMP_RUNTIME to libomp" discussion is still not resolved. Does -fopenmp work out of the box now? -fopenmp=libomp? Where is the user supposed to get the runtime lib from? It's not currently part of the pre-built binaries that ship as part of the release.

docs/ReleaseNotes.rst
121 ↗(On Diff #29316)

Nit: "Now fully supported" and "reported to work on several platforms" reads slightly contradictory to me.

+Jopnathan Peyton

We had a discussion on this on our Wednesday morning call. Jonathan Peyton
has added the cmake infrastructure to support this and is automating the
library tests to support the switch for libomp. Jonathan will be able to
correct me if I am wrong. This we hope will enable the setting of the
default switch of -fopenmp. If there are additional requirements, please
let us know. Thanks.

hans added a comment.Jul 22 2015, 7:01 PM

+Jopnathan Peyton

We had a discussion on this on our Wednesday morning call. Jonathan Peyton
has added the cmake infrastructure to support this and is automating the
library tests to support the switch for libomp. Jonathan will be able to
correct me if I am wrong. This we hope will enable the setting of the
default switch of -fopenmp. If there are additional requirements, please
let us know. Thanks.

I would like to see that setting enabled first, before claiming full support in the release notes, though. If the release notes say it's fully supported, it needs to Just Work out of the box.

Also, I do want to point out that it's now pretty late for this kind of changes in the 3.7 process. Not necessarily too late, but pretty late. It would be good to have a plan for the potential scenario where the default switch doesn't make it into this release: even if it's not on by default, how do we get the most value out of it in this release, can it be made easy for users to experiment with it and provide feedback, etc.

The completeness of the OpenMP 3.1 support in 3.7.0 branch can be seen on x86_64-apple-darwin by using it to run the ctest of OpenMP3.1_Validation test suite from http://web.cs.uh.edu/~hpctools/openmp...

#Tested Directive         	t	ct	ot	oct
has_openmp                	100	100	100	100
omp_atomic                	100	100	100	100
omp_barrier               	100	100	100	100
omp_critical              	100	100	100	100
omp_flush                 	100	100	100	100
omp_for_firstprivate      	100	100	100	100
omp_for_lastprivate       	100	90	100	80
omp_for_ordered           	100	100	100	100
omp_for_private           	100	100	100	100
omp_for_reduction         	100	100	100	100
omp_for_schedule_dynamic  	100	100	100	100
omp_for_schedule_guided   	100	100	100	100
omp_for_schedule_static   	100	100	100	100
omp_for_nowait            	100	100	100	100
omp_get_num_threads       	100	100	100	100
omp_get_wtick             	100	100	100	100
omp_get_wtime             	100	100	100	100
omp_in_parallel           	100	100	100	100
omp_lock                  	100	100	100	100
omp_master                	100	100	100	100
omp_nest_lock             	100	100	100	100
omp_parallel_copyin       	100	100	100	100
omp_parallel_for_firstprivate 	100	100	100	100
omp_parallel_for_lastprivate 	100	100	100	100
omp_parallel_for_ordered  	100	100	100	100
omp_parallel_for_private  	100	100	100	100
omp_parallel_for_reduction 	100	100	100	100
omp_parallel_num_threads  	100	100	100	100
omp_parallel_sections_firstprivate 	100	100	100	100
omp_parallel_sections_lastprivate 	100	100	100	100
omp_parallel_sections_private 	100	100	100	100
omp_parallel_sections_reduction 	100	100	100	85
omp_section_firstprivate  	100	100	100	100
omp_section_lastprivate   	100	100	100	100
omp_section_private       	100	100	100	100
omp_sections_reduction    	100	100	100	95
omp_sections_nowait       	100	100	100	100
omp_parallel_for_if       	100	100	100	100
omp_single_copyprivate    	100	100	100	100
omp_single_nowait         	100	100	100	100
omp_single_private        	100	100	100	100
omp_single                	100	100	100	100
omp_test_lock             	100	100	100	100
omp_test_nest_lock        	100	100	100	100
omp_threadprivate         	100	100	-	-
	
omp_parallel_default      	100	100	100	100
omp_parallel_shared       	100	100	100	100
omp_parallel_private      	100	100	100	100
omp_parallel_firstprivate 	100	100	100	100
omp_parallel_if           	100	100	100	100
omp_parallel_reduction    	100	100	100	100
omp_for_collapse          	100	100	100	100
omp_master_3              	100	100	100	100
omp_task                  	100	100	100	100
omp_task_if               	100	100	100	100
omp_task_untied           	0	-	0	-
omp_task_shared           	100	100	100	100
omp_task_private          	100	100	100	100
omp_task_firstprivate     	100	100	100	100
omp_taskwait              	100	100	100	100
omp_taskyield             	100	100	10	-
omp_task_final            	0	-	0	-


Summary:
S Number of tested Open MP constructs: 62
S Number of used tests:                123
S Number of failed tests:              5
S Number of successful tests:          118
S + from this were verified:           114

Normal tests:
N Number of failed tests:              2
N + from this fail compilation:        0
N + from this timed out                0
N Number of successful tests:          60
N + from this were verified:           59

Orphaned tests:
O Number of failed tests:              3
O + from this fail compilation:        0
O + from this timed out                0
O Number of successful tests:          58
O + from this were verified:           55

which compares very favorably to the results from using FSF gcc 5.2.0...

#Tested Directive         	t	ct	ot	oct
has_openmp                	100	100	100	100
omp_atomic                	100	60	100	35
omp_barrier               	100	100	100	100
omp_critical              	100	0	100	0
omp_flush                 	100	0	100	0
omp_for_firstprivate      	100	100	100	100
omp_for_lastprivate       	100	100	100	95
omp_for_ordered           	100	100	100	100
omp_for_private           	100	100	100	100
omp_for_reduction         	100	100	100	100
omp_for_schedule_dynamic  	100	100	100	100
omp_for_schedule_guided   	100	100	100	100
omp_for_schedule_static   	100	100	100	100
omp_for_nowait            	100	100	100	100
omp_get_num_threads       	100	100	100	100
omp_get_wtick             	0	-	0	-
omp_get_wtime             	100	100	100	100
omp_in_parallel           	100	100	100	100
omp_lock                  	100	55	100	50
omp_master                	100	100	100	100
omp_nest_lock             	100	40	100	25
omp_parallel_copyin       	100	100	100	100
omp_parallel_for_firstprivate 	100	100	100	100
omp_parallel_for_lastprivate 	100	100	100	100
omp_parallel_for_ordered  	100	100	100	100
omp_parallel_for_private  	100	100	100	100
omp_parallel_for_reduction 	100	100	100	100
omp_parallel_num_threads  	100	100	100	100
omp_parallel_sections_firstprivate 	100	100	100	100
omp_parallel_sections_lastprivate 	100	100	100	100
omp_parallel_sections_private 	100	100	100	100
omp_parallel_sections_reduction 	100	25	100	15
omp_section_firstprivate  	100	100	100	100
omp_section_lastprivate   	100	100	100	100
omp_section_private       	100	100	100	100
omp_sections_reduction    	100	30	100	5
omp_sections_nowait       	100	100	100	100
omp_parallel_for_if       	100	100	100	100
omp_single_copyprivate    	100	100	100	100
omp_single_nowait         	100	100	100	100
omp_single_private        	100	100	100	100
omp_single                	100	100	100	100
omp_test_lock             	100	60	100	45
omp_test_nest_lock        	100	60	100	40
omp_threadprivate         	100	100	-	-
	
omp_parallel_default      	100	100	100	100
omp_parallel_shared       	100	100	100	100
omp_parallel_private      	100	100	100	100
omp_parallel_firstprivate 	100	100	100	100
omp_parallel_if           	100	100	100	100
omp_parallel_reduction    	100	100	100	100
omp_for_collapse          	100	100	100	100
omp_master_3              	100	100	100	100
omp_task                  	100	100	100	100
omp_task_if               	100	100	100	100
omp_task_untied           	0	-	0	-
omp_task_shared           	100	100	100	100
omp_task_private          	100	100	100	100
omp_task_firstprivate     	100	100	100	100
omp_taskwait              	100	100	100	100
omp_taskyield             	100	45	10	-
omp_task_final            	0	-	0	-


Summary:
S Number of tested Open MP constructs: 62
S Number of used tests:                123
S Number of failed tests:              7
S Number of successful tests:          116
S + from this were verified:           96

Normal tests:
N Number of failed tests:              3
N + from this fail compilation:        0
N + from this timed out                0
N Number of successful tests:          59
N + from this were verified:           49

Orphaned tests:
O Number of failed tests:              4
O + from this fail compilation:        0
O + from this timed out                0
O Number of successful tests:          57
O + from this were verified:           47

For comparison, the results from the ctest of OpenMP3.1_Validation test suite using the current -fopenmp=libgomp default in 3.7.0 branch are very poor as expected since clang doesn't emit any OpenMP code generation for the libgomp case...

#Tested Directive         	t	ct	ot	oct
has_openmp                	0	-	0	-
omp_atomic                	100	0	100	0
omp_barrier               	0	-	0	-
omp_critical              	100	0	100	0
omp_flush                 	0	-	0	-
omp_for_firstprivate      	100	0	100	0
omp_for_lastprivate       	100	0	100	0
omp_for_ordered           	100	0	100	0
omp_for_private           	100	0	100	0
omp_for_reduction         	100	0	100	0
omp_for_schedule_dynamic  	100	0	100	0
omp_for_schedule_guided   	0	-	0	-
omp_for_schedule_static   	0	-	0	-
omp_for_nowait            	0	-	0	-
omp_get_num_threads       	100	100	100	100
omp_get_wtick             	100	100	100	100
omp_get_wtime             	100	100	100	100
omp_in_parallel           	0	-	0	-
omp_lock                  	100	0	100	0
omp_master                	100	0	100	0
omp_nest_lock             	100	0	100	0
omp_parallel_copyin       	100	0	100	0
omp_parallel_for_firstprivate 	100	0	100	0
omp_parallel_for_lastprivate 	100	0	100	0
omp_parallel_for_ordered  	100	0	100	0
omp_parallel_for_private  	100	0	100	0
omp_parallel_for_reduction 	100	0	100	0
omp_parallel_num_threads  	100	0	100	0
omp_parallel_sections_firstprivate 	100	0	100	0
omp_parallel_sections_lastprivate 	100	0	100	0
omp_parallel_sections_private 	100	100	100	100
omp_parallel_sections_reduction 	100	0	100	0
omp_section_firstprivate  	100	0	100	0
omp_section_lastprivate   	100	0	100	0
omp_section_private       	100	100	100	100
omp_sections_reduction    	100	0	100	0
omp_sections_nowait       	0	-	0	-
omp_parallel_for_if       	100	0	100	0
omp_single_copyprivate    	100	0	100	0
omp_single_nowait         	100	0	100	0
omp_single_private        	0	-	0	-
omp_single                	100	0	100	0
omp_test_lock             	100	0	100	0
omp_test_nest_lock        	100	0	100	0
omp_threadprivate         	100	0	-	-
	
omp_parallel_default      	100	0	100	0
omp_parallel_shared       	100	0	100	0
omp_parallel_private      	100	100	100	100
omp_parallel_firstprivate 	100	0	100	0
omp_parallel_if           	100	0	100	0
omp_parallel_reduction    	100	0	100	0
omp_for_collapse          	100	0	100	0
omp_master_3              	100	0	100	0
omp_task                  	0	-	0	-
omp_task_if               	100	0	100	0
omp_task_untied           	0	-	0	-
omp_task_shared           	100	0	100	0
omp_task_private          	100	100	100	100
omp_task_firstprivate     	0	-	0	-
omp_taskwait              	100	0	100	0
omp_taskyield             	0	-	0	-
omp_task_final            	0	-	0	-


Summary:
S Number of tested Open MP constructs: 62
S Number of used tests:                123
S Number of failed tests:              28
S Number of successful tests:          95
S + from this were verified:           14

Normal tests:
N Number of failed tests:              14
N + from this fail compilation:        0
N + from this timed out                0
N Number of successful tests:          48
N + from this were verified:           7

Orphaned tests:
O Number of failed tests:              14
O + from this fail compilation:        0
O + from this timed out                0
O Number of successful tests:          47
O + from this were verified:           7
hans added a comment.Jul 23 2015, 9:34 AM

Jack, I'm not trying to question to completeness of your implementation. My apologies if it was interpreted that way.

I'm just trying to make sure the release notes match the actual user experience.

ABataev updated this revision to Diff 30556.Jul 23 2015, 8:15 PM
ABataev edited edge metadata.

Updates on current status after comments from Hans

In D11059#210215, @hans wrote:

-fopenmp=libomp?

This one works "out of the box" indeed (provided a user has runtime library available). As I see, Alexey updated his patch to reflect this.

In D11059#210215, @hans wrote:

Where is the user supposed to get the runtime lib from? It's not currently part of the pre-built binaries that ship as part of the release.

OpenMP runtime sources (along with build instructions) is a part of llvm release since 3.5. As I understand, only core clang + llvm compilers are supplied as pre-built binaries, the rest is in source code only.

Yours,

Andrey Bokhanko

Software Engineer
Intel Compiler Team
Intel

So is the default of -fopenmp=libgomp going to be left in place just for the 3.7.0 release or for all future 3.7.x maintenance releases? Frankly this decision to favor a non-functional OpenMP implementation over own own OpenMP library is baffling if the goal it to get widespread testing of this new feature.

jhowarth added a comment.EditedJul 24 2015, 11:44 AM

Also, if we are going to leave the default for CLANG_DEFAULT_OPENMP_RUNTIME set to libgomp, wouldn't it be better to at least modify cfe-3.7.0.src/CMakeLists.txt so that the user could pass -DCLANG_DEFAULT_OPENMP_RUNTIME=libomp to override that default in their own builds of 3.7.0 rather than forcing them to invoke -fopenmp=libomp? Currently we lock them into this unless they manually edit the CMakeLists.txt.

hans added a comment.Jul 24 2015, 2:19 PM
In D11059#210215, @hans wrote:

-fopenmp=libomp?

This one works "out of the box" indeed (provided a user has runtime library available). As I see, Alexey updated his patch to reflect this.

Great.

In D11059#210215, @hans wrote:

Where is the user supposed to get the runtime lib from? It's not currently part of the pre-built binaries that ship as part of the release.

OpenMP runtime sources (along with build instructions) is a part of llvm release since 3.5. As I understand, only core clang + llvm compilers are supplied as pre-built binaries, the rest is in source code only.

compiler-rt, libc++ and other libraries which integrate nicely with the LLVM build are also part of the pre-built binaries, modulo platform support.

I have a patch at http://reviews.llvm.org/D11494 that would facilitate building the run-time as part of the release process and shipping it as a separate download on the release page. I think that would make it easier for users who wish to experiment with Clang's OpenMP support.

So is the default of -fopenmp=libgomp going to be left in place just for the 3.7.0 release or for all future 3.7.x maintenance releases?

The maintenance releases only contain bug fixes. I don't think changing this flag would be in scope.

Frankly this decision to favor a non-functional OpenMP implementation over own own OpenMP library is baffling if the goal it to get widespread testing of this new feature.

Baffling or not, that is still the state on trunk, and I haven't seen any discussion or patches towards changing it. If nothing changes on trunk, there's nothing to consider for merging to 3.7.

hans accepted this revision.Jul 24 2015, 2:20 PM
hans added a reviewer: hans.

Updates on current status after comments from Hans

Thanks!

I've committed this to the branch in r243164.

This revision was automatically updated to reflect the committed changes.