- User Since
- Feb 5 2016, 9:28 AM (372 w, 2 d)
Fri, Mar 24
Jul 9 2022
Thank you @spatel
Seems my old write access is not valid anymore. Can I ask someone to push this commit?
Jul 8 2022
Removed dead function. Thanks @spatel for spotting this!
Following review comments.
@RKSimon turned out doing a simple copy-paste wasn't the best idea. I've corrected the code proposed within one of my comments and put it into a new patch. It passes all test now, nothing runs into infinite loop.
Jul 7 2022
@RKSimon could you give some suggestions where to go next with this?
Jul 6 2022
@RKSimon and @spatel you were right, this code can be happily removed, the fmul expansion is still happening when it's needed. I've uploaded a completely new commit.
Jul 4 2022
Jul 1 2022
Hi @RKSimon, thanks for your comments, although I'd like to obtain some of your guidance on what can I do to address them in the context of this commit.
Jun 29 2022
Sadly, powi is handled differently (than powf, pow and powl which all result in calling optimizePow() modified by this patch) in the SimplifyLibCalls.cpp code, so extending this patch in that direction would go beyond the initial scope and may overstretch my approved upstreaming activity. I would rather create separate commit covering powi later.
Jun 28 2022
Another review comment addressed.
Comment 2 addressed.
Addressed review comment 1.
Jun 27 2022
Jun 25 2022
Sep 29 2021
I do realize that this patch isn't the best solution for this problem, but I wonder what it would be. Maybe LICM (where the decision whether to hoist is made) should rely on something more than BasicAA, but which pass should do such analysis and basing on what criteria?
Sep 28 2021
Apr 9 2020
May 16 2019
May 11 2019
May 10 2019
Jan 21 2019
It's ok, works for me now.
This commit causes Flang unit tests to fail.
Nov 7 2018
Not much I'm able to say. The error is the malformed integer value resulting in incorrect computation result, more often on AArch64, less often on x86_64, both architectures running under Linux. The code does something unusual so I'm not surprised no one faced this before. Also keep in mind that I would never find it if hyperbarriers weren't introduced for all architectures.
I need to contact someone who introduced KMP_REVERSE_HYPER_BAR implementation (unfortunately, it was added here in a huge collective commit years ago, Tue Oct 7 16:25:50 2014).
I have a piece of proprietary code that stopped to fail on AArch64 after I reverted D40358. Digging deeper I found out that it's not the problem with hyperbariers per se. As I undefined KMP_REVERSE_HYPER_BAR, the problem went away. What else I found is that the problem can be reproduced on x86_64 too, only less frequent (once per
for i in `seq 1 1000`; do
execution). As I'd have to spend a time to figure out the logic behind it and verify its validity, in the meantime, can someone familiar with this code have a look at it and confirm reverse implementation is sane?
Oct 5 2018
This commit is bit damaging for those who want to use libomp.a for -static built applications. As KMP_DYNAMIC_LIB is hardcoded to 1 in kmp_config.h.cmake, dlopen will always be used causing following link-time warning:
warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
This makes virtually impossible to copy libomp.a across various Linux boxes or shipping libomp.a in binary libomp packages addressed to various Linux systems.
Sep 7 2018
Sep 5 2018
Jul 25 2018
Feb 9 2018
OMPT test cases all pass after that change.
Can't compile this
Jan 11 2018
I suspect proposed change in kmp_gsupport.cpp might of confused you (as I check only for arch but not for the compiler). Note that gsupport (GOMP interface for libomp) must be compatible with what GCC-compiled programs do, otherwise compiling libomp with clang and running test cases compiled with GCC would lead to confusing inconsistency.
Ahh, I didn't say it isn't working, but I also didn't explicitly said that: when libomp+OMPT is built with clang on 32-bit ARM, all of OMPT test cases pass iff they are compiled with clang.
The problem is on 32-bit ARM and GCC.
Jan 10 2018
Ok, relevant line changed. I didn't notice that although kmp_platform.h (where KMP_ARCH_ARM is defined) is included from callback.h, kmp_os.h where KMP_COMPILER_GCC is defined is not included.
Also note that both clang++ and g++ define __GNUC__ hence full check for GNU C++ compiler looks as follows:
#if defined(__GNUC__) && !defined(__clang__) #warning this is a GNU compiler #else #warning this is NOT a GNU compiler #endif
Guys, I can think of approach as follows, but fixing all relevant lit tests to make FileCheck directives aware of relevant architecture+compiler combinations is beyond my imagination:
Jan 9 2018
Does a Clang built runtime work for GCC? In particular, do all tests pass when building with Clang but adding -DOPENMP_TEST_C_COMPILER=gcc -DOPENMP_TEST_CXX_COMPILER=g++?
I really like this message that proposed patch causes on ARM + gcc when someone's trying to set -DLIBOMP_OMPT_SUPPORT=ON:
CMake Error at runtime/cmake/LibompUtils.cmake:27 (message): LIBOMP: OpenMP Tools Interface requested but not available in this implementation Call Stack (most recent call first): runtime/CMakeLists.txt:316 (libomp_error_say)
This is more clean solution than passing NULL which effects in less pleasant failures.
Nah, doing architecture && compiler doesn't seem to work. I've got some better idea.
Found that __builtin_frame_address(1) gives wrong answer in __kmp_api_GOMP_parallel (a.k.a KMP_API_NAME_GOMP_PARALLEL as in kmp_gsupport.cpp). Then I've found out that this won't work with gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60109
Status: RESOLVED WONTFIX
We can mark it UNSUPPORTED on arm32/gcc
It's in https://reviews.llvm.org/D41854
Jan 8 2018
It's kmp_gsupport.cpp issue I presume, was it ever tested on any 32-bit platform? I'll try to look at it through the day.
The older gcc version the worse it gets, with 4.9.1-16ubuntu6 it's:
I didn't even know these tests are allowed to be compiled with anything else than clang 6.0+...
what's the next step here?
I did my tests on 32-bit dual-core ARMv7 Cortex-A9 machine. Just in case I also tested it on 64-bit 96-cores ARMv8.1 machine to make sure there are no regressions.
Dec 23 2017
@Hahnfeld gosh, you're right and I don't work with 32-bit systems nowadays so I need to track down one of those still kept online. Being far from the office isn't helping, so it won't happen immediately.
Dec 22 2017
Dec 21 2017
Dec 13 2017
Dec 10 2017
Example of old format layout (note that 'Processor :' and 'processor :' won't collide as we don't call strncasecmp()):
Dec 9 2017
Dec 8 2017
I tried various number of threads with OMP_NUM_THREADS (in range between 4 and 96 on 96 cores dual socket ARMv8.1 machine, and between 2 and 4 on 4 cores single socket ARMv8 machine) and enforced affinity with OMP_PROC_BIND=spread, so I guess it answers your question.
+1, although I ran this omp_single_copyprivate.c example in a loop 10000 times on two different AArch64 machines (ARMv8 and ARMv8.1) and didn't observe any hangs, before and after applying this patch.