Details
- Reviewers
spatel RKSimon craig.topper hfinkel
Diff Detail
Event Timeline
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
7949–7959 |
|
add _mm256_cmp_pd double version
add comments in lib/CodeGen/CGBuiltin.cpp
replaced 0xf to _CMP_TRUE_UQ in avx-builtins.c
Should we handle the 'pd256' version the same way?
How about the 0xb ('false') constant? It should produce a zero here?
Can or should we deal with the signalling versions (0x1b, 0x1f) too?
hm looks like 0xb(_CMP_FALSE_OQ) is ordered, so it is not possible and 0x1b or 0x1f might emit a signal.
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
7949–7959 | hm looks like 0xb(_CMP_FALSE_OQ) is ordered, so it is not possible and 0x1b or 0x1f might emit a signal. |
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
7949–7959 | I didn't follow this reasoning.
It's probably helpful to run the program attached to PR28110 ( https://bugs.llvm.org/show_bug.cgi?id=28110 ) to confirm or deny if these predicates behave like you expect.
|
We should've asked this first: is that fold allowed in the default FPENV state that we assume that clang is operating in?
I suppose it is FE_ALL_EXCEPT.
Ping. [andrew.w.kaylor, scanon] Is it OK to assume that FP exceptions are off by default and allow such transformation to constants in the IR since we know that we would have exception with "1.00 -nan" for _mm256_cmp_ps(a, b, 15) in case FP exceptions are enabled?
Update after http://lists.llvm.org/pipermail/llvm-dev/2017-June/114120.html. Added 0x1b(_CMP_FALSE_OS), 0x1f(_CMP_TRUE_US) handling.
Functionally, I think this is correct and complete now. See inline for some nits.
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
7925–7926 | Fix comment to something like: | |
7934 | would produce --> produces | |
7940 | Formatting: over 80-col limit. | |
7950 | would produce --> produces | |
7956 | Formatting: over 80-col limit. |
Fix comment to something like:
"Except for predicates that create constants, ..."