This is an archive of the discontinued LLVM Phabricator instance.

[ConstantFold] Optimize xor undef, undef to undef
AbandonedPublic

Authored by aqjune on May 6 2020, 4:59 PM.

Details

Summary

This patch makes ConstantFold optimize xor undef, undef to undef , which was mentioned at D79452.
Correctness of the transformation: http://volta.cs.utah.edu:8080/z/GAs5QJ

Checked that it did not miscompile testsuite & SPECCPU 2017 . The used testsuite commit was 44dad74558616adcdf54a289db4466c7ab7d2563
because more recent testsuite raised https://bugs.llvm.org/show_bug.cgi?id=45818 (with & without this change).

Diff Detail

Event Timeline

aqjune created this revision.May 6 2020, 4:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2020, 4:59 PM

This is correct, strictly speaking, but has the situation with misguided users really improved any? That same code using _mm_xor_ps() or whatever to zero __m128 variables is probably still out there, and clang still doesn't print any diagnostic by default if you try to do that. And this still isn't really a useful fold.

spatel added a comment.May 7 2020, 5:31 AM

This is correct, strictly speaking, but has the situation with misguided users really improved any? That same code using _mm_xor_ps() or whatever to zero __m128 variables is probably still out there, and clang still doesn't print any diagnostic by default if you try to do that. And this still isn't really a useful fold.

Clang does recognize this given the right flags, but I agree there's more risk than reward here - folding to "0" should be almost as effective for optimization. There's code out there that will probably start misbehaving in very difficult-to-diagnose ways if we do this.

$ cat xor.c 
#include <xmmintrin.h>

int f() {
  int x;
  return x ^ x;
}

__m128 g() {
  __m128 x;
  return _mm_xor_si128(x, x);
}
$ ./clang xor.c -c
$ ./clang xor.c -c -Wall
xor.c:5:10: warning: variable 'x' is uninitialized when used here [-Wuninitialized]
  return x ^ x;
         ^
xor.c:4:8: note: initialize the variable 'x' to silence this warning
  int x;
       ^
        = 0
xor.c:10:24: warning: variable 'x' is uninitialized when used here [-Wuninitialized]
  return _mm_xor_si128(x, x);
                       ^
xor.c:9:3: note: variable 'x' is declared here
  __m128 x;
  ^
2 warnings generated.
xbolva00 added inline comments.
llvm/test/Transforms/InstCombine/xor-undef.ll
1

Create proper test file with CHECKs? (Avoid grep)

I agree, even though this is technically correct, the risk/reward isn't worth it. LLVM used to do this pervasively, but we switched to producing zero out of practical concerns.

aqjune abandoned this revision.May 7 2020, 10:46 AM

I reconsidered about possible risks and I became to agree that this patch required more discussion about its effectiveness in practice.
The similar case would be the optimizations that exploit signed overflow, but unlike signed overflow, I could not detect any performance benefit at least from testsuite (with LTO enabled); all binaries remained the same. Since folding it to undef allows more freedom in theory, I thought it was a low-hanging fruit, but it doesn't seem to be in reality.
So, I agree that it is okay to have the xor undef, undef -> zero transformation. It works in practice, and it does not break existing source programs that uses the pattern to zero-ize variables.