This is an archive of the discontinued LLVM Phabricator instance.

[x86] fold fand (fxor X, -1) Y --> fandn X, Y
ClosedPublic

Authored by spatel on Dec 3 2016, 8:14 AM.

Details

Summary

I noticed this gap in the scalar FP-logic matching with:
D26712
and:
rL287171

AFAIK, the vector versions of this already work.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 80180.Dec 3 2016, 8:14 AM
spatel retitled this revision from to [x86] fold fand (fxor X, -1) Y --> fandn X, Y.
spatel updated this object.
spatel added reviewers: craig.topper, delena, RKSimon.
spatel added a subscriber: llvm-commits.
delena added inline comments.Dec 3 2016, 10:58 AM
lib/Target/X86/X86ISelLowering.cpp
31745 ↗(On Diff #80180)

It should work for scalar and vector types, right? You check only scalar VT (f32, f64) here.

spatel added inline comments.Dec 3 2016, 11:45 AM
lib/Target/X86/X86ISelLowering.cpp
31745 ↗(On Diff #80180)

I think vector types always use combineANDXORWithAllOnesIntoANDNP() for this transform because we peek through the bitcasts to find the integer logic ops for vectors. For scalars, we transform to the X86-specific FP-logic nodes, so that's why we need a separate way to handle them. I'm not sure if that's necessary, but we had load folding bugs when we tried to handle vectors and scalars together.

So this example already works without this patch:

define <2 x double> @FsANDNPSrr(<2 x double> %x, <2 x double> %y) {
  %bc1 = bitcast <2 x double> %x to <2 x i64>
  %bc2 = bitcast <2 x double> %y to <2 x i64>
  %not = xor <2 x i64> %bc2, <i64 -1, i64 -1>
  %and = and <2 x i64> %bc1, %not
  %bc3 = bitcast <2 x i64> %and to <2 x double>
  ret <2 x double> %bc3
}

$ ./llc -o - andn.ll 
  andnps	%xmm0, %xmm1
  movaps	%xmm1, %xmm0
  retq
delena accepted this revision.Dec 4 2016, 3:41 AM
delena edited edge metadata.
This revision is now accepted and ready to land.Dec 4 2016, 3:41 AM
This revision was automatically updated to reflect the committed changes.