This is an archive of the discontinued LLVM Phabricator instance.

Fix signbit related bitcast optimization for ppc_fp128
ClosedPublic

Authored by timshen on Dec 1 2015, 6:44 PM.

Details

Summary

Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst, x)) combines for ppc_fp128, since signbit computation is more complicated.

Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html

Diff Detail

Event Timeline

timshen updated this revision to Diff 41589.Dec 1 2015, 6:44 PM
timshen retitled this revision from to Disable signbit related bitcast optimization for ppc_fp128.
timshen updated this object.
timshen set the repository for this revision to rL LLVM.
timshen retitled this revision from Disable signbit related bitcast optimization for ppc_fp128 to Disable signbit related bitcast optimization for illegal types (e.g ppc_fp128).Dec 1 2015, 7:02 PM
timshen updated this object.Dec 1 2015, 8:22 PM
hfinkel edited edge metadata.Dec 2 2015, 1:22 PM

Is the code generation for ppc_fp128 better this way? If not, please actually fix the problem. If so: 1) please specifically test (in the regression test) for the better code generation and 2) make the change specific to ppc_fp128 [doing this for all illegal types is not good because on soft-float implementations this is still useful even though the types are illegal].

timshen updated this revision to Diff 41697.Dec 2 2015, 5:05 PM
timshen retitled this revision from Disable signbit related bitcast optimization for illegal types (e.g ppc_fp128) to Disable signbit related bitcast optimization for ppc_fp128.
timshen updated this object.
timshen edited edge metadata.
timshen removed rL LLVM as the repository for this revision.

The code generation for ppc_fp128 better, since it's originally wrong and now correct. Changed checks to pin down the exact correct behavior.

timshen set the repository for this revision to rL LLVM.Dec 2 2015, 5:05 PM
timshen updated this revision to Diff 41701.Dec 2 2015, 5:14 PM

Oops, wrong diff. Updated diff.

hfinkel requested changes to this revision.Dec 2 2015, 5:16 PM
hfinkel edited edge metadata.

The code generation for ppc_fp128 better, since it's originally wrong and now correct. Changed checks to pin down the exact correct behavior.

One of the things that I really like about working on LLVM is that It is the general philosophy of this project not to simply disable optimizations in otherwise-broken corner cases, but to properly fix them. Obviously this is not always practical, but it certainly is in this case.

Please prototype a fix which generates the correct mask for ppc_fp128 and compare the resulting code quality. If you don't wish to do this, please let me know, and I (or someone else who works on PowerPC here) will do it for you.

Also, please upload your patches with full context. See http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface for instructions.

This revision now requires changes to proceed.Dec 2 2015, 5:16 PM
timshen updated this revision to Diff 41844.Dec 3 2015, 11:41 PM
timshen retitled this revision from Disable signbit related bitcast optimization for ppc_fp128 to Fix signbit related bitcast optimization for ppc_fp128.
timshen updated this object.
timshen edited edge metadata.
timshen removed rL LLVM as the repository for this revision.

I have updated the combiners, with these optimizations correctly implemented.

Compared to disabling these optimizations:
For fabs, both are 12 instructions, but the optimized version have no floating point instructions.
For fneg, they are exactly the same.
For fcopysign, it's way better - no constants, no function call, less instructions.

Links to the actual assembly code:
https://gist.github.com/innocentim/ae3c3807161b236bbbc7
https://gist.github.com/innocentim/5373dd6ca13c5b3a35b3

I'm not sure if we should keep fabs & fneg combiners. I kept them in the updated patch just as part of the investigation result.

Adding Kit and Bill as they can probably help out here as well.

Thanks for continuing to work on this!

The EXTRACT_ELEMENT index is Endian sensitive in this case, right? I suspect you need to use index 1 for big-Endian targets.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7331

I think you need a check for !LegalTypes here because EXTRACT_ELEMENT should only be used prior to legalization, and i64 might not be legal (if you're compiling for ppc32).

test/CodeGen/PowerPC/fp128-bitcast-after-operation.ll
2

Please also run these tests for the pwr7 and some ppc32 target.

timshen updated this revision to Diff 42459.Dec 10 2015, 12:53 PM
timshen edited edge metadata.

Done.

timshen updated this revision to Diff 42461.Dec 10 2015, 12:57 PM

Renamed the helper function to start with a lower case letter.

hfinkel accepted this revision.Dec 10 2015, 1:07 PM
hfinkel edited edge metadata.

LGTM, thanks!

This revision is now accepted and ready to land.Dec 10 2015, 1:07 PM

I've gone ahead and committed this here:

dzur:~/sources/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
A test/CodeGen/PowerPC/fp128-bitcast-after-operation.ll
M lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Committed r255305

Thank you Hal and Eric for reviewing and committing!

timshen closed this revision.Jan 8 2016, 11:29 AM