This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix up SimplifyDemandedBits for ppcf128
Needs RevisionPublic

Authored by nemanjai on May 27 2020, 6:22 PM.

Details

Reviewers
hfinkel
Carrot
RKSimon
echristo
Group Reviewers
Restricted Project
Summary

On little endian PPC, the order of the doubles in a PPC double-double are reversed. When performing a "demanded bits" simplification, the demanded bits have to be rotated so that the bits come from the correct double.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45475

Diff Detail

Unit TestsFailed

Event Timeline

nemanjai created this revision.May 27 2020, 6:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2020, 6:22 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
echristo accepted this revision.May 27 2020, 6:25 PM
echristo added a subscriber: echristo.

Oof. That's painful, but I guess works for now.

This revision is now accepted and ready to land.May 27 2020, 6:25 PM
RKSimon added inline comments.May 28 2020, 2:06 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
852

Won't this rotate the bits every time you recursively call the function? You might need to limit this to Depth == 0 and then handle/bailout in the ISD::BITCAST case below.

nemanjai updated this revision to Diff 266805.May 28 2020, 4:00 AM
nemanjai marked an inline comment as done.

Add handling for recursively traversing multiple nodes that produce ppcf128 - thanks for spotting this @RKSimon.
Also added a test case for which we perform such a traversal: (select (build_pair)).

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
852

Ah, this is a very good point. I think it would be very difficult to construct a test case where we traverse multiple nodes that produce a ppcf128 value, but it should be possible in principle. Limiting it to Depth == 0 would not achieve the desired result since we are likely entering the recursion from a different node (similarly to how we do so in the attached test case).

nemanjai marked an inline comment as done.May 28 2020, 4:01 AM
RKSimon requested changes to this revision.May 28 2020, 4:58 AM
RKSimon added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
855

This is going to have the issue that it doesn't reset the flag to its old state when you leave the function, and TLO by design needs to be passed by reference.

Initially we might just have to bailout if we encounter MVT::ppcf128 until we can come up with a better solution?

This revision now requires changes to proceed.May 28 2020, 4:58 AM
nemanjai marked an inline comment as done.May 28 2020, 5:09 AM
nemanjai added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
855

I took a look at the calls to this function and was fairly certain that every time we do a non-recursive call to this, we will create a new TLO (which would set the flag to its initial state of false). However, maybe that is not the case (or not guaranteed to be the case in the future).

My initial thoughts on this were to reset the flag to false upon entering recursion (i.e. Depth == 0) but I ended up not adding that for the above reason.

I believe that if I add something like the following above this condition:

if (Depth == 0)
  TLO.BitsRotatedForPPCF128 = false;

we will achieve the desired result. Leaving the bit set after leaving the function isn't really an issue - as long as it is reset when we enter the recursive call chain.

Conversely, I can add another parameter to the function with a default value of false to signal that we have encountered a ppcf128 and rotated the demanded bits. But I would favour the first approach I outlined above as it does not further complicate the API here.

@RKSimon What do you think about the above suggestion (i.e. clearing the flag at the start of recursion)?

@RKSimon Do you think you'll have a chance to have another look at this some time soon?

oops - sorry I didn't see this as phab still puts it under "Waiting on Authors" - I'll take a look soon I promise!

please pre-commit the pr45475.ll test case against trunk to show the current codegen and rebase the patch to show the codegen delta

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
855

I'm still worried that we're not correctly resetting the flag when we return from deeper SimplifyDemandedBits calls (at Depth > 0 - not a new instance) - if we then have another f128 node that we attempt to recurse into as part of the same SimplifyDemandedBits call tree then the demandedbits that time won't get flipped - e.g. select (i1, f128, f128)?

llvm/test/CodeGen/PowerPC/pr45475.ll
42

do you need the dso_local ?