This is an archive of the discontinued LLVM Phabricator instance.

[SDAG] Handle BUILD_PAIR in ComputeNumSignBits
AbandonedPublic

Authored by danilaml on Dec 18 2019, 5:16 AM.

Details

Summary

ComputeNumSignBits function was missing a case for BUILD_PAIR node.

Handle it by computing sign bits of upper part. If it's all ones, add sign bits
of the lower part.

I've encountered it in some test with really big BB that got split in ISel, so sext i32 to i64 happened in one "subDAG", while the actual operation (in this case - mul) in the other.
I wasn't able to produce some small self-contained test-case, but the change seems small and straight-forward enough to not require one.
Still, I'd be happy if someone has ideas w.r.t testing.

Diff Detail

Repository
rL LLVM

Event Timeline

danilaml created this revision.Dec 18 2019, 5:16 AM
danilaml created this object with visibility "No One".
danilaml created this object with edit policy "No One".
danilaml edited the summary of this revision. (Show Details)Dec 20 2019, 8:31 AM
danilaml added reviewers: spatel, craig.topper.
danilaml changed the repository for this revision from rL LLVM to rG LLVM Github Monorepo.
danilaml changed the visibility from "No One" to "Public (No Login Required)".
danilaml changed the edit policy from "No One" to "All Users".
spatel accepted this revision.Dec 20 2019, 11:53 AM

I'm not sure how to test this either, but seems correct, so LGTM.

This revision is now accepted and ready to land.Dec 20 2019, 11:53 AM
craig.topper requested changes to this revision.Dec 20 2019, 12:27 PM
craig.topper added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
3461

I don't think this is correct. If the first call returns "all sign bits" then we check operand(0) that call will always return at least 1. So we increase the total by at least 1. So its impossible for this code to ever return that the number of sign bits of a build_pair is exactly half the total size of the build_pair. That seems wrong.

This revision now requires changes to proceed.Dec 20 2019, 12:27 PM
spatel requested changes to this revision.Dec 20 2019, 1:18 PM

Rescinding my approval. Craig's right - this won't work as expected.

danilaml marked an inline comment as done.Dec 23 2019, 7:56 AM
danilaml added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
3461

Good catch. Do you know of any way to recover/query whether one value is a replicated sign bit of another?
Computing the lo part sign bits is crucial to trigger proper (u)smul_hilo transformations latter on, so missing this info seems unfortunate.
It looks like the relevant info is there when copyFromRegs are generated, but I couldn't find a way to represent it in the DAG similarly to Assert*ext (other than breaking the semantics somewhat and using Assert* with Lo part node).

danilaml added a comment.EditedDec 23 2019, 9:19 AM

One of the motivating examples (not sure how stable this IR to use as a test) is something like this (needs -mllvm -disable-cgp to prevent sext sinking):

define i64 @foo(i32 %a, i32 %b) local_unnamed_addr {
entry:
  %conv6 = sext i32 %a to i64
  %conv17 = sext i32 %b to i64
  %add = add i64 %conv6, 42
  %cmp0 = icmp slt i64 %conv17, %add
  br i1 %cmp0, label %exit, label %exit2 ; branch to prevent bb merging

exit:
  %mul = mul nsw i64 %conv6, %conv17
  ret i64 %mul
exit2:
  ret i64 0
}

On 32-bit target %conv* would be copyFromReg'ed across BBs in the DAG in a pair of registers. Without knowing that ComputeNumSignBits for their build_pair is 33 (since it's just a copy from sext i32 to i64), LLVM won't generate smul_lohi.

Your best bet is to drill down inside the build_pair to see if it comes from a common source, but it'll be very limited.

Your best bet is to drill down inside the build_pair to see if it comes from a common source, but it'll be very limited.

I believe in this case, the inputs to the build_pair are just two copyfromregs on liveins to the basic block. There's no where to drill down to.

And the other problem is that the sign bits information of the originating basic block is calculated between the last DAG combine and isel, so all the producing instructions have already been split.

bjope added a subscriber: bjope.Dec 24 2019, 4:20 PM
danilaml marked an inline comment as done.Dec 30 2019, 8:09 AM
danilaml added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
3462

@craig.topper I could leave this part out of the patch, however this reduces its usefulness (as it'll no longer solve the problem I've mentioned before).
Does this make sense? Perhaps there is some other way to get the info I want (i.e. that two values are just parts of the same original value, so share it's sign bits)? If not, is there a straightforward way to add it? If I understood correctly, this whole thing will become obsolete after transition to GlobalIsel so I'm not sure if it'll be worth the effort.

@danilaml for your original case, why isn't CodeGenPrepare helping by sinking the sext?

bjope added a comment.Dec 30 2019, 3:16 PM

@danilaml for your original case, why isn't CodeGenPrepare helping by sinking the sext?

Maybe it has been edited, but above the original example it says:

(needs -mllvm -disable-cgp to prevent sext sinking)

About "how to test", maybe it is possible to use unittests/CodeGen/AArch64SelectionDAGTest.cpp (or a similar mockup). We already got a few ComputeKnownBits tests placed there.

Btw, one option (if the SMUL_LOHI is important for your target here) could be to detect the widening multiplication in a pass just before ISel, and rewrite mul into some kind of smul_lohi intrinsic.

@craig.topper this pass runs on IR. In the real code that prompted me to come up with this patch the BB block was split during ISel building phase, because (IUC) it was too big. And just so happens that the split was right between sext to i64 and mul. I've tried to simulate it with the code above.
@bjope Thanks. That might be a valid option for testing (although it's not really target specific at this point). However, it might be moot if it turns out there is no way to either 1) check whether two CopyFromRegs came from the same IR value or 2) find out in other ways that one node is just replicated sign bit of the other node.

@danilaml Abandon this? Did you ever find an alternative or specific test cases?

danilaml abandoned this revision.Jun 25 2020, 3:10 AM

@RKSimon sadly not.