This is an archive of the discontinued LLVM Phabricator instance.

ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141)
ClosedPublic

Authored by spatel on Jul 19 2015, 4:54 PM.

Details

Summary

PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141
contains a test case where we have duplicate entries in a node's uses() list.

After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list.

The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combineRepeatedFPDivisors() check entirely because there really is only one user of the divisor.

Diff Detail

Event Timeline

spatel updated this revision to Diff 30129.Jul 19 2015, 4:54 PM
spatel retitled this revision from to ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141).
spatel updated this object.
spatel added reviewers: hfinkel, majnemer, rtrieu, nlewycky.
spatel added a subscriber: llvm-commits.

One inline comment/question. Otherwise it looks OK to me, but I don't do much in here a lot.

-eric

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8373–8374

Could probably merge it into the existing if statement?

That said, I guess using std::find on it is how we do it for most of these, I assume we don't expect a lot of users and so would want to use a different mechanism?

spatel added inline comments.Jul 20 2015, 1:30 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8373–8374

Sure, I can merge it into the compound 'if'.

I thought about changing this to a set (DenseSet?) rather than SmallVector, but I'm also assuming that we would not expect more than a few users, so left it as-is for now.

If you would prefer using a set here, let me know. Thanks!

spatel updated this revision to Diff 30192.Jul 20 2015, 2:08 PM

Patch updated:
I just discovered 'SmallPtrSet'.
Looks like it was designed exactly for this case, so let's use it here instead of a vector. Then, we don't have to use std::find() for dup checking.

Patch updated:
I just discovered 'SmallPtrSet'.
Looks like it was designed exactly for this case, so let's use it here instead of a vector. Then, we don't have to use std::find() for dup checking.

Egads no, that makes the iteration order unstable.

This whole thing seems wrongly formulated. Why are we walking the *entire* users list prior to finding out whether there are few enough users? I feel like the TLI interface should say how many can be handled rather than accepting a number we want to handle.

And then the code here should be to insert into a SetVector until we exceed that threshold.

Patch updated:
I just discovered 'SmallPtrSet'.
Looks like it was designed exactly for this case, so let's use it here instead of a vector. Then, we don't have to use std::find() for dup checking.

Egads no, that makes the iteration order unstable.

This whole thing seems wrongly formulated. Why are we walking the *entire* users list prior to finding out whether there are few enough users? I feel like the TLI interface should say how many can be handled rather than accepting a number we want to handle.

And then the code here should be to insert into a SetVector until we exceed that threshold.

I'm not following. We're asking the TLI if we have exceeded a minimum threshold, not a maximum. If we exceed the threshold, we want to transform all divisions that have this divisor...so we always have to walk all users?

hans added a subscriber: hans.Jul 22 2015, 9:36 AM
hfinkel edited edge metadata.Jul 25 2015, 4:49 PM

Patch updated:
I just discovered 'SmallPtrSet'.
Looks like it was designed exactly for this case, so let's use it here instead of a vector. Then, we don't have to use std::find() for dup checking.

Egads no, that makes the iteration order unstable.

This whole thing seems wrongly formulated. Why are we walking the *entire* users list prior to finding out whether there are few enough users? I feel like the TLI interface should say how many can be handled rather than accepting a number we want to handle.

And then the code here should be to insert into a SetVector until we exceed that threshold.

I'm not following. We're asking the TLI if we have exceeded a minimum threshold, not a maximum. If we exceed the threshold, we want to transform all divisions that have this divisor...so we always have to walk all users?

Part of the issue here is that we might walk all of the users even though TLI.combineRepeatedFPDivisors will return false for any size. This is wasteful. We should just have TLI.combineRepeatedFPDivisors return the size threshold so that we can avoid walking if there is no point (and return false if no combination is ever desired).

Once we know that we're looking for a certain number of users, we can then see if we have enough. As Chandler says, you should use a SetVector so that the iteration order is stable.

Chandler, to be clear, we're not finding out if there are few enough users, we're walking to find out if there are enough (and, if there are, we need to iterate over all of them anyway).

Part of the issue here is that we might walk all of the users even though TLI.combineRepeatedFPDivisors will return false for any size.

Ah, understood. I forgot the default return is always 'false'.

Let me split this up then:

  1. Move the DAGCombiner's combineRepeatedFPDivisors logic into a helper function because it's getting too big (NFC).
  2. Fix the TLI hook to return the minimum uses threshold (NFC).
  3. Fix the data structure to be a SetVector to fix PR24141.
  1. Move the DAGCombiner's combineRepeatedFPDivisors logic into a helper function because it's getting too big (NFC).

Checked in:
http://reviews.llvm.org/rL243293

  1. Fix the TLI hook to return the minimum uses threshold (NFC).

Submitted for review:
http://reviews.llvm.org/D11531

  1. Fix the data structure to be a SetVector to fix PR24141.

This is a trivial change if the previous change is ok, so I'll abandon this as a patch for review unless someone would like to see the updated patch here prior to commit.

This revision was automatically updated to reflect the committed changes.