This is an archive of the discontinued LLVM Phabricator instance.

[EarlyCSE] hoist div/rem when sibling op exists (PR31028)
AbandonedPublic

Authored by spatel on Mar 16 2017, 9:58 AM.

Details

Summary

This is a more general solution to the SimplifyCFG transform in D30910 as suggested by Eli. It's not quite an "elimination", so we may be stretching the definition of "CSE". I've tried to make the intent clear in the code comments.

  1. I'm re-using the existing "DivOpInfo" struct as the hash table key. So the BypassSlowDivision changes are just to move that to a header to make it visible to EarlyCSE. We could change the name (DivRemKey?), but I didn't want to pollute this patch with the cosmetic diffs. Let me know if I should make either of those changes as a pre-commit.
  1. There's a TTI-cost-based bailout for targets where hoisting a remainder could be more expensive than we'd like. That was another suggestion from D30910. The last test triggers that check because an i128 mul op is not legal on x86-64, so that mul has cost = 2. We can probably make a more flexible cost calculation, but this should conservatively prevent the transform if a target doesn't have an appropriate mul instruction.
  1. The first 7 tests (4 positive and 3 negative) are copied from D30910. I think this solution covers anything we could do in SimplifyCFG, so I'll remove those tests from SimplifyCFG and abandon that patch if this is approved. We get the same improvements to the final x86 and AArch64 asm with this patch.

Diff Detail

Event Timeline

spatel created this revision.Mar 16 2017, 9:58 AM
efriedma set the repository for this revision to rL LLVM.
dberlin edited edge metadata.Mar 30 2017, 1:50 PM

I dunno. Does EarlyCSE do any other insertion?
I'm somewhat wary.
My general feeling is the insertion should be separate from the rest of what it does, but i understand that's hard in earlycse because it does not really produce an analysis.
I just feel weird be shoving more stuff in there :)
*especially* cost based stuff.

I'd actually feel better even if you did this in GVN or something that at least does insertion right now, even if that's not quite a great solution either.

I dunno. Does EarlyCSE do any other insertion?

This is my first experience with EarlyCSE, but I don't think there's anything else that moves an instruction. Everything else is killing from what I see.

I'm somewhat wary.
My general feeling is the insertion should be separate from the rest of what it does, but i understand that's hard in earlycse because it does not really produce an analysis.
I just feel weird be shoving more stuff in there :)
*especially* cost based stuff.

TTI was here of course, but yes, using the cost model is another potential precedent. :)

I'd actually feel better even if you did this in GVN or something that at least does insertion right now, even if that's not quite a great solution either.

OK, I'll have a (first) look at GVN.

(i am generalizing the ability to have instructions that are not in the IR in the value number tables, for NewGVN, but no reason to depend on that for now)

(i am generalizing the ability to have instructions that are not in the IR in the value number tables, for NewGVN, but no reason to depend on that for now)

I (finally) got around to looking at NewGVN by way of PR34228 ( https://bugs.llvm.org/show_bug.cgi?id=34228 ), and this functionality doesn't fit there either based on my limited understanding.

We don't want to keep adding weird stuff into existing passes, so are there objections to making this into a new single-purpose "hoist-div-rem" pass just for this transform?

spatel abandoned this revision.Aug 24 2017, 3:03 PM

Abandoning in favor of a stand-alone pass to do the same thing (D37121).