This is an archive of the discontinued LLVM Phabricator instance.

expose ILP for associative operations in the DAG
AbandonedPublic

Authored by spatel on May 14 2015, 12:18 PM.

Details

Summary

In the aftermath of the reversion of http://reviews.llvm.org/rL236031 (D9232), nobody said this idea was outright crazy...so I'm trying again. :)

This time, I've added a bunch of safety measures:

  1. Target-dependent and opt-in per target; currently only x86
  2. A map to count and limit the number of times we try this transform
  3. Only attempt after type legalization
  4. x86 FADD limit artificially low until we're sure there's no fallout

I ended up with target hooks (shouldReassociate / didReassociate) similar to what Jon suggested in a reply mail to r236031.

There are a bunch of TODO items to make this better, but I'm purposely starting as small as possible.

I initially thought that the tracking map should live in X86TargetLowering with the hooks, but that doesn't seem in tune with the rest of the class (it's all const AFAICT), and the target outlives the Combiner or DAG, so the map would need to be reset before each combine if it lived there?

Diff Detail

Event Timeline

spatel updated this revision to Diff 25794.May 14 2015, 12:18 PM
spatel retitled this revision from to expose ILP for associative operations in the DAG.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added a subscriber: Unknown Object (MLST).
escha added a subscriber: escha.May 14 2015, 1:12 PM

Could you help me try to understand what purpose this serves in terms of complementing the already-existing reassociate pass? That is, what benefit comes from doing it in the DAG in addition to the existing reassociate IR pass?

In D9780#173061, @escha wrote:

Could you help me try to understand what purpose this serves in terms of complementing the already-existing reassociate pass? That is, what benefit comes from doing it in the DAG in addition to the existing reassociate IR pass?

Sure - this came up in the earlier review thread:

  1. http://reviews.llvm.org/D8941 provides a simple motivating example for doing this in the DAG; the multiplies that we want to reassociate don't exist before the DAGCombine that transforms the divisions.
  2. More important: this is not a canonical-type of transform that is suitable for IR; this transform increases register pressure and some targets (like some GPUs, as I learned from the previous reversion) simply won't benefit from this optimization because there is no ILP exposed to the programmer.

On Thu, May 14, 2015 at 2:24 PM, James Molloy <james@jamesmolloy.co.uk> wrote:

It sounds like the best time to do this if at all possible is in MachineInst form,
where we have MachineTraceMetrics and accurate register pressure information.

Is there a reason why you're not doing it there? I know it's more awkward, but
it really seems we need accurate register pressure if we're not going to make it
cripplingly conservative.

I only have marginal reasons for trying this here rather than with MachineInsts: I really wanted to make this easy for any target to opt-in, nobody pushed me away from a DAG transform, and I've never attempted a machine-level transform. If the consensus is that it's better done later when we have more register knowledge, I'll certainly give it a try.

I found my way to the "MachineCombiner" pass, and it seems like this optimization would fit right in there.

But I just thought of something scary: to do this reassociation on FP ops, we would have to extend fast-math-flags to MachineInstrs.

Note: I'm still working on getting FMF into the DAG (D8900), but it could take a while to work through the bugs...

Unless anyone sees a reason to keep this patch alive, I'll abandon this DAG combine.

spatel abandoned this revision.Jun 10 2015, 4:35 PM

Abandoning. Re-implemented using the MachineCombiner pass (D10321).