Making the late transformations opt-in results in less surprising behavior when composing multiple calls to the codegen strategy.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for addressing this, Nicolas! LGTM!
I agree that LICM and the other hoisting passes are always beneficial but I wonder if they should run by default as many times as the number of codegen strategies. I think it could be a good idea to disable all the existing (and future) cleanup passes by default to avoid surprises and minimize compile time. WDYT?
@dcaballe thanks!
This is by no means in any final shape or form, I am happy to evolve based on data.
To give more context, atm my first order bottleneck is being able to compose transformations easily and run autotuning at scale.
Once that first bottleneck is lifted, we will have have a good to statistical way to measure whether such improvements result in concrete gains.
In such a context, I think the metrics I'll be looking for are:
- does a particular change increase / decrease the density of "good" strategies in the search space or do we lose important points and degrade the overall result (think of this as an analogy to the final accuracy in an ML model / mixture of models)
- assuming 1. is not adversely impacted, does the change result in speed improvements and ideally faster time to solution (think of of this as the analogy to training speed in ML models)
Also, keep in mind that we are operating at the Linalg level here and the number of ops is generally quite limited since they have pretty big granularity: basically we are talking about ops that can fuse together with the caveat that combinations map/broadcast/transpose/elementwise can all be fused in a single Linalg op.
As we move towards mixing with finer grained transformations, maybe at the loop and load/store level, I am sure these improvements will be more noticeable than they are today.