[NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file
ClosedPublic

Authored by hsaito on Apr 11 2018, 7:24 PM.

Details

Summary

This is a follow up to D45420 (included here since it is still under review and this change is dependent on that) and D45072 (committed).
Actual change for this patch is LoopVectorize* and cmakefile. All others are all from D45420.

LoopVectorizationLegality is an analysis and thus really belongs to Analysis tree. It is modular enough and it is reusable enough ---- we can further improve those aspects once uses outside of LV picks up.

Hopefully, this will make it easier for people familiar with vectorization theory, but not necessarily LV itself to contribute, by lowering the volume of code they should deal with. We probably should start adding some code in LV to check its own capability (i.e., vectorization is legal but LV is not ready to handle it) and then bail out.

Diff Detail

Repository
rL LLVM
hsaito created this revision.Apr 11 2018, 7:24 PM

Hi Hideki,

Can you just include these changes? Leave the other changes to the other review. It's a lot simpler to review that way.

Just saying it's a followup of D45420 is enough to help review.

cheers,
--renato

Hi Hideki,

Can you just include these changes? Leave the other changes to the other review. It's a lot simpler to review that way.

Just saying it's a followup of D45420 is enough to help review.

cheers,
--renato

You mean exclude. Sure. I can do that.

hsaito updated this revision to Diff 142417.Apr 13 2018, 8:09 AM

D45420 portion excluded. Note that this patch won't build w/o D45420 patch.

No changes to the actual code. Some member functions are moved out of class definition to the .cpp file, since they actually don't belong to the header file in .h/.cpp configuration.

createMissedAnalysis() is renamed since it seems to have a need to move to llvm namespace (from static function).

rengolin accepted this revision.Apr 13 2018, 8:17 AM

Right, pure code movement, and a good clean up at that, LGTM.

Just one nit in the comments and can go as soon as its deps are approved. Thanks!

lib/Transforms/Vectorize/LoopVectorize.cpp
22 ↗(On Diff #142417)

I wouldn't say "moved out", but just reference the file like "in LoopVectorizationLegality".

This revision is now accepted and ready to land.Apr 13 2018, 8:17 AM

Right, pure code movement, and a good clean up at that, LGTM.

Just one nit in the comments and can go as soon as its deps are approved. Thanks!

Thanks a lot. Will commit this after D45420.

hsaito updated this revision to Diff 144383.EditedApr 27 2018, 12:34 PM

Rebased, and Legal is kept under Transform, since D45420 has been abandoned.

Notice that a few member functions in Legal, such as canVectorize(), now take UseVPlanNativePath parameter. This is temporary and helps identify where convergence effort is needed. I thought about simply eliminating asserts in Legal that are based on "EnableVPlanNativePath", but Diego wanted to keep them. Alternative is to have EnableVPlanNativePath as a member data of Legal, but I prefer to pass it through call chain from the convergence work perspective.

hsaito added inline comments.Apr 27 2018, 12:39 PM
include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
238 ↗(On Diff #144383)

New parameter added.

325 ↗(On Diff #144383)

New parameter added.

329 ↗(On Diff #144383)

New parameter added.

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
890 ↗(On Diff #144383)

Assert kept here.

998 ↗(On Diff #144383)

Assert kept here.

lib/Transforms/Vectorize/LoopVectorize.cpp
4922 ↗(On Diff #144383)

Assert Diego wants to keep.

5029 ↗(On Diff #144383)

Assert Diego wants to keep

Some minor comments inline. Since Renato already approved this patch I can give you LGTM after that.

I thought about simply eliminating asserts in Legal that are based on "EnableVPlanNativePath",

These asserts guarantee that code in the VPlan native path is not executed when such a path is not enabled. They helped me caught some issues when doing stress testing mode so I think it's a good idea to keep them.

Thanks,
Diego

include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
238 ↗(On Diff #144383)

Could you please document the flag here and in the other two places below?

lib/Transforms/Vectorize/LoopVectorize.cpp
7319 ↗(On Diff #144383)

I'm probably missing something but why this renaming is necessary?

hsaito added inline comments.Apr 27 2018, 9:17 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
7319 ↗(On Diff #144383)

You don't want original generic name (static file scope) in llvm namespace that's visible to all other parts of LLVM.

hsaito updated this revision to Diff 144434.Apr 27 2018, 9:44 PM

Added comments requested by Diego.

include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
238 ↗(On Diff #144383)

Done

dcaballe accepted this revision.Apr 28 2018, 9:38 AM

Thanks, Hideki. LGTM!

The only remaining question that I would have is if it would be better to keep LoopVectorizationLegality.h as a private header (i.e., in /lib/Transforms/Vectorize/) instead of as a public header (i.e., in /include/llvm/Transforms/Vectorize/) since currently LoopVectorizationLegality is not needed outside of LV. I think that public headers have some negative impact in the compile time of LLVM so we may want to avoid them when they are not strictly necessary.
However, I leave it up to you. I think this is more like a philosophical question so I would be OK with the current approach.

Please, wait a few days before committing it to see if there are more comments.

Thanks!
Diego.

Thanks, Hideki. LGTM!

The only remaining question that I would have is if it would be better to keep LoopVectorizationLegality.h as a private header (i.e., in /lib/Transforms/Vectorize/) instead of as a public header (i.e., in /include/llvm/Transforms/Vectorize/) since currently LoopVectorizationLegality is not needed outside of LV. I think that public headers have some negative impact in the compile time of LLVM so we may want to avoid them when they are not strictly necessary.
However, I leave it up to you. I think this is more like a philosophical question so I would be OK with the current approach.

Please, wait a few days before committing it to see if there are more comments.

Thanks!
Diego.

A public header has zero impact in compile time for the files that are not including it, for obvious reasons. I really would like other optimizers start thinking about whether vectorization is even possible or not and act accordingly ---- and for that I'd want to lower the bar for people wanting to do it. Moving the code from one place to another has a psychological barrier for many people. Remember, originally, I wanted to move this to Analysis, to make it easy to create an Analysis pass out of it if somebody wants that.

hsaito retitled this revision from [NFC][LV][LoopUtil] Move LoopVectorizationLegality to Analysis tree to [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.Apr 29 2018, 12:12 AM
hsaito updated this revision to Diff 144471.Apr 29 2018, 12:23 AM

rebased again

This revision was automatically updated to reflect the committed changes.

A public header has zero impact in compile time for the files that are not including it, for obvious reasons.

Yup, that's not the problem.

I really would like other optimizers start thinking about whether vectorization is even possible or not and act accordingly ---- and for that I'd want to lower the bar for people wanting to do it. Moving the code from one place to another has a psychological barrier for many people. Remember, originally, I wanted to move this to Analysis, to make it easy to create an Analysis pass out of it if somebody wants that.

It's not just psychological, it's organisational. A header that is inside a source directory is harder to include than one that is inside an include directory (needs ../.. on the path, or change the include path). That extra step is a giveaway for reviewers that you may be trying to rely on internal logic that should not escape its original domain.

In this case, LoopUtils and others were literally created *just* for the loop vectoriser. Any other pass wanting to use those libraries will have to consider how they work, and some work is bound to be needed on both loop vectoriser and new user.

That extra difficulty is intended as a barrier to avoid misunderstandings (and future code gen failures), so as to make *every* "use beyond original intention" explicit and thought through.

This is why we don't move any local header to includes without a clear user, a clear use case and refactoring with a clear purpose.

I agree with you that this functionality belongs to higher ground, but we need to know what the other users will be like before we can move it, to avoid accidental users.

cheers,
--renato

hsaito added a comment.EditedApr 29 2018, 7:05 PM

I really would like other optimizers start thinking about whether vectorization is even possible or not and act accordingly ---- and for that I'd want to lower the bar for people wanting to do it. Moving the code from one place to another has a psychological barrier for many people. Remember, originally, I wanted to move this to Analysis, to make it easy to create an Analysis pass out of it if somebody wants that.

It's not just psychological, it's organisational. A header that is inside a source directory is harder to include than one that is inside an include directory (needs ../.. on the path, or change the include path). That extra step is a giveaway for reviewers that you may be trying to rely on internal logic that should not escape its original domain.

In this case, LoopUtils and others were literally created *just* for the loop vectoriser. Any other pass wanting to use those libraries will have to consider how they work, and some work is bound to be needed on both loop vectoriser and new user.

That extra difficulty is intended as a barrier to avoid misunderstandings (and future code gen failures), so as to make *every* "use beyond original intention" explicit and thought through.

This is why we don't move any local header to includes without a clear user, a clear use case and refactoring with a clear purpose.

I agree with you that this functionality belongs to higher ground, but we need to know what the other users will be like before we can move it, to avoid accidental users.

In this case, the most immediate client for us is actually our internal fast-track VPLan vectorizer (it's outside of LV so that we don't mess up LV accidentally), but we'd like other optimizers aware of potential vectorization (or lack of it) to unleash more power (e.g., fusion, distribution, etc.). LoopVectorizationLegality::canVectorize() is a straightforward interface for any client to use for such purposes, as a starter. Seriously, LVL is a standalone stuff that we could create an Analysis pass --- pending Induction/RecurrenceDescriptors stuff it is dependent on). We can't move it to a higher ground without first open sourcing non-vectorizer client use?????

In this case, the most immediate client for us is actually our internal fast-track VPLan vectorizer (it's outside of LV so that we don't mess up LV accidentally),

Still inside Transform, so should be fine.

but we'd like other optimizers aware of potential vectorization (or lack of it) to unleash more power (e.g., fusion, distribution, etc.).

More specifically?

LoopVectorizationLegality::canVectorize() is a straightforward interface for any client to use for such purposes, as a starter.

I'm not disagreeing with you. :)

Seriously, LVL is a standalone stuff that we could create an Analysis pass --- pending Induction/RecurrenceDescriptors stuff it is dependent on).

The pending part that is the trouble. We'd have to extract them before a move, but we'd also have to have a reason other than "it looks like an analysis pass".

I agree it does, I agree it would be interesting to move it up, but I have no concrete example of what could use it on the current code, so I have no reason to move it up.

We can't move it to a higher ground without first open sourcing non-vectorizer client use?????

I'm not sure what you're referring to, but if you have downstream code that needs it, then yes, you would have to show it (in the form or an RFC and a review) to help us understand your use.

For now, it doesn't really matter where it is, so we don't move unnecessarily.

Hope this makes sense.

cheers,
--renato

In this case, the most immediate client for us is actually our internal fast-track VPLan vectorizer (it's outside of LV so that we don't mess up LV accidentally),

Still inside Transform, so should be fine.

That's why I didn't insist to move Legal to Analysis.

but we'd like other optimizers aware of potential vectorization (or lack of it) to unleash more power (e.g., fusion, distribution, etc.).

More specifically?

For example, in loop fusion scenarios, doing this is not a rocket science.

if (LVL_for_Loop1->canVectorize() == LVL_for_Loop2->canVectorize()) {

// fusion doesn't have to care much about what vectorizer would do downstream.

} else {

// we'd like to make sure fusion will gain more than vectorizing the loop that canVectorize() --- else don't fuse.

}

This is a chicken-egg problem. In my opinion, since vectorizer has so much performance impact, vectorizer's analysis aspect should be made available for others for "easier reuse". Next thing I'd like to move up to a higher ground is CostModel. I'll be sure to send in an RFC for that.

Thanks,
Hideki

For example, in loop fusion scenarios, doing this is not a rocket science.

Right, this is still loop transformation, so staying inside Transform still makes sense.

I'm not disagreeing with you on the topics:

  1. It does look like an analysis pass
  2. Other passes could profit

However, so far, the other passes are all in the transform directory.

I agree we need to split "what" can be done from "how we'll do it" and I agree we need to make LoopUtils more generic. I was just trying to be clear on why we shouldn't move to include/Analysis yet.

This is a chicken-egg problem. In my opinion, since vectorizer has so much performance impact, vectorizer's analysis aspect should be made available for others for "easier reuse".

Not yet, since we can already do that, while keeping the headers inside Transform.

We shouldn't think of the InnerLoopVectorizer as the "loop vectoriser". VPlan, SLP and other loop transformations are all part of the "loop transformations" which will all need to reuse the same analysis.

So, moving things our of the ILV is great and we should do it ASAP!

Moving them away from Transform will need a reason, of which we don't have one yet.

Next thing I'd like to move up to a higher ground is CostModel. I'll be sure to send in an RFC for that.

Great! We'll need to understand how that ties with the existing TTI hooks and TableGen's scheduling information.

For a long time I wanted to have a cost model with peephole analysis, or at least pattern-matching sequences, but the TTI hooks aren't too good for that and I never had the time to refactor it.

Sorry if I sounded terse recently, I was just trying to explain the reasons, not flog the dead horse. :)

Thanks!
--renato

For example, in loop fusion scenarios, doing this is not a rocket science.

Right, this is still loop transformation, so staying inside Transform still makes sense.

I'm not disagreeing with you on the topics:

  1. It does look like an analysis pass
  2. Other passes could profit

    However, so far, the other passes are all in the transform directory.

    I agree we need to split "what" can be done from "how we'll do it" and I agree we need to make LoopUtils more generic. I was just trying to be clear on why we shouldn't move to include/Analysis yet.

There seems like a disconnect. I fully understood the resistance for moving them to Analysis. I thought you are going against making it available from include/llvm/Transform/Vectorize (as opposed to lib/Transforms/Vectorize).
Pushing them to include/llvm/Transform/Vectorize is already a big improvement and I'm quite happy about that.

There seems like a disconnect.

There was. :)

Pushing them to include/llvm/Transform/Vectorize is already a big improvement and I'm quite happy about that.

I'm happy with that, too.

Sorry for the confusion.

Just to be sure, again, LGTM. :)

There seems like a disconnect.

There was. :)

Pushing them to include/llvm/Transform/Vectorize is already a big improvement and I'm quite happy about that.

I'm happy with that, too.

Sorry for the confusion.

Just to be sure, again, LGTM. :)

Thanks a lot.