This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Polyhedral Value Analysis
Needs ReviewPublic

Authored by jdoerfert on Sep 25 2017, 1:32 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary
This is a _draft_ version for a polyhedral value analysis.

The goal of this analysis is a Scalar Evolution like experience with a
polyhedral-model-based backend. The analysis results are piece-wise
defined polyhedral expressions (PEXP). A PEXP is a loop-iteration
dependent, flow sensitive value description. See the class description
in PolyhedralValueInfo.h for more information and examples.

This patch includes isl, the integer set library. It is separated from
the analysis through the use of polyhedral values (PValue), a custom C++
abstraction layer. Once the interface is stable we can think about an
own implementation that does not depend on isl.

Diff Detail

Repository
rL LLVM

Event Timeline

jdoerfert created this revision.Sep 25 2017, 1:32 PM
bollu added a subscriber: bollu.Sep 25 2017, 3:36 PM

@jdoerfert: Thanks for putting this up. I'm trying to read and understand the ideas behind this. I've left a few questions, could you please clarify them for me?

include/llvm/Analysis/PolyhedralValueInfo.h
40

Should this have been i++? Otherwise, the explanation confuses me.

65

I want to clarify that I understand this representation correctly:

  1. in the j-loop case, because the j is external to the body of the loop S(i, j), j is a free parameter.
  1. in the i-loop case, j = base + niters where niters begins from 0? So, plugging in the actual values, we get j = i + l1, where l1 represents the number of iterations of j?
  1. Outside both the i and the j loop, the same logic of base + niters holds. But in this case, since neither i nor j is part of our "scope", they cannot be parameters anymore, and are this just input values to the map: l0 and l1.

Am I correct so far?

Now, a slightly more complex example:

for (i = 10; i < N; i++)
   for(j = 2 * i; j < M; j += 4)
      S(i, j)

I assume the breakdown would be:

  1. j-loop: [j] -> { [] -> [(j)] }
  2. i-loop: [i] -> { [l1] -> [(2 * i + 4 * l1)]
  3. max/none: [] -> { [l0, l1] -> [(2 * l0 + 4 * l1)]

I find the notation of induction-variables-outside-your-scope-is-parametric a little confusing, but I assume you have a rationale for this.

69

What does (fixed) value mean?

I forgot to include a softlink of the isl include folder. In the llvm src directory the following should be sufficent:

ln -s ../lib/Analysis/isl/include/isl include

I would have updated the revision but there is a problem with Phabricator that prevents me from doing so.

bollu added a comment.Sep 25 2017, 3:37 PM

@jdoerfert: Thanks for putting this up. I'm trying to read and understand the ideas behind this. I've left a few questions, could you please clarify them for me?

jdoerfert added inline comments.Sep 25 2017, 3:46 PM
include/llvm/Analysis/PolyhedralValueInfo.h
40

Yes, this is a typo. It should be i++.

65

Your breakdown looks (almost) good. After renaming of the actual result I get:

[j] -> { [] -> [(j)] }
[i] -> { [l0] -> [(2i + 4l0)] }
[]  -> { [l0, l1] -> [(20 + 2l0 + 4l1)] }

Which is the same for the first two cases but you forgot the initial value for i (here 10) which is multiplied by 2 for the start value of j.


I find the notation of induction-variables-outside-your-scope-is-parametric a little confusing, but I assume you have a rationale for this.

Notations are not perfect yet and I appreciate better ones.

69

Since I reuse the affine functions for domains I need a "value" but domains don't have a "value". So it will be the fixed value (1).

Hi Johannes,

thanks for posting the patch. Obviously it misses comments, and unittests for the C++ interface, but it already gives a very good idea where you are heading. I also feel this patch covers a couple of areas:

  1. A new C++ interface which seems to serve multiple purposes
    • Automatic memory management
    • Add some additional math functionality not available in isl at the moment
    • Rename a couple of functions to provide a better interface to LLVM
    • Add a couple of helpers / convenience features on top of isl (e.g. automatic space adjustment)
    • Add some functionality that is directly needed for the domain/range modeling, but not necessarily belongs to the core properties of an isl set.

All of these ideas make a lot of sense. I am however slightly worried regarding the large amount of hand-written interfacing code -- which is often replicated across different classes. How much of isl are you planning to cover?
Will you also introduce callbacks and stuff? I wonder if there would be a way to avoid a lot of this duplication and possibly make the separation between the parts that are just isl interfaces and the parts that are indeed LLVM additions more clear. We also have a larger set of isl extensions in Polly -- including corresponding unit tests. I think sharing these would indeed make a lot of sense.

  1. The polyhedral value analysis

The general interface makes a lot of sense - I played with this myself (after having introduced islpp), but waited for your proposal to come out. It is a clear extension of the LLVM-IR modeling we discussed together in Polly and you nicely mixed it with the ScalarEvolution design of LLVM. Cool! The code is currently a little too large to comment on specific items. Some of the things you do are already things we did in Polly but unfortunately missing most comments, others are clearly new and I need to read through them in detail. As I said on the mailing list thread, I feel that this analysis generally makes a lot of sense -- assuming we agree -- I would like to help getting it into LLVM. If you would like to do this incrementally -- the RISC-V model would make a lot of sense here. Do you believe there is a way to split this into smaller pieces? Maybe this can be split into:

  • Basic Interface
  • Analysis of scalar values without control flow
  • Simple loops
  • Wrapping

I am especially interested in understanding (and having documented well) how integer wrapping is modeled.

  • Specifically, I would like this analysis to speculate the wrapping introduced by casts as they commonly arise e.g. in OpenCL code get_global_id / get_local_id. Not sure if this is currently possibly / optional.
  • Also, did you consider multi-dimensional output spaces. I discussed with Siddharth and we might be able to model certain polynomial expressions this way. Most interestingly, we might be able to do delinearization on this model, which would allow us to effectively delinearize even in the presence of integer casts.
  • Can you make wrapping optional -- meaning can you potentially model precise wrapping?

Also, do you happen to have a design description why all this works? If not, I wonder if we should create this together when building up the code incrementally in LLVM.

I also have a random set of comments, but these are clearly by far not complete but just passby comments.

include/llvm/Analysis/PValue.h
10

generation

11

backend

19

therefore

54

Is this needed?

include/llvm/Analysis/PolyhedralValueInfo.h
160

Why do you need a special InvalidDomain / KnownDomain? isl provides piecewise affs that can map to nans. Can this information not be directly modeled as part of the pwaff?

lib/Analysis/PValue.cpp
72

I feel this would be a lot cleaner if written with the isl++ bindings.

147

Again, islpp would make this so much clearer.

234

We have the very same functions in Polly. I feel that this is code-duplication that we should certainly find a solution to avoid.

259

Leftover comments?

829

isl++ would allow you to use proper lambdas.

914

Interesting. Are you planning to do this dimension adjustment on all binary operations?

If we would model loop dimensions also as parameters this adjustment would happen automatically. Not sure if this makes things easier though.

1046

Leftover?!

1067

This function would look so much nicer with islpp.

lib/Analysis/PolyhedralValueInfo.cpp
449

As isComplex always returns "false", I assume this is all dead code and likely will explode in some cases.

523

This function certainly needs a comment. ;-)

At best it is split in smaller pieces! ;-)

1079

I think LLVM suggest to not use auto, except in obvious cases.

Hi Johannes,

thanks for posting the patch. Obviously it misses comments, and unittests for the C++ interface, but it already gives a very good idea where you are heading. I also feel this patch covers a couple of areas:

  1. A new C++ interface which seems to serve multiple purposes
    • Automatic memory management
    • Add some additional math functionality not available in isl at the moment
    • Rename a couple of functions to provide a better interface to LLVM
    • Add a couple of helpers / convenience features on top of isl (e.g. automatic space adjustment)
    • Add some functionality that is directly needed for the domain/range modeling, but not necessarily belongs to the core properties of an isl set.

All of these ideas make a lot of sense. I am however slightly worried regarding the large amount of hand-written interfacing code -- which is often replicated across different classes. How much of isl are you planning to cover?
Will you also introduce callbacks and stuff? I wonder if there would be a way to avoid a lot of this duplication and possibly make the separation between the parts that are just isl interfaces and the parts that are indeed LLVM additions more clear. We also have a larger set of isl extensions in Polly -- including corresponding unit tests. I think sharing these would indeed make a lot of sense.

  1. The polyhedral value analysis

The general interface makes a lot of sense - I played with this myself (after having introduced islpp), but waited for your proposal to come out. It is a clear extension of the LLVM-IR modeling we discussed together in Polly and you nicely mixed it with the ScalarEvolution design of LLVM. Cool! The code is currently a little too large to comment on specific items. Some of the things you do are already things we did in Polly but unfortunately missing most comments, others are clearly new and I need to read through them in detail. As I said on the mailing list thread, I feel that this analysis generally makes a lot of sense -- assuming we agree -- I would like to help getting it into LLVM. If you would like to do this incrementally -- the RISC-V model would make a lot of sense here. Do you believe there is a way to split this into smaller pieces? Maybe this can be split into:

  • Basic Interface
  • Analysis of scalar values without control flow
  • Simple loops
  • Wrapping

I am especially interested in understanding (and having documented well) how integer wrapping is modeled.

  • Specifically, I would like this analysis to speculate the wrapping introduced by casts as they commonly arise e.g. in OpenCL code get_global_id / get_local_id. Not sure if this is currently possibly / optional.
  • Also, did you consider multi-dimensional output spaces. I discussed with Siddharth and we might be able to model certain polynomial expressions this way. Most interestingly, we might be able to do delinearization on this model, which would allow us to effectively delinearize even in the presence of integer casts.
  • Can you make wrapping optional -- meaning can you potentially model precise wrapping?

Also, do you happen to have a design description why all this works? If not, I wonder if we should create this together when building up the code incrementally in LLVM.

I also have a random set of comments, but these are clearly by far not complete but just passby comments.

These are all very good but also very detailed questions. I would like to postpone them till this patch is in a form that should be reviewed. Right now I'd like to focus on the general idea, query interface and early results. Specific design choices (such as optional wrapping) can be discussed and added on top of a first implementation.

Just as a remark to the multi-dimentional spaces: Creating a multi-dimensional memory view (aka. "deliniarization") is already happening (for simple cases) in the memory analysis (which I did not yet publish). Conceptually that should be the place for such an "analysis". If we would allow multi-dimensional output spaces then to represent vector values which are actually multi-dimensional values.

jdoerfert added inline comments.Sep 26 2017, 2:50 AM
include/llvm/Analysis/PolyhedralValueInfo.h
160

No, because then the information is gone.

Take:

{ [i0] -> [(i0 + 1)] }

with an invalid domain:

{ [i0] : i0 < 5 }

If we can later proof i0 >= 5 we can use the PEXP without restriction. However, if we would have mapped all i0 < 5 to nan we would have lost their original mapping for good. (Also it would make the PVAff more complex.)

lib/Analysis/PValue.cpp
72

When I wrote/copied it there were no isl++ bindings.

234

While I didn't know that I am aware of other code snippets I directly took from Polly, thus which are now duplicated. Though we're talking about 5-15 lines of code not more at a time.

914

Binary operations are all done via combinator functions which automatically adjust dimensions, yes.

lib/Analysis/PolyhedralValueInfo.cpp
449

I'll just have to implement PVSet::isComplex similar to PVAff::isComplex.

Hi Johannes,

thanks for posting the patch. Obviously it misses comments, and unittests for the C++ interface, but it already gives a very good idea where you are heading. I also feel this patch covers a couple of areas:

  1. A new C++ interface which seems to serve multiple purposes
    • Automatic memory management
    • Add some additional math functionality not available in isl at the moment
    • Rename a couple of functions to provide a better interface to LLVM
    • Add a couple of helpers / convenience features on top of isl (e.g. automatic space adjustment)
    • Add some functionality that is directly needed for the domain/range modeling, but not necessarily belongs to the core properties of an isl set.

All of these ideas make a lot of sense. I am however slightly worried regarding the large amount of hand-written interfacing code -- which is often replicated across different classes. How much of isl are you planning to cover?
Will you also introduce callbacks and stuff? I wonder if there would be a way to avoid a lot of this duplication and possibly make the separation between the parts that are just isl interfaces and the parts that are indeed LLVM additions more clear. We also have a larger set of isl extensions in Polly -- including corresponding unit tests. I think sharing these would indeed make a lot of sense.

  1. The polyhedral value analysis

The general interface makes a lot of sense - I played with this myself (after having introduced islpp), but waited for your proposal to come out. It is a clear extension of the LLVM-IR modeling we discussed together in Polly and you nicely mixed it with the ScalarEvolution design of LLVM. Cool! The code is currently a little too large to comment on specific items. Some of the things you do are already things we did in Polly but unfortunately missing most comments, others are clearly new and I need to read through them in detail. As I said on the mailing list thread, I feel that this analysis generally makes a lot of sense -- assuming we agree -- I would like to help getting it into LLVM. If you would like to do this incrementally -- the RISC-V model would make a lot of sense here. Do you believe there is a way to split this into smaller pieces? Maybe this can be split into:

  • Basic Interface
  • Analysis of scalar values without control flow
  • Simple loops
  • Wrapping

I am especially interested in understanding (and having documented well) how integer wrapping is modeled.

  • Specifically, I would like this analysis to speculate the wrapping introduced by casts as they commonly arise e.g. in OpenCL code get_global_id / get_local_id. Not sure if this is currently possibly / optional.
  • Also, did you consider multi-dimensional output spaces. I discussed with Siddharth and we might be able to model certain polynomial expressions this way. Most interestingly, we might be able to do delinearization on this model, which would allow us to effectively delinearize even in the presence of integer casts.
  • Can you make wrapping optional -- meaning can you potentially model precise wrapping?

Also, do you happen to have a design description why all this works? If not, I wonder if we should create this together when building up the code incrementally in LLVM.

I also have a random set of comments, but these are clearly by far not complete but just passby comments.

These are all very good but also very detailed questions. I would like to postpone them till this patch is in a form that should be reviewed. Right now I'd like to focus on the general idea, query interface and early results. Specific design choices (such as optional wrapping) can be discussed and added on top of a first implementation.

Just as a remark to the multi-dimentional spaces: Creating a multi-dimensional memory view (aka. "deliniarization") is already happening (for simple cases) in the memory analysis (which I did not yet publish). Conceptually that should be the place for such an "analysis". If we would allow multi-dimensional output spaces then to represent vector values which are actually multi-dimensional values.

Fair point! Well taken.

Best,
Tobias

fhahn added a subscriber: fhahn.Sep 26 2017, 7:02 AM

I think it would be easier to review if you would create a separate patch that adds isl and then have a patch with just the files for PolyhedralValueInfo & co.

etherzhhb added inline comments.Oct 7 2017, 2:19 PM
include/llvm/Analysis/PValue.h
42–46

Why PVBase need to be friend with these class?

maybe we could make getIslCtx and getSpace as protected function instead?