This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Never add read accesses for synthesizable values
ClosedPublic

Authored by Meinersbur on Dec 21 2015, 5:39 AM.

Details

Summary

Before adding a MK_Value READ MemoryAccess, check whether the read is necessary or synthesizable. Synthesizable values are later generated by the SCEVExpander and therefore do not need to be transferred explicitly. This patch is also a requirement for the "cause inversion" extracted from D13762.

Diff Detail

Repository
rL LLVM

Event Timeline

Meinersbur updated this revision to Diff 43361.Dec 21 2015, 5:39 AM
Meinersbur retitled this revision from to [Polly] Add conditions for unnecessary value reads.
Meinersbur updated this object.
Meinersbur added reviewers: grosser, jdoerfert.
Meinersbur added a project: Restricted Project.
Meinersbur added subscribers: llvm-commits, pollydev.
grosser edited edge metadata.Dec 21 2015, 9:11 AM

Hi Michael,

some of the patches on the way do not apply cleanly. To get a better feeling of the test, I would like to apply it and play with it. For this I wait until the earlier patches are committed.

Here already some comments:

lib/Analysis/ScopInfo.cpp
3980 ↗(On Diff #43361)

There can be accesses to constants, in case the constant is an externally defined _constant_ global. Such constants are really just globals from which we load values.
To handle these you want to use ConstantInt, ConstantFP.

Also, I wonder which test case requires the isa<BasicBlock>(Value).

Also, I still need to understand if all pieces of the code are actually needed (and tested) at the moment. My feeling is that some of these simplifications may currently be never triggered as they are already covered by the code that inserts the memory accesses. (For this I wait until I can actually run the code).

Meinersbur added inline comments.Dec 22 2015, 2:42 PM
lib/Analysis/ScopInfo.cpp
3980 ↗(On Diff #43361)

Also, I wonder which test case requires the isa<BasicBlock>(Value).

Any branching instruction "uses" a BasicBlock. It iterates over the terminators because conditional branches uses predicates (of type i1) that might be defined in a different ScopStmt.

Meinersbur updated this revision to Diff 45983.Jan 26 2016, 5:45 AM
Meinersbur edited edge metadata.

Rebase; adressing Tobias comments

It is actually this patch blocked by D16522, not the others. Sorry for the confusion.

Hi Michael,

in this patch only the isSynthesisable change seems to actually affect the output, everything else appears to not yet affect anything.
My feeling is that some of the statements added here will be removed in your later patch at a different location. I would prefer to keep them in the same patch to better see that they are just moved around.

Regarding the isSynthesable change itself. It is touching an area that is a little fishy. Specifically, it will disable (all?) integer read-only memory access modeling as most (all?)l read-only memory accesses are synthesis-able. I need to think about this a little bit to understand if/what that means, to which extend we do this today and how we can correctly model read-only memory accesses then.

Is the isSynthesisable part of this patch blocking subsequent patches?

Best,
Tobias

Hi Michael,

in this patch only the isSynthesisable change seems to actually affect the output, everything else appears to not yet affect anything.
My feeling is that some of the statements added here will be removed in your later patch at a different location. I would prefer to keep them in the same patch to better see that they are just moved around.

As a reminder, I extracted out this patch out of D15706 on your request. Nothing will be removed again, it is even here because of the later patch.

Regarding the isSynthesable change itself. It is touching an area that is a little fishy.

This and D15706 are intended to reduce the fishyness. This is a good thing, isn't it?

Specifically, it will disable (all?) integer read-only memory access modeling as most (all?)l read-only memory accesses are synthesis-able. I need to think about this a little bit to understand if/what that means, to which extend we do this today and how we can correctly model read-only memory accesses then.

The idea is simple: Everything that is synthesizable (depending only on SCoP parameters and induction variables) can be inserted directly into the code and therefore does not require reading anything. You are right, since integer definitions defined before the scop are added as parameter, there is no reason to model them as accesses. Why should it? Why model those as parameter _and_ pass them using allocas? This is to my understanding the intention from the beginning. If not, please tell me the rules when read-only accesses are needed.

This should also exactly be what's required for OpenMP kernels. The SCoP parameters need to be passed to the kernel in any case because there might be (other) synthesizable SCEVs/isl_pw_aff in there that depend on these params. Passing a parameter a second time via a kernel function argument is redundant.

Is the isSynthesisable part of this patch blocking subsequent patches?

Yes, blocks D15706 and D12975; I am unable to understand the idea of read-only accesses in the current implementation. There seems to be no consistency. Eg. there is a difference whether it is used by an MK_Value, MK_PHI, or MK_PHI incoming value that is defined in a different statement. D15706 and D12975 modify the creation of MemoryAccesses and I cannot change it accurately if I can't even tell whether it is a bug or by design.

Hi Michael,

in this patch only the isSynthesisable change seems to actually affect the output, everything else appears to not yet affect anything.
My feeling is that some of the statements added here will be removed in your later patch at a different location. I would prefer to keep them in the same patch to better see that they are just moved around.

As a reminder, I extracted out this patch out of D15706 on your request. Nothing will be removed again, it is even here because of the later patch.

Sure. I remember that I asked for it. We just need to make sure the patches are split in a way that everything is tested, otherwise it is difficult to distinguish between dead code, code that is used and tested later and code for which we do not add more test coverage.

In this very patch it seems only isSynthesisable is tested. All other additions can be removed without any test failures. My feeling is that they current code will never call ensureValueRead with arguments that trigger the code that can be removed. If this is indeed the case, these changes (with the exception of isSynthesisable) can not be split off from the next patch. In case they can be reached even today, it would be good to add appropriate test cases.

Regarding the isSynthesable change itself. It is touching an area that is a little fishy.

This and D15706 are intended to reduce the fishyness. This is a good thing, isn't it?

Specifically, it will disable (all?) integer read-only memory access modeling as most (all?)l read-only memory accesses are synthesis-able. I need to think about this a little bit to understand if/what that means, to which extend we do this today and how we can correctly model read-only memory accesses then.

The idea is simple: Everything that is synthesizable (depending only on SCoP parameters and induction variables) can be inserted directly into the code and therefore does not require reading anything. You are right, since integer definitions defined before the scop are added as parameter, there is no reason to model them as accesses. Why should it? Why model those as parameter _and_ pass them using allocas? This is to my understanding the intention from the beginning. If not, please tell me the rules when read-only accesses are needed.

This should also exactly be what's required for OpenMP kernels. The SCoP parameters need to be passed to the kernel in any case because there might be (other) synthesizable SCEVs/isl_pw_aff in there that depend on these params. Passing a parameter a second time via a kernel function argument is redundant.

Just to explain the fishiness a little bit. The issue is that we only introduce parameters for SCEVs that are part of loop bounds or array indexes, but not for non-affine index expressions or scalar dependences that are e.g. just compute a value that is stored to memory. We do not introduce parameters for two reasons: First, we do not need them to model iteration domains, so not adding them keeps the dimensionality of the integer sets low. Second, we can not identify parameters the way we do this for affine expressions as some of the scalar evolution expressions used outside of affine index expressions and loop bounds are non-affine. This means not all the information is modeled/carried by our set of parameters, which is why e.g. the OpenMP code generation is required to scan the SCEVs of a ScopStmt for additional values that need to be transferred (See: getReferencesInSubtree).

Now, I think we should probably model some of these "hidden" scalar dependences more explicitly in the ScopStmt, but I agree that this probably nothing we should block this patch on. In fact, the immediate issue is already taken care of as the OpenMP code generation already scan's the SCEVs again.

It just took me one day to reason about this issue. I think the isSynthesisable part of this patch is fine.

Is the isSynthesisable part of this patch blocking subsequent patches?

Yes, blocks D15706 and D12975; I am unable to understand the idea of read-only accesses in the current implementation. There seems to be no consistency. Eg. there is a difference whether it is used by an MK_Value, MK_PHI, or MK_PHI incoming value that is defined in a different statement. D15706 and D12975 modify the creation of MemoryAccesses and I cannot change it accurately if I can't even tell whether it is a bug or by design.

I hoped to have explained the issue a little bit higher up. I don't think your patch can/will fix the issue I just explained, but we should probably sit down and reason about this in a separate patch. Within the model we use today, your patch is clearly an improvement and it does not regress any existing code.

I have two last requests/questions:

  • As written earlier, I think we should make this just about the isSynthesisable()
  • I wonder why we did not catch this earlier. We already check canSynthesize() in both buildPHIAccesses and in buildScalarDependences. Still, it seems we add unnecessary reads. Did you understand where these still sneaked through?

Best,
Tobias

Just to explain the fishiness a little bit. The issue is that we only introduce parameters for SCEVs that are part of loop bounds or array indexes, but not for non-affine index expressions or scalar dependences that are e.g. just compute a value that is stored to memory. We do not introduce parameters for two reasons: First, we do not need them to model iteration domains, so not adding them keeps the dimensionality of the integer sets low. Second, we can not identify parameters the way we do this for affine expressions as some of the scalar evolution expressions used outside of affine index expressions and loop bounds are non-affine. This means not all the information is modeled/carried by our set of parameters, which is why e.g. the OpenMP code generation is required to scan the SCEVs of a ScopStmt for additional values that need to be transferred (See: getReferencesInSubtree).

Now, I think we should probably model some of these "hidden" scalar dependences more explicitly in the ScopStmt, but I agree that this probably nothing we should block this patch on. In fact, the immediate issue is already taken care of as the OpenMP code generation already scan's the SCEVs again.

If this is the rationale, we should modify canSynthesize() to return false for values that are not in the SCoP's params.

Note that synthesizing is useful to remove scalar dependences:

void func(i32 %x)

Stmt1:
  %1 = add i32 %x, 5

Stmt2:
  use(%1)

canSynthesize() returning true causes the add to be %1 to disappear, because it will be synthesized into Stmt2.

  • As written earlier, I think we should make this just about the isSynthesisable()

OK, I will update this patch.

  • I wonder why we did not catch this earlier. We already check canSynthesize() in both buildPHIAccesses and in buildScalarDependences. Still, it seems we add unnecessary reads. Did you understand where these still sneaked through?

Because there is no check for canSynthesize() in buildPHIAccesses() when handling the case where the incoming value is not defined within the same Stmt.

Meinersbur added a comment.

In http://reviews.llvm.org/D15687#337230, @grosser wrote:

Just to explain the fishiness a little bit. The issue is that we only introduce parameters for SCEVs that are part of loop bounds or array indexes, but not for non-affine index expressions or scalar dependences that are e.g. just compute a value that is stored to memory. We do not introduce parameters for two reasons: First, we do not need them to model iteration domains, so not adding them keeps the dimensionality of the integer sets low. Second, we can not identify parameters the way we do this for affine expressions as some of the scalar evolution expressions used outside of affine index expressions and loop bounds are non-affine. This means not all the information is modeled/carried by our set of parameters, which is why e.g. the OpenMP code generation is required to scan the SCEVs of a ScopStmt for additional values that need to be transferred (See: getReferencesInSubtree).

Now, I think we should probably model some of these "hidden" scalar dependences more explicitly in the ScopStmt, but I agree that this probably nothing we should block this patch on. In fact, the immediate issue is already taken care of as the OpenMP code generation already scan's the SCEVs again.

If this is the rationale, we should modify canSynthesize() to return false for values that are not in the SCoP's params.

Note that synthesizing is useful to remove scalar dependences:

void func(i32 %x)

Stmt1:
  %1 = add i32 %x, 5

Stmt2:
  use(%1)

canSynthesize() returning true causes the add to be %1 to disappear, because it will be synthesized into Stmt2.

Right. That's one of the reasons it exists. So no, I do not think we
want it to return false either, but probably want to find a way to keep
track of these additional scalar references while still being able to
look through SCEVs.

Anyhow, let's just keep this in mind.

  • As written earlier, I think we should make this just about the isSynthesisable()

OK, I will update this patch.

  • I wonder why we did not catch this earlier. We already check canSynthesize() in both buildPHIAccesses and in buildScalarDependences. Still, it seems we add unnecessary reads. Did you understand where these still sneaked through?

Because there is no check for canSynthesize() in buildPHIAccesses() when handling the case where the incoming value is not defined within the same Stmt.

Perfect. I think this info will be great in the commit message. In fact,
it would probably be more educational to add the canSynthesize precisely
at this spot. Then your next patch will show nicely that multiple
canSynthesize() calls will be replaced by just one.

Best,
Tobias

Meinersbur updated this revision to Diff 46141.Jan 27 2016, 8:20 AM
Meinersbur retitled this revision from [Polly] Add conditions for unnecessary value reads to [Polly] Never add read accesses for synthesizable values.
Meinersbur updated this object.

remove all conditions except canSynthesize

grosser accepted this revision.Jan 27 2016, 8:23 AM
grosser edited edge metadata.
grosser added inline comments.
test/Isl/CodeGen/synthesizable_phi_write_after_loop.ll
11 ↗(On Diff #46141)

typo: the the

This revision is now accepted and ready to land.Jan 27 2016, 8:23 AM
This revision was automatically updated to reflect the committed changes.