This is an archive of the discontinued LLVM Phabricator instance.

[Polly][ScopBuilder] Introduce -polly-stmt-granularity=scalar-indep option.
ClosedPublic

Authored by Meinersbur on Sep 29 2017, 7:23 AM.

Details

Summary

The option splits BasicBlocks into minimal statements such that no additional scalar dependencies are introduced.

The algorithm is based on a union-find structure, and unites sets if putting them into separate statements would introduce a scalar dependencies. As a consequence, instructions may be split into separate statements such their relative order is different than the statements they are in. This is accounted for instructions whose relative order matters (e.g. memory accesses).

The algorithm is generic in that heuristic changes can be made with relative ease. We might relax the order requirement for read-reads or accesses to different base pointers. Forwardable instructions can be made to not cause a join.

This implementation gives us a speed-up of 82% in SPEC 2006 456.hmmer benchmark by allowing loop-distribution in a hot loop.

Diff Detail

Event Timeline

Meinersbur created this revision.Sep 29 2017, 7:23 AM
nandini12396 edited edge metadata.Sep 29 2017, 7:35 AM

@Meinersbur : A quick q. Even if the split introduced new scalar dependences, wouldnt forward op tree pass take care of it?

bollu added inline comments.Sep 29 2017, 8:00 AM
lib/Analysis/ScopBuilder.cpp
110

Nit: Could you add a one line explanation as to what the granularity refers to, please?

758

I don't understand, what is this loop trying to do? As best as I can tell, it checks that if we have seen Leader previously (Inserted == false), then we fuse Leader` with everything that was added after Leader.

Is this to ensure that in situations like this:

A
B
A

A and B are forced into the same basic block?

Either way, I'd appreciate it if it was written in a separate function or something, because it seems nice and modular.

766

Nit: could you please provide the concrete type used here? (Instruction *)

772

AFAICT, we don't need to call getLeaderValue on SeenLeaders.back(), because SeenLeaders only contains leaders.

787

Nit: please give this a concrete type (Instruction *)

796

Nit: could you factor this out into a separate function or a lambda? insertLeaderInstructions(...), or insertEquivalenceClassInstructions

@Meinersbur : A quick q. Even if the split introduced new scalar dependences, wouldnt forward op tree pass take care of it?

ForwardOpTree cannot forward everything, only 'speculatable' instructions and loads that are known to not have changed.

Even though this might cover a large number of cases, I tried to get something that conservatively doesn't do anything worse than it was before. More aggressive strategies (such as splitting each write access into its own statement) can be implemented as well.

grosser edited edge metadata.Sep 29 2017, 8:32 AM

Cool. Would be great to see the LICM working.

I mean see it working with LICM. Thanks Michael!

ForwardOpTree cannot forward everything, only 'speculatable' instructions and loads that are known to not have changed.

Ok. So, this option will, in future cover https://reviews.llvm.org/D37337#859617 ?

Meinersbur updated this revision to Diff 117659.Oct 4 2017, 5:26 AM
Meinersbur marked 3 inline comments as done.
  • Add unit tests
  • Refactor
  • Handle scalar dependencies due to incoming values
  • Add comments
Meinersbur retitled this revision from [Polly][ScopBuilder][WIP] Introduce -polly-stmt-granularity=scalar-indep option. to [Polly][ScopBuilder] Introduce -polly-stmt-granularity=scalar-indep option..Oct 4 2017, 5:26 AM
Meinersbur marked 2 inline comments as done.Oct 4 2017, 5:32 AM

ForwardOpTree cannot forward everything, only 'speculatable' instructions and loads that are known to not have changed.

Ok. So, this option will, in future cover https://reviews.llvm.org/D37337#859617 ?

Yes, although I think Tobias also had in mind what ForwardOpTree does today. Currently getScopStmt(Instruction*) allows returning only a single ScopStmt anyway, such that at this stage we cannot duplicate instructions into multiple statements. Although we could choose a 'primary' location for each instruction. There is definitely still room for improvement.

lib/Analysis/ScopBuilder.cpp
772

Former leaders may have merged into other leaders since it was added.

Hi Michael,

thanks for the update. Is the commit message up-to-date (AKA is the WIP still open or has it been resolved). Would be good to know before I start a more in-depth review!

Best,
Tobias

test/ScopInfo/granularity_scalar-indep.ll
5

interleaved

Meinersbur edited the summary of this revision. (Show Details)Oct 4 2017, 5:55 AM

Hi Michael,

thanks for the update. Is the commit message up-to-date (AKA is the WIP still open or has it been resolved). Would be good to know before I start a more in-depth review!

The issue has been resolved bt commits rr314662 to r314665. I removed the [WIP] tag, but forgot to update the summary.

bollu added inline comments.Oct 4 2017, 7:13 AM
lib/Analysis/ScopBuilder.cpp
683

Just to clarify: An Instruction I is ordered if there exists an instruction J such that I; J != J; I, correct?

711

Consider joinPHIWritesToEpilogue?

771

Consider a name such as joinAllAfterEpilogue? I feel that this name carries the intent a little better.

test/ScopInfo/granularity_scalar-indep_noepilogue.ll
3 ↗(On Diff #117659)

Nit: epilogie -> epilogue

Meinersbur added inline comments.Oct 4 2017, 4:08 PM
lib/Analysis/ScopBuilder.cpp
683

Not exactly, but it's the right idea.

Nit: Equality (!=) is not the right relation. If I;J is a tuple, then those are never equal. If it is a set, they are always equal. What you mean is some equivalence relation for semantics, which is difficult to define formally. I also don't intend to guarantee that if this function is true, that such an instruction J exists. Interactions with a third instruction K between I and J might also be possible, or depend on the code around. It is more that if it returns false and any permutation of non-ordered instructions have the same semantics in any context.

I'd stay with a verbal definition.

711

I call "epilogue" the last statement of a basic block where the PHI writes are added to. That is, PHI writes are always in the epilogue, whether this function is called or not.

More exact would be joinStatementsContainingDefinitionsOfIncomingValuesInSuccessorBlockWithEpilogue.

If PHI writes are added to the statement containing the definition instead of the writes, this function becomes useless. I hope to implement that sometime.

771

Good idea.

test/ScopInfo/granularity_scalar-indep.ll
5

thanks

test/ScopInfo/granularity_scalar-indep_noepilogue.ll
3 ↗(On Diff #117659)

thanks

bollu edited edge metadata.Oct 4 2017, 5:15 PM

Thanks for the very clean patch :) I really like reviewing your code, it's always really well written.

lib/Analysis/ScopBuilder.cpp
683

Thanks for the correction!

Consider adding the explanation you just gave to the comment, in that case?

"An unordered instruction is an instruction, such that a sequence of unordered instructions can be permuted without changing semantics. An ordered instruction is an instruction that is not unordered."

grosser accepted this revision.Oct 4 2017, 11:01 PM

Very cool. That looks good to me (after addressing Siddharth's comments).

lib/Analysis/ScopBuilder.cpp
681

its semantics

This revision is now accepted and ready to land.Oct 4 2017, 11:01 PM
Meinersbur updated this revision to Diff 117795.Oct 5 2017, 4:13 AM
Meinersbur marked 11 inline comments as done.
  • Address comments
  • Fix return instead of continue bug
This revision was automatically updated to reflect the committed changes.