This is an archive of the discontinued LLVM Phabricator instance.

BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts
ClosedPublic

Authored by grosser on Jan 18 2017, 11:56 PM.

Details

Summary

Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.

Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.

We also document the current behavior a little bit better.

Diff Detail

Repository
rL LLVM

Event Timeline

grosser created this revision.Jan 18 2017, 11:56 PM
Meinersbur accepted this revision.Jan 19 2017, 5:29 AM

Did you forget to prefix the title with "[Polly]"?

include/polly/CodeGen/BlockGenerators.h
837–906 ↗(On Diff #84946)

Good description

lib/CodeGen/BlockGenerators.cpp
1444 ↗(On Diff #84946)

Any reason to not use getNewValue?

Could you consider adding an assertion verifying that BBCopy/PHI already exist in the maps?

With those two changes both cases of StmtR->contains(IncomingBB) would share some code that could be put before the condition, maybe the condition can be entirely removed.

Instead of the const-cast, we could remove the const-qualifier which doesn't serve any purpose anyway.

This revision is now accepted and ready to land.Jan 19 2017, 5:29 AM
This revision was automatically updated to reflect the committed changes.

Hi Michael,

thanks for the very useful review comments. I integrated most of them, but could not pull out the getNewValue easily. I still need to understand if/when I can pull out the insertionpoint update logic. I committed this for now, as this already seems to be a strict improvement. Will look into the insertion point logic at again.