This is an archive of the discontinued LLVM Phabricator instance.

[PartialInlining] Enhance code outliner to sink locals declared outside the outline region
ClosedPublic

Authored by davidxl on Jun 2 2017, 12:11 PM.

Details

Summary

This is a follow up to the previous patch that shrinkwraps allocas of locals that are declared inside the outline region.

With this patch, the machinery is introduced to handle more general case when life time markers are outside the region. This is the first step which is not yet hooked up with AA yet.

Diff Detail

Event Timeline

davidxl created this revision.Jun 2 2017, 12:11 PM
davidxl retitled this revision from [PartialInlining] Enhance partial inliner to sink wraps locals declared outside the outline region to [PartialInlining] Enhance code outliner to sink locals declared outside the outline region.
davidxl updated this revision to Diff 101283.Jun 2 2017, 3:06 PM

Fix bugs in tests.

wmi added inline comments.Jun 5 2017, 8:36 AM
lib/Transforms/Utils/CodeExtractor.cpp
166

Why we choose to skip BitCast, GetElementPtr, ..., Alloca, but treat other instructions like BinaryOp instruction conservatively. Why not just check memory access instructions like load/store/call and treat them conservatively, but skip all the other instructions?

187–197

Can store have similar logic as load here?

292–300

I think it is possible one alloca has multiple pairs of lifetime_start and lifetime_end. Like loop unrolling may create such case. It is also possible for one lifetime_start to have multiple lifetime_ends. But I guess the most common case in inlined blocks is one pair of lifetime_start and lifetime_end, so we may just add some check to avoid the other uncommon cases.

337–348

If AI have multiple bitcasts, we will only sink one of them and may cause problem?

davidxl added inline comments.Jun 5 2017, 11:09 AM
lib/Transforms/Utils/CodeExtractor.cpp
166

Done.

187–197

Done.

292–300

Added check to fail when this is detected.

337–348

I think it is already handled. If there are more than one bitcast uses outside the region, the getter will return null.

davidxl updated this revision to Diff 101430.Jun 5 2017, 11:09 AM

Addressed Wei's feedback.

wmi edited edge metadata.Jun 7 2017, 9:45 AM

One comment about simplifying the test. Other than that, LGTM.

test/Transforms/CodeExtractor/live_shrink.ll
19–34

The target of the test is used to check whether alloca can be shrinked properly. Can we use -skip-partial-inlining-cost-analysis and simplify the test by keeping only one call @_ZN1A7memfuncEv?

davidxl updated this revision to Diff 101789.Jun 7 2017, 11:55 AM

Simplify test cases.

davide added inline comments.Jun 7 2017, 12:12 PM
lib/Transforms/Utils/CodeExtractor.cpp
147–162

I think this is a case where two nested lambda expressions make the code hard to understand. I'd convert them to a for loop.

363

You can probably move this inside the NDEBUG block.

davide added inline comments.Jun 7 2017, 12:15 PM
include/llvm/Transforms/Utils/CodeExtractor.h
110

s/sinked/sunk/

lib/Transforms/Utils/CodeExtractor.cpp
289–290

Can you add a comment explaining why you skip ordered load/stores?

davidxl updated this revision to Diff 101802.Jun 7 2017, 1:17 PM
davidxl marked 3 inline comments as done.

Addressed Davide's comments.

Any more comments?

lib/Transforms/Utils/CodeExtractor.cpp
289–290

not necessarily needed. Removed.

davide edited edge metadata.Jun 10 2017, 8:54 PM

No more comments from me.

This revision was automatically updated to reflect the committed changes.