This is an archive of the discontinued LLVM Phabricator instance.

[LIR] move code to support for structs and hand unrolled loops
ClosedPublic

Authored by haicheng on Dec 20 2015, 6:31 PM.

Details

Summary

Some code is moved to let LIR support structs and hand unrolled loops in the future.

isConsecutiveAccess() is moved from SLP vectorizer to valuetracking so that lir can use in the future

processLoopStoreOfLoopLoad() is hoisted under processLoopBlock()

Many checks are moved to isLegalStore()

Diff Detail

Repository
rL LLVM

Event Timeline

haicheng updated this revision to Diff 43344.Dec 20 2015, 6:31 PM
haicheng retitled this revision from to [LIR] move code to support for structs and hand unrolled loops .
haicheng updated this object.
haicheng added reviewers: mssimpso, gberry, mcrosier.
haicheng set the repository for this revision to rL LLVM.
haicheng updated this revision to Diff 43379.Dec 21 2015, 8:59 AM

remove isConsecutiveAccess() from this patch

mcrosier edited edge metadata.Dec 21 2015, 9:40 AM

Did you run this on our test suite to verify there is no function change?

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
307–308

forMemset -> ForMemset

308

forMemcpy -> ForMemcpy

333

I would probably just drop this extra check (i.e., if (HasMemset || HasMemsetPattern) {) as these checks are already built into the next bit of logic. I also assume the common case is that these builtins are typically available.

367

I would add a comment here that indicates we're now looking for a store that can be turned into a memcpy. The HasMemcpy check implies this, but I find it best to be verbose.

376

Shouldn't this be returning false (i.e., this is not a legal store for memset/memcpy optimizations)?

383

Shouldn't this be returning false (i.e., this is not a legal store for memset/memcpy optimizations)?

393

Shouldn't this be returning false (i.e., this is not a legal store for memset/memcpy optimizations)?

399

Shouldn't this be returning false (i.e., this is not a legal store for memset/memcpy optimizations)?

415

These two variables should be capitalized.

422–425

You can remove the extra curly braces. I.e.,

if (ForMemset)

StoreRefsForMemset.push_back(SI);

else if (ForMemcpy)

StoreRefsForMemcpy.push_back(SI);
474

This function is still called processLoopStores; please remove the extra 's'

690–691

At this point you've already verified the value operand of the store is a non-volatile load. Therefore, you can replace this dyn_cast<> with a cast<>. You can probably drop the check of LI in the assert, but I'd keep the isSimple check.

697–698

At this point you've already verified the Load pointer operand is an AddRecExpr. Therefore, you can replace this dyn_cast<> with a cast<>. You can also drop the assert on the next line because we'll already throw an assert if we cast a null value.

Did you run this on our test suite to verify there is no function change?

There is actually a small functional change. Now, for every basic block, I first generate all the memsets, then all the memcpys because they are in two different loops. Before, they may mix since they are processed in the same loop.

I compared the binaries of all spec2000/2006. This is the only difference I found and only occurred once in spec2006/gcc.

Did you run this on our test suite to verify there is no function change?

There is actually a small functional change. Now, for every basic block, I first generate all the memsets, then all the memcpys because they are in two different loops. Before, they may mix since they are processed in the same loop.

I compared the binaries of all spec2000/2006. This is the only difference I found and only occurred once in spec2006/gcc.

Can you be more specific about the difference?

haicheng updated this revision to Diff 43389.Dec 21 2015, 1:02 PM
haicheng edited edge metadata.
haicheng marked 13 inline comments as done.

Address chad's comments

mcrosier added inline comments.Dec 21 2015, 1:28 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
133

Why is the PatternValue logic being hoisted out into the body of processLoopStore? Why not sink it back to avoid passing a nullptr when optimizing series of memset instructions (MSI)?

I guess the bigger question is why isn't the PatternValue used in the context of optimizing MSIs? Can you not combine two memset_pattern16 memsets?

332–403

I think we dropped part of the original comment.

// If the stored value is a byte-wise value (like i32 -1), then it may be
400

Perhaps add an assert here like:

assert((ForMemset || ForMemcpy) && "Unexpected memset/memcpy store type.");

haicheng updated this revision to Diff 43407.Dec 21 2015, 3:20 PM
haicheng marked 3 inline comments as done.

Address Chad's comments

mcrosier accepted this revision.Dec 23 2015, 7:17 AM
mcrosier edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Dec 23 2015, 7:17 AM
mcrosier closed this revision.Dec 23 2015, 10:05 AM

Committed in r256336.