This is an archive of the discontinued LLVM Phabricator instance.

[loop idiom Recognition] support memcpy for multiple consecutive loads and stores
AcceptedPublic

Authored by DIVYA on Jun 23 2017, 12:20 PM.

Details

Summary

This pass converts multiple consecutive loads and stores inside a loop to memcpy operation.
For example, the stores and loads in the below code can be converted to a memcpy .

struct foo
{

int a;
int b;

} f,g;

for(int i=0;i<n;i++)
{

f[i].a  =  g[i].a;
f[i].b  =  g[i].b;

}

Worked in collaboration with Sebastian Pop and Aditya Kumar.

Diff Detail

Event Timeline

DIVYA created this revision.Jun 23 2017, 12:20 PM
DIVYA edited the summary of this revision. (Show Details)Jun 23 2017, 12:23 PM
DIVYA updated this revision to Diff 104483.Jun 28 2017, 11:36 AM
mgrang added a subscriber: mgrang.Jun 28 2017, 11:49 AM
mgrang added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
161 ↗(On Diff #104483)

Extra newline.

371 ↗(On Diff #104483)

Extra newline.

458 ↗(On Diff #104483)

Ditto.

497 ↗(On Diff #104483)

Ditto.

919 ↗(On Diff #104483)

Use i != e instead of i < e.

951 ↗(On Diff #104483)

This loop looks a bit hacky. Can this be re-written in a better way?

954 ↗(On Diff #104483)

Check indentation.

1204 ↗(On Diff #104483)

Newline removed.

1607 ↗(On Diff #104483)

Newline removed.

test/Transforms/LoopIdiom/memcpy_structPattern.ll
12 ↗(On Diff #104483)

Can the test case be simplified/minimized by removing unwanted/extra attributes?

mgrang added inline comments.Jun 28 2017, 11:52 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
901 ↗(On Diff #104483)

Extra / in comments. Also check comment spacing.

test/Transforms/LoopIdiom/memcpy_structPattern.ll
1 ↗(On Diff #104483)

Unit test names follow all lowercase naming convention: memcpy_struct_pattern.ll

DIVYA updated this revision to Diff 105328.Jul 5 2017, 1:40 PM
  • Removed Extra Newlines
  • Added IndexQueue(SmallVector) for storing possible consecutive stores
  • changed testcase filename
haicheng edited edge metadata.EditedJul 5 2017, 6:56 PM

Please check the inlined comment.

Is it possible to reuse the code of processLoopStores() and processLoopStridedStore()?

Haicheng

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
527 ↗(On Diff #105328)

I think this comment should be changed.

900 ↗(On Diff #105328)

I think we need to emphasize that loads and stores are consecutive.

930 ↗(On Diff #105328)

What if FirstStoreLoad is nullptr?

968 ↗(On Diff #105328)

Same thing here.

1070 ↗(On Diff #105328)

Please check the format.

1077 ↗(On Diff #105328)

memset?

1105 ↗(On Diff #105328)

memset?

DIVYA updated this revision to Diff 105424.Jul 6 2017, 7:09 AM
  • Updated some comments
DIVYA updated this revision to Diff 105428.Jul 6 2017, 7:31 AM

Just a few more comments. Please use clang format on your code.

Haicheng

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
984 ↗(On Diff #105428)

I think you can remove this one or change it to an assert.

At this point, I think we know the stores have the same strides and loads have the same strides as their users (stores), so the strides of the loads should be the same.

1151 ↗(On Diff #105428)

What if some Loads/Stores are atomic and some are not?

DIVYA updated this revision to Diff 105544.Jul 6 2017, 2:56 PM
  • If there are multiple stores and loads ,which are consecutive, but if any of them is atomic , then it won't be converted to memcpy.
  • isAtomicStoreLoad flag checks for any atomic store or load, when there are multiple stores and loads .
  • If there is single store and load , then it will converted to memcpy or atomic memcpy depending on whether the store and load are atomic or not.
haicheng added inline comments.Jul 6 2017, 5:43 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
981 ↗(On Diff #105544)

Can we just return from here?

DIVYA updated this revision to Diff 105630.Jul 7 2017, 6:30 AM
DIVYA marked 2 inline comments as done.Jul 25 2017, 11:26 AM
DIVYA added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
968 ↗(On Diff #105328)

This will be checked in the isLegalStore() function itself.

haicheng added inline comments.Jul 25 2017, 11:42 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
930 ↗(On Diff #105328)

Then you can use cast instead of dyn_cast

DIVYA updated this revision to Diff 108658.Jul 28 2017, 8:18 AM
  • Changed dyn_cast to cast
DIVYA updated this revision to Diff 108659.Jul 28 2017, 8:27 AM
DIVYA marked an inline comment as done.
DIVYA marked 2 inline comments as done.Aug 2 2017, 1:45 PM

I think you can refactor your code to reuse most of the implementation of processLoopStores() and processLoopStridedStore(). Maybe just add a flag to do the extra work for the loads.

DIVYA updated this revision to Diff 110415.Aug 9 2017, 9:16 AM
  • Refactored code to reuse implementation of processLoopStores() and processLoopStridedStore()
  • Added ForMemcpy flag
haicheng added inline comments.Aug 9 2017, 10:05 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
134 ↗(On Diff #110415)

I think you can pass in an enum which has value ForMemset, ForMemsetPattern, ForMemcpy like LegalStoreKind above or just use LegalStoreKind if possible.

651–675 ↗(On Diff #110415)

I think code here can refactor with code above

737 ↗(On Diff #110415)

I think processLoopStoreOfLoopLoad() can refactor with processLoopStridedStore(), then you don't need if...else... here.

DIVYA updated this revision to Diff 110461.Aug 9 2017, 1:05 PM
  • Used enum for ForMemcpy
  • Refactored some code
DIVYA marked an inline comment as done.Aug 9 2017, 1:09 PM
DIVYA added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
737 ↗(On Diff #110415)

processLoopStoreOfLoopLoad() function was already present,so I haven't refactored it with processLoopStridedStore() in this patch.I can do that in the next patch

DIVYA marked 3 inline comments as done.Aug 9 2017, 1:10 PM
haicheng added inline comments.Aug 9 2017, 3:40 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
646 ↗(On Diff #110461)

Please add a comment to make it clear it is for memcpy.

or add if (memcpy)

or write this part as a switch.

669–674 ↗(On Diff #110461)

clang-format here

737 ↗(On Diff #110415)

I think you can go ahead.

DIVYA updated this revision to Diff 110716.Aug 11 2017, 6:58 AM
DIVYA marked 2 inline comments as done.
  • modified comment
DIVYA added inline comments.Aug 11 2017, 2:11 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
737 ↗(On Diff #110415)

So should I do it in this patch or next one?

haicheng added inline comments.Aug 11 2017, 2:16 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
737 ↗(On Diff #110415)

I think in this patch.

DIVYA updated this revision to Diff 110973.Aug 14 2017, 8:35 AM
  • processLoopStoreOfLoopLoad() refactored with processLoopStridedStore(),
DIVYA marked an inline comment as done.Aug 14 2017, 8:36 AM
DIVYA updated this revision to Diff 110998.Aug 14 2017, 9:15 AM
DIVYA updated this revision to Diff 111020.Aug 14 2017, 10:12 AM
DIVYA updated this revision to Diff 111030.Aug 14 2017, 10:29 AM
haicheng added inline comments.Aug 14 2017, 3:05 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
830 ↗(On Diff #111030)

a space between stride and comma.

847–857 ↗(On Diff #111030)

This part can be written as

if (Memset)
  SplatValue = 
else if (memsetpattern)
  Patternvalue =
966 ↗(On Diff #111030)

This can be changed to

if (memset)

969 ↗(On Diff #111030)

this can be changed to

else if (memset_pattern)

1022–1029 ↗(On Diff #111030)

I think after creating newCall, setDebugLoc and debug dumping can be refactored too.

DIVYA updated this revision to Diff 111184.Aug 15 2017, 9:10 AM
DIVYA marked 5 inline comments as done.
  • setDebugLoc and debug dumping refactored
haicheng added inline comments.Aug 15 2017, 3:33 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143 ↗(On Diff #111184)

I think we should use MemIdiom and IsLoopMemset can be removed.

DIVYA added inline comments.Aug 16 2017, 7:07 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143 ↗(On Diff #111184)

But when processLoopStridedStore() is called from processLoopMemSet() function , then IsLoopMemset is set true and when it is called from processLoopStores() function , then IsLoopMemset is set false.Even though both can have MemIdiom set to LegalStoreKind::Memset .So using MemIdiom alone would not be sufficient.As, we need IsLoopMemset to indicate if memset which can be promoted to a large memset.

haicheng added inline comments.Aug 16 2017, 8:13 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143 ↗(On Diff #111184)

Yes, you are right. I think we can move the check avoidLIRForMultiBlockLoop() much earlier so that we don't need this flag and we can save compilation time. But, we can do that in the next patch.

965–968 ↗(On Diff #111184)

I think here can be written as

if (Memset)
else if (MemsetPattern)
else (Memcpy)

not nested if

DIVYA updated this revision to Diff 111356.Aug 16 2017, 8:44 AM
  • Removed nested if
DIVYA marked an inline comment as done.Aug 16 2017, 8:45 AM
haicheng added inline comments.Aug 16 2017, 12:27 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
577 ↗(On Diff #111356)

SL[i]->getValueOperand() => FirstStoredVal

644 ↗(On Diff #111356)

SL[k]->getValueOperand() => SecondStoredVal

943 ↗(On Diff #111356)

not only memset here

DIVYA updated this revision to Diff 111407.Aug 16 2017, 1:14 PM
DIVYA marked 3 inline comments as done.
  • modified comment
haicheng accepted this revision.Aug 17 2017, 7:28 AM

LGTM

Thanks for working on that. Please fully check the correctness before committing the patch.

This revision is now accepted and ready to land.Aug 17 2017, 7:28 AM
mcrosier resigned from this revision.Sep 18 2017, 8:15 AM
sanjoy resigned from this revision.Jan 29 2022, 5:44 PM