This is an archive of the discontinued LLVM Phabricator instance.

[loop idiom Recognition] support memcpy for multiple consecutive loads and stores
AcceptedPublic

Authored by DIVYA on Jun 23 2017, 12:20 PM.

Details

Summary

This pass converts multiple consecutive loads and stores inside a loop to memcpy operation.
For example, the stores and loads in the below code can be converted to a memcpy .

struct foo
{

int a;
int b;

} f,g;

for(int i=0;i<n;i++)
{

f[i].a  =  g[i].a;
f[i].b  =  g[i].b;

}

Worked in collaboration with Sebastian Pop and Aditya Kumar.

Diff Detail

Event Timeline

DIVYA created this revision.Jun 23 2017, 12:20 PM
DIVYA edited the summary of this revision. (Show Details)Jun 23 2017, 12:23 PM
DIVYA updated this revision to Diff 104483.Jun 28 2017, 11:36 AM
mgrang added a subscriber: mgrang.Jun 28 2017, 11:49 AM
mgrang added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
157

Extra newline.

366

Extra newline.

451

Ditto.

489

Ditto.

1017

Use i != e instead of i < e.

1039

This loop looks a bit hacky. Can this be re-written in a better way?

1042

Check indentation.

1086–1087

Newline removed.

1490–1491

Newline removed.

test/Transforms/LoopIdiom/memcpy_structPattern.ll
12 ↗(On Diff #104483)

Can the test case be simplified/minimized by removing unwanted/extra attributes?

mgrang added inline comments.Jun 28 2017, 11:52 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1015–1016

Extra / in comments. Also check comment spacing.

test/Transforms/LoopIdiom/memcpy_structPattern.ll
1 ↗(On Diff #104483)

Unit test names follow all lowercase naming convention: memcpy_struct_pattern.ll

DIVYA updated this revision to Diff 105328.Jul 5 2017, 1:40 PM
  • Removed Extra Newlines
  • Added IndexQueue(SmallVector) for storing possible consecutive stores
  • changed testcase filename
haicheng edited edge metadata.EditedJul 5 2017, 6:56 PM

Please check the inlined comment.

Is it possible to reuse the code of processLoopStores() and processLoopStridedStore()?

Haicheng

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
521–523

I think this comment should be changed.

1015–1016

I think we need to emphasize that loads and stores are consecutive.

1016

Please check the format.

1016

memset?

1017

What if FirstStoreLoad is nullptr?

1047

memset?

1055

Same thing here.

DIVYA updated this revision to Diff 105424.Jul 6 2017, 7:09 AM
  • Updated some comments
DIVYA updated this revision to Diff 105428.Jul 6 2017, 7:31 AM

Just a few more comments. Please use clang format on your code.

Haicheng

lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1054–1055

What if some Loads/Stores are atomic and some are not?

1068

I think you can remove this one or change it to an assert.

At this point, I think we know the stores have the same strides and loads have the same strides as their users (stores), so the strides of the loads should be the same.

DIVYA updated this revision to Diff 105544.Jul 6 2017, 2:56 PM
  • If there are multiple stores and loads ,which are consecutive, but if any of them is atomic , then it won't be converted to memcpy.
  • isAtomicStoreLoad flag checks for any atomic store or load, when there are multiple stores and loads .
  • If there is single store and load , then it will converted to memcpy or atomic memcpy depending on whether the store and load are atomic or not.
haicheng added inline comments.Jul 6 2017, 5:43 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1063

Can we just return from here?

DIVYA updated this revision to Diff 105630.Jul 7 2017, 6:30 AM
DIVYA marked 2 inline comments as done.Jul 25 2017, 11:26 AM
DIVYA added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1055

This will be checked in the isLegalStore() function itself.

haicheng added inline comments.Jul 25 2017, 11:42 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1017

Then you can use cast instead of dyn_cast

DIVYA updated this revision to Diff 108658.Jul 28 2017, 8:18 AM
  • Changed dyn_cast to cast
DIVYA updated this revision to Diff 108659.Jul 28 2017, 8:27 AM
DIVYA marked an inline comment as done.
DIVYA marked 2 inline comments as done.Aug 2 2017, 1:45 PM

I think you can refactor your code to reuse most of the implementation of processLoopStores() and processLoopStridedStore(). Maybe just add a flag to do the extra work for the loads.

DIVYA updated this revision to Diff 110415.Aug 9 2017, 9:16 AM
  • Refactored code to reuse implementation of processLoopStores() and processLoopStridedStore()
  • Added ForMemcpy flag
haicheng added inline comments.Aug 9 2017, 10:05 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
134

I think you can pass in an enum which has value ForMemset, ForMemsetPattern, ForMemcpy like LegalStoreKind above or just use LegalStoreKind if possible.

686–710

I think code here can refactor with code above

732

I think processLoopStoreOfLoopLoad() can refactor with processLoopStridedStore(), then you don't need if...else... here.

DIVYA updated this revision to Diff 110461.Aug 9 2017, 1:05 PM
  • Used enum for ForMemcpy
  • Refactored some code
DIVYA marked an inline comment as done.Aug 9 2017, 1:09 PM
DIVYA added inline comments.
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
732

processLoopStoreOfLoopLoad() function was already present,so I haven't refactored it with processLoopStridedStore() in this patch.I can do that in the next patch

DIVYA marked 3 inline comments as done.Aug 9 2017, 1:10 PM
haicheng added inline comments.Aug 9 2017, 3:40 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
643

Please add a comment to make it clear it is for memcpy.

or add if (memcpy)

or write this part as a switch.

672–677

clang-format here

732

I think you can go ahead.

DIVYA updated this revision to Diff 110716.Aug 11 2017, 6:58 AM
DIVYA marked 2 inline comments as done.
  • modified comment
DIVYA added inline comments.Aug 11 2017, 2:11 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
732

So should I do it in this patch or next one?

haicheng added inline comments.Aug 11 2017, 2:16 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
732

I think in this patch.

DIVYA updated this revision to Diff 110973.Aug 14 2017, 8:35 AM
  • processLoopStoreOfLoopLoad() refactored with processLoopStridedStore(),
DIVYA marked an inline comment as done.Aug 14 2017, 8:36 AM
DIVYA updated this revision to Diff 110998.Aug 14 2017, 9:15 AM
DIVYA updated this revision to Diff 111020.Aug 14 2017, 10:12 AM
DIVYA updated this revision to Diff 111030.Aug 14 2017, 10:29 AM
haicheng added inline comments.Aug 14 2017, 3:05 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
865

a space between stride and comma.

880–895

This part can be written as

if (Memset)
  SplatValue = 
else if (memsetpattern)
  Patternvalue =
1056

This can be changed to

if (memset)

1059

this can be changed to

else if (memset_pattern)

1086–1093

I think after creating newCall, setDebugLoc and debug dumping can be refactored too.

DIVYA updated this revision to Diff 111184.Aug 15 2017, 9:10 AM
DIVYA marked 5 inline comments as done.
  • setDebugLoc and debug dumping refactored
haicheng added inline comments.Aug 15 2017, 3:33 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143

I think we should use MemIdiom and IsLoopMemset can be removed.

DIVYA added inline comments.Aug 16 2017, 7:07 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143

But when processLoopStridedStore() is called from processLoopMemSet() function , then IsLoopMemset is set true and when it is called from processLoopStores() function , then IsLoopMemset is set false.Even though both can have MemIdiom set to LegalStoreKind::Memset .So using MemIdiom alone would not be sufficient.As, we need IsLoopMemset to indicate if memset which can be promoted to a large memset.

haicheng added inline comments.Aug 16 2017, 8:13 AM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
143

Yes, you are right. I think we can move the check avoidLIRForMultiBlockLoop() much earlier so that we don't need this flag and we can save compilation time. But, we can do that in the next patch.

1015–1045

I think here can be written as

if (Memset)
else if (MemsetPattern)
else (Memcpy)

not nested if

DIVYA updated this revision to Diff 111356.Aug 16 2017, 8:44 AM
  • Removed nested if
DIVYA marked an inline comment as done.Aug 16 2017, 8:45 AM
haicheng added inline comments.Aug 16 2017, 12:27 PM
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
577

SL[i]->getValueOperand() => FirstStoredVal

644

SL[k]->getValueOperand() => SecondStoredVal

938–978

not only memset here

DIVYA updated this revision to Diff 111407.Aug 16 2017, 1:14 PM
DIVYA marked 3 inline comments as done.
  • modified comment
haicheng accepted this revision.Aug 17 2017, 7:28 AM

LGTM

Thanks for working on that. Please fully check the correctness before committing the patch.

This revision is now accepted and ready to land.Aug 17 2017, 7:28 AM
mcrosier resigned from this revision.Sep 18 2017, 8:15 AM
sanjoy resigned from this revision.Jan 29 2022, 5:44 PM