This is an archive of the discontinued LLVM Phabricator instance.

[CGP][ARM] Dont align memcpy args when optimization for size
Needs ReviewPublic

Authored by dmgreen on Aug 19 2022, 7:38 AM.

Details

Summary

This was added back in D7908, to align memcpy args. It should be limited when optimizing for size to prevent extra unnecessary padding being added. It seems to only currently be used under arm.

Diff Detail

Event Timeline

dmgreen created this revision.Aug 19 2022, 7:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 7:38 AM
dmgreen requested review of this revision.Aug 19 2022, 7:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 7:38 AM
SjoerdMeijer added inline comments.Aug 19 2022, 8:21 AM
llvm/test/CodeGen/ARM/memcpy-no-inline.ll
59–85

I understand the logic of this patch, but I am struggling with the tests.
In this new test below, it's unclear to me why we shouldn't be generating the libcall.

To make the differences clear I am wondering if these tests should test more or if they should be llc .. | llvm-objdump -d .. tests so that we can actually see codesize?

dmgreen updated this revision to Diff 455277.Aug 24 2022, 10:31 AM

I've updated the test with all the output - and just shown the diffs here. With that test, if the array is no longer aligned then a series of load/stores will be needed, an LDM will have too-high alignment requirements.

efriedma added inline comments.
llvm/lib/CodeGen/CodeGenPrepare.cpp
2130

Not sure I understand the placement of this check. Increasing the alignment of an alloca or a MemIntrinsic doesn't directly increase codesize. And increasing the alignment of a global variable only increases codeisze if we're forced to insert extra padding.