This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyLibCalls] Retain attributes added by Builder.CreateMem*
ClosedPublic

Authored by arichardson on Oct 3 2022, 7:01 AM.

Details

Summary

This currently does not make much of a difference (only one tests is
affected), but it is helpful e.g. for the out-of-tree CHERI target where
Builder.CreateMemCpy() can add attributes other than parameter alignment.

Diff Detail

Event Timeline

arichardson created this revision.Oct 3 2022, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:01 AM
arichardson requested review of this revision.Oct 3 2022, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:01 AM
nikic accepted this revision.Oct 3 2022, 9:10 AM

LGTM

This currently does not make much of a difference (only one tests is
affected), but it is helpful e.g. for the out-of-tree CHERI target where
Builder.CreateMemCpy() can add attributes other than parameter alignment.

Out of interest, what attributes does it add?

This revision is now accepted and ready to land.Oct 3 2022, 9:10 AM

LGTM

This currently does not make much of a difference (only one tests is
affected), but it is helpful e.g. for the out-of-tree CHERI target where
Builder.CreateMemCpy() can add attributes other than parameter alignment.

Out of interest, what attributes does it add?

CHERI has strict requirements on load/store alignment to preserve the hidden validity bits (you have to use aligned 16 byte loads/stores on CHERI-extended 64-bit architectures). This means we can't emit memcpy/memmove inline if the known alignment is only 8 bytes since 8 byte loads&stores will invalidate all validity bits. This means something like a 16 byte memcpy with 8 byte alignment has to be lowered to a memcpy() call so that it preserves validity bits if it is aligned to 16 bytes at runtime.

I added a new attribute "no_preserve_cheri_tags" that the frontend (as well as a few passes such as SLC) can use to indicate that the memory being copied will never have validity tags sets, which means we can inline the calls using smaller loads&stores.

jrtc27 added a comment.Oct 3 2022, 9:40 AM

(Well, you can emit it inline so long as you're happy to branch, at which point it's not really much of a gain over making a call to memcpy and has the issue of code bloat, but maybe for particularly hot places you'd want to still inline a partially-specialised version, and you have to for __builtin_memcpy_inline)

This revision was landed with ongoing or failed builds.Oct 4 2022, 6:12 AM
This revision was automatically updated to reflect the committed changes.