This is an archive of the discontinued LLVM Phabricator instance.

[LangRef] State that storing an aggregate fills padding with undef
ClosedPublic

Authored by aqjune on Aug 18 2020, 7:28 PM.

Details

Summary

This patch makes LangRef be explicit about the value of padding when storing an aggregate.
It states that when an aggregate is stored into memory, padding is filled with undef.

Here is a clue that supports this change (edited to reflect the discussion from llvm-dev):

The two items below are not relevant with this patch because Clang lowers load/store of individual field of struct into load/stores of the corresponding pointer with a primitive type. Also, when copy is needed, it uses memcpy instead of load/store of an aggregate, as discussed in the llvm-dev. However, this patch is still valid (as discussed) because it is needed to explain the two optimizations above.

  • According to C17, the value of padding bytes when storing values in structures or unions is unspecified.
  • I updated Alive2 and it did not find any problematic transformation from LLVM unit tests and while running translation validation of a few C programs.

Diff Detail

Event Timeline

aqjune created this revision.Aug 18 2020, 7:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2020, 7:28 PM
aqjune requested review of this revision.Aug 18 2020, 7:28 PM
aqjune edited the summary of this revision. (Show Details)Aug 18 2020, 8:16 PM

Makes sense.

llvm/docs/LangRef.rst
9331

Probably makes sense to add corresponding language to loads (that they ignore the padding).

I would like to keep the SCCP reasoning and I don't have any knowledge of it being problematic. Given that we already apply this semantic, I doubt too many people would be offended by specifying it.

aqjune updated this revision to Diff 286732.Aug 20 2020, 1:10 AM

Mention that load ignores padding of aggregate.

aqjune marked an inline comment as done.Aug 20 2020, 1:10 AM
efriedma added inline comments.Aug 21 2020, 3:17 PM
llvm/docs/LangRef.rst
9242

"is not read" might be a little strong; I'm not sure we want to promise the generated code won't access the address in question, just that the bytes in question aren't part of the value representation.

aqjune updated this revision to Diff 287168.Aug 22 2020, 1:17 AM

Update wording

aqjune marked an inline comment as done.Aug 22 2020, 1:18 AM
aqjune added inline comments.
llvm/docs/LangRef.rst
9242

Yes, what I intended to say was that it is accessed but not participate into the returned value. Updated the sentence

aqjune marked an inline comment as done.Aug 22 2020, 1:18 AM
nikic accepted this revision.Aug 29 2020, 1:41 AM

LGTM

llvm/docs/LangRef.rst
9242
This revision is now accepted and ready to land.Aug 29 2020, 1:41 AM
aqjune edited the summary of this revision. (Show Details)Aug 29 2020, 10:52 PM
This revision was landed with ongoing or failed builds.Aug 29 2020, 10:54 PM
This revision was automatically updated to reflect the committed changes.
aqjune marked an inline comment as done.Aug 29 2020, 10:57 PM