Page MenuHomePhabricator

[SelectionDAG] Fix argument copy elision with irregular types
ClosedPublic

Authored by LemonBoy on May 10 2021, 1:08 AM.

Details

Summary

D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime.

Diff Detail

Event Timeline

LemonBoy created this revision.May 10 2021, 1:08 AM
LemonBoy requested review of this revision.May 10 2021, 1:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2021, 1:08 AM
LemonBoy edited the summary of this revision. (Show Details)May 10 2021, 1:08 AM
rnk added a comment.May 17 2021, 1:27 PM

I don't see why this optimization should be declared illegal. Why isn't it the caller's responsibility to store these narrow integers canonically? C doesn't have narrow integers, but GCC doesn't do this. If a variable is passed in memory, it just uses the memory that was passed into it.

llvm/test/CodeGen/X86/arg-copy-elide.ll
80

We don't want this, do we? Seems like it would be a regression.

I don't see why this optimization should be declared illegal. Why isn't it the caller's responsibility to store these narrow integers canonically? C doesn't have narrow integers, but GCC doesn't do this. If a variable is passed in memory, it just uses the memory that was passed into it.

AFAICS it's the callee's responsibility to chop the extra bits off.
In this (compile at -O0) example you can see that @expect is correctly truncating the i3 value passed in-register, but fails to do so for the third argument that's passed in-memory.

rnk added a comment.May 20 2021, 11:08 AM

Thinking about this more, I think this is a good change. Currently we use the zext and sext attributes to document if the high bits of an argument are initialized, and if so, to what. If those attributes aren't present, we shouldn't do copy elision. I have a minor concern that needs to be addressed, though.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9895

Don't we still need this check? I don't see any other code that checks that the allocated type and argument type match. Otherwise we could try to copy elide an i32 stored into an i64 alloca, for example. Please add a test for that, sorry for not already having one.

llvm/test/CodeGen/X86/arg-copy-elide.ll
80

This code turns out to not be representative of what clang generates, so I don't think we need to worry about it. See this bool example here:
https://gcc.godbolt.org/z/efPavanfb

void escape(bool *);
void bool_arg(bool b) { escape(&b); }

We fail to copy elide because the argument is i1, but the alloca is i8. I think this will be true generally for non-byte-sized integers, so there is really no harm to this change to copy elision.

LemonBoy added inline comments.May 21 2021, 3:43 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9895

In tryToElideArgumentCopy there's a check between the object size pointed by the old and new frame indices, in case of a partially initialized i64 being forwarded as a i32 the check will fail.
No strong feeling about this, I can put the check back if you prefer.

LemonBoy updated this revision to Diff 346980.May 21 2021, 3:44 AM

Add a test w/ partially-initialized alloca.

rnk added inline comments.May 21 2021, 9:20 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9895

I think we should keep the store size vs. alloca size check here. If I were debugging this code later, I would be confused as to how an alloca like this became a copy elision candidate that gets rejected later in tryToElideArgumentCopy.

LemonBoy updated this revision to Diff 347065.May 21 2021, 9:39 AM

Reinstate the alloca initialization check.

rnk accepted this revision.May 21 2021, 9:54 AM

lgtm, thanks.

This revision is now accepted and ready to land.May 21 2021, 9:54 AM