This is an archive of the discontinued LLVM Phabricator instance.

Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0.
ClosedPublic

Authored by mjacob on Jan 28 2015, 5:47 PM.

Details

Summary

Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256. This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view. On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.

This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls. If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.

The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86). NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction. All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator). X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.

I call this a "partial fix" because there are still cases that can't be
lowered. A fatal error is reported in these cases.

Diff Detail

Event Timeline

mjacob updated this revision to Diff 18936.Jan 28 2015, 5:47 PM
mjacob retitled this revision from to Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0..
mjacob updated this object.
mjacob edited the test plan for this revision. (Show Details)
mjacob added reviewers: arsenm, theraven, compnerd.
mjacob added subscribers: alex, Unknown Object (MLST).
compnerd added inline comments.Jan 28 2015, 6:00 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4560

The phrasing here could be a bit better IMO. How about:

report_fatal_error(Twine("cannot lower memory intrinsic in address space ") + AS);
mjacob updated this revision to Diff 18950.Jan 29 2015, 4:13 AM

Add a test case for inlining of small constant memcpys from / to address space 256 on X86.

mjacob added inline comments.Jan 29 2015, 4:38 AM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4560

I wanted the error to be explicit about the fact that lowering memory intrinsics is partially supported. Waiting for another opinion on this.

mjacob updated this object.Feb 3 2015, 11:10 AM
arsenm added inline comments.Feb 17 2015, 3:34 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4560

I don't think an error message should include FIXME

mjacob updated this revision to Diff 41346.Nov 29 2015, 3:39 AM

Change error message to be more concise, addressing reviewers' comments.

mjacob marked 2 inline comments as done.Nov 29 2015, 3:40 AM
theraven added inline comments.Nov 29 2015, 8:44 AM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4560

I'm not sure why we're doing this assertion at all. The correct thing to do is delegate to the target here and ask it for the *correct* memcpy variant. We currently work around this limitation by doing an IR pass that translates memcpy intrinsics into the correct calls but that prevents expansion in the back end.

mjacob added inline comments.Nov 29 2015, 9:35 AM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4560

I agree. A few points, though:

  1. This removes an assert in SelectionDAGBuilder which (a) is completely x86-specific and (b) is not present in release builds which are used by most LLVM users, generating wrong code without even giving a warning.
  1. This solution works fine for the common case where the memcpy comes from a small aggregate assignment in the frontend or is formed from a few consecutive loads and stores by the optimizer. In other cases an error is reported instead of emitting wrong code.

While the change here is not solving all problems for everyone, it makes the situation strictly better than before in my opinion.

ping

I think this won't get in the way of implementing a more comprehensive solution to the memory intrinsic lowering problem. Instead I see it as a first step. However, as we don't really seem to come to a conclusion here, I wonder whether it makes sense to discuss the problem in a broader scope first (on the mailing list).

hfinkel accepted this revision.Dec 10 2015, 10:44 AM
hfinkel added a reviewer: hfinkel.
hfinkel added a subscriber: hfinkel.

ping

I think this won't get in the way of implementing a more comprehensive solution to the memory intrinsic lowering problem. Instead I see it as a first step. However, as we don't really seem to come to a conclusion here, I wonder whether it makes sense to discuss the problem in a broader scope first (on the mailing list).

I agree; this is a reasonable incremental step. When some target actually wants to do the "right" thing here for the more-general case, adding the associated infrastructure will be straightforward. Without such a use case, however, the error is the best we can do.

LGTM.

This revision is now accepted and ready to land.Dec 10 2015, 10:44 AM
mjacob closed this revision.Dec 12 2015, 1:36 PM