This is an archive of the discontinued LLVM Phabricator instance.

[instrinsics] Add @llvm.memcpy.inline instrinsics
ClosedPublic

Authored by gchatelet on Dec 19 2019, 8:36 AM.

Details

Summary

This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++.
A follow up CL will add the intrinsics in Clang.

Event Timeline

gchatelet created this revision.Dec 19 2019, 8:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 19 2019, 8:36 AM

This looks pretty reasonable to me. Test coverage is low though:
I think we need to add it to the read/write intrinsic tests (that I thought existed). Also some tests that verify we handle memcpy in a certain way should be copied for this intrinsic.

llvm/include/llvm/IR/Intrinsics.td
512

Nit: I'd move this below just after the non-inline version.

gchatelet updated this revision to Diff 234907.Dec 20 2019, 8:29 AM
gchatelet marked an inline comment as done.
  • address comments

This looks pretty reasonable to me. Test coverage is low though:
I think we need to add it to the read/write intrinsic tests (that I thought existed). Also some tests that verify we handle memcpy in a certain way should be copied for this intrinsic.

Thx for the review @jdoerfert. I tried to add the relevant tests, do you have any other tests in mind?

llvm/include/llvm/IR/Intrinsics.td
512

Ok although it's not part of the 'Standard C Library Intrinsics' per say.

gchatelet updated this revision to Diff 234915.Dec 20 2019, 9:08 AM
  • - Add a Lint test for overlapping buffers

This looks fine to me, I wait and hope someone else accepts.

Can someone else have a look and approve this patch?

Can someone else have a look and approve this patch?

Looks like @efriedma suggested this approach - perhaps he can review?

Needs a LangRef update.

It would be nice to do a bit more work so various optimizations handle llvm.memcpy.inline in a way that isn't completely conservative (for example, in alias analysis), but I guess that doesn't need to be in the initial patch.

llvm/include/llvm/IR/IntrinsicInst.h
659

Maybe hide the base class getLength() with a version that returns a ConstantInt instead? You could use it, for example, in Lint.cpp.

llvm/include/llvm/IR/Intrinsics.td
515

Maybe clarify what "inlined" means. In this context, it should mean specifically that it doesn't call any external function, including memcpy. (It would be difficult to reliably guarantee that it doesn't contain any call instructions; for example, we might outline a memcpy sequence, and that's probably fine.)

llvm/test/CodeGen/X86/memcpy-inline.ll
13

It would be a good idea to have a testcase for a larger size that wouldn't normally be inlined. (Also, maybe a test on another target that doesn't have a "memcpy" instruction.)

gchatelet updated this revision to Diff 238193.Jan 15 2020, 1:23 AM
gchatelet marked 6 inline comments as done.
  • Rebase and address comments
gchatelet added inline comments.Jan 15 2020, 1:24 AM
llvm/include/llvm/IR/IntrinsicInst.h
659

For Lint.cpp it has to go through findValue which erases the type, not sure what you prefer, make findValue a template or cast (as it's currently done)

llvm/test/CodeGen/X86/memcpy-inline.ll
13

Also, maybe a test on another target that doesn't have a "memcpy" instruction.

I'm not sure what you want to test with this.

It would be nice to do a bit more work so various optimizations handle llvm.memcpy.inline in a way that isn't completely conservative (for example, in alias analysis), but I guess that doesn't need to be in the initial patch.

I'm not too familiar with AA but my understanding is that the definition in llvm/include/llvm/IR/Intrinsics.td already describes the arguments with enough precision to be helpful for the AA pass.

def int_memcpy_inline
    : Intrinsic<[],
      [ llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty, llvm_i1_ty ],
      [ IntrArgMemOnly, NoCapture<0>, NoCapture<1>, WriteOnly<0>, ReadOnly<1>,
      ImmArg<2>, ImmArg<3> ]>;

Did you have something specific in mind? Or am I missing something completely?

It would be nice to do a bit more work so various optimizations handle llvm.memcpy.inline in a way that isn't completely conservative (for example, in alias analysis), but I guess that doesn't need to be in the initial patch.

I'm not too familiar with AA but my understanding is that the definition in llvm/include/llvm/IR/Intrinsics.td already describes the arguments with enough precision to be helpful for the AA pass.

def int_memcpy_inline
    : Intrinsic<[],
      [ llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty, llvm_i1_ty ],
      [ IntrArgMemOnly, NoCapture<0>, NoCapture<1>, WriteOnly<0>, ReadOnly<1>,
      ImmArg<2>, ImmArg<3> ]>;

Did you have something specific in mind? Or am I missing something completely?

The attributes are helpful, but not enough to figure out how many bytes are modified.

llvm/include/llvm/IR/IntrinsicInst.h
659

You don't need to go through findValue to find a value that's already a ConstantInt.

llvm/test/CodeGen/X86/memcpy-inline.ll
13

Probably not that important, given the tests you have already; nevermind.

gchatelet marked 2 inline comments as done.Jan 16 2020, 2:23 AM

It would be nice to do a bit more work so various optimizations handle llvm.memcpy.inline in a way that isn't completely conservative (for example, in alias analysis), but I guess that doesn't need to be in the initial patch.

I'm not too familiar with AA but my understanding is that the definition in llvm/include/llvm/IR/Intrinsics.td already describes the arguments with enough precision to be helpful for the AA pass.

def int_memcpy_inline
    : Intrinsic<[],
      [ llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty, llvm_i1_ty ],
      [ IntrArgMemOnly, NoCapture<0>, NoCapture<1>, WriteOnly<0>, ReadOnly<1>,
      ImmArg<2>, ImmArg<3> ]>;

Did you have something specific in mind? Or am I missing something completely?

The attributes are helpful, but not enough to figure out how many bytes are modified.

I see, I looked it up a bit but it will take some time for me to understand the framework and where to add this information.
Can you provide me with pointers?
I can do this as a follow up patch also or someone more knowledgeable than me maybe?

gchatelet updated this revision to Diff 238434.Jan 16 2020, 2:24 AM
  • Simplify code

@efriedma does it look good enough as is? Any other suggestions?

A couple more minor issues.

I see, I looked it up a bit but it will take some time for me to understand the framework and where to add this information.
Can you provide me with pointers?

There's code in BasicAAResult::getModRefInfo, which calls MemoryLocation::getForSource(). I guess you get the implementation of getForSource() for free, by inheriting from MemTransferInst, but I'm not sure we actually call it with your current patch, because you didn't modify classof for AnyMemCpyInst.

There might be a few other places where we could check for llvm.memcpy.inline, but nothing really high-priority, I think.

llvm/include/llvm/IR/IntrinsicInst.h
669

Should memcpy_inline count as an AnyMemIntrinsic/AnyMemTransferInst/AnyMemCpyInst?

llvm/lib/Analysis/Lint.cpp
351

isIntN(64)? Your LangRef documentation explicitly says it doesn't have to be an i64.

You can just drop the assertion, and use getLimitedValue() instead of getZExtValue().

gchatelet updated this revision to Diff 239279.Jan 21 2020, 4:35 AM
gchatelet marked 3 inline comments as done.
  • address comments
  • - Add a Lint test for overlapping buffers
  • Address comments
  • Simplify code
  • - Address comments, fix test
gchatelet updated this revision to Diff 239570.Jan 22 2020, 6:30 AM
  • rebasing

Looks good to you @efriedma?

llvm/include/llvm/IR/IntrinsicInst.h
669

Yes I believe it should.

xbolva00 added inline comments.
llvm/include/llvm/IR/Intrinsics.td
522

NoAlias?

gchatelet updated this revision to Diff 240158.Jan 24 2020, 4:12 AM
gchatelet marked 2 inline comments as done.
  • Add NoAlias
llvm/include/llvm/IR/Intrinsics.td
522

Good catch.

xbolva00 added inline comments.Jan 24 2020, 4:37 AM
llvm/include/llvm/IR/Intrinsics.td
522

IntrWillReturn too? (Memcpy has it too)

gchatelet updated this revision to Diff 240170.Jan 24 2020, 5:54 AM
  • Fixed test - addressed comments
gchatelet marked an inline comment as done.Jan 24 2020, 5:54 AM
gchatelet updated this revision to Diff 240495.Jan 27 2020, 1:39 AM
  • rebasing

A gentle ping

This revision is now accepted and ready to land.Jan 27 2020, 4:28 PM
This revision was automatically updated to reflect the committed changes.

Thx everyone for the review !

lebedev.ri added inline comments.
llvm/docs/LangRef.rst
11732–11733

This is kinda too late, but i'm failing to find where it's explained why the size is restricted to being an immediate?

gchatelet marked 2 inline comments as done.Feb 10 2020, 2:38 AM
gchatelet added inline comments.
llvm/docs/LangRef.rst
11732–11733

The discussion happened on the POC patch

lebedev.ri marked an inline comment as done.Feb 10 2020, 2:46 AM
lebedev.ri added inline comments.
llvm/docs/LangRef.rst
11732–11733

I see, efficient lowering problems. Thanks.