This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++.
A follow up CL will add the intrinsics in Clang.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 42838 Build 43419: arc lint + arc unit
Event Timeline
This looks pretty reasonable to me. Test coverage is low though:
I think we need to add it to the read/write intrinsic tests (that I thought existed). Also some tests that verify we handle memcpy in a certain way should be copied for this intrinsic.
llvm/include/llvm/IR/Intrinsics.td | ||
---|---|---|
512 | Nit: I'd move this below just after the non-inline version. |
Thx for the review @jdoerfert. I tried to add the relevant tests, do you have any other tests in mind?
llvm/include/llvm/IR/Intrinsics.td | ||
---|---|---|
512 | Ok although it's not part of the 'Standard C Library Intrinsics' per say. |
Needs a LangRef update.
It would be nice to do a bit more work so various optimizations handle llvm.memcpy.inline in a way that isn't completely conservative (for example, in alias analysis), but I guess that doesn't need to be in the initial patch.
llvm/include/llvm/IR/IntrinsicInst.h | ||
---|---|---|
659 | Maybe hide the base class getLength() with a version that returns a ConstantInt instead? You could use it, for example, in Lint.cpp. | |
llvm/include/llvm/IR/Intrinsics.td | ||
515 | Maybe clarify what "inlined" means. In this context, it should mean specifically that it doesn't call any external function, including memcpy. (It would be difficult to reliably guarantee that it doesn't contain any call instructions; for example, we might outline a memcpy sequence, and that's probably fine.) | |
llvm/test/CodeGen/X86/memcpy-inline.ll | ||
12 | It would be a good idea to have a testcase for a larger size that wouldn't normally be inlined. (Also, maybe a test on another target that doesn't have a "memcpy" instruction.) |
llvm/include/llvm/IR/IntrinsicInst.h | ||
---|---|---|
659 | For Lint.cpp it has to go through findValue which erases the type, not sure what you prefer, make findValue a template or cast (as it's currently done) | |
llvm/test/CodeGen/X86/memcpy-inline.ll | ||
12 |
I'm not sure what you want to test with this. |
I'm not too familiar with AA but my understanding is that the definition in llvm/include/llvm/IR/Intrinsics.td already describes the arguments with enough precision to be helpful for the AA pass.
def int_memcpy_inline : Intrinsic<[], [ llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty, llvm_i1_ty ], [ IntrArgMemOnly, NoCapture<0>, NoCapture<1>, WriteOnly<0>, ReadOnly<1>, ImmArg<2>, ImmArg<3> ]>;
Did you have something specific in mind? Or am I missing something completely?
The attributes are helpful, but not enough to figure out how many bytes are modified.
llvm/include/llvm/IR/IntrinsicInst.h | ||
---|---|---|
659 | You don't need to go through findValue to find a value that's already a ConstantInt. | |
llvm/test/CodeGen/X86/memcpy-inline.ll | ||
12 | Probably not that important, given the tests you have already; nevermind. |
I see, I looked it up a bit but it will take some time for me to understand the framework and where to add this information.
Can you provide me with pointers?
I can do this as a follow up patch also or someone more knowledgeable than me maybe?
A couple more minor issues.
I see, I looked it up a bit but it will take some time for me to understand the framework and where to add this information.
Can you provide me with pointers?
There's code in BasicAAResult::getModRefInfo, which calls MemoryLocation::getForSource(). I guess you get the implementation of getForSource() for free, by inheriting from MemTransferInst, but I'm not sure we actually call it with your current patch, because you didn't modify classof for AnyMemCpyInst.
There might be a few other places where we could check for llvm.memcpy.inline, but nothing really high-priority, I think.
llvm/include/llvm/IR/IntrinsicInst.h | ||
---|---|---|
666 | Should memcpy_inline count as an AnyMemIntrinsic/AnyMemTransferInst/AnyMemCpyInst? | |
llvm/lib/Analysis/Lint.cpp | ||
351 | isIntN(64)? Your LangRef documentation explicitly says it doesn't have to be an i64. You can just drop the assertion, and use getLimitedValue() instead of getZExtValue(). |
- address comments
- - Add a Lint test for overlapping buffers
- Address comments
- Simplify code
- - Address comments, fix test
llvm/include/llvm/IR/Intrinsics.td | ||
---|---|---|
522 | NoAlias? |
llvm/include/llvm/IR/Intrinsics.td | ||
---|---|---|
522 | IntrWillReturn too? (Memcpy has it too) |
llvm/docs/LangRef.rst | ||
---|---|---|
11756–11757 ↗ | (On Diff #240792) | This is kinda too late, but i'm failing to find where it's explained why the size is restricted to being an immediate? |
llvm/docs/LangRef.rst | ||
---|---|---|
11756–11757 ↗ | (On Diff #240792) | The discussion happened on the POC patch |
llvm/docs/LangRef.rst | ||
---|---|---|
11756–11757 ↗ | (On Diff #240792) | I see, efficient lowering problems. Thanks. |
Maybe hide the base class getLength() with a version that returns a ConstantInt instead? You could use it, for example, in Lint.cpp.