motivated to resolve https://github.com/rust-lang/rust/issues/107436
on https://reviews.llvm.org/D150970, remove the following memcpy
Differential D150967
[MemCpyOpt] precommit test for memcpy removal for immutable arguments from attributes (NFC) khei4 on May 19 2023, 7:44 AM. Authored by
Details motivated to resolve https://github.com/rust-lang/rust/issues/107436 on https://reviews.llvm.org/D150970, remove the following memcpy
Diff Detail
Unit Tests
Event Timeline
Comment Actions apply feedback, add cases for attributes union of Callsite and Parameter and negative case
Comment Actions To write down what I said on the last call: For this transform, we need to be careful because we don't know what kind of accesses the called function will perform. The function may have an unknown assumption on the alignment and may access an unknown number of bytes. As such, we can only perform the transform if we know the alignment/size of the object that is passed to the function. If we have something like this define void @test(i1 %c, ptr noalias %val) { %val1 = alloca [4 x i8], align 4 %val2 = alloca [16 x i8], align 16 %val3 = select i1 %c, ptr %val1, ptr %val2 call void @llvm.memcpy.p0.p0.i64(ptr align 4 %val3, ptr align 4 %val, i64 4, i1 false) call void @f(ptr nocapture noalias readonly %val3) ret void } then it might be that %c is always known false, so @f assumes it gets the %val2 = alloca [16 x i8], align 16 pointer. As such, it might assume that the pointer is aligned to 16 bytes and is 16 bytes large. As such, we need to be sure that %val satisfies this as well if we do the replacement. I think in practice, this means we should only do the replacement if a) the pointer passed to the function is an alloca (which ensures we know alignment & size) b) the memcpy copies the full size of the alloca (which ensures that the new pointer is dereferencable for the required range) and c) the new pointer has alignment >= the alloca alignment. Comment Actions Thank you so much! Sounds reasonable! For the first step, I'll assume *syntactically* possible args' alignments are coming to the args, and see whether they satisfy the following conditions. I'll entrust further analysis(for example, the above always-known-value case you gave without any folding) on other passes except MemCpyOptimizer, at least for the first step! (I think it might not be valuable to do it on this pass).
Sounds good! Moreover, b seems like possible to extend less than the size of alloca :). I'll add tests to check each conditions! Thanks! Comment Actions add the cases memcpy and alloca have alignment assumptions Comment Actions add test cases
Comment Actions tweak
TODO: sort test cases to correspond the transformation branch This comment was removed by khei4. Comment Actions I think if the dest is unescaped alloca, allocasize and memcpy length corresponding is unnecessary, because
example: https://llvm.godbolt.org/z/Pd383eo46 But I feel like it’s a little bit too detailed to consider before merging the general case ;) Comment Actions This should be safe (though not very useful...)
In this case, the original pointer (%val1) is dereferenceable to 4 bytes, but the new one (%val) is only known dereferenceable to 1 byte, so we are replacing dereferencable(bigger) with dereferenceable(smaller), no? That wouldn't be valid. Comment Actions
Yeah. As you said not only memcpy length and alloca size, but the new pointer's dereferenceability matters. Yeah, if %val1's dereferenceablity is shown in someway(although I'm not sure alloca can be ensured dereferenceable without initializer), that'll be a problem! Thanks! But anyway currently these case seem a little far from a practical case! Comment Actions add
Comment Actions add cases
remove obsolete comment Comment Actions I think this is missing tests where some of the attributes on the argument are missing: E.g. only readonly nocapture but not noalias or only readonly noalias but not nocapture. Comment Actions add
Although on may alias case might be conservative because current implementation checks unescaped alloca only. |
For readonly, the LangRef says: This attribute indicates that the function does not write through this pointer argument, even though it may write to the memory that the pointer points to.
IIUC, it means there is no write that through this argument, but we can't guarantee that the memory is not written. So maybe we need to make sure the function @fnr has the attribute memory(read)?