Details
Diff Detail
Event Timeline
The patch shows the current idea so we can consult the design first, and then do boring marking work.
@efriedma Happy to hear your comments so I can address them now.
The approach is fine.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
196 | I think you're missing a check for null-pointer-is-valid metadata here. |
InstCombine does something weird for me.
define dso_local i32 @fff6(i8* nocapture readonly %s) local_unnamed_addr #5 {
entry:
%call = tail call i64 @strlen(i8* %s) #8
%conv = trunc i64 %call to i32
ret i32 %conv
}
I add nonnull for strlen so... -> call i64 @strlen(i8* nonnull %s) but after -instcombine we have undef here (@spatel ?)
define dso_local i32 @fff6(i8* nocapture readonly %s) local_unnamed_addr #5 {
entry:
ret i32 undef
}
I will look more closely to visitCallSite...
Ping @spatel whether he has any idea why instcombine turns
define dso_local i32 @fff6(i8* nocapture readonly %s) local_unnamed_addr #5 {
entry:
%call = tail call i64 @strlen(i8* nonnull %s) #8
%conv = trunc i64 %call to i32
ret i32 %conv
}
to undef
I don't see this happening in the minimal case. Can you post the full IR and 'opt' command that you're using?
https://pastebin.com/8bxCuV5V + this patch
opt -instcombine p.ll -S
After I mark it, CI->dump() prints " %call = call i64 @strlen(i8* nonnull %s) #2" but later, some code turns this to undef.
You can't return the same value, or you'll invoke instcombine's RAUW function which does this:
// If we are replacing the instruction with itself, this must be in a // segment of unreachable code, so just clobber the instruction. if (&I == V) V = UndefValue::get(I.getType());
Maybe we can also drop nonnull attributes for certain functions with size argument, here, in instcombine?
Clang side was not so successful: https://reviews.llvm.org/D30806
Side idea:
From C function callsites with size arg we could theoretically determine some pointer aliasing info..
I'm not sure what you're pinging, exactly. Can you please update the patch summary to describe what this patch does.
Also, do we need D30806 to be fixed and recommitted first?
Some comments and questions.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
191 | Any reason you do not use isKnownNonZero ? | |
204 | I don't get this check. Can you explain it to me (potentially in a comment)? | |
286 | Why is non-null supposed to be "allowed" for the destination but not the source in case the length is 0? | |
448 | It seems fragile to drop the non-null conditionally only. Shouldn't we always drop it and add it back if possible? | |
548 | As above. | |
565 | As above. | |
1079 | Here, and actually everywhere we currently look for a constant size, we could reuse the more powerful isKnownNonZero function. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
204 | In some AS, dereferencing null ptr may be fine. See discussion: https://reviews.llvm.org/D53666 Maybe this is not a proper check, well, suggestions welcome | |
448 | Ideally it should be on Clang side. They had tried but the solutions were reverted after miscompiles. |
Not directly for this one, but we need it for the final goal (enable nonnull arg promotion in FunctionAttrs) so ISCCP or other passes can be more powerful.
If we fix it on Clang side, this patch would be smaller (without dropping nonnull attr)
Another thing we need to solve is a “call sinking” (if enabled, ISCCP misses many opportunities to remove null checks)
And we would be even more powerful with FunctionAttrs as a Module pass to use DT in arg attribute promotion logic (now we do it only for entry block).
Just for some context:
I'm working on attribute interference to improve all IPOs right now.
In my experiments argument promotion is often limited by alias and dereferencability information.
Therefore I'd like this to happen so non-null information can be deduced by default which will
then improve dereferenceability information in the new interference pass.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
204 | You do this check already 3 lines above ( F->nullPointerIsDefined())). | |
448 | That is not a good answer to my question. Doing it in clang seems reasonable, if that is not what we want/can do we can do it here but properly. Thus, I'd argue it should not depend on the length as otherwise, we get some weird behaviors that cause us to drop it if we normalized the length to a constant before but not if we didn't. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
204 | Not enough. Then we have e.g. "call void @llvm.memset.p5i8.i64(i8 addrspace(5)* nonnull align 8 {{.*}}, i8 0, i64 40, i1 false" As noted by Eli, "null pointers are potentially valid in non-zero address-spaces" (test/CodeGenOpenCL/amdgpu-nullptr.cl) | |
448 | In some problematic C functions, yes, we should drop this attribute always for certain parameters. ...and here we would tag it properly when we are sure that the size is > 0... we should not do it e.g. for: strcmp(a, b, n), since n = 0 is valid and we could miscompile code due to null check removal transformation. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
204 | This is very very cryptic. First, the explicitly named check and then some condition without explanation. Maybe use NullPointerIsDefined instead as it combines both checks and it's clear what is happening. | |
448 |
Afaik, null pointers are never "valid" but often assumed to be valid. That said, I don't see why you want to drop non-null only conditionally. Your very argument is that for a lenght of 0 we should not have non-null, however, your patch actually keeps it in this and other cases for which != 0 was not shown. |
From some libc function we can get more info:
memcmp(p, d, 16)
Both p and d can be annotated with derefencable(16), right?
For functions like tihs, if the size is != null, we know the pointers are nonnull. And, yes, if the size is a constant, we know that is the dereferenceable number.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
195 | No need to check first. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
286 | s = NULL; The strncat() function is similar, except that
so "d" (dest) is always null terminated (and must be nonnull). |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
1149 | @spatel @jdoerfert @lebedev.ri infinite loop here https://github.com/llvm/llvm-project/blob/master/llvm/test/Transforms/InstCombine/memset-1.ll#L63 define float* @pr25892(i32 %size) #0 { Maybe you can give me a hint what is wrong? Thanks |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
1149 | Anyway, we dont need to drop removeNonNull(CI, {0}); for llvm.memset. It is only issue for libc's memset, and we catch it, e.g: if (LenC && LenC->getZExtValue() > 0) { annotateDereferenceableBytes(CI, {0}, LenC->getZExtValue()); annotateNonNull(CI, {0}); } else if (!isIntrinsic) { removeNonNull(CI, {0}); } But I would like to know what is going here :) |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
1149 | And maybe we can (after this patch) propose to annotate llvm.memset/memcpy/memmove with nonnull in td file? @jdoerfert With -OO no opts run -> no problem even with nonnull attrs. |
I added initial comments, will have to go through again though.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
214 | Maybe make it clear that this annotates nonnull only if it is not a valid pointer, e.g., name this annotateNonNullBasedOnAccess or sth. | |
217 | Please do not use this method, it is super confusing. I'd argue you should use the getCallee anyway but if you think the surrounding function is the one you need, I'd prefer getCaller. | |
225 | If you do not keep track (=statistics), there is probably no need to check first, as below, I think. | |
232 | I don't think you need to check first. | |
518 | What about this one? | |
548 | I'm confused, here and before, you annotate and later, under some condition, remove the annotation again. This seems risky. Here for example, Len = 0 will not remove the attribute but an unknown Len will. I doubt that is intentional. | |
678 | I would argue we actually want to annotate deref all the time but do also do nonnull while we are at it. (The Attributor derives nonnull from deref if appropriate.) Long story short, why do we derive nonnull but not deref(1) here and in other places where we expect a \0 char? (We could also add deref(N) if we already computed the static string size.) | |
2384 | We could even use "isKnownNonZero" or other helper functions here. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
217 | Ok, getCaller. CallInst has no getCallee. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
217 | getCallee -> getCalledFunction I still think the callee is what is important here (conceptually), though I doubt it makes a difference in practice. | |
548 | Oh, this is all confusing (as I said). | |
678 |
Yes, or more general, for everything we know a minimal "length".
I don't think so, but see below.
No. Though, we do already call GetStringLength all the time here, why not manifest the information?
It does, yes.
That doesn't happen in the Attributor right now but it should because deref(1) can be combined (e.g., in a phi) with null to deref_or_null(1) while nonnull cannot be combined with null to anything. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
2384 | This is true but since we we need to compute “N” anyway, I think isKnownNonZero is a overkill a bit, since it computes what we already have - “N”. - should I use it even here? But I could use it in some places, yes, (memcpy/memmove/memset).. |
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
2384 | For a lot of calls a nonnull size implies other stuff, especially, noalias, deref_or_null. |
Found a bug in LLVM.
We do not copy attributes, it seems.
define i8* @memset_size_select4(i1 %b, i8* %ptr) {
; CHECK-LABEL: @memset_size_select4(
; CHECK-NEXT: [[SIZE:%.*]] = select i1 [[B:%.*]], i32 10, i32 50
; CHECK-NEXT: call void @llvm.memset.p0i8.i32(i8* align 1 [[PTR:%.*]], i8 0, i32 [[SIZE]], i1 false)
; CHECK-NEXT: ret i8* [[PTR]]
;
%size = select i1 %b, i32 10, i32 50 %memset = tail call i8* @memset(i8* nonnull dereferenceable_or_null(40) %ptr, i32 0, i32 %size) ret i8* %memset
}
Fixed with
CallInst *NewCI = B.CreateMemSet(CI->getArgOperand(0), Val, Size, 1); NewCI->setAttributes(CI->getAttributes());
I will update this patch soon.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
2384 | Until I start heavy mechanical work here I would like to know if this approch is acceptable. Value *LibCallSimplifier::optimizeMemSet(CallInst *CI, IRBuilder<> &B, bool isIntrinsic) { Value *Size = CI->getArgOperand(2); if (ConstantInt *LenC = dyn_cast<ConstantInt>(Size)) { annotateDereferenceableBytes(CI, {0}, LenC->getZExtValue()); annotateNonNull(CI, {0}); } else if (isKnownNonZero(Size, DL)) { annotateDereferenceableBytes(CI, {0}, 1); annotateNonNull(CI, {0}); } else if (!isIntrinsic) { removeNonNull(CI, {0}); } Now we catch even this case :) %size = select i1 %b, i32 10, i32 50 %memset = tail call i8* @memset(i8* nonnull %ptr, i32 0, i32 %size) #1 ret i8* %memset } define i8* @memset_size_select_or(i1 %b, i8* %ptr, i32 %v) { %size = ashr i32 -2, %v %memset = tail call i8* @memset(i8* nonnull %ptr, i32 0, i32 %size) #1 ret i8* %memset } for select of constants we can do better job than deref(1) if (ConstantInt *LenC = dyn_cast<ConstantInt>(Size)) { annotateDereferenceableBytes(CI, {0}, LenC->getZExtValue()); annotateNonNull(CI, {0}); } else if (isKnownNonZero(Size, DL)) { const APInt *X, *Y; uint64_t DerefMin = 1; if (match(Size, m_Select(m_Value(), m_APInt(X), m_APInt(Y)))) { Min = std::min(X->getZExtValue(), Y->getZExtValue()); annotateDereferenceableBytes(CI, {0}, DerefMin); } annotateNonNull(CI, {0}); } else if (!isIntrinsic) { removeNonNull(CI, {0}); } Ok? |
For some cases I would like to copy just attributes for argno 0 - but I dont know to do it :(
I mean something like this:
NewCI->setAttrubutes(0, CI->getAttributes(0))
Maybe you can help me?
test/Transforms/InstCombine/mem-deref-bytes.ll | ||
---|---|---|
76 | @jdoerfert please check this "side-effect" improvement while working on nonnull annotation.. if "null-pointer-is-valid"="true" and "nonnull dereferenceable_or_null(40)" -> nonnull dereferenceable(40) is ok? if nonnull is missing, see memcmp_const_size_update_deref6, See newly added "CI->paramHasAttr(ArgNo, Attribute::NonNull) check" in annotateDereferenceableBytes. |
Take the attribute list (getAttributes()) then getParamAttributes(ArgNo), then probably foreach addParamAttr(NewArgNo), and then set the attribute list. Make sure to always check that you update the list properly.
test/Transforms/InstCombine/mem-deref-bytes.ll | ||
---|---|---|
76 | This looks fine to me, also test6. nonnull + dereferenceable_or_null(X) is always dereferenceable(X) regardless of the validity of null. |
Ok, thanks.
In terms of scope of annotations, I think it is good enough.
Now we have to check if all test changes are OK.
test/Transforms/InstCombine/mem-deref-bytes.ll | ||
---|---|---|
76 | Thanks! |
I think I can tag many libc functions without size argument with nonnull args directly in “BuildLibCalls” - better approach.
New patch soon.
Or should we assume if NullPointerIsDefined = true that strlen could work with null pointer?
Anway if libc tags with nonnull, like int memcmp(void * nonnull dst, void * nonnull src, size_t n) this applies to function signature, not to specific callsites so this fix will not work, indeed. We need drop nonnull from signatures in BuildLibCalls.
I think we just doing shitty workarounds and gcc 4.9 optimizes it fully since ptr cannot be null even if size is 0.
And we have sanitizer check to do that:
https://reviews.llvm.org/D9673?id=25492
Jakub Jelinek at GCC Bugzilla:
C says e.g. for memchr:
"The memchr function locates the first occurrence of c (converted to an unsigned
char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s."
and elsewhere:
"If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function."
As memchr first argument must point to an object and NULL does not point to any object, NULL is not a valid value. It is similar to memcpy (NULL, NULL, 0) or memset (NULL, 0, 0) etc. being invalid.”
Sanitizer warns so I dont know why se should permit UB.
Maybe @aaron.ballman could help us
- whether Jakub Jelinek is right
- whether memcpy/memmove/memset with NULL dst/src pointers is UB (even for size = 0)
memcpy(0,0,0) is UB after all
https://stackoverflow.com/questions/5243012/is-it-guaranteed-to-be-safe-to-perform-memcpy0-0-0
So we can move nonnull annotation to BuildLibCalls.
Take a look at the list archives for the discussion about attribute nonnull and nullable a while ago. The short version is that the library interface specification might have a wildcard clause that NULL pointers are invalid unless explicitly allowed, but it works perfectly fine on any sensible implementation except the ds9k. When the combination of glibc annotations and aggressive gcc optimizations started, it pointlessly broke a lot of code for no good reason.
It is still a UB and UBSanitizer warns. If somebody uses (typical combo) clang and gcc, one may be very surprised why debug/dev (with clang) builds work and release (with gcc) builds not.
While in 2014 or so there was no way to find this UB in codebases, we have now UB Sanitizer. In 2019 there is no excuse "it is a UB but it still kinda work (except gcc, ds9k, ..)". Add simple "if" - problem solved. For general case, anybody can send a paper to specify this behaviour to C committee.
Now, because of "memcpy(NULL, NULL, 0) fans" we cannot optimize null checks (imagine you have safe wrapper (which always check if nonnull ptrs) above standard string functions..) - "no good reason". ?
I see a lot of opportunity from nonnull/dereferenceable annotation so I dont see why we should not to do it (ok, few codes may break - any sane codebase is checked by sanitizers anyway).
Please read the discussion. The general consensus was and likely is that this is at least somewhat bogus in the standard at best and the cost of not artificially breaking code is much higher than the benefit. Yes, GCC made different choices, but that's no excuse.
And nobody wrote a paper to fix “the bogus”.
@jdoerfert what do you think?
As I see, annotating callsites is atm pointless since FunctionAttrs/Attributor cannot propagate it - they do only if “function signature (decl)” contains nonnull. I would continue to work on this current “safe” approach if there is a small promise that Attributor will handle this kind of propagation too :)
@uenoku informed me that Attributor will handle it with a WIP patch which is not yet commited. Good.
So this patch has a sense in terms of global scope. PTAL as I finished here all annotation work.
As I recall, there are two issues. One is: do we add the attributes based on the function names. The second is: What to do with the attributes that already exist in, e.g., some glibc header files.
My opinion is that:
- We should definitely have this optimization capability
- But we should have this after we filter out the annotations from memcpy/memset (etc.) in Clang (when the size is dynamic or zero). There should be a flag to turn off the filtering, but at least for now, it should be on by default.
- We should have a function attribute reflecting the same filtering setting to appropriately control adding the attributes in the optimizer.
Yes, NULL pointers with memcpy/memset are UB from the standards perspective, but it's unclear to me that the value in exploiting this UB outweighs the potential problems introduced. While I generally agree with the viewpoint that says that people should use sanitizers, etc., and believe that they should in this case as well, because memcpy/memset are *so* common, because there's a wide-spread and reasonable disagreement as to whether this behavior should be undefined at all, and because there's no strong and broad case for performance improvements (unlike, for example, exploiting the lack of signed-integer overflow), I think that we should filter.
do we add the attributes based on the function names.
I dont see this as a problem - reserved names.
What to do with the attributes that already exist in, e.g., some glibc header files.
Can you point me on real example/OS/glibc version where this actually happens?
I have Ubuntu 18 LTS and glibc and I see no such attributes in IR.
But we should have this after we filter out the annotations from memcpy/memset (etc.) in Clang (when the size is dynamic or zero)
I dont think we want add/remove attributes in Clang based on size argument. It seems just bad to look at function names in Clang and go thru Clang AST to inspect size argument.
Either drop all and possibly add it back here in instcombine or leave it and drop it in eg. BuildLibCalls?
- these are libc functions and almost all C code is compiled by gcc, I think if the code triggered UB, gcc 4.9 and newer already exploited this UB -> UB releaved by “miscompile” or GCC UB sanitizer
- small percentange of C code which is compiled only by clang is hopefully tested with sanitizer -> UB revealed
On the other side the cost of patching Clang and llvm (+ maintenance) to keep this UB working in the user code (I would be very interested to see how many % of all code it could broken in 2019 - I believe very very small percentange) seems quite high.
The glibc header files have the attributes, AFAIKT: https://github.com/bminor/glibc/blob/master/string/string.h - and this looks like the header on the Fedora 27 system that I just checked.
/* Copy N bytes of SRC to DEST. */ extern void *memcpy (void *__restrict __dest, const void *__restrict __src, size_t __n) __THROW __nonnull ((1, 2));
If you're not seeing the attributes in the IR, and I'm not seeing them either, we might just not be adding them to avoid this issue. I do see them in Clang's AST:
|-FunctionDecl x </usr/include/string.h:42:1, /usr/include/sys/cdefs.h:289:63> /usr/include/string.h:42:14 used memcpy 'void *(void *restrict, const void *restrict, size_t)' extern | |-ParmVarDecl x <col:22, col:39> col:39 __dest 'void *restrict' | |-ParmVarDecl x <col:47, col:70> col:70 __src 'const void *restrict' | |-ParmVarDecl x <line:43:8, col:15> col:15 __n 'size_t':'unsigned long' | |-NoThrowAttr x </usr/include/sys/cdefs.h:55:35> | `-NonNullAttr x <line:289:44, /usr/include/string.h:43:44> 1 2
You seem to be assuming that it would be obvious that a given program would be negatively affected by this, and I don't think this would be obvious at all. Sanitizers are great tools for debugging problems, but they're dynamic tools, so they only test the code paths and the parts of the configuration space that the instrumented program actually executes. Best practice may be to fuzz test all code with sanitizers, which can provide more confidence in finding problems like this, but only a very small percentage of the code that exists has been tested in this way (or has been subjected to any suitable, effective static analysis).
I look at this as something about which reasonable people can disagree. No one really has data on this, AFAIK, and so I just have my judgement. I suspect that the percentage of affected programs here is small, but not tiny. memcpy is a very-commonly-used function, and so I suspect we're in a situation where a small (but not tiny) percentage of a very large amount of code yields a large number of affected programs.
If Clang already drops it [0], that is good, then this patch is totally fine.
[0] @aaron.ballman, @rsmith, @rjmcall - can you tell us more what Clang actually does here?
If Clang already drops it [0], that is good, then this patch is totally fine.
But this patch would add the nonnull attributes back, no? I'm not sure that's fine.
We should ditch “removeNonNull” handling since Clang never annotates callsites (Clang only annotates function signatures - currently nonnull is dropped, great!)
What do you think, @jdoerfert ? removeNonNull sometimes caused instcombine’s infinite loop + i think we dont need removeNonNull at all.
I would like to move forward here..
Sorry for the delay but it will take me a while to look through all the changes and make sure we did not overlook sth.
Some minor nits, two questions that you could clarify or resolve, other than that I think this is fine.
lib/Transforms/Utils/SimplifyLibCalls.cpp | ||
---|---|---|
582 | Somewhere above and somewhere below you copy all attributes, here only the param 0 attributes, is there a reason why? | |
1036 | Leftover call. | |
2647 | Move to the beginning of the function, puts has often no users (IMHO). | |
3062 | Is there a reason this is not happening for the two functions below? | |
test/Transforms/InstCombine/strstr-1.ll | ||
91 | Add a newline at the end. |
Btw
We should ditch “removeNonNull” handling since Clang never annotates callsites (Clang only annotates function signatures - currently nonnull is dropped, great!)
I will go this way and remove it. removeNonNull may pessimize things / create compiler infinite loops. Some other may infer that pointer X is nonnull and then InstCombine would remove it in strncpy(D, X, N); // non const N
Imagine
%add.ptr = getelementptr inbounds i8, i8* %string_literal, i64 1
%call2 = call i64 @strlen(i8* nonnull dereferenceable(1) %string_literal) #7 %sub3 = add i64 %call2, -2 %call4 = call i8* @strncpy(i8* nonnull %call1, i8* nonnull %add.ptr, i64 %sub3) #8
This is IR which cause instcombine infinite loop. InstCombine sees that sub3 is not "known zero" and will remove nonnull from add.ptr. As I noted, I am not worried if we remove "removeNonNull", since Clang does not annotate call sites, Clang just emit function signatures (and these do not contain "nonnulls").
Hi @jdoerfert
Do you plan to adjust Attributor's place in opt pipeline? Attributor must run after InstCombine (too) to propagate nonull/deref attributes from callsites to function arguments.
Any reason you do not use isKnownNonZero ?