https://llvm.org/bugs/show_bug.cgi?id=31142 :
SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg.
Differential D27114
Preserve nonnull metadata on Loads through SROA & mem2reg. luqmana on Nov 24 2016, 2:23 PM. Authored by
Details https://llvm.org/bugs/show_bug.cgi?id=31142 : SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg.
Diff Detail
Event TimelineComment Actions I don't get it, this patch looks like it is *removing* SROA's existing logic to propagate metadata? And it doesn't add any new test cases, just deletes an existing test? Comment Actions Could you also preserve !range metadata? It behaves similarly to nonzero, and inttoptr/ptrtoint interchanges them.
Comment Actions
Err, no, you're probably right; I haven't really worked with it, and LangRef doesn't say anything, so "must" defaults to undefined behavior. You still can't propagate backwards without proving post-dominance. Comment Actions That should hold in this case no? Lines 361-381 assert that LI is dominated by the Store and the Store trivially dominates ReplVal? Comment Actions
Post-dominates, not dominates... they're sort of similar, but post-dominance involves following the control flow edges the other way. A compiler textbook should cover this. Comment Actions Address comments from review.
Comment Actions PostDominatorTree doesn't provide the kind of post-dominance you need for this transformation: LLVM basic blocks have implicit edges which exit the function early, and those edges aren't reflected in the tree. For example, execution doesn't actually continue after a call to exit(). (See also the llvm::isGuaranteedToTransferExecutionToSuccessor() utility.) Comment Actions Hmm, is it possible then to do this sort of analysis in LLVM in this context? Or is this optimization just not possible currently? Comment Actions I don't think there's any existing analysis which computes the sort of post-dominator tree you want. Anyway, I'm not sure this is really what you want... have you experimented with inserting llvm.assume calls? Comment Actions The problem wasn't that. We do have llvm.assume which InstCombine turns assume( (load addr) != null ) to a load addr !nonnull. Then, SROA/mem2reg end up eating the !nonnull metadata. Some more details at https://github.com/rust-lang/rust/issues/37945 which was linked in the bug link in the description. Comment Actions
You can do the reverse transform in mem2reg: load addr !nonnull -> assume( (load addr) != null ), then when the load gets erased the nonnull assumption is preserved. Comment Actions Or we could stop instcombine from doing this in the first place. I think we should canonicalize on the assume formulation as it is strictly more general than the nonnull metadata formulation. Comment Actions For this change I just implemented @efriedma's suggestion. I think changing how nonnull-ness is canonicalized is outside the scope of this. Comment Actions Could you clean up the formatting to match LLVM coding standards? At the very least clang-format would help here (lots of lines over 80 columns, indent looks all wrong).
Comment Actions Thanks for the changes. Someone else can correct me if I'm wrong, but I think we're delaying improvements in nonnull optimization until we have a clang hack to avoid disastrous optimizations of libc functions: Comment Actions Yes, that thread seems largely in favor of getting something in place in Clang first. If no one else works on a patch to Clang I may in a week or so, but sadly my queue is very full at the moment. Comment Actions I don't think that's relevant to this patch. Once we have the nonnull in metadata form, we should preserve it. The particular problem with optimizing based on nonnull function-argument annotations is propagating the nonnull annotation from a callsite back into the caller. For that, we should wait on associated Clang updates (so that we can avoid doing it for memcpy and friends based on header-file annotations). This change should proceed. Comment Actions Bunch of little comments below. I'd like to see two high level thinsg before this goes in though:
Comment Actions Updated the patch to address the little comments. As for the second point, I think that's better handled in a separate patch. If Eli is happy with the current approach I'd like to move forward with just this patch. Comment Actions
Semantically, this seems consistent with existing code dealing with !nonnull. That said, creating calls to @llvm.assume has the potential to cause performance regressions; there are a bunch of places in the optimizer which don't handle them well, so it's probably worth checking that this change doesn't create more problems than it solves. Comment Actions Are there any existing performance tests so I can determine if this might cause a regression? Comment Actions Well, you can run the LLVM testsuite (http://llvm.org/docs/lnt/quickstart.html)... but that's written entirely written in C and C++, and I don't think clang produces nonnull metadata at all. Maybe you have some Rust performance tests you could use? Comment Actions Apologies if this isn't directly related to this patch, but one thing I noticed and mentioned in D28337: InstCombine is not calling into InstSimplify (and hence value tracking) with a context instruction in many cases. And so computeKnownBitsFromAssume() is not triggered in those cases. If we fix that (assuming that the lack of context instruction optional parameter is an oversight), we should make better use of the assumption machinery. Comment Actions
I was more referring to the fact than m_OneUse etc. doesn't work the way you want it to in the presence of llvm.assume calls. Comment Actions Hmmm, I recall us not using as much assume due to supposed LLVM slowdowns but don't have any on hand. Comment Actions So at least running the test suite I didn't come across any performance regressions. Could we land this and observe the buildbots or something for any regressions and back out as necessary?
Comment Actions I'm not familiar with this code, but I'd like to see this functionality, so any assistance in reviewing is appreciated.
Comment Actions Hal's earlier comment said we don't need to hold this up for clang, but for reference that fix is proposed now: @chandlerc / @davide / @efriedma / @hfinkel - can you continue your review? I don't know enough to approve. Comment Actions I was hoping Chandler would continue to review, but I guess I can do it. The test here is really bare-bones; it doesn't really cover all the relevant codepaths. It doesn't check the effect of the isKnownNonNullAt, and it doesn't cover one of the codepaths in PromoteMemoryToRegister.cpp at all. Otherwise looks fine per earlier discussion. Comment Actions FWIW, I'm happy to continue it or for you to continue it. I was planning to GC these threads when the clang patch actually landed, but if you can get to them sooner, so much the better.
FWIW, I think this really needs to be addressed before it lands. Comment Actions I added another test for all the relevant codepaths in PromoteMemoryToRegister. Comment Actions LGTM. Luqman, thanks for putting up with the long delay in reviewing this. Do you have commit access, or should I commit this for you? |
Is the intent to extend this for some subset of metadata types beyond just MD_nonnull?
Either way, just make this one line for now?
NewLI->copyMetadata(LI, LLVMContext::MD_nonnull);