This avoids creating unnecessary casts if the IP used to be a dbg info
intrinsic. Fixes PR37727.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Thanks for doing this. Do you think we have the same bug here w.r.t lifetime markers, assume intrinsics, etc? If so, maybe this patch would be the right vehicle to add something like Instruction::isMetaInstruction, as suggested by Hideki in a previous thread.
As far as testing is concerned, I’ve started running a script to check for debug info IR generation differences in the LLVM test suite. It could be worthwhile to check in a specific regression test, but at the moment this is a bit cumbersome to write. See this commit for an example: https://github.com/llvm-mirror/llvm/commit/8c3bbfcc15b89aad062e9b6e9251ce5666a43427
(Edited because I hit send too early.)
Yep, I think the other meta instructions will cause problems here (and other places) too. I can add a test case for this patch to the loop vectorizer tests, which exposed this problem.
Didn't we recently add a more convenient mechanism to determine all meta-instructions? Or am I misremembering and that was MIR?
Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).
We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
95 ↗ | (On Diff #150297) | This handling isn't consistent with the handling for landing pads/etc. |
Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?
Yes, I think it makes sense to introduce this. It might also be convenient to add an analogous method to Instruction to replace some uses of getNextNode().
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
95 ↗ | (On Diff #150297) | Would it be correct to write "IP = advancePastNonAffectingInsts(IP)", everywhere we see "++IP" here? |
I think so, we could also refactor rL334317 to be less cumbersome.
Are you going to work on this or you want me to? [I suspect we're going to find more places where we need the utility so the sooner we get this done, the better].
Something straight forward would be to just add something like the skipDebug below as a static function to BasicBlock, similar to instructionsWithoutDebug
+BasicBlock::iterator BasicBlock::skipDebug(iterator Iter) { + while (isa<DbgInfoIntrinsic>(*Iter)) + Iter++; + return Iter; +}
I am happy to add something like that, but I will only get around to that early next week I guess. Maybe it would be even better to have this helper skip all 'meta' instructions that are not supposed to have an impact on codegen and also adopt instructionsWithoutDebug. Otherwise those issues will just keep popping up for various kinds of such instructions.
Or maybe it would be even more convenient to just have a new iterator type, which skips such meta instructions.
Presumably there's a lot of pre-existing code which traffics in BasicBlock::iterator. It might be hard to refactor that code to use a different iterator kind.
Something straight forward would be to just add something like the skipDebug below as a static function to BasicBlock
My preference is to have a function in the llvm namespace which advances a BasicBlock::iterator, because I don't think of a BasicBlock as something that can advance an iterator.
Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?
I think so, we could also refactor rL334317 to be less cumbersome.
Are you going to work on this or you want me to? [I suspect we're going to find more places where we need the utility so the sooner we get this done, the better].
I'm going on a week-long vacation starting tomorrow, so I'm happy to have someone else push this forward :).
Skimming through the LangRef, here are the intrinsics which should be non-affecting: llvm.donothing, llvm.dbg.*, and llvm.{var, ptr, }.annotation.*. I'm skeptical of adding lifetime markers and llvm.assume to this list, because these intrinsics have behaviors that optimizations may want to consider.
The reason I don't necessarily want to go that route is that people now have to think about which iterator they should use when walking instructions in a BB.
I think it creates more confusion than it helps. Your other idea is fine to me, but I'd love to hear from others (Eli/Vedant/Adrian etc.)
I've updated the patch to add a skipDebugInstructionsForward utility. We have a very similar function defined in MachineBasicBlock.h, maybe we should consolidate those functions?
With this approach, there is one scenario where we could still get different codegen depending on debug info if there are debug intrinsics before a funclet/landingpad instruction.
The MachineBasicBlock.h function operates on MachineInstrs the one in BasicBlock operates on Instructions. I'm not sure how to unify these: perhaps by adding a skipForwardWhile(std::function(bool() shouldSkip) primitive to ilist?
Probably overkill, but I'd imagine the way to do this would be a trait
class that implements the two different tests depending on the type of the
node. (I guess a generic "isDebugInstruction" function that uses that trait
could be a handy general purpose tool to have, maybe? (so you don't have to
think so hard about different way s of testing whether something is a debug
instruction in MC versus IR))
I like the trait suggestion! There potentially are a few useful helper functions we provide, and using a trait we could provide them in a generic way for both IR and machine instructions & co. As it is now, I am just slightly worried that we will end up with 2 different but quite similar APIs for dealing with debug info in IR/machine instructions.
include/llvm/IR/BasicBlock.h | ||
---|---|---|
440 ↗ | (On Diff #151344) | I think this should skip llvm.ptr.annotation, llvm.var.annotation, etc. as well. Stepping back a bit, I shouldn't have suggested adding a 'skip meta instruction' API as a part of your SCEVExpander bug fix. IMO it would be cleaner to split the new API out, add a unit test, and incorporate some light refactoring along with it. Would you mind if I started a separate review to add this API? I had a patch for this prepped before I went out-of-office, but didn't have the time to start a review then. I'd be happy to take a stab at this. |
Perhaps you are thinking about MachineInstr::isMetaInstruction?
https://llvm.org/doxygen/MachineInstr_8h_source.html#l00918
include/llvm/IR/BasicBlock.h | ||
---|---|---|
440 ↗ | (On Diff #151344) |
Not at all, that would be great. I will probably only be able to pick this patch up again in a couple of days. |
I'm planning on verifying the change. The testing process is roughly:
$ find test/Transforms -name \*.ll | parallel opt-check-dbg-invar.sh $OPT {} -loop-vectorize
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
105 ↗ | (On Diff #152287) | Why not skip debug instructions here, instead? It should make less work for callers of findInsertPointAfter, right? |
95 ↗ | (On Diff #150297) | I suppose this wouldn't be useful, because dbg.values aren't allowed to occur before phis / eh instructions. |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
105 ↗ | (On Diff #152287) | IIRC Eli suggested to not do it in this function, as it should behave similar to BasicBlock::getFirstInsertionPt , which also does not skip debug info. |
I had to revert this, as there is crash in relwithdebug builds, presumably because insertion points get out of sync. I'll do some digging.