This is an archive of the discontinued LLVM Phabricator instance.

[SCEVExp] Advance found insertion point until we find a non-dbg instruction.
ClosedPublic

Authored by fhahn on Jun 7 2018, 4:37 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

fhahn created this revision.Jun 7 2018, 4:37 AM
vsk added subscribers: hsaito, debug-info.EditedJun 7 2018, 8:40 AM

Thanks for doing this. Do you think we have the same bug here w.r.t lifetime markers, assume intrinsics, etc? If so, maybe this patch would be the right vehicle to add something like Instruction::isMetaInstruction, as suggested by Hideki in a previous thread.

As far as testing is concerned, I’ve started running a script to check for debug info IR generation differences in the LLVM test suite. It could be worthwhile to check in a specific regression test, but at the moment this is a bit cumbersome to write. See this commit for an example: https://github.com/llvm-mirror/llvm/commit/8c3bbfcc15b89aad062e9b6e9251ce5666a43427

(Edited because I hit send too early.)

fhahn added a comment.Jun 7 2018, 9:24 AM

Yep, I think the other meta instructions will cause problems here (and other places) too. I can add a test case for this patch to the loop vectorizer tests, which exposed this problem.

Didn't we recently add a more convenient mechanism to determine all meta-instructions? Or am I misremembering and that was MIR?

Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).

We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.

lib/Analysis/ScalarEvolutionExpander.cpp
95 ↗(On Diff #150297)

This handling isn't consistent with the handling for landing pads/etc.

fhahn added a comment.Jun 7 2018, 12:49 PM

Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).

We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.

Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?

spatel added a subscriber: spatel.Jun 7 2018, 3:23 PM
vsk added a comment.Jun 8 2018, 1:15 PM

Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).

We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.

Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?

Yes, I think it makes sense to introduce this. It might also be convenient to add an analogous method to Instruction to replace some uses of getNextNode().

lib/Analysis/ScalarEvolutionExpander.cpp
95 ↗(On Diff #150297)

Would it be correct to write "IP = advancePastNonAffectingInsts(IP)", everywhere we see "++IP" here?

davide added a subscriber: davide.Jun 8 2018, 1:49 PM

Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).

We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.

Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?

I think so, we could also refactor rL334317 to be less cumbersome.
Are you going to work on this or you want me to? [I suspect we're going to find more places where we need the utility so the sooner we get this done, the better].

fhahn added a comment.Jun 8 2018, 2:16 PM

Not sure this is the best fix. The function is named findInsertPointAfter; from the name, I would expect it to behave similarly to BasicBlock::getFirstInsertionPt (which does not try to move past debug info).

We added instructionsWithoutDebug recently to deal with this sort of issue, but it's not directly usable here.

Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?

I think so, we could also refactor rL334317 to be less cumbersome.
Are you going to work on this or you want me to? [I suspect we're going to find more places where we need the utility so the sooner we get this done, the better].

Something straight forward would be to just add something like the skipDebug below as a static function to BasicBlock, similar to instructionsWithoutDebug

+BasicBlock::iterator BasicBlock::skipDebug(iterator Iter) {
+  while (isa<DbgInfoIntrinsic>(*Iter))
+    Iter++;
+  return Iter;
+}

I am happy to add something like that, but I will only get around to that early next week I guess. Maybe it would be even better to have this helper skip all 'meta' instructions that are not supposed to have an impact on codegen and also adopt instructionsWithoutDebug. Otherwise those issues will just keep popping up for various kinds of such instructions.

fhahn added a comment.Jun 8 2018, 2:21 PM

Or maybe it would be even more convenient to just have a new iterator type, which skips such meta instructions.

vsk added a comment.Jun 8 2018, 3:10 PM

Or maybe it would be even more convenient to just have a new iterator type, which skips such meta instructions.

Presumably there's a lot of pre-existing code which traffics in BasicBlock::iterator. It might be hard to refactor that code to use a different iterator kind.

Something straight forward would be to just add something like the skipDebug below as a static function to BasicBlock

My preference is to have a function in the llvm namespace which advances a BasicBlock::iterator, because I don't think of a BasicBlock as something that can advance an iterator.

Yeah, maybe we should have a function that takes an iterator and advances the iterator to the first non debug/meta instruction?

I think so, we could also refactor rL334317 to be less cumbersome.
Are you going to work on this or you want me to? [I suspect we're going to find more places where we need the utility so the sooner we get this done, the better].

I'm going on a week-long vacation starting tomorrow, so I'm happy to have someone else push this forward :).

Skimming through the LangRef, here are the intrinsics which should be non-affecting: llvm.donothing, llvm.dbg.*, and llvm.{var, ptr, }.annotation.*. I'm skeptical of adding lifetime markers and llvm.assume to this list, because these intrinsics have behaviors that optimizations may want to consider.

davide added a comment.Jun 8 2018, 3:34 PM

Or maybe it would be even more convenient to just have a new iterator type, which skips such meta instructions.

The reason I don't necessarily want to go that route is that people now have to think about which iterator they should use when walking instructions in a BB.
I think it creates more confusion than it helps. Your other idea is fine to me, but I'd love to hear from others (Eli/Vedant/Adrian etc.)

fhahn updated this revision to Diff 151344.Jun 14 2018, 6:54 AM
fhahn retitled this revision from [SCEVExpander] Ignore dbg info when finding insertion point. to [SCEVExp] Advance found insertion point until we find a non-dbg instruction..

I've updated the patch to add a skipDebugInstructionsForward utility. We have a very similar function defined in MachineBasicBlock.h, maybe we should consolidate those functions?

With this approach, there is one scenario where we could still get different codegen depending on debug info if there are debug intrinsics before a funclet/landingpad instruction.

I've updated the patch to add a skipDebugInstructionsForward utility. We have a very similar function defined in MachineBasicBlock.h, maybe we should consolidate those functions?

The MachineBasicBlock.h function operates on MachineInstrs the one in BasicBlock operates on Instructions. I'm not sure how to unify these: perhaps by adding a skipForwardWhile(std::function(bool() shouldSkip) primitive to ilist?

Probably overkill, but I'd imagine the way to do this would be a trait
class that implements the two different tests depending on the type of the
node. (I guess a generic "isDebugInstruction" function that uses that trait
could be a handy general purpose tool to have, maybe? (so you don't have to
think so hard about different way s of testing whether something is a debug
instruction in MC versus IR))

fhahn added a comment.Jun 14 2018, 2:54 PM

I like the trait suggestion! There potentially are a few useful helper functions we provide, and using a trait we could provide them in a generic way for both IR and machine instructions & co. As it is now, I am just slightly worried that we will end up with 2 different but quite similar APIs for dealing with debug info in IR/machine instructions.

vsk added inline comments.Jun 18 2018, 2:18 PM
include/llvm/IR/BasicBlock.h
440 ↗(On Diff #151344)

I think this should skip llvm.ptr.annotation, llvm.var.annotation, etc. as well.

Stepping back a bit, I shouldn't have suggested adding a 'skip meta instruction' API as a part of your SCEVExpander bug fix. IMO it would be cleaner to split the new API out, add a unit test, and incorporate some light refactoring along with it.

Would you mind if I started a separate review to add this API? I had a patch for this prepped before I went out-of-office, but didn't have the time to start a review then. I'd be happy to take a stab at this.

mattd added a subscriber: mattd.Jun 18 2018, 2:31 PM

Didn't we recently add a more convenient mechanism to determine all meta-instructions? Or am I misremembering and that was MIR?

Perhaps you are thinking about MachineInstr::isMetaInstruction?
https://llvm.org/doxygen/MachineInstr_8h_source.html#l00918

fhahn added inline comments.Jun 18 2018, 2:50 PM
include/llvm/IR/BasicBlock.h
440 ↗(On Diff #151344)

Would you mind if I started a separate review to add this API? I had a patch for this prepped before I went out-of-office, but didn't have the time to start a review then. I'd be happy to take a stab at this.

Not at all, that would be great. I will probably only be able to pick this patch up again in a couple of days.

vsk added inline comments.Jun 19 2018, 4:49 PM
include/llvm/IR/BasicBlock.h
440 ↗(On Diff #151344)

Great, 'rL335083: [IR] Introduce helpers to skip debug instructions (NFC)' has landed, hope that helps.

fhahn updated this revision to Diff 152287.Jun 21 2018, 6:56 AM

Update to use skipDebugInfo introduced in rL335083

efriedma accepted this revision.Jun 21 2018, 12:02 PM

LGTM assuming you're planning to regression-test this somehow.

This revision is now accepted and ready to land.Jun 21 2018, 12:02 PM
vsk added a comment.Jun 21 2018, 12:07 PM

LGTM assuming you're planning to regression-test this somehow.

I'm planning on verifying the change. The testing process is roughly:
$ find test/Transforms -name \*.ll | parallel opt-check-dbg-invar.sh $OPT {} -loop-vectorize

lib/Analysis/ScalarEvolutionExpander.cpp
105 ↗(On Diff #152287)

Why not skip debug instructions here, instead? It should make less work for callers of findInsertPointAfter, right?

95 ↗(On Diff #150297)

I suppose this wouldn't be useful, because dbg.values aren't allowed to occur before phis / eh instructions.

fhahn updated this revision to Diff 152676.Jun 25 2018, 8:38 AM

Added test case.

fhahn added inline comments.Jun 25 2018, 8:39 AM
lib/Analysis/ScalarEvolutionExpander.cpp
105 ↗(On Diff #152287)

IIRC Eli suggested to not do it in this function, as it should behave similar to BasicBlock::getFirstInsertionPt , which also does not skip debug info.

vsk accepted this revision.Jun 25 2018, 9:47 AM

LGTM, thanks!

This revision was automatically updated to reflect the committed changes.
fhahn added a comment.Jun 26 2018, 7:06 AM

I had to revert this, as there is crash in relwithdebug builds, presumably because insertion points get out of sync. I'll do some digging.