This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Introduce VPCmpInst sub-class in the instruction-level representation
AbandonedPublic

Authored by dcaballe on Aug 15 2018, 5:12 PM.

Details

Summary

This patch introduces VPCmpInst, a sub-class of VPInstruction used to model details of comparison instructions in VPlan, such as the comparison's predicate. At this point, we don't see the need of distinguishing between integer and floating point comparison at VPlan representation level.

VPCmpInst is needed in D50480 to properly model a new compare VPInstruction (i.e., a compare that is not part of the input IR) generated during the vectorization process.

Diff Detail

Event Timeline

dcaballe created this revision.Aug 15 2018, 5:12 PM
rkruppe added a comment.EditedAug 16 2018, 11:04 AM

The implementation looks good to me. The interface chosen here (directly mirroring CmpInst from the Value hierarchy in the VPValue hierarchy) also seems like the right direction to me. Besides avoiding the problematic concept of "underlying Instructions" altogether, it also gives a convenient place to put any helper functionality that the vectorizer code might want when generating and manipulating such comparisons.

That said, this is currently the only subclass of its kind and it would be weird if it remained that way. In other words, I would expect that VPInstruction would gain more such subclasses (e.g., VPBinaryOperator, VPSelectInst, etc.) as the vectorizer starts to create and manipulate these instructions. Does this make sense to y'all?

lib/Transforms/Vectorize/LoopVectorizationPlanner.h
157

This duplicates the assertion in the other overload that's called in the next line, right? I don't mind that, it's good to be a bit more defensive, just want to make sure I didn't miss anything and that this is intentional.

That said, this is currently the only subclass of its kind and it would be weird if it remained that way. In other words, I would expect that VPInstruction would gain more such subclasses (e.g., VPBinaryOperator, VPSelectInst, etc.) as the vectorizer starts to create and manipulate these instructions. Does this make sense to y'all?

Our general intention is to make VPInstructions as easy to use as Instructions to many LLVM developers, who aren't necessarily very familiar with vectorizer, and also reduce the duplicate development/maintenance effort. If that requires more subclassing, we'll consider doing that but very carefully, and only as needed basis. At this point, letting VPInstruction to have all the functionality of Instruction is not an objective. We are starting from implementing just enough to satisfy vectorizer needs and minimizing unnecessary divergence in doing so (i.e., what's implemented can be used in a very similar manner).

Sorry that I'm not directly answering your question. Hope this helps in evaluating between the two alternatives: new opcode approach in D50480 and subclassing approach in this patch, however. I think subclassing here helps us avoid unnecessary divergence.

Our general intention is to make VPInstructions as easy to use as Instructions to many LLVM developers, who aren't necessarily very familiar with vectorizer, and also reduce the duplicate development/maintenance effort. If that requires more subclassing, we'll consider doing that but very carefully, and only as needed basis. At this point, letting VPInstruction to have all the functionality of Instruction is not an objective. We are starting from implementing just enough to satisfy vectorizer needs and minimizing unnecessary divergence in doing so (i.e., what's implemented can be used in a very similar manner).

Sorry that I'm not directly answering your question. Hope this helps in evaluating between the two alternatives: new opcode approach in D50480 and subclassing approach in this patch, however. I think subclassing here helps us avoid unnecessary divergence.

Thanks, it doesn't answer my question in the exact terms I used, but it hits all the notes I was curious about. I agree with the goals (accessibility to developers, minimizing unnecessary divergence) and also with getting there incrementally and on demand.

dcaballe added inline comments.Aug 17 2018, 9:24 AM
lib/Transforms/Vectorize/LoopVectorizationPlanner.h
157

Thanks, Robin. IIRC, the initial implementation didn't invoke the overload. I agree with what you said. It's better to be defensive, just in case the implementation of this one ends up not invoking the overload again.

In case you missed it, there is some discussion in D50480 regarding this code. Your feedback would be appreciated.

Thanks!
Diego

Jumping from D50480:

This patch aims to model a rather special early-exit condition that restricts the execution of the entire loop body to certain iterations, rather than model general compare instructions. If preferred, an "EarlyExit" extended opcode can be introduced instead of the controversial ICmpULE. This should be easy to revisit in the future if needed.

This patch is fine as is, or rather much better with ICmpULE than EarlyExit.

This patch focuses on modeling an early-exit compare and then generating it, w/o making strategic design decisions supporting future vplan-to-vplan transformations, the interfaces they may need, potential templatization, or other long-term high-level VPlan concerns. These should be explained and discussed separately along with pros and cons of alternative solutions for supporting the desired interfaces and for holding their storage, including subclassing VPInstructions, using detached Instructions, or other possibilities.

Sure. I agree.

[Full disclosure] I have a big mental barrier in accepting your "early-exit" terminology here since I relate that term to "break out of the loop", but that's just the terminology difference. Nothing to do with the substance of this patch. [End of full disclosure]

Regarding "using detached Instructions". I fully go against that because that'll forever prohibit moving the VPlan/VPInstructions into Analysis. IR Verifier will trigger if there is a detached IR Instruction at the end of an Analysis pass. I already had a hallway chat with @lattner about a possibility of using IR Instructions and IR CFG in the detached mode (and that also requires many utilities to be usable in detached mode) and he was totally pessimistic about it. That was two years ago at 2016 Developer Conference, but nothing really has changed since then in that regard. That was the end of my hope for using detached IR Instructions, instead of introducing VPInstructions. Detached Instructions under the hood of VPInstructions is not very useful if we can't keep them between vectorization Analysis pass and vectorization Transformation pass.

Ayal added a comment.Aug 28 2018, 7:45 AM

Jumping from D50480:

This patch aims to model a rather special early-exit condition that restricts the execution of the entire loop body to certain iterations, rather than model general compare instructions. If preferred, an "EarlyExit" extended opcode can be introduced instead of the controversial ICmpULE. This should be easy to revisit in the future if needed.

This patch is fine as is, or rather much better with ICmpULE than EarlyExit.

This patch focuses on modeling an early-exit compare and then generating it, w/o making strategic design decisions supporting future vplan-to-vplan transformations, the interfaces they may need, potential templatization, or other long-term high-level VPlan concerns. These should be explained and discussed separately along with pros and cons of alternative solutions for supporting the desired interfaces and for holding their storage, including subclassing VPInstructions, using detached Instructions, or other possibilities.

Sure. I agree.

[Full disclosure] I have a big mental barrier in accepting your "early-exit" terminology here since I relate that term to "break out of the loop", but that's just the terminology difference. Nothing to do with the substance of this patch. [End of full disclosure]

Regarding "using detached Instructions". I fully go against that because that'll forever prohibit moving the VPlan/VPInstructions into Analysis. IR Verifier will trigger if there is a detached IR Instruction at the end of an Analysis pass. I already had a hallway chat with @lattner about a possibility of using IR Instructions and IR CFG in the detached mode (and that also requires many utilities to be usable in detached mode) and he was totally pessimistic about it. That was two years ago at 2016 Developer Conference, but nothing really has changed since then in that regard. That was the end of my hope for using detached IR Instructions, instead of introducing VPInstructions. Detached Instructions under the hood of VPInstructions is not very useful if we can't keep them between vectorization Analysis pass and vectorization Transformation pass.

(Just for the record, the detached ICmpInst used in https://reviews.llvm.org/D50480?id=161564 passes when verifyModule() is called right after LVP.plan(). If the Undef's it uses are of concern, its operands could be nullified after construction.)

dcaballe abandoned this revision.Aug 21 2019, 12:46 PM

Not needed for now.

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2019, 12:46 PM
Herald added a subscriber: psnobl. · View Herald Transcript