This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Record IR flags on VPWidenRecipe directly (NFC).
ClosedPublic

Authored by fhahn on Apr 24 2023, 10:26 AM.

Details

Summary

This patch introduces a VPIRFlags class to record various IR flags. This
allows de-coupling of IR flags from the underlying instructions. The
main benefit is that it allows dropping of IR flags from recipes
directly, without the need to go through
State::MayGeneratePoisonRecipes. The plan is to remove
MayGeneratePoisonRecipes once all relevant recipes are transitioned.

It also allows dropping IR flags during VPlan-to-VPlan transforms, which
will be used in a follow-up patch to implement truncateToMinimalBitwidths
as VPlan-to-VPlan transform.

Diff Detail

Event Timeline

fhahn created this revision.Apr 24 2023, 10:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 10:26 AM
fhahn requested review of this revision.Apr 24 2023, 10:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 10:26 AM
Ayal added inline comments.May 3 2023, 1:26 PM
llvm/lib/Transforms/Vectorize/VPlan.h
954

Record which of the three operators is recorded, to ensure appropriate flags are accessed?
Initialize all to off?

972

Ensuring the generated code will be treated as if via Instruction::dropPoisonGeneratingFlags().
Can instead keep a flag in VPIRFlags indicating that such flags should be dropped, instead of replicating that logic here.

992

Better than having VPIRFlags hold FastMathFlags as an internal struct/class/union?

997

This replaces Instruction::copyIRFlags(const Value *V, bool IncludeWrapFlags = true) which sets the flags of "this" according to those of V, whether on or off, but does so in the opposite direction - setting flags of I according to those of "this", as in:

if (isa<OverflowingBinaryOperator>(I)) {
  I->setHasNoSignedWrap(hasNoSignedWrap());
  I->setHasNoUnsignedWrap(hasNoUnsignedWrap());
} else if (isa<PossiblyExactOperator>(I)) {
  I->setIsExact(isExact());
} else if (isa<FPMathOperator>(I)) {
  I->copyFastMathFlags(getFastMathFlags());
}

where each clause assumes or checks that VPIRFlags also records the same operator.
But it does so by only turning-on in I every flag that is turned-on in VPIRFlags - assuming are all initialized off in both VPIRFlags and I?

The current approach could use each turn-on flag to indicate which operator is recorded, i.e., could do

if (isExact())
  I->setIsExact();
else if (getFastMathFlags().any())
  I->setFastMathFlags(getFastMathFlags());
else {
  if (hasNoUnsignedWrap())
    I->setHasNoUnsignedWrap();
  if (hasNoSignedWrap())
    I->setHasNoSignedWrap();
}

(just to clarify understanding.)

(Note that GEPs should record an inBounds flag.)

1052–1055

Should the recipe "have a" VPIRFlags rather than "be a" VPIRFlags?

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
640

Retain comment somewhere?

fhahn updated this revision to Diff 519519.May 4 2023, 9:14 AM
fhahn marked 6 inline comments as done.

Address latest comments, thanks!

fhahn added inline comments.May 4 2023, 9:24 AM
llvm/lib/Transforms/Vectorize/VPlan.h
954

Done. It looks like there's no easy way to check if an opcode on its own is a overflowing/possibly exact/FP op, so I added a helper enum class.

972

I am not sure how this would work when we don't have an Instruction available from which to drop.

We could record a new flag when calling dropPoisonGeneratingFlags on a recipe, but this would then also require to call another function to call Instruction::dropPoisonGeneratingFlags() during code-gen. We still need to look up the flags to set during code-gen, so it seems cleaner to properly unset them here.

Also when thinking about printing the flags as part of the VPlan printing (which isn't the case yet)

992

This is mostly to keep keep things tightly packed, but I can change that.

997

Updated using the recorded type of the op, thanks!

1052–1055

The main benefit of inheriting is to make the common transferFlags and dropPoisonGeneratingFlags available. But the name VPIRFlags is not ideal for inheriting.. I changed it to VPOpWithIRFlags, but can also change the VPWidenRecipe (and others transitioned in future patches) to hold an instance instead.

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
640

I am not sure where a good place would be. I am not sure it is necessary, because now it it is sufficient to directly drop the flags in collectPoisonGeneratingRecipes.

fhahn updated this revision to Diff 520049.May 6 2023, 2:12 AM

Change to VPRecipeWithIRFlags inherting from VPRecipeBase and add classof implementation to allow for more generic invalidation (dyn_cast<VPRecipeWithIRFlags>) which simplifies follow-on patches that extend invalidation to additional recipes.

Ayal added inline comments.May 7 2023, 8:10 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1077

Update comment - recipes that support flags can drop them now rather than be collected to have them dropped later. This suggests that recipeBase should indicate if it supports flags or not? In the long run, any recipe for an underlying having poison generating flags should be capable of dropping them, right?

llvm/lib/Transforms/Vectorize/VPlan.h
954

An alternative may be to keep the OpType next to its flags, retaining a simple bit vector mask, which could be reset by a single zeroing of a short/int, as in:

unsigned HasWrapFlags : 1;
unsigned HasNUW : 1;
unsigned HasNSW : 1;

unsigned HasFastMathFlags : 1;
unsigned AllowReassoc : 1;
unsigned NoNaNs : 1;
unsigned NoInfs : 1;
unsigned NoSignedZeros : 1;
unsigned AllowReciprocal : 1;
unsigned AllowContract : 1;
unsigned ApproxFunc : 1;

unsigned HasInBoundFlag : 1;
unsigned HasInBound : 1;

where "Other" corresponds to all "Has*Flags" (or everything) being false.

972

Agreed: better drop poison generating flags eagerly rather than record a dropPoison flag and (continuing to) drop them lazily.

A main motivation for trying to reuse existing FMF struct and dropPoisonGeneratingFlags() API rather than reimplement them is to help keep the two in sync - which deserves a comment.

TL;DR:
The Instruction whose flags needs to drop will be available when generated, as the concluding part of transfer/copy flags, as done now.

Admittedly the flags still need to be recorded and looked up.

If a dropPoison flag is recorded, it should effect isExact() and hasNo*Wrap() for them to be consistent, so may as well flip their bit, or else inline them into transferFlags() - but they may be needed in the future, including by VPlan printing.

992

Keeping things tightly packed looks great.

A union OTOH would emphasize that these are mutually exclusive alternatives, and also allow initializing all to zero at once.

The FMF flags are currently used only as a whole - to record and apply all using setFastMathFlags(getFastMathFlags()). I.e., access to individual flags is not needed nor provided. If/when such access is needed, it could and should be provided via getFastMathFlags(), as that provides their full API, but currently requires to first construct all FMFs, redundantly.
WDYT?

1007

Can drop only relevant flags according to OpType.

1044

This conceptually "sets" the IR flags of I rather than only adding those enabled? The two are identical iff I has (initially) all its flags unset. Cf. copyIRFlags().

1052–1055

Class essentially holds IR flags, as in FMF (and more), so original name sounds fine as does having VPWidenRecipe hold an instance.
Worth reasoning about how this aspect would relate to the class hierarchy of recipes now or as more are encountered?
BTW, CmpInst::Predicate "flags" may also be supported in a similar way, cf. D50823

1054

May as well set each flag of I directly here rather than first having getFastMathFlags() set each in FMFs and then copying all to I. Unless these flags are held in an FMF to be set directly.

fhahn updated this revision to Diff 520211.May 7 2023, 2:28 PM
fhahn marked 8 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1077

Added a comment.

With the current system, recipes which have IR flags can be casted to VPRecipeWithIRFlags, similar how IR instructions/operators can be casted to OverflowingBinaryOperator & co. D149082 migrates VPWidenGEPRecipe and D150027 migrates VPRecplicateRecipe and removes MayGeneratePoisonRecipes

llvm/lib/Transforms/Vectorize/VPlan.h
954

Updated to simply include it in the bitfield, thanks!

972

I added a note to dropPoisonGeneratingFlags that it needs to be kept in sync with Instruction::dropPoisonGeneratingFlags.

I think it is very unlikely that one of the existing flags will get re-classified as poison-generating. If a new poison-generating flag gets added to the IR, we will never set this flags unless it is added here. And then it should also be added to dropPoisonGeneratingFlags, so keeping things in sync should not be too much trouble (once all recipes migrated).

Printing is added in D150029. Please let me know if you still would prefer to separately track whether poison-generating flags need to be dropped.

992

Updated to use a union, thanks!

Also updated to set fast-math flags individually. I can also add individual getters.

1007

Updated, thanks!

1044

Changed the name to setFlags to make it clearer.

1052–1055

The current modeling is similar to OverflwoingBinaryOperator/PossiblyExactOperator in IR, providing a convenient way to check if an object may have flags, allowing for checking via isa/dyn_cast.

IMO CmpInst::Predicate is slightly different to the wrapping flags, as it is more like an (essential) argument, similar to designation type for casts.

1054

updated to set them directly here.

Ayal added a comment.May 8 2023, 1:16 AM

Thanks! Looks good to me, adding last minor nits.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1080
1080–1081
llvm/lib/Transforms/Vectorize/VPlan.h
940
949

Can reuse FastMathFlags, but that would complicate initialization and turn the union from char to int?

967

Better as unsigned char?

976

(0 >> false)
Ah, the purpose of having Has*Flags as indicator bits was to embed them with all other bits so that all could be initialized with one zeroing assignment, and they could direct relate to the bits they cover by placing them next to each other. Now an enum may (again) be preferred, referring to named structs, and possibly switched upon.

1034

These getters can be dropped after inlining them into their only use below in setFlags(), similar to how the latter accesses FMFs flags - directly w/o getters.

1036

This getFastMathFlags() is now useless and can be dropped.
May be needed in the future, at which point it should also be tested.

1049
fhahn updated this revision to Diff 520304.May 8 2023, 2:01 AM
fhahn marked 9 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1080

Adjusted, thanks!

1080–1081

Adjusted. thanks!

llvm/lib/Transforms/Vectorize/VPlan.h
940

Adjusted, thanks!

949

Yeah I think that would make things more complicated unfortunately and unnecessarily increase the size for now.

967

Updated, thanks!

976

Switched back to the enum and updated code to use switch.

1034

Yep, removed them.

1036

Yep, removed!

1049

updated, thanks!

Ayal accepted this revision.May 8 2023, 2:50 AM

Ship it!

This revision is now accepted and ready to land.May 8 2023, 2:50 AM
This revision was automatically updated to reflect the committed changes.