This is an archive of the discontinued LLVM Phabricator instance.

[TLI][AArch64] Extend ReplaceWithVeclib to replace vector FREM instructions for scalable vectors
AbandonedPublic

Authored by jolanta.jensen on Jul 27 2023, 8:10 AM.

Download Raw Diff

Details

Reviewers

paulwalker-arm
mgabka
huntergr

Summary

This patch teaches ReplaceWithVeclib pass how to replace
vector FREM instructions for scalable vectors.

Depends on D157258

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,040 ms	x64 debian > MLIR.Examples/standalone::test.toy

Event Timeline

jolanta.jensen created this revision.Jul 27 2023, 8:10 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 8:10 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

jolanta.jensen requested review of this revision.Jul 27 2023, 8:10 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 8:10 AM

Herald added subscribers: llvm-commits, wangpc, alextsao1999. · View Herald Transcript

Harbormaster completed remote builds in B248579: Diff 544781.Jul 27 2023, 10:55 AM

mgabka added inline comments.Jul 28 2023, 2:13 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
142–153	from what I can see this is the only different part of code then in "replaceWithTLIFunction" , hence most of the code below is a code duplication. I think a smart refactoring of replaceWithTLIFunction would be enough to make it work for CI and Instructions. The other thing is that the code you wrote duplicates a lot, you can use a suitable container and just add extra element there when you are creating a type for masked function, in that way you do not need to create the type twice and add input types twice.
180	I guess that this is handled by "replaceAllUsesWith" but worth to check if with a test.
251	this is not needed, the mechanism here should work for both fixed and scalable types if mappings exist. if we want to reject the transformation for scalable vector types I think we should reject it earlier, i.e where we detect frem. please LLVM coding guideline on using braces with simple if statememts https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements, it applies to other places in this patch.
253	Hi Jolanta, in my opinion it would be better to have a single main entry point here, and then branch from inside replaceWithCallToVeclib, thanks to it you can void some of the code duplication, like for example the debug messages.
254	when moving this outside this function, this check can be combined with the one above since ScalableVectorType class exists.
260–263	the function name "replaceInstructionWithCallToVeclib" does not suggest that this is specific to fmod/frem, at the moment we do not have intention to extend it beyond frem, so in my opinion it is worth to replace the function name, to avoid confusion.
264	please use early exit instead, it is much better idea to use early exits from function instead long nested if block, it improves code readability. please apply it where possible to other checks you are doing
272–273	why looking for it in cases where you are not going to use it? it isn't efficient. please change it.
274–276	this message is printed actually after looking for the mappings, so probably should be moved up.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll
414	having both arguments of frem as inputs to this function would simplify the test, same apply to other tests. The other thing is that other tests in this file are using fast math flags, so please use it for frem as well.
llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll
2 ↗	(On Diff #544781)	could you explain why did you decide to modify this test?

Addressed review comments.

jolanta.jensen added inline comments.Jul 31 2023, 3:55 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
142–153	Lines 129-134 also differ. Line 77 has an assert that is not vaIid here. I created small functions for lines 54-66 and 116-127 that are shared. And I fixed the code duplication above.
180	Indeed, it is handled by "replaceAllUsesWith". void Value::replaceAllUsesWith(Value New) { doRAUW(New, ReplaceMetadataUses::Yes); } void Value::doRAUW(Value New, ReplaceMetadataUses ReplaceMetaUses) { ... if (ReplaceMetaUses == ReplaceMetadataUses::Yes && isUsedByMetadata()) ValueAsMetadata::handleRAUW(this, New); ... Do we need a test to confirm or is this explanation good enough?
251	There is no mappings for frem with fixed vectors for SLEEF library so we need to check it's a scalable type. I'll check the code for breaches of the standard and I'll correct.
253	Fixed.
254	I changed to ScalableVectorType.
260–263	Fixed.
264	Fixed.
272–273	Fixed.
274–276	Fixed.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll
414	Fixed.
llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll
2 ↗	(On Diff #544781)	As I understand the commit message and the implementation itself, ReplaceWithVeclib pass is to be run on intrinsics operating on vectors not scalar values. This pass did make any changes to the IR and is only confusing.

Harbormaster completed remote builds in B249158: Diff 545580.Jul 31 2023, 5:42 AM

mgabka added inline comments.Jul 31 2023, 5:45 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
142–153	I am not against having "replaceFremWithCallToVeclib" I think that it could stay. However I would like you to try to not create " replaceFremWithCallToVeclib". In my opinion the code will be more readable if we have only replaceWithTLIFunction, which takes an Instruction, and do not create functions like "replaceWithNewCallInst" or "addFunctionToCompilerUsed". If it makes code more clean it might be worth to move some of the frem related things like creating function type to a helper function.

Addressing review comment about refactoring.

jolanta.jensen added inline comments.Aug 1 2023, 4:07 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
142–153	Fixed. All frem handling is in replaceWithTLIFunction. I think it's better to have it all gathered there than break out into separate function(s) as it's not that much code.

Harbormaster completed remote builds in B249445: Diff 545997.Aug 1 2023, 5:07 AM

mgabka added inline comments.Aug 1 2023, 5:19 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
58	the NumElements and ElementType you can obtain from the Frem instruction return tyoe, so you do not need to pass it as arguments to this function and you can remove this assert as well.
64	IR builder has getAllOnesMask please use it instead
120–121	this assert can be moved up so you won't need a temporary variable
122	could you move this comment abve the if stmt and remove the braces?
128	extra new line, it is not needed
170	could you move this comment abve the if stmt and remove the braces?
171–174	IMO this would be more readable if written as if statements, moreover if the type is not double of float we should rerutn false from this function, am I right? StringRef ScalarName; if (ElementType->isFloatTy()) ScalarName =TLI.getName(LibFunc_fmodf); elseif(ElementType->isDoubleTy()) ScalarName =TLI.getName(LibFunc_fmod) else return false; thanks to to the statement below is not needed and code is more clear and compact.
174	please move the comment above the if stmt and remove braces
178	these 2 if stmts can be combined together, C/C++ guarantees that they are executed from left to right
178–179	could you move the comments above the if and remove the braces? I think this is not needed entirely to be fair, can we rely on the check performed by TLI.getVectorizedFunction? and add debug message just below the final return false?
268	could you can remove the braces here?
llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll
2 ↗	(On Diff #544781)	yes, you are correct, am I aware that this test is quite bad from the beginning. I would like to rearrange the SLEEF tests to match what I did for armpl in https://reviews.llvm.org/home/menu/view/137/, i.e have separate tests for LV and for replace with veclib. For now I would leave it as it is and remove the code you added, AFAIK we should not see calls to @fmod or @fmodf in the situations we want to use _ZGVsMxvv_fmod or _ZGVsMxvv_fmodf, in those cases frem instructions with vector operands should be present in IR instead.

Addressing review comments.

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
122	Fixed.
170	Fixed.
171–174	Fixed. In the unusual case the ScalarName is empty we will just do 2 unnecessary lookups.
174	Fixed.
178	Fixed.
178–179	Yes, we can. It calls into: bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const { return !(getVectorizedFunction(F, VF, false).empty() && getVectorizedFunction(F, VF, true).empty()); } Removed the isFunctionVectorizable() check and added a debug message if a vectorized function wasn't found.

jolanta.jensen added inline comments.Aug 2 2023, 5:17 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
58	I would need to do a cast similar way as I do in replaceFremWithCallToVeclib, i.e. something like (dyn_cast<ScalableVectorType>(I.getType()))->getElementCount() and I would need a variable for it in the whole function scope even if only frem is using it. I think it's less fuss to send it as argument. But if you think it's better to obtain it from Frem instruction anyway, I'll change. I realized I do not use ElementType and I removed it.
64	It does similar except I'll get a Value not Type since it does more than I need and I would need to convert back. Value getAllOnesMask(ElementCount NumElts) { VectorType VTy = VectorType::get(Type::getInt1Ty(Context), NumElts); return Constant::getAllOnesValue(VTy); } I kept line 65 as it is but I changed code on lines 98-100 to use getAllOnesMask.
120–121	Fixed.
128	Fixed.
268	Fixed. I kept the a blank line before return to make it more visible what is returned.

Addressing review comments.

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
58	Fixed.

Harbormaster completed remote builds in B249791: Diff 546489.Aug 2 2023, 1:26 PM

Addressing yet another review comments.

Harbormaster completed remote builds in B250087: Diff 546890.Aug 3 2023, 10:16 AM

jolanta.jensen edited the summary of this revision. (Show Details)Aug 3 2023, 10:37 AM

mgabka added inline comments.Aug 4 2023, 7:29 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
54–84	I think most of us are more used to read positive conditions this is why I would suggest to write it as: if(CI) { handle intrinsic } else { handle frem and here the assert to check that the instruction is a frem would make more sense } I would suggest to do a simila change in the if/else block you added below
63	you can just use here: Tys.push_back(VectorType::get(Type::getInt1Ty(M->getContext()), NumElements));
74	IMO this assert is redundant, it is essentially checking if the Function::Create works correctly, what is not needed in my opinion.
93	I know that this is not part of your work, but I realised that this is not tested at all, could you create a an NFC patch with regenerated tests with "--check-globals" same applied to the non scalable test and rebase your patch?
103	not needed change
150	this is wrong, please read about StringRef https://discourse.llvm.org/t/std-string-vs-llvm-stringref/65873/2 you can just write it as : StringRef TLIName = TLI.getVectorizedFunction(ScalarName, NumElements
151–155	I am not sure that this is even needed, in the TLI all mappings for SLEEF or ArmPL for scalable vectors are masked, so I would assume that we either replace with masked call or we do not replace at all, @paulwalker-arm what is your opinion?
161	I think LLVM prefers to use an in-line C-style comment in this case /Masked/ true
163	I think this is going to pollute dbg log too much as it will add a debug message for each unsupported intrinsic, please remove it.
258	nit: better to use consistent spelling so either please use FRem of frem everywhere
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll
393	please use same naming scheme as other functions in this file so this should be : llvm_frem_vscale_f64, please apply to the other tests

paulwalker-arm added inline comments.Aug 4 2023, 7:39 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
151–155	Checking for both is fine and akin to what we do in LoopVectorize. There's no requirement for scalable vector math routines to require a mask, it's just masked versions are easier to work with when tail-folding.

Addressing review comments

jolanta.jensen added inline comments.Aug 7 2023, 3:06 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
54–84	I agree. It was ordered this way because frem handling was before intrinsics handling in replaceWithCallToVeclib. But since branching was reverted to reside in runImpl where it was in thefirst version of the patch, it does not make sense to have intrinsics handling before frem handling here. Hopefully the code will be more clear this way even if I dont like those if...else... anyway and I would prefer separate functions for intrinsics and frem.
63	Fixed. This way I could move declaration of IRBuilder more locally too.
74	This assert is part of the original implementation, removing it would be introducing unnecessary nfc (even if it has been moved to avoid creation of an extra variable -- so the move is a nfc anyway). I would like to keep it since it is a part of the original implementation.
93	Fixed.
150	Yeah, copy-pasted from the original code in replaceWithCallToVeclib. Corrected -- but only here, in my own code.
161	Fixed.
163	Why should it pollute? We won't get it more times than LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `" << ScalarName << "` and vector width " << NumElements << ".\n"); It matches the above print in case we find nothing. I think it's user friendly and I would like to keep it.
258	Fixed.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll
393	The other test names start with @llvm since they call functions which names start with @llvm. My test names start with @frem since I call frem. In my opinion it is consistent with the naming scheme in this file.

Harbormaster completed remote builds in B250705: Diff 547687.Aug 7 2023, 4:31 AM

jolanta.jensen added inline comments.Aug 7 2023, 5:32 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
52	This is a bug. It will loose its value when it comes to line 104. It needs to be instantiated at once, i.e. ElementCount NumElements = (dyn_cast<ScalableVectorType>(I.getType()))->getElementCount();

jolanta.jensen added inline comments.Aug 7 2023, 6:00 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
52	Or actually, the issue here is we set the ElementCount from an if statement, so if the TLIFunc is present, it will not be set.

mgabka added inline comments.Aug 7 2023, 6:05 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
51	this is wrong, i.e in cases where the function is already declared in the module you are not initialising it and in the line 112 you are going to print empty string. The code below is adding function declaration to the module, and you are setting the OldName only in such situation similar situation applied to the NumElements variable. So this is an incorrect code as those variables should be set to correct value in both situations. I also believe the code below could be written in a bit shorter, please check if you can do anything with it as inspiration have a look at the code I suggested to use in the other place.
103	not needed change please revert this change.
103–122	this can be written in much more compacted way for example: SmallVector<Value > Args(I.operand_values()); SmallVector<OperandBundleDef, 1> OpBundles; // Preserve the operand bundles if it is an intrinsic call. if (CI) CI->getOperandBundlesAsDefs(OpBundles); // for masked calls to frem add a mask operand = else if (Masked) Args.push_back(IRBuilder.getAllOnesMask(NumElements)); CallInst Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles); if (isa<FPMathOperator>(Replacement)) { // Preserve fast math flags for FP math. Replacement->copyFastMathFlags(&I); }
163	I was too fast, this function only works for frem i nstruction, not for all llvm intrinsics. In my view there is no need to add messages when optimization won't happen, you added messages when it happens, so lack of it means that it is not happening.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll
393	In my opinion it makes things only more difficult as it breaks the alphabetical order, and when looking for output for frem we need to too at. the end of the file instead place where all math operations started which name starts with "f" is.

mgabka added inline comments.Aug 7 2023, 6:53 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp

103–122

I was wrong here, when the Instruction is a CI has the call as the first operand so it needs to be a bit different, I also think that it would be good to handle always mask (assuming it is always the last operand)

SmallVector<Value *> Args;
SmallVector<OperandBundleDef, 1> OpBundles;
// Preserve the operand bundles and copy arguments if it is an intrinsic call.
if (CI) {
  Args.assign(CI->arg_begin(), CI->arg_end());
  CI->getOperandBundlesAsDefs(OpBundles);
 }
else
  Args.assign(I.op_begin(), I.op_end());
// if mask is requested we need to add it
if (Masked)
  Args.push_back(IRBuilder.getAllOnesMask(NumElements));

CallInst *Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles);

Bug fixes and review comments.

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
51	Yup. this is a bug as well. Corrected. The code below (everything that is in if (CI) block) is part of the original implementation and I would prefer not to touch it.
103–122	This is part of the original implementation and I would prefer not to touch it. And CallInst part does not handle scalable vectors so no mask. And I don't think we even can assume mask is always the last operand even if so is the case for FRem.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll
393	replace-intrinsics-with-veclib-sleef-scalable.ll is alphabetically ordered for intrinsics tests but replace-intrinsics-with-veclib-armpl.ll is not. It starts with cos and the next one is sin. The remainder of the test file is fairly alphabetical even if log10 comes after log2. But I moved the frem tests so they come in alphabetical order as they are listed alphabetically in VecFuncs.def. I renamed them also to llvm_frem even if they not intrinsics.

jolanta.jensen added inline comments.Aug 7 2023, 9:41 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
161–163	Forgot about this one. Will be included in next patch.

Harbormaster completed remote builds in B250815: Diff 547824.Aug 7 2023, 12:10 PM

Mask fix.

jolanta.jensen added inline comments.Aug 9 2023, 3:50 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
161–163	Removed the unnecessary debug print.

Harbormaster completed remote builds in B251341: Diff 548551.Aug 9 2023, 6:07 AM

jolanta.jensen added a child revision: D157525: [LV][AArch64] Enable scalable vectorization of loops that contain FREM instructions.Aug 9 2023, 10:28 AM

mgabka added inline comments.Aug 10 2023, 1:14 AM

llvm/include/llvm/Analysis/VectorUtils.h
184	please add documentation for the new added parameter. in my opinion the interface will be better if we use it for both scalable and fixed vectors, otherwise it is confusing what this parameter is and when it is used.
llvm/lib/CodeGen/ReplaceWithVeclib.cpp
55	this may fail, not all instructions used here will be returning Vector. You need the EC only for frem case to create Type for the mask, but in this case you have guaranteed that it returns a vector type. Please change it.
57	nod needed change
74	FWIW this you already refactored the code quite a bit so removing not needed bit is not a problem as part of this work.
75	The whole idea about std::optional is that the value may not exist, https://en.cppreference.com/w/cpp/utility/optional/value and you need to handle it otherwise you will get exception.
76	IMO this dbg message is not needed as it is too detailed. LLVM tends to have more high level dbg output otherwise the dbg log are growing too quickly. for this pass having a dbg messages to show that the is replacement and what has been replaced with what in my opinion is enough.
103–122	The final goal is to make this pass work for scalable vectors, so introducing unnecessary if/else blocks to just remove them later in my opinion is not a good idea. Since we now want to use getParamIndexForOptionalMask we will now the place of mask.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll
195	a comment explaining why the transformation is not happening would be useful here.

jolanta.jensen added inline comments.Aug 10 2023, 1:40 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
55	Change to what? I was sending ElementCount as an argument but this was disliked and I was asked to retrieve it here. How do you want it? Could you explain a bit more?
75	In my opinion the check on line 74 is enough but I can add one more check for the position of the mask.
103–122	It's not only about the mask. CallInst needs it's OpBundles as well. But sure, if we add support for scalable vectors for CI it may be rewritten. But we are not there yet.

Addressing review comments.

jolanta.jensen added inline comments.Aug 10 2023, 10:23 AM

llvm/include/llvm/Analysis/VectorUtils.h
184	Updated the documentation. For fixed vectors do not need to retrieve the Vectorization Factor. The Module parameter is only used for scalable vectors to retrieve the Vectorization Factor. But I realized the EC parameter can be misused if we use it with a CallInst and there is no Function in the Module. Any suggestions how to mitigate? // 1. We don't accept a zero lanes vectorization factor. // 2. We don't accept the demangling if the vector function is not // present in the module unless we handle an Instruction. if (VF == 0) return std::nullopt; if (!EC && !M.getFunction(VectorName)) return std::nullopt;
llvm/lib/Analysis/VFABIDemangling.cpp
455	Here, the EC parameter can be misused if it is set for a CallInst but the VectorFunction is not present in the Module. Any advice how to mitigate?
llvm/lib/CodeGen/ReplaceWithVeclib.cpp
55	Sending NumElements as argument again, this time as optional one. Please check if this is a better solution.
57	Added a blank line again. Same with line 68.
74	Removed the assert checking for the function type. i.e. on line 62.
75	Added a check if we got a return value.
76	Removed.
llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll
195	Fixed.

Harbormaster completed remote builds in B251732: Diff 549092.Aug 10 2023, 4:37 PM

jolanta.jensen added inline comments.Aug 10 2023, 11:31 PM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
44	This will malfunction if false is sent as an argument. To be corrected.

Bugfix to a bug introduced in previous patch.

jolanta.jensen added inline comments.Aug 11 2023, 2:55 AM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
44	Now fixed.

Harbormaster completed remote builds in B251895: Diff 549313.Aug 11 2023, 5:31 AM

jolanta.jensen abandoned this revision.Aug 29 2023, 4:10 AM

Herald added a subscriber: sunshaoce. · View Herald TranscriptAug 29 2023, 4:10 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

VectorUtils.h

8 lines

lib/

Analysis/

VFABIDemangling.cpp

28 lines

CodeGen/

ReplaceWithVeclib.cpp

137 lines

test/

CodeGen/

AArch64/

replace-intrinsics-with-veclib-armpl.ll

23 lines

replace-intrinsics-with-veclib-sleef-scalable.ll

23 lines

Diff 549313

llvm/include/llvm/Analysis/VectorUtils.h

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	///			///
	/// \param MangledName -> input string in the format			/// \param MangledName -> input string in the format
	/// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)].			/// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)].
	/// \param M -> Module used to retrieve informations about the vector			/// \param M -> Module used to retrieve informations about the vector
	/// function that are not possible to retrieve from the mangled			/// function that are not possible to retrieve from the mangled
	/// name. At the moment, this parameter is needed only to retrieve the			/// name. At the moment, this parameter is needed only to retrieve the
	/// Vectorization Factor of scalable vector functions from their			/// Vectorization Factor of scalable vector functions from their
	/// respective IR declarations.			/// respective IR declarations.
	std::optional<VFInfo> tryDemangleForVFABI(StringRef MangledName,			/// \param EC -> Vectorization Factor for scalable vector functions if
	const Module &M);			/// we are handling an Instruction since not possible to retrieve from
				/// the Module. Must not be set for an CallInst.
				mgabkaUnsubmitted Not Done Reply Inline Actions please add documentation for the new added parameter. in my opinion the interface will be better if we use it for both scalable and fixed vectors, otherwise it is confusing what this parameter is and when it is used. mgabka: please add documentation for the new added parameter. in my opinion the interface will be…
				jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Updated the documentation. For fixed vectors do not need to retrieve the Vectorization Factor. The Module parameter is only used for scalable vectors to retrieve the Vectorization Factor. But I realized the EC parameter can be misused if we use it with a CallInst and there is no Function in the Module. Any suggestions how to mitigate? // 1. We don't accept a zero lanes vectorization factor. // 2. We don't accept the demangling if the vector function is not // present in the module unless we handle an Instruction. if (VF == 0) return std::nullopt; if (!EC && !M.getFunction(VectorName)) return std::nullopt; jolanta.jensen: Updated the documentation. For fixed vectors do not need to retrieve the Vectorization Factor.
				std::optional<VFInfo>
				tryDemangleForVFABI(StringRef MangledName, const Module &M,
				std::optional<ElementCount> EC = std::nullopt);

	/// This routine mangles the given VectorName according to the LangRef			/// This routine mangles the given VectorName according to the LangRef
	/// specification for vector-function-abi-variant attribute and is specific to			/// specification for vector-function-abi-variant attribute and is specific to
	/// the TLI mappings. It is the responsibility of the caller to make sure that			/// the TLI mappings. It is the responsibility of the caller to make sure that
	/// this is only used if all parameters in the vector function are vector type.			/// this is only used if all parameters in the vector function are vector type.
	/// This returned string holds scalar-to-vector mapping:			/// This returned string holds scalar-to-vector mapping:
	/// _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)			/// _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)
	///			///
	▲ Show 20 Lines • Show All 806 Lines • Show Last 20 Lines

llvm/lib/Analysis/VFABIDemangling.cpp

Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	if (auto *VTy = dyn_cast<VectorType>(Ty))
return VTy->getElementCount();		return VTy->getElementCount();

return ElementCount::getFixed(/Min=/1);		return ElementCount::getFixed(/Min=/1);
}		}
} // namespace		} // namespace

// Format of the ABI name:		// Format of the ABI name:
// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)]		// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)]
std::optional<VFInfo> VFABI::tryDemangleForVFABI(StringRef MangledName,		std::optional<VFInfo>
const Module &M) {		VFABI::tryDemangleForVFABI(StringRef MangledName, const Module &M,
		std::optional<ElementCount> EC) {
const StringRef OriginalName = MangledName;		const StringRef OriginalName = MangledName;
// Assume there is no custom name <redirection>, and therefore the		// Assume there is no custom name <redirection>, and therefore the
// vector name consists of		// vector name consists of
// _ZGV<isa><mask><vlen><parameters>_<scalarname>.		// _ZGV<isa><mask><vlen><parameters>_<scalarname>.
StringRef VectorName = MangledName;		StringRef VectorName = MangledName;

// Parse the fixed size part of the manled name		// Parse the fixed size part of the manled name
if (!MangledName.consume_front("_ZGV"))		if (!MangledName.consume_front("_ZGV"))
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	VFABI::tryDemangleForVFABI(StringRef MangledName, const Module &M,
// Adjust the VF for scalable signatures. The EC.Min is not encoded		// Adjust the VF for scalable signatures. The EC.Min is not encoded
// in the name of the function, but it is encoded in the IR		// in the name of the function, but it is encoded in the IR
// signature of the function. We need to extract this information		// signature of the function. We need to extract this information
// because it is needed by the loop vectorizer, which reasons in		// because it is needed by the loop vectorizer, which reasons in
// terms of VectorizationFactor or ElementCount. In particular, we		// terms of VectorizationFactor or ElementCount. In particular, we
// need to make sure that the VF field of the VFShape class is never		// need to make sure that the VF field of the VFShape class is never
// set to 0.		// set to 0.
if (IsScalable) {		if (IsScalable) {
		if (EC) {
		VF = EC->getKnownMinValue();
		} else {
const Function *F = M.getFunction(VectorName);		const Function *F = M.getFunction(VectorName);
// The declaration of the function must be present in the module		// The declaration of the function must be present in the module
// to be able to retrieve its signature.		// to be able to retrieve its signature.
if (!F)		if (!F)
return std::nullopt;		return std::nullopt;
const ElementCount EC = getECFromSignature(F->getFunctionType());		const ElementCount EC = getECFromSignature(F->getFunctionType());
VF = EC.getKnownMinValue();		VF = EC.getKnownMinValue();
}		}
		}
// 1. We don't accept a zero lanes vectorization factor.		// 1. We don't accept a zero lanes vectorization factor.
// 2. We don't accept the demangling if the vector function is not		// 2. We don't accept the demangling if the vector function is not
// present in the module.		// present in the module unless we handle an Instruction.
if (VF == 0)		if (VF == 0)
return std::nullopt;		return std::nullopt;
if (!M.getFunction(VectorName))		if (!EC && !M.getFunction(VectorName))
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Here, the EC parameter can be misused if it is set for a CallInst but the VectorFunction is not present in the Module. Any advice how to mitigate? jolanta.jensen: Here, the EC parameter can be misused if it is set for a CallInst but the VectorFunction is not…
return std::nullopt;		return std::nullopt;

const VFShape Shape({ElementCount::get(VF, IsScalable), Parameters});		const VFShape Shape({ElementCount::get(VF, IsScalable), Parameters});
return VFInfo({Shape, std::string(ScalarName), std::string(VectorName), ISA});		return VFInfo({Shape, std::string(ScalarName), std::string(VectorName), ISA});
}		}

VFParamKind VFABI::getVFParamKindFromString(const StringRef Token) {		VFParamKind VFABI::getVFParamKindFromString(const StringRef Token) {
const VFParamKind ParamKind = StringSwitch<VFParamKind>(Token)		const VFParamKind ParamKind = StringSwitch<VFParamKind>(Token)
Show All 20 Lines

llvm/lib/CodeGen/ReplaceWithVeclib.cpp

Show All 32 Lines	STATISTIC(NumCallsReplaced,
"Number of calls to intrinsics that have been replaced.");		"Number of calls to intrinsics that have been replaced.");

STATISTIC(NumTLIFuncDeclAdded,		STATISTIC(NumTLIFuncDeclAdded,
"Number of vector library function declarations added.");		"Number of vector library function declarations added.");

STATISTIC(NumFuncUsedAdded,		STATISTIC(NumFuncUsedAdded,
"Number of functions added to `llvm.compiler.used`");		"Number of functions added to `llvm.compiler.used`");

static bool replaceWithTLIFunction(CallInst &CI, const StringRef TLIName) {		static bool
Module *M = CI.getModule();		replaceWithTLIFunction(Instruction &I, const StringRef TLIName,
		std::optional<ElementCount> NumElements = std::nullopt,
Function *OldFunc = CI.getCalledFunction();		std::optional<bool> Masked = false) {
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions This will malfunction if false is sent as an argument. To be corrected. jolanta.jensen: This will malfunction if false is sent as an argument. To be corrected.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Now fixed. jolanta.jensen: Now fixed.
		Module *M = I.getModule();
		CallInst *CI = dyn_cast<CallInst>(&I);

// Check if the vector library function is already declared in this module,		// Check if the vector library function is already declared in this module,
// otherwise insert it.		// otherwise insert it.
Function *TLIFunc = M->getFunction(TLIName);		Function *TLIFunc = M->getFunction(TLIName);
		StringRef OldName =
		mgabkaUnsubmitted Not Done Reply Inline Actions this is wrong, i.e in cases where the function is already declared in the module you are not initialising it and in the line 112 you are going to print empty string. The code below is adding function declaration to the module, and you are setting the OldName only in such situation similar situation applied to the NumElements variable. So this is an incorrect code as those variables should be set to correct value in both situations. I also believe the code below could be written in a bit shorter, please check if you can do anything with it as inspiration have a look at the code I suggested to use in the other place. mgabka: this is wrong, i.e in cases where the function is already declared in the module you are not…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Yup. this is a bug as well. Corrected. The code below (everything that is in if (CI) block) is part of the original implementation and I would prefer not to touch it. jolanta.jensen: Yup. this is a bug as well. Corrected. The code below (everything that is in if (CI) block) is…
		CI ? CI->getCalledFunction()->getName() : I.getOpcodeName();
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions This is a bug. It will loose its value when it comes to line 104. It needs to be instantiated at once, i.e. ElementCount NumElements = (dyn_cast<ScalableVectorType>(I.getType()))->getElementCount(); jolanta.jensen: This is a bug. It will loose its value when it comes to line 104. It needs to be instantiated…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Or actually, the issue here is we set the ElementCount from an if statement, so if the TLIFunc is present, it will not be set. jolanta.jensen: Or actually, the issue here is we set the ElementCount from an if statement, so if the TLIFunc…
if (!TLIFunc) {		if (!TLIFunc) {
TLIFunc = Function::Create(OldFunc->getFunctionType(),		if (CI) {
Function::ExternalLinkage, TLIName, *M);		// Intrinsics handling.
		mgabkaUnsubmitted Not Done Reply Inline Actions this may fail, not all instructions used here will be returning Vector. You need the EC only for frem case to create Type for the mask, but in this case you have guaranteed that it returns a vector type. Please change it. mgabka: this may fail, not all instructions used here will be returning Vector. You need the EC only…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Change to what? I was sending ElementCount as an argument but this was disliked and I was asked to retrieve it here. How do you want it? Could you explain a bit more? jolanta.jensen: Change to what? I was sending ElementCount as an argument but this was disliked and I was asked…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Sending NumElements as argument again, this time as optional one. Please check if this is a better solution. jolanta.jensen: Sending NumElements as argument again, this time as optional one. Please check if this is a…
		Function *OldFunc = CI->getCalledFunction();
		FunctionType *OldFuncTy = OldFunc->getFunctionType();
		TLIFunc =
		mgabkaUnsubmitted Not Done Reply Inline Actions the NumElements and ElementType you can obtain from the Frem instruction return tyoe, so you do not need to pass it as arguments to this function and you can remove this assert as well. mgabka: the NumElements and ElementType you can obtain from the Frem instruction return tyoe, so you do…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions I would need to do a cast similar way as I do in replaceFremWithCallToVeclib, i.e. something like (dyn_cast<ScalableVectorType>(I.getType()))->getElementCount() and I would need a variable for it in the whole function scope even if only frem is using it. I think it's less fuss to send it as argument. But if you think it's better to obtain it from Frem instruction anyway, I'll change. I realized I do not use ElementType and I removed it. jolanta.jensen: I would need to do a cast similar way as I do in replaceFremWithCallToVeclib, i.e. something…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
		Function::Create(OldFuncTy, Function::ExternalLinkage, TLIName, *M);
TLIFunc->copyAttributesFrom(OldFunc);		TLIFunc->copyAttributesFrom(OldFunc);
		} else {
		// FRem handling.
		assert(I.getOpcode() == Instruction::FRem &&
		mgabkaUnsubmitted Not Done Reply Inline Actions you can just use here: Tys.push_back(VectorType::get(Type::getInt1Ty(M->getContext()), NumElements)); mgabka: you can just use here: Tys.push_back(VectorType::get(Type::getInt1Ty(M->getContext())…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. This way I could move declaration of IRBuilder more locally too. jolanta.jensen: Fixed. This way I could move declaration of IRBuilder more locally too.
		"Must be a FRem instruction.");
		mgabkaUnsubmitted Not Done Reply Inline Actions IR builder has getAllOnesMask please use it instead mgabka: IR builder has getAllOnesMask please use it instead
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions It does similar except I'll get a Value not Type since it does more than I need and I would need to convert back. Value getAllOnesMask(ElementCount NumElts) { VectorType VTy = VectorType::get(Type::getInt1Ty(Context), NumElts); return Constant::getAllOnesValue(VTy); } I kept line 65 as it is but I changed code on lines 98-100 to use getAllOnesMask. jolanta.jensen: It does similar except I'll get a Value not Type since it does more than I need and I would…
		if (Masked.value() && !NumElements)
		return false;
		Type *RetTy = I.getType();
		SmallVector<Type *> Tys = {RetTy, RetTy};
		if (Masked.value()) {
		// Get the mask position.
		std::optional<llvm::VFInfo> Info =
		VFABI::tryDemangleForVFABI(TLIName, *M, NumElements.value());
		if (!Info)
		return false;
		mgabkaUnsubmitted Not Done Reply Inline Actions IMO this assert is redundant, it is essentially checking if the Function::Create works correctly, what is not needed in my opinion. mgabka: IMO this assert is redundant, it is essentially checking if the Function::Create works…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions This assert is part of the original implementation, removing it would be introducing unnecessary nfc (even if it has been moved to avoid creation of an extra variable -- so the move is a nfc anyway). I would like to keep it since it is a part of the original implementation. jolanta.jensen: This assert is part of the original implementation, removing it would be introducing…
		mgabkaUnsubmitted Not Done Reply Inline Actions FWIW this you already refactored the code quite a bit so removing not needed bit is not a problem as part of this work. mgabka: FWIW this you already refactored the code quite a bit so removing not needed bit is not a…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Removed the assert checking for the function type. i.e. on line 62. jolanta.jensen: Removed the assert checking for the function type. i.e. on line 62.
		std::optional<unsigned> MaskPos = Info->getParamIndexForOptionalMask();
		mgabkaUnsubmitted Not Done Reply Inline Actions The whole idea about std::optional is that the value may not exist, https://en.cppreference.com/w/cpp/utility/optional/value and you need to handle it otherwise you will get exception. mgabka: The whole idea about std::optional is that the value may not exist, https://en.cppreference.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions In my opinion the check on line 74 is enough but I can add one more check for the position of the mask. jolanta.jensen: In my opinion the check on line 74 is enough but I can add one more check for the position of…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Added a check if we got a return value. jolanta.jensen: Added a check if we got a return value.
		if (!MaskPos)
		mgabkaUnsubmitted Not Done Reply Inline Actions IMO this dbg message is not needed as it is too detailed. LLVM tends to have more high level dbg output otherwise the dbg log are growing too quickly. for this pass having a dbg messages to show that the is replacement and what has been replaced with what in my opinion is enough. mgabka: IMO this dbg message is not needed as it is too detailed. LLVM tends to have more high level…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Removed. jolanta.jensen: Removed.
		return false;
		Tys.insert(Tys.begin() + MaskPos.value(),
		VectorType::get(Type::getInt1Ty(M->getContext()),
		NumElements.value()));
		}
		TLIFunc = Function::Create(FunctionType::get(RetTy, Tys, false),
		Function::ExternalLinkage, TLIName, *M);
		}
		mgabkaUnsubmitted Not Done Reply Inline Actions I think most of us are more used to read positive conditions this is why I would suggest to write it as: if(CI) { handle intrinsic } else { handle frem and here the assert to check that the instruction is a frem would make more sense } I would suggest to do a simila change in the if/else block you added below mgabka: I think most of us are more used to read positive conditions this is why I would suggest to…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions I agree. It was ordered this way because frem handling was before intrinsics handling in replaceWithCallToVeclib. But since branching was reverted to reside in runImpl where it was in thefirst version of the patch, it does not make sense to have intrinsics handling before frem handling here. Hopefully the code will be more clear this way even if I dont like those if...else... anyway and I would prefer separate functions for intrinsics and frem. jolanta.jensen: I agree. It was ordered this way because frem handling was before intrinsics handling in…
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Added vector library function `"		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Added vector library function `"
<< TLIName << "` of type `" << *(TLIFunc->getType())		<< TLIName << "` of type `" << *(TLIFunc->getType())
<< "` to module.\n");		<< "` to module.\n");

mgabkaUnsubmitted Not Done Reply Inline Actions nod needed change mgabka: nod needed change
jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Added a blank line again. Same with line 68. jolanta.jensen: Added a blank line again. Same with line 68.
++NumTLIFuncDeclAdded;		++NumTLIFuncDeclAdded;

// Add the freshly created function to llvm.compiler.used,		// Add the freshly created function to llvm.compiler.used,
// similar to as it is done in InjectTLIMappings		// similar to as it is done in InjectTLIMappings
appendToCompilerUsed(*M, {TLIFunc});		appendToCompilerUsed(*M, {TLIFunc});
		mgabkaUnsubmitted Not Done Reply Inline Actions I know that this is not part of your work, but I realised that this is not tested at all, could you create a an NFC patch with regenerated tests with "--check-globals" same applied to the non scalable test and rebase your patch? mgabka: I know that this is not part of your work, but I realised that this is not tested at all, could…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.

LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Adding `" << TLIName		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Adding `" << TLIName
<< "` to `@llvm.compiler.used`.\n");		<< "` to `@llvm.compiler.used`.\n");
++NumFuncUsedAdded;		++NumFuncUsedAdded;
}		}

// Replace the call to the vector intrinsic with a call		// Replace the call to the FRem instruction/vector intrinsic with a call
// to the corresponding function from the vector library.		// to the corresponding function from the vector library.
IRBuilder<> IRBuilder(&CI);		IRBuilder<> IRBuilder(&I);
SmallVector<Value *> Args(CI.args());		CallInst *Replacement = nullptr;
		if (CI) {
		// Intrinsics handling.
		SmallVector<Value *> Args(CI->args());
// Preserve the operand bundles.		// Preserve the operand bundles.
SmallVector<OperandBundleDef, 1> OpBundles;		SmallVector<OperandBundleDef, 1> OpBundles;
CI.getOperandBundlesAsDefs(OpBundles);		CI->getOperandBundlesAsDefs(OpBundles);
CallInst *Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles);		Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles);
assert(OldFunc->getFunctionType() == TLIFunc->getFunctionType() &&		} else {
"Expecting function types to be identical");		// FRem handling.
CI.replaceAllUsesWith(Replacement);		if (Masked.value() && !NumElements)
		return false;
		SmallVector<Value *> Args(I.operand_values());
		if (Masked.value())
		Args.push_back(IRBuilder.getAllOnesMask(NumElements.value()));
		Replacement = IRBuilder.CreateCall(TLIFunc, Args);
		}
		I.replaceAllUsesWith(Replacement);
if (isa<FPMathOperator>(Replacement)) {		if (isa<FPMathOperator>(Replacement)) {
		mgabkaUnsubmitted Not Done Reply Inline Actions this assert can be moved up so you won't need a temporary variable mgabka: this assert can be moved up so you won't need a temporary variable
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
// Preserve fast math flags for FP math.		// Preserve fast math flags for FP math.
		mgabkaUnsubmitted Not Done Reply Inline Actions could you move this comment abve the if stmt and remove the braces? mgabka: could you move this comment abve the if stmt and remove the braces?
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
		mgabkaUnsubmitted Not Done Reply Inline Actions this can be written in much more compacted way for example: SmallVector<Value > Args(I.operand_values()); SmallVector<OperandBundleDef, 1> OpBundles; // Preserve the operand bundles if it is an intrinsic call. if (CI) CI->getOperandBundlesAsDefs(OpBundles); // for masked calls to frem add a mask operand = else if (Masked) Args.push_back(IRBuilder.getAllOnesMask(NumElements)); CallInst Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles); if (isa<FPMathOperator>(Replacement)) { // Preserve fast math flags for FP math. Replacement->copyFastMathFlags(&I); } mgabka: this can be written in much more compacted way for example: ``` SmallVector<Value *> Args(I.
		mgabkaUnsubmitted Not Done Reply Inline Actions I was wrong here, when the Instruction is a CI has the call as the first operand so it needs to be a bit different, I also think that it would be good to handle always mask (assuming it is always the last operand) SmallVector<Value > Args; SmallVector<OperandBundleDef, 1> OpBundles; // Preserve the operand bundles and copy arguments if it is an intrinsic call. if (CI) { Args.assign(CI->arg_begin(), CI->arg_end()); CI->getOperandBundlesAsDefs(OpBundles); } else Args.assign(I.op_begin(), I.op_end()); // if mask is requested we need to add it if (Masked) Args.push_back(IRBuilder.getAllOnesMask(NumElements)); CallInst Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles); mgabka: I was wrong here, when the Instruction is a CI has the call as the first operand so it needs to…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions This is part of the original implementation and I would prefer not to touch it. And CallInst part does not handle scalable vectors so no mask. And I don't think we even can assume mask is always the last operand even if so is the case for FRem. jolanta.jensen: This is part of the original implementation and I would prefer not to touch it. And CallInst…
		mgabkaUnsubmitted Not Done Reply Inline Actions The final goal is to make this pass work for scalable vectors, so introducing unnecessary if/else blocks to just remove them later in my opinion is not a good idea. Since we now want to use getParamIndexForOptionalMask we will now the place of mask. mgabka: The final goal is to make this pass work for scalable vectors, so introducing unnecessary…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions It's not only about the mask. CallInst needs it's OpBundles as well. But sure, if we add support for scalable vectors for CI it may be rewritten. But we are not there yet. jolanta.jensen: It's not only about the mask. CallInst needs it's OpBundles as well. But sure, if we add…
Replacement->copyFastMathFlags(&CI);		Replacement->copyFastMathFlags(&I);
}		}
		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Replaced call to `" << OldName
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Replaced call to `"		<< "` with call to `" << TLIName << "`.\n");
<< OldFunc->getName() << "` with call to `" << TLIName
<< "`.\n");
++NumCallsReplaced;		++NumCallsReplaced;
return true;		return true;
		mgabkaUnsubmitted Not Done Reply Inline Actions extra new line, it is not needed mgabka: extra new line, it is not needed
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
}		}

		static bool replaceFremWithCallToVeclib(const TargetLibraryInfo &TLI,
		Instruction &I) {
		auto *VectorArgTy = dyn_cast<ScalableVectorType>(I.getType());
		// We have TLI mappings for FRem on scalable vectors only.
		if (!VectorArgTy)
		return false;
		ElementCount NumElements = VectorArgTy->getElementCount();
		auto *ElementType = VectorArgTy->getElementType();
		StringRef ScalarName;
		if (ElementType->isFloatTy())
		ScalarName = TLI.getName(LibFunc_fmodf);
		else if (ElementType->isDoubleTy())
		ScalarName = TLI.getName(LibFunc_fmod);
		else
		return false;
		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `"
		<< ScalarName << "` and vector width " << NumElements
		<< ".\n");
		StringRef TLIName = TLI.getVectorizedFunction(ScalarName, NumElements);
		if (!TLIName.empty()) {
		mgabkaUnsubmitted Not Done Reply Inline Actions this is wrong, please read about StringRef https://discourse.llvm.org/t/std-string-vs-llvm-stringref/65873/2 you can just write it as : StringRef TLIName = TLI.getVectorizedFunction(ScalarName, NumElements mgabka: this is wrong, please read about StringRef https://discourse.llvm.org/t/std-string-vs-llvm…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Yeah, copy-pasted from the original code in replaceWithCallToVeclib. Corrected -- but only here, in my own code. jolanta.jensen: Yeah, copy-pasted from the original code in replaceWithCallToVeclib. Corrected -- but only here…
		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Found unmasked TLI function `"
		<< TLIName << "`.\n");
		return replaceWithTLIFunction(I, TLIName);
		mgabkaUnsubmitted Not Done Reply Inline Actions from what I can see this is the only different part of code then in "replaceWithTLIFunction" , hence most of the code below is a code duplication. I think a smart refactoring of replaceWithTLIFunction would be enough to make it work for CI and Instructions. The other thing is that the code you wrote duplicates a lot, you can use a suitable container and just add extra element there when you are creating a type for masked function, in that way you do not need to create the type twice and add input types twice. mgabka: from what I can see this is the only different part of code then in "replaceWithTLIFunction"…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Lines 129-134 also differ. Line 77 has an assert that is not vaIid here. I created small functions for lines 54-66 and 116-127 that are shared. And I fixed the code duplication above. jolanta.jensen: Lines 129-134 also differ. Line 77 has an assert that is not vaIid here. I created small…
		mgabkaUnsubmitted Not Done Reply Inline Actions I am not against having "replaceFremWithCallToVeclib" I think that it could stay. However I would like you to try to not create " replaceFremWithCallToVeclib". In my opinion the code will be more readable if we have only replaceWithTLIFunction, which takes an Instruction, and do not create functions like "replaceWithNewCallInst" or "addFunctionToCompilerUsed". If it makes code more clean it might be worth to move some of the frem related things like creating function type to a helper function. mgabka: I am not against having "replaceFremWithCallToVeclib" I think that it could stay. However I…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. All frem handling is in replaceWithTLIFunction. I think it's better to have it all gathered there than break out into separate function(s) as it's not that much code. jolanta.jensen: Fixed. All frem handling is in replaceWithTLIFunction. I think it's better to have it all…
		}
		TLIName = TLI.getVectorizedFunction(ScalarName, NumElements, true);
		mgabkaUnsubmitted Not Done Reply Inline Actions I am not sure that this is even needed, in the TLI all mappings for SLEEF or ArmPL for scalable vectors are masked, so I would assume that we either replace with masked call or we do not replace at all, @paulwalker-arm what is your opinion? mgabka: I am not sure that this is even needed, in the TLI all mappings for SLEEF or ArmPL for…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Checking for both is fine and akin to what we do in LoopVectorize. There's no requirement for scalable vector math routines to require a mask, it's just masked versions are easier to work with when tail-folding. paulwalker-arm: Checking for both is fine and akin to what we do in LoopVectorize. There's no requirement for…
		if (!TLIName.empty()) {
		LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Found masked TLI function `"
		<< TLIName << "`.\n");
		return replaceWithTLIFunction(I, TLIName, NumElements, /Masked/ true);
		}
		return false;
		mgabkaUnsubmitted Not Done Reply Inline Actions I think LLVM prefers to use an in-line C-style comment in this case /Masked/ true mgabka: I think LLVM prefers to use an in-line C-style comment in this case /Masked/ true
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
		}

		mgabkaUnsubmitted Not Done Reply Inline Actions I think this is going to pollute dbg log too much as it will add a debug message for each unsupported intrinsic, please remove it. mgabka: I think this is going to pollute dbg log too much as it will add a debug message for each…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Why should it pollute? We won't get it more times than LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `" << ScalarName << "` and vector width " << NumElements << ".\n"); It matches the above print in case we find nothing. I think it's user friendly and I would like to keep it. jolanta.jensen: Why should it pollute? We won't get it more times than ``` LLVM_DEBUG(dbgs() << DEBUG_TYPE <<…
		mgabkaUnsubmitted Not Done Reply Inline Actions I was too fast, this function only works for frem i nstruction, not for all llvm intrinsics. In my view there is no need to add messages when optimization won't happen, you added messages when it happens, so lack of it means that it is not happening. mgabka: I was too fast, this function only works for frem i nstruction, not for all llvm intrinsics. In…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Forgot about this one. Will be included in next patch. jolanta.jensen: Forgot about this one. Will be included in next patch.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Removed the unnecessary debug print. jolanta.jensen: Removed the unnecessary debug print.
static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,		static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
CallInst &CI) {		CallInst &CI) {
if (!CI.getCalledFunction()) {		if (!CI.getCalledFunction()) {
return false;		return false;
}		}

auto IntrinsicID = CI.getCalledFunction()->getIntrinsicID();		auto IntrinsicID = CI.getCalledFunction()->getIntrinsicID();
		mgabkaUnsubmitted Not Done Reply Inline Actions could you move this comment abve the if stmt and remove the braces? mgabka: could you move this comment abve the if stmt and remove the braces?
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
if (IntrinsicID == Intrinsic::not_intrinsic) {		if (IntrinsicID == Intrinsic::not_intrinsic) {
// Replacement is only performed for intrinsic functions		// Replacement is only performed for intrinsic functions
return false;		return false;
}		}
		mgabkaUnsubmitted Not Done Reply Inline Actions IMO this would be more readable if written as if statements, moreover if the type is not double of float we should rerutn false from this function, am I right? StringRef ScalarName; if (ElementType->isFloatTy()) ScalarName =TLI.getName(LibFunc_fmodf); elseif(ElementType->isDoubleTy()) ScalarName =TLI.getName(LibFunc_fmod) else return false; thanks to to the statement below is not needed and code is more clear and compact. mgabka: IMO this would be more readable if written as if statements, moreover if the type is not double…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. In the unusual case the ScalarName is empty we will just do 2 unnecessary lookups. jolanta.jensen: Fixed. In the unusual case the ScalarName is empty we will just do 2 unnecessary lookups.
		mgabkaUnsubmitted Not Done Reply Inline Actions please move the comment above the if stmt and remove braces mgabka: please move the comment above the if stmt and remove braces
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.

mgabkaUnsubmitted Not Done Reply Inline Actions not needed change mgabka: not needed change
mgabkaUnsubmitted Not Done Reply Inline Actions not needed change please revert this change. mgabka: not needed change please revert this change.
// Convert vector arguments to scalar type and check that		// Convert vector arguments to scalar type and check that
// all vector operands have identical vector width.		// all vector operands have identical vector width.
ElementCount VF = ElementCount::getFixed(0);		ElementCount VF = ElementCount::getFixed(0);
		mgabkaUnsubmitted Not Done Reply Inline Actions these 2 if stmts can be combined together, C/C++ guarantees that they are executed from left to right mgabka: these 2 if stmts can be combined together, C/C++ guarantees that they are executed from left to…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
SmallVector<Type *> ScalarTypes;		SmallVector<Type *> ScalarTypes;
		mgabkaUnsubmitted Not Done Reply Inline Actions could you move the comments above the if and remove the braces? I think this is not needed entirely to be fair, can we rely on the check performed by TLI.getVectorizedFunction? and add debug message just below the final return false? mgabka: could you move the comments above the if and remove the braces? I think this is not needed…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Yes, we can. It calls into: bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const { return !(getVectorizedFunction(F, VF, false).empty() && getVectorizedFunction(F, VF, true).empty()); } Removed the isFunctionVectorizable() check and added a debug message if a vectorized function wasn't found. jolanta.jensen: Yes, we can. It calls into: ``` bool isFunctionVectorizable(StringRef F, const ElementCount…
for (auto Arg : enumerate(CI.args())) {		for (auto Arg : enumerate(CI.args())) {
		mgabkaUnsubmitted Not Done Reply Inline Actions I guess that this is handled by "replaceAllUsesWith" but worth to check if with a test. mgabka: I guess that this is handled by "replaceAllUsesWith" but worth to check if with a test.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Indeed, it is handled by "replaceAllUsesWith". void Value::replaceAllUsesWith(Value New) { doRAUW(New, ReplaceMetadataUses::Yes); } void Value::doRAUW(Value New, ReplaceMetadataUses ReplaceMetaUses) { ... if (ReplaceMetaUses == ReplaceMetadataUses::Yes && isUsedByMetadata()) ValueAsMetadata::handleRAUW(this, New); ... Do we need a test to confirm or is this explanation good enough? jolanta.jensen: Indeed, it is handled by "replaceAllUsesWith". ``` void Value::replaceAllUsesWith(Value *New)…
auto *ArgType = Arg.value()->getType();		auto *ArgType = Arg.value()->getType();
// Vector calls to intrinsics can still have		// Vector calls to intrinsics can still have
// scalar operands for specific arguments.		// scalar operands for specific arguments.
if (isVectorIntrinsicWithScalarOpAtArg(IntrinsicID, Arg.index())) {		if (isVectorIntrinsicWithScalarOpAtArg(IntrinsicID, Arg.index())) {
ScalarTypes.push_back(ArgType);		ScalarTypes.push_back(ArgType);
} else {		} else {
// The argument in this place should be a vector if		// The argument in this place should be a vector if
// this is a call to a vector intrinsic.		// this is a call to a vector intrinsic.
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (!TLIName.empty()) {
return replaceWithTLIFunction(CI, TLIName);		return replaceWithTLIFunction(CI, TLIName);
}		}

return false;		return false;
}		}

static bool runImpl(const TargetLibraryInfo &TLI, Function &F) {		static bool runImpl(const TargetLibraryInfo &TLI, Function &F) {
bool Changed = false;		bool Changed = false;
SmallVector<CallInst *> ReplacedCalls;		SmallVector<Instruction *> ReplacedCalls;
for (auto &I : instructions(F)) {		for (auto &I : instructions(F)) {
		mgabkaUnsubmitted Not Done Reply Inline Actions this is not needed, the mechanism here should work for both fixed and scalable types if mappings exist. if we want to reject the transformation for scalable vector types I think we should reject it earlier, i.e where we detect frem. please LLVM coding guideline on using braces with simple if statememts https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements, it applies to other places in this patch. mgabka: this is not needed, the mechanism here should work for both fixed and scalable types if…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions There is no mappings for frem with fixed vectors for SLEEF library so we need to check it's a scalable type. I'll check the code for breaches of the standard and I'll correct. jolanta.jensen: There is no mappings for frem with fixed vectors for SLEEF library so we need to check it's a…
if (auto *CI = dyn_cast<CallInst>(&I)) {		if (auto *CI = dyn_cast<CallInst>(&I)) {
if (replaceWithCallToVeclib(TLI, *CI)) {		if (replaceWithCallToVeclib(TLI, *CI)) {
		mgabkaUnsubmitted Not Done Reply Inline Actions Hi Jolanta, in my opinion it would be better to have a single main entry point here, and then branch from inside replaceWithCallToVeclib, thanks to it you can void some of the code duplication, like for example the debug messages. mgabka: Hi Jolanta, in my opinion it would be better to have a single main entry point here, and then…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
ReplacedCalls.push_back(CI);		ReplacedCalls.push_back(&I);
		mgabkaUnsubmitted Not Done Reply Inline Actions when moving this outside this function, this check can be combined with the one above since ScalableVectorType class exists. mgabka: when moving this outside this function, this check can be combined with the one above since…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions I changed to ScalableVectorType. jolanta.jensen: I changed to ScalableVectorType.
		Changed = true;
		}
		} else if (I.getOpcode() == Instruction::FRem) {
		// If there is a suitable TLI mapping for FRem instruction,
		mgabkaUnsubmitted Not Done Reply Inline Actions nit: better to use consistent spelling so either please use FRem of frem everywhere mgabka: nit: better to use consistent spelling so either please use FRem of frem everywhere
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
		// replace the instruction.
		if (replaceFremWithCallToVeclib(TLI, I)) {
		ReplacedCalls.push_back(&I);
Changed = true;		Changed = true;
}		}
		mgabkaUnsubmitted Not Done Reply Inline Actions the function name "replaceInstructionWithCallToVeclib" does not suggest that this is specific to fmod/frem, at the moment we do not have intention to extend it beyond frem, so in my opinion it is worth to replace the function name, to avoid confusion. mgabka: the function name "replaceInstructionWithCallToVeclib" does not suggest that this is specific…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
}		}
		mgabkaUnsubmitted Not Done Reply Inline Actions please use early exit instead, it is much better idea to use early exits from function instead long nested if block, it improves code readability. please apply it where possible to other checks you are doing mgabka: please use early exit instead, it is much better idea to use early exits from function instead…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
}		}
// Erase the calls to the intrinsics that have been replaced		// Erase the calls to the intrinsics and the FRem instructions that have been
// with calls to the vector library.		// replaced with calls to the vector library.
for (auto *CI : ReplacedCalls) {		for (auto *I : ReplacedCalls) {
		mgabkaUnsubmitted Not Done Reply Inline Actions could you can remove the braces here? mgabka: could you can remove the braces here?
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. I kept the a blank line before return to make it more visible what is returned. jolanta.jensen: Fixed. I kept the a blank line before return to make it more visible what is returned.
CI->eraseFromParent();		I->eraseFromParent();
}		}
return Changed;		return Changed;
}		}

		mgabkaUnsubmitted Not Done Reply Inline Actions why looking for it in cases where you are not going to use it? it isn't efficient. please change it. mgabka: why looking for it in cases where you are not going to use it? it isn't efficient. please…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
////////////////////////////////////////////////////////////////////////////////		////////////////////////////////////////////////////////////////////////////////
// New pass manager implementation.		// New pass manager implementation.
////////////////////////////////////////////////////////////////////////////////		////////////////////////////////////////////////////////////////////////////////
		mgabkaUnsubmitted Not Done Reply Inline Actions this message is printed actually after looking for the mappings, so probably should be moved up. mgabka: this message is printed actually after looking for the mappings, so probably should be moved up.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
PreservedAnalyses ReplaceWithVeclib::run(Function &F,		PreservedAnalyses ReplaceWithVeclib::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
const TargetLibraryInfo &TLI = AM.getResult<TargetLibraryAnalysis>(F);		const TargetLibraryInfo &TLI = AM.getResult<TargetLibraryAnalysis>(F);
auto Changed = runImpl(TLI, F);		auto Changed = runImpl(TLI, F);
if (Changed) {		if (Changed) {
PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
PA.preserve<TargetLibraryAnalysis>();		PA.preserve<TargetLibraryAnalysis>();
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines
; CHECK-SAME: (<vscale x 4 x float> [[IN:%.*]]) #[[ATTR1]] {		; CHECK-SAME: (<vscale x 4 x float> [[IN:%.*]]) #[[ATTR1]] {
; CHECK-NEXT: [[TMP1:%.*]] = call fast <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float> [[IN]])		; CHECK-NEXT: [[TMP1:%.*]] = call fast <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float> [[IN]])
; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]		; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
;		;
%1 = call fast <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float> %in)		%1 = call fast <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float> %in)
ret <vscale x 4 x float> %1		ret <vscale x 4 x float> %1
}		}

		; TLI mappings for FREM instruction. They are not utilized since the names of the vector library functions
		mgabkaUnsubmitted Not Done Reply Inline Actions a comment explaining why the transformation is not happening would be useful here. mgabka: a comment explaining why the transformation is not happening would be useful here.
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.
		; cannot be demangled.

		define <vscale x 2 x double> @llvm_frem_vscale_f64(<vscale x 2 x double> %in1, <vscale x 2 x double> %in2) #0 {
		; CHECK-LABEL: define <vscale x 2 x double> @llvm_frem_vscale_f64
		; CHECK-SAME: (<vscale x 2 x double> [[IN1:%.]], <vscale x 2 x double> [[IN2:%.]]) #[[ATTR1]] {
		; CHECK-NEXT: [[OUT:%.*]] = frem fast <vscale x 2 x double> [[IN1]], [[IN2]]
		; CHECK-NEXT: ret <vscale x 2 x double> [[OUT]]
		;
		%out = frem fast <vscale x 2 x double> %in1, %in2
		ret <vscale x 2 x double> %out
		}

		define <vscale x 4 x float> @llvm_frem_vscale_f32(<vscale x 4 x float> %in1, <vscale x 4 x float> %in2) #0 {
		; CHECK-LABEL: define <vscale x 4 x float> @llvm_frem_vscale_f32
		; CHECK-SAME: (<vscale x 4 x float> [[IN1:%.]], <vscale x 4 x float> [[IN2:%.]]) #[[ATTR1]] {
		; CHECK-NEXT: [[OUT:%.*]] = frem fast <vscale x 4 x float> [[IN1]], [[IN2]]
		; CHECK-NEXT: ret <vscale x 4 x float> [[OUT]]
		;
		%out = frem fast <vscale x 4 x float> %in1, %in2
		ret <vscale x 4 x float> %out
		}


declare <2 x double> @llvm.log.v2f64(<2 x double>)		declare <2 x double> @llvm.log.v2f64(<2 x double>)
declare <4 x float> @llvm.log.v4f32(<4 x float>)		declare <4 x float> @llvm.log.v4f32(<4 x float>)
declare <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double>)		declare <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double>)
declare <vscale x 4 x float> @llvm.log.nxv4f32(<vscale x 4 x float>)		declare <vscale x 4 x float> @llvm.log.nxv4f32(<vscale x 4 x float>)

define <2 x double> @llvm_log_f64(<2 x double> %in) {		define <2 x double> @llvm_log_f64(<2 x double> %in) {
; CHECK-LABEL: define <2 x double> @llvm_log_f64		; CHECK-LABEL: define <2 x double> @llvm_log_f64
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	;
%1 = call fast <vscale x 4 x float> @llvm.pow.nxv4f32(<vscale x 4 x float> %in, <vscale x 4 x float> %power)		%1 = call fast <vscale x 4 x float> @llvm.pow.nxv4f32(<vscale x 4 x float> %in, <vscale x 4 x float> %power)
ret <vscale x 4 x float> %1		ret <vscale x 4 x float> %1
}		}

attributes #0 = { "target-features"="+sve" }		attributes #0 = { "target-features"="+sve" }
;.		;.
; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }		; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
; CHECK: attributes #[[ATTR1]] = { "target-features"="+sve" }		; CHECK: attributes #[[ATTR1]] = { "target-features"="+sve" }
;.		;.
		mgabkaUnsubmitted Not Done Reply Inline Actions having both arguments of frem as inputs to this function would simplify the test, same apply to other tests. The other thing is that other tests in this file are using fast math flags, so please use it for frem as well. mgabka: having both arguments of frem as inputs to this function would simplify the test, same apply to…
		jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions Fixed. jolanta.jensen: Fixed.

llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals
	; RUN: opt -mattr=+sve -vector-library=sleefgnuabi -replace-with-veclib -S < %s \| FileCheck %s			; RUN: opt -mattr=+sve -vector-library=sleefgnuabi -replace-with-veclib -S < %s \| FileCheck %s

	target triple = "aarch64-unknown-linux-gnu"			target triple = "aarch64-unknown-linux-gnu"

	; NOTE: The existing TLI mappings are not used since the -replace-with-veclib pass is broken for scalable vectors.			; NOTE: The existing TLI mappings are not used since the -replace-with-veclib pass is broken for scalable vectors.

				;.
				; CHECK: @[[LLVM_COMPILER_USED:[a-zA-Z0-9_$"\\.-]+]] = appending global [2 x ptr] [ptr @_ZGVsMxvv_fmod, ptr @_ZGVsMxvv_fmodf], section "llvm.metadata"
				;.
	define <vscale x 2 x double> @llvm_ceil_vscale_f64(<vscale x 2 x double> %in) {			define <vscale x 2 x double> @llvm_ceil_vscale_f64(<vscale x 2 x double> %in) {
	; CHECK-LABEL: @llvm_ceil_vscale_f64(			; CHECK-LABEL: @llvm_ceil_vscale_f64(
	; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double> [[IN:%.]])			; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double> [[IN:%.]])
	; CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]			; CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
	;			;
	%1 = call fast <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double> %in)			%1 = call fast <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double> %in)
	ret <vscale x 2 x double> %1			ret <vscale x 2 x double> %1
	}			}
	▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @llvm_fma_vscale_f32(			; CHECK-LABEL: @llvm_fma_vscale_f32(
	; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 4 x float> @llvm.fma.nxv4f32(<vscale x 4 x float> [[A:%.]], <vscale x 4 x float> [[B:%.]], <vscale x 4 x float> [[C:%.]])			; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 4 x float> @llvm.fma.nxv4f32(<vscale x 4 x float> [[A:%.]], <vscale x 4 x float> [[B:%.]], <vscale x 4 x float> [[C:%.]])
	; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]			; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
	;			;
	%1 = call fast <vscale x 4 x float> @llvm.fma.nxv4f32(<vscale x 4 x float> %a, <vscale x 4 x float> %b, <vscale x 4 x float> %c)			%1 = call fast <vscale x 4 x float> @llvm.fma.nxv4f32(<vscale x 4 x float> %a, <vscale x 4 x float> %b, <vscale x 4 x float> %c)
	ret <vscale x 4 x float> %1			ret <vscale x 4 x float> %1
	}			}

				; NOTE: TLI mapping for FREM instruction.

				define <vscale x 2 x double> @llvm_frem_vscale_f64(<vscale x 2 x double> %in1, <vscale x 2 x double> %in2) {
				; CHECK-LABEL: @llvm_frem_vscale_f64(
				; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 2 x double> @_ZGVsMxvv_fmod(<vscale x 2 x double> [[IN1:%.]], <vscale x 2 x double> [[IN2:%.*]], <vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 true, i64 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer))
				; CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
				;
				%out = frem fast <vscale x 2 x double> %in1, %in2
				ret <vscale x 2 x double> %out
				}

				define <vscale x 4 x float> @llvm_frem_vscale_f32(<vscale x 4 x float> %in1, <vscale x 4 x float> %in2) {
				; CHECK-LABEL: @llvm_frem_vscale_f32(
				; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 4 x float> @_ZGVsMxvv_fmodf(<vscale x 4 x float> [[IN1:%.]], <vscale x 4 x float> [[IN2:%.*]], <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i64 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer))
				; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
				;
				%out = frem fast <vscale x 4 x float> %in1, %in2
				ret <vscale x 4 x float> %out
				}

	define <vscale x 2 x double> @llvm_log_vscale_f64(<vscale x 2 x double> %in) {			define <vscale x 2 x double> @llvm_log_vscale_f64(<vscale x 2 x double> %in) {
	; CHECK-LABEL: @llvm_log_vscale_f64(			; CHECK-LABEL: @llvm_log_vscale_f64(
	; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double> [[IN:%.]])			; CHECK-NEXT: [[TMP1:%.]] = call fast <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double> [[IN:%.]])
	; CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]			; CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
	;			;
	%1 = call fast <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double> %in)			%1 = call fast <vscale x 2 x double> @llvm.log.nxv2f64(<vscale x 2 x double> %in)
	ret <vscale x 2 x double> %1			ret <vscale x 2 x double> %1
	}			}
	▲ Show 20 Lines • Show All 202 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]			; CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
	;			;
	%1 = call fast <vscale x 4 x float> @llvm.trunc.nxv4f32(<vscale x 4 x float> %in)			%1 = call fast <vscale x 4 x float> @llvm.trunc.nxv4f32(<vscale x 4 x float> %in)
	ret <vscale x 4 x float> %1			ret <vscale x 4 x float> %1
	}			}

	declare <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.ceil.nxv2f64(<vscale x 2 x double>)
	declare <vscale x 4 x float> @llvm.ceil.nxv4f32(<vscale x 4 x float>)			declare <vscale x 4 x float> @llvm.ceil.nxv4f32(<vscale x 4 x float>)
	declare <vscale x 2 x double> @llvm.copysign.nxv2f64(<vscale x 2 x double>, <vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.copysign.nxv2f64(<vscale x 2 x double>, <vscale x 2 x double>)
				mgabkaUnsubmitted Not Done Reply Inline Actions please use same naming scheme as other functions in this file so this should be : llvm_frem_vscale_f64, please apply to the other tests mgabka: please use same naming scheme as other functions in this file so this should be…
				jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions The other test names start with @llvm since they call functions which names start with @llvm. My test names start with @frem since I call frem. In my opinion it is consistent with the naming scheme in this file. jolanta.jensen: The other test names start with @llvm since they call functions which names start with @llvm.
				mgabkaUnsubmitted Not Done Reply Inline Actions In my opinion it makes things only more difficult as it breaks the alphabetical order, and when looking for output for frem we need to too at. the end of the file instead place where all math operations started which name starts with "f" is. mgabka: In my opinion it makes things only more difficult as it breaks the alphabetical order, and…
				jolanta.jensenAuthorUnsubmitted Done Reply Inline Actions replace-intrinsics-with-veclib-sleef-scalable.ll is alphabetically ordered for intrinsics tests but replace-intrinsics-with-veclib-armpl.ll is not. It starts with cos and the next one is sin. The remainder of the test file is fairly alphabetical even if log10 comes after log2. But I moved the frem tests so they come in alphabetical order as they are listed alphabetically in VecFuncs.def. I renamed them also to llvm_frem even if they not intrinsics. jolanta.jensen: replace-intrinsics-with-veclib-sleef-scalable.ll is alphabetically ordered for intrinsics tests…
	declare <vscale x 4 x float> @llvm.copysign.nxv4f32(<vscale x 4 x float>, <vscale x 4 x float>)			declare <vscale x 4 x float> @llvm.copysign.nxv4f32(<vscale x 4 x float>, <vscale x 4 x float>)
	declare <vscale x 2 x double> @llvm.cos.nxv2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.cos.nxv2f64(<vscale x 2 x double>)
	declare <vscale x 4 x float> @llvm.cos.nxv4f32(<vscale x 4 x float>)			declare <vscale x 4 x float> @llvm.cos.nxv4f32(<vscale x 4 x float>)
	declare <vscale x 2 x double> @llvm.exp.nxv2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.exp.nxv2f64(<vscale x 2 x double>)
	declare <vscale x 4 x float> @llvm.exp.nxv4f32(<vscale x 4 x float>)			declare <vscale x 4 x float> @llvm.exp.nxv4f32(<vscale x 4 x float>)
	declare <vscale x 2 x double> @llvm.exp2.nxv2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.exp2.nxv2f64(<vscale x 2 x double>)
	declare <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float>)			declare <vscale x 4 x float> @llvm.exp2.nxv4f32(<vscale x 4 x float>)
	declare <vscale x 2 x double> @llvm.fabs.nxv2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.fabs.nxv2f64(<vscale x 2 x double>)
	Show All 33 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TLI][AArch64] Extend ReplaceWithVeclib to replace vector FREM instructions for scalable vectorsAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 549313

llvm/include/llvm/Analysis/VectorUtils.h

llvm/lib/Analysis/VFABIDemangling.cpp

llvm/lib/CodeGen/ReplaceWithVeclib.cpp

llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll

llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll

[TLI][AArch64] Extend ReplaceWithVeclib to replace vector FREM instructions for scalable vectors
AbandonedPublic