Patch adds five new GICombinerRules, one for each of the following unary FP instrs: G_FNEG, G_FABS, G_FPTRUNC, G_FSQRT, and G_FLOG2. The combine rules perform the FP operation on the constant operand and replace the original instr with the result. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules.
Details
Diff Detail
Event Timeline
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1481 | Why go through double instead of preserving the APFloat? |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1462 | Wrong rounding mode |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1481 | Because GIDefMatchData wants to have the variable uninitialized, which would call the default ctor of APFloat which is private. See following pseudo code: APFLoat MatchDataN; // calls APFloat() if (matchCombineConstantFoldFpUnary(MI, MatchDataN)) replaceInstWithAPFloat(MI, MatchDataN); // dummy function for example The error looks like: llvm-project/build/lib/Target/AArch64/AArch64GenPreLegalizeGICombiner.inc:343:11: error: calling a private constructor of class 'llvm::APFloat' APFloat MatchData23; ^ llvm-project/llvm/include/llvm/ADT/APFloat.h:842:3: note: implicitly declared private here APFloat() : U(IEEEdouble()) { ^ |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1481 | I guess you could work around this by keeping it wrapped in Optional<APFloat> |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1462 | Should the correct rounding mode be rmNearestTiesToEven? Is that only for G_FPTRUNC to LLT::scalar(16) and not LLT::scalar(32)? | |
1481 | That could work. I ended up liking the current solution with replaceInstWithFConstant over my initial prototype where I tried passing around the APFloat&, because buildFConstant(DstOp, double) will convert the double to the appropriate APFloat depending on the LLT of the DstOp. That way we can just take advantage of the existing replaceInstWithFConstant function. |
- match / apply functions use Optional<APFloat> to pass info about the constant FP.
- Updated rounding modes to be rmNearestTiesToEven
- Refactored constantFoldFpUnary
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1486–1489 | Yes, that's why there should be one. An FP LLT isn't any arbitrary number of bits, there's still only the handful of valid FP type combinations. We're probably going to have to add something to track f16 vs. bf16, but for now it's still a switch over a handful of valid FP sizes |
- Added LLT helper function to get float semantic from scalar LLT (IEEE fp semantics only at the moment)
- Updated constantFoldFpUnary to use the new helper function
- Rebased / fixed merge conflicts
llvm/include/llvm/Target/GlobalISel/Combine.td | ||
---|---|---|
294 | Is there any reason these are all separate combines, when they're all using the same function? Have you found it useful to be able to turn these on/off per-opcode? Most other combines look like this: def fconstant_matchinfo: GIDefMatchData<"Optional<APFloat>">; def constant_fold_unary: GICombineRule < (defs root:$root, fconstant_matchinfo:$info), (match (wip_match_opcode G_FNEG, G_FABS, G_FPTRUNC, G_FSQRT, G_FLOG2):$root, [{ return Helper.matchCombineConstantFoldFpUnary(*${root}, ${info}); }]), (apply [{ return Helper.applyCombineConstantFoldFpUnary(*${root}, ${info}); }]) >; | |
295 | Do you need separate matchinfo definitions? I've noticed all the combines do this, but I think it would be better to just say def fconstant_matchinfo: GIDefMatchData<"Optional<APFloat>">; and then reuse it in every combine that uses it versus redefining matchinfo for every combine. (The rest of the combines could probably be cleaned up similarly in a later commit if this works) | |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
1449 | This should probably assert or be a llvm_unreachable, since we never expect to run this with any other opcode. |
llvm/include/llvm/Target/GlobalISel/Combine.td | ||
---|---|---|
294 | No reason, except I didn't see that we could have a list of opcodes. Will fix that. | |
295 | When they are all refactored into a single combine rule, we'll end up with one matchinfo for free. | |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
1449 | Makes sense to have an assert. If somehow control flow ended up there, then the developer would likely want the compiler to assert rather than return None. |
- Collapsed the many combine rules into a single combine rule with a list of matchable opcodes
- Made the default switch case in constantFoldFpUnary be llvm_unreachable
llvm/include/llvm/Target/GlobalISel/Combine.td | ||
---|---|---|
295 | Tablegen isn't smart enough to reuse identical matchinfos (although this should really be fixed) |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1443 | Probably not anymore | |
1459 | You're right, no longer necessary | |
llvm/lib/CodeGen/LowLevelType.cpp | ||
64 | I figured calling it on the vector element type would likely be the typical intended use, and calling on the aggregate vector type would likely be bug (and wouldn't really make much sense). |
- Rebased and fixed merge conflicts
- Removed unnecessary type checks in constantFoldFpUnary
- Removed G_FPTRUNC special case for S64 in constantFoldFpUnary
Is there any reason these are all separate combines, when they're all using the same function? Have you found it useful to be able to turn these on/off per-opcode?
Most other combines look like this: