This is an archive of the discontinued LLVM Phabricator instance.

Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128
ClosedPublic

Authored by steven.zhang on Nov 12 2020, 3:10 AM.

Details

Summary

X86 and AArch64 expand it as libcall inside the target. And PowerPC also want to expand them as libcall for P8. So, I implement this in the legalizer and remove the code for X86/AArch64 to avoid the duplicate code.

Diff Detail

Event Timeline

steven.zhang created this revision.Nov 12 2020, 3:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2020, 3:10 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
steven.zhang requested review of this revision.Nov 12 2020, 3:10 AM
RKSimon added inline comments.Nov 12 2020, 4:28 AM
llvm/include/llvm/CodeGen/TargetLowering.h
1058

Maybe limit the scope to something like isSupportedCastOperation?

X86 uses Custom action based on the output type and checks the input type in the Custom handler. If the input type and output type are legal the custom handler returns the SDValue that was passed to the custom handler. That tells LegalizeDAG that no custom handling was required and it should be considered legal. Would that work for your case?

X86 uses Custom action based on the output type and checks the input type in the Custom handler. If the input type and output type are legal the custom handler returns the SDValue that was passed to the custom handler. That tells LegalizeDAG that no custom handling was required and it should be considered legal. Would that work for your case?

I think not. The way why X86 works is that, return SDValue() for custom handler means legal. But what PowerPC wants is to expand it. We didn't have good way to tell the legalizer to expand it when we are inside the custom handler as far as I know...

Return SDValue() means Expand. Returning the original op means Legal. Returning any other node means custom handling was done. X86 does all 3.

steven.zhang abandoned this revision.Nov 12 2020, 11:02 PM

Return SDValue() means Expand. Returning the original op means Legal. Returning any other node means custom handling was done. X86 does all 3.

Wow~ That is exactly what I want! Thank you for the comments @craig.topper @RKSimon

I have to reopen this revision as the custom way still cannot meet PowerPC's requirement, because it will try to expand it first. From the implementation, it seems that we are trying our best to expand the node without checking the legalization of the expanded node, which results in several runtime call eventually. And sometimes, the expand never fall into libcall for some opcode ... See

case ISD::SINT_TO_FP:
case ISD::STRICT_SINT_TO_FP:
  Tmp1 = ExpandLegalINT_TO_FP(Node, Tmp2);
  Results.push_back(Tmp1);
  if (Node->isStrictFPOpcode())
    Results.push_back(Tmp2);
  break;

What we are trying to do is to call the runtime call for such operations.

Address Reviewer's comments.

I have to reopen this revision as the custom way still cannot meet PowerPC's requirement, because it will try to expand it first. From the implementation, it seems that we are trying our best to expand the node without checking the legalization of the expanded node, which results in several runtime call eventually. And sometimes, the expand never fall into libcall for some opcode ... See

case ISD::SINT_TO_FP:
case ISD::STRICT_SINT_TO_FP:
  Tmp1 = ExpandLegalINT_TO_FP(Node, Tmp2);
  Results.push_back(Tmp1);
  if (Node->isStrictFPOpcode())
    Results.push_back(Tmp2);
  break;

What we are trying to do is to call the runtime call for such operations.

Address Reviewer's comments.

Yeah if you want to emit a libcall that is an issue. AArch64 and X86 both worked around this by emitting libcalls directly from Custom handling. See the callers of LowerF128Call in both targets. But this change might be better.

I have to reopen this revision as the custom way still cannot meet PowerPC's requirement, because it will try to expand it first. From the implementation, it seems that we are trying our best to expand the node without checking the legalization of the expanded node, which results in several runtime call eventually. And sometimes, the expand never fall into libcall for some opcode ... See

case ISD::SINT_TO_FP:
case ISD::STRICT_SINT_TO_FP:
  Tmp1 = ExpandLegalINT_TO_FP(Node, Tmp2);
  Results.push_back(Tmp1);
  if (Node->isStrictFPOpcode())
    Results.push_back(Tmp2);
  break;

What we are trying to do is to call the runtime call for such operations.

Address Reviewer's comments.

Yeah if you want to emit a libcall that is an issue. AArch64 and X86 both worked around this by emitting libcalls directly from Custom handling. See the callers of LowerF128Call in both targets. But this change might be better.

Thank you for the information. I will try to remove those code from AArch64 and X86 and use the way I did in D91757 to common all the code.

simoll added a subscriber: simoll.Nov 19 2020, 2:50 AM

I know that Simon mentioned that perhaps we should limit this to cast operations but TBH, I think there may be value in making this general and passing in the SDNode itself rather than opcode/type(s). Then all of this logic can be greatly simplified into something like this:

void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {
  LLVM_DEBUG(dbgs() << "\nLegalizing: "; Node->dump(&DAG));
  if (TLI.opRequiresLibCall(Node))
    ConvertNodeToLibcall(Node);

And presumably, there would be another function for the TLI to provide the RTLIB::Libcall for the specific SDNode.

And of course, this needs some tests.

llvm/include/llvm/CodeGen/TargetLowering.h
1081

The semantics here are rather subtle and I don't think the name describes the semantics adequately.
"Supported" in this case means that some expansion can be performed - not that the operation is Legal for the target. The purpose of this appears to be to determine if the conversion requires a libcall. So why not name it that? Perhaps:

virtual bool castRequiresLibCall(unsigned Op, EVT SrcVT, EVT DestVT) const {
  return false;
}

And then of course, flip the ternary operator above.

Thank you for Nemanjai's comments. I have changed it as suggested as it is indeed more clean, but with a different name.

I have removed all the places that call the LowerF128Call Craig mentioned for AArch64 and X86 and do it inside legalizer in an uniform way. However, I see some test change with this and I am not sure if it is expected change due to the different implementation of expanding the libcall. Can someone please help me double confirm this ?

I will rebase D91757 for powerpc side change.

steven.zhang retitled this revision from [NFC] Add hook for target to customize different legalization action according to the input type to Add hook for target to customize different legalization action according to the input type.Nov 20 2020, 3:04 AM

It would appear to me that the only test case changes are that the libcalls were turned into tail calls - but of course my knowledge of X86 and ARM asm is next to zero. Presumably the original lowering didn't set the necessary flags on the call and now the legalizer does so.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2980 ↗(On Diff #306638)

A return after llvm_unreachable? Does the build compiler warn about no return here?

llvm/test/CodeGen/AArch64/arm64-fp128.ll
7 ↗(On Diff #306638)

These appear to have been turned into tail calls with no stack frame. That seems like a positive sideeffect but an ARM/AArch64 expert should certainly confirm that.

steven.zhang added inline comments.Nov 22 2020, 6:18 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2980 ↗(On Diff #306638)

No... Also, there is already such kind of code. See

SDValue AArch64TargetLowering::LowerOperation(SDValue Op,
                                              SelectionDAG &DAG) const {
  LLVM_DEBUG(dbgs() << "Custom lowering: ");
  LLVM_DEBUG(Op.dump());

  switch (Op.getOpcode()) {
  default:
    llvm_unreachable("unimplemented operand");
    return SDValue();
steven.zhang edited the summary of this revision. (Show Details)

Add the PowerPC side support. I have verify the functionality with PowerPC and it works well. Could someone from AArch64 or X86 verify the functionality for this patch as I see some test change though it looks good (I didn't have the X86 and AArch64 env) ?

craig.topper added inline comments.Nov 24 2020, 9:25 PM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4329

The input type should be legal here so I don't think this comment makes sense as is.

4342

The result of a dyn_cast shouldn't be checked for null by an assert. That should just be a cast. Is it dyn_cast in the type legalizer too?

Address comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4329

Ah, the evil for copy-paste. Yes, it is legal here. I will update the comments. For PowerPC, i1 is legal type, so, we need this part of code to extend it to the type that has libcall.

4342

Good catch. I will update it with cast. In fact, the type legalizer didn't skip this extra operand which seems wrong. See SoftenFloatRes_FP_ROUND(). We can fix it later I think.

yubing added a subscriber: yubing.Nov 26 2020, 9:09 PM

Could we check whether the input type is fp128 at the begining of PPCTargetLowering::LowerFP_TO_INT and if it is a fp128 input type, we directly ConvertNodeToLibcall?

Besides, I found a strange thing:
In PPCTargetLowering::LowerFP_TO_INT, we will return original op if the input type is fp128, which means in some subtarget , fp128's fptoint is legal and have corresponding instructions.
But what you are going to do is to do a hack and expand it to libcall at the beginning of legalizeOp(...). It seems there is a conflict here.

qiucf added a subscriber: qiucf.Nov 26 2020, 10:19 PM

Besides, I found a strange thing:
In PPCTargetLowering::LowerFP_TO_INT, we will return original op if the input type is fp128, which means in some subtarget , fp128's fptoint is legal and have corresponding instructions.
But what you are going to do is to do a hack and expand it to libcall at the beginning of legalizeOp(...). It seems there is a conflict here.

If I understand correctly, the situation is:

  1. fptoint is legal on subtarget power9, but needs libcall to implement on power8 or earlier.
  2. We can't set fptoint as Expand on power8, since in this case Expand won't fallthrough to libcall.
  3. ConvertNodeToLibcall can't be referenced in target-specific lowering.
  4. Besides, the framework only decides the action of fptoint by return-value type. But things like 'double to i32' doesn't need libcall on power8.

Could we check whether the input type is fp128 at the begining of PPCTargetLowering::LowerFP_TO_INT and if it is a fp128 input type, we directly ConvertNodeToLibcall?

I am not sure if this is feasible as ConvertNodeToLibcall is for legalizer not target lower.

Besides, I found a strange thing: In PPCTargetLowering::LowerFP_TO_INT, we will return original op if the input type is fp128, which means in some subtarget , fp128's fptoint is legal and have corresponding instructions. But what you are going to do is to do a hack and expand it to libcall at the beginning of legalizeOp(...). It seems there is a conflict here.

That is something that we did to fix the limitation of current setOperationAction mechanism and I am fine if we have any better solution for this... In fact, they are not conflict as the hook means that, expand it as libcall without querying the operation action as the operation action cannot handle the case that there are more than one input type.

We used to handle this by having it as custom and inside the hook, different the behavior by different return value. But we have no way to do something like this:

Legalize the sint_to_fp as libcall if the source type is fp128.

Any thoughts ?

Could we check whether the input type is fp128 at the begining of PPCTargetLowering::LowerFP_TO_INT and if it is a fp128 input type, we directly ConvertNodeToLibcall?

I am not sure if this is feasible as ConvertNodeToLibcall is for legalizer not target lower.

Besides, I found a strange thing: In PPCTargetLowering::LowerFP_TO_INT, we will return original op if the input type is fp128, which means in some subtarget , fp128's fptoint is legal and have corresponding instructions. But what you are going to do is to do a hack and expand it to libcall at the beginning of legalizeOp(...). It seems there is a conflict here.

That is something that we did to fix the limitation of current setOperationAction mechanism and I am fine if we have any better solution for this... In fact, they are not conflict as the hook means that, expand it as libcall without querying the operation action as the operation action cannot handle the case that there are more than one input type.

We used to handle this by having it as custom and inside the hook, different the behavior by different return value. But we have no way to do something like this:

Legalize the sint_to_fp as libcall if the source type is fp128.

Any thoughts ?

The Expand path falls back to Libcall if the expand fails. So if you return SDValue() from the custom handler, what does the expand for sint_to_fp try to right now for your fp128 cases?

I know do know the Expand for FP_EXTEND will never fail and tries to do a stack memory operation. But maybe we're missing a extload legality check there that we could use to make it fail.

Could we check whether the input type is fp128 at the begining of PPCTargetLowering::LowerFP_TO_INT and if it is a fp128 input type, we directly ConvertNodeToLibcall?

I am not sure if this is feasible as ConvertNodeToLibcall is for legalizer not target lower.

Besides, I found a strange thing: In PPCTargetLowering::LowerFP_TO_INT, we will return original op if the input type is fp128, which means in some subtarget , fp128's fptoint is legal and have corresponding instructions. But what you are going to do is to do a hack and expand it to libcall at the beginning of legalizeOp(...). It seems there is a conflict here.

That is something that we did to fix the limitation of current setOperationAction mechanism and I am fine if we have any better solution for this... In fact, they are not conflict as the hook means that, expand it as libcall without querying the operation action as the operation action cannot handle the case that there are more than one input type.

We used to handle this by having it as custom and inside the hook, different the behavior by different return value. But we have no way to do something like this:

Legalize the sint_to_fp as libcall if the source type is fp128.

Any thoughts ?

The Expand path falls back to Libcall if the expand fails. So if you return SDValue() from the custom handler, what does the expand for sint_to_fp try to right now for your fp128 cases?

I know do know the Expand for FP_EXTEND will never fail and tries to do a stack memory operation. But maybe we're missing a extload legality check there that we could use to make it fail.

Yeah, we can make FP_EXTEND work if check the legality. For ExpandLegalINT_TO_FP(), it will try to use the stack memory operation to convert the int to f64 first, then, extend it to f128. We also can check the legality of the FPRound/FPExtend to make it fail. Then, we will hit assertion: assert(!isSigned && "Legalize cannot Expand SINT_TO_FP for i64 yet"); So, we can make it fail to if it is not signed. Then, get the final step to have a complicate algorithm to do the convert. But we can make it fail if check the legality.

This is the first attempt I made before but hit too many issues and I feel that the expand will try best to expand it without checking the legality. If we have concern on opening such a backdoor for legalization action, I can update the patch to add the legality check to make the expand fail.

steven.zhang retitled this revision from Add hook for target to customize different legalization action according to the input type to Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128.
steven.zhang edited the summary of this revision. (Show Details)

Make the expand failed for fp_to_int/int_to_fp/fp_round/fp_extend as well as the strict counterpart.

Gentle ping ...

As we have the time pressure for this, if there is concern from other target about this patch, I will abandon this patch and do it inside PowerPC as what AArch64/X86 did now. Personally, I like the idea to common them into legalizer.

craig.topper added inline comments.Dec 8 2020, 8:09 PM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4336

Why is STRICT_FP_ROUND handled above and not down here?

4351

We shouldn't to get call getTypeToTransformTo here. The type is already legal right?

Address comments.

steven.zhang added inline comments.Dec 11 2020, 2:01 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4351

Good catch.

Gentle ping ...

This revision is now accepted and ready to land.Dec 14 2020, 9:15 PM

Rebase tests change for PowerPC.

This revision was landed with ongoing or failed builds.Dec 17 2020, 12:00 AM
This revision was automatically updated to reflect the committed changes.