SPE floating-point compare instructions only update the GT bit in the CR
field. All predicates must therefore be reduced to GT/LE.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 43191 Build 43919: arc lint + arc unit
Event Timeline
Justin,
last week, I found out, that there is a second place for this GT/LE modification needed. Therefore I moved the switch/case into a separate function.
diff --git a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp index 2cb0387..b678656 --- a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp +++ b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp @@ -3860,6 +3878,36 @@ static PPC::Predicate getPredicateForSetCC(ISD::CondCode CC) { } } +// For SPE operations, the Result is stored in GT +// Return the corresponding GT or LE code for this +// Prior to this, the Compare must have been modified to EF?CMP?? in SelectCC +static PPC::Predicate getPredicateForSetCCForSPE(ISD::CondCode CC) { + PPC::Predicate Opc = PPC::PRED_SPE; + switch(CC) { + case ISD::SETOEQ: + case ISD::SETEQ: + case ISD::SETOLT: + case ISD::SETLT: + case ISD::SETOGT: + case ISD::SETGT: + Opc = PPC::PRED_GT; + break; + case ISD::SETUNE: + case ISD::SETNE: + case ISD::SETULE: + case ISD::SETLE: + case ISD::SETUGE: + case ISD::SETGE: + Opc = PPC::PRED_LE; + break; + default: + printf("Undefined SPE Predicate for CC %u\n",CC); + break; + } + return Opc; +} + + /// getCRIdxForSetCC - Return the index of the condition register field /// associated with the SetCC condition, and whether or not the field is /// treated as inverted. That is, lt = 0; ge = 0 inverted. @@ -4890,6 +4937,13 @@ void PPCDAGToDAGISel::Select(SDNode *N) { } unsigned BROpc = getPredicateForSetCC(CC); + // Override BROpc if SPE with f64/f32 operation + // Watch out: N->getOperand(0).getValueType is not the same as N->getValueType(0) + if (PPCSubTarget->hasSPE() + && ( N->getOperand(0).getValueType() == MVT::f64 + || N->getOperand(0).getValueType() == MVT::f32) ) { + BROpc = getPredicateForSetCCForSPE(CC); + } unsigned SelectCCOp; if (N->getValueType(0) == MVT::i32) @@ -5048,6 +5102,12 @@ void PPCDAGToDAGISel::Select(SDNode *N) { PCC |= getBranchHint(PCC, FuncInfo, N->getOperand(4)); SDValue CondCode = SelectCC(N->getOperand(2), N->getOperand(3), CC, dl); + + if (PPCSubTarget->hasSPE() && N->getOperand(2).getValueType().isFloatingPoint()) { + // For SPE instructions, the result is in GT bit of the CR + PCC = getPredicateForSetCCForSPE(CC); + } + SDValue Ops[] = { getI32Imm(PCC, dl), CondCode, N->getOperand(4), N->getOperand(0) }; CurDAG->SelectNodeTo(N, PPC::BCC, MVT::Other, Ops);
Not sure, if the naming of the function getPredicateForSetCCForSPE() is good. Maybe we need to rename this.
Best regards,
Kei
Thanks, @kthomsen. Can you provide a LLVM IR test case for this? And test cases for the other two reviews I added you on, since you wrote the patches initially?
I don't have build test cases for the patches. For the next 4 weeks, I will be on vacation and at two conferences.
My (local) tests are done by creating a C file with some specific code to test the patch and have a look into the generated assembler file and the results by running the binary on a target machine.
Looks like, that I need to find someone, who can show/guide/help me how to write and run a test cases.
@kthomsen you can use 'clang -emit-llvm' to emit LLVM IR from your C test cases. There are reducer passes that you can run on that as well, to reduce the test case to the smallest needed, but I can't recall what they are.
Thanks @Jim, yes the code was copied from a comment in another review and I didn't run clang-format on it. Thanks for pointing out 'git-clang-format', I didn't know about that before.
Ping on this? I've been using this patch for quite some time now, and really want to get it in before 10.
Looks like to me that Vector Compare in SPE also has different CR bit semantics .
So this patch will only handle floating point part for SPE? If so, maybe we should make it explicit in title.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp | ||
---|---|---|
3871 | Can we merge this into getPredicateForSetCC? So that we don't need to add code in every callsite of getPredicateForSetCC? | |
3891 | If we don't merge this into getPredicateForSetCC, maybe it would be better to list those should have been legalized ones like getPredicateForSetCC. | |
5088 | Why we are using isFloatingPoint() here, while checking MVT::f32/MVT::f64 above? | |
llvm/test/CodeGen/PowerPC/spe.ll | ||
2 | Maybe we should include one RUN line for -O0 to test FastIsel as well? | |
2 | Why not generate the checks using script? | |
149–150 | Why don't we check bits with ugt? |
Correct. We currently only do codegen for SPE floating point, not for vector. I currently have no plans to implement vector support any time soon, if ever.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp | ||
---|---|---|
3891 | Since this is called in more places than getPredicateForSetCC, this comment makes more sense to implement. | |
5088 | No real reason why. I don't see any difference between them in this use case, but I can unify the conditions. | |
llvm/test/CodeGen/PowerPC/spe.ll | ||
2 | Is there an example of generating checks via a script? These checks were generated from a C file provided by @kthomsen a while back, with -emit-llvm, and pared down to the smallest working set I could. | |
149–150 | The asm generated with ugt is much larger, including extra efscmpeq checks as well, so I chose not to include that. |
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp | ||
---|---|---|
3891 | ? Why more places than getPredicateForSetCC? I think we call it after each getPredicateForSetCC | |
5088 | Yes, please. | |
llvm/test/CodeGen/PowerPC/spe.ll | ||
2 | Just run llvm/utils/update_llc_test_checks.py --llc-binary ../build/bin/llc llvm/test/CodeGen/PowerPC/spe.ll . | |
149–150 | But then we lose the test point here? |
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp | ||
---|---|---|
3891 | You're right. I didn't check the full path below before the getPredicateForSetCCForSPE(). However, getPredicateForSetCCForSPE() is *only* to be used for floating point checks. getPredicateForSetCC() doesn't take a type. I guess I could add a type argument for getPredicateForCC(). | |
llvm/test/CodeGen/PowerPC/spe.ll | ||
2 | Thanks. I was unaware of that script, and should probably poke around more to see what else is there that I can make use of in the future. | |
149–150 | Good point. Originally, this test was just to make sure the instruction itself was generated. The other tests I adapted because they were easy to add in the correctness of the condition check. |
Address comments. As part of this I ran 'update_llc_test_checks.py' on the
whole spe.ll file. I'm not sure if that's necessary for this, but I included
the output anyway. If it's better as a separate change I can do that.
LGTM.
Yes, it would be better if you can run update_llc_test_checks.py without this patch first, commit it, rebase, then run update_llc_test_checks.py with this patch again.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp | ||
---|---|---|
5086 | Unnecessary new line? |
You can refer https://llvm.org/docs/Phabricator.html#committing-a-change to commit a change.
If your commit message has Differential Revision: <URL>, the differential will close automatically.
arc patch D<Revision> is used to fetch a differential from Phabricator.
https://llvm.org/docs/Phabricator.html#committing-someone-s-change-from-phabricator
@Jim Yeah, I know. I normally do fix the commit message, but forgot before pushing this one. I'll have to check, but I thought arc had a way to update the commit message when generating the review.
arc amend copies Phabricator summary to the git commit message. It also rewrites metadata tags such as Reviewed By:. I use a script to strip unneeded metadata tags before committing: https://lists.llvm.org/pipermail/llvm-dev/2020-January/137895.html
Can we merge this into getPredicateForSetCC? So that we don't need to add code in every callsite of getPredicateForSetCC?