Details
Diff Detail
Unit Tests
Event Timeline
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | What about all the optimizations we have for mul by constant using SHXADD? Won't this miss out on those? |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | I do not think it will affect the optimization with SHxADD. For the ones such as For the rules such as My patch still generate better code. c1 + x * 72 (c1 is a non-simm12 constant) before current patch lui Ry, higher-bits of c1 addi Ry, Ry, lower-12-bits of c1 sh3add Rz, x, x sh3add Rz, Rz, Ry after my patch addi Ry, x, c1/72 sh3add Ry, Ry, Ry sll Ry, Ry, 3 I will add those cases to the test file. |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | Isn’t it possible for the MUL you create here to have 11/13/25/41/73/37/21 as a constant that should use SHXADD? |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | Yes, it is possible. And for mul with 11/13/25/41/73/37/21 which use shxadd, my change still generates better asm. I will add in the tests. |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | Sorry, your concern is right. I can not handle mul with 11/13/25/41/73/37/21. |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
537 ↗ | (On Diff #365026) | I have skip those constants in the zba extension. And will figure out a better way for them in the future. Thanks for your help. |
Is there any better way to implement this optimization other than ISelDagToDag?
- It seems hard to calculate c1/c0 by writing TD mapping rules.
- I also tried DAG transform in RISCVTargetLowering::ReplaceNodeResults, it is also not easy, since it involves more code about legalization.
It should probably be a DAG combine and the transform that's turning (mul (add X, C1), C2) into (add (mul X, C2), C1 * C2) should ask the target if it is profitable, or at least call isLegalAddImmediate for C1*C2. @spatel or @lebedev.ri, what do you think?
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
18 ↗ | (On Diff #365158) | #include <set>. and it should be after all llvm headers |
534 ↗ | (On Diff #365158) | Use std::array so you can use begin()/end() on the std::set. Though you could sort the array and use std::binary_search and avoid the set completely. |
Thanks for your help! I have added a new target hook function isMulAddWithConstNotProfitable to let the DAGCombiner consult the target before combining (mul (add X, C1), C2).
This solution seems more clear. And it really improve most riscv's assembly code, except two of them, which I have made inline comments.
llvm/test/CodeGen/RISCV/addimm-mulimm.ll | ||
---|---|---|
241 | This should not be a regression, since 8 bytes are saved, and the mulw should cost no more than 3 cycles since 73 is a small integer, so does for 19/25/41/73/11/13... | |
305 | This can be further optimized to (SLLIW (SH1ADD a0, a0, a0), 6). I will make another patch for this optimization. |
llvm/test/CodeGen/RISCV/addimm-mulimm.ll | ||
---|---|---|
305 | The best form should be ; RV64IM-NEXT: addi a0, a0, 1000 It can be done via new rules in RISCVInstrInfoB.td |
llvm/test/CodeGen/RISCV/addimm-mulimm.ll | ||
---|---|---|
305 | I have submitted another patch to optimize that case. |
llvm/test/CodeGen/RISCV/addimm-mulimm.ll | ||
---|---|---|
214 | Why are RV32IM check lines, in a file whose name and path don't say bitmanip, mentioning bitmanip instructions? |
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2086 | I think my origin default true is right, the default return value should not be false. Since my hook is called as if (AddNode.getNode()->hasOneUse() && TLI.isMulAddWithConstProfitable(AddNode, ConstNode)) return true; So it should return default true for undetermined cases. |
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2086 | Actually my previous version is isMulAddWithConstNotProfitable, which return default false, and return true for clear regression on specific targets. Craig suggested me to remove the Not, and inverse the condition when calling. The core issue is, the original DAGCombiner will do the folding if the AddNode has only one use, which will harm performance in some situation. And the solution is adding another check (along with the hasOneUse) to let the target prevent the transform if the target does think there is regresssion. But if the target is also not sure, what default value should be ? And the hook name should have Not or should not have a Not ? |
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2086 | The default should be whatever makes SelectionDAG behave the same as it currently does unless the changes in behaviour turn out to be useful for the majority of targets. TLI functions should be positive not negative; there are no hooks that have Not in them (other than when they refer to a not instruction). I'm not sure what the issue is though? Whether you have a NotProfitable function that defaults to false or a Profitable function that defaults to true makes no semantic difference other than when the caller needs to put a ! in front of it, it's purely a stylistic issue. |
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2086 | I will keep current form without Not, and still return default true to "make SelectionDAG behave the same as it currently does". And I have improved my comments more clear as /// Return true if it may be profitable to fold /// (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2), and return false /// to prevent the folding for definite regression. /// The target should check the cost of materializing c1, c2 and c1*c2 into /// registers. If it is not sure about some cases, a default true /// can be returned to let the DAGCombiner decide. |
ping ... Can this patch be approved ? It seems there is no objection on adding a hook isMulAddWithConstProfitable
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
16858–16859 | "finds no regression" doesn't make sense to me. A regression is a bug, but if the target says it's profitable then there is no regression; a regression would be if a target said it was not profitable but it in fact was, and so the emitted code got worse. So when a target returns false, it's not that it finds a regression, because nothing has happened yet. This also feeds into a related point. When comments in DAGCombiner talk about regressions, they generally mean "this transformation should make sense on most targets, but many of them don't enable it currently because they have patterns or custom lowering that would need to be adapted to handle it otherwise they won't match important cases any more" (with the "won't match important cases" being the regression), generally marked as TODO (all but one, with the odd-one-out still saying that something should probably be improved). This is not that case, this is just asking the target whether the transformation is profitable, simply as a "does your instruction set benefit from this transformation?". | |
llvm/lib/Target/RISCV/RISCVISelLowering.h | ||
464 | What's the point of duplicating this doxygen comment? It's just going to get outdated, and anyone who wants to know what it does (which is pretty obvious from the name) can just look at TargetLowering.h. If we copied the documentation to every override the tree would be a mess. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
16858–16859 | Thanks. I have made the comments more clear. | |
llvm/lib/Target/RISCV/RISCVISelLowering.h | ||
464 | Thanks. I have removed the redundant comments. |
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2083 | The comment's still not great, putting grammatical issues aside. A lot of it is just explaining the basics of TLI hooks, but also is overly prescriptive with what backends should do to evaluate it (and I also don't like "to avoid definite worse code generated", often TLI hooks end up being best-effort heuristics, unable to give a definitive answer, because that might require extremely expensive whole-function checks that depend on knowing what other transformations are going to be made). I'd just go with something like (borrowing style from surrounding examples):
|
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2083 | Thanks, I have updated the comments according to your suggested expression. One more issue, English is not my mother language and your are appreciated to help me fix any "grammatical issues" you mentioned. ^_^ |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
9062 | Nit: IMO, it would be more intuitive to compare the other way around, swapping the operands and changing the condition to >=. |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
9062 | I can swap the operands, but why changing the condition to >=, should using > be better? Since I think this should consider i64 on rv64 and i32 on rv32. |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
9062 | Yes, > is correct, and I agree with Luis that this is the more natural way round to express it. |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
9062 | Sorry, ignore the part about the equals :) I mixed some thoughts when writing that. |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
9062 | Thanks. The order has been changed. |
The comment's still not great, putting grammatical issues aside. A lot of it is just explaining the basics of TLI hooks, but also is overly prescriptive with what backends should do to evaluate it (and I also don't like "to avoid definite worse code generated", often TLI hooks end up being best-effort heuristics, unable to give a definitive answer, because that might require extremely expensive whole-function checks that depend on knowing what other transformations are going to be made).
I'd just go with something like (borrowing style from surrounding examples):