Update code comments in DAGCombiner::ReduceLoadWidth and refactor
the handling of SRL a bit. The refactoring is done with the intent
of adding support for folding away SRA by using SEXTLOAD in a
follow-up patch.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
This is an attempt to refactor the code a bit, making the diff in the D116930 child revision smaller.
There is still some code duplication since SRL is handled in two places. I kind of focused on making the introduction of SRA support easier, while still refactoring the second part of SRL handling just to describe that part a bit more. Maybe it too aggressive refactoring?
There are a complicated set of conditions to handle the various patterns/inputs. I don't know if I have accounted for all of the combinations.
I wonder if we'd be better off breaking it up, but if you feel confident that it's correct, let's keep it. :)
See inline for some minor changes.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
12097–12098 | This sounds like it was originally written with a little-endian-only implementation. /// If the result of a load is shifted/masked/truncated to an effectively /// narrower type, try to transform the load to a narrower type and/or /// use an extending load. And fix the capitalization? "DAGCombiner::reduceLoadWidth()" | |
12117 | is masked -> are masked | |
12149 | that it -> that is | |
12178 | needs to be -> need to be | |
12219 | This assumes that ExtVTBits is a power-of-2, but is that enforced/asserted? | |
12267–12268 | This seems backwards. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
12219 | I've not really understood why these checks exist. I mean, if there is no SRL at all we won't even take this path. So the code below (I figure mainly isLegalNarrowLdSt) need to ensure the legality anyway for the more general situation. So maybe this is some kind of early out (possibly saving a tiny amount of compile time). Or it is protecting the AND masking below somehow (but I can't really see that the AND masking depend on these properties). Nevertheless, just like you have spotted, these checks aren't even making sense when ExtVTBits isn't a power of two. As it happens we do get here for multiple lit tests with ExtEVTBits not being a power-of-2. If I simply remove the checks, then I get diffs in several lit tests. I did examined one such test, and it turned out that we ended up with some slightly different order of transforms (resulting in (and (load i32), 7) instead of (and (sexload i8 to i32), 7), not really sure whichever would be better in that particular case). If the sextload is preferred, then I guess there should be another DAGCombine added that is doing such a transform(?). If I instead bail out here if ExtVTBits isn't a power-of-2, then I get diffs in 3 lit tests. Those diffs looked like regressions as the instruction count increased in all three cases. Skipping the tests seems like the better solution of the two above, but I'd rather fix that in a separate patch. And there might actually be some alternatives to explore. The AND mask hack below might detect that ExtVT can be reduced further into something that is a power-of-2, so bailing out on ExtVTBits not being a power-of-2 before the AND mask check was perhaps too restricting. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
12219 | Then this is even more confusing than I thought! |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
12267–12268 | To be clear, I think the code is correct. I just meant that the comment seems inverted for endian. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
12267–12268 | I relaxed the description a bit. We could end up adjusting the pointer both for big/little endian here. Such as only loading a single byte from the middle of an i64. |
clang-format not found in user’s local PATH; not linting file.