Page MenuHomePhabricator

[DAGCombine] Refactor DAGCombiner::ReduceLoadWidth. NFCI

Authored by bjope on Jan 12 2022, 5:24 AM.



Update code comments in DAGCombiner::ReduceLoadWidth and refactor
the handling of SRL a bit. The refactoring is done with the intent
of adding support for folding away SRA by using SEXTLOAD in a
follow-up patch.

Diff Detail

Event Timeline

bjope created this revision.Jan 12 2022, 5:24 AM
bjope requested review of this revision.Jan 12 2022, 5:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 5:24 AM
bjope added a comment.Jan 12 2022, 5:34 AM

This is an attempt to refactor the code a bit, making the diff in the D116930 child revision smaller.

There is still some code duplication since SRL is handled in two places. I kind of focused on making the introduction of SRA support easier, while still refactoring the second part of SRL handling just to describe that part a bit more. Maybe it too aggressive refactoring?

There are a complicated set of conditions to handle the various patterns/inputs. I don't know if I have accounted for all of the combinations.
I wonder if we'd be better off breaking it up, but if you feel confident that it's correct, let's keep it. :)

See inline for some minor changes.


This sounds like it was originally written with a little-endian-only implementation.
How about generalizing to:

/// If the result of a load is shifted/masked/truncated to an effectively 
/// narrower type, try to transform the load to a narrower type and/or
/// use an extending load.

And fix the capitalization? "DAGCombiner::reduceLoadWidth()"


is masked -> are masked


that it -> that is


needs to be -> need to be


This assumes that ExtVTBits is a power-of-2, but is that enforced/asserted?


This seems backwards.
It's the little-endian target that needs to adjust the pointer. We're chopping off the LSB, so this is always converting ShAmt back to zero for big-endian?

bjope added inline comments.Jan 13 2022, 10:52 AM

I've not really understood why these checks exist. I mean, if there is no SRL at all we won't even take this path. So the code below (I figure mainly isLegalNarrowLdSt) need to ensure the legality anyway for the more general situation. So maybe this is some kind of early out (possibly saving a tiny amount of compile time). Or it is protecting the AND masking below somehow (but I can't really see that the AND masking depend on these properties).

Nevertheless, just like you have spotted, these checks aren't even making sense when ExtVTBits isn't a power of two.

As it happens we do get here for multiple lit tests with ExtEVTBits not being a power-of-2.

If I simply remove the checks, then I get diffs in several lit tests. I did examined one such test, and it turned out that we ended up with some slightly different order of transforms (resulting in (and (load i32), 7) instead of (and (sexload i8 to i32), 7), not really sure whichever would be better in that particular case). If the sextload is preferred, then I guess there should be another DAGCombine added that is doing such a transform(?).

If I instead bail out here if ExtVTBits isn't a power-of-2, then I get diffs in 3 lit tests. Those diffs looked like regressions as the instruction count increased in all three cases.

Skipping the tests seems like the better solution of the two above, but I'd rather fix that in a separate patch. And there might actually be some alternatives to explore. The AND mask hack below might detect that ExtVT can be reduced further into something that is a power-of-2, so bailing out on ExtVTBits not being a power-of-2 before the AND mask check was perhaps too restricting.

spatel added inline comments.Jan 13 2022, 12:07 PM

Then this is even more confusing than I thought!
It's fine if you want to leave any more changes to other patches. You've probably stepped through this more than anyone else by now. :)

Adding Sam as a potential reviewer - code history shows that the last significant changes were:

spatel added inline comments.Jan 13 2022, 12:15 PM

To be clear, I think the code is correct. I just meant that the comment seems inverted for endian.

bjope updated this revision to Diff 400318.Jan 15 2022, 2:13 PM

Updates based on review feedback.

bjope marked 4 inline comments as done.Jan 15 2022, 2:17 PM
bjope added inline comments.

I relaxed the description a bit. We could end up adjusting the pointer both for big/little endian here. Such as only loading a single byte from the middle of an i64.

bjope added inline comments.

I made a follow-up patch related to this here D117406.

spatel accepted this revision.Jan 16 2022, 8:10 AM


This revision is now accepted and ready to land.Jan 16 2022, 8:10 AM
This revision was landed with ongoing or failed builds.Jan 16 2022, 11:25 AM
This revision was automatically updated to reflect the committed changes.