- User Since
- May 22 2014, 1:24 PM (237 w, 6 d)
Unrelated diffs got uploaded?
Reverted at rL349056 because:
Sounds like everyone is in agreement about the overall direction. There are just a few inline comments/questions to answer, and then we should be good to go.
Added check and test for non-power-of-2 type.
- Add hasOneUse() checks and test, so we're not converting to funnel shift if there's potential for perf regression.
- I went ahead and committed the PhaseOrdering test since it shows a bug independent of this patch (rL348980).
Tue, Dec 11
Unsetting 'Approved' until we have a fix for the bug.
Reopening - this patch was reverted at rL348844 because it broke an ARM test.
Add tests for vsetcc simplification edge cases. These are currently folded early in generic DAGCombiner via SimplifySetCC().
Mon, Dec 10
Fix code comment to better explain what we're doing.
I don't have a good understanding of the potential edge cases with aggregates, so adding some other potential reviewers.
LGTM - but I think there's a bug in cheapToScalarize() that I didn't notice before. Please put a FIXME note in there.
Determine op size based on the opcode parameter (no need to change the function signature).
Sun, Dec 9
Sat, Dec 8
LGTM - the code change itself is just removing a hack.
Fri, Dec 7
Thu, Dec 6
(Adding more potential reviewers for MI changes)
Wed, Dec 5
I just looked over the codegen changes so far, but I want to add some more knowledgeable x86 hackers to have a look too. There are 2 concerns:
- Are there any known uarch problems with overlapping loads?
- Are there any known uarch problems with unaligned accesses (either scalar or SSE)?
Tue, Dec 4
Ping * 2.
LGTM. We can handle any of the TODO items as follow-ups.
Mon, Dec 3
I just see a few inline nits. Adding Fabian in case he has any feedback.
I think this is the right change for IR (I'm trying to do something similar with D50992 but that's been held up while we try to get the backend in shape). .
Sun, Dec 2
I've been looking at this:
...and I'm still not exactly sure how we should fix it, but I'm confident that what this patch is proposing is not the right answer. We don't want to create a dummy instruction only to analyze and delete it.
My best guess is that we can handle this case using a small enhancement to ValueTracking plus a minimal change to the callers in InstCombine and/or SimplifyCFG. These are hacks vs. a proper solution that uses a dominator tree and a more significant fix to CVP, but it's what we already do in both InstCombine and SimplifyCFG, so there's really no additional cost for optimizing simple cases like this.