This is an archive of the discontinued LLVM Phabricator instance.

[MergeICmps] Enable the MergeICmps Pass by default.
ClosedPublic

Authored by courbet on Feb 1 2018, 6:53 AM.

Diff Detail

Event Timeline

courbet created this revision.Feb 1 2018, 6:53 AM
spatel added a comment.Feb 1 2018, 1:22 PM

Can you create at least one llc test each for x86 and PPC, so we see the result of merging + expansion together? This could just be copying existing tests from test/Transforms/MergeICmps?

courbet updated this revision to Diff 132574.Feb 2 2018, 5:32 AM

Add llc test for X86 and PPC

Can you create at least one llc test each for x86 and PPC, so we see the result of merging + expansion together? This could just be copying existing tests from test/Transforms/MergeICmps?

Done. Can you double check the PPC test ? It does make sense to me, but I'm not an expert.

Can you create at least one llc test each for x86 and PPC, so we see the result of merging + expansion together? This could just be copying existing tests from test/Transforms/MergeICmps?

Done. Can you double check the PPC test ? It does make sense to me, but I'm not an expert.

Thanks. This test looks fine to me as PPC given the 'align 4' on the loads. I recommend checking these tests in now, so we just see the improvements from enabling MergeICmps when this patch is committed.

For reference, the current PPC codegen is something like this:

# %bb.0:                                # %entry
	lwz 5, 0(3)
	lwz 6, 0(4)
	cmplw	5, 6
	bne	0, .LBB0_2
# %bb.1:                                # %land.rhs.i
	lwz 3, 4(3)
	lwz 4, 4(4)
	cmpw	3, 4
	b .LBB0_3
.LBB0_2:
	crxor 2, 2, 2
.LBB0_3:                                # %opeq1.exit
	li 3, 0
	li 4, 1
	isel 3, 4, 3, 2
	blr

My concern is that I don't see consideration for misalignment when combining to larger ops. We generally don't care on x86, but it could be a problem for some PPC? This may be a non-issue if some other pass is expected to split misaligned memops later.

courbet updated this revision to Diff 132596.Feb 2 2018, 7:59 AM

Rebase the tests to show codegen diffs.

spatel added a comment.Feb 2 2018, 8:28 AM

Rebase the tests to show codegen diffs.

Thanks!

Thinking about this a bit more, since MergeICmps and MemCmpExpansion use the same TLI hook, this shouldn't be a problem unless there's already a bug in PPC's enabling of that hook. Given that MemCmpExpansion was originally written with a PPC target in mind, there's not much risk of that.

So I think this patch is fine now, but let's give someone who's closer to current-day PPC metal a chance to reply.

nemanjai accepted this revision.Feb 5 2018, 12:09 PM

The PPC code looks way nicer. The alignment is fine as well - unaligned scalar loads are cheap on modern PPC HW and the loads are in bounds. Of course, I have no comment on the X86 code as I don't know enough about the target.
LGTM.

This revision is now accepted and ready to land.Feb 5 2018, 12:09 PM
spatel accepted this revision.Feb 5 2018, 12:30 PM

The x86 side LGTM.

Just wanted to also mention that this passes a double bootstrap build/lit/lnt on PPC.

Thanks a lot for the review !

This revision was automatically updated to reflect the committed changes.

FWIW, this broke compilation of Qt for x86 (at least on windows, and the reduced example also triggers for other x86 targets), see PR36557 for details. Do you have time to look at it sometime soon, or is it appropriate to revert this until that is sorted out?