This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
ClosedPublic

Authored by inouehrs on Jun 28 2017, 8:35 AM.

Details

Summary

On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris.
But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR).

This patch makes backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate.

e.g. (x | 0xFFFFFFFF) should be

	ori 3, 3, 65535
	oris 3, 3, 65535

but now LLVM generates

	li 4, 0
	oris 4, 4, 65535
	ori 4, 4, 65535
	or 3, 3, 4

Diff Detail

Event Timeline

inouehrs created this revision.Jun 28 2017, 8:35 AM
nemanjai added inline comments.Jul 25 2017, 2:42 AM
lib/Target/PowerPC/PPCISelDAGToDAG.cpp
3356

We don't want to do this with i32 nodes? Or is this already handled correctly due to code elsewhere?

inouehrs updated this revision to Diff 108578.Jul 27 2017, 9:30 PM

added test cases for i32

inouehrs added inline comments.Jul 27 2017, 9:34 PM
lib/Target/PowerPC/PPCISelDAGToDAG.cpp
3356

This code optimize both i64 and i32 on ppc64.

unsigned long f64b(unsigned long x) {
  return x | 0xFFFFFFuLL;
}

unsigned f32b(unsigned x) {
  return x | 0xFFFFFF;
}

I added test cases for i32 (ori_test_e and xori_test_e)

kbarton accepted this revision.Aug 22 2017, 1:43 PM

LGTM

This revision is now accepted and ready to land.Aug 22 2017, 1:43 PM
This revision was automatically updated to reflect the committed changes.