This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Shorter sequence to load 64bit constant with same hi/lo words
ClosedPublic

Authored by Carrot on Oct 12 2016, 9:29 AM.

Details

Summary

This is a patch to implement pr30640.

When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register.

This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together.

Diff Detail

Event Timeline

Carrot updated this revision to Diff 74392.Oct 12 2016, 9:29 AM
Carrot retitled this revision from to [PPC] Shorter sequence to load 64bit constant with same hi/lo words.
Carrot updated this object.
Carrot added a reviewer: hfinkel.
Carrot added a subscriber: llvm-commits.
hfinkel accepted this revision.Oct 12 2016, 12:24 PM
hfinkel edited edge metadata.

LGTM

test/CodeGen/PowerPC/pr30640.ll
8

Please check the full materialization sequence.

This revision is now accepted and ready to land.Oct 12 2016, 12:24 PM
Carrot updated this revision to Diff 74452.Oct 12 2016, 3:56 PM
Carrot edited edge metadata.
Carrot marked an inline comment as done.

Updated the test case with full related code sequence.

hfinkel added inline comments.Oct 12 2016, 4:07 PM
test/CodeGen/PowerPC/pr30640.ll
8

Please use regexs here to avoid a dependence on unrelated RA choices:

; CHECK: lis [[REG1:[0-9]+]], -12851
; CHECK: ori [[REG2:[0-9]+]], [[REG1]], 52685

and so on. Only the output register is fixed.

Carrot updated this revision to Diff 74578.Oct 13 2016, 1:56 PM
Carrot marked an inline comment as done.

Updated test case with regex.

Updated test case with regex.

LGTM

This revision was automatically updated to reflect the committed changes.