Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2] ldrh w1, [x2, #2]
becomes
ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff
Paths
| Differential D13771
[AArch64]Add support for converting halfword loads into a 32-bit word load ClosedPublic Authored by junbuml on Oct 15 2015, 7:43 AM.
Details
Summary Convert two halfword loads into a single 32-bit word load with bitfield extract ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff
Diff Detail Event TimelineComment Actions Hi Jun,
ldrh w1, [x0] becomes something like: ldp w1, w3 [x0]
Comment Actions
Thanks Chad for the review.
mcrosier edited edge metadata. Comment ActionsLGTM, with one minor nit.
This revision is now accepted and ready to land.Oct 19 2015, 7:27 AM
Comment Actions I tested it on A57 for spec2000 and spec2006. This patch was applied pretty widely, but clear performance was only observed in spce2006/h264ref (about 2%), and no performance regression was found. I can add a flag to turn it on only for a specific architecture. Please let me know any suggestion. Comment Actions James, Chad
Revision Contents
Diff 37590 lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
test/CodeGen/AArch64/arm64-ldp.ll
|
Just delete and fall through.