The purpose of this patch is to make LSR generate better code for SystemZ in the cases of memory intrinsics and comparison of immediate with memory. These instructions in particular can have no index register and can only accept a small immediate offset. Improvements on benchmarks have been confirmed.
In order to achieve this, the following common code changes were made:
- New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() and isFoldableMemAccessOffset() with the Instruction pointers.
- isLegalAddressingMode() gets a new optional Instruction* parameter (defaults to nullptr) used by LSR if Target returns true in LSRWithInstrQueries(). All target methods have been updated as well.
- In LSR / isAddressUse(): handle address operands of memset, memmove and memcpy as address uses.
- In LSR / RateFormula(): Don't add to ImmCost if the instructions are already checked. It only adds confusion when the results are otherwise equal. Call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads / stores.
- In LSR / isAMCompletelyFolded(): Let target look at instructions if it returns true in LSRWithInstrQueries().
SystemZ: - isLSRCostLess() overriden to check instruction counts like X86 does it.
- isLegalAddressingMode() and isFoldableMemAccessOffset() improved to handle memcpy and compare imm w/ mem.
- LSRWithInstrQueries() returns true
- minor updates of tests dag-combine-01.ll and loop-01.ll
- Two new tests in loop-01.ll
Typo: displacement