Page MenuHomePhabricator

[PowerPC] Combine address computation to favour selecting DForm instructions
Needs RevisionPublic

Authored by nemanjai on Jul 22 2019, 7:47 AM.

Details

Summary

When selecting a memory operation for which the only available D-Form instructions have alignment requirements for the displacement immediate, we end up having to select X-Form instructions if the immediate isn't a required multiple. However, if the base register is used for multiple memory operations, we don't need to be so pessimistic.

For example,

vector int test(char *Ptr) {
  vector int a = *(vector int *)(Ptr+7);
  vector int b = *(vector int *)(Ptr+23);
  vector int c = *(vector int *)(Ptr+39);
  vector int d = *(vector int *)(Ptr+55);
  return a + b + c + d;
}

We currently materialize the constants 3, 7, 23, 39 in registers and use those as index registers with R3 as the base. However, if we computed the value r3 + 55, we can use DQ-Form instructions with displacements (48, 23, 16, 0).

This patch performs these adjustments and fixes up some of the addressing computation functions so as to avoid redundant code.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai created this revision.Jul 22 2019, 7:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2019, 7:47 AM

Please note that essentially all the code in the SelectAddress* functions is somewhat orthogonal to this. Those changes are required to optimize what this patch does, but if the reviewers prefer, I can pull those out into a separate patch. However, I opted to keep them part of this patch at least initially as motivation for those changes on their own would be hard to understand without the context of this patch.

The idea is great! Some comments.

lib/Target/PowerPC/PPCISelDAGToDAG.cpp
5492

I am not sure the intention that you select the vector instead of map. The map make more sense to me.

5507

Maybe, we need to respect the volatile and atomic flag ?

5679

bool getBaseAndRequiredAlignmentFor(SDNode *MemOp, const SDValue *&Base, unsigned &alignment) makes more sense.

5686

There are some helper function to calculate the base and offset. i.e.
BaseIndexOffset::match(Store, DAG)

5714

The intention of addMemOpToDFormTracker is to collect the information and keep it in tracker. However, we are walking the tracker again later to get the uses of current base and discard the tracker. Why not generate the relationship between the base and offset here directly ? i.e.

maintain a map between: Base -> { reqAlign, {BaseIndexOffset1, ... } } and BaseIndexOffset::match() will help us get the base and offset. Maybe, we need also keep the load/store pointer for each BaseIndexOffset as we will update it later.

Inside the updateMemOp..., walk the map one by one and handle it for each Base as what you did.

5739

uint64_t

5746

If we maintain the base map, we don't need to walk the uses as for each load/store, we will get the alignment and compare with the alignment in the map for each base.

5760

Don't need the loop if it is a map.

test/CodeGen/PowerPC/unal-altivec-wint.ll
19

I didn't see the benefit for this change. Can we avoid it ?

steven.zhang requested changes to this revision.EditedAug 16 2019, 2:25 AM

We need to double check the base offset we select to avoid exceeding the 16-bit as much as possible

lib/Target/PowerPC/PPCISelDAGToDAG.cpp
5655

If the offset exceed the 16-bit, we should try to scale it into 16-bit.

; Function Attrs: norecurse nounwind readonly
define dso_local signext i32 @foo(i8* nocapture readonly %p) local_unnamed_addr #0 {
entry:
  %add.ptr = getelementptr inbounds i8, i8* %p, i64 69340
  %0 = bitcast i8* %add.ptr to i32*
  %1 = load i32, i32* %0, align 4, !tbaa !2
  %add.ptr1 = getelementptr inbounds i8, i8* %p, i64 69344
  %2 = bitcast i8* %add.ptr1 to i32*
  %3 = load i32, i32* %2, align 4, !tbaa !2
  %cmp = icmp eq i32 %1, %3
  %conv = zext i1 %cmp to i32
  ret i32 %conv
}
This revision now requires changes to proceed.Aug 16 2019, 2:25 AM