Hi all,
This is a RFC to teach TypePromotion to promote PHI-nodes with a single use by a ZExt or SExt, where the incoming values are either constants or loads where the target can extend the loaded value for free.
See below a motivating example:
#include <stdint.h>
int foo(uint8_t* p) {
uint16_t index = *p;
do {
index = p[index];
} while (index < 100);
return index;
}
For aarch64 this would end up with the following IR:
define dso_local i32 @_Z3fooPh(i8* nocapture readonly %p) local_unnamed_addr #0 {
entry:
%0 = load i8, i8* %p, align 1, !tbaa !8
br label %do.body
do.body: ; preds = %do.body, %entry
do.body: ; preds = %do.body, %entry
%index.0.in = phi i8 [ %0, %entry ], [ %1, %do.body ]
%idxprom = zext i8 %index.0.in to i64
%arrayidx = getelementptr inbounds i8, i8* %p, i64 %idxprom
%1 = load i8, i8* %arrayidx, align 1, !tbaa !8
%cmp = icmp ult i8 %1, 100
br i1 %cmp, label %do.body, label %do.end, !llvm.loop !11
do.end: ; preds = %do.body
do.end: ; preds = %do.body
%conv2 = zext i8 %1 to i32
ret i32 %conv2
}
Which in turn leads to an unnecessary 'and' at the start of the loop. The goal here is to turn that PHI-node to an i64 and push the zext to the loads.
Let me know what you think of this approach and whether there might be more appropriate ways to ensure we get a better codegen for issues like this.