In byval promotion, the generated in callers 'load' instructions have
the common alignment with the corresponding offset and alignment of
the promoted byval argument:
define internal void @callee_load_first_element(%struct.ss* byval(%struct.ss) align 16 %b) { %temp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0 ... }
will be optimized into:
define internal void @callee_load_from_aligned_1(i32 %b.0.val) { %temp2 = add i32 %b.0.val, 1 ret void } define i32 @caller_load_first_element() { %S = alloca %struct.ss, align 16 ... %S.0 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0 %S.0.val = load i32, i32* %S.0, align 16 %S.1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1 %S.1.val = load i64, i64* %S.1, align 4 call void @callee_load_first_element(i32 %S.0.val, i64 %S.1.val) ... }
(So, '%S.0.val' has 'align 16', the same align as the initial
'%struct.ss' argument has.)
But the "usual" promotion doesn't follow this rule, the corresponding
'load' instruction generated in the caller will have just the maximum
alignment of all the loads for the argument's part in the callee, and
the possible alignment for the argument itself is just ignored:
define internal void @callee_load_from_aligned(%struct.ss* align 16 %b) { %temp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0 %temp1 = load i32, i32* %temp, align 4 %temp2 = add i32 %temp1, 1 %temp3 = load i32, i32* %temp, align 8 ret void }
will be optimized into:
define internal void @callee_load_from_aligned(i32 %b.0.val) { %temp2 = add i32 %b.0.val, 1 ret void } define i32 @caller_load_from_aligned() { %S = alloca %struct.ss, align 16 ... %1 = getelementptr %struct.ss, %struct.ss* %S, i64 0, i32 0 %S.val = load i32, i32* %1, align 8 call void @callee_load_from_aligned_1(i32 %S.val) ... }
(So, '%S.val' has 'align 8' while the pointer, argument '%b' in the
callee, is aligned by 16.)
The intent of the patch is to align the behavior of both propagation
schemes: byval and non-byval. However, if there is a load with a larger
value of the 'align' attribute than the argument has, the non-byval
promotion will use this alignment while the byval one doesn't.
Isn't this incorrect for non-zero offsets?
Overall, I'm not sure I really understand what you're trying to achieve here. What we're interested here is whether loads are speculatable, which is the case when they are defereferenceable and aligned. This can be either because the load is guaranteed to execute, or because we have known dereferenceability/alignment knowledge. Byval deref/align will be taken into account for the latter check (allCallersPassValidPointerForArgument).