This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] loosen alignment constraint for load transform
ClosedPublic

Authored by spatel on Dec 16 2020, 7:25 AM.

Details

Summary

As discussed in D93229, we only need a minimal alignment constraint when querying whether a hypothetical vector load is safe. We still pass/use the potentially stronger alignment attribute when checking costs and creating the new load.

There's already a test that changes with the minimum code change, so splitting this off as a preliminary proposal independent of any gep/offset enhancements.

Diff Detail

Event Timeline

spatel created this revision.Dec 16 2020, 7:25 AM
spatel requested review of this revision.Dec 16 2020, 7:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2020, 7:25 AM
lebedev.ri accepted this revision.Dec 16 2020, 7:53 AM

LGTM, thanksThi, this makes sense to me.
float variant takes too long to proof, but i8 works fine:

----------------------------------------
define <4 x i8> @src(* dereferenceable(4) align(1) %p) {
%0:
  %s = load i8, * dereferenceable(4) align(1) %p, align 4
  %r = insertelement <4 x i8> undef, i8 %s, i32 0
  ret <4 x i8> %r
}
=>
define <4 x i8> @tgt(* dereferenceable(4) align(1) %p) {
%0:
  %1 = bitcast * dereferenceable(4) align(1) %p to *
  %2 = load <4 x i8>, * %1, align 4
  %r = shufflevector <4 x i8> %2, <4 x i8> undef, 0, 4294967295, 4294967295, 4294967295
  ret <4 x i8> %r
}
Transformation seems to be correct!
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
141–142

This probably warrants a comment why we pass this alignment (that we only care about dereferenceability)

llvm/test/Transforms/VectorCombine/X86/load.ll
407

They are not really in conflict, just the first one being overly pessimistic.
Since we know what the alignment must be on all execution paths,
the alignment specified on the argument could be enhanced by knowledge backpropagation:
https://godbolt.org/z/dvqEEb

This revision is now accepted and ready to land.Dec 16 2020, 7:53 AM
spatel added inline comments.Dec 16 2020, 8:50 AM
llvm/test/Transforms/VectorCombine/X86/load.ll
407

Ah, thanks for the explanation. I was thinking of a gep case where the alignment could be over-specified, but that will be UB based on:
http://llvm.org/docs/LangRef.html#load-instruction
"Overestimating the alignment results in undefined behavior."

lebedev.ri added inline comments.Dec 16 2020, 9:22 AM
llvm/test/Transforms/VectorCombine/X86/load.ll
409

align 1 here doesn't actually say that the alignment is/will be exactly 1, it specifies the lower bound,
it is perfectly fine for the actual pointer to be 1024-byte aligned instead.