This test case (which I hope is free on UB), has two stores of 0 to offsets 20 and 24 in a chunk of memory:
store i32 0, i32* %helper.20.32
store i32 0, i32* %helper.24.32, align 8
A 64 bit load, aligned to 4 bytes:
%load.helper.20.64 = load i64, i64* %helper.20.64, align 4
This is on AArch32, so during type legalisation the i64 load is split into two 32bit loads. The second of them:
t35: i32,ch = load<(load 4 from %ir.helper.20.64 + 4)> t21, t37, undef:i32
gets marked as being align 8 (note: the base+offset is align 8, not the base). This is then deemed to not alias with the load to %helper.24.32 as the alignment just set is taken as the base alignment, not the base+offset alignment.
The test case seems to need a <4 x i32> which on ARM is converted to a VLD1_UPD. I believe this pushes certain optimisation back later after legalisation. Originally it needed -combiner-global-alias-analysis, but this version shows the same error without.
Here I've set the updated alignment only if the alignment hold true for the base.
This should just use getAlignment() rather than looking at the two underlying alignments?