During the SeparateConstOffsetFromGEP pass, signed extensions are distributed and then later recombined. The recombination stage is somewhat problematic- it doesn't differ add and sub instructions from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern, although this issue is incredibly rare. It only occurs under the below conditions. The test case only miscompiles on PowerPC as per my current testing; however, the issue is not in PowerPC specific code as far as I can tell, and it appears that the right IR input could theoretically trigger this on multiple platforms.
The IR contains:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extA + extB
The problematic code will transform that into:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input.
The operands must be in the same order as above- and other optimization passes must preserve this behavior until this section of the code is reached, which is unlikely with the possibly-unnecessary extensions. My test case was heavily reduced both at the C level and the IR level from cblas_ztrsv, but there are many extraneous instructions creating dependencies necessary to preserve the miscompilation.
I do appreciate your documentation, jingyue.