This is an archive of the discontinued LLVM Phabricator instance.

InstCombine: fix extraction when performing vector/array punning
ClosedPublic

Authored by evgeny777 on Feb 3 2017, 6:46 AM.

Details

Summary

Current implementation of visitShuffleVectorInst emits incorrect IR, when shufflevector instruction is followed by multiple bitcast instructions. Consider the following example:

define void @test(<16 x i8> %w, i32* %o1, float* %o2) {
  %v = shufflevector <16 x i8> %w, <16 x i8> undef, <4 x i32> <i32 12, i32 13, i32 14, i32 15>
  %f = bitcast <4 x i8> %v to float
  %i = bitcast <4 x i8> %v to i32
  store i32 %i, i32* %o1, align 4
  store float %f, float* %o2, align 4
  ret void
}

If you try to run opt -instcombine over it, you'll get the following IR:

define void @test(<16 x i8> %w, i32* %o1, float* %o2) {
  %v.bc = bitcast <16 x i8> %w to <4 x i32>
  %v.extract = extractelement <4 x i32> %v.bc, i32 3
  %v.extract1 = shufflevector <16 x i8> %w, <16 x i8> undef, <16 x i32> <i32 3, i32 4, i32 5, i32 6, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef>
  %v.bc2 = bitcast <16 x i8> %v.extract1 to <4 x float>
  %v.extract3 = extractelement <4 x float> %v.bc2, i32 0
  store i32 %v.extract, i32* %o1, align 4
  store float %v.extract3, float* %o2, align 4
  ret void
}

The weird shuffle %v.extract1 is emitted, because BegIdx is modified inside for-loop and becomes 3 instead of 12 (as it should be)

If you apply this patch you'll get following:

define void @test(<16 x i8> %w, i32* %o1, float* %o2) {
  %v.bc = bitcast <16 x i8> %w to <4 x i32>
  %v.extract = extractelement <4 x i32> %v.bc, i32 3
  %v.bc1 = bitcast <16 x i8> %w to <4 x float>
  %v.extract2 = extractelement <4 x float> %v.bc1, i32 3
  store i32 %v.extract, i32* %o1, align 4
  store float %v.extract2, float* %o2, align 4
  ret void
}

The broken optimization causes incorrect calculation of DCT matrix with arm-neon extensions enabled and clang -O3

Diff Detail

Repository
rL LLVM

Event Timeline

evgeny777 created this revision.Feb 3 2017, 6:46 AM

Any comments on it?

mkuper accepted this revision.Feb 15 2017, 11:00 AM

LGTM, thanks!

This revision is now accepted and ready to land.Feb 15 2017, 11:00 AM
This revision was automatically updated to reflect the committed changes.