This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Fix bug in custom shuffle combine
ClosedPublic

Authored by tlively on May 18 2020, 2:37 PM.

Details

Summary

The code previously assumed the source of the bitcast in the combined
pattern was a vector type, but this is not always true. This patch
adds a check to avoid an assertion failure in that case.

Diff Detail

Event Timeline

tlively created this revision.May 18 2020, 2:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2020, 2:37 PM

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1716–1717

Nit: double spaces between , and undef

1717–1718

Nit: t0 -> T0 for consistency

aheejin accepted this revision.May 19 2020, 3:59 AM
This revision is now accepted and ready to land.May 19 2020, 3:59 AM
tlively updated this revision to Diff 264999.May 19 2020, 12:53 PM
tlively marked 2 inline comments as done.
  • Address comments
This revision was automatically updated to reflect the committed changes.

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

Ping

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

Ping

Sorry for the very delayed response. No, without those casts there are no bitcasts generated. For some reason I don't understand, though, the bit casts are generated in a different order when the input and output type have the same number of elements. This is why the problem only showed up for wasm_f32x4_splat. If I changed the definition of v128_t to be the same as __i64x2, I got the same issue with wasm_f64x2_splat instead.

Sorry not sure if I understand... what my question was, why do we need to casts return values to v128_t at all? (I'm not very familiar with the header file) So for example, for f32x4_splat, can't we do just

static __inline__ __f32x4 __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (__f32x4){__a, __a, __a, __a};
}

instead of

static __inline__ v128_t __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (v128_t)(__f32x4){__a, __a, __a, __a};
}

? The same for all other intrinsics. Is there a reason that all intrinsic's return type should be v128_t?

Sorry not sure if I understand... what my question was, why do we need to casts return values to v128_t at all? (I'm not very familiar with the header file) So for example, for f32x4_splat, can't we do just

static __inline__ __f32x4 __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (__f32x4){__a, __a, __a, __a};
}

instead of

static __inline__ v128_t __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (v128_t)(__f32x4){__a, __a, __a, __a};
}

? The same for all other intrinsics. Is there a reason that all intrinsic's return type should be v128_t?

Oh gotcha. This is just a design decision. The header originally had different user-facing types for the different lane interpretations, but after some feedback from some early users we decided to just expose a single v128_t type for users to worry about. Internally we still need to use all the different types just to get the compiler to emit the correct code.