[WebAssembly] Support for a ternary atomic RMW instruction
Needs ReviewPublic

Authored by aheejin on Wed, Jul 11, 9:16 AM.

Details

Reviewers
dschuff
Summary

This adds support for a ternary atomic RMW instruction: cmpxchg.

Diff Detail

aheejin created this revision.Wed, Jul 11, 9:16 AM
aheejin planned changes to this revision.Wed, Jul 11, 2:28 PM

Turns out this is not currently able to make use of truncating/extending instructions when the 'success flag' of the LLVM IR cmpxchg instruction is used.

aheejin updated this revision to Diff 155126.Thu, Jul 12, 1:02 AM

Variable name change

I think this CL can be reviewed as is. I will add an optimization for the success flag thing in another CL bc it is gonna be too long. So the problem I was talking about is, let's say we have these two test cases:

define i64 @cmpxchg_i8_i64_loaded_value(i8* %p, i64 %exp, i64 %new) {
  %exp_t = trunc i64 %exp to i8
  %new_t = trunc i64 %new to i8
  %pair = cmpxchg i8* %p, i8 %exp_t, i8 %new_t seq_cst seq_cst
  %old = extractvalue { i8, i1 } %pair, 0
  %e = zext i8 %old to i64
  ret i64 %e
}

define i1 @cmpxchg_i8_i64_success(i8* %p, i64 %exp, i64 %new) {
  %exp_t = trunc i64 %exp to i8
  %new_t = trunc i64 %new to i8
  %pair = cmpxchg i8* %p, i8 %exp_t, i8 %new_t seq_cst seq_cst
  %succ = extractvalue { i8, i1 } %pair, 1
  ret i1 %succ
}

So, in the LLVM IR (not wasm), unlike atomicrmw instruction, cmpxchg instruction returns a pair of { loaded value, success flag }. So it returns an additional 'success flag' which indicates whether the loaded value and the expected value matches. With this CL, the first function's compilation result is going to be

cmpxchg_i8_i64_loaded_value:
  .param    i32, i64, i64
  .result   i64
  i64.atomic.rmw8_u.cmpxchg  $push0=, 0($0), $1, $2
  return    $pop0

But for the second function (which is little contrived, because, usually the success flag is not gonna be returned from a function but likely to be used in a loop condition), this fails to make use of the i64.atomic.rmw8_u.cmpxchg instruction. It's gonna be something like

cmpxchg_i8_i64_success:
  .param    i32, i64, i64
  .result   i32
  i32.wrap/i64  $push6=, $1
  tee_local  $push5=, $3=, $pop6
  i32.wrap/i64  $push0=, $2
  i32.atomic.rmw8_u.cmpxchg  $push1=, 0($0), $pop5, $pop0
  i32.const  $push2=, 255
  i32.and   $push3=, $3, $pop2
  i32.eq    $push4=, $pop1, $pop3
  return    $pop4

which is suboptimal. (This only happens when truncation-extension exists.)

I think we need another set of patterns to optimize this. And in case we want to use both the loaded value and the success flag, which I guess is the most common case, we need another set of patterns for that as well. I'll add that in another CL separately.

aheejin updated this revision to Diff 155726.Mon, Jul 16, 11:38 AM
  • Add a TODO