Page MenuHomePhabricator

[CostModel] fix cost calc bug for sadd/ssub with overflow
ClosedPublic

Authored by spatel on Nov 3 2020, 6:16 AM.

Details

Summary

As noted in D90554, there's an opcode typo in using an easily misused cost model API: getCmpSelInstrCost(). Beyond that, the assumed sequence of ops is questionable, but that would be another patch.

My guess is that the x86 test diffs show that we are probably wrong both before and after this change, so there will be no practical difference.
As an example, I tried this test which shows a cost of '7' either way:

define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
  %ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
  %r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
  %z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
  ret <4 x i32> %z
}

$ llc -o - sadd.ll -mattr=avx
	vpaddd	%xmm1, %xmm0, %xmm2
	vpcmpgtd	%xmm2, %xmm0, %xmm0
	vpxor	%xmm0, %xmm1, %xmm0
	vblendvps	%xmm0, LCPI0_0(%rip), %xmm2, %xmm0

Diff Detail

Event Timeline

spatel created this revision.Nov 3 2020, 6:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2020, 6:16 AM
Herald added a subscriber: mcrosier. · View Herald Transcript
spatel requested review of this revision.Nov 3 2020, 6:16 AM

Thanks for doing this. But now I'm surprised to not see any Arm changes?

spatel added a comment.Nov 3 2020, 6:48 AM

Thanks for doing this. But now I'm surprised to not see any Arm changes?

I double-checked, and nothing there is wiggling. So I partly stepped through the same v4i32 example using -mtriple=thumbv8m.main -mattr=mve, and there's apparently no difference in the underlying throughput costs of the (mis-specified type) cmp and sel. Bugs on top of bugs?

samparker accepted this revision.Nov 3 2020, 6:51 AM

Okay, fair enough! I was getting confused that this is still throughput only.

This revision is now accepted and ready to land.Nov 3 2020, 6:51 AM
This revision was automatically updated to reflect the committed changes.