This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] [mc] Corrected several VI opcodes to avoid printing _e64
ClosedPublic

Authored by dp on May 12 2017, 4:22 AM.

Details

Summary

There are VI instructions which can only be encoded using VOP3 format (VOP2/VOP3 in CI). However disassembler appends "_e64" suffix which is misleading for VI ISA.

Full list of opcodes affected by this issue:

v_bcnt_u32_b32
v_bfm_b32
v_cvt_pk_i16_i32
v_cvt_pk_u16_u32
v_cvt_pkaccum_u8_f32
v_cvt_pknorm_i16_f32
v_cvt_pknorm_u16_f32
v_cvt_pkrtz_f16_f32
v_ldexp_f32
v_mbcnt_hi_u32_b32
v_mbcnt_lo_u32_b32

See Bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Diff Detail

Repository
rL LLVM

Event Timeline

dp created this revision.May 12 2017, 4:22 AM
dp retitled this revision from Corrected several VI opcodes to avoid printing _e64 to [AMDGPU] [mc] Corrected several VI opcodes to avoid printing _e64.May 12 2017, 4:23 AM
dp edited the summary of this revision. (Show Details)May 12 2017, 4:44 AM
artem.tamazov edited edge metadata.May 12 2017, 4:51 AM

Generally, looks good.
A subtle detail: I am thinking the we shall ensure compatibility between ISA generations as much as possible.
Let's consider the following use case:

  1. We have valid CI ISA with VOP3 instructions.
  2. Disassembly for CI yields text with _e64 suffixes.
  3. The text should be assembled for VI without need to remove _e64 suffix.

Similar use case: disassembled CI ISA with VOP2 (with _e32) should not be assembled for VI.

Please check.

dp added a comment.EditedMay 12 2017, 5:08 AM

Generally, looks good.
A subtle detail: I am thinking the we shall ensure compatibility between ISA generations as much as possible.
Let's consider the following use case:

  1. We have valid CI ISA with VOP3 instructions.
  2. Disassembly for CI yields text with _e64 suffixes.
  3. The text should be assembled for VI without need to remove _e64 suffix.

Similar use case: disassembled CI ISA with VOP2 (with _e32) should not be assembled for VI.

Please check.

This is how current implementation works.
'_e32' and '_e64' suffices are allowed for opcodes which have only one encoding, provided that there is no conflict between specified and actual encoding size.

Examples:

v_bfm_b32 v5, s1, v2          // ok on SI,VI
v_bfm_b32_e32 v5, s1, v2  // ok on SI, error on VI
v_bfm_b32_e64 v5, s1, v2  // ok on SI,VI
artem.tamazov added inline comments.May 12 2017, 5:23 AM
test/CodeGen/AMDGPU/constant-fold-mi-operands.ll
28 ↗(On Diff #98746)

WRT all changes in CodeGen tests:
Is there is a way to indicate "_e64 or nothing" in tests? I think that would minimize changes and copypasting.

dp added inline comments.May 12 2017, 7:39 AM
test/CodeGen/AMDGPU/constant-fold-mi-operands.ll
28 ↗(On Diff #98746)

Looks like FileCheck supports only a subset of regular expressions.

Empty expressions are not allowed:

v_mbcnt_lo_u32{{_e64|}}           // syntax error: empty subexpression

It is possible to specify a blank or a metasymbol like \b (word boundary) but it does not work as expected:

v_mbcnt_lo_u32{{_e64| }}           // does not match with v_mbcnt_lo_u32 for some reason
v_mbcnt_lo_u32{{_e64|.}}           // does not work either

The following pattern works but it looks ugly:

v_mbcnt_lo{{_u32_e64|_u32}}           // works

Any ideas?

artem.tamazov added inline comments.May 12 2017, 8:14 AM
test/CodeGen/AMDGPU/constant-fold-mi-operands.ll
28 ↗(On Diff #98746)

{{...}} can contain any regexp, so could you try this:

v_mbcnt_lo_u32{{(?:_e64)?}}
dp updated this revision to Diff 98795.May 12 2017, 10:21 AM

Updated as suggested by Artem to minimize changes in CodeGen tests.
Unfortunately, {{(...)?}} does not work in many cases so I had to use {{(...)*}}

artem.tamazov accepted this revision.May 15 2017, 5:12 AM

Thanks, good!

This revision is now accepted and ready to land.May 15 2017, 5:12 AM
This revision was automatically updated to reflect the committed changes.