This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Accept plain register name where an address is expected.
AbandonedPublic

Authored by jonpa on Sep 12 2021, 4:28 AM.

Details

Reviewers
uweigand
Summary

Allow an address to be written as just a single (base) register in assembly.

Example: 'lg %r0, 0(%r1)' and 'lg %r0, %r1' are equivalent.

This patch is not checking the expected kind of address (MemKind) or HasLength / HasVectorIndex - would that be better?

Diff Detail

Unit TestsFailed

Event Timeline

jonpa created this revision.Sep 12 2021, 4:28 AM
jonpa requested review of this revision.Sep 12 2021, 4:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2021, 4:28 AM

This patch is not checking the expected kind of address (MemKind) or HasLength / HasVectorIndex - would that be better?

I believe %reg should be treated equivalently to 0(%reg) for any type of address, including those with length or vector index, so this should be fine. However, it would be good to validate that behavior against GAS, and also add a few test cases for instructions with those address types.

Otherwise the patch looks good to me.

jonpa updated this revision to Diff 373449.EditedSep 19 2021, 4:07 AM

I believe %reg should be treated equivalently to 0(%reg) for any type of address, including those with length or vector index, so this should be fine. However, it would be good to validate that behavior against GAS, and also add a few test cases for instructions with those address types.

For addresses with a Length field, it seems GAS does require the full address with the explicit displacement as well:

        xc      0(32,%r2), 0(%r2)
        xc      0(32,%r2), %r2
#       xc      (32,%r2), %r2    # not accepted by GAS
#       xc      32,%r2, %r2      # not accepted by GAS

The same is also the case with a vector index:

         vsceg   %v0, 0(%v1, %r3), 1
#        vsceg   %v0, (%v1, %r3), 1  # not accepted by GAS
#        vsceg   %v0, %v1, %r3, 1    # not accepted by GAS

So I did not see any changes I could make for the patch, but I nevertheless added an extra test for the second (plain) address of an XC.

An interesting twist is that you have to use the full name (%rX); if you were to use just the register number (1-15), it would be considered a displacement instead of a register ...

Hmmm... to my surprise it now seems that GAS is treating a plain register name as an offset:

lg      %r0, 0(%r1)
lg      %r0, %r1
lg      %r0, 1

-> as ; objdump ->

Disassembly of section .text:

0000000000000000 <.text>:
   0:	e3 00 10 00 00 04 	lg	%r0,0(%r1)
   6:	e3 00 00 01 00 04 	lg	%r0,1
   c:	e3 00 00 01 00 04 	lg	%r0,1

With this patch though as we agreed:

llvm-mc --show-encoding
	.text
	lg	%r0, 0(%r1)                     # encoding: [0xe3,0x00,0x10,0x00,0x00,0x04]
	lg	%r0, 0(%r1)                     # encoding: [0xe3,0x00,0x10,0x00,0x00,0x04]
	lg	%r0, 1                          # encoding: [0xe3,0x00,0x00,0x01,0x00,0x04]

Mayebe this was fixed recently in GAS?

as --version
GNU assembler version 2.32-30.2ibm.fc31
Copyright (C) 2019 Free Software Foundation, Inc.

I believe %reg should be treated equivalently to 0(%reg) for any type of address, including those with length or vector index, so this should be fine. However, it would be good to validate that behavior against GAS, and also add a few test cases for instructions with those address types.

For addresses with a Length field, it seems GAS does require the full address with the explicit displacement as well:

        xc      0(32,%r2), 0(%r2)
        xc      0(32,%r2), %r2
#       xc      (32,%r2), %r2    # not accepted by GAS
#       xc      32,%r2, %r2      # not accepted by GAS

Yes, if you have a length field, you need the full address. I was wondering whether you can omit the length field completely, e.g. have something like

xc %r2, %r2

but it looks like this is isn't accepted anyway as the length is mandatory (makes sense to me as well).

An interesting twist is that you have to use the full name (%rX); if you were to use just the register number (1-15), it would be considered a displacement instead of a register ...

Hmmm... to my surprise it now seems that GAS is treating a plain register name as an offset:

lg      %r0, 0(%r1)
lg      %r0, %r1
lg      %r0, 1

-> as ; objdump ->

Disassembly of section .text:

0000000000000000 <.text>:
   0:	e3 00 10 00 00 04 	lg	%r0,0(%r1)
   6:	e3 00 00 01 00 04 	lg	%r0,1
   c:	e3 00 00 01 00 04 	lg	%r0,1

With this patch though as we agreed:

llvm-mc --show-encoding
	.text
	lg	%r0, 0(%r1)                     # encoding: [0xe3,0x00,0x10,0x00,0x00,0x04]
	lg	%r0, 0(%r1)                     # encoding: [0xe3,0x00,0x10,0x00,0x00,0x04]
	lg	%r0, 1                          # encoding: [0xe3,0x00,0x00,0x01,0x00,0x04]

Huh, this is a problem. We'll need to verify this with the binutils team -- that behavior seems just broken to me. (And if the current behavior of GAS is just broken, then I'm wondering whether the correct fix isn't rather to make this an error in GAS too, and not change LLVM at all ...)

jonpa abandoned this revision.Sep 20 2021, 6:28 AM