This is an archive of the discontinued LLVM Phabricator instance.

Support Intel "l" suffixes for x86_64 R8-R15 registers.
AbandonedPublic

Authored by mtrent on Dec 4 2019, 9:26 PM.

Details

Summary

Intel's 64-bit architecture specifies the low-byte of registers r8-r15 can
be specified using either a "b" suffix ("r8b") or an "l" suffix ("r8l").
This commit adds "l" suffix alternate strings to the r8b - r15b registers,
using TableGen's Register "AltName" mechanism.

Event Timeline

mtrent created this revision.Dec 4 2019, 9:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 4 2019, 9:26 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
mtrent added a comment.Dec 5 2019, 7:40 AM

Test case?

Suggestion?

Do you have examples of other tools that accept this? I checked the GNU assembler and it didn't accept r8l

mtrent added a comment.EditedDec 5 2019, 2:00 PM

Do you have examples of other tools that accept this? I checked the GNU assembler and it didn't accept r8l

I don't. I know Apple's (old) GNU-based assembler does not accept r8l. I do not know if Intel provided tools that accept r8l, but that's the most likely candidate. I'm going from some (old) user reports stating it should work, as well as documentation found online, such as:

https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
https://software.intel.com/en-us/articles/introduction-to-x64-assembly
https://stackoverflow.com/questions/1753602/what-are-the-names-of-the-new-x86-64-processors-registers
https://stackoverflow.com/questions/43991779/why-does-apple-use-r8l-for-the-byte-registers-instead-of-r8b

The first intel URL documents r8l exclusively. The second intel URL seems to favor r8b while acknowledging r8l. The stackoverflow links seem to explain why the world prefers using the AMD register names.

I don't have overly strong feelings about this. If r8l and friends are strictly just alternate strings of r8b this seems like a reasonable request for compatibility with code written using r8l. Again, based on some developer feedback I have, there do exist people who expect r8l to work, for whatever reason. If there were a convenient way to force people to opt into this alternate syntax I could go for that, although I don't know of an existing case that handles this, and I don't think this is worth creating some new flag or classification. If someone with "sufficient authority" were to say this Intel syntax is no longer valid, or if LLVM will not support it, I'm also OK with dropping this request and returning my bug reports as "Not To Be Fixed".

I also found this where NASM indicated they wouldn't support it https://sourceforge.net/p/nasm/bugs/324/

I'm not sure what to do here. I'd like to see at least some other widely used tool supporting this. I worry we'll end up in a situation years from now where other tools try to match clang for what seems to have started as quirk in Intel's documentation nearly 15 years ago.

mtrent updated this revision to Diff 232458.Dec 5 2019, 2:51 PM

Add tests for these alternate registers.

mtrent added a comment.Dec 5 2019, 3:09 PM

I'm not sure what to do here. I'd like to see at least some other widely used tool supporting this. I worry we'll end up in a situation years from now where other tools try to match clang for what seems to have started as quirk in Intel's documentation nearly 15 years ago.

I suppose another way to say it is, we need someone to weigh LLVM's cost of "Allowing code that uses Intel-style register names to exist" against LLVM's cost of "Encouraging Intel-style register names to exist." And this is strictly in the context of x86_64, and not, say, other assembly languages.

In my opinion the cost of code maintenance within LLVM is quite low. Table Gen supports alternate strings, the impact to the parser is negligible. Also, the register names will be canonicalized to the AMD style names if run through a disassembler pass; folks who write "r8l" will have to read "r8b" in otool, lldb, and other tools. That suggests llvm isn't bending over backwards to accommodate or encourage these names.

I'm not sure how to settle the cost of "future tools, years from now" against LLVM's karmic account.

Apparently fasm, x64, Linux, (the "flat assembler") as accessible via "tio.run" will accept "l" suffix as alternate form of the r*b registers. Here's a dorky existence proof:

format ELF executable 3
use64

_start:

mov r8l, 0xff
mov r9l, 0xff
mov r10l, 0xff
mov r11l, 0xff
mov r12l, 0xff
mov r13l, 0xff
mov r14l, 0xff
mov r15l, 0xff
mov r8b, r8l
mov r9b, r9l
mov r10b, r10l
mov r11b, r11l
mov r12b, r12l
mov r13b, r13l
mov r14b, r14l
mov r15b, r15l

mov eax, 4
mov ebx, 1
mov ecx, msg
mov edx, 13
int 0x80
mov eax, 1

mov ebx, 0
int 0x80

msg db "Hello, World!"

Program output:
Hello, World!

Console:
flat assembler version 1.73.16 (16384 kilobytes memory, x64)
2 passes, 179 bytes.

Real time: 0.008 s
User time: 0.004 s
Sys. time: 0.004 s
CPU share: 100.87 %
Exit code: 0

So there is an example.

I contacted our documentation people yesterday to point out this difference between Intel and AMD documentation. They have agreed to fix this in the next release of the SDM.

mtrent added a comment.Dec 9 2019, 9:47 AM

Do we know what form that fix will take? And does that affect this PR?

craig.topper added subscribers: jyknight, rnk.

Looks like the flat assembler supports it, but doesn't document it as supported? https://flatassembler.net/docs.php?article=manual#2.1.19

I believe the Intel SDM is going to change all references to R8L to be R8B.

Adding @rnk and @jyknight as they had expressed an opinion about this in a brief chat on Discord.

Yes, I had expressed a dislike to adding these alises, as there's no pressing need to do so.

X86_64 has been around for 20 years now -- and in all that time, none of the widely-used assemblers have supported these aliases. Adding new aliases now is just adds to confusion and non-portability, which doesn't really help anyone.

Given that the only thing actually using these register names appears to be documentation which is going to be adjusted, that's even more reason not to do it.

rnk added a comment.Dec 16 2019, 12:57 PM

+1, let's not do it.

Very good, I will note this is "not to be fixed" and return the request to support.

mtrent abandoned this revision.Dec 16 2019, 1:09 PM

@mtrent A new Intel SDM was released today that changes the names to R8B..R15B