This is an archive of the discontinued LLVM Phabricator instance.

[x86/asm] Make variants work when converting at&t inline asm input to intel asm output
ClosedPublic

Authored by thakis on Nov 15 2021, 7:10 AM.

Details

Summary

asm always has AT&T-style input (asm inteldialect has Intel-style asm
input), so EmitGCCInlineAsmStr() always has to pick the same variant since it
cares about the input asm string, not the output asm string.

For PowerPC, that default variant is 1. For other targets, it's 0.

Without this, the included test case errors out with

error: unknown use of instruction mnemonic without a size suffix
         mov rax, rbx

since it picks the intel branch and then tries to interpret it as AT&T
when selecting intel-style output with -x86-asm-syntax=intel.

Diff Detail

Event Timeline

thakis created this revision.Nov 15 2021, 7:10 AM
thakis requested review of this revision.Nov 15 2021, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 15 2021, 7:10 AM

Looks reasonable to me, but every time I touch anything related to inline asm, something always breaks, so I'm also a bit worried.

The IBM folks touched this line not long ago (https://reviews.llvm.org/D98276), so let's bring them in on this.

https://reviews.llvm.org/D98276 just moved things around, so I think it's mostly independent of this change.

But it looks like llvm/test/CodeGen/PowerPC/asm-dialect.ll is failing with this if the ppc target is enabled.

Related changes there are https://reviews.llvm.org/rGe21237e59a8e172e4b038c4998411538853d6bc5 and https://reviews.llvm.org/rG266db7fe043f8f29bc76a7242761d35b3e70cd82.

It looks like the only backends that have tests for variants checked in are ppc and x86. It looks like ppc always wants variant 1, while x86 always wants variant 0 for asm and always variant 1 for asm inteldialect.

It does look like MAI->getAssemblerDialect() is generally more used as the dialect for writing. So in general, I think reading it here is incorrect, since variant selection is more a property of the contents of the asm string.

So I think the thing to do here is probably to do something like int AsmPrinterVariant = MMI->getTarget().UnqualifiedInlineAsmVariant(); and add LLVMTargetMachine::UnqualifiedInlineAsmVariant() that returns 0 but make it return 1 in PPCTargetMachine.

(https://reviews.llvm.org/rG1e9097e36a7acdcd75a2f8f229275afd33542db3 is also somewhat related.)

thakis updated this revision to Diff 387286.Nov 15 2021, 8:45 AM
thakis edited the summary of this revision. (Show Details)

new approach to unbreak powerpc

Thanks @hans for the heads-up, I do indeed think this patch would break inline asm on z/OS.

This seems to be the same underlying problem I had raised earlier this year in this patch (which never went anywhere): https://reviews.llvm.org/D95444 . To quote:

LLVM has two notions of "assembler dialect" that are conceptually distinct, but are currently somewhat intermixed.

The first notion corresponds to platform-specific syntactical differences in assembler text, e.g. the distinction between AT&T and Intel-style assembler for x86 (and similar differences on other architectures). The second notion corresponds to C/C++ source level differences in handling the inputs/outputs of *inline* assembler statements, specifically the difference between GNU inline asm and MS inline asm.

These are conceptually different, but the current inline assembler handling always enforces AT&T style assembler with GNU inline asm and Intel-style assembler with MS inline asm. This is done even on non-Intel platforms, which may not even have AT&T vs. Intel assembler -- that's not an issue for MS inline asm, as this itself only exists on Intel, but it is a problem if a platform wants to use different assembler styles for *GNU* inline asm. (Specifically, we want to do this on SystemZ to distinguish between GAS and HLASM style assembler text.)

Maybe some of the ideas in the above patch can be re-activated to fix this particular Intel problem without breaking z/OS?

Thanks @hans for the heads-up, I do indeed think this patch would break inline asm on z/OS.

Do you have any test coverage of z/OS? All tests pass with this as far as I can tell.

uweigand added a subscriber: Kai.Nov 15 2021, 9:17 AM

Thanks @hans for the heads-up, I do indeed think this patch would break inline asm on z/OS.

Do you have any test coverage of z/OS? All tests pass with this as far as I can tell.

The z/OS test coverage is still a bit weak -- @Kai, do we have inline assembler tests in the tree already?

Alternatively, you can tell me which command to run manually to test.

(But in general, if there isn't test coverage, expect others to break you.)

Kai added a comment.Nov 15 2021, 10:42 AM

The z/OS test coverage is still a bit weak -- @Kai, do we have inline assembler tests in the tree already?

No, there are currently no inline assembler tests in tree. We are still working on it.

Sounds like this can land as-is then and you'll figure it out, yes?

I think gcc's documentation says that inline assembly parsing variant follows -masm?

If the integrated assembler is disabled, I would think we need to follow -masm to match the dialect of the non inline assembly?

I think gcc's documentation says that inline assembly parsing variant follows -masm?

Right, that's what I'm trying to make happen (in D113707), and this is one of the prereqs to making the llvm bits work for that. With that patch, TUs compiled with -masm=intel will emit asm inteldialect instrs for inline asm.

craig.topper added a comment.EditedNov 15 2021, 11:56 AM

I think gcc's documentation says that inline assembly parsing variant follows -masm?

Right, that's what I'm trying to make happen (in D113707), and this is one of the prereqs to making the llvm bits work for that. With that patch, TUs compiled with -masm=intel will emit asm inteldialect instrs for inline asm.

I thought asm inteldialect is used to select EmitMSInlineAsmStr in AsmPrinterInlineAsm.cpp. Is that also changing? Are the only differences between those functions related to the assembly syntax and not the originating C code?

I thought asm inteldialect is used to select EmitMSInlineAsmStr in AsmPrinterInlineAsm.cpp. Is that also changing? Are the only differences between those functions related to the assembly syntax and not the originating C code?

That's true, and that's not changing, at least that's my current plan. D113932 is the last change in LLVM land I want to make in preparation for the clang change.

The idea is that att style __asm__ becomes asm and -masm=intel __asm__ becomes asm inteldialect.

I'm not super duper familiar with the asm code (also not 100% unfamiliar though), so maybe there's some better way to handle this? But this approach seems to work well in my tests, seems to pass tests, and is fairly non-invasive.

It's important to keep _output_ and _input_ asm dialects mentally separated. At least that's confusing to me :)

asm vs asm inteldialect controls the dialect in inline asm inputs. Either of those can output at&t or intel asm at the end (controlled by the -x86-asm-syntax= llvm option).

thakis retitled this revision from [x86] Make assembler variant selection work when outputting intel-style asm to [x86] Make assembler variant selection work when converting at&t inline asm inputs to intel asm output.Nov 15 2021, 12:54 PM
thakis retitled this revision from [x86] Make assembler variant selection work when converting at&t inline asm inputs to intel asm output to [x86/asm] Make variants work when converting at&t inline asm input to intel asm output.

I thought asm inteldialect is used to select EmitMSInlineAsmStr in AsmPrinterInlineAsm.cpp. Is that also changing? Are the only differences between those functions related to the assembly syntax and not the originating C code?

That's true, and that's not changing, at least that's my current plan. D113932 is the last change in LLVM land I want to make in preparation for the clang change.

I've now uploaded an updated version of D113707, the clang change. Maybe looking at that makes it clearer how the three changes are related :)

hans accepted this revision.Nov 17 2021, 7:08 AM

lgtm

This revision is now accepted and ready to land.Nov 17 2021, 7:08 AM