This is an archive of the discontinued LLVM Phabricator instance.

[clang] Fix handling of physical registers in inline assembly operands.
ClosedPublic

Authored by jonpa on Sep 8 2020, 4:56 AM.

Details

Summary

Improve EmitAsmStmt() to

  • Not tie physregs with the "+r" constraint, but instead add the hard register as an input constraint. This makes "+r" and "=r":"r" look the same in the output.

Background: Macro intensive code may contain inline assembly statements with multiple operands constrained to the same physreg. Such a case with the operand constraints "+r" : "r" currently triggers the TwoAddressInstructionPass assertion against any extra use of a tied register. Furthermore, TwoAddress will insert a COPY to that physreg even though isel has already done so (for the non-tied use), which may lead to a second redundant instruction currently.

A simple fix for this is to not emit tied physreg uses in the first place for the "+r" constraint, which is what this patch does.

  • Give an error on multiple outputs to the same physical register.

This should be reported and this is also what GCC does.

Diff Detail

Event Timeline

jonpa created this revision.Sep 8 2020, 4:56 AM
jonpa requested review of this revision.Sep 8 2020, 4:56 AM
lebedev.ri retitled this revision from [CFE] Fix handling of physical registers in inline assembly operands. to [clang] Fix handling of physical registers in inline assembly operands..Sep 8 2020, 5:16 AM
lebedev.ri edited subscribers, added: cfe-commits; removed: llvm-commits.
iii added a subscriber: iii.Sep 10 2020, 9:53 AM
jonpa added a reviewer: void.Sep 23 2020, 3:24 AM
jonpa removed a reviewer: void.

Ping!

Last chance for anyone to speak up about any objections to committing this...

craig.topper added inline comments.Oct 6 2020, 11:35 PM
clang/lib/CodeGen/CGStmt.cpp
2090

Can we get rid of the count call above and just use the bool from the std::pair returned by insert?

jonpa updated this revision to Diff 296618.Oct 7 2020, 1:43 AM
jonpa marked an inline comment as done.

Updated per review by eliminating the call to 'count'.

clang/lib/CodeGen/CGStmt.cpp
2090

Ah, yes.

jyu2 added inline comments.Oct 7 2020, 7:53 AM
clang/lib/CodeGen/CGStmt.cpp
2060

Use llvm::SmallSet?

jonpa updated this revision to Diff 296901.Oct 8 2020, 2:27 AM
jonpa marked an inline comment as done.

Updated per review to use SmallSet.

aaron.ballman added inline comments.Oct 8 2020, 9:17 AM
clang/lib/CodeGen/CGStmt.cpp
2088

Diagnostics in Clang are typically not grammatically correct, so I think it should be multiple instead of Multiple.

jonpa updated this revision to Diff 297136.Oct 9 2020, 12:44 AM
jonpa marked an inline comment as done.

Updated message string per review.

aaron.ballman accepted this revision.Oct 9 2020, 5:20 AM

This LGTM, but I'm not super well-versed with asm statements to begin with. You should wait a bit for other reviewers to weigh in before landing.

This revision is now accepted and ready to land.Oct 9 2020, 5:20 AM
jyu2 accepted this revision.Oct 9 2020, 2:40 PM

LGTM

Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2020, 6:11 AM
jonpa reopened this revision.Oct 13 2020, 11:46 PM

Patch temporarily reverted.

This revision is now accepted and ready to land.Oct 13 2020, 11:46 PM
jonpa requested review of this revision.Oct 14 2020, 12:04 AM
jonpa added a comment.Oct 16 2020, 5:10 AM

The problem seems to be with a tied earlyclobber operand:

asm("" : "+&r"(a));

Per the comment in InlineAsm::ConstraintInfo::Parse(), only output can be earlyclobber.

I am not sure if the earlyclobber ('&') should with this patch be removed from the added use of the register, or if this has to for some reason really be tied to the def?

The problem seems to be with a tied earlyclobber operand:

asm("" : "+&r"(a));

Per the comment in InlineAsm::ConstraintInfo::Parse(), only output can be earlyclobber.

I am not sure if the earlyclobber ('&') should with this patch be removed from the added use of the register, or if this has to for some reason really be tied to the def?

So we have a read/write tied earlyclobber operand that is also tied to a phyreg via an asm register declaration? That means that: the physreg is used as input, it is also used as output, and it is written to before all inputs are read (i.e. no *other* input may share the same register, even if it would otherwise hold the same value).

I believe there is no way to represent the effect of a read/write tied earlyclobber in any other way. E.g. trying to use separate inputs/outputs (one with "=&r", one with "r") does not work since the earlyclobber on the output would prevent the same register to be used for the input, but it *has* to be the same register ...

On on other hand, in this special case the original problem cannot actually occur, because there cannot be any other input using the same physreg (that's what the earlyclobber ensures), so maybe be just don't do that transformation if an earlyclobber is present?

jonpa updated this revision to Diff 298624.Oct 16 2020, 6:52 AM

That makes sense... Patch updated to keep the tying of operands for this case of earlyclobber.

@nickdesaulniers : Can you see if this patch now passes your builds?

nickdesaulniers requested changes to this revision.Oct 16 2020, 11:16 AM

Thanks for updating the patch.

Please add a test case for the earlyclobber case. In particular, if none of the in-tree tests failed for the initial version of this patch that landed, you should add one that would have.

I should be able to git checkout c78da037783bda0f27f4d82060149166e6f0c796, observe the test case fail, then arc patch D87279 and observe the test case pass.

The current version of this patch, Diff 298624, passes the previously failing Linux kernel builds.

This revision now requires changes to proceed.Oct 16 2020, 11:16 AM
jonpa updated this revision to Diff 298819.Oct 17 2020, 3:51 AM

Sorry, here's the test case included as well...

nickdesaulniers accepted this revision.Oct 19 2020, 1:12 PM

Looks great, thanks!

This revision is now accepted and ready to land.Oct 19 2020, 1:12 PM

@nathanchance reports in https://github.com/ClangBuiltLinux/linux/issues/1512
that

// clang x.c
register long current_stack_pointer asm("rsp");
void vmcs_set_bits() {
  asm goto("" : "+r"(current_stack_pointer) : : : do_exception);
do_exception:;
}

now seems to be crashing as a result of this change.

This change seems like it's done in the wrong layer -- why do we add a special-case in Clang to emit "={r11},{r11}", instead of fixing LLVM to not break when given the previously-output "={r11},0"?

Furthermore, since earlyclobber was exempted from this change, doesn't the same TwoAddressInstructionPass bug still affect earlyclobber uses?

Sorry for the delay.

This change seems like it's done in the wrong layer -- why do we add a special-case in Clang to emit "={r11},{r11}", instead of fixing LLVM to not break when given the previously-output "={r11},0"?

We were starting out with a test case that was the result of macro-intensive code (Linux kernel). It had a tied physreg def/use, with an additional use of the same physreg, which did not compile with clang. There were two problems: TwoAddress could not handle this case and therefore asserted, and also the extra COPY in the output of the same physreg that would also have to be dealt with.

Instead of trying to improve TwoAddress to recognize and handle a special case like this while also making sure no redundant COPYs would be produced, we found it more reasonable to handle this in the front end. The reasoning was that "= {r7},0" should be completely equivalent with "= {r7},{r7}", and they should be treated the same by the compiler. A tied physreg did not seem useful in the first place, so we decided to always emit the second form from the front end.

This is pretty much what the original commit message said: "Background: Macro intensive user code may contain inline assembly statements with multiple operands constrained to the same physreg. Such a case (with the operand constraints "+r" : "r") currently triggers the TwoAddressInstructionPass assertion against any extra use of a tied register. Furthermore, TwoAddress will insert a COPY to that physreg even though isel has already done so (for the non-tied use), which may lead to a second redundant instruction currently. A simple fix for this is to not emit tied physreg uses in the first place for the "+r" constraint, which is what this patch does."

Furthermore, since earlyclobber was exempted from this change, doesn't the same TwoAddressInstructionPass bug still affect earlyclobber uses?

I was never under the impression that this was a bug in TwoAddress, but rather an unhandled ("odd") case of phys-regs. ISel produced the first COPY, and TwoAddress the second one. How should that be cleaned up? Seemed best to avoid adding unnecessarily to TwoAddress if possible to avoid it...

Sorry, we don't have early clobber operands on SystemZ, so I haven't really thought much about that. We decided we had to exclude them from this patch per the reasoning of https://reviews.llvm.org/D87279#2334519.

So I would guess it might still be a problem to compile an early-clobber tied def/use of a physreg with an extra use (or several) of the same physreg. If that case is something you intend to handle, I would agree that perhaps then the front end change of this patch would not be needed anymore as it is the alternative approach.

@nathanchance reports in https://github.com/ClangBuiltLinux/linux/issues/1512 that... now seems to be crashing as a result of this change.

I tested x.c, and it "works" on SystemZ, while it did not pass ISel on X86, from what I understand. Given the above reasoning of the equivalency between the two expressions (use either tied or explicit of the same phys-reg), I would also wonder if X86 is "choking on valid IR"...

As another matter, the SystemZ backend actually can't handle x.c, as it triggers `SpillGPRs.LowGPR != SpillGPRs.HighGPR && "Should be saving %r15 and something else". This is however a problem in the prologe/epilog emission if anything... I guess the case of just clobbering SP in a function has not been seen before.