This is an archive of the discontinued LLVM Phabricator instance.

[libc][NFC] Fix GCC inline asm compilation problems + workaround clang assertion
ClosedPublic

Authored by abrachet on Jan 24 2022, 8:54 PM.

Details

Summary

The clobber list "cc" is added to inline assembly to workaround a clang assertion that triggers when building with a clang built with assertions enabled. See bug 53391.

The sqrtl input constraint was changed from f to t to fix compilation errors in gcc. Not that it matters much, it's nearly impossible to build with gcc anyway.

See https://godbolt.org/z/z3bc6a9PM showing functionally same output assembly.

Diff Detail

Event Timeline

abrachet created this revision.Jan 24 2022, 8:54 PM
abrachet requested review of this revision.Jan 24 2022, 8:54 PM
lntue accepted this revision.Jan 24 2022, 9:44 PM
This revision is now accepted and ready to land.Jan 24 2022, 9:44 PM
sivachandra accepted this revision.Jan 24 2022, 9:54 PM

That a compiler unnecessarily requires the "cc" clobber specification is OK but GCC requiring the input operand to have the "f" modifier instead of the "t" modifier seems not so correct. Should we really care about the GCC problem now?

This revision was landed with ongoing or failed builds.Jan 25 2022, 8:40 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 25 2022, 8:40 AM

That a compiler unnecessarily requires the "cc" clobber specification is OK but GCC requiring the input operand to have the "f" modifier instead of the "t" modifier seems not so correct. Should we really care about the GCC problem now?

Ok I've committed without the changes to sqrtl

That a compiler unnecessarily requires the "cc" clobber specification is OK but GCC requiring the input operand to have the "f" modifier instead of the "t" modifier seems not so correct. Should we really care about the GCC problem now?

Ok I've committed without the changes to sqrtl

When something arcane like this is there for a specific reason like working around a compiler bug, then this should always have comments and ideally those comments should have URLs to the compiler bug report so in future it's clear why something is there and when it can be removed. If these instructions don't actually touch the condition codes in hardware, then a proper asm would not say they do. If there's a need to do something other than what the instruction's hardware definition should suggest, it needs a comment. Instructions like fsqrt are documented in the Intel manual to touch the FPU status flags. Perhaps GCC is using the "cc" clobber to also indicate the condition code-like portions of the FPU status register. (However, I don't see any indications of this in GCC's source where it emits fsqrt itself.)

In the case of "f" vs "t" I think this is a misuse of the constraint semantics, which are specified by GCC. The "t" constraint for x86 is a specific register, not a class of registers. In GCC, when two operands must actually be the same thing exactly then this must be specified directly by using a matching constraint (just digits). So here, you need to use "=t" in the output and then "0" in the input to say that the input operand must be identical to the output operand. Otherwise you're saying they should be two separate operands that both meet the "t" constraint. Since that constraint is only met by a single register, that's impossible. Using the "f" constraint tells it to use some other x87 register, which won't be the actual input to the machine instruction.

In the case of "f" vs "t" I think this is a misuse of the constraint semantics, which are specified by GCC. The "t" constraint for x86 is a specific register, not a class of registers. In GCC, when two operands must actually be the same thing exactly then this must be specified directly by using a matching constraint (just digits). So here, you need to use "=t" in the output and then "0" in the input to say that the input operand must be identical to the output operand. Otherwise you're saying they should be two separate operands that both meet the "t" constraint. Since that constraint is only met by a single register, that's impossible. Using the "f" constraint tells it to use some other x87 register, which won't be the actual input to the machine instruction.

Thanks for explaining. I verified that using the "0" modifier for the input argument fixes the GCC problem. Without that, the error GCC was giving was:

error: output operand 0 must use '&' constraint
    3 |     __asm__ __volatile__("fsqrt" : "=t"(out) : "t"(in));

Which got me to add =& constraint to the output operand and GCC promptly complained about the "impossible constraint" that you have explained.

I think in fact the proper thing for this case is to just use a single operand "+t" to directly indicate that the one operand is both input and output. I'm not sure that "=&t" and "0" will behave any different, but this seems to be exactly what "+" is for.

I think in fact the proper thing for this case is to just use a single operand "+t" to directly indicate that the one operand is both input and output. I'm not sure that "=&t" and "0" will behave any different, but this seems to be exactly what "+" is for.

Indeed, thanks!