This is an archive of the discontinued LLVM Phabricator instance.

[Inline-asm] Add diagnosts for unsupported inline assembly arguments
ClosedPublic

Authored by pengfei on Jul 30 2021, 2:36 AM.

Details

Summary

GCC supports power-of-2 size structures for the arguments. Clang supports fewer than GCC. But Clang always crashes for the unsupported cases.

This patch adds sema checks to do the diagnosts to solve these crashes.

Diff Detail

Event Timeline

pengfei requested review of this revision.Jul 30 2021, 2:36 AM
pengfei created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2021, 2:36 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
jyu2 added a comment.Aug 2 2021, 1:28 AM

I don't find any document about GCC and ICC's behavior and the generated

code doesn't make much sense to me. Besides, if we change "=rm" to "=r"
or "=m", GCC and ICC will report error differently. Which gives me an
impression that the use of constraint is undefined behavior.

Here is gcc document: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands

To me it is defined behavior. Or I may missing some thing here.

constraint:
A string constant specifying constraints on the placement of the operand; See Constraints, for details.

Output constraints must begin with either ‘=’ (a variable overwriting an existing value) or ‘+’ (when reading and writing). When using ‘=’, do not assume the location contains the existing value on entry to the asm, except when the operand is tied to an input; see Input Operands.

After the prefix, there must be one or more additional constraints (see Constraints) that describe where the value resides. Common constraints include ‘r’ for register and ‘m’ for memory. When you list more than one possible location (for example, "=rm"), the compiler chooses the most efficient one based on the current context. If you list as many alternates as the asm statement allows, you permit the optimizers to produce the best possible code. If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, local register variables may provide a solution (see Local Register Variables).

@jyu2, Thanks for the detailed information.

Besides, if we change "=rm" to "=r"
or "=m", GCC and ICC will report error differently. Which gives me an
impression that the use of constraint is undefined behavior.
To me it is defined behavior. Or I may missing some thing here.

Sorry, I didn't describe it clearly. I meant it's ub when they are used together with structure type. E.g., https://godbolt.org/z/ssY5K4xEP
As you can see, when we use "=rm", GCC and ICC happen to work, that's because GCC only allows the structure to be passed on register with ICC only allows memory. I didn't find a document which describes the relationships between structure and its output constraint.
Besides, https://godbolt.org/z/5ex9x3dvv shows GCC and ICC have some rules when we passing a structure (Adding one more element in the exapmles will fail on both GCC and ICC). I didn't find the document either.

code doesn't make much sense to me.

This patch doesn't change any functionality of Clang. We just adding more sema checks to avoid the ugly crash. This is required from our user.
I believe we can do more things to march the behavior of GCC and ICC. I have added a FIXME in the code. Considering we don't have a clear document and such a case is in rare. And we don't have such requirment from user. I think this patch is till better than none.
Does it make sense?

jyu2 added a comment.Aug 2 2021, 11:45 AM

First I did not say "code doesn't make much sense to me". This is copy form your description.

Okay, thank for explanation. So if I understand right, both gcc and icc generated code is not making sense to you. So instead, you want clang give error for this when output size greater than input size.
A question:
What about input size greater than output size? It seems the error already emit for that case in Sema. Should also handle the same with your case in Sema?

clang/lib/Sema/SemaStmtAsm.cpp
671

Why not emit error in here?

First I did not say "code doesn't make much sense to me". This is copy form your description.

Oop, my bad understanding. :)

both gcc and icc generated code is not making sense to you.

Yes. Unless there's document to specify the behavior clearly.

So instead, you want clang give error for this when output size greater than input size.

Giving error only when output size greater than input size and output type is not one of the 3 cases in CGStmt.cpp:2480. I.e.: OutputDomain != AD_Other || OutSize <= InSize. Otherwise, Clang will crash there.

What about input size greater than output size? It seems the error already emit for that case in Sema. Should also handle the same with your case in Sema?

Sema emits error when 1) InputDomain != AD_Other 2) Output is mentioned and OutSize < InSize. So input size is allow to be greater than output size only when output is not mentioned. I don't fully understand the code in CGStmt.cpp, but according to comment in line 2463, it seems only larger output result needs to be handled.

clang/lib/Sema/SemaStmtAsm.cpp
671

Actually, adding this condition will let it fall through and emit error at line 688.

pengfei updated this revision to Diff 363746.Aug 3 2021, 8:00 AM

Implemented the handling for structure types.

pengfei retitled this revision from [Inline-asm] Add semacheck for unsupported constraint to [Inline-asm] Add structure type handling when they are tied in input and output constraints.Aug 3 2021, 8:05 AM
pengfei edited the summary of this revision. (Show Details)
pengfei updated this revision to Diff 363974.Aug 3 2021, 11:39 PM

Address Jennifer comments from off-line.

pengfei updated this revision to Diff 413453.Mar 7 2022, 6:57 AM

Refactor.

Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 6:57 AM
pengfei retitled this revision from [Inline-asm] Add structure type handling when they are tied in input and output constraints to [Inline-asm] Add diagnosts for unsupported inline assembly arguments.Mar 7 2022, 7:03 AM
pengfei edited the summary of this revision. (Show Details)
skan added a subscriber: skan.Mar 7 2022, 5:33 PM
skan added inline comments.
clang/test/Sema/asm.c
317–345

Better to add comments about the size of the struct.

360–361

Better to comment that we are checking the size is power of 2 here.

pengfei updated this revision to Diff 413676.Mar 7 2022, 6:27 PM
pengfei marked 2 inline comments as done.

Thanks @skan !

jyu2 added inline comments.Mar 9 2022, 8:17 AM
clang/lib/Sema/SemaStmtAsm.cpp
677

Error message is not very clear to me. I think we should create more specified error message there. Like power of two, or size < 8 or > pointer size?

Using error message selector.

pengfei updated this revision to Diff 414276.Mar 9 2022, 10:18 PM

Address review comment.

clang/lib/Sema/SemaStmtAsm.cpp
677

Use getIntTypeForBitwidth instead. So we don't need to check for each case.

pengfei updated this revision to Diff 414277.Mar 9 2022, 10:22 PM

Remove outdated comment

jyu2 added inline comments.Mar 10 2022, 3:31 PM
clang/lib/Sema/SemaStmtAsm.cpp
622

Are you sure you want to change the Input/output Domain? Since you changed this, could you add both codegen and sema check tests for struct type(you already has sema check for struct type, but I don't see any array type) and array type.

Thanks.
Jennifer

675

Do you need return NS after diagnostic?

pengfei updated this revision to Diff 414614.Mar 11 2022, 1:49 AM

Address review comments.

clang/lib/Sema/SemaStmtAsm.cpp
622

The input / output domain are just for sema check here. I don't think it changes codegen behavior.

675

Good catch!

jyu2 added inline comments.Mar 21 2022, 7:48 AM
clang/lib/Sema/SemaStmtAsm.cpp
622

Yes, for seam check, once error emit, compiler stop at CodeGen. However, if no error emit, compiler will go though the CodeGen.

Since pointer/struct and constantArray are allowed for small size of type, so would you please add code gen test for that?

In your test where no-error should emit part...

asm ("" : "=rm" (a): "0" (1)); // no-error

Thanks.
Jennifer

pengfei updated this revision to Diff 417303.Mar 22 2022, 8:13 AM

Address review comment.

clang/lib/Sema/SemaStmtAsm.cpp
622

Thanks Jennifer! I see your point. I'll only add sema check (rather than loose in some way) in this patch.
The codegen part is complicated, some cases will pass but some still fail. I have left comments in the tests.

jyu2 accepted this revision.Mar 22 2022, 8:21 PM

This is okay with me.

This revision is now accepted and ready to land.Mar 22 2022, 8:21 PM
This revision was landed with ongoing or failed builds.Mar 22 2022, 9:12 PM
This revision was automatically updated to reflect the committed changes.