This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add support for the X asm constraint
ClosedPublic

Authored by sbaranga on Apr 13 2016, 8:30 AM.

Details

Summary

This patch adds support for the X asm constraint.

To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).

Fixes PR26493

Diff Detail

Event Timeline

sbaranga updated this revision to Diff 53568.Apr 13 2016, 8:30 AM
sbaranga retitled this revision from to [ARM] Add support for the X asm constraint.
sbaranga updated this object.
sbaranga added reviewers: rengolin, t.p.northover.
sbaranga added a subscriber: llvm-commits.
rengolin edited edge metadata.Apr 13 2016, 8:59 AM

Hi Silviu,

The documentation on the X constraint is... lacking. All it says is "Any operand whatsoever is allowed.". Sigh...

So, before we get support for it, I'd like to see a few examples from user code on how people expect it to work (probably the best we can do for now).

Also, you don't seem to be testing much. I think you need to have better instructions (adds, movs, loads) to make sure we're testing the whole range of the X constraint. I'm guessing *any* operand is a bit vague, but does it support immediates, as well as registers, expressions?

Maybe it'd be easier to create a bug, block PR27197 with it, and explain the expected behaviour, with reproducible C files and command lines.

cheers,
--renato

lib/Target/ARM/ARMISelLowering.cpp
11450

What about VFP?

11457

No "t" for 32-bit FP registers?

test/CodeGen/ARM/inlineasm-X-constraint.ll
1

why do you need -no-integrated-as?

Hi Silviu,

The documentation on the X constraint is... lacking. All it says is "Any operand whatsoever is allowed.". Sigh...

Hi Renato,

Our definition is "Allows an operand of any kind, no constraint whatsoever. Typically useful to pass a label for an asm branch or call." (LangRef.rst).
So basically we're free to choose.

FWIW, our approach is similar to the X86 one.

So, before we get support for it, I'd like to see a few examples from user code on how people expect it to work (probably the best we can do for now).

The PR (https://llvm.org/bugs/show_bug.cgi?id=26493) text has some good pointers:
http://www.ethernut.de/en/documents/arm-inline-asm.html

Also, you don't seem to be testing much. I think you need to have better instructions (adds, movs, loads) to make sure we're testing the whole range of the X constraint. I'm guessing *any* operand is a bit vague, but does it support immediates, as well as registers, expressions?

I'll add more tests. I think the types of the operands in these instructions are more important than the instructions themselves.

Maybe it'd be easier to create a bug, block PR27197 with it, and explain the expected behaviour, with reproducible C files and command lines.

The only use case of this that I'm aware of is the one from http://www.ethernut.de/en/documents/arm-inline-asm.html.

Cheers,
Silviu

lib/Target/ARM/ARMISelLowering.cpp
11450

Good point.

11457

Why "t" (wouldn't "w" be enough)?

test/CodeGen/ARM/inlineasm-X-constraint.ll
1

I don't, I'll just remove this.

Do'h! I missed the PR reference, my apologies.

Though, wrt the blog post, I think it's a very lean source for such a generic case like this. I suggest you try to use the constraint on GCC and come up with a general description for its behaviour, what's allowed and what's not. Maybe even talk to a few GCC engineers once you have a better idea, and then be able to accurately describe what we want to implement.

Without that, it'll be hard to review the patch, and adding a hack to make your specific use case pass wouldn't be very constructive, either. :)

cheers,
--renato

lib/Target/ARM/ARMISelLowering.cpp
11457

"t" would select an S register, while "w" would select a D reg.

If this is "anything", shouldn't it also allow immediate values and expressions?

Do'h! I missed the PR reference, my apologies.

Though, wrt the blog post, I think it's a very lean source for such a generic case like this. I suggest you try to use the constraint on GCC and come up with a general description for its behaviour, what's allowed and what's not. Maybe even talk to a few GCC engineers once you have a better idea, and then be able to accurately describe what we want to implement.

Without that, it'll be hard to review the patch, and adding a hack to make your specific use case pass wouldn't be very constructive, either. :)

cheers,
--renato

Sure!

Cheers,
Silviu

lib/Target/ARM/ARMISelLowering.cpp
11457

Our 'w' matching strategy would also match SPRs, DPRs and QPRs (see getRegForInlineAsmConstraint), based on the size of ConstraintVT. If that fails, the generic getRegForInlineAsmConstraint will still be able to give this a register class.

It does need a regression test though.

Regarding immediates/expressions: at least our implementation doesn't allow it (on any targets). I'll check what gcc does.

jgreenhalgh added a subscriber: jgreenhalgh.EditedApr 14 2016, 3:44 AM

Decaying this to "w" or "r" would potentially pessimize code generation, and might well break use cases. The whole point of "X" is to prevent the compiler from having to reload an operand you don't actually care about using in the output assembly (a scratch, or in that blog post, a hidden dependency). As GCC isn't going to put any effort in to forcing the form of the operand, I'd expect most uses that actually try to print out an "X" constrained register to be using it as a shorthand for getting constants or labels out. There are normally more expressive constraints if that is what you need.

Thanks, James!

The current implementation will only perform the lowering part (and force this to a register by converting to "r" or "w") if the operand is not a basic block, function, or integer constant - so this should cover some of those use cases.

It is true that GCC would be more efficient in some cases (one example would be FP constants), but we would still fit into the definition of "no constraint whatsoever" and therefore correct - which is an improvement from the current situation, where we'll simply crash on this constraint.

We can still improve our handling of this constraint, but that would be something that is target independent and outside the scope of this patch.

Thanks,
Silviu

It is true that GCC would be more efficient in some cases (one example would be FP constants), but we would still fit into the definition of "no constraint whatsoever" and therefore correct - which is an improvement from the current situation, where we'll simply crash on this constraint.

I agree bad codegen trumps ICE crashes, but James mentioned it "might well break use cases". I'm interested in those...

James, do you have some pointers on the expected usage of this constraint in the wild? The more the merrier!

cheers,
--renato

It is true that GCC would be more efficient in some cases (one example would be FP constants), but we would still fit into the definition of "no constraint whatsoever" and therefore correct - which is an improvement from the current situation, where we'll simply crash on this constraint.

I agree bad codegen trumps ICE crashes, but James mentioned it "might well break use cases". I'm interested in those...

James, do you have some pointers on the expected usage of this constraint in the wild? The more the merrier!

A Debian grep gets a few examples out from x86 mmx headers (and a couple for other architectures) https://codesearch.debian.net/results/%22%3DX%22/page_0

I'm making no promises as to how correct those uses are (http://sources.debian.net/src/php7.0/7.0.5-3/Zend/zend_multiply.h/?hl=56#L56 looks pretty fragile!), but I don't know of any better sources for real world code. I was only guessing this would break somebody's use case (because it is so liberal at the moment), rather than having one in particular in mind.

A Debian grep gets a few examples out from x86 mmx headers (and a couple for other architectures) https://codesearch.debian.net/results/%22%3DX%22/page_0

Hey, I didn't know about this code search, nice!

Silviu, it'd be good to get a few examples from that search result page and at least understand what's going on, and maybe prepare some better error messages than seg faults.

I'm making no promises as to how correct those uses are (http://sources.debian.net/src/php7.0/7.0.5-3/Zend/zend_multiply.h/?hl=56#L56 looks pretty fragile!), but I don't know of any better sources for real world code. I was only guessing this would break somebody's use case (because it is so liberal at the moment), rather than having one in particular in mind.

Crap. That example can go wrong in so many ways... The complete lack of type safety makes that macro a nightmare for compilers... sigh...

I think we can retain our current approach that, whatever makes sense, passes, and what doesn't breaks with an error message in the lines of "stupid user".

--renato

A Debian grep gets a few examples out from x86 mmx headers (and a couple for other architectures) https://codesearch.debian.net/results/%22%3DX%22/page_0

Hey, I didn't know about this code search, nice!

Silviu, it'd be good to get a few examples from that search result page and at least understand what's going on, and maybe prepare some better error messages than seg faults.

Yeah, I'm currently working through a list of additional use cases to see what works and why other things don't work.
Printing integer immediates and function addresses works.
For labels, we hit an issue where the label's basic block is not considered to have its address taken which can cause the basic block that's being pointed to by the label to be removed (at least in some cases), resulting in invalid output. I'll raise a PR for that.

I'm making no promises as to how correct those uses are (http://sources.debian.net/src/php7.0/7.0.5-3/Zend/zend_multiply.h/?hl=56#L56 looks pretty fragile!), but I don't know of any better sources for real world code. I was only guessing this would break somebody's use case (because it is so liberal at the moment), rather than having one in particular in mind.

Crap. That example can go wrong in so many ways... The complete lack of type safety makes that macro a nightmare for compilers... sigh...

I think we can retain our current approach that, whatever makes sense, passes, and what doesn't breaks with an error message in the lines of "stupid user".

You'll at least get an assembler error, which is much nicer.

Cheers,
Silviu

sbaranga updated this revision to Diff 54194.Apr 19 2016, 8:18 AM
sbaranga edited edge metadata.

Added further regression tests.

Only use a 'w' constraint if we know that we have at least VFPv2.

rengolin accepted this revision.Apr 22 2016, 1:42 PM
rengolin edited edge metadata.

Hi Silviu,

I'm ok with this approach, since it's simple and seems to cover the cases pretty well. I'm guessing that the error message when the prediction goes wrong is "invalid constraint for asm operand", which is *much* better than a segfault. Also, confusion around macros are really the user's fault for writing such an egregious code in the first place.

It'd be good if James could have a final look, but I'm ok with the change.

cheers,
--renato

test/CodeGen/ARM/inlineasm-X-constraint.ll
5

shouldn't you verify that the register used in vmsr is also used in vadd?

Same comment for other CHECK lines below.

This revision is now accepted and ready to land.Apr 22 2016, 1:42 PM
sbaranga added inline comments.Apr 25 2016, 3:38 AM
test/CodeGen/ARM/inlineasm-X-constraint.ll
5

The vadd operand ($0) doesn't appear in the vmsr instruction. It's there just to add the dependency (which is the usecase shown in the PR).

It'd be good if James could have a final look, but I'm ok with the change.

I think this will be fine. Missing the constants and labels might come back to bite you (I found an example of s390 using the "X" constraint in the kernel [ arch/s390/include/asm/jump_label.h ] to get at a label for hotpatching), but the break is no worse than you have at the moment (and that's s390, I didn't see anything scary in arch/arm or arch/arm64).

If you're happy to be more restrictive than GCC here, then I think think the patch is fine.

Please heavily comment the testcases to reflect that they are not representative of the way the "X" constraint should be used in the real world, and you may want to add further comments to the lowering function to make it clear that it is tighter than neccessary.

If you're happy to be more restrictive than GCC here, then I think think the patch is fine.

We're always happy to be more restrictive than GCC. :)

Please heavily comment the testcases to reflect that they are not representative of the way the "X" constraint should be used in the real world, and you may want to add further comments to the lowering function to make it clear that it is tighter than neccessary.

Indeed, please do! Also, if you could add the asm() statement that generated each example and a short explanation where the match is binding to, it'd be most helpful, given that you can't match the parameter in some cases.

With the additional comments in the test, LGTM. Thanks!

test/CodeGen/ARM/inlineasm-X-constraint.ll
5

Right, ok.

Thanks! Yes, I think being more restrictive than GCC should be fine at least for now.

sbaranga updated this revision to Diff 54850.Apr 25 2016, 7:28 AM
sbaranga edited edge metadata.

Add more comments for the regresion tests (explaining what's the
tests are doint and what's going on when we fail to emit constants/
labels).

Also add some additional comments in the lowering function
(pointing out that lowering to "r" or "w" is more restrictive than
necessary, possibly not handling as expected some code).

sbaranga closed this revision.Apr 25 2016, 7:35 AM

Excellent! Please go ahead and commit. Thanks!

Thanks, that looks perfect now.

Thanks for reviewing! Committed in r267411.

Cheers,
Silviu