Extend support for expressions which represent a variable access in ms-style inline-asm, to allow the incorporation of both registers and variables.
Currently, expression such as 'asm mov eax, [var + eax]' would have been reduced to the (equivalent of) 'asm mov eax, [var]'
This patch amends it
clang counterpart: D35775
Details
- Reviewers
rnk myatsina echristo m_zuckerman
Diff Detail
- Repository
- rL LLVM
Event Timeline
I would really prefer to see fewer rewrites and more pass through of inline assembly to MC. The way we parse inline asm in two phases is a confusing source of bugs. It also makes it harder to test the Intel asm parser, since the more rewrites we do the more cross-repository tests we have to add.
include/llvm/MC/MCParser/MCTargetAsmParser.h | ||
---|---|---|
77 | format | |
test/MC/X86/intel-syntax-error.s | ||
21–24 | I see, these are in fact valid x86. :) Can you move them to a valid test case so we don't lose coverage, like test/MC/X86/intel-syntax.s? |
I would really prefer to see fewer rewrites and more pass through of inline assembly to MC
As do I. What are you suggesting? Keep rewrites to the absolute necessary minimum?
Pepole do some funky stuff on MS-Inline-Asm, a large portion of it should be 'legalized' in order for the Assembler to be able to digest.
The other way around is to allow this nonsense in the Assembler as well.
Both options are bad, but I believe the latter is a bit more.
If it's of any comfort - have this 'new' rewrite is to be allowed - AOK_Imm/AOK_ImmPrefix/AOK_SizeDirective can be implemented as a special case of it (AOK_DotOperator should be cancelled).
Net effect will remain the same - but it'll be (possibly) less of a mess of code.
Hm, rewriting everything in terms of a general AOK_IntelExpr does sound better, even if it is kind of the opposite direction from what I was thinking. :) What I'm imagining is that any time there is a symbolic reference to C/C++ enums, globals, or variables, we basically fully parse that operand and decompose it into the usual parts of an X86 memory operand: segment, base, index, scale, displacement, etc, check that those parts make sense, and emit a completely rewritten operand. Right now the parsing code in MC has a hard time deciding what's legal because it doesn't know whether certain C/C++ variable references refer to stack locals or global variables. A local variable will require using a base register, but a global will not, but on x86_64, a global will require %rip-relative addressing, which can't use base+index registers.
Doing this exposes a lot of x86 details that aren't really supposed to be in the supposedly target neutral asmparser interface. That battle may already be lost, though.
This battle can fairly been viewed as lost, until a 'global' refactoring/reconsideration will be announced (if ever) regarding the way targets' specific parsers and AsmParser are to integrate one with each other, in terms of responsibilities. those 'leaks' you've mentioned from X86 to AsmParser are a fine exhibition of it. I have some ideas regarding how we may overcome/'hide' such phenomena in a way which should make rewrites unnecessary, but they are currently seem 'too heavy' (and reaching code parts wihch I should get more familiar with ere tempering). perhaps a llvm-dev discussion should be commenced? to either find a more clean and elegant way, or setting 'guidelines' which will confine this witchcraft to an agreed upon, tolerable minimum.
In the meantime - I deem we should 'allow' ourselves to 'rewrite' only where no scale-able alternative can be conceived, and try to concentrate/uniform all rewrites under one hood (in an utopic future - it should ease rewrites elimination)
lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
---|---|---|
1281 | Please keep the curly braces on the for loop. I know elsewhere they are omitted, but this is just too clever for me. | |
1283 | If we have to keep doing comparing source locations, this code would greatly benefit from some helper functions that do things like the following: isLocBetween, isLocAfter, isLocBefore, so you don't have to deal in pointers all the time. | |
1308–1319 | This is too clever. IMO we should take SymName, make an SMLoc out of it, and use the comparison helpers to figure out if the string is before or after '[', and do the appropriate rewrite. | |
1525 | User-facing diagnostics are supposed to be sentence fragments that start with lower case. We also try to avoid contractions in user-facing diagnostics, so this should be "cannot" instead of "Can't". |
format