Page MenuHomePhabricator

[X86][AsmParser] re-introduce 'offset' operator
Needs ReviewPublic

Authored by coby on Sep 5 2017, 3:27 AM.



Amend MS offset operator implementation, to more closely fit with its MS counterpart:

  1. InlineAsm: evaluate non-local source entities to their (address) location
  2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous
  3. address PR32530

Presented diffs are relative to D37412

InlineAsm tests: D37466

Diff Detail


Event Timeline

coby created this revision.Sep 5 2017, 3:27 AM

Adding @mcrosier , as he originally introduced the feature

coby edited the summary of this revision. (Show Details)Sep 6 2017, 2:47 AM
rnk added inline comments.Sep 18 2017, 1:58 PM

Can you add some more tests? I think this combines with everything: scale, index, base, segment, and integer displacement.

coby added a comment.Sep 19 2017, 12:50 AM

sure, will update shortly

coby updated this revision to Diff 117285.EditedOct 1 2017, 7:45 AM

Minimal source changes - introduced 'new' enum values definitions as adapted on dependent change (D37412)
Added few more assembly level testing, as requested by @rnk
Minimized 'offset_operator.ll' test-case, and auto-updated it checks

RKSimon edited edge metadata.Nov 2 2017, 3:13 AM

@coby What is happening with this patch?

coby added a comment.Nov 2 2017, 3:43 AM

@coby What is happening with this patch?

currently suspended due to a more dire matters.
nonetheless - as far as functionality is to be considered, and unless any of the listed reviewers will object, this one require no further development.

avt77 edited edge metadata.Nov 28 2017, 11:31 PM

@coby, is it possible to commit this patch? I'd like to fix based on this source.

mcrosier resigned from this revision.Apr 11 2018, 11:29 AM

Thanks for working on this! I'd be happy to see this fixed. It also likely helps with PR36676.

I patched it in to play with it (it still applies with just two hunks not applying via patch, but manually applying them seems easy.) I got an assert on this code (which is from Windows.h):

$ cat repro.ii
void Int64ShrlMod32(unsigned __int64 Value) {
  __asm {
    mov edx, dword ptr [Value+4]
$ bin/clang-7.0 -cc1 -triple i386-pc-windows-msvc19.11.0 -S -disable-free -main-file-name moved.i -mrelocation-model static -mthread-model posix -relaxed-aliasing -fmath-errno -masm-verbose -mconstructor-aliases -target-cpu pentium4 -mllvm -x86-asm-syntax=intel -D_MT -flto-visibility-public-std --dependent-lib=libcmt --dependent-lib=oldnames -fms-volatile -fdiagnostics-format msvc -dwarf-column-info -debugger-tuning=gdb -target-linker-version 305 -momit-leaf-frame-pointer -ffunction-sections -coverage-notes-file /Users/thakis/src/llvm-build-goma/moved.gcno -resource-dir /Users/thakis/src/llvm-build-goma/lib/clang/7.0.0 -Os -Wno-msvc-not-found -WCL4 -fdebug-compilation-dir /Users/thakis/src/llvm-build-goma -ferror-limit 19 -fmessage-length 254 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.11 -fdelayed-template-parsing -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o moved.asm -x cpp-output repro.ii
Assertion failed: (AR.IntelExp.isValid() && "cannot write invalid intel expression"), function parseMSInlineAsm, file /Users/thakis/src/llvm-rw/lib/MC/MCParser/AsmParser.cpp, line 5709.
Stack dump:
0.	Program arguments: /Users/thakis/src/llvm-build-goma/bin/clang-7.0 -cc1 -triple i386-pc-windows-msvc19.11.0 -S -disable-free -main-file-name moved.i -mrelocation-model static -mthread-model posix -relaxed-aliasing -fmath-errno -masm-verbose -mconstructor-aliases -target-cpu pentium4 -mllvm -x86-asm-syntax=intel -D_MT -flto-visibility-public-std --dependent-lib=libcmt --dependent-lib=oldnames -fms-volatile -fdiagnostics-format msvc -dwarf-column-info -debugger-tuning=gdb -target-linker-version 305 -momit-leaf-frame-pointer -ffunction-sections -coverage-notes-file /Users/thakis/src/llvm-build-goma/moved.gcno -resource-dir /Users/thakis/src/llvm-build-goma/lib/clang/7.0.0 -Os -Wno-msvc-not-found -WCL4 -fdebug-compilation-dir /Users/thakis/src/llvm-build-goma -ferror-limit 19 -fmessage-length 254 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.11 -fdelayed-template-parsing -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o moved.asm -x cpp-output repro.ii 
1.	repro.ii:5:1: current parser token '}'
2.	repro.ii:1:45: parsing function body 'Int64ShrlMod32'
3.	repro.ii:1:45: in compound statement ('{}')
0  clang-7.0                0x000000010d82ebc8 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  clang-7.0                0x000000010d82ddc5 llvm::sys::RunSignalHandlers() + 85
2  clang-7.0                0x000000010d82f1d2 SignalHandler(int) + 258
3  libsystem_platform.dylib 0x00007fffc1fd2b3a _sigtramp + 26
4  libsystem_platform.dylib 0x00007fff532a277c _sigtramp + 2435644508
5  libsystem_c.dylib        0x00007fffc1e57420 abort + 129
6  libsystem_c.dylib        0x00007fffc1e1e893 basename_r + 0
7  clang-7.0                0x000000010d5320cd (anonymous namespace)::AsmParser::parseMSInlineAsm(void*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, unsigned int&, unsigned int&, llvm::SmallVectorImpl<std::__1::pair<void*, bool> >&, llvm::SmallVectorImpl<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >&, llvm::SmallVectorImpl<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >&, llvm::MCInstrInfo const*, llvm::MCInstPrinter const*, llvm::MCAsmParserSemaCallback&) + 7981
8  clang-7.0                0x000000010e936e52 clang::Parser::ParseMicrosoftAsmStatement(clang::SourceLocation) + 5746
9  clang-7.0                0x000000010e92bf66 clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::AllowedConstructsKind, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) + 1238
10 clang-7.0                0x000000010e92b938 clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::AllowedConstructsKind, clang::SourceLocation*) + 264
11 clang-7.0                0x000000010e933739 clang::Parser::ParseCompoundStatementBody(bool) + 1497

@coby Are you still looking as this? If not (@coby hasn't been active on phab since 2017) @thakis would you be in a position to commandeer this and complete it?

@coby Are you still looking as this? If not (@coby hasn't been active on phab since 2017) @thakis would you be in a position to commandeer this and complete it?

I can try, but I don't know this code well. I've uploaded a rebased but otherwise unmodified version at

The crash I posted above is because IntelExpr() used to map a Scale of 0 to 1. If I add that back in, my crash goes away.

However, offset is also supposed to work with quoted symbols. If I allow that like so:

+  if (!isParsingInlineAsm()) {
+    if ((getTok().isNot(AsmToken::Identifier) &&
+         getTok().isNot(AsmToken::String)) ||
+        getParser().parsePrimaryExpr(Val, End))
+      return Error(Start, "unexpected token!");

then one of the two examples in PR36676 starts working, but the other asserts due to an unknown relocation. The problem is that offset is modeled as a X86Operand::CreateImm() with a fixup, but since intel asm instrs don't have sizes the logic sees that the imm is 0 and hence creates a 16-bit imm and we then get a 2 byte reloc, which aren't implemented. This is wrong anyhow since push offset foo should create a pointer size relocation. X86AsmParser::ParseIntelOperand() could not create an imm but a memory ref if SM.isOffsetOperator() and that makes things go, but then the push is encoded with a PUSH32rmm instead of a PUSH32i8, which is a byte larger and different from what clang-cl writes if you don't pass /FA, so I'd like to fix that too. I haven't yet figured out how :-)