This is an archive of the discontinued LLVM Phabricator instance.

ms-inline-asm: Scope inline asm labels to functions
ClosedPublic

Authored by ehsan on Jul 18 2014, 12:17 PM.

Details

Summary

This fixes PR20023. In order to implement this scoping rule, we maintain
a map from the label's name as it appears in the source code to an internal
name which is guaranteed to be an invalid mangled name. Every time that we
encounter a label, we add an entry to this internal map, and when we fail
to parse something as a valid identifier, we get a LookupInlineAsmLabel
callback from MC which allows us to optimistically interpret the identifier
as a label name. By the time that we pop the function's scope, we go
through this map, and for each entry verify that we have seen a label
somewhere in the function body, and emit a diagnostic otherwise. This has
the undesired side effect that for some unknown names, we may issue an
extra diagnostic (as can be seen in test/Sema/ms-inline-asm.c.

Diff Detail

Repository
rL LLVM

Event Timeline

ehsan updated this revision to Diff 11659.Jul 18 2014, 12:17 PM
ehsan retitled this revision from to ms-inline-asm: Scope inline asm labels to functions.
ehsan updated this object.
ehsan edited the test plan for this revision. (Show Details)
ehsan added a reviewer: rnk.
ehsan added a subscriber: Unknown Object (MLST).
rnk added inline comments.Jul 18 2014, 12:39 PM
include/clang/Sema/Sema.h
3134 ↗(On Diff #11659)

micro nit: & on the right.

lib/Parse/ParseStmtAsm.cpp
308–312 ↗(On Diff #11659)

This seems like the responsibility of the assembly parser. IMO it should call back into clang when it encounters a label in inline assembly mode, and then clang can set up the mapping.

test/CodeGen/ms-inline-asm.c
499–500 ↗(On Diff #11659)

Can you use the asm { ... } syntax for the test case? In the future, we need to lower consecutive asm statements into a single LLVM inline assembly blob, because right now LLVM can insert spills and restores between the two blobs. In other words, you can't rely on this today:

__asm mov eax, x
__asm test eax, eax ; eax may be different, but we usually have no reason to spill here

If you use the __asm { } syntax, we won't have to change this test in the future.

ehsan added inline comments.Jul 18 2014, 1:41 PM
lib/Parse/ParseStmtAsm.cpp
308–312 ↗(On Diff #11659)

Do we have access to the sema callback there? AFAICT we don't... :/

test/CodeGen/ms-inline-asm.c
499–500 ↗(On Diff #11659)

Ah, this reminds me that I intended to work on a patch to merge these __asm blocks. :-)

ehsan added inline comments.Jul 19 2014, 1:04 PM
lib/Parse/ParseStmtAsm.cpp
308–312 ↗(On Diff #11659)

Actually I think I can make this work... Will upload a new patch later today.

ehsan updated this revision to Diff 11687.Jul 19 2014, 2:14 PM

Addressed the review comments.

rnk added inline comments.Jul 24 2014, 11:52 AM
lib/Parse/ParseStmtAsm.cpp
96 ↗(On Diff #11687)

You can return a StringRef directly.

lib/Sema/SemaDecl.cpp
1492 ↗(On Diff #11687)

I believe ActOnPopScope happens on any closing curly brace, so this won't do the "right" thing:

void f() {
  bool failed;
  __asm retry:
  {
    failed = g();
  } // The label is cleared here.
  if (failed)
    __asm jmp retry
}

Thinking more carefully, I don't think having the label map right on Sema works at all in the general case of nested functions. Consider this C++ example with inner classes:

void f() {
  __asm some_label:
  struct A {
    static void g() {
      __asm jmp some_label ; This should jump forwards
      __asm some_label:
      __asm nop
    }
  };
}

You can build similar test cases from C++11 lambdas and Obj-C blocks, and now OpenMP captured statements.

We need to associate this label map with the function we're currently processing. Now that I'm considering the actual implementation difficulties, I'm going back to your "proposal #3" on PR20023, where we model these labels as regular label statements. That way, we reuse all the right name lookup machinery. We won't be able to 'goto' a label from an inline asm statement or 'jmp' to a non-asm label, so if you go this way, please diagnose both cases. You can probably add a bit to LabelDecl to track whether the label was from MS inline asm or regular source code.

Apologies for the runaround, but I think it's probably better to rewrite along these lines. What do you think?

ehsan updated this revision to Diff 13793.Sep 17 2014, 11:21 AM

Updated the patch to use the LabelDecl machinery.

Note that there are a couple of issues with the diagnostics that I'm generating (see the TODO comments.)

The first issue is that I currently diagnose jumping to an asm label from a goto statement in ActOnGotoStmt, which means that we currently can't diagnose the case where we first see the goto statement, generate an implicit label from it, and later on see an asm block which defines a label with that name. The reason why the proper diagnostic is hard is that I don't know where the right place to perform it is. Note that we need access to the SourceLocation of the GotoStmt and also to the SourceLocation of the place where the label is declared in the asm block.

The second issue with the diagnostics is that as I'm generating the "use of undeclared label" error, I only have access to the LabelDecl and its location, which means that for example the diagnostic in t10 in test/Sema/ms-inline-asm.c is misplaced.

Any help with fixing the above two issues is appreciated.

Also, I have tested most of the combinations of goto's, jumps and asm/non-asm labels that I have thought of, but I may have missed something. If you see cases for which I have not included a test case, please point them out!

rnk edited edge metadata.Sep 17 2014, 1:10 PM

Nice, this completely eliminates the lookup difficulties in the previous approach.

include/clang/AST/Decl.h
312 ↗(On Diff #13793)

This will cause a memory leak for names > 16 chars. When the ASTContext is destroyed, it doesn't actually call destructors of individual AST nodes, it merely frees the memory. This should be a StringRef that points to memory from 'new (Context, 1) char[Len]', probably.

lib/Parse/ParseStmtAsm.cpp
99–101 ↗(On Diff #13793)

I think you can sink this into translateLocation if you make it take a SourceMgr and SMLoc.

lib/Sema/SemaStmt.cpp
2357–2358 ↗(On Diff #13793)

Does it help if you issue the same diagnostic when processing 'foo:' in this example?

void f() {
  goto foo;
  __asm {
  foo:
    nop
  }
}

After issuing the diagnostic, you can avoid the "undeclared label" knock-on errors by pretending that you did in fact define the label.

lib/Sema/SemaStmtAsm.cpp
522 ↗(On Diff #13793)

This should be a member variable of Sema, rather than a global.

534 ↗(On Diff #13793)

We should be able to use the 'L' assembler prefix to suppress the symbol table entry for the label. I'm happy as long as both 'nm' and 'dumpbin /symbols' agree that there are no MSASMLABEL symbols in the object file. I think it will with the code as written.

I'd also try to get the user-written label in there, so that the -S output is more intelligible, something like:

OS << "L__MSASMLABEL_" << counter++ << "__" << ExternalLabelName;

Users can't make labels that start with 'L', so we should be safe from collisions, we just need a unique counter to make sure we don't collide with ourselves.

test/Sema/ms-inline-asm.c
148 ↗(On Diff #13793)

Can you add a .cpp test case involving nested scopes, like:

void f() {
  __asm some_label:
  struct A {
    static void g() {
      __asm jmp some_label ; This should jump forwards
      __asm some_label:
      __asm nop
    }
  };
}
ehsan added inline comments.Sep 17 2014, 1:22 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

Oh, I didn't know that! What's this 'new (Contex, 1) char[Len]' trick? I'm not familiar with it, I don't think...

lib/Parse/ParseStmtAsm.cpp
99–101 ↗(On Diff #13793)

Yeah looks like it. Will do.

lib/Sema/SemaStmt.cpp
2357–2358 ↗(On Diff #13793)

I cannot issue the diagnostic there because I will not have a GotoStmt pointer when processing foo:. That's really the main issue.

What I would ideally do would be to not issue any diagnostics in ActOnGotoStmt, and wait until the scope is fully processed, then visit all of the GotoStmts in the scope, check their LabelDecls to see if they're asm labels and issue the diagnostic there. But I don't know where I would hook into in order to be able to visit all of the GotoStmts at the end of a scope.

lib/Sema/SemaStmtAsm.cpp
522 ↗(On Diff #13793)

Sure.

534 ↗(On Diff #13793)

I think you brought this issue up before. :) I am turning these names into local symbol names in http://reviews.llvm.org/D4587 by using getPrivateGlobalPrefix() when rewriting AOK_Label. I don't think that we need to store that prefix in the AST, do we?

Your suggestion to add the user-written label in the name is a good one, it just needs me to change all of these tests again! ;) But I'll do it in the next iteration.

On the issue of counter uniqueness, the current global counter obviously ensures that. If we have one Sema object per each compilation, then making it a Sema member won't change things, so we should be good there.

test/Sema/ms-inline-asm.c
145 ↗(On Diff #13793)

Any ideas how I can fix this? Or is the current misplaced diagnostic acceptable?

148 ↗(On Diff #13793)

Certainly!

rnk added inline comments.Sep 17 2014, 1:35 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

It's an operator new overload declared in ASTContext.h. Basically, it's equivalent to 'new char[Len]', except it goes calls the 'void* operator new(size_t Bytes, ASTContext &Context, size_t Alignment)' overload.

lib/Sema/SemaStmt.cpp
2357–2358 ↗(On Diff #13793)

Looks like we have JumpDiagnostics.cpp for this purpose... See if you can work it in there?

lib/Sema/SemaStmtAsm.cpp
534 ↗(On Diff #13793)

OK, doing this in LLVM MC seems great.

test/Sema/ms-inline-asm.c
145 ↗(On Diff #13793)

Moving to JumpDiagnostics.cpp will probably fix it.

ehsan added inline comments.Sep 17 2014, 5:28 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

Hmm, who is responsible for freeing this memory? The comments in ASTContext.h seem to suggest that I should call ASTContext::Deallocate on the pointer, but I'm not sure when I should do that if the destructor is not invoked. Looking at other parts of the code base, it seems like nobody worries about deleting memory allocated this way...

lib/Sema/SemaStmt.cpp
2357–2358 ↗(On Diff #13793)

Sure.

rnk added inline comments.Sep 17 2014, 5:42 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

When the ASTContext gets destroyed, it frees all memory allocated with it. It's basically a simple pool allocation scheme. The AST lives until compilation is over (i.e. forever). The clang driver even passes a -disable-free flag to prevent the -cc1 process from destroying the AST because even that is slower than just ending the process.

ehsan updated this revision to Diff 13816.Sep 17 2014, 7:19 PM
ehsan edited edge metadata.

I think this addresses all of the review comments...

ehsan added inline comments.Sep 17 2014, 7:21 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

Thanks! This makes perfect sense. Should I submit another patch to improve the documentation about this in ASTContext.h?

rnk accepted this revision.Sep 17 2014, 8:26 PM
rnk edited edge metadata.

lgtm

include/clang/Sema/ScopeInfo.h
323–331 ↗(On Diff #13816)

Is it enough to say setHasBranchProtectedScope() instead of setHasInlineMSAsmBlocks()? I would say that an MS inline asm blob is a "protected scope" and add it to the list of things above HasBranchProtectedScope.

This revision is now accepted and ready to land.Sep 17 2014, 8:26 PM
rnk added inline comments.Sep 17 2014, 8:27 PM
include/clang/AST/Decl.h
312 ↗(On Diff #13793)

Maybe, but I could swear it's in there somewhere, under Ownership.h or something.

ehsan updated this revision to Diff 13818.Sep 17 2014, 8:59 PM
ehsan edited edge metadata.

Addressed the review comment.

ehsan closed this revision.Sep 21 2014, 7:31 PM
ehsan updated this revision to Diff 13917.

Closed by commit rL218230 (authored by @ehsan).