This fixes PR20023. In order to implement this scoping rule, we maintain
a map from the label's name as it appears in the source code to an internal
name which is guaranteed to be an invalid mangled name. Every time that we
encounter a label, we add an entry to this internal map, and when we fail
to parse something as a valid identifier, we get a LookupInlineAsmLabel
callback from MC which allows us to optimistically interpret the identifier
as a label name. By the time that we pop the function's scope, we go
through this map, and for each entry verify that we have seen a label
somewhere in the function body, and emit a diagnostic otherwise. This has
the undesired side effect that for some unknown names, we may issue an
extra diagnostic (as can be seen in test/Sema/ms-inline-asm.c.
Details
Diff Detail
Event Timeline
include/clang/Sema/Sema.h | ||
---|---|---|
3134 | micro nit: & on the right. | |
lib/Parse/ParseStmtAsm.cpp | ||
308–312 | This seems like the responsibility of the assembly parser. IMO it should call back into clang when it encounters a label in inline assembly mode, and then clang can set up the mapping. | |
test/CodeGen/ms-inline-asm.c | ||
499–500 | Can you use the asm { ... } syntax for the test case? In the future, we need to lower consecutive asm statements into a single LLVM inline assembly blob, because right now LLVM can insert spills and restores between the two blobs. In other words, you can't rely on this today: __asm mov eax, x __asm test eax, eax ; eax may be different, but we usually have no reason to spill here If you use the __asm { } syntax, we won't have to change this test in the future. |
lib/Parse/ParseStmtAsm.cpp | ||
---|---|---|
308–312 | Actually I think I can make this work... Will upload a new patch later today. |
lib/Parse/ParseStmtAsm.cpp | ||
---|---|---|
96 | You can return a StringRef directly. | |
lib/Sema/SemaDecl.cpp | ||
1492 | I believe ActOnPopScope happens on any closing curly brace, so this won't do the "right" thing: void f() { bool failed; __asm retry: { failed = g(); } // The label is cleared here. if (failed) __asm jmp retry } Thinking more carefully, I don't think having the label map right on Sema works at all in the general case of nested functions. Consider this C++ example with inner classes: void f() { __asm some_label: struct A { static void g() { __asm jmp some_label ; This should jump forwards __asm some_label: __asm nop } }; } You can build similar test cases from C++11 lambdas and Obj-C blocks, and now OpenMP captured statements. We need to associate this label map with the function we're currently processing. Now that I'm considering the actual implementation difficulties, I'm going back to your "proposal #3" on PR20023, where we model these labels as regular label statements. That way, we reuse all the right name lookup machinery. We won't be able to 'goto' a label from an inline asm statement or 'jmp' to a non-asm label, so if you go this way, please diagnose both cases. You can probably add a bit to LabelDecl to track whether the label was from MS inline asm or regular source code. Apologies for the runaround, but I think it's probably better to rewrite along these lines. What do you think? |
Updated the patch to use the LabelDecl machinery.
Note that there are a couple of issues with the diagnostics that I'm generating (see the TODO comments.)
The first issue is that I currently diagnose jumping to an asm label from a goto statement in ActOnGotoStmt, which means that we currently can't diagnose the case where we first see the goto statement, generate an implicit label from it, and later on see an asm block which defines a label with that name. The reason why the proper diagnostic is hard is that I don't know where the right place to perform it is. Note that we need access to the SourceLocation of the GotoStmt and also to the SourceLocation of the place where the label is declared in the asm block.
The second issue with the diagnostics is that as I'm generating the "use of undeclared label" error, I only have access to the LabelDecl and its location, which means that for example the diagnostic in t10 in test/Sema/ms-inline-asm.c is misplaced.
Any help with fixing the above two issues is appreciated.
Also, I have tested most of the combinations of goto's, jumps and asm/non-asm labels that I have thought of, but I may have missed something. If you see cases for which I have not included a test case, please point them out!
Nice, this completely eliminates the lookup difficulties in the previous approach.
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | This will cause a memory leak for names > 16 chars. When the ASTContext is destroyed, it doesn't actually call destructors of individual AST nodes, it merely frees the memory. This should be a StringRef that points to memory from 'new (Context, 1) char[Len]', probably. |
lib/Parse/ParseStmtAsm.cpp | ||
99–101 | I think you can sink this into translateLocation if you make it take a SourceMgr and SMLoc. | |
lib/Sema/SemaStmt.cpp | ||
2357–2358 ↗ | (On Diff #13793) | Does it help if you issue the same diagnostic when processing 'foo:' in this example? void f() { goto foo; __asm { foo: nop } } After issuing the diagnostic, you can avoid the "undeclared label" knock-on errors by pretending that you did in fact define the label. |
lib/Sema/SemaStmtAsm.cpp | ||
496 | This should be a member variable of Sema, rather than a global. | |
508 | We should be able to use the 'L' assembler prefix to suppress the symbol table entry for the label. I'm happy as long as both 'nm' and 'dumpbin /symbols' agree that there are no MSASMLABEL symbols in the object file. I think it will with the code as written. I'd also try to get the user-written label in there, so that the -S output is more intelligible, something like: OS << "L__MSASMLABEL_" << counter++ << "__" << ExternalLabelName; Users can't make labels that start with 'L', so we should be safe from collisions, we just need a unique counter to make sure we don't collide with ourselves. | |
test/Sema/ms-inline-asm.c | ||
133 | Can you add a .cpp test case involving nested scopes, like: void f() { __asm some_label: struct A { static void g() { __asm jmp some_label ; This should jump forwards __asm some_label: __asm nop } }; } |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | Oh, I didn't know that! What's this 'new (Contex, 1) char[Len]' trick? I'm not familiar with it, I don't think... |
lib/Parse/ParseStmtAsm.cpp | ||
99–101 | Yeah looks like it. Will do. | |
lib/Sema/SemaStmt.cpp | ||
2357–2358 ↗ | (On Diff #13793) | I cannot issue the diagnostic there because I will not have a GotoStmt pointer when processing foo:. That's really the main issue. What I would ideally do would be to not issue any diagnostics in ActOnGotoStmt, and wait until the scope is fully processed, then visit all of the GotoStmts in the scope, check their LabelDecls to see if they're asm labels and issue the diagnostic there. But I don't know where I would hook into in order to be able to visit all of the GotoStmts at the end of a scope. |
lib/Sema/SemaStmtAsm.cpp | ||
496 | Sure. | |
508 | I think you brought this issue up before. :) I am turning these names into local symbol names in http://reviews.llvm.org/D4587 by using getPrivateGlobalPrefix() when rewriting AOK_Label. I don't think that we need to store that prefix in the AST, do we? Your suggestion to add the user-written label in the name is a good one, it just needs me to change all of these tests again! ;) But I'll do it in the next iteration. On the issue of counter uniqueness, the current global counter obviously ensures that. If we have one Sema object per each compilation, then making it a Sema member won't change things, so we should be good there. | |
test/Sema/ms-inline-asm.c | ||
130 | Any ideas how I can fix this? Or is the current misplaced diagnostic acceptable? | |
133 | Certainly! |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | It's an operator new overload declared in ASTContext.h. Basically, it's equivalent to 'new char[Len]', except it goes calls the 'void* operator new(size_t Bytes, ASTContext &Context, size_t Alignment)' overload. |
lib/Sema/SemaStmt.cpp | ||
2357–2358 ↗ | (On Diff #13793) | Looks like we have JumpDiagnostics.cpp for this purpose... See if you can work it in there? |
lib/Sema/SemaStmtAsm.cpp | ||
508 | OK, doing this in LLVM MC seems great. | |
test/Sema/ms-inline-asm.c | ||
130 | Moving to JumpDiagnostics.cpp will probably fix it. |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | Hmm, who is responsible for freeing this memory? The comments in ASTContext.h seem to suggest that I should call ASTContext::Deallocate on the pointer, but I'm not sure when I should do that if the destructor is not invoked. Looking at other parts of the code base, it seems like nobody worries about deleting memory allocated this way... |
lib/Sema/SemaStmt.cpp | ||
2357–2358 ↗ | (On Diff #13793) | Sure. |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | When the ASTContext gets destroyed, it frees all memory allocated with it. It's basically a simple pool allocation scheme. The AST lives until compilation is over (i.e. forever). The clang driver even passes a -disable-free flag to prevent the -cc1 process from destroying the AST because even that is slower than just ending the process. |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | Thanks! This makes perfect sense. Should I submit another patch to improve the documentation about this in ASTContext.h? |
lgtm
include/clang/Sema/ScopeInfo.h | ||
---|---|---|
323–331 ↗ | (On Diff #13816) | Is it enough to say setHasBranchProtectedScope() instead of setHasInlineMSAsmBlocks()? I would say that an MS inline asm blob is a "protected scope" and add it to the list of things above HasBranchProtectedScope. |
include/clang/AST/Decl.h | ||
---|---|---|
312 ↗ | (On Diff #13793) | Maybe, but I could swear it's in there somewhere, under Ownership.h or something. |
micro nit: & on the right.