This patch implements the microsoft segment pragmas. As posted it works up to testing but it needs a little polish and some lit tests. Particularly I'm pretty sure that the way I assign the code_seg attribute in CheckFunctionDeclaration is wrong. Another thing to note about this patch is that it introduces a unified pragma lexer for microsoft pragmas. This lexer basically just snarfs up some tokens, adds a marker token and passes them all onto the parser, which actually parses the tokens. The idea is that all past and future ms pragmas would migrate to this pattern.
Diff Detail
Event Timeline
Also, to note, we can't actually emit appropriate section properties. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-March/071058.html . Either way, I still need to add diagnostics for when the section type requested is not fulfillable.
Ping.
Also, I looked at all of the Microsoft pragams and 6 of them use a labeled stack data_seg, bss_seg, const_seg, code_seg, pack and conform. If we choose to template MSSegmentStack, we can reasonably pull in pack (a value of 0 isn't a ligitimate argument and can be used, like nullptr, to indicate that we're pushing or poping w/o changing the setting). Putting in conform is much trickier because conform takes multiple extra arguments. There is significantly more error checking code in the pragma pack handler than in the ms one (for example #pragma_bss(pop) with no stack or after a reset is a no-op not an error). Fusing the logic for stack with the others would either involve a bunch of special-casing or we'd end up having much stricter rules about how some of these pragmas work than msvc does.
Pragmas float_control, managed, unmanaged, strict_gs_check, warning and vtordisp use an unlabeled stack. the first 4 push booleans, as does vtordisp, although in a slightly different way. warning pushes a small integer.
include/clang/Sema/Sema.h | ||
---|---|---|
307–308 | I can't guess what these three fields are. Maybe this should be a struct? | |
313–316 | We should serialize/deserialize these if they occur in a PCH (but we shouldn't do so if they're in a module). I don't think anyone is particularly likely to care about this, but please at least add a FIXME. | |
lib/Parse/ParseDeclCXX.cpp | ||
2618–2621 | Do you really need to check for this in so many places? (Do none of them call the other ones?) ;( | |
lib/Parse/ParsePragma.cpp | ||
1313–1323 | Dropping the eod delimiter from the stream here looks questionable. You have no way to distinguish between #pragma code_seg(blah) and #pragma code_seg (blah) Error recovery might also do strange things if the tokens from the pragma are in the normal token stream, since the rest of the parser isn't aware that annot_pragma_ms_pragma opens some kind of token scope. Maybe you could set the list of tokens as the annotation value on the annot_pragma_ms_pragma token instead? | |
lib/Sema/SemaDecl.cpp | ||
7498 | Commented-out code? |
This update constitutes a significant re-write of the previous version of the patch.
The unified MS-pragma handler has been updated to consume the token stream for the pragma up to eod and attach it to the annotation token, capped with an eof. The parser inserts the tokens back into the stream and parses them.
PragmaStack has been templated for future unification with the stack used by pragma pack.
include/clang/Sema/Sema.h | ||
---|---|---|
7071–7094 | IMO we should only have one action here, and make the attribute kind a parameter. We'll need to add one more for init_seg anyway. | |
lib/Parse/ParsePragma.cpp | ||
441 | Rather than making a new enum, how about using MSSegmentAttr::CodeSeg, etc? We should be able to include clang/AST/Attr.h from here. | |
449 | gotos aren't common in LLVM, and there aren't any ALL_CAPS names, even for macros or enums. I'd rather have a static helper like eatPastEOFToken() and do 'return eatPastEOFToken(PP)' for all of these gotos. Nevermind that we're returning void. :) | |
502–509 | Once we use MSSegmentAttr::Kind, we collapse these Sema actions down to ActOnPragmaMSSegment() and make the kind a parameter. |
lib/Parse/ParsePragma.cpp | ||
---|---|---|
438 | It's strange to use PP.Lex here -- in the parser you should generally be using ConsumeToken instead. If you're trying to avoid updating PrevTokLocation (which seems like a good idea, since pragma tokens aren't tokens in the usual sense), you're not doing enough, because BalancedDelimiterTracker and ParseStringLiteralExpression will update it. Maybe save and restore it on entry to/exit from this function, and use ConsumeToken here? | |
485 | Don't call this unless isTokenStringLiteral(), or it'll assert. | |
487 | In this case, a diagnostic will already have been produced. (Generally speaking, when a function returns you an invalid *Result or a function returning bool on error returns true, that indicates that it has already produced a diagnostic.) | |
490–493 | This would fit better in the ActOn... functions. Generally, we try to avoid the parser inspecting AST nodes. |
A major overhaul. The patch now considers attribute((section())), declspec(allocate()) and #pragma section. I've also added gcc style diagnostics for section conflicts (a behavior clang lacked for attribute__((section()))). I've added test cases for both Sema and CodeGen. The CodeGen test case reuses an old file-name and moves its contents to a more appropriately named test.
Note: There is one known outstanding issue with the patch: In C mode, due to tentative declarations, we don't actually process uninitialized variables until the end of the TU, which means they get stuck in whatever section the last #pragma bss_seg() dictates, rather than the one that was active when they were defined.
Addresses Aaron's comments. Also rebased and incorporates init_seg using the unified MS pragma handling machinery.
Specific comments:
- added documentation for the __declspec(allocate) attribute
- warnings were added and expanded upon, but could not be fused
- XXXXSeg renamed to Segment
- Fixed style issues around &and *
- SectionInfo will be read during codegen (we can make it private and add an interface if we really need to, noone's going to access it for anything they don't need...)
- Added a significant number of parser tests, found a bug and fixed that
- Added a custom diagnostic for shared, nopage, nocache, discard and remove
- Fixed things that would break MSVC2012
- Other random small fixes requested.
Specific Replies:
- We're using eof instead of eod for our pragmas due to the way that certain parser helpers work. Things like parsestringliteral can walk through an eod but not an eof. Richard Smith suggested this.
- After talking to a couple people here about [&] for the lambdas (I'm new to lambdas and am looking for guidance) there was some consensus that these lambdas are both so short and used so locally that being more explicit about what exactly they capture is not particularly more clear and adds code.
- The Add followed by Remove process (rather than making the add conditional based on it not being immediately removed) significantly simplified the implantation logic by allowing Unify to always look at the attribute. Given that this particular piece of code is colder than cold (someone is mis-using an extremely rare and specific pragma) I opted for code size/simplicity over performance.
Looks good, feel free to commit with some tweaks.
include/clang/Basic/DiagnosticParseKinds.td | ||
---|---|---|
775 | I think you omitted "in" before '#pragma' | |
include/clang/Sema/Sema.h | ||
277 | Looks like these are used as flags. Can you give the enumerators explicit hex values to make that clear? I thought "Val & PSK_Push" was a bug. | |
7073 | I'm not sure what the S in PSK stands for. Maybe PMK for PragmaMsKind? | |
7092 | This can be SectionInfo() = default so the default ctor is still trivial. | |
lib/Parse/ParsePragma.cpp | ||
563 | indentation | |
lib/Sema/SemaAttr.cpp | ||
328–342 | This function looks like it has bugs, but it's dead and should be deleted. | |
lib/Sema/SemaDecl.cpp | ||
7065 | indentation | |
8958 | indentation |
I think you omitted "in" before '#pragma'