This is an archive of the discontinued LLVM Phabricator instance.

Implementation of #pragma (data|code|const|bss)_seg
ClosedPublic

Authored by whunt on Mar 12 2014, 6:07 PM.

Details

Summary

This patch implements the microsoft segment pragmas. As posted it works up to testing but it needs a little polish and some lit tests. Particularly I'm pretty sure that the way I assign the code_seg attribute in CheckFunctionDeclaration is wrong. Another thing to note about this patch is that it introduces a unified pragma lexer for microsoft pragmas. This lexer basically just snarfs up some tokens, adds a marker token and passes them all onto the parser, which actually parses the tokens. The idea is that all past and future ms pragmas would migrate to this pattern.

Diff Detail

Event Timeline

whunt added a comment.Mar 12 2014, 6:08 PM

Also, to note, we can't actually emit appropriate section properties. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-March/071058.html . Either way, I still need to add diagnostics for when the section type requested is not fulfillable.

whunt added a comment.Mar 14 2014, 2:15 PM

Ping.

Also, I looked at all of the Microsoft pragams and 6 of them use a labeled stack data_seg, bss_seg, const_seg, code_seg, pack and conform. If we choose to template MSSegmentStack, we can reasonably pull in pack (a value of 0 isn't a ligitimate argument and can be used, like nullptr, to indicate that we're pushing or poping w/o changing the setting). Putting in conform is much trickier because conform takes multiple extra arguments. There is significantly more error checking code in the pragma pack handler than in the ms one (for example #pragma_bss(pop) with no stack or after a reset is a no-op not an error). Fusing the logic for stack with the others would either involve a bunch of special-casing or we'd end up having much stricter rules about how some of these pragmas work than msvc does.

Pragmas float_control, managed, unmanaged, strict_gs_check, warning and vtordisp use an unlabeled stack. the first 4 push booleans, as does vtordisp, although in a slightly different way. warning pushes a small integer.

rsmith added inline comments.Mar 17 2014, 7:24 PM
include/clang/Sema/Sema.h
307–308

I can't guess what these three fields are. Maybe this should be a struct?

313–316

We should serialize/deserialize these if they occur in a PCH (but we shouldn't do so if they're in a module). I don't think anyone is particularly likely to care about this, but please at least add a FIXME.

lib/Parse/ParseDeclCXX.cpp
2618–2621

Do you really need to check for this in so many places? (Do none of them call the other ones?) ;(

lib/Parse/ParsePragma.cpp
1313–1323

Dropping the eod delimiter from the stream here looks questionable. You have no way to distinguish between

#pragma code_seg(blah)

and

#pragma code_seg
(blah)

Error recovery might also do strange things if the tokens from the pragma are in the normal token stream, since the rest of the parser isn't aware that annot_pragma_ms_pragma opens some kind of token scope. Maybe you could set the list of tokens as the annotation value on the annot_pragma_ms_pragma token instead?

lib/Sema/SemaDecl.cpp
7498

Commented-out code?

whunt added inline comments.Mar 25 2014, 4:10 PM
include/clang/Sema/Sema.h
307–308

Done.

313–316

FIXME added.

lib/Parse/ParseDeclCXX.cpp
2618–2621

Other patches that add pragmas touch each of these places.

lib/Parse/ParsePragma.cpp
1313–1323

Fixed.

lib/Sema/SemaDecl.cpp
7498

Moved to ActOnFunctionDeclarator to get the definitionness.

whunt updated this revision to Unknown Object (????).Mar 25 2014, 4:26 PM

This update constitutes a significant re-write of the previous version of the patch.
The unified MS-pragma handler has been updated to consume the token stream for the pragma up to eod and attach it to the annotation token, capped with an eof. The parser inserts the tokens back into the stream and parses them.
PragmaStack has been templated for future unification with the stack used by pragma pack.

rnk added inline comments.Mar 26 2014, 4:15 PM
include/clang/Sema/Sema.h
7071–7094

IMO we should only have one action here, and make the attribute kind a parameter. We'll need to add one more for init_seg anyway.

lib/Parse/ParsePragma.cpp
441

Rather than making a new enum, how about using MSSegmentAttr::CodeSeg, etc? We should be able to include clang/AST/Attr.h from here.

449

gotos aren't common in LLVM, and there aren't any ALL_CAPS names, even for macros or enums.

I'd rather have a static helper like eatPastEOFToken() and do 'return eatPastEOFToken(PP)' for all of these gotos. Nevermind that we're returning void. :)

502–509

Once we use MSSegmentAttr::Kind, we collapse these Sema actions down to ActOnPragmaMSSegment() and make the kind a parameter.

rsmith added inline comments.Mar 26 2014, 6:00 PM
lib/Parse/ParsePragma.cpp
438

It's strange to use PP.Lex here -- in the parser you should generally be using ConsumeToken instead. If you're trying to avoid updating PrevTokLocation (which seems like a good idea, since pragma tokens aren't tokens in the usual sense), you're not doing enough, because BalancedDelimiterTracker and ParseStringLiteralExpression will update it. Maybe save and restore it on entry to/exit from this function, and use ConsumeToken here?

485

Don't call this unless isTokenStringLiteral(), or it'll assert.

487

In this case, a diagnostic will already have been produced. (Generally speaking, when a function returns you an invalid *Result or a function returning bool on error returns true, that indicates that it has already produced a diagnostic.)

490–493

This would fit better in the ActOn... functions. Generally, we try to avoid the parser inspecting AST nodes.

whunt updated this revision to Unknown Object (????).Apr 2 2014, 6:21 PM

A major overhaul. The patch now considers attribute((section())), declspec(allocate()) and #pragma section. I've also added gcc style diagnostics for section conflicts (a behavior clang lacked for attribute__((section()))). I've added test cases for both Sema and CodeGen. The CodeGen test case reuses an old file-name and moves its contents to a more appropriately named test.

Note: There is one known outstanding issue with the patch: In C mode, due to tentative declarations, we don't actually process uninitialized variables until the end of the TU, which means they get stuck in whatever section the last #pragma bss_seg() dictates, rather than the one that was active when they were defined.

whunt updated this revision to Unknown Object (????).Apr 7 2014, 6:25 PM

Addresses Aaron's comments. Also rebased and incorporates init_seg using the unified MS pragma handling machinery.
Specific comments:

  • added documentation for the __declspec(allocate) attribute
  • warnings were added and expanded upon, but could not be fused
  • XXXXSeg renamed to Segment
  • Fixed style issues around &and *
  • SectionInfo will be read during codegen (we can make it private and add an interface if we really need to, noone's going to access it for anything they don't need...)
  • Added a significant number of parser tests, found a bug and fixed that
  • Added a custom diagnostic for shared, nopage, nocache, discard and remove
  • Fixed things that would break MSVC2012
  • Other random small fixes requested.

Specific Replies:

  • We're using eof instead of eod for our pragmas due to the way that certain parser helpers work. Things like parsestringliteral can walk through an eod but not an eof. Richard Smith suggested this.
  • After talking to a couple people here about [&] for the lambdas (I'm new to lambdas and am looking for guidance) there was some consensus that these lambdas are both so short and used so locally that being more explicit about what exactly they capture is not particularly more clear and adds code.
  • The Add followed by Remove process (rather than making the add conditional based on it not being immediately removed) significantly simplified the implantation logic by allowing Unify to always look at the attribute. Given that this particular piece of code is colder than cold (someone is mis-using an extremely rare and specific pragma) I opted for code size/simplicity over performance.
rnk accepted this revision.Apr 8 2014, 1:00 PM

Looks good, feel free to commit with some tweaks.

include/clang/Basic/DiagnosticParseKinds.td
775

I think you omitted "in" before '#pragma'

include/clang/Sema/Sema.h
277

Looks like these are used as flags. Can you give the enumerators explicit hex values to make that clear? I thought "Val & PSK_Push" was a bug.

7073

I'm not sure what the S in PSK stands for. Maybe PMK for PragmaMsKind?

7092

This can be SectionInfo() = default so the default ctor is still trivial.

lib/Parse/ParsePragma.cpp
563

indentation

lib/Sema/SemaAttr.cpp
328–342

This function looks like it has bugs, but it's dead and should be deleted.

lib/Sema/SemaDecl.cpp
7065

indentation

8958

indentation

aaron.ballman resigned from this revision.Oct 13 2015, 5:58 AM
aaron.ballman removed a reviewer: aaron.ballman.
Eugene.Zelenko closed this revision.Oct 3 2016, 5:30 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL205810.