This is an archive of the discontinued LLVM Phabricator instance.

[MSVC] Add support for pragma function
ClosedPublic

Authored by steplong on Apr 29 2022, 2:55 PM.

Details

Summary

MSVC pragma function tells the compiler to generate calls to functions in the pragma function list, instead of using the builtin. Needs https://reviews.llvm.org/D124701

https://docs.microsoft.com/en-us/cpp/preprocessor/function-c-cpp?view=msvc-170

Diff Detail

Event Timeline

steplong created this revision.Apr 29 2022, 2:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2022, 2:55 PM
steplong requested review of this revision.Apr 29 2022, 2:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2022, 2:55 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
hans added a comment.May 2 2022, 5:23 AM

From the MS docs:

Once a function pragma is seen, it takes effect at the first function definition that contains a specified intrinsic function. The effect continues to the end of the source file, or to the appearance of an intrinsic pragma specifying the same intrinsic function. You can only use the function pragma outside of a function, at the global level.

Should we try to handle the interaction between pragma intrinsic and pragma function, i.e. that the former "undoes" the latter? And should we error/warn if the pragma occurs not in namespace scope?

clang/include/clang/Sema/Sema.h
10319

Pass by const-ref maybe?

From the MS docs:

Once a function pragma is seen, it takes effect at the first function definition that contains a specified intrinsic function. The effect continues to the end of the source file, or to the appearance of an intrinsic pragma specifying the same intrinsic function. You can only use the function pragma outside of a function, at the global level.

Should we try to handle the interaction between pragma intrinsic and pragma function, i.e. that the former "undoes" the latter? And should we error/warn if the pragma occurs not in namespace scope?

Oh I wasn't sure how to check if the function pragma was outside of a function. Yup, we can try to handle pragma intrinsic

steplong updated this revision to Diff 426433.May 2 2022, 9:38 AM
steplong edited the summary of this revision. (Show Details)
steplong updated this revision to Diff 426702.May 3 2022, 7:30 AM
steplong updated this revision to Diff 426802.May 3 2022, 12:24 PM
steplong updated this revision to Diff 426998.May 4 2022, 7:15 AM

Removed unnecessary braces (mentioned in the LLVM coding standard)

hans added a comment.May 4 2022, 7:48 AM

And should we error/warn if the pragma occurs not in namespace scope?

Oh I wasn't sure how to check if the function pragma was outside of a function.

In the ActOn.. methods I think you can check what the CurContext is.

clang/lib/Parse/ParsePragma.cpp
3542

SmallVector would be more common LLVM style, I think. The ActOn.. methods could take the argument as a const-ref to SmallVectorImpl<StringRef>. This applies to the function pragma below as well.

3561

since the above is just a warning, we should probably still call the ActOn.. method?

3605

same as above: since it's just a warning, maybe we still want to call the ActOn.. method?

clang/lib/Sema/SemaAttr.cpp
1079

Do we want to avoid duplicates in MSFunctionNoBuiltins? Or maybe it doesn't matter?

steplong updated this revision to Diff 427040.May 4 2022, 9:23 AM

Changed std::vector to llvm::SmallVector

steplong added inline comments.May 4 2022, 9:27 AM
clang/lib/Parse/ParsePragma.cpp
3561

Hmm, I'm not sure because all the msgs above are also warnings (diag::warn_)

clang/lib/Sema/SemaAttr.cpp
1079

Yea, I didn't think it really mattered. I originally wanted to use a set, but I needed the strings to be stored in contiguous memory for NoBuitinAttr::CreateImplicit() in Sema::AddRangeBasedNoBuiltin()

steplong updated this revision to Diff 427127.May 4 2022, 1:52 PM

Error if pragma function/intrinsic is used inside a function

hans added a comment.May 6 2022, 5:54 AM

It needs tests for the warnings about badly formed pragmas, and for pragmas outside file/namespace scope.

clang/lib/Parse/ParsePragma.cpp
3561

Yes, but for those, (missing left paren, or unknown intrinsic) it makes sense to do nothing. If it's just the right paren missing, I don't think it makes sense to ignore the intrinsic? Can you check how we handle similar situations with other intrinsics?

clang/lib/Sema/SemaAttr.cpp
1079

A set would be nicer conceptually of course.

How about using an llvm::SmallSetVector. That will solve the problem of duplicates, it will be deterministic, and you can use getArrayRef() to get the values in contiguous memory, or maybe being() and end() will work too.

It needs tests for the warnings about badly formed pragmas, and for pragmas outside file/namespace scope.

Appreciate the comments. I have a couple of questions:

Is the tests for badly formed pragmas, clang/test/Preprocessor/pragma_microsoft.c?
It looks like MSVC only cares about it not being in a function like mentioned in the comments in D125011. Do we try to mimic MSVC behavior because MSVC rejects #pragma function(memset whichout the right paren?
MSVC also rejects #pragma function(foo) if foo isn't extern C or declared. We probably to match that behavior even if it isn't documented?

steplong added inline comments.May 6 2022, 12:06 PM
clang/lib/Sema/SemaAttr.cpp
1079

Hmm do you know where I can find an Allocator? getArrayRef().data() returns a pointer to a constant StringRef, but CreateImplicit takes a pointer to a StringRef.

steplong updated this revision to Diff 427716.May 6 2022, 12:49 PM
steplong added a reviewer: aaron.ballman.

Changed it to use SmallSetVector instead of SmallVector. Had to use an temporary SmallVector in AddRangeBasedNoBuiltin() because I'm not sure how to get a pointer to StringRef. getArrayRef().copy().data() needs an Allocator.

Also, I can't figure out why the test pragma-ms-function-intrinsic.c is failing from the pragma not being in file context. The pragmas are not inside a function.

Will add Sema and parsing tests in the next update to this patch.

aaron.ballman added inline comments.May 9 2022, 7:57 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
988 ↗(On Diff #427716)
clang/include/clang/Sema/Sema.h
758
10320–10325

This makes it easier for callers who have a SmallVector of a different extent but the correct type.

clang/lib/Parse/ParsePragma.cpp
3569

Given that this patch is about adding pragma function, I think it'd be better to split the changes to the intrinsic pragma out into their own patch set to keep a clear separation.

3572–3574

Formatting looks off here -- you should run your patch through clang-format (https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting).

3578

Can we use ExpectAndConsume() here instead?

3602–3607

Another ExpectAndConsume()?

clang/lib/Sema/SemaAttr.cpp
1070

What is CurContext when this gets called for your .c test file? I would have expected it to be the TranslationUnitDecl which should be a file context (getting the redecl context shouldn't do anything in the cases I've seen).

steplong added inline comments.May 10 2022, 6:46 AM
clang/lib/Parse/ParsePragma.cpp
3578

I don't think we can use ExpectAndConsume() here because it's not a Parser

clang/lib/Sema/SemaAttr.cpp
1070

It looks like it's a FunctionDecl

aaron.ballman added inline comments.May 10 2022, 7:26 AM
clang/lib/Parse/ParsePragma.cpp
3578

Oh good point! Sorry for the noise.

clang/lib/Sema/SemaAttr.cpp
1070

Wha? That seems rather surprising -- which function does it think the pragma is scoped to?

You might consider putting breakpoints in PushDeclContext() and PopDeclContext() to see what's going on (or search for other places where we assign to CurContext and break on those).

steplong updated this revision to Diff 428375.May 10 2022, 7:37 AM
steplong edited the summary of this revision. (Show Details)
  • Clang-formatted patch
  • Split out the pragma intrinsic stuff
  • Replaced SmallSetVector in pragma handler with SmallVector. MSNoBuiltins is still a SmallSetVector in Sema.
  • Commented out if not file scope block in ActOn.. method. Will investigate next
  • Added pragma warnings test
steplong added inline comments.May 10 2022, 8:14 AM
clang/lib/Sema/SemaAttr.cpp
1070

This is what I see when I run it on pragma-ms-function.c:

PUSHING TranslationUnit
PUSHING Function foo1
FOUND PRAGMA FUNCTION
POPPING Function foo1
PUSHING TranslationUnit
PUSHING Function foo2
FOUND PRAGMA FUNCTION
POPPING Function foo2
PUSHING TranslationUnit
PUSHING Function foo3
POPPING Function foo3
PUSHING TranslationUnit

I'm logging the swap in PopDeclContext() with POPPING and PUSHING and the push in PushDeclContext() with just PUSHING.
I'm also logging FOUND PRAGMA in the pragma handler

aaron.ballman added inline comments.May 10 2022, 8:34 AM
clang/lib/Sema/SemaAttr.cpp
1070

Huh. For fun, can you try changing the test to:

void foo1(char *s, char *d, size_t n) {
  bar(s);
  memset(s, 0, n);
  memcpy(d, s, n);
}

int i; // Now there's a declaration here

#pragma function(strlen, memset)

and see if you get different results? I'm wondering if what's happening is that the CurContext is being updated after we've lexed the next token from the preprocessor, which means we're still in the context of foo1 until after we've processed the pragma due to it being a preprocessing token. It still wouldn't make much sense to me, because I think we should hit that on the } for foo1(), but it's a shot in the dark.

steplong added inline comments.May 10 2022, 9:09 AM
clang/lib/Sema/SemaAttr.cpp
1070

It looks like you're right. The int i makes the pragma show up after the TranslationUnit is pushed.

PUSHING TranslationUnit
PUSHING Function foo1
POPPING Function foo1
PUSHING TranslationUnit
FOUND PRAGMA FUNCTION
PUSHING Function foo2
POPPING Function foo2
PUSHING TranslationUnit
FOUND PRAGMA FUNCTION
PUSHING Function foo3
POPPING Function foo3
PUSHING TranslationUnit

I really appreciate the review and you helping me debug this

aaron.ballman added inline comments.May 10 2022, 9:45 AM
clang/lib/Sema/SemaAttr.cpp
1070

Ugggghhhhh I was hoping I was wrong because I think the solution requires some experimentation that may or may not pan out.

The way the pragma is currently being handled is: we parse everything at preprocessing time (PragmaMSFunctionHandler::HandlePragma()) and then call directly into Sema (Actions.ActOnPragmaMSFunction()).

The way I think we may have to handle the pragma is: identify the pragma at preprocessing time, emit an annotated token stream, parse that token stream from the parser, and then call Sema from there. e.g., more like:

https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParsePragma.cpp#L2698
https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/Parser.cpp#L830
https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParsePragma.cpp#L894 (you would call ActOnPragmaMSFunction() from here)

This way, the appropriate parsing contexts are all updated (hopefully)!

I really appreciate the review and you helping me debug this

I'm very happy to help!

steplong updated this revision to Diff 428456.May 10 2022, 11:56 AM
  • Changed pragma-ms-functions.c to only check that the no-builtin attributes are added to the functions
  • Moved pragma handler to Parser
  • Changed scope check to only file scope. MSVC accepts stuff like that we don't:
struct S {
  int a;
#pragma function(memset)
};
aaron.ballman added inline comments.May 11 2022, 4:19 AM
clang/lib/Sema/SemaAttr.cpp
1070

Now that we solved the mystery of the context -- do we need getRedeclContext() here?

1103

I think we should rename this function to be more specific to the pragma. How about AddImplicitMSFunctionNoBuiltinAttr() -- it's a mouthful, but it's at least a descriptive mouthful that won't be called too often.

clang/lib/Sema/SemaDecl.cpp
10224–10226

I'd be delighted if a follow-up NFC commit changed this name as well -- "RangeBased" doesn't tell me much of anything in these identifiers (it leaves me wondering why there's no range passed in to the function).

clang/test/Preprocessor/pragma_microsoft.c
210–211 ↗(On Diff #428456)

It's a bit unfortunate that these don't suggest that an identifier is what's missing given that empty parens isn't a particularly useful pragma, however, I don't insist on a change either.

220 ↗(On Diff #428456)

It'd be pretty reasonable to add the struct test here as well with a comment that Microsoft accepts it but we reject it on purpose with an appeal to the MS docs that say it must appear at file scope.

steplong updated this revision to Diff 428657.May 11 2022, 7:19 AM
  • Renamed AddRangeBasedNoBuiltin to AddImplicitMSFunctionNoBuiltinAttr
  • Renamed err_pragma_intrinsic_function_scope to err_pragma_expected_file_scope
  • Added comment about MSVC accepting struct test
  • Removed ->getRedeclContext(). Seems to be ok (i.e. tests are still passing)
aaron.ballman accepted this revision.May 11 2022, 8:32 AM

Thanks for all the hard work on this, LGTM! Can you please add a release note to clang/docs/ReleaseNotes.rst for the new functionality (probably best under the New Pragmas in Clang heading) when you land it?

This revision is now accepted and ready to land.May 11 2022, 8:32 AM
steplong added inline comments.May 11 2022, 12:05 PM
clang/test/Preprocessor/pragma_microsoft.c
210 ↗(On Diff #428657)

Hmm does it make sense for this to be a warning and not an error?

aaron.ballman added inline comments.May 12 2022, 6:32 AM
clang/test/Preprocessor/pragma_microsoft.c
210 ↗(On Diff #428657)

I'm glad I'm not the only one who questioned this. :-D I tend to be of the opinion that syntax errors should be errors rather than warnings. However, this is using an existing warning that's used by a bunch of other pragmas so I think it's defensible to use it here, especially given that this pragma form is commonly used for compiler extensions (so who knows, the syntax could be valid for another compiler, I guess).

aaron.ballman added inline comments.May 12 2022, 6:43 AM
clang/lib/Sema/SemaAttr.cpp
1070

It looks like we need getRedeclContext() after all, consider:

#include <cstring>

extern "C" {
void foo();
#pragma function(memset)
}
steplong updated this revision to Diff 428941.May 12 2022, 7:39 AM
  • Added test case for the getRedeclContext() case
  • Brought back getRedeclContext()
  • Clang-formatted
  • Added description in docs
aaron.ballman accepted this revision.May 12 2022, 8:11 AM

LGTM, thanks for all the work here!

clang/docs/ReleaseNotes.rst
272 ↗(On Diff #428941)
steplong updated this revision to Diff 428981.May 12 2022, 9:46 AM
  • Updated line in docs
This revision was automatically updated to reflect the committed changes.