This is an archive of the discontinued LLVM Phabricator instance.

Add a test to distinguish between reserved tokens and normal identifiers
ClosedPublic

Authored by ygao on Mar 10 2014, 3:07 PM.

Details

Summary

Hi,
The -fms-extensions option affects a number of subtle front-end C/C++
behaviors, and it would be useful to be able to distinguish MS keywords
from regular identifiers in the ms-extensions mode even if the triple
does not define a Windows target. It should make life easier if anyone
needs to port their Windows codes to elsewhere.

  • Gao

Diff Detail

Event Timeline

This makes sense to me, but I'll defer to the folks working on MS support.

Personally, I think this is a dangerous idea.

This page lists some of the features that code can expect to see when _MSC_EXTENSIONS is enabled.

Most notably, "Out of Class Definition of static const Integral (or enum) Members" is permitted which our -fms-extensions should not enable.

This misfeature comes up a lot and _MSC_EXTENSIONS is often how people check for it.

ygao added a comment.Mar 10 2014, 7:43 PM

Hi David,
Thanks for reviewing the change. I might need some clarifications.
I tried the sample codes from the stackoverflow link (thanks for doing the
research!), and it appears that we do not implement this bad behavior from MSVC
even with "clang -cc1 -triple i686-pc-win32 -fms-extensions". So I deduce that
"clang -fms-extensions" will behave differently than "cl /Ze" on Windows.
On the other hand, this proposed patch is only making -fms-extensions behave
the same on Windows and non-Windows targets. Since _MSC_EXTENSIONS is currently
defined on Windows target, I guess programmers will already be surprised if
they are using this macro to test MSVC-specific behaviors?

  • Gao.

Consider the following:

struct S {
  static const int i = 42;
};

const int *use() { return &S::i; }

Compiling with clang -cc1 -x c++ -S -o - -emit-llvm -triple i686-linux-gnu does not produce a definition of i.

Compiling with clang -cc1 -x c++ -S -o - -emit-llvm -triple i686-pc-win32 produces i defined as @"\01?i@S@@2HB" = linkonce_odr constant i32 42, align 4.

This matches MSVC's behavior: ?i@S@@2HB

We won't define _MSC_EXTENSIONS unless we are targeting the MSVC ABI and enable the MSVC-specific keywords with -fms-extensions.

ygao added a comment.Mar 11 2014, 11:28 AM

Hi David,
Many thanks for the code sample! I can reproduce the behavior with clang and with Visual Studio.
I still have a question regarding the MSVC behavior. I used Visual Studio 2012 and when I change
the project property to disable language extensions (/Za), Visual Studio will leave the static const
member as undefined. This seems to indicate that if clang wants to match MSVC behavior, it
should only produce a definition at -fms-extensions mode rather than being on by default, unless
there is a way to disable this behavior. Maybe I am still missing something.

  • Gao

In clang, -fms-extensions controls the conforming extensions. This is not a conforming extension.

ygao added a comment.Mar 12 2014, 10:40 AM

Hi David,

To summarize our discussion:

  1. the _MSC_EXTENSION macro indicates compatibility with "cl.exe /Ze"
  2. Some of the behaviors of "cl.exe /Ze" is implemented for Windows target, but

not under control of -fms-extensions

  1. Turning on -fms-extensions for non-Windows target will get the MS keywords,

but not the ABI-specific behaviors expected of "cl.exe /Ze"

Since it is not feasible to define _MSC_EXTENSIONS for -fms-extensions, is
it acceptable to define a new macro, which the users can use to guard their
MS keywords? For example, _MS_EXTENSIONS.

  • Gao
I think it would be better to use `__has_extension` instead:

http://clang.llvm.org/docs/LanguageExtensions.html#has-feature-and-has-
extension

We're trying to arrange that wchar_t is always available, but
if the compiler predefines it then we can't. What
has_extension
test can we make that tells us when this name is predefined?
Skimming through the doc it didn't look like there was one for this.
Thanks,
--paulr

I don't think I understand the problem you are trying to solve.

If you want to determine whether or not MSVC keywords are enabled, I don't see why it wouldn't be unreasonable to add a msvc_keywords argument for __has_extension.

I don't think I understand the problem you are trying to solve.

We provide a header file to our licensees.
That header file declares wchar_t.
But if the licensee uses -fms-extensions, that causes clang to
predeclare
wchar_t and our declaration causes a compile-time error.
So, we need a way to conditionalize our declaration.

If you want to determine whether or not MSVC keywords are enabled, I

don't see why it wouldn't be unreasonable to add a msvc_keywords
argument for __has_extension.

If "MSVC keywords" includes "predeclared types" I guess that would work.
--paulr

http://llvm-reviews.chandlerc.com/D3034

ygao updated this revision to Unknown Object (????).Mar 12 2014, 6:58 PM

Hi David,
Thanks for the suggestions.
I am uploading a patch for implementing a msvc_keywords extension.

  • Gao

This change looks good to me, I'd like @rsmith's opinion on the spelling of msvc_keywords before you commit it.

If/when you do commit this, please make sure to commit this with a different message than the current one.

rsmith added inline comments.Mar 12 2014, 7:27 PM
lib/Lex/PPMacroExpansion.cpp
1043

I'm opposed to having something so vague and general. What happens when we add support for more MS keywords?

Please make the extension check be a test for a specific extension.

@rsmith, perhaps we want has_extension(keyword_<keyword_goes_here>) ?

Alternatively, perhaps __is_keyword or __is_identifier?

ygao updated this revision to Unknown Object (????).Mar 14 2014, 11:22 AM

Hi David, Richard,
The following patch implements a new __has_keyword() test.
There are several existing keyword tests in clang/test/Lexer directory that
can be rewritten to make use of this new test. I will do that in a follow-up
patch if this test looks acceptable,

  • Gao

The patch looks good.

What should we do about the iso646 alternative tokens (and, or, compl, ...)? These are technically not keywords (and this implementation of __has_keyword will return 0 for them), but they are also not usable as identifiers, and it would be useful for code to detect them (because in MSVC compat mode we need to treat them weirdly).

ygao added a comment.Mar 14 2014, 5:31 PM

I do not know. Are these alternative tokens pretty much set or are we expecting to add more in the future? Are
they recognized as keywords only under MSVC compat mode?
I guess we could have another has_operator_token() test, but I can be convinced to rename has_keyword()
to mean any (non-attribute) special tokens.
Also, this proposed implementation is not testing those objective-c "at" keywords (only a keyword if after an "@"
in objc mode). I am not sure if this is a problem.

ygao added a comment.Mar 14 2014, 9:02 PM

They're not keywords, they're weird ways of spelling things like && and |.
Under MSVC compat mode, we should not recognize them at all (MSVC doesn't do
so in its default mode, and appears to treat them as macros in its
"conforming" mode).

In that case, something like the following should return true if the operator
token is to be recognized. Right?

bool isSpecialOperatorToken(IdentifierInfo *II)
{

if (LangOpts.MSVCCompat)
  return false;
return II->isCPlusPlusOperatorKeyword();

}

You could instead provide an __is_identifier, which I think might be what
you're really looking for here ("is this token usable as an identifier?").

True, but I would prefer if it is called something else.
All the other has_***() methods return true if the given parameter is
something special, whereas it seems that
is_identifier() would return
false if something is special (hence not an identifier).
Maybe __has_special_token(parameter)?

ygao updated this revision to Unknown Object (????).Mar 24 2014, 6:21 PM

Hi,
I tried both a has_reserved_token() implementation and is_identifier(), and I realize
that the __is_identifier() test requires quite a bit less code. Uploaded is the new patch,

  • Gao
ygao added a comment.Mar 31 2014, 12:06 PM

A gentle ping.

rsmith accepted this revision.Apr 10 2014, 6:15 PM

Looks good. Please also provide a documentation update for this feature.

ygao closed this revision.May 6 2014, 7:08 PM

Updated the clang manual in r206099. Closing this code review.