This is an archive of the discontinued LLVM Phabricator instance.

__builtin_FILE_NAME()
ClosedPublic

Authored by karapsinie on Feb 27 2023, 8:00 AM.

Details

Summary

My commit adds "__builtin_FILE_NAME()" to clang, which insert only the filename because the full path is not always needed.

Diff Detail

Event Timeline

karapsinie created this revision.Feb 27 2023, 8:00 AM
Herald added a project: Restricted Project. · View Herald Transcript
karapsinie requested review of this revision.Feb 27 2023, 8:00 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 27 2023, 8:00 AM

Execute "arc diff git merge-base HEAD origin --update D144878"

shafik added inline comments.Feb 27 2023, 10:44 AM
clang/lib/AST/Expr.cpp
2291

It looks like a copy of the code from ExpandBuiltinMacro since we are already calling processPathForFileMacro from that same file why not factor out this code? If we can avoid code duplication we should to prevent possible future refactors only updating one copy of the code.

It'd be nice to have compiler feature parity. I created a GCC feature request for __builtin_FILE_NAME: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108978

karapsinie added a comment.EditedMar 1 2023, 3:28 AM

I tried rebase revision, but edit other revision (https://reviews.llvm.org/D145040).
We need to rebase the revision, because the tests are falling, because there is no this commit: https://reviews.llvm.org/rG24d144571dbffc6993d13fb7ca781248eed024de

karapsinie updated this revision to Diff 501451.Mar 1 2023, 3:34 AM

Execute "git-clang-format HEAD~1"

karapsinie updated this revision to Diff 501484.Mar 1 2023, 6:06 AM

Duplicate code moved to "Preprocessor::processPathToFilename"

karapsinie updated this revision to Diff 501486.Mar 1 2023, 6:10 AM

Execute "git-clang-format HEAD~1"

karapsinie marked an inline comment as done.Mar 1 2023, 9:23 AM

If all is well, approve and merge the commit.

PTAL.

Have you seen the comments on the GCC issue that @MaskRay filed? Is that something we should do as well? (It doesn't have to be part of this patch, but it'd be good to ensure we're collaborating with GCC so we get the same builtin functionality in this space.)

There should be a release note and documentation for the new functionality. One thing to consider for the docs is explaining what's going on with text encodings (and perhaps that text applies to this entire class of file-related builtins). e.g., if the system code page is Shift-JIS does this builtin return the text in Shift-JIS or UTF-8 or something else?

clang/unittests/AST/ASTImporterTest.cpp
1553

You should back out the spurious whitespace changes (I'd recommend setting your editor to not discard trailing whitespace on save as that's a common culprit for this).

karapsinie updated this revision to Diff 503696.Mar 9 2023, 1:46 AM

Removed the spurious whitespace changes.

karapsinie marked an inline comment as done.Mar 9 2023, 11:51 PM

PTAL.

Have you seen the comments on the GCC issue that @MaskRay filed? Is that something we should do as well? (It doesn't have to be part of this patch, but it'd be good to ensure we're collaborating with GCC so we get the same builtin functionality in this space.)

There should be a release note and documentation for the new functionality. One thing to consider for the docs is explaining what's going on with text encodings (and perhaps that text applies to this entire class of file-related builtins). e.g., if the system code page is Shift-JIS does this builtin return the text in Shift-JIS or UTF-8 or something else?

This is my first open-source commit.
I don't know where or what to write.
Maybe you will do it or tell me how to do it?

clang/unittests/AST/ASTImporterTest.cpp
1553

Ok.

PTAL.

Have you seen the comments on the GCC issue that @MaskRay filed? Is that something we should do as well? (It doesn't have to be part of this patch, but it'd be good to ensure we're collaborating with GCC so we get the same builtin functionality in this space.)

There should be a release note and documentation for the new functionality. One thing to consider for the docs is explaining what's going on with text encodings (and perhaps that text applies to this entire class of file-related builtins). e.g., if the system code page is Shift-JIS does this builtin return the text in Shift-JIS or UTF-8 or something else?

This is my first open-source commit.

Oh, awesome! Welcome!

I don't know where or what to write.
Maybe you will do it or tell me how to do it?

Happily!

Our release notes live in clang/docs/ReleaseNotes.rst and I'd probably add it under this heading specifically: https://github.com/llvm/llvm-project/blob/main/clang/docs/ReleaseNotes.rst#non-comprehensive-list-of-changes-in-this-release
Our documentation for builtins live in clang/docs/LanguageExtensions.rst and I'd probably add documentation under this heading specifically: https://github.com/llvm/llvm-project/blob/main/clang/docs/LanguageExtensions.rst#source-location-builtins

Both our documentation and our release notes are written in Sphinx (RST) which is a markdown format, more details of which can be found at: https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html

In terms of the encoding question I was asking, that's information we'll have to figure out (CC @tahonermann and @cor3ntin for text encoding question). My guess (which needs verification) is that we convert the file name from the system encoding to UTF-8 internally, and this builtin will likely return UTF-8 as a result. If that's correct, I think that behavior is reasonable, but I've CCed some experts who can tell me all the things I forgot to consider.

As for speaking with GCC devs, I think the issue that @MaskRay filed (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108978) got the discussion started. They asked if we should add __builtin_BASE_FILE at the same time -- is that something you think we should add and would be willing to work on as well? If so, then you should probably add a comment to that bug saying that's something we'd be fine adding so the GCC folks know the scope of the request. And if that's not something you think should be added, you should probably add a comment to that bug saying so and why. FWIW, you can see the implementation for __BASE_FILE__ and __FILE_NAME__ here: https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPMacroExpansion.cpp#L1537.

In terms of the encoding question I was asking, that's information we'll have to figure out (CC @tahonermann and @cor3ntin for text encoding question). My guess (which needs verification) is that we convert the file name from the system encoding to UTF-8 internally, and this builtin will likely return UTF-8 as a result. If that's correct, I think that behavior is reasonable, but I've CCed some experts who can tell me all the things I forgot to consider.

These things are bag of bytes - we likely can't promise _anything_ because there is no requirements of filenames to be encodable in _any_ encoding on some systems. In practice we can assume it's UTF-8 everywhere and convert to UTF-8 on windows because that's the only string literal encoding clang supports for now. If we did support other literal encodings, we would need to answer that question. IE, we could try to convert, but the conversion might not work.

clang/include/clang/Lex/Preprocessor.h
2859

processPathToFileName would be better for consistency

clang/lib/Lex/PPMacroExpansion.cpp
1993–1994

we probably want to pass a SmallVectorImpl<char> here, instead of a SmallString, to avoid making that function a template
(look at processPathForFileMacro definition for example)

2000–2001
karapsinie marked an inline comment as done.

Fulfilled the wishes of cor3ntin

karapsinie marked 3 inline comments as done.

Fulfilled the wishes of cor3ntin №2

Updated a release note and documentation.

aaron.ballman added inline comments.Mar 14 2023, 10:10 AM
clang/docs/ReleaseNotes.rst
113–114 ↗(On Diff #504648)
clang/include/clang/AST/Expr.h
4689–4690
clang/lib/AST/Expr.cpp
2283–2284
clang/lib/Lex/PPMacroExpansion.cpp
2000–2002

Minor coding style nit.

clang/lib/Parse/ParseExpr.cpp
805–817

It looks like the formatting here got wrecked, you should back these changes out.

Fulfilled the wishes of aaron.ballman

Execute "git-clang-format HEAD~1"

karapsinie marked 4 inline comments as done.

Added "__" for "builtin..."

aaron.ballman accepted this revision.Mar 16 2023, 6:17 AM

LGTM aside from a small formatting issue. Do you need someone to land this on your behalf? If so, what name and email address would you like used for patch attribution?

clang/lib/Parse/ParseExpr.cpp
804–805

Formatting here also got messed up a bit.

This revision is now accepted and ready to land.Mar 16 2023, 6:17 AM
karapsinie marked an inline comment as done.Mar 16 2023, 8:58 AM

LGTM aside from a small formatting issue. Do you need someone to land this on your behalf? If so, what name and email address would you like used for patch attribution?

Yes, I need help with landing.
Name: Ilya Karapsin.
Email: ilya84050@gmail.com.

clang/lib/Parse/ParseExpr.cpp
804–805

If these two lines are restored, "git-clang-format" will break the comments.

aaron.ballman added inline comments.Mar 17 2023, 6:44 AM
clang/lib/Parse/ParseExpr.cpp
804–805

I'll fix it on commit. Do you still get that behavior if you use git-clang-format to format only the *patch* though? https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting

This revision was automatically updated to reflect the committed changes.