Page MenuHomePhabricator

adds __FILE_BASENAME__ builtin macro
Needs ReviewPublic

Authored by weimingz on Feb 29 2016, 1:58 PM.
Tags
None
Tokens
"Like" token, awarded by thinkingfish."Like" token, awarded by johnkdoe."Like" token, awarded by NSProgrammer."Like" token, awarded by JetForMe.

Details

Summary

FILE will be expanded to the full path. In some scenarios, only part of the full path is needed.
This patch adds the option below:

-ffile-macro-prefix-to-remove=/foo/bar

to strip away matched prefix.

Diff Detail

Event Timeline

weimingz updated this revision to Diff 49414.Feb 29 2016, 1:58 PM
weimingz retitled this revision from to adds __FILE_BASENAME__ builtin macro.
weimingz updated this object.
weimingz added a subscriber: cfe-commits.
bcraig added a subscriber: bcraig.Feb 29 2016, 2:57 PM

Note: this doesn't count as an official "LGTM".

The code change seems fine to me. I think this has been implemented in gcc as well, but I don't recall for certain. If this has been implemented in gcc, then I would expect the semantics to be the same. If it hasn't been implemented in gcc, then we might want to pick a different name for the macro (e.g. CLANG_FILE_BASENAME).

Instead of doing this, would it make sense to have a flag like -ffile-basename that changes what FILE expands to?

I had wished I'd be able to have some control over FILE (I'd like to say "make all FILEs relative to this given directory for my use case), and changing FILE to something else in all the world's code isn't easy – so maybe having a flag that provides some control over FILE instead of adding a separate macro would be a good idea.

joerg added a subscriber: joerg.Feb 29 2016, 3:20 PM

I've added functionality like that to NetBSD's GCC a long time ago. That functionality is useful for a variety of situations, cutting down file space or canonicalization are two uses.

Note: this doesn't count as an official "LGTM".

The code change seems fine to me. I think this has been implemented in gcc as well, but I don't recall for certain. If this has been implemented in gcc, then I would expect the semantics to be the same. If it hasn't been implemented in gcc, then we might want to pick a different name for the macro (e.g. CLANG_FILE_BASENAME).

GCC has a macro "BASE_FILE", which I initially thought was for this purpose, but it's not.
BASE_FILE

This macro expands to the name of the main input file, in the form of a C string constant. This is the source file that was specified on the command line of the preprocessor or C compiler.

I will rename the macro to CLANG_FILE_BASENAME

Instead of doing this, would it make sense to have a flag like -ffile-basename that changes what FILE expands to?

I had wished I'd be able to have some control over FILE (I'd like to say "make all FILEs relative to this given directory for my use case), and changing FILE to something else in all the world's code isn't easy – so maybe having a flag that provides some control over FILE instead of adding a separate macro would be a good idea.

do you mean this?
if source file is /a/b/c/d/foo.c and if -ffile-name-stem-remove=/a/b/c, then _FILE_ will be expanded to "d/foo.c" ?

weimingz updated this revision to Diff 49445.Feb 29 2016, 6:18 PM

rename the macro to CLANG_FILE_BASENAME per Ben's comments.

thakis edited edge metadata.Feb 29 2016, 6:25 PM

Instead of doing this, would it make sense to have a flag like -ffile-basename that changes what FILE expands to?

I had wished I'd be able to have some control over FILE (I'd like to say "make all FILEs relative to this given directory for my use case), and changing FILE to something else in all the world's code isn't easy – so maybe having a flag that provides some control over FILE instead of adding a separate macro would be a good idea.

do you mean this?
if source file is /a/b/c/d/foo.c and if -ffile-name-stem-remove=/a/b/c, then _FILE_ will be expanded to "d/foo.c" ?

Yes, something like that. Maybe it could look like -f__file__-expansion=basename (to make FILE expand to just the basename, what you want), -f__file__-expansion=relative-to:/a/b/c to make it relative to a given path.

Instead of doing this, would it make sense to have a flag like -ffile-basename that changes what FILE expands to?

I had wished I'd be able to have some control over FILE (I'd like to say "make all FILEs relative to this given directory for my use case), and changing FILE to something else in all the world's code isn't easy – so maybe having a flag that provides some control over FILE instead of adding a separate macro would be a good idea.

do you mean this?
if source file is /a/b/c/d/foo.c and if -ffile-name-stem-remove=/a/b/c, then _FILE_ will be expanded to "d/foo.c" ?

Yes, something like that. Maybe it could look like -f__file__-expansion=basename (to make FILE expand to just the basename, what you want), -f__file__-expansion=relative-to:/a/b/c to make it relative to a given path.

I think we can do this separately. A "basename" macro is easier for programmers to use and no build system change needed.

bcraig edited edge metadata.Mar 1 2016, 10:24 AM

LGTM. You should probably wait for someone else to approve it though.

I think we can do this separately. A "basename" macro is easier for programmers to use and no build system change needed.

Hm, I would think that adding a flag to your CFLAGS is easier than getting all your dependencies to use a clang-only new macro…

I think we can do this separately. A "basename" macro is easier for programmers to use and no build system change needed.

Hm, I would think that adding a flag to your CFLAGS is easier than getting all your dependencies to use a clang-only new macro…

Depends a lot on the person and code base.
Pro-macro situations:
I've had code bases where the bulk of the code refused to use compiler / environment provided preprocessor tokens. In those environments, all of the uses of FILE and the like were in one place. Adding a compiler test and switching to CLANG_FILE_BASENAME would be easy in those code bases. Changing CFLAGS in dozens of different compiles wouldn't.

I've also seen uses of FILE that would break if the file name changed drastically and unexpectedly.

Pro-compilation switch situation:
If FILE is used in a lot of places, and the files are compiled with multiple compilers, then switching to a different macro would be problematic. If it is easy to change CFLAGS in all of your builds then the compilation switch makes sense.

weimingz updated this revision to Diff 49713.Mar 3 2016, 12:21 AM
weimingz edited edge metadata.

Add "-fFILE-prefix-to-remove" flag to support the trim of the prefix.
Passing special value ALL_DIR to remove all dir parts.

For example FILE is /a/b/c
-fFILE-prefix-to-remove=/a/ will cause FILE be expanded to b/c

hfinkel added a subscriber: hfinkel.Mar 3 2016, 7:20 AM

Add "-fFILE-prefix-to-remove" flag to support the trim of the prefix.
Passing special value ALL_DIR to remove all dir parts.

For example FILE is /a/b/c
-fFILE-prefix-to-remove=/a/ will cause FILE be expanded to b/c

I *really* don't like having __FILE__ in the option name. That will look visually disjointing on the command line. How about: -ffile-macro-prefix-to-remove?

weimingz updated this revision to Diff 49762.Mar 3 2016, 11:39 AM
weimingz updated this object.

Change the option name to -ffile-macro-prefix-to-remove

weimingz updated this revision to Diff 50300.EditedMar 10 2016, 10:33 AM

rebased

rebased

ping~

rebased

ping~

HI,

Any comments/suggestions?

alexr added a subscriber: alexr.Apr 8 2016, 11:46 AM

There are multiple features in this patch that should be considered separately. Please split the patch.

That said, I don't think we want to add a fundamental new preprocessor feature like FILE_BASENAME without at least getting some early buy-in from the WG14 and WG21 committees. I suspect they’d not want to add to the preprocessor at all.

The flag concept seems much more tractable. Please split it into two flags, one to force FILE to only be the basename, and one that handles the prefix stripping. That is, a magic value of ALL_DIR seems like the wrong choice here.

Sounds good. Will do.

Thanks for reviewing

weimingz updated this revision to Diff 53105.Apr 8 2016, 6:19 PM

per Alex's suggestion, split into 2 flags:
-ffile-macro-prefix-to-remove=xxx : remove matched prefix
-ffile-macro-basename-only : remove the whole dir part

weimingz updated this object.Apr 8 2016, 6:22 PM
dexonsmith requested changes to this revision.Apr 13 2018, 6:47 AM
dexonsmith added a subscriber: dexonsmith.

Hal requested splitting the patch in two, since the two features are separable, but they both still seem to be here. Perhaps start with the prefix patch?

include/clang/Driver/Options.td
828

This opt name doesn’t match the command line option.

831

Same here.

include/clang/Lex/PreprocessorOptions.h
122

This identifier name is reserved for the implementation.

126

Same here.

lib/Lex/PPMacroExpansion.cpp
1608

Why not use =?

This revision now requires changes to proceed.Apr 13 2018, 6:47 AM
JetForMe awarded a token.EditedApr 13 2018, 1:13 PM
JetForMe added a subscriber: JetForMe.

Just want to comment that this functionality is also good for security, as it enables hiding of path information in final builds (while still allowing logging to show file and line number). Because a lot of third-party code can get incorporated into a build, command-line options to modify the rendering of FILE are preferred.

weimingz updated this revision to Diff 142499.Apr 13 2018, 6:05 PM
weimingz edited the summary of this revision. (Show Details)

split the original into two parts. This one supports -ffile-macro-prefix-to-remove function.

I locally verified it. To add a test case, do we need to use VFS?

split the original into two parts. This one supports -ffile-macro-prefix-to-remove function.

I locally verified it. To add a test case, do we need to use VFS?

VFS might work.

NSProgrammer added a subscriber: NSProgrammer.
jamie added a subscriber: jamie.Nov 19 2018, 4:34 AM
Ka-Ka added a subscriber: Ka-Ka.Dec 18 2018, 12:56 AM
Ka-Ka added a comment.EditedDec 18 2018, 7:20 AM

Ping?

Do you think this small lit testcase below work as a testcase for the -ffile-macro-prefix-to-remove option?

// RUN: %clang_cc1 -emit-llvm %s  -o - -ffile-macro-prefix-to-remove=%s | FileCheck %s
char this_file[] = __FILE__;
// CHECK: @this_file = global [1 x i8] zeroinitializer
kristina added a subscriber: kristina.EditedJan 3 2019, 6:53 AM

I like the idea of just getting the filename reliably, but I think an extension to strip path prefixes is a bit of an overkill especially as yet another compiler flag. I implemented a similar thing in my fork in form of a directive to __generate(...) which is my own weird extension to do meta-programming using the preprocessor, that's the implementation with no extra parameters (just as an idea, I know the implementation is not great):

else if (PS == kMPS_DirectiveFile)
 {
   PresumedLoc PLoc = SourceMgr.getPresumedLoc(Tok.getLocation());
   
   OutBuffer.clear();
   OutBuffer.push_back('"');
   
   if (PLoc.isValid()) {
     TokenBuffer.clear();
     TokenBuffer.append(PLoc.getFilename());
     
     size_t LastSlashPos = TokenBuffer.find_last_of('/');
     if (LastSlashPos != StringRef::npos) {
       std::string LastPathComponent = 
         Lexer::Stringify(TokenBuffer.substr(LastSlashPos+1));
       
       OutBuffer.append(LastPathComponent);
     }
     else {
       Lexer::Stringify(TokenBuffer);
       OutBuffer.append(TokenBuffer.str());
     }
   }
   else {
     OutBuffer.push_back('?');
   }
 }

I think this type of approach (I don't mean the PP extension, just the methodology) for getting the filename without the full path or shedding parts of the full path is a lot nicer. I'm not entirely sure why so many headers are getting pulled in either, though, I think the only issue is the path separator? Aside from that I don't think this needs a compiler flag, but a macro for just getting the filename reliably in upstream would be nice. Your flag idea means various build systems may have to do gymnastics to find out the right path to strip so it seems cumbersome.

To throw in my 2 cents. I don’t really have a preference between a compiler flag vs a different macro that’s just for the file name sans path prefix. But I have a real need for this to get into clang: with 1.2 million lines of code, the regular placement of log statements and custom asserts leads to megabytes in binary size from all the FILE usages, and that could easily be a few hundred KB with this kind of support in clang.

To throw in my 2 cents. I don’t really have a preference between a compiler flag vs a different macro that’s just for the file name sans path prefix. But I have a real need for this to get into clang: with 1.2 million lines of code, the regular placement of log statements and custom asserts leads to megabytes in binary size from all the FILE usages, and that could easily be a few hundred KB with this kind of support in clang.

Yeah that would only require adding a new macro in lib/Lex/PPMacroExpansion.cpp plus a test instead of a huge diff as it is now. So yes as I said, that would be a great patch. However this implementation is is needlessly complicated and I don't think this should be yet another driver/compiler flag for a seemingly obscure use case. So yes I'm with you on that, filename without the path prefix is definitely an extension I'd like to see upstreamed (I proposed it on cfe-dev a while ago but it wasn't well received so I never put it up for review (that was before the weird preprocessor stuff that I do now in my fork)).

kristina added a comment.EditedJan 3 2019, 11:17 AM

I should add that this is not just about reproducible builds but about code size as mentioned by @NSProgrammer. There are ways to cheat with builtins but the problem is, entire __FILE__ string is still emitted into read-only data, despite the constexpressiveness of the value. The common way of getting the short filename in code would the the following (the #else case) which is a constant evaluated but has a side effect of dumping the full paths in rodata regardless (ICF seems unable to clean them up even in full LTO builds).

#ifdef __clang_extension_generate
    #define OS_CUR_FILENAME  __generate("file!0")
#else
    #define OS_CUR_FILENAME (__builtin_strrchr(__FILE__, '/') ? \
        __builtin_strrchr(__FILE__, '/') + 1 : __FILE__)
#endif

The extension on the other hand avoids that. So it's a win for code size and reproducible builds. I would urge reconsidering something like __FILE_NAME__, or whatever name is preferable for just getting the last path component at preprocessor stage, despite it being poorly received by other developers the last time it seems to be a feature wanted by consumers. I would urge maintainers to reply to get another consensus regarding this since my stance is still the same, if a "missing" feature is widely hacked around, it's clearly desirable. While I understand that a lot have an aversion to nonstandard Clang-specific extensions, as long as a feature check is available it should be down to the consumer to decide if they want to make use of such extensions.

Ping for author? This is one of the changes I'd like to see through, though the complexity here can be massively reduced as stated above. I wouldn't mind opening a new diff if the author is away with just a change in PPMacroExpansion and a testcase (which should probably be included here) but I would appreciate some form of consensus since this exact idea with a special macro for filename was previously rejected.

dstenb added a subscriber: dstenb.Jan 10 2019, 12:22 AM
NSProgrammer added a comment.EditedFri, Feb 22, 9:48 AM

We would prefer a macro like __FILE_NAME__ over a build flag for code reading consistency (they would clearly do different things vs varying based on an obscure flag being present/absent). (This contradicts my previous statement of having no preference since discussing it more, the macro approach would actually be preferable, but either is acceptable).

This patch is languishing, unless the original author thinks otherwise, a new patch to push this through would be great.

kristina added a comment.EditedThu, Feb 28, 9:11 AM

If the author is still missing at the end of next week, any objections to me resubmitting a similar patch that just implements __FILE_NAME__ or __BASE_NAME__ (Need a few more opinions here I guess, personally I think __FILE_NAME__ makes more sense)?

I'll carve it out from my PP extension which simply looks for the last path separator (depending on the OS) and only renders the filename after it (or the whole path if there's no separator). No need for additional complications like depths etc. Since this idea was shot down last time, is it possible to get a few people to voice their opinion before I mark this as abandoned and carve out and clean up this from my PP extension and add proper tests for it?

Would be appreciated, as this sort of thing is very useful (IMO) so would like to know if anyone is really against this proposal.