This is an archive of the discontinued LLVM Phabricator instance.

[PGO]: Do not use invalid Char in instrumentation variable names
ClosedPublic

Authored by davidxl on Dec 4 2015, 12:26 PM.

Details

Summary

: and < are not valid characters for assemblers. See MCAsmInfo::isValidUnquotedName in MCAsmInfo.cpp and method llvm::printLLVMNameWithoutPrefix in AsmWriter.cpp.

Because this, when the such symbols names emitted in .s file will be quoted which will also upset the assembler.

To reproduce the problem, simply build the following source with option -fprofile-instr-generate -c -no-integrated-as:

static int cmp(int a, int b)
{

return a > b;

}

extern int foo(int (*)(int, int));

int bar()
{

return foo(cmp);

}

However there is a downside of this fix -- this is not a backward compatible fix -- once the fix is in, the old version of the indexed profile can not be guaranteed to be usable again (only partially usable) -- the static function's profile data will be dropped (can not be found anymore).

Need more opinion on the impact of partially breaking the old version of indexed format.

Diff Detail

Repository
rL LLVM

Event Timeline

davidxl updated this revision to Diff 41924.Dec 4 2015, 12:26 PM
davidxl retitled this revision from to [PGO]: Do not use invalid Char in instrumentation variable names.
davidxl updated this object.
davidxl added reviewers: bogner, vsk.
davidxl added a subscriber: llvm-commits.
silvas added a subscriber: silvas.Dec 4 2015, 7:50 PM

I would prefer not to silently break compatibility. These things tend to come back and bite. In particular, static functions are quite common, and we are unlikely to "do the right thing" with just this patch (actually, I know that this will affect the intrinsics in our built-in headers, so most code is likely to be affected). Do we have a versioning mechanism available so that we can produce a proper diagnostic and reject the file?

Personally, I am not strongly opposed to changing the format. We are still rolling out PGO to our customers and haven't produced explicit documentation on this aspect, but my intention is that we will document to our users that we do not guarantee any compatibility of the profraw or profdata files across releases. So in principle I have no concerns with breaking old profraw or profdata files as long as we cleanly reject them (besides a slight apprehension since I often am comparing compilers across releases and it is convenient to not have to have a separate llvm-profdata for each release).

So I will defer to Justin about whether we should break compatibility over this aspect and if so, how we should handle the transition.

davidxl updated this revision to Diff 41985.Dec 4 2015, 10:17 PM

Updated patch.

With this change, the format change will be backward compatible.

Clang test case changes will be in another patch -- mostly mechanical. Existing test cases also include one binary profile test that covers the old naming scheme.

davidxl updated this revision to Diff 41986.Dec 4 2015, 10:19 PM

The correct update.

vsk edited edge metadata.Dec 10 2015, 5:38 PM

This lgtm.

Tests are needed.

vsk added a comment.Dec 10 2015, 5:57 PM

Oof, this hit me a second after I pressed enter. It looks like we don't have coverage for this at the moment.

I will post test changes in a different patch -- existing tests already cover the case.

I will post test changes in a different patch -- existing tests already cover the case.

No. This patch changes functionality and needs a test.

Very well. I had just added a test case to cover version 3. Will add
another one in this patch to cover the new naming scheme.

thanks,

David

davidxl updated this revision to Diff 42498.Dec 10 2015, 10:16 PM
davidxl edited edge metadata.

Update the patch with a test case.

This revision was automatically updated to reflect the committed changes.