This is an archive of the discontinued LLVM Phabricator instance.

[clang][docs] Update SanitizerSpecialCaseList docs
AbandonedPublic

Authored by ellis on Jun 12 2023, 5:05 PM.

Details

Summary

The docs in SanitizerSpecialCaseList.rst seem to be outdated.

  • "The meaning of * in regular expression for entity names is different - it is treated as in shell wildcarding."
    • This is wrong because */foo.c will match /path/to/foo.c instead of just something/foo.c. Instead, * is treated as .* and filepaths are treated as strings.
  • I've removed fun:MyFooBar since it made it seem like we could match demangled names.

To confirm these changes, I've extended the profile-filter-new.c test and updated it to use split-file.

Diff Detail

Event Timeline

ellis created this revision.Jun 12 2023, 5:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 5:05 PM
ellis retitled this revision from [clang][docs] Update Sanitizer docs and test to [clang][docs] Update SanitizerSpecialCaseList docs.Jun 12 2023, 5:14 PM
ellis edited the summary of this revision. (Show Details)
ellis added reviewers: vlad.tsyrklevich, MaskRay.
ellis published this revision for review.Jun 12 2023, 5:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 5:16 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
MaskRay accepted this revision.Jun 12 2023, 6:21 PM

"The meaning of * in regular expression for entity names is different - it is treated as in shell wildcarding."

Right. See SpecialCaseList.cpp

I've removed fun:MyFooBar since it made it seem like we could match demangled names.

Yes, this is a limitation that we cannot match just C files (src:*.c matches .cc and .cpp files as well)...

clang/test/CodeGen/profile-filter-new.c
1

is more robust since if %t previously exists as a file (stale file after testing), the new test will fail...

rm -rf %t is also clear that all files in the old directory are removed, so more difficult to reference dangling files after updating tests...

// RUN: rm -rf %t && split-file %s %t && cd %t is common as well to avoid %t/ occurrences...

11

Consider dropping Inputs/ for all files. It's unneeded and misleading (not in the source directory).

This revision is now accepted and ready to land.Jun 12 2023, 6:21 PM

For this patch, the best test is actually llvm/unittests/Support/SpecialCaseListTest.cpp. clang/test/CodeGen/profile-filter-new.c is a bit loosely connected to the gist of the patch.

This is unrelated to this change but related to the issue this change is addressing. The use of regular expressions for special case list with the special handling of * is error-prone, I've seen many people having issues it. Furthermore, regular expression for matching file names requires excessive escaping (see for example https://fuchsia-review.git.corp.google.com/c/fuchsia/+/763162) and most patterns used for function names only utilize *, not requiring the full strength of regular expression. I think we should consider replacing regular expressions with glob patterns. These are already implemented in https://github.com/llvm/llvm-project/blob/3391bdc255f1a75c59d71c7305959e84d8d5f468/llvm/include/llvm/Support/GlobPattern.h so the implementation should be straightforward, but it's going to be a breaking change. I don't know how widespread the use of special case lists is and whether that's going to be an issue?

ellis planned changes to this revision.Jun 14 2023, 9:36 AM

This is unrelated to this change but related to the issue this change is addressing. The use of regular expressions for special case list with the special handling of * is error-prone, I've seen many people having issues it. Furthermore, regular expression for matching file names requires excessive escaping (see for example https://fuchsia-review.git.corp.google.com/c/fuchsia/+/763162) and most patterns used for function names only utilize *, not requiring the full strength of regular expression. I think we should consider replacing regular expressions with glob patterns. These are already implemented in https://github.com/llvm/llvm-project/blob/3391bdc255f1a75c59d71c7305959e84d8d5f468/llvm/include/llvm/Support/GlobPattern.h so the implementation should be straightforward, but it's going to be a breaking change. I don't know how widespread the use of special case lists is and whether that's going to be an issue?

Oh neat, I wasn't aware of GlobPattern.h. Personally, I would be happy with this change, but it would require changing some blocklists on my end (which is fine). I'm not sure what the processes is for a breaking change like this, but since clang 16 has just been released, it seems like a good time to change this. Thoughts?

This is unrelated to this change but related to the issue this change is addressing. The use of regular expressions for special case list with the special handling of * is error-prone, I've seen many people having issues it. Furthermore, regular expression for matching file names requires excessive escaping (see for example https://fuchsia-review.git.corp.google.com/c/fuchsia/+/763162) and most patterns used for function names only utilize *, not requiring the full strength of regular expression. I think we should consider replacing regular expressions with glob patterns. These are already implemented in https://github.com/llvm/llvm-project/blob/3391bdc255f1a75c59d71c7305959e84d8d5f468/llvm/include/llvm/Support/GlobPattern.h so the implementation should be straightforward, but it's going to be a breaking change. I don't know how widespread the use of special case lists is and whether that's going to be an issue?

Oh neat, I wasn't aware of GlobPattern.h. Personally, I would be happy with this change, but it would require changing some blocklists on my end (which is fine). I'm not sure what the processes is for a breaking change like this, but since clang 16 has just been released, it seems like a good time to change this. Thoughts?

I think that making this change now and highlighting it as a breaking change for LLVM 17 is probably the right strategy. We should also consider extending the existing GlobPattern implementation to support brace expansion (i.e. {foo,bar,baz}) to better match the existing functionality.

vitalybuka edited reviewers, added: vitalybuka; removed: vlad.tsyrklevich.Jun 18 2023, 8:44 PM