This is an archive of the discontinued LLVM Phabricator instance.

Use raw strings to avoid deprecation warnings in regexp patterns
ClosedPublic

Authored by amccarth on Jun 4 2019, 3:49 PM.

Details

Summary

In LLDB, where tests run with the debug version of Python, we get a series of deprecation warnings because \( escape sequences are treated as part of the string literal rather than an escaped paren for the regexp pattern.

I believe the change preserves the original intent rather than the actual behavior when running with Python 3, but another pair of eyes would be appreciated.

Diff Detail

Repository
rL LLVM

Event Timeline

amccarth created this revision.Jun 4 2019, 3:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 4 2019, 3:49 PM
Herald added a subscriber: delcypher. · View Herald Transcript
jdenny added a comment.Jun 4 2019, 4:52 PM

This looks equivalent to me.

But how do I see the bad behavior or deprecation warnings? Python 3.6.7 with -ddd didn't reveal anything on a simple example I just tried.

Thanks.

jdenny accepted this revision.Jun 4 2019, 5:18 PM

Never mind. I see it now. LGTM.

llvm/utils/lit/lit/TestRunner.py
52 ↗(On Diff #203041)

The fact that \' will remain as \' bugs me because the ' need not be escaped as part of the regex. Hopefully that won't be deprecated one day.

This revision is now accepted and ready to land.Jun 4 2019, 5:18 PM

But how do I see the bad behavior or deprecation warnings? Python 3.6.7 with -ddd didn't reveal anything on a simple example I just tried.

I see the deprecation warnings when running the LLDB tests on Windows, which selects the debug version of Python 3.6.7.

I don't think other LLVM tests are run with the debug version of Python, so the warnings do not show. My larger goal here is the clean up the Windows LLDB test output so we can see the real problems and fix enough of them to make it reasonable to run the tests on some (more?) the Windows bots.

Thanks for the review.

llvm/utils/lit/lit/TestRunner.py
52 ↗(On Diff #203041)

Would you prefer the following?

kPdbgRegex = '%dbg\\(([^)\'"]*)\\)'

(It's kind of disappointing the Python raw string literals aren't always the best choice.)

amccarth updated this revision to Diff 203616.Jun 7 2019, 1:36 PM

Changed the approach in the first fix to use explicit escaping after Joel Denny's comment got me to re-think it.

amccarth marked 3 inline comments as done.Jun 7 2019, 2:08 PM
amccarth added inline comments.
llvm/utils/lit/lit/TestRunner.py
52 ↗(On Diff #203041)

I've gone ahead and switched this to a non-raw string and just added the extra escaping.

This revision was automatically updated to reflect the committed changes.
amccarth marked an inline comment as done.
jdenny added inline comments.Jun 7 2019, 5:38 PM
llvm/trunk/utils/lit/lit/TestRunner.py
1430

The \+ doesn't need the escape here either, right? While your patch doesn't impact this one, it's another unknown/unnecessary escape at the regex level, similar to the \' we discussed. I'm not a python expert, and I don't know whether it's actually worthwhile to worry about unnecessary escapes at the regex level.

Thanks for the patch, by the way.